| 1 | ---
 | 
| 2 | default_highlighter: oils-sh
 | 
| 3 | ---
 | 
| 4 | 
 | 
| 5 | A Tour of YSH
 | 
| 6 | =============
 | 
| 7 | 
 | 
| 8 | <!-- author's note about example names
 | 
| 9 | 
 | 
| 10 | - people: alice, bob
 | 
| 11 | - nouns: ale, bean
 | 
| 12 |   - peanut, coconut
 | 
| 13 | - 42 for integers
 | 
| 14 | -->
 | 
| 15 | 
 | 
| 16 | This doc describes the [YSH]($xref) language from **clean slate**
 | 
| 17 | perspective.  We don't assume you know Unix shell, or the compatible
 | 
| 18 | [OSH]($xref).  But shell users will see the similarity, with simplifications
 | 
| 19 | and upgrades.
 | 
| 20 | 
 | 
| 21 | Remember, YSH is for Python and JavaScript users who avoid shell!  See the
 | 
| 22 | [project FAQ][FAQ] for more color on that.
 | 
| 23 | 
 | 
| 24 | [FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
 | 
| 25 | [path dependence]: https://en.wikipedia.org/wiki/Path_dependence
 | 
| 26 | 
 | 
| 27 | This document is **long** because it demonstrates nearly every feature of the
 | 
| 28 | language.  You may want to read it in multiple sittings, or read [The Simplest
 | 
| 29 | Explanation of
 | 
| 30 | Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
 | 
| 31 | (Until 2023, YSH was called the "Oil language".)
 | 
| 32 | 
 | 
| 33 | 
 | 
| 34 | Here's a summary of what follows:
 | 
| 35 | 
 | 
| 36 | 1. YSH has interleaved *word*, *command*, and *expression* languages.
 | 
| 37 |    - The command language has Ruby-like *blocks*, and the expression language
 | 
| 38 |      has Python-like *data types*.
 | 
| 39 | 2. YSH has both builtin *commands* like `cd /tmp`, and builtin *functions* like
 | 
| 40 |    `join()`.
 | 
| 41 | 3. Languages for *data*, like [JSON][], are complementary to YSH code.
 | 
| 42 | 4. OSH and YSH share both an *interpreter data model* and a *process model*
 | 
| 43 |    (provided by the Unix kernel).  Understanding these common models will make
 | 
| 44 |    you both a better shell user and YSH user.
 | 
| 45 | 
 | 
| 46 | Keep these points in mind as you read the details below.
 | 
| 47 | 
 | 
| 48 | [JSON]: https://json.org
 | 
| 49 | 
 | 
| 50 | <div id="toc">
 | 
| 51 | </div>
 | 
| 52 | 
 | 
| 53 | ## Preliminaries
 | 
| 54 | 
 | 
| 55 | Start YSH just like you start bash or Python:
 | 
| 56 | 
 | 
| 57 | <!-- oils-sh below skips code block extraction, since it doesn't run -->
 | 
| 58 | 
 | 
| 59 | ```sh-prompt
 | 
| 60 | bash$ ysh                # assuming it's installed
 | 
| 61 | 
 | 
| 62 | ysh$ echo 'hello world'  # command typed into YSH
 | 
| 63 | hello world
 | 
| 64 | ```
 | 
| 65 | 
 | 
| 66 | In the sections below, we'll save space by showing output **in comments**, with
 | 
| 67 | `=>`:
 | 
| 68 | 
 | 
| 69 |     echo 'hello world'       # => hello world
 | 
| 70 | 
 | 
| 71 | Multi-line output is shown like this:
 | 
| 72 | 
 | 
| 73 |     echo one
 | 
| 74 |     echo two
 | 
| 75 |     # =>
 | 
| 76 |     # one
 | 
| 77 |     # two
 | 
| 78 | 
 | 
| 79 | ## Examples
 | 
| 80 | 
 | 
| 81 | ### Hello World Script
 | 
| 82 | 
 | 
| 83 | You can also type commands into a file like `hello.ysh`.  This is a complete
 | 
| 84 | YSH program, which is identical to a shell program:
 | 
| 85 | 
 | 
| 86 |     echo 'hello world'     # => hello world
 | 
| 87 | 
 | 
| 88 | ### A Taste of YSH
 | 
| 89 | 
 | 
| 90 | Unlike shell, YSH has `var` and `const` keywords:
 | 
| 91 | 
 | 
| 92 |     const name = 'world'   # const is rarer, used the top-level
 | 
| 93 |     echo "hello $name"     # => hello world
 | 
| 94 | 
 | 
| 95 | They take rich Python-like expressions on the right:
 | 
| 96 | 
 | 
| 97 |     var x = 42             # an integer, not a string
 | 
| 98 |     setvar x = x * 2 + 1   # mutate with the 'setvar' keyword
 | 
| 99 | 
 | 
| 100 |     setvar x += 5          # Increment by 5
 | 
| 101 |     echo $x                # => 6
 | 
| 102 | 
 | 
| 103 |     var mylist = [x, 7]    # two integers [6, 7]
 | 
| 104 | 
 | 
| 105 | Expressions are often surrounded by `()`:
 | 
| 106 | 
 | 
| 107 |     if (x > 0) {
 | 
| 108 |       echo 'positive'
 | 
| 109 |     }  # => positive
 | 
| 110 | 
 | 
| 111 |     for i, item in (mylist) {  # 'mylist' is a variable, not a string
 | 
| 112 |       echo "[$i] item $item"
 | 
| 113 |     }
 | 
| 114 |     # =>
 | 
| 115 |     # [0] item 6
 | 
| 116 |     # [1] item 7
 | 
| 117 | 
 | 
| 118 | YSH has Ruby-like blocks:
 | 
| 119 | 
 | 
| 120 |     cd /tmp {
 | 
| 121 |       echo hi > greeting.txt  # file created inside /tmp
 | 
| 122 |       echo $PWD               # => /tmp
 | 
| 123 |     }
 | 
| 124 |     echo $PWD                 # prints the original directory
 | 
| 125 | 
 | 
| 126 | And utilities to read and write JSON:
 | 
| 127 | 
 | 
| 128 |     var person = {name: 'bob', age: 42}
 | 
| 129 |     json write (person)
 | 
| 130 |     # =>
 | 
| 131 |     # {
 | 
| 132 |     #   "name": "bob",
 | 
| 133 |     #   "age": 42,
 | 
| 134 |     # }
 | 
| 135 | 
 | 
| 136 |     echo '["str", 42]' | json read  # sets '_reply' variable by default
 | 
| 137 | 
 | 
| 138 | The `=` keyword evaluates and prints an expression:
 | 
| 139 | 
 | 
| 140 |     = _reply
 | 
| 141 |     # => (List)   ["str", 42]
 | 
| 142 | 
 | 
| 143 | (Think of it like `var x = _reply`, without the `var`.)
 | 
| 144 | 
 | 
| 145 | ## Word Language: Expressions for Strings (and Arrays)
 | 
| 146 | 
 | 
| 147 | Let's describe the word language first, and then talk about commands and
 | 
| 148 | expressions.  Words are a rich language because **strings** are a central
 | 
| 149 | concept in shell.
 | 
| 150 | 
 | 
| 151 | ### Three Kinds of String Literals
 | 
| 152 | 
 | 
| 153 | You can choose the quoting style that's most convenient to write a given
 | 
| 154 | string.
 | 
| 155 | 
 | 
| 156 | #### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
 | 
| 157 | 
 | 
| 158 | Double-quoted strings allow **interpolation with `$`**:
 | 
| 159 | 
 | 
| 160 |     var person = 'alice'
 | 
| 161 |     echo "hi $person, $(echo bye)"  # => hi alice, bye
 | 
| 162 | 
 | 
| 163 | Write operators by escaping them with `\`:
 | 
| 164 | 
 | 
| 165 |     echo "\$ \" \\ "                # => $ " \
 | 
| 166 | 
 | 
| 167 | In single-quoted strings, all characters are **literal** (except `'`, which
 | 
| 168 | can't be expressed):
 | 
| 169 | 
 | 
| 170 |     echo 'c:\Program Files\'        # => c:\Program Files\
 | 
| 171 | 
 | 
| 172 | If you want C-style backslash **character escapes**, use a J8 string, which is
 | 
| 173 | like JSON, but with single quotes::
 | 
| 174 | 
 | 
| 175 |     echo u' A is \u{41} \n line two, with backslash \\'
 | 
| 176 |     # =>
 | 
| 177 |     #  A is A
 | 
| 178 |     #  line two, with backslash \
 | 
| 179 | 
 | 
| 180 | The `u''` strings are guaranteed to be valid Unicode (unlike JSON), but you can
 | 
| 181 | also use `b''` strings:
 | 
| 182 | 
 | 
| 183 |     echo b'byte \yff'  # byte that's not valid unicode, like \xff in other languages
 | 
| 184 |                        # do not confuse with \u{ff}
 | 
| 185 | 
 | 
| 186 | #### Multi-line Strings
 | 
| 187 | 
 | 
| 188 | Multi-line strings are surrounded with triple quotes.  They come in the same
 | 
| 189 | three varieties, and leading whitespace is stripped in a convenient way.
 | 
| 190 | 
 | 
| 191 |     sort <<< """
 | 
| 192 |     var sub: $x
 | 
| 193 |     command sub: $(echo hi)
 | 
| 194 |     expression sub: $[x + 3]
 | 
| 195 |     """
 | 
| 196 |     # =>
 | 
| 197 |     # command sub: hi
 | 
| 198 |     # expression sub: 9
 | 
| 199 |     # var sub: 6
 | 
| 200 | 
 | 
| 201 |     sort <<< '''
 | 
| 202 |     $2.00  # literal $, no interpolation
 | 
| 203 |     $1.99
 | 
| 204 |     '''
 | 
| 205 |     # =>
 | 
| 206 |     # $1.99
 | 
| 207 |     # $2.00
 | 
| 208 | 
 | 
| 209 |     sort <<< u'''
 | 
| 210 |     C\tD
 | 
| 211 |     A\tB
 | 
| 212 |     '''  # b''' strings also supported
 | 
| 213 |     # =>
 | 
| 214 |     # A        B
 | 
| 215 |     # C        D
 | 
| 216 | 
 | 
| 217 | (Use multiline strings instead of shell's [here docs]($xref:here-doc).)
 | 
| 218 | 
 | 
| 219 | ### Three Kinds of Substitution
 | 
| 220 | 
 | 
| 221 | YSH has syntax for 3 types of substitution, all of which start with `$`.  These
 | 
| 222 | things can all be converted to a **string**:
 | 
| 223 | 
 | 
| 224 | 1. Variables
 | 
| 225 | 2. The output of commands
 | 
| 226 | 3. The value of expressions
 | 
| 227 | 
 | 
| 228 | #### Variable Sub
 | 
| 229 | 
 | 
| 230 | The syntax `$a` or `${a}` converts a variable to a string:
 | 
| 231 | 
 | 
| 232 |     var a = 'ale'
 | 
| 233 |     echo $a                          # => ale
 | 
| 234 |     echo _${a}_                      # => _ale_
 | 
| 235 |     echo "_ $a _"                    # => _ ale _
 | 
| 236 | 
 | 
| 237 | The shell operator `:-` is occasionally useful in YSH:
 | 
| 238 | 
 | 
| 239 |     echo ${not_defined:-'default'}   # => default
 | 
| 240 | 
 | 
| 241 | #### Command Sub
 | 
| 242 | 
 | 
| 243 | The `$(echo hi)` syntax runs a command and captures its `stdout`:
 | 
| 244 | 
 | 
| 245 |     echo $(hostname)                 # => example.com
 | 
| 246 |     echo "_ $(hostname) _"           # => _ example.com _
 | 
| 247 | 
 | 
| 248 | #### Expression Sub
 | 
| 249 | 
 | 
| 250 | The `$[myexpr]` syntax evaluates an expression and converts it to a string:
 | 
| 251 | 
 | 
| 252 |     echo $[a]                        # => ale
 | 
| 253 |     echo $[1 + 2 * 3]                # => 7
 | 
| 254 |     echo "_ $[1 + 2 * 3] _"          # => _ 7 _
 | 
| 255 | 
 | 
| 256 | <!-- TODO: safe substitution with "$[a]"html -->
 | 
| 257 | 
 | 
| 258 | ### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
 | 
| 259 | 
 | 
| 260 | There are four constructs that evaluate to an **list of strings**, rather than
 | 
| 261 | a single string.
 | 
| 262 | 
 | 
| 263 | #### Globs
 | 
| 264 | 
 | 
| 265 | Globs like `*.py` evaluate to a list of files.
 | 
| 266 | 
 | 
| 267 |     touch foo.py bar.py  # create the files
 | 
| 268 |     write *.py
 | 
| 269 |     # =>
 | 
| 270 |     # foo.py
 | 
| 271 |     # bar.py
 | 
| 272 | 
 | 
| 273 | If no files match, it evaluates to an empty list (`[]`).
 | 
| 274 | 
 | 
| 275 | #### Brace Expansion
 | 
| 276 | 
 | 
| 277 | The brace expansion mini-language lets you write strings without duplication:
 | 
| 278 | 
 | 
| 279 |     write {alice,bob}@example.com
 | 
| 280 |     # =>
 | 
| 281 |     # alice@example.com
 | 
| 282 |     # bob@example.com
 | 
| 283 | 
 | 
| 284 | #### Splicing
 | 
| 285 | 
 | 
| 286 | The `@` operator splices an array into a command:
 | 
| 287 | 
 | 
| 288 |     var myarray = :| ale bean |
 | 
| 289 |     write S @myarray E
 | 
| 290 |     # =>
 | 
| 291 |     # S
 | 
| 292 |     # ale
 | 
| 293 |     # bean
 | 
| 294 |     # E
 | 
| 295 | 
 | 
| 296 | You also have `@[]` to splice an expression that evaluates to a list:
 | 
| 297 | 
 | 
| 298 |     write -- @[split('ale bean')]
 | 
| 299 |     # => 
 | 
| 300 |     # ale
 | 
| 301 |     # bean
 | 
| 302 | 
 | 
| 303 | Each item will be converted to a string.
 | 
| 304 | 
 | 
| 305 | #### Split Command Sub / Split Builtin Sub
 | 
| 306 | 
 | 
| 307 | There's also a variant of *command sub* that splits first:
 | 
| 308 | 
 | 
| 309 |     write @(seq 3)  # write gets 3 arguments
 | 
| 310 |     # =>
 | 
| 311 |     # 1
 | 
| 312 |     # 2
 | 
| 313 |     # 3
 | 
| 314 | 
 | 
| 315 | <!-- TODO: This should decode J8 notation, which includes "" j"" and b"" -->
 | 
| 316 | 
 | 
| 317 | ## Command Language: I/O, Control Flow, Abstraction
 | 
| 318 | 
 | 
| 319 | ### Simple Commands and Redirects
 | 
| 320 | 
 | 
| 321 | A simple command is a space-separated list of words, which are often unquoted.
 | 
| 322 | YSH looks up the first word to determine if it's a `proc` or shell builtin.
 | 
| 323 | 
 | 
| 324 |     echo 'hello world'   # The shell builtin 'echo'
 | 
| 325 | 
 | 
| 326 |     proc greet (name) {  # A proc is like a procedure or process
 | 
| 327 |       echo "hello $name"
 | 
| 328 |     }
 | 
| 329 | 
 | 
| 330 |     # Now the first word will resolve to the proc
 | 
| 331 |     greet alice          # => hello alice
 | 
| 332 | 
 | 
| 333 | If it's neither, then it's assumed to be an external command:
 | 
| 334 | 
 | 
| 335 |     ls -l /tmp           # The external 'ls' command
 | 
| 336 | 
 | 
| 337 | Commands accept traditional string arguments, as well as typed arguments in
 | 
| 338 | parentheses:
 | 
| 339 | 
 | 
| 340 |     # 'write' is a string arg; 'x' is a typed expression arg
 | 
| 341 |     json write (x)
 | 
| 342 | 
 | 
| 343 | You can **redirect** `stdin` and `stdout` of simple commands:
 | 
| 344 | 
 | 
| 345 |     echo hi > tmp.txt  # write to a file
 | 
| 346 |     sort < tmp.txt
 | 
| 347 | 
 | 
| 348 | Idioms for using stderr (identical to shell):
 | 
| 349 | 
 | 
| 350 |     ls /tmp 2>errors.txt
 | 
| 351 |     echo 'fatal error' 1>&2
 | 
| 352 | 
 | 
| 353 | "Simple" commands in YSH can also have typed `()` and block `{}` args, which
 | 
| 354 | we'll see in the section on "procs".
 | 
| 355 | 
 | 
| 356 | ### Pipelines
 | 
| 357 | 
 | 
| 358 | Pipelines are a powerful method manipulating data streams:
 | 
| 359 | 
 | 
| 360 |     ls | wc -l                       # count files in this directory
 | 
| 361 |     find /bin -type f | xargs wc -l  # count files in a subtree
 | 
| 362 | 
 | 
| 363 | The stream may contain (lines of) text, binary data, JSON, TSV, and more.
 | 
| 364 | Details below.
 | 
| 365 | 
 | 
| 366 | ### Multi-line Commands
 | 
| 367 | 
 | 
| 368 | The YSH `...` prefix lets you write long commands, pipelines, and `&&` chains
 | 
| 369 | without `\` line continuations.
 | 
| 370 | 
 | 
| 371 |     ... find /bin               # traverse this directory and
 | 
| 372 |         -type f -a -executable  # print executable files
 | 
| 373 |       | sort -r                 # reverse sort
 | 
| 374 |       | head -n 30              # limit to 30 files
 | 
| 375 |       ;
 | 
| 376 | 
 | 
| 377 | When this mode is active:
 | 
| 378 | 
 | 
| 379 | - A single newline behaves like a space
 | 
| 380 | - A blank line (two newlines in a row) is illegal, but a line that has only a
 | 
| 381 |   comment is allowed.  This prevents confusion if you forget the `;`
 | 
| 382 |   terminator.
 | 
| 383 | 
 | 
| 384 | ### `var`, `setvar`, `const` to Declare and Mutate
 | 
| 385 | 
 | 
| 386 | Constants can't be modified:
 | 
| 387 | 
 | 
| 388 |     const myconst = 'mystr'
 | 
| 389 |     # setvar myconst = 'foo' would be an error
 | 
| 390 | 
 | 
| 391 | Modify variables with the `setvar` keyword:
 | 
| 392 | 
 | 
| 393 |     var num_beans = 12
 | 
| 394 |     setvar num_beans = 13
 | 
| 395 | 
 | 
| 396 | A more complex example:
 | 
| 397 | 
 | 
| 398 |     var d = {name: 'bob', age: 42}  # dict literal
 | 
| 399 |     setvar d.name = 'alice'         # d.name is a synonym for d['name']
 | 
| 400 |     echo $[d.name]                  # => alice
 | 
| 401 | 
 | 
| 402 | That's most of what you need to know about assignments.  Advanced users may
 | 
| 403 | want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
 | 
| 404 | 
 | 
| 405 | <!--
 | 
| 406 |     var g = 1
 | 
| 407 |     var h = 2
 | 
| 408 |     proc demo(:out) {
 | 
| 409 |       setglobal g = 42
 | 
| 410 |       setref out = 43
 | 
| 411 |     }
 | 
| 412 |     demo :h  # pass a reference to h
 | 
| 413 |     echo "$g $h"  # => 42 43
 | 
| 414 | -->
 | 
| 415 | 
 | 
| 416 | More details: [Variable Declaration and Mutation](variables.html).
 | 
| 417 | 
 | 
| 418 | ### `for` Loop
 | 
| 419 | 
 | 
| 420 | Shell-style for loops iterate over **words**:
 | 
| 421 | 
 | 
| 422 |     for word in 'oils' $num_beans {pea,coco}nut {
 | 
| 423 |       echo $word
 | 
| 424 |     }
 | 
| 425 |     # =>
 | 
| 426 |     # oils
 | 
| 427 |     # 13
 | 
| 428 |     # peanut
 | 
| 429 |     # coconut
 | 
| 430 | 
 | 
| 431 | You can also request the loop index:
 | 
| 432 | 
 | 
| 433 |     for i, word in README.md *.py {
 | 
| 434 |       echo "$i - $word"
 | 
| 435 |     }
 | 
| 436 |     # =>
 | 
| 437 |     # 0 - README.md
 | 
| 438 |     # 1 - __init__.py
 | 
| 439 | 
 | 
| 440 | To iterate over a typed data, use parentheses around an **expression**.  The
 | 
| 441 | expression should evaluate to an integer range, `List`, `Dict`, or `Str`
 | 
| 442 | (TODO).
 | 
| 443 | 
 | 
| 444 |     for i in (3 .. 5) {  # range operator ..
 | 
| 445 |       echo "i = $i"
 | 
| 446 |     }
 | 
| 447 |     # =>
 | 
| 448 |     # i = 3
 | 
| 449 |     # i = 4
 | 
| 450 | 
 | 
| 451 | List:
 | 
| 452 | 
 | 
| 453 |     var foods = ['ale', 'bean']
 | 
| 454 |     for item in (foods) {
 | 
| 455 |       echo $item
 | 
| 456 |     }
 | 
| 457 |     # =>
 | 
| 458 |     # ale
 | 
| 459 |     # bean
 | 
| 460 | 
 | 
| 461 | Again you can request the index:
 | 
| 462 | 
 | 
| 463 |     for i, item in (foods) {
 | 
| 464 |       echo "$i - $item"
 | 
| 465 |     }
 | 
| 466 |     # =>
 | 
| 467 |     # 0 - ale
 | 
| 468 |     # 1 - bean
 | 
| 469 | 
 | 
| 470 | Likewise, here's the most general form of the dictionary loop:
 | 
| 471 | 
 | 
| 472 |     var mydict = {pea: 42, nut: 10}
 | 
| 473 |     for i, k, v in (mydict) {
 | 
| 474 |       echo "$i - $k - $v"
 | 
| 475 |     }
 | 
| 476 |     # =>
 | 
| 477 |     # 0 - pea - 42
 | 
| 478 |     # 1 - nut - 10
 | 
| 479 | 
 | 
| 480 | There are two simpler forms:
 | 
| 481 | 
 | 
| 482 | - One variable gives you the key: `for k in (mydict)`
 | 
| 483 | - Two variables gives you the key and value: `for k, v in (mydict)`
 | 
| 484 | 
 | 
| 485 | (One way to think of it: `for` loops in YSH have the functionality Python's
 | 
| 486 | `enumerate()`, `items()`, `keys()`, and `values()`.)
 | 
| 487 | 
 | 
| 488 | <!--
 | 
| 489 | TODO: Str loop should give you the (UTF-8 offset, rune)
 | 
| 490 | Or maybe just UTF-8 offset?  Decoding errors could be exceptions, or Unicode
 | 
| 491 | replacement.
 | 
| 492 | -->
 | 
| 493 | 
 | 
| 494 | ### `while` Loop
 | 
| 495 | 
 | 
| 496 | While loops can use a **command** as the termination condition:
 | 
| 497 | 
 | 
| 498 |     while test --file lock {
 | 
| 499 |       sleep 1
 | 
| 500 |     }
 | 
| 501 | 
 | 
| 502 | Or an **expression**, which is surrounded in `()`:
 | 
| 503 | 
 | 
| 504 |     var i = 3
 | 
| 505 |     while (i < 6) {
 | 
| 506 |       echo "i = $i"
 | 
| 507 |       setvar i += 1
 | 
| 508 |     }
 | 
| 509 |     # =>
 | 
| 510 |     # i = 3
 | 
| 511 |     # i = 4
 | 
| 512 |     # i = 5
 | 
| 513 | 
 | 
| 514 | ### `if elif` Conditional
 | 
| 515 | 
 | 
| 516 | If statements test the exit code of a command, and have optional `elif` and
 | 
| 517 | `else` clauses:
 | 
| 518 | 
 | 
| 519 |     if test --file foo {
 | 
| 520 |       echo 'foo is a file'
 | 
| 521 |       rm --verbose foo     # delete it
 | 
| 522 |     } elif test --dir foo {
 | 
| 523 |       echo 'foo is a directory'
 | 
| 524 |     } else {
 | 
| 525 |       echo 'neither'
 | 
| 526 |     }
 | 
| 527 | 
 | 
| 528 | Invert the exit code with `!`:
 | 
| 529 | 
 | 
| 530 |     if ! grep alice /etc/passwd { 
 | 
| 531 |       echo 'alice is not a user'
 | 
| 532 |     }
 | 
| 533 | 
 | 
| 534 | As with `while` loops, the condition can also be an **expression** wrapped in
 | 
| 535 | `()`:
 | 
| 536 | 
 | 
| 537 |     if (num_beans > 0) {
 | 
| 538 |       echo 'so many beans'
 | 
| 539 |     }
 | 
| 540 | 
 | 
| 541 |     var done = false
 | 
| 542 |     if (not done) {        # negate with 'not' operator (contrast with !)
 | 
| 543 |       echo "we aren't done"
 | 
| 544 |     }
 | 
| 545 | 
 | 
| 546 | ### `case` Conditional
 | 
| 547 | 
 | 
| 548 | The case statement is a series of conditionals and executable blocks.  The
 | 
| 549 | condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
 | 
| 550 | like `/d+/`, or a typed expression like `(42)`:
 | 
| 551 | 
 | 
| 552 |     var s = 'README.md'
 | 
| 553 |     case (s) {
 | 
| 554 |       *.py           { echo 'Python' }
 | 
| 555 |       *.cc | *.h     { echo 'C++' }
 | 
| 556 |       *              { echo 'Other' }
 | 
| 557 |     }
 | 
| 558 |     # => Other
 | 
| 559 | 
 | 
| 560 |     case (s) {
 | 
| 561 |       / dot* '.md' / { echo 'Markdown' }
 | 
| 562 |       (30 + 12)      { echo 'the integer 42' }
 | 
| 563 |       (else)         { echo 'neither' }
 | 
| 564 |     }
 | 
| 565 |     # => Markdown
 | 
| 566 | 
 | 
| 567 | <!-- TODO: document case on typed data -->
 | 
| 568 | 
 | 
| 569 | (Shell style like `if foo; then ... fi` and `case $x in ...  esac` is also legal,
 | 
| 570 | but discouraged in YSH code.)
 | 
| 571 | 
 | 
| 572 | ### Error Handling
 | 
| 573 | 
 | 
| 574 | If statements are also used for **error handling**.  Builtins and external
 | 
| 575 | commands use this style:
 | 
| 576 | 
 | 
| 577 |     if ! test -d /bin {
 | 
| 578 |       echo 'not a directory'
 | 
| 579 |     }
 | 
| 580 | 
 | 
| 581 |     if ! cp foo /tmp {
 | 
| 582 |       echo 'error copying'  # any non-zero status
 | 
| 583 |     }
 | 
| 584 | 
 | 
| 585 | Procs use this style (because of shell's *disabled `errexit` quirk*):
 | 
| 586 | 
 | 
| 587 |     try {
 | 
| 588 |       myproc
 | 
| 589 |     }
 | 
| 590 |     if (_status !== 0) {
 | 
| 591 |       echo 'failed'
 | 
| 592 |     }
 | 
| 593 | 
 | 
| 594 | For a complete list of examples, see [YSH vs. Shell Idioms > Error
 | 
| 595 | Handling](idioms.html#error-handling).  For design goals and a reference, see
 | 
| 596 | [YSH Fixes Shell's Error Handling](error-handling.html).
 | 
| 597 | 
 | 
| 598 | #### `break`, `continue`, `return`, `exit`
 | 
| 599 | 
 | 
| 600 | The `exit` **keyword** exits a process (it's not a shell builtin.)  The other 3
 | 
| 601 | control flow keywords behave like they do in Python and JavaScript.
 | 
| 602 | 
 | 
| 603 | ### Ruby-like Blocks 
 | 
| 604 | 
 | 
| 605 | Here's a builtin command that takes a literal block argument:
 | 
| 606 | 
 | 
| 607 |     shopt --unset errexit {  # ignore errors
 | 
| 608 |       cp ale /tmp
 | 
| 609 |       cp bean /bin
 | 
| 610 |     }
 | 
| 611 | 
 | 
| 612 | Blocks are a special kind of typed argument passed to commands like `shopt`.
 | 
| 613 | Their type is `value.Command`.
 | 
| 614 | 
 | 
| 615 | ### Shell-like `proc`
 | 
| 616 | 
 | 
| 617 | You can define units of code with the `proc` keyword.
 | 
| 618 | 
 | 
| 619 |     proc mycopy (src, dest) {
 | 
| 620 |       ### Copy verbosely
 | 
| 621 | 
 | 
| 622 |       mkdir -p $dest
 | 
| 623 |       cp --verbose $src $dest
 | 
| 624 |     }
 | 
| 625 | 
 | 
| 626 | The `###` line is a "doc comment", and can be retrieved with `pp proc`.  Simple
 | 
| 627 | procs like this are invoked like a shell command:
 | 
| 628 | 
 | 
| 629 |     touch log.txt
 | 
| 630 |     mycopy log.txt /tmp   # first word 'mycopy' is a proc
 | 
| 631 | 
 | 
| 632 | Procs have more features, including **four** kinds of arguments:
 | 
| 633 | 
 | 
| 634 | 1. Word args (which are always strings)
 | 
| 635 | 1. Typed, positional args (aka positional args)
 | 
| 636 | 1. Typed, named args (aka named args)
 | 
| 637 | 1. A final block argument, which may be written with `{ }`.
 | 
| 638 | 
 | 
| 639 | At the call site, they can look like any of these forms:
 | 
| 640 | 
 | 
| 641 |     cd /tmp                      # word arg
 | 
| 642 | 
 | 
| 643 |     json write (d)               # word arg, then positional arg
 | 
| 644 | 
 | 
| 645 |     # error 'failed' (status=9)  # word arg, then named arg
 | 
| 646 | 
 | 
| 647 |     cd /tmp { echo $PWD }        # word arg, then block arg
 | 
| 648 | 
 | 
| 649 |     var mycmd = ^(echo hi)       # expression for a value.Command
 | 
| 650 |     eval (mycmd)                 # positional arg 
 | 
| 651 | 
 | 
| 652 | <!-- TODO: lazy arg list: ls8 | where [age > 10] -->
 | 
| 653 | 
 | 
| 654 | At the definition site, the kinds of parameters are separated with `;`, similar
 | 
| 655 | to the Julia language:
 | 
| 656 | 
 | 
| 657 |     proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
 | 
| 658 |       echo "$word1 $word2 $[pos1 + pos2]"
 | 
| 659 |       json write (rest_pos)
 | 
| 660 |     }
 | 
| 661 | 
 | 
| 662 |     proc p3 (w ; ; named1, named2, ...rest_named; block) {
 | 
| 663 |       echo "$w $[named1 + named2]"
 | 
| 664 |       eval (block)
 | 
| 665 |       json write (rest_named)
 | 
| 666 |     }
 | 
| 667 | 
 | 
| 668 |     proc p4 (; ; ; block) {
 | 
| 669 |       eval (block)
 | 
| 670 |     }
 | 
| 671 | 
 | 
| 672 | YSH also has Python-like functions defined with `func`.  These are part of the
 | 
| 673 | expression language, which we'll see later.
 | 
| 674 | 
 | 
| 675 | For more info, see the [Informal Guide to Procs and Funcs](proc-func.html)
 | 
| 676 | (under construction).
 | 
| 677 | 
 | 
| 678 | #### Builtin Commands
 | 
| 679 | 
 | 
| 680 | **Shell builtins** like `cd` and `read` are the "standard library" of the
 | 
| 681 | command language.  Each one takes various flags:
 | 
| 682 | 
 | 
| 683 |     cd -L .                      # follow symlinks
 | 
| 684 | 
 | 
| 685 |     echo foo | read --all        # read all of stdin
 | 
| 686 |     
 | 
| 687 | Here are some categories of builtin:
 | 
| 688 | 
 | 
| 689 | - I/O: `echo  write  read`
 | 
| 690 | - File system: `cd  test`
 | 
| 691 | - Processes: `fork  wait  forkwait  exec`
 | 
| 692 | - Interpreter settings: `shopt  shvar`
 | 
| 693 | - Meta: `command  builtin  runproc  type  eval`
 | 
| 694 | - Modules: `source  module`
 | 
| 695 | 
 | 
| 696 | <!-- TODO: Link to a comprehensive list of builtins -->
 | 
| 697 | 
 | 
| 698 | ## Expression Language: Python-like Types
 | 
| 699 | 
 | 
| 700 | YSH expressions look and behave more like Python or JavaScript than shell.  For
 | 
| 701 | example, we write `if (x < y)` instead of `if [ $x -lt $y ]`.  Expressions are
 | 
| 702 | usually surrounded by `( )`.  
 | 
| 703 | 
 | 
| 704 | At runtime, variables like `x` and `y` are bounded to **typed data**, like
 | 
| 705 | integers, floats, strings, lists, and dicts.
 | 
| 706 | 
 | 
| 707 | <!--
 | 
| 708 | [Command vs. Expression Mode](command-vs-expression-mode.html) may help you
 | 
| 709 | understand how YSH is parsed.
 | 
| 710 | -->
 | 
| 711 | 
 | 
| 712 | ### Python-like `func`
 | 
| 713 | 
 | 
| 714 | At the end of the *Command Language*, we saw that procs are shell-like units of
 | 
| 715 | code.  Now let's talk about Python-like **functions** in YSH, which are
 | 
| 716 | different than `procs`:
 | 
| 717 | 
 | 
| 718 | - They're defined with the `func` keyword.
 | 
| 719 | - They're called in expressions, not in commands.
 | 
| 720 | - They're **pure**, and live in the **interior** of a process.
 | 
| 721 |   - In contrast, procs usually perform I/O, and have **exterior** boundaries.
 | 
| 722 | 
 | 
| 723 | Here's a function that mutates its argument:
 | 
| 724 | 
 | 
| 725 |     func popTwice(mylist) {
 | 
| 726 |       call mylist->pop()
 | 
| 727 |       call mylist->pop()
 | 
| 728 |     }
 | 
| 729 | 
 | 
| 730 |     var mylist = [3, 4]
 | 
| 731 | 
 | 
| 732 |     # The call keyword is an "adapter" between commands and expressions,
 | 
| 733 |     # like the = keyword.
 | 
| 734 |     call popTwice(mylist)
 | 
| 735 | 
 | 
| 736 | Here's a pure function:
 | 
| 737 | 
 | 
| 738 |     func myRepeat(s, n; special=false) {  # positional; named params
 | 
| 739 |       var parts = []
 | 
| 740 |       for i in (0 .. n) {
 | 
| 741 |         append $s (parts)
 | 
| 742 |       }
 | 
| 743 |       var result = join(parts)
 | 
| 744 | 
 | 
| 745 |       if (special) {
 | 
| 746 |         return ("$result !!")  # parens required for typed return
 | 
| 747 |       } else {
 | 
| 748 |         return (result)
 | 
| 749 |       }
 | 
| 750 |     }
 | 
| 751 | 
 | 
| 752 |     echo $[myRepeat('z', 3)]  # => zzz
 | 
| 753 | 
 | 
| 754 |     echo $[myRepeat('z', 3, special=true)]  # => zzz !!
 | 
| 755 | 
 | 
| 756 | Funcs are named using `camelCase`, while procs use `kebab-case`.  See the
 | 
| 757 | [Style Guide](style-guide.html) for more conventions.
 | 
| 758 | 
 | 
| 759 | #### Builtin Functions
 | 
| 760 | 
 | 
| 761 | In addition, to builtin commands, YSH has Python-like builtin **functions**.
 | 
| 762 | These are like the "standard library" for the expression language.  Examples:
 | 
| 763 | 
 | 
| 764 | - Functions that take multiple types: `len()  type()`
 | 
| 765 | - Conversions: `bool()   int()   float()   str()  list()   ...`
 | 
| 766 | - Explicit word evaluation: `split()  join()  glob()  maybe()`  
 | 
| 767 | 
 | 
| 768 | <!-- TODO: Make a comprehensive list of func builtins. -->
 | 
| 769 | 
 | 
| 770 | 
 | 
| 771 | ### Data Types: `Int`, `Str`, `List`, `Dict`, ...
 | 
| 772 | 
 | 
| 773 | YSH has data types, each with an expression syntax and associated methods.
 | 
| 774 | 
 | 
| 775 | ### Methods
 | 
| 776 | 
 | 
| 777 | Mutating methods are looked up with a thin arrow `->`:
 | 
| 778 | 
 | 
| 779 |     var foods = ['ale', 'bean']
 | 
| 780 |     var last = foods->pop()  # bean
 | 
| 781 |     write @foods  # => ale
 | 
| 782 | 
 | 
| 783 | You can ignore the return value with the `call` keyword:
 | 
| 784 | 
 | 
| 785 |     call foods->pop()
 | 
| 786 | 
 | 
| 787 | Transforming methods use a fat arrow `=>`:
 | 
| 788 | 
 | 
| 789 |     var line = ' ale bean '
 | 
| 790 |     var trimmed = line => trim() => upper()  # 'ALE BEAN'
 | 
| 791 | 
 | 
| 792 | If the `=>` operator doesn't find a method with the given name in the object's
 | 
| 793 | type, it looks for free functions:
 | 
| 794 | 
 | 
| 795 |     # list() is a free function taking one arg
 | 
| 796 |     # join() is a free function taking two args
 | 
| 797 |     var x = {k1: 42, k2: 43} => list() => join('/')  # 'K1/K2'
 | 
| 798 | 
 | 
| 799 | This allows a left-to-right "method chaining" style.
 | 
| 800 | 
 | 
| 801 | ---
 | 
| 802 | 
 | 
| 803 | Now let's go through the data types in YSH.  We'll show the syntax for
 | 
| 804 | literals, and what **methods** they have.
 | 
| 805 | 
 | 
| 806 | #### Null and Bool
 | 
| 807 | 
 | 
| 808 | YSH uses JavaScript-like spellings these three "atoms":
 | 
| 809 | 
 | 
| 810 |     var x = null
 | 
| 811 | 
 | 
| 812 |     var b1, b2 = true, false
 | 
| 813 | 
 | 
| 814 |     if (b1) {
 | 
| 815 |       echo 'yes'
 | 
| 816 |     }  # => yes
 | 
| 817 | 
 | 
| 818 | 
 | 
| 819 | #### Int
 | 
| 820 | 
 | 
| 821 | There are many ways to write integers:
 | 
| 822 | 
 | 
| 823 |     var small, big = 42, 65_536
 | 
| 824 |     echo "$small $big"                  # => 42 65536
 | 
| 825 | 
 | 
| 826 |     var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
 | 
| 827 |     echo "$hex $octal $binary"           # => 65536 493 21
 | 
| 828 | 
 | 
| 829 | <!--
 | 
| 830 | "Runes" are integers that represent Unicode code points.  They're not common in
 | 
| 831 | YSH code, but can make certain string algorithms more readable.
 | 
| 832 | 
 | 
| 833 |     # Pound rune literals are similar to ord('A')
 | 
| 834 |     const a = #'A'
 | 
| 835 | 
 | 
| 836 |     # Backslash rune literals can appear outside of quotes
 | 
| 837 |     const newline = \n  # Remember this is an integer
 | 
| 838 |     const backslash = \\  # ditto
 | 
| 839 | 
 | 
| 840 |     # Unicode rune literal is syntactic sugar for 0x3bc
 | 
| 841 |     const mu = \u{3bc}
 | 
| 842 | 
 | 
| 843 |     echo "chars $a $newline $backslash $mu"  # => chars 65 10 92 956
 | 
| 844 | -->
 | 
| 845 | 
 | 
| 846 | #### Float
 | 
| 847 | 
 | 
| 848 | Floats are written like you'd expect:
 | 
| 849 | 
 | 
| 850 |     var small = 1.5e-10
 | 
| 851 |     var big = 3.14
 | 
| 852 | 
 | 
| 853 | #### Str
 | 
| 854 | 
 | 
| 855 | See the section above called *Three Kinds of String Literals*.  It described
 | 
| 856 | `'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
 | 
| 857 | as their multiline variants.
 | 
| 858 | 
 | 
| 859 | Strings are UTF-8 encoded in memory, like strings in the [Go
 | 
| 860 | language](https://golang.org).  There isn't a separate string and unicode type,
 | 
| 861 | as in Python.
 | 
| 862 | 
 | 
| 863 | Strings are **immutable**, as in Python and JavaScript.  This means they only
 | 
| 864 | have **transforming** methods:
 | 
| 865 | 
 | 
| 866 |     var x = s => trim()
 | 
| 867 | 
 | 
| 868 | Other methods:
 | 
| 869 | 
 | 
| 870 | - `trimLeft()   trimRight()`
 | 
| 871 | - `trimPrefix()   trimSuffix()`
 | 
| 872 | - `upper()   lower()` (not implemented)
 | 
| 873 | 
 | 
| 874 | <!--
 | 
| 875 | The syntax `:symbol` could be an interned string.
 | 
| 876 | -->
 | 
| 877 | 
 | 
| 878 | #### List (and Arrays)
 | 
| 879 | 
 | 
| 880 | All lists can be expressed with Python-like literals:
 | 
| 881 | 
 | 
| 882 |     var foods = ['ale', 'bean', 'corn']
 | 
| 883 |     var recursive = [1, [2, 3]]
 | 
| 884 | 
 | 
| 885 | As a special case, list of strings are called **arrays**.  It's often more
 | 
| 886 | convenient to write them with shell-like literals:
 | 
| 887 | 
 | 
| 888 |     # No quotes or commas
 | 
| 889 |     var foods = :| ale bean corn |
 | 
| 890 | 
 | 
| 891 |     # You can use the word language here
 | 
| 892 |     var other = :| foo $s *.py {alice,bob}@example.com |
 | 
| 893 | 
 | 
| 894 | Lists are **mutable**, as in Python and JavaScript.  So they mainly have
 | 
| 895 | mutating methods:
 | 
| 896 | 
 | 
| 897 |     call foods->reverse()
 | 
| 898 |     write -- @foods
 | 
| 899 |     # =>
 | 
| 900 |     # corn
 | 
| 901 |     # bean
 | 
| 902 |     # ale
 | 
| 903 | 
 | 
| 904 | #### Dict
 | 
| 905 | 
 | 
| 906 | Dicts use syntax that's more like JavaScript than Python.  Here's a dict
 | 
| 907 | literal:
 | 
| 908 | 
 | 
| 909 |     var d = {
 | 
| 910 |       name: 'bob',  # unquoted keys are allowed
 | 
| 911 |       age: 42,
 | 
| 912 |       'key with spaces': 'val'
 | 
| 913 |     }
 | 
| 914 | 
 | 
| 915 | There are two syntaxes for key lookup.  If the key doesn't exist, it's a fatal
 | 
| 916 | error.
 | 
| 917 | 
 | 
| 918 |     var v1 = d['name']
 | 
| 919 |     var v2 = d.name                # shorthand for the above
 | 
| 920 |     var v3 = d['key with spaces']  # no shorthand for this
 | 
| 921 | 
 | 
| 922 | Keys names can be computed with expressions in `[]`:
 | 
| 923 | 
 | 
| 924 |     var key = 'alice'
 | 
| 925 |     var d2 = {[key ++ '_z']: 'ZZZ'}  # Computed key name
 | 
| 926 |     echo $[d2.alice_z]   # => ZZZ    # Reminder: expression sub
 | 
| 927 | 
 | 
| 928 | Omitting the value causes it to be taken from a variable of the same name:
 | 
| 929 | 
 | 
| 930 |     var d3 = {key}             # value is taken from the environment
 | 
| 931 |     echo "name is $[d3.key]"   # => name is alice
 | 
| 932 | 
 | 
| 933 | More:
 | 
| 934 | 
 | 
| 935 |     var empty = {}
 | 
| 936 |     echo $[len(empty)]  # => 0
 | 
| 937 | 
 | 
| 938 | Dicts are **mutable**, as in Python and JavaScript.  But the `keys()` and `values()`
 | 
| 939 | methods return new `List` objects:
 | 
| 940 | 
 | 
| 941 |     var keys = d2 => keys()    # => alice_z
 | 
| 942 |     # var vals = d3 => values()  # => alice
 | 
| 943 | 
 | 
| 944 | ### `Place` type / "out params"
 | 
| 945 | 
 | 
| 946 | The `read` builtin can either set an implicit variable `_reply`:
 | 
| 947 | 
 | 
| 948 |     whoami | read --all  # sets _reply
 | 
| 949 | 
 | 
| 950 | Or you can pass a `value.Place`, created with `&`
 | 
| 951 | 
 | 
| 952 |     var x                      # implicitly initialized to null
 | 
| 953 |     whoami | read --all (&x)   # mutate this "place"
 | 
| 954 |     echo who=$x  # => who=andy
 | 
| 955 | 
 | 
| 956 | #### Quotation Types: value.Command (Block) and value.Expr
 | 
| 957 | 
 | 
| 958 | These types are for reflection on YSH code.  Most YSH programs won't use them
 | 
| 959 | directly.
 | 
| 960 | 
 | 
| 961 | - `Command`: an unevaluated code block.
 | 
| 962 |   - rarely-used literal: `^(ls | wc -l)`
 | 
| 963 | - `Expr`: an unevaluated expression.
 | 
| 964 |   - rarely-used literal: `^[42 + a[i]]`
 | 
| 965 | 
 | 
| 966 | <!-- TODO: implement Block, Expr, ArgList types (variants of value) -->
 | 
| 967 | 
 | 
| 968 | ### Operators
 | 
| 969 | 
 | 
| 970 | Operators are generally the same as in Python:
 | 
| 971 | 
 | 
| 972 |     if (10 <= num_beans and num_beans < 20) {
 | 
| 973 |       echo 'enough'
 | 
| 974 |     }  # => enough
 | 
| 975 | 
 | 
| 976 | YSH has a few operators that aren't in Python.  Equality can be approximate or
 | 
| 977 | exact:
 | 
| 978 | 
 | 
| 979 |     var n = ' 42 '
 | 
| 980 |     if (n ~== 42) {
 | 
| 981 |       echo 'equal after stripping whitespace and type conversion'
 | 
| 982 |     }  # => equal after stripping whitespace type conversion
 | 
| 983 | 
 | 
| 984 |     if (n === 42) {
 | 
| 985 |       echo "not reached because strings and ints aren't equal"
 | 
| 986 |     }
 | 
| 987 | 
 | 
| 988 | <!-- TODO: is n === 42 a type error? -->
 | 
| 989 | 
 | 
| 990 | Pattern matching can be done with globs (`~~` and `!~~`)
 | 
| 991 | 
 | 
| 992 |     const filename = 'foo.py'
 | 
| 993 |     if (filename ~~ '*.py') {
 | 
| 994 |       echo 'Python'
 | 
| 995 |     }  # => Python
 | 
| 996 | 
 | 
| 997 |     if (filename !~~ '*.sh') {
 | 
| 998 |       echo 'not shell'
 | 
| 999 |     }  # => not shell
 | 
| 1000 | 
 | 
| 1001 | or regular expressions (`~` and `!~`).  See the Eggex section below for an
 | 
| 1002 | example of the latter.
 | 
| 1003 | 
 | 
| 1004 | Concatenation is `++` rather than `+` because it avoids confusion in the
 | 
| 1005 | presence of type conversion:
 | 
| 1006 | 
 | 
| 1007 |     var n = 42 + 1    # string plus int does implicit conversion
 | 
| 1008 |     echo $n           # => 43
 | 
| 1009 | 
 | 
| 1010 |     var y = 'ale ' ++ "bean $n"  # concatenation
 | 
| 1011 |     echo $y  # => ale bean 43
 | 
| 1012 | 
 | 
| 1013 | <!--
 | 
| 1014 | TODO: change example above
 | 
| 1015 |     var n = '42' + 1    # string plus int does implicit conversion
 | 
| 1016 | -->
 | 
| 1017 | 
 | 
| 1018 | <!--
 | 
| 1019 | 
 | 
| 1020 | #### Summary of Operators
 | 
| 1021 | 
 | 
| 1022 | - Arithmetic: `+ - * / // %` and `**` for exponentatiation
 | 
| 1023 |   - `/` always yields a float, and `//` is integer division
 | 
| 1024 | - Bitwise: `& | ^ ~`
 | 
| 1025 | - Logical: `and or not`
 | 
| 1026 | - Comparison: `==  <  >  <=  >=  in  'not in'` 
 | 
| 1027 |   - Approximate equality: `~==`
 | 
| 1028 |   - Eggex and glob match: `~  !~  ~~  !~~`
 | 
| 1029 | - Ternary: `1 if x else 0`
 | 
| 1030 | - Index and slice: `mylist[3]` and `mylist[1:3]`
 | 
| 1031 |   - `mydict->key` is a shortcut for `mydict['key']`
 | 
| 1032 | - Function calls
 | 
| 1033 |   - free: `f(x, y)`
 | 
| 1034 |   - transformations and chaining: `s => startWith('prefix')`
 | 
| 1035 |   - mutating methods: `mylist->pop()`
 | 
| 1036 | - String and List: `++` for concatenation
 | 
| 1037 |   - This is a separate operator because the addition operator `+` does
 | 
| 1038 |     string-to-int conversion
 | 
| 1039 | 
 | 
| 1040 | TODO: What about list comprehensions?
 | 
| 1041 | -->
 | 
| 1042 | 
 | 
| 1043 | ### Egg Expressions (YSH Regexes)
 | 
| 1044 | 
 | 
| 1045 | An *Eggex* is a type of YSH expression that denote regular expressions.  They
 | 
| 1046 | translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
 | 
| 1047 | --regexp-extended` (GNU only).
 | 
| 1048 | 
 | 
| 1049 | They're designed to be readable and composable.  Example:
 | 
| 1050 | 
 | 
| 1051 |     var D = / digit{1,3} /
 | 
| 1052 |     var ip_pattern = / D '.' D '.' D '.' D'.' /
 | 
| 1053 | 
 | 
| 1054 |     var z = '192.168.0.1'
 | 
| 1055 |     if (z ~ ip_pattern) {           # Use the ~ operator to match
 | 
| 1056 |       echo "$z looks like an IP address"
 | 
| 1057 |     }  # => 192.168.0.1 looks like an IP address
 | 
| 1058 | 
 | 
| 1059 |     if (z !~ / '.255' %end /) {
 | 
| 1060 |       echo "doesn't end with .255"
 | 
| 1061 |     }  # => doesn't end with .255"
 | 
| 1062 | 
 | 
| 1063 | See the [Egg Expressions doc](eggex.html) for details.
 | 
| 1064 | 
 | 
| 1065 | ## Interlude
 | 
| 1066 | 
 | 
| 1067 | Let's review what we've seen before moving onto other YSH features.
 | 
| 1068 | 
 | 
| 1069 | ### Three Interleaved Languages
 | 
| 1070 | 
 | 
| 1071 | Here are the languages we saw in the last 3 sections:
 | 
| 1072 | 
 | 
| 1073 | 1. **Words** evaluate to a string, or list of strings.  This includes:
 | 
| 1074 |    - literals like `'mystr'`
 | 
| 1075 |    - substitutions like `${x}` and `$(hostname)`
 | 
| 1076 |    - globs like `*.sh`
 | 
| 1077 | 2. **Commands** are used for
 | 
| 1078 |    - I/O: pipelines, builtins like `read`
 | 
| 1079 |    - control flow: `if`, `for`
 | 
| 1080 |    - abstraction: `proc`
 | 
| 1081 | 3. **Expressions** on typed data are borrowed from Python, with some JavaScript
 | 
| 1082 |    influence.
 | 
| 1083 |    - Lists: `['ale', 'bean']` or `:| ale bean |`
 | 
| 1084 |    - Dicts: `{name: 'bob', age: 42}`
 | 
| 1085 |    - Functions: `split('ale bean')` and `join(['pea', 'nut'])`
 | 
| 1086 | 
 | 
| 1087 | ### How Do They Work Together?
 | 
| 1088 | 
 | 
| 1089 | Here are two examples:
 | 
| 1090 | 
 | 
| 1091 | (1) In this this *command*, there are **four** *words*.  The fourth word is an
 | 
| 1092 | *expression sub* `$[]`.
 | 
| 1093 | 
 | 
| 1094 |     write hello $name $[d['age'] + 1]
 | 
| 1095 |     # =>
 | 
| 1096 |     # hello
 | 
| 1097 |     # world
 | 
| 1098 |     # 43
 | 
| 1099 | 
 | 
| 1100 | (2) In this assignment, the *expression* on the right hand side of `=`
 | 
| 1101 | concatenates two strings.  The first string is a literal, and the second is a
 | 
| 1102 | *command sub*.
 | 
| 1103 | 
 | 
| 1104 |     var food = 'ale ' ++ $(echo bean | tr a-z A-Z)
 | 
| 1105 |     write $food  # => ale BEAN
 | 
| 1106 | 
 | 
| 1107 | So words, commands, and expressions are **mutually recursive**.  If you're a
 | 
| 1108 | conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
 | 
| 1109 | help you understand this on a deeper level.
 | 
| 1110 | 
 | 
| 1111 | <!--
 | 
| 1112 | One way to think about these sublanguages is to note that the `|` character
 | 
| 1113 | means something different in each context:
 | 
| 1114 | 
 | 
| 1115 | - In the command language, it's the pipeline operator, as in `ls | wc -l`
 | 
| 1116 | - In the word language, it's only valid in a literal string like `'|'`, `"|"`,
 | 
| 1117 |   or `\|`.  (It's also used in `${x|html}`, which formats a string.)
 | 
| 1118 | - In the expression language, it's the bitwise OR operator, as in Python and
 | 
| 1119 |   JavaScript.
 | 
| 1120 | -->
 | 
| 1121 | 
 | 
| 1122 | ## Languages for Data (Interchange Formats)
 | 
| 1123 | 
 | 
| 1124 | In addition to languages for **code**, YSH also deals with languages for
 | 
| 1125 | **data**.  [JSON]($xref) is a prominent example of the latter.
 | 
| 1126 | 
 | 
| 1127 | <!-- TODO: Link to slogans, fallacies, and concepts -->
 | 
| 1128 | 
 | 
| 1129 | ### UTF-8
 | 
| 1130 | 
 | 
| 1131 | UTF-8 is the foundation of our textual data languages.
 | 
| 1132 | 
 | 
| 1133 | <!-- TODO: there's a runes() iterator which gives integer offsets, usable for
 | 
| 1134 | slicing -->
 | 
| 1135 | 
 | 
| 1136 | <!-- TODO: write about J8 notation -->
 | 
| 1137 | 
 | 
| 1138 | ### Lines of Text (traditional), and JSON/J8 Strings
 | 
| 1139 | 
 | 
| 1140 | Traditional Unix tools like `grep` and `awk` operate on streams of lines.  YSH
 | 
| 1141 | supports this style, just like any other shell.
 | 
| 1142 | 
 | 
| 1143 | But YSH also has [J8 Notation][], a data format based on [JSON][].
 | 
| 1144 | 
 | 
| 1145 | [J8 Notation]: j8-notation.html
 | 
| 1146 | 
 | 
| 1147 | It lets you encode arbitrary byte strings into a single (readable) line,
 | 
| 1148 | including those with newlines and terminal escape sequences.
 | 
| 1149 | 
 | 
| 1150 | Example:
 | 
| 1151 | 
 | 
| 1152 |     # A line with a tab char in the middle
 | 
| 1153 |     var mystr = u'pea\t' ++ u'42\n'
 | 
| 1154 | 
 | 
| 1155 |     # Print it as JSON
 | 
| 1156 |     write $[toJson(mystr)]  # => "pea\t42\n"
 | 
| 1157 | 
 | 
| 1158 |     # JSON8 is the same, but it's not lossy for binary data
 | 
| 1159 |     write $[toJson8(mystr)]  # => "pea\t42\n"
 | 
| 1160 | 
 | 
| 1161 | ### Structured: JSON8, TSV8
 | 
| 1162 | 
 | 
| 1163 | You can write and read **tree-shaped** as [JSON][]:
 | 
| 1164 | 
 | 
| 1165 |     var d = {key: 'value'}
 | 
| 1166 |     json write (d)                # dump variable d as JSON
 | 
| 1167 |     # =>
 | 
| 1168 |     # {
 | 
| 1169 |     #   "key": "value"
 | 
| 1170 |     # }
 | 
| 1171 | 
 | 
| 1172 |     echo '["ale", 42]' > example.json
 | 
| 1173 | 
 | 
| 1174 |     json read (&d2) < example.json  # parse JSON into var d2
 | 
| 1175 |     pp cell d2                      # inspect the in-memory value
 | 
| 1176 |     # =>
 | 
| 1177 |     # ['ale', 42]
 | 
| 1178 | 
 | 
| 1179 | [JSON][] will lose information when strings have binary data, but the slight
 | 
| 1180 | [JSON8]($xref) upgrade won't:
 | 
| 1181 | 
 | 
| 1182 |     var b = {binary: $'\xff'}
 | 
| 1183 |     json8 write (b)
 | 
| 1184 |     # =>
 | 
| 1185 |     # {
 | 
| 1186 |     #   "binary": b'\yff'
 | 
| 1187 |     # }
 | 
| 1188 | 
 | 
| 1189 | [JSON]: $xref
 | 
| 1190 | 
 | 
| 1191 | <!--
 | 
| 1192 | TODO:
 | 
| 1193 | - Fix pp cell output
 | 
| 1194 | - Use json write (d) syntax
 | 
| 1195 | -->
 | 
| 1196 | 
 | 
| 1197 | **Table-shaped** data can be read and written as [TSV8]($xref).  (TODO: not yet
 | 
| 1198 | implemented.)
 | 
| 1199 | 
 | 
| 1200 | <!-- Figure out the API.  Does it work like JSON?
 | 
| 1201 | 
 | 
| 1202 | Or I think we just implement
 | 
| 1203 | - rows: 'where' or 'filter' (dplyr)
 | 
| 1204 | - cols: 'select' conflicts with shell builtin; call it 'cols'?
 | 
| 1205 | - sort: 'sort-by' or 'arrange' (dplyr)
 | 
| 1206 | - TSV8 <=> sqlite conversion.  Are these drivers or what?
 | 
| 1207 |   - and then let you pipe output?
 | 
| 1208 | 
 | 
| 1209 | Do we also need TSV8 space2tab or something?  For writing TSV8 inline.
 | 
| 1210 | 
 | 
| 1211 | More later:
 | 
| 1212 | - MessagePack (e.g. for shared library extension modules)
 | 
| 1213 |   - msgpack read, write?  I think user-defined function could be like this?
 | 
| 1214 | - SASH: Simple and Strict HTML?  For easy processing
 | 
| 1215 | -->
 | 
| 1216 | 
 | 
| 1217 | ## The Runtime Shared by OSH and YSH
 | 
| 1218 | 
 | 
| 1219 | Although we describe OSH and YSH as different languages, they use the **same**
 | 
| 1220 | interpreter under the hood.  This interpreter has various `shopt` flags that
 | 
| 1221 | are flipped for different behavior, e.g. with `shopt --set ysh:all`.
 | 
| 1222 | 
 | 
| 1223 | Understanding this interpreter and its interface to the Unix kernel will help
 | 
| 1224 | you understand **both** languages!
 | 
| 1225 | 
 | 
| 1226 | ### Interpreter Data Model
 | 
| 1227 | 
 | 
| 1228 | The [Interpreter State](interpreter-state.html) doc is **under construction**.
 | 
| 1229 | It will cover:
 | 
| 1230 | 
 | 
| 1231 | - Two separate namespaces (like Lisp 1 vs. 2):
 | 
| 1232 |   - **proc** namespace for procs as the first word
 | 
| 1233 |   - **variable** namespace
 | 
| 1234 | - The variable namespace has a **call stack**, for the local variables of a
 | 
| 1235 |   proc.
 | 
| 1236 |   - Each **stack frame** is a `{name -> cell}` mapping.
 | 
| 1237 |   - A **cell** has one of the above data types: `Bool`, `Int`, `Str`, etc.
 | 
| 1238 |   - A cell has `readonly`, `export`, and `nameref` **flags**.
 | 
| 1239 | - Boolean shell options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
 | 
| 1240 | - String shell options with `shvar`: `IFS`, `PATH`
 | 
| 1241 | - **Registers** that are silently modified by the interpreter
 | 
| 1242 |   - `$?` and `_status`
 | 
| 1243 |   - `$!` for the last PID
 | 
| 1244 |   - `_this_dir`
 | 
| 1245 |   - `_reply`
 | 
| 1246 | 
 | 
| 1247 | ### Process Model (the kernel)
 | 
| 1248 | 
 | 
| 1249 | The [Process Model](process-model.html) doc is **under construction**.  It will cover:
 | 
| 1250 | 
 | 
| 1251 | - Simple Commands, `exec` 
 | 
| 1252 | - Pipelines.  #[shell-the-good-parts](#blog-tag)
 | 
| 1253 | - `fork`, `forkwait`
 | 
| 1254 | - Command and process substitution.
 | 
| 1255 | - Related links:
 | 
| 1256 |   - [Tracing execution in Oils](xtrace.html) (xtrace), which divides
 | 
| 1257 |     process-based concurrency into **synchronous** and **async** constructs.
 | 
| 1258 |   - [Three Comics For Understanding Unix
 | 
| 1259 |     Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
 | 
| 1260 | 
 | 
| 1261 | 
 | 
| 1262 | <!--
 | 
| 1263 | Process model additions: Capers, Headless shell 
 | 
| 1264 | 
 | 
| 1265 | some optimizations: See YSH starts fewer processes than other shells.
 | 
| 1266 | -->
 | 
| 1267 | 
 | 
| 1268 | ## Summary
 | 
| 1269 | 
 | 
| 1270 | YSH is a large language that evolved from Unix shell.  It has shell-like
 | 
| 1271 | commands, Python-like expressions on typed data, and Ruby-like command blocks.
 | 
| 1272 | 
 | 
| 1273 | Even though it's large, you can "forget" the bad parts of shell like `[ $x -lt
 | 
| 1274 | $y ]`.
 | 
| 1275 | 
 | 
| 1276 | These concepts are central to YSH:
 | 
| 1277 | 
 | 
| 1278 | 1. Interleaved *word*, *command*, and *expression* languages.
 | 
| 1279 | 2. A standard library of *shell builtins*, as well as *builtin functions*
 | 
| 1280 | 3. Languages for *data*: J8 Notation, including JSON8 and TSV8
 | 
| 1281 | 4. A *runtime* shared by OSH and YSH
 | 
| 1282 | 
 | 
| 1283 | ## Related Docs
 | 
| 1284 | 
 | 
| 1285 | - [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
 | 
| 1286 | - [YSH Language Influences](language-influences.html) - In addition to shell,
 | 
| 1287 |   Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
 | 
| 1288 | - [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
 | 
| 1289 |   you remember the syntax.
 | 
| 1290 | - [YSH Language Warts](warts.html) documents syntax that may be surprising.
 | 
| 1291 | 
 | 
| 1292 | ## Appendix: Features Not Shown
 | 
| 1293 | 
 | 
| 1294 | ### Advanced
 | 
| 1295 | 
 | 
| 1296 | These shell features are part of YSH, but aren't shown for brevity.
 | 
| 1297 | 
 | 
| 1298 | - The `fork` and `forkwait` builtins, for concurrent execution and subshells.
 | 
| 1299 | - Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
 | 
| 1300 | 
 | 
| 1301 | ### Deprecated Shell Constructs
 | 
| 1302 | 
 | 
| 1303 | The shared interpreter supports many shell constructs that are deprecated:
 | 
| 1304 | 
 | 
| 1305 | - YSH code uses shell's `||` and `&&` in limited circumstances, since `errexit`
 | 
| 1306 |   is on by default.
 | 
| 1307 | - Assignment builtins like `local` and `declare`.  Use YSH keywords.
 | 
| 1308 | - Boolean expressions like `[[ x =~ $pat ]]`.  Use YSH expressions.
 | 
| 1309 | - Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`.  Use YSH expressions.
 | 
| 1310 | - The `until` loop can always be replaced with a `while` loop
 | 
| 1311 | - Most of what's in `${}` can be written in other ways.  For example
 | 
| 1312 |   `${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
 | 
| 1313 | 
 | 
| 1314 | ### Not Yet Implemented
 | 
| 1315 | 
 | 
| 1316 | This document mentions a few constructs that aren't yet implemented.  Here's a
 | 
| 1317 | summary:
 | 
| 1318 | 
 | 
| 1319 | ```none
 | 
| 1320 | # Unimplemented syntax:
 | 
| 1321 | 
 | 
| 1322 | echo ${x|html}               # formatters
 | 
| 1323 | 
 | 
| 1324 | echo ${x %.2f}               # statically-parsed printf
 | 
| 1325 | 
 | 
| 1326 | var x = j"line\n"
 | 
| 1327 | echo j"line\n"               # JSON-style string literal
 | 
| 1328 | 
 | 
| 1329 | var x = "<p>$x</p>"html      
 | 
| 1330 | echo "<p>$x</p>"html         # tagged string
 | 
| 1331 | 
 | 
| 1332 | var x = 15 Mi                # units suffix
 | 
| 1333 | ```
 | 
| 1334 | 
 | 
| 1335 | Important builtins that aren't implemented:
 | 
| 1336 | 
 | 
| 1337 | - `describe` for testing
 | 
| 1338 | - `parseArgs()` to parse flags
 | 
| 1339 | - Builtins for [TSV8]($xref) - selection, projection, sorting
 | 
| 1340 | 
 | 
| 1341 | <!--
 | 
| 1342 | 
 | 
| 1343 | - To document: Method calls
 | 
| 1344 | - To implement: Capers: stateless coprocesses
 | 
| 1345 | -->
 | 
| 1346 | 
 | 
| 1347 | ## Appendix: Example of an YSH Module
 | 
| 1348 | 
 | 
| 1349 | YSH can be used to write simple "shell scripts" or longer programs.  It has
 | 
| 1350 | *procs* and *modules* to help with the latter.
 | 
| 1351 | 
 | 
| 1352 | A module is just a file, like this:
 | 
| 1353 | 
 | 
| 1354 | ```
 | 
| 1355 | #!/usr/bin/env ysh
 | 
| 1356 | ### Deploy script
 | 
| 1357 | 
 | 
| 1358 | module main || return 0         # declaration, "include guard"
 | 
| 1359 | use bin cp mkdir                # optionally declare binaries used
 | 
| 1360 | 
 | 
| 1361 | source $_this_dir/lib/util.ysh  # defines 'log' helper
 | 
| 1362 | 
 | 
| 1363 | const DEST = '/tmp/ysh-tour'
 | 
| 1364 | 
 | 
| 1365 | proc my-sync(...files) {
 | 
| 1366 |   ### Sync files and show which ones
 | 
| 1367 | 
 | 
| 1368 |   cp --verbose @files $DEST
 | 
| 1369 | }
 | 
| 1370 | 
 | 
| 1371 | proc main {
 | 
| 1372 |   mkdir -p $DEST
 | 
| 1373 | 
 | 
| 1374 |   touch {foo,bar}.py {build,test}.sh
 | 
| 1375 | 
 | 
| 1376 |   log "Copying source files"
 | 
| 1377 |   my-sync *.py *.sh
 | 
| 1378 | 
 | 
| 1379 |   if test --dir /tmp/logs {
 | 
| 1380 |     cd /tmp/logs
 | 
| 1381 | 
 | 
| 1382 |     log "Copying logs"
 | 
| 1383 |     my-sync *.log
 | 
| 1384 |   }
 | 
| 1385 | }
 | 
| 1386 | 
 | 
| 1387 | if is-main {                    # The only top-level statement
 | 
| 1388 |   main @ARGV
 | 
| 1389 | }
 | 
| 1390 | ```
 | 
| 1391 | 
 | 
| 1392 | <!--
 | 
| 1393 | TODO:
 | 
| 1394 | - Also show flags parsing?
 | 
| 1395 | - Show longer examples where it isn't boilerplate
 | 
| 1396 | -->
 | 
| 1397 | 
 | 
| 1398 | You wouldn't bother with the boilerplate for something this small.  But this
 | 
| 1399 | example illustrates the idea, which is that the top level often contains these
 | 
| 1400 | words: `proc`, `const`, `module`, `source`, and `use`.
 | 
| 1401 | 
 |