| 1 | ---
|
| 2 | default_highlighter: oils-sh
|
| 3 | ---
|
| 4 |
|
| 5 | A Feel For YSH Syntax
|
| 6 | =====================
|
| 7 |
|
| 8 | A short way to describe the [YSH]($xref) language:
|
| 9 |
|
| 10 | > A Unix shell that's familiar to people who know Python, JavaScript, or Ruby.
|
| 11 |
|
| 12 | This document gives you a feel for that, with brief examples. It's not a
|
| 13 | comprehensive or precise guide. Roughly speaking, YSH code has more
|
| 14 | punctuation than those 3 languages, but less than shell and Perl.
|
| 15 |
|
| 16 | If you're totally unfamiliar with the language, read [The Simplest Explanation
|
| 17 | of Oil](//www.oilshell.org/blog/2020/01/simplest-explanation.html) first. (Oil
|
| 18 | was renamed [YSH]($xref) in 2023.)
|
| 19 |
|
| 20 | <div id="toc">
|
| 21 | </div>
|
| 22 |
|
| 23 | ## Preliminaries
|
| 24 |
|
| 25 | Different parts of YSH are parsed in either **command** or **expression** mode.
|
| 26 | Command mode is like shell:
|
| 27 |
|
| 28 | echo $x
|
| 29 |
|
| 30 | Expression mode looks like Python or JavaScript, and appears on right-hand side
|
| 31 | of `=`:
|
| 32 |
|
| 33 | var x = 42 + array[i]
|
| 34 |
|
| 35 | The examples below aren't organized along those lines, but they use `var` and
|
| 36 | `echo` to remind you of the context. Some constructs are valid in both modes.
|
| 37 |
|
| 38 | ## Sigils
|
| 39 |
|
| 40 | Sigils are punctuation characters that precede a name, e.g. the `$` in
|
| 41 | `$mystr`.
|
| 42 |
|
| 43 | Unlike Perl and PHP, YSH doesn't use sigils on the LHS of assignments, or in
|
| 44 | expression mode. The [syntactic concepts](syntactic-concepts.html) doc
|
| 45 | explains this difference.
|
| 46 |
|
| 47 | ### Very Common
|
| 48 |
|
| 49 | The `$` and `@` sigils mean roughly what they do in shell, Perl, and
|
| 50 | PowerShell.
|
| 51 |
|
| 52 | `$` means *string* / *scalar*. These shell constructs are idiomatic in YSH:
|
| 53 |
|
| 54 | $mvar ${myvar}
|
| 55 | $(hostname)
|
| 56 |
|
| 57 | And these YSH language extensions also use `$`:
|
| 58 |
|
| 59 | echo $[42 + a[i]] # string interpolation of expression
|
| 60 | grep $/ digit+ / # inline eggex (not implemented yet)
|
| 61 |
|
| 62 | `@` means *array* / *splice an array*:
|
| 63 |
|
| 64 | echo "$@" # Legacy syntax; prefer @ARGV
|
| 65 |
|
| 66 | YSH:
|
| 67 |
|
| 68 | echo @strs # splice array
|
| 69 |
|
| 70 | echo @[split(x)] @[glob(x)] # splice expressions that returns arrays
|
| 71 |
|
| 72 | for i in @(seq 3) { # split command sub
|
| 73 | echo $i
|
| 74 | }
|
| 75 |
|
| 76 | proc p(first, @rest) { # named varargs in proc signatures
|
| 77 | write -- $first # (procs are shell-like functions)
|
| 78 | write -- @rest
|
| 79 | }
|
| 80 |
|
| 81 | ### Less Common
|
| 82 |
|
| 83 | The colon means "unquoted word" in these two lines:
|
| 84 |
|
| 85 | var mysymbol = :key # string, not implemented yet
|
| 86 | var myarray = :| one two three | # array
|
| 87 |
|
| 88 | It's also used to pass the name of a variable to a builtin:
|
| 89 |
|
| 90 | echo hi | read :myvar
|
| 91 |
|
| 92 | A caret means "unevaluated":
|
| 93 |
|
| 94 | var cmd = ^(cd /tmp; ls *.txt)
|
| 95 | var expr = ^[42 + a[i]] # unimplemented
|
| 96 | var template = ^"var = $var" # unimplemented
|
| 97 |
|
| 98 | <!--
|
| 99 |
|
| 100 | `:` means lazily evaluated in these 2 cases (not implemented):
|
| 101 |
|
| 102 | when :(x > 0) { echo 'positive' }
|
| 103 | x = :[1 + 2]
|
| 104 |
|
| 105 | -->
|
| 106 |
|
| 107 | ## Opening and Closing Delimiters
|
| 108 |
|
| 109 | The `{}` `[]` and `()` characters have several different meanings, but we try
|
| 110 | our best to make them consistent. They're subject to legacy constraints from
|
| 111 | Bourne shell, Korn shell, and [bash]($xref).
|
| 112 |
|
| 113 | ### Braces: Command Blocks and Dict Literal Expressions
|
| 114 |
|
| 115 | In expression mode, `{}` are used for dict literals (aka hash
|
| 116 | tables, associative arrays), which makes YSH look like JavaScript:
|
| 117 |
|
| 118 |
|
| 119 | var d = {name: 'Bob', age: 10}
|
| 120 |
|
| 121 | while (x > 0) {
|
| 122 | setvar x -= 1
|
| 123 | }
|
| 124 |
|
| 125 | In command mode, they're used for blocks of code:
|
| 126 |
|
| 127 | cd /tmp {
|
| 128 | echo $PWD
|
| 129 | }
|
| 130 |
|
| 131 | Blocks are also used for "declarative" configuration:
|
| 132 |
|
| 133 | server www.example.com {
|
| 134 | port = 80
|
| 135 | root = '/home/www'
|
| 136 | section bar {
|
| 137 | ...
|
| 138 | }
|
| 139 | }
|
| 140 |
|
| 141 | ### Parens: Expression
|
| 142 |
|
| 143 | Parens are used in expressions:
|
| 144 |
|
| 145 | var x = (42 + a[i]) * myfunc(42, 'foo')
|
| 146 |
|
| 147 | if (x > 0) { # compare with if test -d /tmp
|
| 148 | echo 'positive'
|
| 149 | }
|
| 150 |
|
| 151 | And signatures:
|
| 152 |
|
| 153 | proc p(x, y) {
|
| 154 | echo $x $y
|
| 155 | }
|
| 156 |
|
| 157 | In [Eggex](eggex.html), they mean **grouping** and not capture, which is
|
| 158 | consistent with other YSH expressions:
|
| 159 |
|
| 160 | var p = / digit+ ('seconds' | 'minutes' | 'hours' ) /
|
| 161 |
|
| 162 |
|
| 163 | <!--
|
| 164 | echo .(4 + 5)
|
| 165 | echo foo > &(fd)
|
| 166 | -->
|
| 167 |
|
| 168 | ### Parens with Sigil: Command Interpolation
|
| 169 |
|
| 170 | The "sigil pairs" with parens enclose commands:
|
| 171 |
|
| 172 | echo $(ls | wc -l) # command sub
|
| 173 | echo @(seq 3) # split command sub
|
| 174 |
|
| 175 | var myblock = ^(echo $PWD) # block literal in expression mode
|
| 176 |
|
| 177 | diff <(sort left.txt) <(sort right.txt) # bash syntax
|
| 178 |
|
| 179 | Unlike brackets and braces, the `()` characters can't appear in shell commands,
|
| 180 | which makes them useful as delimiters.
|
| 181 |
|
| 182 | ### Brackets: Sequence, Subscript
|
| 183 |
|
| 184 | In expression mode, `[]` means sequence:
|
| 185 |
|
| 186 | var mylist = ['one', 'two', 'three']
|
| 187 |
|
| 188 | or subscript:
|
| 189 |
|
| 190 | var item = mylist[1]
|
| 191 | var item = mydict['foo']
|
| 192 |
|
| 193 | ### Brackets with a Sigil: Expression
|
| 194 |
|
| 195 | The sigil pair `$[]` is common in command mode:
|
| 196 |
|
| 197 | echo $[42 + a[i]]
|
| 198 |
|
| 199 | Quotations are valid in expression mode:
|
| 200 |
|
| 201 | var my_expr = ^[42 + a[i]]
|
| 202 |
|
| 203 | Pass lazy arg lists to commands with `[]`. They're syntactic sugar for `^[]`:
|
| 204 |
|
| 205 | assert [42 === x] # short version
|
| 206 |
|
| 207 | assert (^[42 === x]) # same thing
|
| 208 |
|
| 209 | <!--
|
| 210 |
|
| 211 | And are used in type expressions:
|
| 212 |
|
| 213 | Dict[Int, Str]
|
| 214 | Func[Int => Int]
|
| 215 |
|
| 216 | -->
|
| 217 |
|
| 218 | ## Spaces Around `=` ?
|
| 219 |
|
| 220 | In YSH, *your own* variables look like this:
|
| 221 |
|
| 222 | const x = 42
|
| 223 | var s = 'foo'
|
| 224 | setvar s = 'bar'
|
| 225 |
|
| 226 | In contrast, special shell variables are written with a single `NAME=value`
|
| 227 | argument:
|
| 228 |
|
| 229 | shvar PATH=/tmp {
|
| 230 | temporary
|
| 231 | }
|
| 232 |
|
| 233 | Which is similar to the syntax of the `env` command:
|
| 234 |
|
| 235 | env PYTHONPATH=/tmp ./myscript.py
|
| 236 |
|
| 237 |
|
| 238 | ## Naming Conventions for Identifiers
|
| 239 |
|
| 240 | See the [Style Guide](style-guide.html).
|
| 241 |
|
| 242 | <!--
|
| 243 |
|
| 244 | class Parser { }
|
| 245 | data Point(x Int, y Int)
|
| 246 |
|
| 247 | enum Expr { Unary(child Expr), Binary(left Expr, right Expr) }
|
| 248 | -->
|
| 249 |
|
| 250 | ## Other Punctuation Usage
|
| 251 |
|
| 252 | Here are other usages of the punctuation discussed:
|
| 253 |
|
| 254 | echo *.[ch] # glob char and char classes
|
| 255 | echo {alice,bob}@example.com # brace expansion
|
| 256 |
|
| 257 | Eggex:
|
| 258 |
|
| 259 | / [a-f A-F 0-9] / # char classes use []
|
| 260 |
|
| 261 | / digit+ ('ms' | 'us') / # non-capturing group
|
| 262 | < digit+ > # capturing group
|
| 263 | < digit+ :hour > # named capture
|
| 264 |
|
| 265 | dot{3,4} a{+ N} # repetition
|
| 266 |
|
| 267 | The `~` character is used in operators that mean "pattern" or "approximate":
|
| 268 |
|
| 269 | if (s ~ /d+/) {
|
| 270 | echo 'number'
|
| 271 | }
|
| 272 |
|
| 273 | if (s ~~ '*.py') {
|
| 274 | echo 'Python'
|
| 275 | }
|
| 276 |
|
| 277 | if (mystr ~== myint) {
|
| 278 | echo 'string equals number'
|
| 279 | }
|
| 280 |
|
| 281 | Extended globs are discouraged in YSH because they're a weird way of writing
|
| 282 | regular expressions. But they also use "sigil pairs" with parens:
|
| 283 |
|
| 284 | ,(*.py|*.sh) # preferred synonym for @(*.py|*.sh)
|
| 285 | +(...) # bash/ksh-compatible
|
| 286 | *(...)
|
| 287 | ?(...)
|
| 288 | !(...)
|
| 289 |
|
| 290 | Shell arithmetic is also discouraged in favor of YSH arithmetic:
|
| 291 |
|
| 292 | echo $((1 + 2)) # shell: confusing coercions, dynamically parsed
|
| 293 | echo $[1 + 2] # YSH: types, statically parsed
|
| 294 |
|
| 295 | <!--
|
| 296 | ! ? suffixes (not implemented)
|
| 297 | -->
|
| 298 |
|
| 299 | ## Related Docs
|
| 300 |
|
| 301 | - [Syntactic Concepts in the YSH Language](syntactic-concepts.html)
|
| 302 | - [Language Influences](language-influences.html)
|
| 303 |
|
| 304 | ## Appendix: Table of Sigil Pairs
|
| 305 |
|
| 306 | This table is mainly for YSH language designers. Many constructs aren't
|
| 307 | implemented, but we reserve space for them. The [Oils
|
| 308 | Reference](ref/index.html) is more complete.
|
| 309 |
|
| 310 | Example Description What's Inside Where Valid Notes
|
| 311 |
|
| 312 | $(hostname) Command Sub Command cmd,expr
|
| 313 | @(seq 3) Split Command Sub Command cmd,expr should decode J8
|
| 314 | strings
|
| 315 |
|
| 316 | { echo hi } Block Literal Command cmd shell requires ;
|
| 317 | ^(echo hi) Unevaluated Block Command expr rare
|
| 318 |
|
| 319 | >(sort -n) Process Sub Command cmd rare
|
| 320 | <(echo hi) Process Sub Command cmd rare
|
| 321 |
|
| 322 | :|foo $bar| Array Literal Words expr
|
| 323 |
|
| 324 | $[42 + a[i]] Stringify Expr Expression cmd,expr
|
| 325 | @[glob(x)] Array-ify Expr Expression cmd,expr not implemented
|
| 326 | ^[42 + a[i]] Unevaluated Expr Expression expr not implemented
|
| 327 |
|
| 328 | ^"$1 $2" Unevaluated Str DQ String expr not implemented
|
| 329 |
|
| 330 | ${x %2d} Var Sub Formatting cmd,expr not implemented
|
| 331 | ${x|html} Var Sub Formatting cmd,expr not implemented
|
| 332 |
|
| 333 | json (x) Typed Arg List Argument cmd
|
| 334 | Expressions
|
| 335 |
|
| 336 | $/d+/ Inline Eggex Eggex Expr cmd not implemented
|
| 337 |
|
| 338 | r'' Raw String String expr cmd when shopt
|
| 339 | Literal parse_raw_string
|
| 340 |
|
| 341 | j"" JSON8 String String cmd,expr not implemented
|
| 342 | Literal
|
| 343 |
|
| 344 | #'a' Char Literal UTF-8 char expr
|
| 345 |
|
| 346 | Discouraged / Deprecated
|
| 347 |
|
| 348 | ${x%%pre} Shell Var Sub Shell cmd,expr mostly deprecated
|
| 349 | $((1+2)) Shell Arith Sub Shell Arith cmd deprecated
|
| 350 |
|
| 351 | @(*.py|*.sh) Extended Glob Glob Words cmd deprecated
|
| 352 | +(...)
|
| 353 | *(...)
|
| 354 | ?(...)
|
| 355 | !(...)
|
| 356 |
|
| 357 | ,(*.py|*.sh) Extended Glob Glob Words cmd break conflict
|
| 358 | with split command
|
| 359 | sub
|
| 360 |
|
| 361 | Key to "where valid" column:
|
| 362 |
|
| 363 | - `cmd` means `lex_mode_e.ShCommand`
|
| 364 | - `expr` means `lex_mode_e.Expr`
|
| 365 |
|
| 366 | Some unused sigil pairs:
|
| 367 |
|
| 368 | ~() -() =() /() _() .()
|
| 369 |
|