OILS / doc / ref / chap-expr-lang.md View on Github | oilshell.org

750 lines, 454 significant
1---
2title: YSH Expression Language (Oils Reference)
3all_docs_url: ..
4body_css_class: width40
5default_highlighter: oils-sh
6preserve_anchor_case: yes
7---
8
9<div class="doc-ref-header">
10
11[Oils Reference](index.html) &mdash;
12Chapter **YSH Expression Language**
13
14</div>
15
16This chapter describes the YSH expression language, which includes [Egg
17Expressions]($xref:eggex).
18
19<div id="dense-toc">
20</div>
21
22## Assignment
23
24### assign
25
26The `=` operator is used with assignment keywords:
27
28 var x = 42
29 setvar x = 43
30
31 const y = 'k'
32
33 setglobal z = 'g'
34
35### aug-assign
36
37The augmented assignment operators are:
38
39 += -= *= /= **= //= %=
40 &= |= ^= <<= >>=
41
42They are used with `setvar` and `setglobal`. For example:
43
44 setvar x += 2
45
46is the same as:
47
48 setvar x = x + 2
49
50Likewise, these are the same:
51
52 setglobal a[i] -= 1
53
54 setglobal a[i] = a[i] - 1
55
56## Literals
57
58### atom-literal
59
60YSH uses JavaScript-like spellings for these three "atoms":
61
62 null # type Null
63 true false # type Bool
64
65Note: to signify "no value", you may sometimes use an empty string `''`,
66instead of `null`.
67
68### int-literal
69
70Examples of integer literals:
71
72 var decimal = 42
73 var big = 42_000
74
75 var hex = 0x0010_ffff
76
77 var octal = 0o755
78
79 var binary = 0b0001_0000
80
81### float-lit
82
83Examples of float literals:
84
85 var myfloat = 3.14
86
87 var f2 = -1.5e-100
88
89### ysh-string
90
91YSH has single and double-quoted strings borrowed from Bourne shell, and
92C-style strings borrowed from J8 Notation.
93
94Double quoted strings respect `$` interpolation:
95
96 var dq = "hello $world and $(hostname)"
97
98You can add a `$` before the left quote to be explicit: `$"x is $x"` rather
99than `"x is $x"`.
100
101Single quoted strings may be raw:
102
103 var s = r'line\n' # raw string means \n is literal, NOT a newline
104
105Or *J8 strings* with backslash escapes:
106
107 var s = u'line\n \u{3bc}' # unicode string means \n is a newline
108 var s = b'line\n \u{3bc} \yff' # same thing, but also allows bytes
109
110Both `u''` and `b''` strings evaluate to the single `Str` type. The difference
111is that `b''` strings allow the `\yff` byte escape.
112
113#### Notes
114
115There's no way to express a single quote in raw strings. Use one of the other
116forms instead:
117
118 var sq = "single quote: ' "
119 var sq = u'single quote: \' '
120
121Sometimes you can omit the `r`, e.g. where there are no backslashes and thus no
122ambiguity:
123
124 echo 'foo'
125 echo r'foo' # same thing
126
127The `u''` and `b''` strings are called *J8 strings* because the syntax in YSH
128**code** matches JSON-like **data**.
129
130 var strU = u'mu = \u{3bc}' # J8 string with escapes
131 var strB = b'bytes \yff' # J8 string that can express byte strings
132
133More examples:
134
135 var myRaw = r'[a-z]\n' # raw strings can be used for regexes (not
136 # eggexes)
137
138### triple-quoted
139
140Triple-quoted string literals have leading whitespace stripped on each line.
141They come in the same variants:
142
143 var dq = """
144 hello $world and $(hostname)
145 no leading whitespace
146 """
147
148 var myRaw = r'''
149 raw string
150 no leading whitespace
151 '''
152
153 var strU = u'''
154 string that happens to be unicode \u{3bc}
155 no leading whitespace
156 '''
157
158 var strB = b'''
159 string that happens to be bytes \u{3bc} \yff
160 no leading whitespace
161 '''
162
163Again, you can omit the `r` prefix if there's no backslash, because it's not
164ambiguous:
165
166 var myRaw = '''
167 raw string
168 no leading whitespace
169 '''
170
171### str-template
172
173String templates use the same syntax as double-quoted strings:
174
175 var mytemplate = ^"name = $name, age = $age"
176
177Related topics:
178
179- [Str => replace](chap-type-method.html#replace)
180- [ysh-string](chap-expr-lang.html#ysh-string)
181
182### list-literal
183
184Lists have a Python-like syntax:
185
186 var mylist = ['one', 'two', [42, 43]]
187
188And a shell-like syntax:
189
190 var list2 = :| one two |
191
192The shell-like syntax accepts the same syntax as a simple command:
193
194 ls $mystr @ARGV *.py {foo,bar}@example.com
195
196 # Rather than executing ls, evaluate words into a List
197 var cmd = :| ls $mystr @ARGV *.py {foo,bar}@example.com |
198
199### dict-literal
200
201Dicts look like JavaScript.
202
203 var d = {
204 key1: 'value', # key can be unquoted if it looks like a var name
205 'key2': 42, # or quote it
206
207 ['key2' ++ suffix]: 43, # bracketed expression
208 }
209
210Omitting a value means that the corresponding key takes the value of a var of
211the same name:
212
213 ysh$ var x = 42
214 ysh$ var y = 43
215
216 ysh$ var d = {x, y} # values omitted
217 ysh$ = d
218 (Dict) {x: 42, y: 43}
219
220### range
221
222A range is a sequence of numbers that can be iterated over:
223
224 for i in (0 .. 3) {
225 echo $i
226 }
227 => 0
228 => 1
229 => 2
230
231As with slices, the last number isn't included. To iterate from 1 to n, you
232can use this idiom:
233
234 for i in (1 .. n+1) {
235 echo $i
236 }
237
238### block-expr
239
240In YSH expressions, we use `^()` to create a [Command][] object:
241
242 var myblock = ^(echo $PWD; ls *.txt)
243
244It's more common for [Command][] objects to be created with block arguments,
245which are not expressions:
246
247 cd /tmp {
248 echo $PWD
249 ls *.txt
250 }
251
252[Command]: chap-type-method.html#Command
253
254### expr-literal
255
256An expression literal is an object that holds an unevaluated expression:
257
258 var myexpr = ^[1 + 2*3]
259
260[Expr]: chap-type-method.html#Expr
261
262## Operators
263
264### op-precedence
265
266YSH operator precedence is identical to Python's operator precedence.
267
268New operators:
269
270- `++` has the same precedence as `+`
271- `->` and `=>` have the same precedence as `.`
272
273<!-- TODO: show grammar -->
274
275
276<h3 id="concat">concat <code>++</code></h3>
277
278The concatenation operator works on `Str` objects:
279
280 ysh$ var s = 'hello'
281 ysh$ var t = s ++ ' world'
282
283 ysh$ = t
284 (Str) "hello world"
285
286and `List` objects:
287
288 ysh$ var L = ['one', 'two']
289 ysh$ var M = L ++ ['three', '4']
290
291 ysh$ = M
292 (List) ["one", "two", "three", "4"]
293
294String interpolation can be nicer than `++`:
295
296 var t2 = "${s} world" # same as t
297
298Likewise, splicing lists can be nicer:
299
300 var M2 = :| @L three 4 | # same as M
301
302### ysh-equals
303
304YSH has strict equality:
305
306 a === b # Python-like, without type conversion
307 a !== b # negated
308
309And type converting equality:
310
311 '3' ~== 3 # True, type conversion
312
313The `~==` operator expects a string as the left operand.
314
315---
316
317Note that:
318
319- `3 === 3.0` is false because integers and floats are different types, and
320 there is no type conversion.
321- `3 ~== 3.0` is an error, because the left operand isn't a string.
322
323You may want to use explicit `int()` and `float()` to convert numbers, and then
324compare them.
325
326---
327
328Compare objects for identity with `is`:
329
330 ysh$ var d = {}
331 ysh$ var e = d
332
333 ysh$ = d is d
334 (Bool) true
335
336 ysh$ = d is {other: 'dict'}
337 (Bool) false
338
339To negate `is`, use `is not` (like Python:
340
341 ysh$ d is not {other: 'dict'}
342 (Bool) true
343
344### ysh-in
345
346The `in` operator tests if a key is in a dictionary:
347
348 var d = {k: 42}
349 if ('k' in d) {
350 echo yes
351 } # => yes
352
353Unlike Python, `in` doesn't work on `Str` and `List` instances. This because
354those operations take linear time rather than constant time (O(n) rather than
355O(1)).
356
357TODO: Use `includes() / contains()` methods instead.
358
359### ysh-compare
360
361The comparison operators apply to integers or floats:
362
363 4 < 4 # => false
364 4 <= 4 # => true
365
366 5.0 > 5.0 # => false
367 5.0 >= 5.0 # => true
368
369Example in context:
370
371 if (x < 0) {
372 echo 'x is negative'
373 }
374
375### ysh-logical
376
377The logical operators take boolean operands, and are spelled like Python:
378
379 not
380 and or
381
382Note that they are distinct from `! && ||`, which are part of the [command
383language](chap-cmd-lang.html).
384
385### ysh-arith
386
387YSH supports most of the arithmetic operators from Python. Notably, `/` and `%`
388differ from Python as [they round toward zero, not negative
389infinity](https://www.oilshell.org/blog/2024/03/release-0.21.0.html#integers-dont-do-whatever-python-or-c-does).
390
391Use `+ - *` for `Int` or `Float` addition, subtraction and multiplication. If
392any of the operands are `Float`s, then the output will also be a `Float`.
393
394Use `/` and `//` for `Float` division and `Int` division, respectively. `/`
395will _always_ result in a `Float`, meanwhile `//` will _always_ result in an
396`Int`.
397
398 = 1 / 2 # => (Float) 0.5
399 = 1 // 2 # => (Int) 0
400
401Use `%` to compute the _remainder_ of integer division. The left operand must
402be an `Int` and the right a _positive_ `Int`.
403
404 = 1 % 2 # -> (Int) 1
405 = -4 % 2 # -> (Int) 0
406
407Use `**` for exponentiation. The left operand must be an `Int` and the right a
408_positive_ `Int`.
409
410All arithmetic operators may coerce either of their operands from strings to a
411number, provided those strings are formatted as numbers.
412
413 = 10 + '1' # => (Int) 11
414
415Operators like `+ - * /` will coerce strings to _either_ an `Int` or `Float`.
416However, operators like `// ** %` and bit shifts will coerce strings _only_ to
417an `Int`.
418
419 = '1.14' + '2' # => (Float) 3.14
420 = '1.14' % '2' # Type Error: Left operand is a Str
421
422### ysh-bitwise
423
424Bitwise operators are like Python and C:
425
426 ~ # unary complement
427
428 & | ^ # binary and, or, xor
429
430 >> << # bit shift
431
432### ysh-ternary
433
434The ternary operator is borrowed from Python:
435
436 display = 'yes' if len(s) else 'empty'
437
438### ysh-index
439
440`Str` objects can be indexed by byte:
441
442 ysh$ var s = 'cat'
443 ysh$ = mystr[1]
444 (Str) 'a'
445
446 ysh$ = mystr[-1] # index from the end
447 (Str) 't'
448
449`List` objects:
450
451 ysh$ var mylist = [1, 2, 3]
452 ysh$ = mylist[2]
453 (Int) 3
454
455`Dict` objects are indexed by string key:
456
457 ysh$ var mydict = {'key': 42}
458 ysh$ = mydict['key']
459 (Int) 42
460
461### ysh-attr
462
463The expression `mydict.key` is short for `mydict['key']`.
464
465(Like JavaScript, but unlike Python.)
466
467### ysh-slice
468
469Slicing gives you a subsequence of a `Str` or `List`, like Python.
470
471Negative indices are relative to the end.
472
473### func-call
474
475A function call expression looks like Python:
476
477 ysh$ = f('s', 't', named=42)
478
479A semicolon `;` can be used after positional args and before named args, but
480isn't always required:
481
482 ysh$ = f('s', 't'; named=42)
483
484In these cases, the `;` is necessary:
485
486 ysh$ = f(...args; ...kwargs)
487
488 ysh$ = f(42, 43; ...kwargs)
489
490### thin-arrow
491
492The thin arrow is for mutating methods:
493
494 var mylist = ['bar']
495 call mylist->pop()
496
497<!--
498TODO
499 var mydict = {name: 'foo'}
500 call mydict->erase('name')
501-->
502
503### fat-arrow
504
505The fat arrow is for transforming methods:
506
507 if (s => startsWith('prefix')) {
508 echo 'yes'
509 }
510
511If the method lookup on `s` fails, it looks for free functions. This means it
512can be used for "chaining" transformations:
513
514 var x = myFunc() => list() => join()
515
516### match-ops
517
518YSH has four pattern matching operators: `~ !~ ~~ !~~`.
519
520Does string match an **eggex**?
521
522 var filename = 'x42.py'
523 if (filename ~ / d+ /) {
524 echo 'number'
525 }
526
527Does a string match a POSIX regular expression (ERE syntax)?
528
529 if (filename ~ '[[:digit:]]+') {
530 echo 'number'
531 }
532
533Negate the result with the `!~` operator:
534
535 if (filename !~ /space/ ) {
536 echo 'no space'
537 }
538
539 if (filename !~ '[[:space:]]' ) {
540 echo 'no space'
541 }
542
543Does a string match a **glob**?
544
545 if (filename ~~ '*.py') {
546 echo 'Python'
547 }
548
549 if (filename !~~ '*.py') {
550 echo 'not Python'
551 }
552
553Take care not to confuse glob patterns and regular expressions.
554
555- Related doc: [YSH Regex API](../ysh-regex-api.html)
556
557## Eggex
558
559### re-literal
560
561An eggex literal looks like this:
562
563 / expression ; flags ; translation preference /
564
565The flags and translation preference are both optional.
566
567Examples:
568
569 var pat = / d+ / # => [[:digit:]]+
570
571You can specify flags passed to libc `regcomp()`:
572
573 var pat = / d+ ; reg_icase reg_newline /
574
575You can specify a translation preference after a second semi-colon:
576
577 var pat = / d+ ; ; ERE /
578
579Right now the translation preference does nothing. It could be used to
580translate eggex to PCRE or Python syntax.
581
582- Related doc: [Egg Expressions](../eggex.html)
583
584### re-primitive
585
586There are two kinds of eggex primitives.
587
588"Zero-width assertions" match a position rather than a character:
589
590 %start # translates to ^
591 %end # translates to $
592
593Literal characters appear within **single** quotes:
594
595 'oh *really*' # translates to regex-escaped string
596
597Double-quoted strings are **not** eggex primitives. Instead, you can use
598splicing of strings:
599
600 var dq = "hi $name"
601 var eggex = / @dq /
602
603### class-literal
604
605An eggex character class literal specifies a set. It can have individual
606characters and ranges:
607
608 [ 'x' 'y' 'z' a-f A-F 0-9 ] # 3 chars, 3 ranges
609
610Omit quotes on ASCII characters:
611
612 [ x y z ] # avoid typing 'x' 'y' 'z'
613
614Sets of characters can be written as strings
615
616 [ 'xyz' ] # any of 3 chars, not a sequence of 3 chars
617
618Backslash escapes are respected:
619
620 [ \\ \' \" \0 ]
621 [ \xFF \u0100 ]
622
623Splicing:
624
625 [ @str_var ]
626
627Negation always uses `!`
628
629 ![ a-f A-F 'xyz' @str_var ]
630
631### named-class
632
633Perl-like shortcuts for sets of characters:
634
635 [ dot ] # => .
636 [ digit ] # => [[:digit:]]
637 [ space ] # => [[:space:]]
638 [ word ] # => [[:alpha:]][[:digit:]]_
639
640Abbreviations:
641
642 [ d s w ] # Same as [ digit space word ]
643
644Valid POSIX classes:
645
646 alnum cntrl lower space
647 alpha digit print upper
648 blank graph punct xdigit
649
650Negated:
651
652 !digit !space !word
653 !d !s !w
654 !alnum # etc.
655
656### re-repeat
657
658Eggex repetition looks like POSIX syntax:
659
660 / 'a'? / # zero or one
661 / 'a'* / # zero or more
662 / 'a'+ / # one or more
663
664Counted repetitions:
665
666 / 'a'{3} / # exactly 3 repetitions
667 / 'a'{2,4} / # between 2 to 4 repetitions
668
669### re-compound
670
671Sequence expressions with a space:
672
673 / word digit digit / # Matches 3 characters in sequence
674 # Examples: a42, b51
675
676(Compare `/ [ word digit ] /`, which is a set matching 1 character.)
677
678Alternation with `|`:
679
680 / word | digit / # Matches 'a' OR '9', for example
681
682Grouping with parentheses:
683
684 / (word digit) | \\ / # Matches a9 or \
685
686### re-capture
687
688To retrieve a substring of a string that matches an Eggex, use a "capture
689group" like `<capture ...>`.
690
691Here's an eggex with a **positional** capture:
692
693 var pat = / 'hi ' <capture d+> / # access with _group(1)
694 # or Match => _group(1)
695
696Captures can be **named**:
697
698 <capture d+ as month> # access with _group('month')
699 # or Match => group('month')
700
701Captures can also have a type **conversion func**:
702
703 <capture d+ : int> # _group(1) returns Int
704
705 <capture d+ as month: int> # _group('month') returns Int
706
707Related docs and help topics:
708
709- [YSH Regex API](../ysh-regex-api.html)
710- [`_group()`](chap-builtin-func.html#_group)
711- [`Match => group()`](chap-type-method.html#group)
712
713### re-splice
714
715To build an eggex out of smaller expressions, you can **splice** eggexes
716together:
717
718 var D = / [0-9][0-9] /
719 var time = / @D ':' @D / # [0-9][0-9]:[0-9][0-9]
720
721If the variable begins with a capital letter, you can omit `@`:
722
723 var ip = / D ':' D /
724
725You can also splice a string:
726
727 var greeting = 'hi'
728 var pat = / @greeting ' world' / # hi world
729
730Splicing is **not** string concatenation; it works on eggex subtrees.
731
732### re-flags
733
734Valid ERE flags, which are passed to libc's `regcomp()`:
735
736- `reg_icase` aka `i` - ignore case
737- `reg_newline` - 4 matching changes related to newlines
738
739See `man regcomp`.
740
741### re-multiline
742
743Multi-line eggexes aren't yet implemented. Splicing makes it less necessary:
744
745 var Name = / <capture [a-z]+ as name> /
746 var Num = / <capture d+ as num> /
747 var Space = / <capture s+ as space> /
748
749 # For variables named like CapWords, splicing @Name doesn't require @
750 var lexer = / Name | Num | Space /