OILS / doc / old / expression-language.md View on Github | oilshell.org

406 lines, 269 significant
1---
2in_progress: yes
3css_files: ../../web/base.css ../../web/manual.css ../../web/toc.css
4---
5
6Oil's Expression Language: A Mix of Python and JavaScript
7=========================================================
8
9Recall that Oil is composed of three interleaved languages:
10[words](word-language.html), [commands](command-language.html), and
11**expressions**.
12
13This doc describes expressions, but only the things that are **not** in:
14
15- [A Tour of the Oil Language](oil-language-tour.html). The best intro.
16- The `#expr-lang` section of [Oil Help
17 Topics](oil-help-topics.html#expr-lang). A reference.
18- [Egg Expressions](eggex.html). A "sublanguage" this language.
19
20TODO: This doc should have example shell sessions, like the tour does.
21
22<div id="toc">
23</div>
24
25## Preliminaries
26
27### Comparison to Python and JavaScript
28
29For a short summary, see [Oil vs. Python](oil-vs-python.html).
30
31### Constructs Shared Between Word and Expression Languages
32
33String literals can be used in both words and expressions:
34
35 echo 'foo'
36 var x = 'foo'
37
38 echo "hello $name"
39 var y = "hello $name"
40
41 echo $'\t TAB'
42 var z = $'\t TAB'
43
44This includes multi-line string literals:
45
46 echo '''
47 hello
48 world
49 '''
50
51 var x = '''
52 hello
53 world
54 '''
55
56 # (and the 2 other kinds)
57
58Command substitution is shared:
59
60 echo $(hostname)
61 var a = $(hostname) # no quotes necessary
62 var b = "name is $(hostname)"
63
64String substitution is shared:
65
66 echo ${MYVAR:-}
67 var c = ${MYVAR:-}
68 var d = "var is ${MYVAR:-}"
69
70Not shared:
71
72- Unquoted substitution `$foo` isn't available in expression mode. (It should
73 be or bare `foo`, or `"$foo"`)
74- Expression sub `$[1 + 2]` is usually not necessary in expression mode, so it
75 isn't available. You can use a quoted string like `var x = "$[1 + 2]"`.
76
77## Literals for Data Types
78
79### String Literals: Like Shell, But Less Confusion About Backslashes
80
81Oil has 3 kinds of string literal. See the docs in the intro for detail, as
82well as the [Strings](strings.html) doc.
83
84As a detail, Oil disallows this case:
85
86 $ var x = '\n'
87 var x = '\n'
88 ^~
89 [ interactive ]:1: Strings with backslashes should look like r'\n' or $'\n'
90
91In expression mode, you're forced to specify an explicit `r` or `$` when the
92string has backslashes. This is because shell has the opposite default as
93Python: In shell, unadorned strings are raw. In Python, unadorned strings
94respect C escapes.
95
96### Float Literals
97
98- Floating point literals are also like C/Python: `1.23e-10`. Except:
99 - A number is required before the `.` now
100 - No `1_000_000.123_456` because that was hard to implement as a hand-written
101 Python regex.
102
103Those last two caveats about floats are TODOs:
104<https://github.com/oilshell/oil/issues/483>
105
106### List Type: Both "Array" and List Literals
107
108There is a single list type, but it has two syntaxes:
109
110- `:| one two three |` for an "array" of strings. This is equivalent to
111 `['one', 'two', 'three']`.
112- `[1, [2, 'three', {}]]` for arbitrary Python-like "lists".
113
114Longer example:
115
116 var x = :| a b c |
117 var x = :|
118 'single quoted'
119 "double quoted $var"
120 $'c string'
121 glob/*.py
122 brace-{a,b,c}-{1..3}
123 |
124
125### Dict Literals Look Like JavaScript
126
127Dict literals use JavaScript's rules, which are similar but not identical to
128Python.
129
130The key can be either a **bare word** or **bracketed expression**.
131
132(1) For example, `{age: 30}` means what `{'age': 30}` does in Python. That is,
133`age` is **not** the name of a variable. This fits more with the "dict as ad
134hoc struct" philosophy.
135
136(2) In `{[age]: 30}`, `age` is a variable. You can put an arbitrary expression
137in there like `{['age'.upper()]: 30}`. (Note: Lua also has this bracketed key
138syntax.)
139
140(3) `{age, key2}` is the same as `{age: age, key2: key2}`. That is, if the
141name is a bare word, you can leave off the value, and it will be looked up in
142the context where the dictionary is defined.
143
144This is what ES2015 calls "shorthand object properties":
145
146- <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer>
147
148### Block, Expr
149
150TODO:
151
152 var myblock = ^(ls | wc -l)
153 var myexpr = ^[1 + 2]
154
155## Operators on Multiple Types
156
157Like JavaScript, Oil has two types of equality, but uses `===` and `~==` rather
158than `===` and `==`.
159
160### Exact Equality `=== !==`
161
162- TODO: types must be the same, so `'42' === 42` is not just false, but it's an
163 **error**.
164
165### Approximate Equality `~==`
166
167- There's no negative form like `!==`. Use `not (a ~== b)` instead.
168- Valid Operand Types:
169 - LHS: `Str` only
170 - RHS: `Str`, `Int`, `Bool`
171
172Examples:
173
174 ' foo ' ~== 'foo' # whitespace stripped on LEFT only
175 ' 42 ' ~== 42
176 ' TRue ' ~== true # true, false, 0, 1, and I think T, F
177
178Currently, there are no semantics for floats, so none of these work:
179
180 ' 42.0 ' ~== 42
181 ' 42 ' ~== 42.0
182 42.0 ~== 42
183 42 ~== 42.0
184
185(Should `float_equals()` be a separate function?)
186
187### Function and Method Calls
188
189 var result = add(x, y)
190 var result = foo(x, named='default')
191
192 if (s.startswith('prefix')) {
193 echo yes
194 }
195
196Use Cases:
197
198 var d = {1: 2, 3: 4}
199 const k = keys(d)
200
201
202## Boolean Operators
203
204### Logical: `not` `and` `or`
205
206Like Python.
207
208### Ternary
209
210 var cond = true
211 var x = 'yes' if cond else 'no'
212
213## Arithmetic
214
215<!--
216TODO: Should the string to number/integer conversions also handle these cases?
217
218 '1_000' => 1000
219 '0xff' => 255
220 '0o010' => 8
221 '0b0001_0000' => 32
222
223Right now comparison operators convert decimal strings.
224-->
225
226### Arithmetic `+ - * /`
227
228These are like Python, but they do string to number conversion (but not unary
229`-`.) A number is an integer or float.
230
231That is:
232
233- `'1' + '2'` evaluates to `3` because `1 + 2` evaluates to `3`.
234- `'1' + '2.5'` evaluates to `3.5` because `1 + 2.5` evaluates to `3.5`.
235
236### Arithmetic `// %` and `**`
237
238Also like Python, but they do string to **integer** conversion.
239
240- `'9' // '4'` evaluates to `2` because `9 / 4` evaluates to `2`.
241
242### Bitwise `~ & | ^ << >>`
243
244Like Python.
245
246## Comparison of Integers and Floats `< <= > >=`
247
248These operators also do string to number conversion. That is:
249
250- `'22' < '3'` false because `22 < 3` is false. (It would be true under
251 lexicographical comparison.)
252- `'3.1' <= '3.14'` is true because `3.1 <= 3.14` is true.
253
254TODO:
255
256- Do we have `is` and `is not`? I think it's useful for lists and dicts
257- Remove chained comparison? This syntax is directly from Python.
258 - That is, `x op y op z` is a shortcut for `x op y and y op z`
259
260## String Pattern Matching `~` and `~~`
261
262- Eggex: `~` `!~`
263 - Similar to bash's `[[ $x =~ $pat ]]`
264- Glob: `~~` `!~~`
265 - Similar to bash's `[[ $x == *.py ]]`
266
267## String and List Operators
268
269In addition to pattern matching.
270
271### Concatenation with `++`
272
273 s ++ 'suffix'
274 L ++ [1, 2] ++ :| a b |
275
276### Indexing `a[i]`
277
278 var s = 'foo'
279 var second = s[1] # are these integers though? maybe slicing gives you things of length 1
280 echo $second # 'o'
281
282 var a = :| spam eggs ham |
283 var second = a[1]
284 echo $second # => 'eggs'
285
286 echo $[a[-1]] # => ham
287
288Semantics are like Python: Out of bounds is an error.
289
290### Slicing `a[i:j]`
291
292 var s = 'food'
293 var slice = s[1:3]
294 echo $second # 'oo'
295
296 var a = :| spam eggs ham |
297 var slice = a[1:3]
298 write -- @slice # eggs, ham
299
300Semantics are like Python: Out of bounds is **not** an error.
301
302## Dict Operators
303
304### Membership with `in`
305
306- And `not in`
307- But strings and arrays use functions?
308 - .find() ? It's more of an algorithm.
309
310### `d->key` is a shortcut for `d['key']`
311
312> the distinction between attributes and dictionary members always seemed weird
313> and unnecessary to me.
314
315I've been thinking about this for [the Oil
316language](http://www.oilshell.org/blog/2019/08/22.html), which is heavily
317influenced by Python.
318
319The problem is that dictionary attributes come from user data, i.e. from JSON,
320while methods like `.keys()` come from the interpreter, and Python allows you
321 to provide user-defined methods like `mydict.mymethod()` too.
322
323Mixing all of those things in the same namespace seems like a bad idea.
324
325In Oil I might do introduce an `->` operator, so `d->mykey` is a shortcut for
326`d['mykey']`.
327
328```
329d.keys(), d.values(), d.items() # methods
330d->mykey
331d['mykey']
332```
333
334Maybe you could disallow user-defined attributes on dictionaries, and make them
335free:
336
337```
338keys(d), values(d), items(d)
339d.mykey # The whole namespace is available for users
340```
341
342However I don't like that this makes dictionaries a special case. Thoughts?
343
344## Deferred
345
346### List and Dict Comprehensions
347
348List comprehensions might be useful for a "faster" for loop? It only does
349expressions?
350
351### Splat `*` and `**`
352
353Python allows splatting into lists:
354
355 a = [1, 2]
356 b = [*a, 3]
357
358And dicts:
359
360 d = {'name': 'alice'}
361 d2 = {**d, age: 42}
362
363### Ranges `1:n` (vs slices)
364
365Deferred because you can use
366
367 for i in @(seq $n) {
368 echo $i
369 }
370
371This gives you strings but that's OK for now. We don't yet have a "fast" for
372loop.
373
374Notes:
375
376- Oil slices don't have a "step" argument. Justification:
377 - R only has `start:end`, it doesn't have `start:end:step`
378 - Julia has `start:step:end`!
379 - I don't think the **step** is so useful that it has to be first class
380 syntax. In other words, Python's syntax is optimized for a rare case --
381 e.g. `a[::2]`.
382- Python has slices, but it doesn't have a range syntax. You have to write
383 `range(0, n)`.
384- A syntactic difference between slices and ranges: slice endpoints can be
385 **implicit**, like `a[:n]` and `a[3:]`.
386
387## Appendices
388
389### Oil vs. Tea
390
391- Tea: truthiness of `Str*` is a problem. Nul, etc.
392 - `if (mystr)` vs `if (len(mystr))`
393 - though I think strings should be non-nullable value types? They are
394 slices.
395 - they start off as the empty slice
396- Automatic conversions of strings to numbers
397 - `42` and `3.14` and `1e100`
398
399### Implementation Notes
400
401- Limitation:
402 - Start with Str, StrArray, and AssocArray data model
403 - Then add int, float, bool, null (for JSON)
404 - Then add fully recursive data model (depends on FC)
405 - `value = ... | dict[str, value]`
406