1 | ---
2 | in_progress: yes
3 | css_files: ../../web/base.css ../../web/manual.css ../../web/toc.css
4 | ---
5 |
6 | Oil's Expression Language: A Mix of Python and JavaScript
7 | =========================================================
8 |
9 | Recall that Oil is composed of three interleaved languages:
10 | [words](word-language.html), [commands](command-language.html), and
11 | **expressions**.
12 |
13 | This doc describes expressions, but only the things that are **not** in:
14 |
15 | - [A Tour of the Oil Language](oil-language-tour.html). The best intro.
16 | - The `#expr-lang` section of [Oil Help
17 | Topics](oil-help-topics.html#expr-lang). A reference.
18 | - [Egg Expressions](eggex.html). A "sublanguage" this language.
19 |
20 | TODO: This doc should have example shell sessions, like the tour does.
21 |
22 | <div id="toc">
23 | </div>
24 |
25 | ## Preliminaries
26 |
27 | ### Comparison to Python and JavaScript
28 |
29 | For a short summary, see [Oil vs. Python](oil-vs-python.html).
30 |
31 | ### Constructs Shared Between Word and Expression Languages
32 |
33 | String literals can be used in both words and expressions:
34 |
35 | echo 'foo'
36 | var x = 'foo'
37 |
38 | echo "hello $name"
39 | var y = "hello $name"
40 |
41 | echo $'\t TAB'
42 | var z = $'\t TAB'
43 |
44 | This includes multi-line string literals:
45 |
46 | echo '''
47 | hello
48 | world
49 | '''
50 |
51 | var x = '''
52 | hello
53 | world
54 | '''
55 |
56 | # (and the 2 other kinds)
57 |
58 | Command substitution is shared:
59 |
60 | echo $(hostname)
61 | var a = $(hostname) # no quotes necessary
62 | var b = "name is $(hostname)"
63 |
64 | String substitution is shared:
65 |
66 | echo ${MYVAR:-}
67 | var c = ${MYVAR:-}
68 | var d = "var is ${MYVAR:-}"
69 |
70 | Not shared:
71 |
72 | - Unquoted substitution `$foo` isn't available in expression mode. (It should
73 | be or bare `foo`, or `"$foo"`)
74 | - Expression sub `$[1 + 2]` is usually not necessary in expression mode, so it
75 | isn't available. You can use a quoted string like `var x = "$[1 + 2]"`.
76 |
77 | ## Literals for Data Types
78 |
79 | ### String Literals: Like Shell, But Less Confusion About Backslashes
80 |
81 | Oil has 3 kinds of string literal. See the docs in the intro for detail, as
82 | well as the [Strings](strings.html) doc.
83 |
84 | As a detail, Oil disallows this case:
85 |
86 | $ var x = '\n'
87 | var x = '\n'
88 | ^~
89 | [ interactive ]:1: Strings with backslashes should look like r'\n' or $'\n'
90 |
91 | In expression mode, you're forced to specify an explicit `r` or `$` when the
92 | string has backslashes. This is because shell has the opposite default as
93 | Python: In shell, unadorned strings are raw. In Python, unadorned strings
94 | respect C escapes.
95 |
96 | ### Float Literals
97 |
98 | - Floating point literals are also like C/Python: `1.23e-10`. Except:
99 | - A number is required before the `.` now
100 | - No `1_000_000.123_456` because that was hard to implement as a hand-written
101 | Python regex.
102 |
103 | Those last two caveats about floats are TODOs:
104 | <https://github.com/oilshell/oil/issues/483>
105 |
106 | ### List Type: Both "Array" and List Literals
107 |
108 | There is a single list type, but it has two syntaxes:
109 |
110 | - `:| one two three |` for an "array" of strings. This is equivalent to
111 | `['one', 'two', 'three']`.
112 | - `[1, [2, 'three', {}]]` for arbitrary Python-like "lists".
113 |
114 | Longer example:
115 |
116 | var x = :| a b c |
117 | var x = :|
118 | 'single quoted'
119 | "double quoted $var"
120 | $'c string'
121 | glob/*.py
122 | brace-{a,b,c}-{1..3}
123 | |
124 |
125 | ### Dict Literals Look Like JavaScript
126 |
127 | Dict literals use JavaScript's rules, which are similar but not identical to
128 | Python.
129 |
130 | The key can be either a **bare word** or **bracketed expression**.
131 |
132 | (1) For example, `{age: 30}` means what `{'age': 30}` does in Python. That is,
133 | `age` is **not** the name of a variable. This fits more with the "dict as ad
134 | hoc struct" philosophy.
135 |
136 | (2) In `{[age]: 30}`, `age` is a variable. You can put an arbitrary expression
137 | in there like `{['age'.upper()]: 30}`. (Note: Lua also has this bracketed key
138 | syntax.)
139 |
140 | (3) `{age, key2}` is the same as `{age: age, key2: key2}`. That is, if the
141 | name is a bare word, you can leave off the value, and it will be looked up in
142 | the context where the dictionary is defined.
143 |
144 | This is what ES2015 calls "shorthand object properties":
145 |
146 | - <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer>
147 |
148 | ### Block, Expr
149 |
150 | TODO:
151 |
152 | var myblock = ^(ls | wc -l)
153 | var myexpr = ^[1 + 2]
154 |
155 | ## Operators on Multiple Types
156 |
157 | Like JavaScript, Oil has two types of equality, but uses `===` and `~==` rather
158 | than `===` and `==`.
159 |
160 | ### Exact Equality `=== !==`
161 |
162 | - TODO: types must be the same, so `'42' === 42` is not just false, but it's an
163 | **error**.
164 |
165 | ### Approximate Equality `~==`
166 |
167 | - There's no negative form like `!==`. Use `not (a ~== b)` instead.
168 | - Valid Operand Types:
169 | - LHS: `Str` only
170 | - RHS: `Str`, `Int`, `Bool`
171 |
172 | Examples:
173 |
174 | ' foo ' ~== 'foo' # whitespace stripped on LEFT only
175 | ' 42 ' ~== 42
176 | ' TRue ' ~== true # true, false, 0, 1, and I think T, F
177 |
178 | Currently, there are no semantics for floats, so none of these work:
179 |
180 | ' 42.0 ' ~== 42
181 | ' 42 ' ~== 42.0
182 | 42.0 ~== 42
183 | 42 ~== 42.0
184 |
185 | (Should `float_equals()` be a separate function?)
186 |
187 | ### Function and Method Calls
188 |
189 | var result = add(x, y)
190 | var result = foo(x, named='default')
191 |
192 | if (s.startswith('prefix')) {
193 | echo yes
194 | }
195 |
196 | Use Cases:
197 |
198 | var d = {1: 2, 3: 4}
199 | const k = keys(d)
200 |
201 |
202 | ## Boolean Operators
203 |
204 | ### Logical: `not` `and` `or`
205 |
206 | Like Python.
207 |
208 | ### Ternary
209 |
210 | var cond = true
211 | var x = 'yes' if cond else 'no'
212 |
213 | ## Arithmetic
214 |
215 | <!--
216 | TODO: Should the string to number/integer conversions also handle these cases?
217 |
218 | '1_000' => 1000
219 | '0xff' => 255
220 | '0o010' => 8
221 | '0b0001_0000' => 32
222 |
223 | Right now comparison operators convert decimal strings.
224 | -->
225 |
226 | ### Arithmetic `+ - * /`
227 |
228 | These are like Python, but they do string to number conversion (but not unary
229 | `-`.) A number is an integer or float.
230 |
231 | That is:
232 |
233 | - `'1' + '2'` evaluates to `3` because `1 + 2` evaluates to `3`.
234 | - `'1' + '2.5'` evaluates to `3.5` because `1 + 2.5` evaluates to `3.5`.
235 |
236 | ### Arithmetic `// %` and `**`
237 |
238 | Also like Python, but they do string to **integer** conversion.
239 |
240 | - `'9' // '4'` evaluates to `2` because `9 / 4` evaluates to `2`.
241 |
242 | ### Bitwise `~ & | ^ << >>`
243 |
244 | Like Python.
245 |
246 | ## Comparison of Integers and Floats `< <= > >=`
247 |
248 | These operators also do string to number conversion. That is:
249 |
250 | - `'22' < '3'` false because `22 < 3` is false. (It would be true under
251 | lexicographical comparison.)
252 | - `'3.1' <= '3.14'` is true because `3.1 <= 3.14` is true.
253 |
254 | TODO:
255 |
256 | - Do we have `is` and `is not`? I think it's useful for lists and dicts
257 | - Remove chained comparison? This syntax is directly from Python.
258 | - That is, `x op y op z` is a shortcut for `x op y and y op z`
259 |
260 | ## String Pattern Matching `~` and `~~`
261 |
262 | - Eggex: `~` `!~`
263 | - Similar to bash's `[[ $x =~ $pat ]]`
264 | - Glob: `~~` `!~~`
265 | - Similar to bash's `[[ $x == *.py ]]`
266 |
267 | ## String and List Operators
268 |
269 | In addition to pattern matching.
270 |
271 | ### Concatenation with `++`
272 |
273 | s ++ 'suffix'
274 | L ++ [1, 2] ++ :| a b |
275 |
276 | ### Indexing `a[i]`
277 |
278 | var s = 'foo'
279 | var second = s[1] # are these integers though? maybe slicing gives you things of length 1
280 | echo $second # 'o'
281 |
282 | var a = :| spam eggs ham |
283 | var second = a[1]
284 | echo $second # => 'eggs'
285 |
286 | echo $[a[-1]] # => ham
287 |
288 | Semantics are like Python: Out of bounds is an error.
289 |
290 | ### Slicing `a[i:j]`
291 |
292 | var s = 'food'
293 | var slice = s[1:3]
294 | echo $second # 'oo'
295 |
296 | var a = :| spam eggs ham |
297 | var slice = a[1:3]
298 | write -- @slice # eggs, ham
299 |
300 | Semantics are like Python: Out of bounds is **not** an error.
301 |
302 | ## Dict Operators
303 |
304 | ### Membership with `in`
305 |
306 | - And `not in`
307 | - But strings and arrays use functions?
308 | - .find() ? It's more of an algorithm.
309 |
310 | ### `d->key` is a shortcut for `d['key']`
311 |
312 | > the distinction between attributes and dictionary members always seemed weird
313 | > and unnecessary to me.
314 |
315 | I've been thinking about this for [the Oil
316 | language](http://www.oilshell.org/blog/2019/08/22.html), which is heavily
317 | influenced by Python.
318 |
319 | The problem is that dictionary attributes come from user data, i.e. from JSON,
320 | while methods like `.keys()` come from the interpreter, and Python allows you
321 | to provide user-defined methods like `mydict.mymethod()` too.
322 |
323 | Mixing all of those things in the same namespace seems like a bad idea.
324 |
325 | In Oil I might do introduce an `->` operator, so `d->mykey` is a shortcut for
326 | `d['mykey']`.
327 |
328 | ```
329 | d.keys(), d.values(), d.items() # methods
330 | d->mykey
331 | d['mykey']
332 | ```
333 |
334 | Maybe you could disallow user-defined attributes on dictionaries, and make them
335 | free:
336 |
337 | ```
338 | keys(d), values(d), items(d)
339 | d.mykey # The whole namespace is available for users
340 | ```
341 |
342 | However I don't like that this makes dictionaries a special case. Thoughts?
343 |
344 | ## Deferred
345 |
346 | ### List and Dict Comprehensions
347 |
348 | List comprehensions might be useful for a "faster" for loop? It only does
349 | expressions?
350 |
351 | ### Splat `*` and `**`
352 |
353 | Python allows splatting into lists:
354 |
355 | a = [1, 2]
356 | b = [*a, 3]
357 |
358 | And dicts:
359 |
360 | d = {'name': 'alice'}
361 | d2 = {**d, age: 42}
362 |
363 | ### Ranges `1:n` (vs slices)
364 |
365 | Deferred because you can use
366 |
367 | for i in @(seq $n) {
368 | echo $i
369 | }
370 |
371 | This gives you strings but that's OK for now. We don't yet have a "fast" for
372 | loop.
373 |
374 | Notes:
375 |
376 | - Oil slices don't have a "step" argument. Justification:
377 | - R only has `start:end`, it doesn't have `start:end:step`
378 | - Julia has `start:step:end`!
379 | - I don't think the **step** is so useful that it has to be first class
380 | syntax. In other words, Python's syntax is optimized for a rare case --
381 | e.g. `a[::2]`.
382 | - Python has slices, but it doesn't have a range syntax. You have to write
383 | `range(0, n)`.
384 | - A syntactic difference between slices and ranges: slice endpoints can be
385 | **implicit**, like `a[:n]` and `a[3:]`.
386 |
387 | ## Appendices
388 |
389 | ### Oil vs. Tea
390 |
391 | - Tea: truthiness of `Str*` is a problem. Nul, etc.
392 | - `if (mystr)` vs `if (len(mystr))`
393 | - though I think strings should be non-nullable value types? They are
394 | slices.
395 | - they start off as the empty slice
396 | - Automatic conversions of strings to numbers
397 | - `42` and `3.14` and `1e100`
398 |
399 | ### Implementation Notes
400 |
401 | - Limitation:
402 | - Start with Str, StrArray, and AssocArray data model
403 | - Then add int, float, bool, null (for JSON)
404 | - Then add fully recursive data model (depends on FC)
405 | - `value = ... | dict[str, value]`
406 |