| 1 | ---
 | 
| 2 | in_progress: yes
 | 
| 3 | css_files: ../../web/base.css ../../web/manual.css ../../web/toc.css
 | 
| 4 | ---
 | 
| 5 | 
 | 
| 6 | Oil's Expression Language: A Mix of Python and JavaScript
 | 
| 7 | =========================================================
 | 
| 8 | 
 | 
| 9 | Recall that Oil is composed of three interleaved languages:
 | 
| 10 | [words](word-language.html), [commands](command-language.html), and
 | 
| 11 | **expressions**.
 | 
| 12 | 
 | 
| 13 | This doc describes expressions, but only the things that are **not** in:
 | 
| 14 | 
 | 
| 15 | - [A Tour of the Oil Language](oil-language-tour.html).  The best intro.
 | 
| 16 | - The `#expr-lang` section of [Oil Help
 | 
| 17 |   Topics](oil-help-topics.html#expr-lang).  A reference.
 | 
| 18 | - [Egg Expressions](eggex.html).  A "sublanguage" this language.
 | 
| 19 | 
 | 
| 20 | TODO: This doc should have example shell sessions, like the tour does.
 | 
| 21 | 
 | 
| 22 | <div id="toc">
 | 
| 23 | </div>
 | 
| 24 | 
 | 
| 25 | ## Preliminaries
 | 
| 26 | 
 | 
| 27 | ### Comparison to Python and JavaScript
 | 
| 28 | 
 | 
| 29 | For a short summary, see [Oil vs. Python](oil-vs-python.html).  
 | 
| 30 | 
 | 
| 31 | ### Constructs Shared Between Word and Expression Languages
 | 
| 32 | 
 | 
| 33 | String literals can be used in both words and expressions:
 | 
| 34 | 
 | 
| 35 |     echo 'foo'
 | 
| 36 |     var x = 'foo'
 | 
| 37 | 
 | 
| 38 |     echo "hello $name"
 | 
| 39 |     var y = "hello $name"
 | 
| 40 | 
 | 
| 41 |     echo $'\t TAB'
 | 
| 42 |     var z = $'\t TAB'
 | 
| 43 | 
 | 
| 44 | This includes multi-line string literals:
 | 
| 45 | 
 | 
| 46 |     echo '''
 | 
| 47 |     hello 
 | 
| 48 |     world
 | 
| 49 |     '''
 | 
| 50 | 
 | 
| 51 |     var x = '''
 | 
| 52 |     hello
 | 
| 53 |     world
 | 
| 54 |     '''
 | 
| 55 | 
 | 
| 56 |     # (and the 2 other kinds)
 | 
| 57 | 
 | 
| 58 | Command substitution is shared:
 | 
| 59 | 
 | 
| 60 |     echo $(hostname)
 | 
| 61 |     var a = $(hostname)  # no quotes necessary
 | 
| 62 |     var b = "name is $(hostname)"
 | 
| 63 | 
 | 
| 64 | String substitution is shared:
 | 
| 65 | 
 | 
| 66 |     echo ${MYVAR:-}
 | 
| 67 |     var c = ${MYVAR:-}
 | 
| 68 |     var d = "var is ${MYVAR:-}"
 | 
| 69 | 
 | 
| 70 | Not shared:
 | 
| 71 | 
 | 
| 72 | - Unquoted substitution `$foo` isn't available in expression mode.  (It should
 | 
| 73 |   be or bare `foo`, or `"$foo"`)
 | 
| 74 | - Expression sub `$[1 + 2]` is usually not necessary in expression mode, so it
 | 
| 75 |   isn't available.  You can use a quoted string like `var x = "$[1 + 2]"`.
 | 
| 76 | 
 | 
| 77 | ## Literals for Data Types
 | 
| 78 | 
 | 
| 79 | ### String Literals: Like Shell, But Less Confusion About Backslashes
 | 
| 80 | 
 | 
| 81 | Oil has 3 kinds of string literal.  See the docs in the intro for detail, as
 | 
| 82 | well as the [Strings](strings.html) doc.
 | 
| 83 | 
 | 
| 84 | As a detail, Oil disallows this case:
 | 
| 85 | 
 | 
| 86 |     $ var x = '\n'
 | 
| 87 |       var x = '\n'
 | 
| 88 |                ^~
 | 
| 89 |     [ interactive ]:1: Strings with backslashes should look like r'\n' or $'\n'
 | 
| 90 | 
 | 
| 91 | In expression mode, you're forced to specify an explicit `r` or `$` when the
 | 
| 92 | string has backslashes.  This is because shell has the opposite default as
 | 
| 93 | Python: In shell, unadorned strings are raw.  In Python, unadorned strings
 | 
| 94 | respect C escapes.
 | 
| 95 | 
 | 
| 96 | ### Float Literals
 | 
| 97 | 
 | 
| 98 | - Floating point literals are also like C/Python: `1.23e-10`.  Except:
 | 
| 99 |   - A number is required before the `.` now
 | 
| 100 |   - No `1_000_000.123_456` because that was hard to implement as a hand-written
 | 
| 101 |     Python regex.
 | 
| 102 | 
 | 
| 103 | Those last two caveats about floats are TODOs:
 | 
| 104 | <https://github.com/oilshell/oil/issues/483>
 | 
| 105 | 
 | 
| 106 | ### List Type: Both "Array" and List Literals
 | 
| 107 | 
 | 
| 108 | There is a single list type, but it has two syntaxes:
 | 
| 109 | 
 | 
| 110 | - `:| one two three |` for an "array" of strings.  This is equivalent to
 | 
| 111 |   `['one', 'two', 'three']`.
 | 
| 112 | - `[1, [2, 'three', {}]]` for arbitrary Python-like "lists".
 | 
| 113 | 
 | 
| 114 | Longer example:
 | 
| 115 | 
 | 
| 116 |     var x = :| a b c |
 | 
| 117 |     var x = :|
 | 
| 118 |       'single quoted'
 | 
| 119 |       "double quoted $var"
 | 
| 120 |       $'c string'
 | 
| 121 |       glob/*.py
 | 
| 122 |       brace-{a,b,c}-{1..3}
 | 
| 123 |     |
 | 
| 124 | 
 | 
| 125 | ### Dict Literals Look Like JavaScript
 | 
| 126 | 
 | 
| 127 | Dict literals use JavaScript's rules, which are similar but not identical to
 | 
| 128 | Python.
 | 
| 129 | 
 | 
| 130 | The key can be either a **bare word** or **bracketed expression**.
 | 
| 131 | 
 | 
| 132 | (1) For example, `{age: 30}` means what `{'age': 30}` does in Python.  That is,
 | 
| 133 | `age` is **not** the name of a variable.  This fits more with the "dict as ad
 | 
| 134 | hoc struct" philosophy.
 | 
| 135 | 
 | 
| 136 | (2) In `{[age]: 30}`, `age` is a variable.  You can put an arbitrary expression
 | 
| 137 | in there like `{['age'.upper()]: 30}`.  (Note: Lua also has this bracketed key
 | 
| 138 | syntax.)
 | 
| 139 | 
 | 
| 140 | (3) `{age, key2}` is the same as `{age: age, key2: key2}`.  That is, if the
 | 
| 141 | name is a bare word, you can leave off the value, and it will be looked up in
 | 
| 142 | the context where the dictionary is defined. 
 | 
| 143 | 
 | 
| 144 | This is what ES2015 calls "shorthand object properties":
 | 
| 145 | 
 | 
| 146 | - <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer>
 | 
| 147 | 
 | 
| 148 | ### Block, Expr
 | 
| 149 | 
 | 
| 150 | TODO:
 | 
| 151 | 
 | 
| 152 |     var myblock = ^(ls | wc -l)  
 | 
| 153 |     var myexpr = ^[1 + 2]
 | 
| 154 | 
 | 
| 155 | ## Operators on Multiple Types
 | 
| 156 | 
 | 
| 157 | Like JavaScript, Oil has two types of equality, but uses `===` and `~==` rather
 | 
| 158 | than `===` and `==`.
 | 
| 159 | 
 | 
| 160 | ### Exact Equality `=== !==`
 | 
| 161 | 
 | 
| 162 | - TODO: types must be the same, so `'42' === 42` is not just false, but it's an
 | 
| 163 |   **error**.
 | 
| 164 | 
 | 
| 165 | ### Approximate Equality `~==`
 | 
| 166 | 
 | 
| 167 | - There's no negative form like `!==`.  Use `not (a ~== b)` instead.
 | 
| 168 | - Valid Operand Types:
 | 
| 169 |   - LHS: `Str` only
 | 
| 170 |   - RHS: `Str`, `Int`, `Bool`
 | 
| 171 | 
 | 
| 172 | Examples:
 | 
| 173 | 
 | 
| 174 |     ' foo ' ~== 'foo'  # whitespace stripped on LEFT only
 | 
| 175 |     ' 42 ' ~== 42
 | 
| 176 |     ' TRue ' ~== true  # true, false, 0, 1, and I think T, F
 | 
| 177 | 
 | 
| 178 | Currently, there are no semantics for floats, so none of these work:
 | 
| 179 | 
 | 
| 180 |     ' 42.0 ' ~== 42
 | 
| 181 |     ' 42 ' ~== 42.0
 | 
| 182 |     42.0 ~== 42
 | 
| 183 |     42 ~== 42.0
 | 
| 184 | 
 | 
| 185 | (Should `float_equals()` be a separate function?)
 | 
| 186 | 
 | 
| 187 | ### Function and Method Calls
 | 
| 188 | 
 | 
| 189 |     var result = add(x, y)
 | 
| 190 |     var result = foo(x, named='default')
 | 
| 191 | 
 | 
| 192 |     if (s.startswith('prefix')) {
 | 
| 193 |       echo yes
 | 
| 194 |     }
 | 
| 195 | 
 | 
| 196 | Use Cases:
 | 
| 197 | 
 | 
| 198 |     var d = {1: 2, 3: 4}
 | 
| 199 |     const k = keys(d)
 | 
| 200 | 
 | 
| 201 | 
 | 
| 202 | ## Boolean Operators
 | 
| 203 | 
 | 
| 204 | ### Logical: `not` `and` `or`
 | 
| 205 | 
 | 
| 206 | Like Python.
 | 
| 207 | 
 | 
| 208 | ### Ternary
 | 
| 209 | 
 | 
| 210 |     var cond = true
 | 
| 211 |     var x = 'yes' if cond else 'no'
 | 
| 212 | 
 | 
| 213 | ## Arithmetic
 | 
| 214 | 
 | 
| 215 | <!--
 | 
| 216 | TODO: Should the string to number/integer conversions also handle these cases?
 | 
| 217 | 
 | 
| 218 |     '1_000' => 1000   
 | 
| 219 |     '0xff' => 255
 | 
| 220 |     '0o010' => 8
 | 
| 221 |     '0b0001_0000' => 32
 | 
| 222 | 
 | 
| 223 | Right now comparison operators convert decimal strings.
 | 
| 224 | -->
 | 
| 225 | 
 | 
| 226 | ### Arithmetic `+ - * /`
 | 
| 227 | 
 | 
| 228 | These are like Python, but they do string to number conversion (but not unary
 | 
| 229 | `-`.) A number is an integer or float.
 | 
| 230 | 
 | 
| 231 | That is:
 | 
| 232 | 
 | 
| 233 | - `'1' + '2'` evaluates to `3` because `1 + 2` evaluates to `3`.
 | 
| 234 | - `'1' + '2.5'` evaluates to `3.5` because `1 + 2.5` evaluates to `3.5`.
 | 
| 235 | 
 | 
| 236 | ### Arithmetic `// %` and `**`
 | 
| 237 | 
 | 
| 238 | Also like Python, but they do string to **integer** conversion.
 | 
| 239 | 
 | 
| 240 | - `'9' // '4'` evaluates to `2` because `9 / 4` evaluates to `2`.
 | 
| 241 | 
 | 
| 242 | ### Bitwise `~ & | ^ << >>`
 | 
| 243 | 
 | 
| 244 | Like Python.
 | 
| 245 | 
 | 
| 246 | ## Comparison of Integers and Floats `< <= > >=`
 | 
| 247 | 
 | 
| 248 | These operators also do string to number conversion.  That is:
 | 
| 249 | 
 | 
| 250 | - `'22' < '3'` false because `22 < 3` is false.  (It would be true under
 | 
| 251 |   lexicographical comparison.)
 | 
| 252 | - `'3.1' <= '3.14'` is true because `3.1 <= 3.14` is true.
 | 
| 253 | 
 | 
| 254 | TODO:
 | 
| 255 | 
 | 
| 256 | - Do we have `is` and `is not`?  I think it's useful for lists and dicts
 | 
| 257 | - Remove chained comparison?  This syntax is directly from Python.
 | 
| 258 |   - That is, `x op y op  z` is a shortcut for `x op y and y op z`
 | 
| 259 | 
 | 
| 260 | ## String Pattern Matching `~` and `~~`
 | 
| 261 | 
 | 
| 262 | - Eggex: `~` `!~` 
 | 
| 263 |   - Similar to bash's `[[ $x =~ $pat ]]`
 | 
| 264 | - Glob: `~~` `!~~`
 | 
| 265 |   - Similar to bash's `[[ $x == *.py ]]`
 | 
| 266 | 
 | 
| 267 | ## String and List Operators
 | 
| 268 | 
 | 
| 269 | In addition to pattern matching.
 | 
| 270 | 
 | 
| 271 | ### Concatenation with `++`
 | 
| 272 | 
 | 
| 273 |     s ++ 'suffix'
 | 
| 274 |     L ++ [1, 2] ++ :| a b |
 | 
| 275 | 
 | 
| 276 | ### Indexing `a[i]`
 | 
| 277 | 
 | 
| 278 |     var s = 'foo'
 | 
| 279 |     var second = s[1]    # are these integers though?  maybe slicing gives you things of length 1
 | 
| 280 |     echo $second  # 'o'
 | 
| 281 | 
 | 
| 282 |     var a = :| spam eggs ham |
 | 
| 283 |     var second = a[1]
 | 
| 284 |     echo $second  # => 'eggs'
 | 
| 285 | 
 | 
| 286 |     echo $[a[-1]]  # => ham
 | 
| 287 | 
 | 
| 288 | Semantics are like Python:  Out of bounds is an error.
 | 
| 289 | 
 | 
| 290 | ### Slicing `a[i:j]`
 | 
| 291 | 
 | 
| 292 |     var s = 'food'
 | 
| 293 |     var slice = s[1:3]
 | 
| 294 |     echo $second  # 'oo'
 | 
| 295 | 
 | 
| 296 |     var a = :| spam eggs ham |
 | 
| 297 |     var slice = a[1:3]
 | 
| 298 |     write -- @slice  # eggs, ham
 | 
| 299 | 
 | 
| 300 | Semantics are like Python:  Out of bounds is **not** an error.
 | 
| 301 | 
 | 
| 302 | ## Dict Operators
 | 
| 303 | 
 | 
| 304 | ### Membership with `in`
 | 
| 305 | 
 | 
| 306 | - And `not in`
 | 
| 307 | - But strings and arrays use functions?
 | 
| 308 |   - .find() ?  It's more of an algorithm.
 | 
| 309 | 
 | 
| 310 | ### `d->key` is a shortcut for `d['key']`
 | 
| 311 | 
 | 
| 312 | > the distinction between attributes and dictionary members always seemed weird
 | 
| 313 | > and unnecessary to me.
 | 
| 314 | 
 | 
| 315 | I've been thinking about this for [the Oil
 | 
| 316 | language](http://www.oilshell.org/blog/2019/08/22.html), which is heavily
 | 
| 317 | influenced by Python.
 | 
| 318 | 
 | 
| 319 | The problem is that dictionary attributes come from user data, i.e. from JSON,
 | 
| 320 | while methods like `.keys()` come from the interpreter, and Python allows you
 | 
| 321 |   to provide user-defined methods like `mydict.mymethod()` too.
 | 
| 322 | 
 | 
| 323 | Mixing all of those things in the same namespace seems like a bad idea.
 | 
| 324 | 
 | 
| 325 | In Oil I might do introduce an `->` operator, so `d->mykey` is a shortcut for
 | 
| 326 | `d['mykey']`.
 | 
| 327 | 
 | 
| 328 | ```
 | 
| 329 | d.keys(), d.values(), d.items()  # methods
 | 
| 330 | d->mykey
 | 
| 331 | d['mykey']
 | 
| 332 | ```
 | 
| 333 | 
 | 
| 334 | Maybe you could disallow user-defined attributes on dictionaries, and make them
 | 
| 335 | free:
 | 
| 336 | 
 | 
| 337 | ```
 | 
| 338 | keys(d), values(d), items(d)
 | 
| 339 | d.mykey  # The whole namespace is available for users
 | 
| 340 | ```
 | 
| 341 | 
 | 
| 342 | However I don't like that this makes dictionaries a special case.  Thoughts?
 | 
| 343 | 
 | 
| 344 | ## Deferred
 | 
| 345 | 
 | 
| 346 | ### List and Dict Comprehensions
 | 
| 347 | 
 | 
| 348 | List comprehensions might be useful for a "faster" for loop?  It only does
 | 
| 349 | expressions?
 | 
| 350 | 
 | 
| 351 | ### Splat `*` and `**`
 | 
| 352 | 
 | 
| 353 | Python allows splatting into lists:
 | 
| 354 | 
 | 
| 355 |     a = [1, 2] 
 | 
| 356 |     b = [*a, 3]
 | 
| 357 | 
 | 
| 358 | And dicts:
 | 
| 359 | 
 | 
| 360 |     d = {'name': 'alice'}
 | 
| 361 |     d2 = {**d, age: 42}
 | 
| 362 | 
 | 
| 363 | ### Ranges `1:n` (vs slices)
 | 
| 364 | 
 | 
| 365 | Deferred because you can use 
 | 
| 366 | 
 | 
| 367 |     for i in @(seq $n) {
 | 
| 368 |       echo $i
 | 
| 369 |     }
 | 
| 370 | 
 | 
| 371 | This gives you strings but that's OK for now.  We don't yet have a "fast" for
 | 
| 372 | loop.
 | 
| 373 | 
 | 
| 374 | Notes:
 | 
| 375 | 
 | 
| 376 | - Oil slices don't have a "step" argument.  Justification:
 | 
| 377 |   - R only has `start:end`, it doesn't have `start:end:step`
 | 
| 378 |   - Julia has `start:step:end`!
 | 
| 379 |   - I don't think the **step** is so useful that it has to be first class
 | 
| 380 |     syntax.  In other words, Python's syntax is optimized for a rare case --
 | 
| 381 |     e.g. `a[::2]`.
 | 
| 382 | - Python has slices, but it doesn't have a range syntax.  You have to write
 | 
| 383 |   `range(0, n)`. 
 | 
| 384 | - A syntactic difference between slices and ranges: slice endpoints can be
 | 
| 385 |   **implicit**, like `a[:n]` and `a[3:]`.
 | 
| 386 | 
 | 
| 387 | ## Appendices
 | 
| 388 | 
 | 
| 389 | ### Oil vs. Tea
 | 
| 390 | 
 | 
| 391 | - Tea: truthiness of `Str*` is a problem.  Nul, etc.
 | 
| 392 |   - `if (mystr)` vs `if (len(mystr))`
 | 
| 393 |   - though I think strings should be non-nullable value types?  They are
 | 
| 394 |     slices.
 | 
| 395 |   - they start off as the empty slice
 | 
| 396 | - Automatic conversions of strings to numbers
 | 
| 397 |   - `42` and `3.14` and `1e100`
 | 
| 398 | 
 | 
| 399 | ### Implementation Notes
 | 
| 400 | 
 | 
| 401 | - Limitation:
 | 
| 402 |   - Start with Str, StrArray, and AssocArray data model
 | 
| 403 |   - Then add int, float, bool, null (for JSON)
 | 
| 404 |   - Then add fully recursive data model (depends on FC)
 | 
| 405 |     - `value = ... | dict[str, value]`
 | 
| 406 | 
 |