| 1 | ---
 | 
| 2 | in_progress: yes
 | 
| 3 | default_highlighter: oils-sh
 | 
| 4 | css_files: ../../web/base.css ../../web/manual.css ../../web/toc.css
 | 
| 5 | ---
 | 
| 6 | 
 | 
| 7 | Word Language
 | 
| 8 | =============
 | 
| 9 | 
 | 
| 10 | Recall that Oil is composed of three interleaved languages: **words**,
 | 
| 11 | [commands](command-language.html), and [expressions](expression-language.html).
 | 
| 12 | 
 | 
| 13 | This doc describes words, but only the things that are **not** in:
 | 
| 14 | 
 | 
| 15 | - [A Tour of the Oil Language](oil-language-tour.html)
 | 
| 16 | - The `#word-lang` section of [OSH Help
 | 
| 17 |   Topics](osh-help-topics.html#word-lang)
 | 
| 18 | - The `#word-lang` section of [Oil Help
 | 
| 19 |   Topics](oil-help-topics.html#word-lang)
 | 
| 20 | 
 | 
| 21 | <div id="toc">
 | 
| 22 | </div>
 | 
| 23 | 
 | 
| 24 | ## What's a Word?
 | 
| 25 | 
 | 
| 26 | A word is an expression like `$x`, `"hello $name"`, or `{build,test}/*.py`.  It
 | 
| 27 | evaluates to a string or an array of strings.
 | 
| 28 | 
 | 
| 29 | Generally speaking, Oil behaves like a simpler version of POSIX shell / bash.
 | 
| 30 | Sophisticated users can read [Simple Word Evaluation](simple-word-eval.html)
 | 
| 31 | for a comparison.
 | 
| 32 | 
 | 
| 33 | ## Contexts Where Words Are Used
 | 
| 34 | 
 | 
| 35 | ### Words Are Part of Expressions and Commands
 | 
| 36 | 
 | 
| 37 | Part of an expression:
 | 
| 38 | 
 | 
| 39 |     var x = ${y:-'default'}
 | 
| 40 | 
 | 
| 41 | Part of a command:
 | 
| 42 | 
 | 
| 43 |     echo ${y:-'default'}
 | 
| 44 | 
 | 
| 45 | ### Word Sequences: in for loops and array literals
 | 
| 46 | 
 | 
| 47 | The three contexts where splitting and globbing apply are the ones where a
 | 
| 48 | **sequence** of words is evaluated (`EvalWordSequence`):
 | 
| 49 | 
 | 
| 50 | 1. [Command]($help:simple-command): `echo $x foo`
 | 
| 51 | 2. [For loop]($help:for): `for i in $x foo; do ...`
 | 
| 52 | 3. [Array Literals]($help:array): `a=($x foo)` and `var a = :| $x foo |` ([oil-array]($help))
 | 
| 53 | 
 | 
| 54 | ### Oil vs. Bash Array Literals
 | 
| 55 | 
 | 
| 56 | Oil has a new array syntax, but it also supports the bash-compatible syntax:
 | 
| 57 | 
 | 
| 58 | ```
 | 
| 59 | local myarray=(one two *.py)  # bash
 | 
| 60 | 
 | 
| 61 | var myarray = :| one two *.py |  # Oil style
 | 
| 62 | ```
 | 
| 63 | 
 | 
| 64 | ### Oil Discourages Context-Sensitive Evaluation
 | 
| 65 | 
 | 
| 66 | Shell also has contexts where it evaluates words to a **single string**, rather
 | 
| 67 | than a sequence, like:
 | 
| 68 | 
 | 
| 69 | ```sh
 | 
| 70 | # RHS of Assignment
 | 
| 71 | x="${not_array[@]}"
 | 
| 72 | x=*.py  # not a glob
 | 
| 73 | 
 | 
| 74 | # Redirect Arg
 | 
| 75 | echo foo > "${not_array[@]}"
 | 
| 76 | echo foo > *.py  # not a glob
 | 
| 77 | 
 | 
| 78 | # Case variables and patterns
 | 
| 79 | case "${not_array1[@]}" in 
 | 
| 80 |   "${not_array2[@]}")
 | 
| 81 |     echo oops
 | 
| 82 |     ;;
 | 
| 83 | esac
 | 
| 84 | 
 | 
| 85 | case *.sh in   # not a glob
 | 
| 86 |   *.py)        # a string pattern, not a file system glob
 | 
| 87 |     echo oops
 | 
| 88 |     ;;
 | 
| 89 | esac
 | 
| 90 | ```
 | 
| 91 | 
 | 
| 92 | The behavior of these snippets diverges a lot in existing shells.  That is,
 | 
| 93 | shells are buggy and poorly-specified.
 | 
| 94 | 
 | 
| 95 | Oil disallows most of them.  Arrays are considered separate from strings and
 | 
| 96 | don't randomly "decay".
 | 
| 97 | 
 | 
| 98 | Related: the RHS of an Oil assignment is an expression, which can be of any
 | 
| 99 | type, including an array:
 | 
| 100 | 
 | 
| 101 | ```
 | 
| 102 | var parts = split(x)       # returns an array
 | 
| 103 | var python = glob('*.py')  # ditto
 | 
| 104 | 
 | 
| 105 | var s = join(parts)        # returns a string
 | 
| 106 | ```
 | 
| 107 | 
 | 
| 108 | ## Sigils
 | 
| 109 | 
 | 
| 110 | This is a recap of [A Feel for Oil's Syntax](syntax-feelings.html).
 | 
| 111 | 
 | 
| 112 | ### `$` Means "Returns One String"
 | 
| 113 | 
 | 
| 114 | Examples:
 | 
| 115 | 
 | 
| 116 | - All substitutions: var, command, arith
 | 
| 117 |   - TODO: Do we have `$[a[x+1]]` as an expression substitution?
 | 
| 118 |   - Or `$[ /pat+ /]`?
 | 
| 119 |   - I don't think so.
 | 
| 120 | 
 | 
| 121 | - Inline function calls, a YSH extension: `$[join(myarray)]`
 | 
| 122 | 
 | 
| 123 | (C-style strings like `$'\n'` use `$`, but that's more of a bash anachronism.
 | 
| 124 | In Oil, `c'\n'` is preferred.
 | 
| 125 | 
 | 
| 126 | ### `@` Means "Returns An Array of Strings"
 | 
| 127 | 
 | 
| 128 | Enabled with `shopt -s parse_at`.
 | 
| 129 | 
 | 
| 130 | Examples:
 | 
| 131 | 
 | 
| 132 | - `@myarray`
 | 
| 133 | - `@[arrayfunc(x, y)]`
 | 
| 134 | 
 | 
| 135 | These are both Oil extensions.
 | 
| 136 | 
 | 
| 137 | The array literal syntax also uses a `@`:
 | 
| 138 | 
 | 
| 139 | ```
 | 
| 140 | var myarray = :| 1 2 3 |
 | 
| 141 | ```
 | 
| 142 | 
 | 
| 143 | ## OSH Features
 | 
| 144 | 
 | 
| 145 | ### Word Splitting and Empty String Elision
 | 
| 146 | 
 | 
| 147 | Uses POSIX behavior for unquoted substitutions like `$x`.
 | 
| 148 | 
 | 
| 149 | - The string value is split into args with `$IFS`.
 | 
| 150 | - If the string value is empty, no args are produced.
 | 
| 151 | 
 | 
| 152 | ### Implicit Joining
 | 
| 153 | 
 | 
| 154 | Shell has odd "joining" semantics, which are supported in Oil but generally
 | 
| 155 | discouraged:
 | 
| 156 | 
 | 
| 157 |     set -- 'a b' 'c d'
 | 
| 158 |     argv.py X"$@"X  # => ['Xa', 'b', 'c', 'dX']
 | 
| 159 | 
 | 
| 160 | In Oil, the RHS of an assignment is an expression, and joining only occurs
 | 
| 161 | within double quotes:
 | 
| 162 | 
 | 
| 163 |     # Oil
 | 
| 164 |     var joined = $x$y    # parse error
 | 
| 165 |     var joined = "$x$y"  # OK
 | 
| 166 | 
 | 
| 167 |     # Shell
 | 
| 168 |     joined=$x$y          # OK
 | 
| 169 |     joined="$x$y"        # OK
 | 
| 170 | 
 | 
| 171 | <a name="extended-glob"></a>
 | 
| 172 | ### Extended Globs
 | 
| 173 | 
 | 
| 174 | Extended globs in OSH are a "legacy syntax" modelled after the behavior of
 | 
| 175 | `bash` and `mksh`.  This features adds alternation, repetition, and negation to
 | 
| 176 | globs, giving the power of regexes.
 | 
| 177 | 
 | 
| 178 | You can use them to match strings:
 | 
| 179 | 
 | 
| 180 |     $ [[ foo.cc == *.(cc|h) ]] && echo 'matches'  # => matches
 | 
| 181 | 
 | 
| 182 | Or produce lists of filename arguments:
 | 
| 183 | 
 | 
| 184 |     $ touch foo.cc foo.h
 | 
| 185 |     $ echo *.@(cc|h)  # => foo.cc foo.h
 | 
| 186 | 
 | 
| 187 | There are some limitations and differences:
 | 
| 188 | 
 | 
| 189 | - Extended globs are supported only when Oil is built with GNU libc.
 | 
| 190 |   - GNU libc has the `FNM_EXTMATCH` extension to `fnmatch()`.  Unlike bash and
 | 
| 191 |     mksh, Oil doesn't implement its own extended glob matcher.
 | 
| 192 | - They're more **static**, like in `mksh`.  When an extended glob appears in a
 | 
| 193 |   word, we evaluate the word, match filenames, and **skip** the rest of the
 | 
| 194 |   word evaluation pipeline.  This means:
 | 
| 195 |   - Automatic word splitting is skipped in something like
 | 
| 196 |     `$unquoted/@(*.cc|h)`.
 | 
| 197 |   - You can't use arrays like `"$@"` and extended globs in the same word, e.g.
 | 
| 198 |     `"$@"_*.@(cc|h)`.  This is usually nonsensical anyway.
 | 
| 199 | - OSH only accepts them in **contexts** that make sense.
 | 
| 200 |   - For example, `echo foo > @(cc|h)` is a runtime error in OSH, but other
 | 
| 201 |     shells will write a file literally named `@(cc|h)`.
 | 
| 202 |   - OSH doesn't accept `${undef:-@(cc)}`.  But it does accept `${x%@(cc)}`,
 | 
| 203 |     since string strip operators like `%` accept a glob.
 | 
| 204 | - Extended globbing is always on in OSH, regardless of `shopt -s extglob`.
 | 
| 205 |   - Trivia: `bash` can't parse some extended globs unless `extglob` is on.  But
 | 
| 206 |     it parses others when it's off.
 | 
| 207 | - Extended globs can't be used in the `PATTERN` in `${x//PATTERN/replace}`.
 | 
| 208 |   This is because we only translate normal (non-extended) globs to regexes (in
 | 
| 209 |   order to get the position information necessary for string replacement).
 | 
| 210 | - They're not supported when `shopt --set simple_word_eval` (Oil word
 | 
| 211 |   evaluation).
 | 
| 212 |   - For similar reasons, they're also not supported in assignment builtins.
 | 
| 213 |     (This is a good thing!)
 |