OILS / doc / old / word-language.md View on Github | oilshell.org

213 lines, 149 significant
1---
2in_progress: yes
3default_highlighter: oils-sh
4css_files: ../../web/base.css ../../web/manual.css ../../web/toc.css
5---
6
7Word Language
8=============
9
10Recall that Oil is composed of three interleaved languages: **words**,
11[commands](command-language.html), and [expressions](expression-language.html).
12
13This doc describes words, but only the things that are **not** in:
14
15- [A Tour of the Oil Language](oil-language-tour.html)
16- The `#word-lang` section of [OSH Help
17 Topics](osh-help-topics.html#word-lang)
18- The `#word-lang` section of [Oil Help
19 Topics](oil-help-topics.html#word-lang)
20
21<div id="toc">
22</div>
23
24## What's a Word?
25
26A word is an expression like `$x`, `"hello $name"`, or `{build,test}/*.py`. It
27evaluates to a string or an array of strings.
28
29Generally speaking, Oil behaves like a simpler version of POSIX shell / bash.
30Sophisticated users can read [Simple Word Evaluation](simple-word-eval.html)
31for a comparison.
32
33## Contexts Where Words Are Used
34
35### Words Are Part of Expressions and Commands
36
37Part of an expression:
38
39 var x = ${y:-'default'}
40
41Part of a command:
42
43 echo ${y:-'default'}
44
45### Word Sequences: in for loops and array literals
46
47The three contexts where splitting and globbing apply are the ones where a
48**sequence** of words is evaluated (`EvalWordSequence`):
49
501. [Command]($help:simple-command): `echo $x foo`
512. [For loop]($help:for): `for i in $x foo; do ...`
523. [Array Literals]($help:array): `a=($x foo)` and `var a = :| $x foo |` ([oil-array]($help))
53
54### Oil vs. Bash Array Literals
55
56Oil has a new array syntax, but it also supports the bash-compatible syntax:
57
58```
59local myarray=(one two *.py) # bash
60
61var myarray = :| one two *.py | # Oil style
62```
63
64### Oil Discourages Context-Sensitive Evaluation
65
66Shell also has contexts where it evaluates words to a **single string**, rather
67than a sequence, like:
68
69```sh
70# RHS of Assignment
71x="${not_array[@]}"
72x=*.py # not a glob
73
74# Redirect Arg
75echo foo > "${not_array[@]}"
76echo foo > *.py # not a glob
77
78# Case variables and patterns
79case "${not_array1[@]}" in
80 "${not_array2[@]}")
81 echo oops
82 ;;
83esac
84
85case *.sh in # not a glob
86 *.py) # a string pattern, not a file system glob
87 echo oops
88 ;;
89esac
90```
91
92The behavior of these snippets diverges a lot in existing shells. That is,
93shells are buggy and poorly-specified.
94
95Oil disallows most of them. Arrays are considered separate from strings and
96don't randomly "decay".
97
98Related: the RHS of an Oil assignment is an expression, which can be of any
99type, including an array:
100
101```
102var parts = split(x) # returns an array
103var python = glob('*.py') # ditto
104
105var s = join(parts) # returns a string
106```
107
108## Sigils
109
110This is a recap of [A Feel for Oil's Syntax](syntax-feelings.html).
111
112### `$` Means "Returns One String"
113
114Examples:
115
116- All substitutions: var, command, arith
117 - TODO: Do we have `$[a[x+1]]` as an expression substitution?
118 - Or `$[ /pat+ /]`?
119 - I don't think so.
120
121- Inline function calls, a YSH extension: `$[join(myarray)]`
122
123(C-style strings like `$'\n'` use `$`, but that's more of a bash anachronism.
124In Oil, `c'\n'` is preferred.
125
126### `@` Means "Returns An Array of Strings"
127
128Enabled with `shopt -s parse_at`.
129
130Examples:
131
132- `@myarray`
133- `@[arrayfunc(x, y)]`
134
135These are both Oil extensions.
136
137The array literal syntax also uses a `@`:
138
139```
140var myarray = :| 1 2 3 |
141```
142
143## OSH Features
144
145### Word Splitting and Empty String Elision
146
147Uses POSIX behavior for unquoted substitutions like `$x`.
148
149- The string value is split into args with `$IFS`.
150- If the string value is empty, no args are produced.
151
152### Implicit Joining
153
154Shell has odd "joining" semantics, which are supported in Oil but generally
155discouraged:
156
157 set -- 'a b' 'c d'
158 argv.py X"$@"X # => ['Xa', 'b', 'c', 'dX']
159
160In Oil, the RHS of an assignment is an expression, and joining only occurs
161within double quotes:
162
163 # Oil
164 var joined = $x$y # parse error
165 var joined = "$x$y" # OK
166
167 # Shell
168 joined=$x$y # OK
169 joined="$x$y" # OK
170
171<a name="extended-glob"></a>
172### Extended Globs
173
174Extended globs in OSH are a "legacy syntax" modelled after the behavior of
175`bash` and `mksh`. This features adds alternation, repetition, and negation to
176globs, giving the power of regexes.
177
178You can use them to match strings:
179
180 $ [[ foo.cc == *.(cc|h) ]] && echo 'matches' # => matches
181
182Or produce lists of filename arguments:
183
184 $ touch foo.cc foo.h
185 $ echo *.@(cc|h) # => foo.cc foo.h
186
187There are some limitations and differences:
188
189- Extended globs are supported only when Oil is built with GNU libc.
190 - GNU libc has the `FNM_EXTMATCH` extension to `fnmatch()`. Unlike bash and
191 mksh, Oil doesn't implement its own extended glob matcher.
192- They're more **static**, like in `mksh`. When an extended glob appears in a
193 word, we evaluate the word, match filenames, and **skip** the rest of the
194 word evaluation pipeline. This means:
195 - Automatic word splitting is skipped in something like
196 `$unquoted/@(*.cc|h)`.
197 - You can't use arrays like `"$@"` and extended globs in the same word, e.g.
198 `"$@"_*.@(cc|h)`. This is usually nonsensical anyway.
199- OSH only accepts them in **contexts** that make sense.
200 - For example, `echo foo > @(cc|h)` is a runtime error in OSH, but other
201 shells will write a file literally named `@(cc|h)`.
202 - OSH doesn't accept `${undef:-@(cc)}`. But it does accept `${x%@(cc)}`,
203 since string strip operators like `%` accept a glob.
204- Extended globbing is always on in OSH, regardless of `shopt -s extglob`.
205 - Trivia: `bash` can't parse some extended globs unless `extglob` is on. But
206 it parses others when it's off.
207- Extended globs can't be used in the `PATTERN` in `${x//PATTERN/replace}`.
208 This is because we only translate normal (non-extended) globs to regexes (in
209 order to get the position information necessary for string replacement).
210- They're not supported when `shopt --set simple_word_eval` (Oil word
211 evaluation).
212 - For similar reasons, they're also not supported in assignment builtins.
213 (This is a good thing!)