OILS / doc / interpreter-state.md View on Github | oilshell.org

326 lines, 200 significant
1---
2in_progress: yes
3---
4
5Interpreter State
6=================
7
8The Oils project has a single interpreter that supports both the OSH and YSH
9languages.
10
11In other words, It's useful to think of Unix shell in historical layers:
12
13- [OSH]($xref): A compatible but cleaned-up shell language.
14 1. Thompson Shell (pipelines, exit status)
15 2. Bourne Shell (variables, functions)
16 3. [Korn Shell]($xref:ksh) (indexed arrays)
17 4. [Bash]($xref:bash) (`shopt`, associative arrays)
18- [YSH]($xref): A new shell language that manipulates the same interpreter
19 state in a cleaner way.
20
21<!--
22TODO:
23
24- New "Pulp"?
25- Use fenced code blocks
26 - and run through BOTH bash and osh
27 - and link to this doc
28 - bash 4.4 in a sandbox?
29-->
30
31
32<div id="toc">
33</div>
34
35## Example
36
37Shell has many syntaxes for the same semantics, which can be confusing. For
38example, in bash, these four statements do similar things:
39
40```sh-prompt
41$ foo='bar'
42$ declare -g foo=bar
43$ x='foo=bar'; typeset $x
44$ printf -v foo bar
45
46$ echo $foo
47bar
48```
49
50In addition, YSH adds JavaScript-like syntax:
51
52```
53var foo = 'bar'
54```
55
56YSH syntax can express more data types, but it may also confuse new users.
57
58So the sections below describe the shell from a **semantic** perspective, which
59should help users reason about their programs.
60
61Quick tip: Use the [pp]($help) builtin to inspect shell variables.
62
63## Design Goals
64
65### Simplify and Rationalize bash
66
67POSIX shell has a fairly simple model: everything is a string, and `"$@"` is a
68special case.
69
70Bash adds many features on top of POSIX, including arrays and associative
71arrays. Oils implements those features, and a few more.
72
73However, it also significantly simplifies the model.
74
75A primary difference is mentioned in [Known Differences](known-differences.html):
76
77- In bash, the *locations* of values are tagged with types, e.g. `declare -A
78 unset_assoc_array`.
79- In Oils, *values* are tagged with types. This is how common dynamic languages
80 like Python and JavaScript behave.
81
82In other words, Oils "salvages" the confusing semantics of bash and produces
83something simpler, while still being very compatible.
84
85### Add New Features and Types
86
87TODO
88
89- eggex type
90- later: floating point type
91
92## High Level Description
93
94### Memory Is a Stack
95
96- Shell has a stack but no heap. The stack stores:
97 - Variables that are local to a function.
98 - The **arguments array** which is spelled `"$@"` in shell, and `@ARGV` in
99 YSH.
100- Shell's memory has values and locations, but **no** references/pointers.
101
102<!--
103later: YSH adds references to data structures on the heap, which may be recurisve.
104-->
105
106### Environment Variables Become Global Variables
107
108On initialization, environment variables like `PYTHONPATH=.` are copied into
109the shell's memory as global variables, with the `export` flag set.
110
111Global variables are stored in the first stack frame, i.e. the one at index
112`0`.
113
114### Functions and Variables Are Separate
115
116There are two distinct namespaces. For example:
117
118```
119foo() {
120 echo 'function named foo'
121}
122foo=bar # a variable; doesn't affect the function
123```
124
125### Variable Name Lookup with "Dynamic Scope"
126
127OSH has it, but YSH limits it.
128
129### Limitations of Arrays And Compound Data Structures
130
131Shell is a value-oriented language.
132
133- Can't Be Nested
134- Can't Be Passed to Functions or Returned From Functions
135- Can't Take References; Must be Copied
136
137Example:
138
139```
140declare -a myarray=("${other_array[@]}") # shell
141
142var myarray = :| @other_array | # Oils
143```
144
145Reason: There's no Garbage collection.
146
147### Integers and Coercion
148
149- Strings are coerced to integers to do math.
150- What about `-i` in bash?
151
152
153### Unix `fork()` Has Copy-On-Write Semantics
154
155See the [Process Model](process-model.html) document.
156
157
158## Key Data Types
159
160TODO: [core/runtime.asdl]($oils-src)
161
162<!--
163TODO:
164- Make a graphviz diagram once everything is settled?
165-->
166
167### `cell`
168
169TODO
170
171- [export]($help) only applies to **strings**
172
173### `value`
174
175Undef, Str, Sequential/Indexed Arrays, Associative Array
176
177- OSH has `value.BashArray`, and YSH has `value.List`.
178- no integers, but there is (( ))
179- "$@" is an array, and "${a[@]}" too
180 - not true in bash -- it's fuzzy there
181 - but $@ and ${a[@]} are NOT arrays
182- flags: readonly and exported (but arrays/assoc arrays shouldn't be exported)
183 - TODO: find that
184
185### `cmd_value` for shell builtins
186
187Another important type:
188
189```
190 assign_arg = (lvalue lval, value? rval, int spid)
191
192 cmd_value =
193 Argv(string* argv, int* arg_spids, command__BraceGroup? block)
194 | Assign(builtin builtin_id,
195 string* argv, int* arg_spids,
196 assign_arg* pairs)
197```
198
199
200## Printing State
201
202### Shell Builtins
203
204Oils supports various shell and bash operations to view the interpreter state.
205
206- `set` prints variables and their values
207- `set -o` prints options
208- `declare/typeset/readonly/export -p` prints a subset of variables
209- `test -v` tests if a variable is defined.
210
211### [pp]($help) in Oils
212
213Pretty prints a cell.
214
215This is cleaner!
216
217TODO: What about functions
218
219
220
221
222## Modifying State
223
224### YSH Keywords
225
226TODO: See YSH Keywords doc.
227
228### Shell Assignment Builtins: declare/typeset, readonly, export
229
230...
231
232### [unset]($help)
233
234You can't unset an array in OSH? But you can in bash.
235
236### Other Builtins
237
238- [read]($help). Sometimes sets the magic `$REPLY` variable.
239- [getopts]($help)
240
241
242## Links
243
244- [Process Model](process-mode.html)
245- <https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays>
246- <https://www.thegeekstuff.com/2010/06/bash-array-tutorial>
247
248
249## Appendix: Bash Issues
250
251<!--
252### Surprising Parsing
253
254Parsing bash is undecidable.
255
256 A[x]
257 a[x]
258-->
259
260### Strings and Arrays Are Confused
261
262 Horrible
263
264 a=('1 2' 3)
265 b=(1 '2 3') # two different elements
266
267 [[ $a == $b ]]
268 [[ ${a[0]} == ${b[0]} ]]
269
270 [[ ${a[@]} == ${b[@]} ]]
271
272
273Associative arrays and being undefined
274
275- half an array type
276 - strict_array removes this
277 - case $x in "$@"
278- half an associative array type
279
280### Indexed Arrays and Associative Arrays Are Confused
281
282### Empty and Unset Are Confused
283
284- empty array conflicts with `set -o nounset` (in bash 4.3). I can't recommend
285 in good faith.
286
287<!--
288test -v (???) Was there a bug here?
289-->
290
291
292
293<!--
294
295## Quirky Syntax and Semantics in Shell Sublanguages
296
297### Command
298
299Mentioned above:
300
301 a[x+1]+=x
302 a[x+1]+=$x
303
304 s+='foo'
305
306### Word
307
308Mentioned above:
309
310 echo ${a[0]}
311 echo "${a[0]}"
312 echo ${a[i+1]}
313
314### Arithmetic Does Integer Coercion
315
316SURPRISING! Avoid if you can!!!
317
318 (( a[ x+1 ] += s )) #
319
320
321### Boolean: [[ $a = $b ]]
322
323Operates on strings only. Can't compare
324
325-->
326