OILS / doc / variables.md View on Github | oilshell.org

380 lines, 264 significant
1---
2default_highlighter: oils-sh
3---
4
5Variable Declaration, Mutation, and Scope
6=========================================
7
8This doc addresses these questions:
9
10- How do variables behave in YSH?
11- What are some practical guidelines for using them?
12
13<div id="toc">
14</div>
15
16## YSH Design Goals
17
18YSH is a graceful upgrade to shell, and the behavior of variables follows from
19that philosophy.
20
21- OSH implements shell-compatible behavior.
22- YSH enhances shell with **new features** like expressions over typed data,
23 which will be familiar to Python and JavaScript programmers.
24- It's a **stricter** language.
25 - Procs (shell functions) are self-contained and modular. They're
26 understandable by reading their signature.
27 - We removed [dynamic scope]($xref:dynamic-scope). This mechanism isn't
28 familiar to most programmers, and may cause accidental mutation (bugs).
29 - YSH has variable **declarations** like JavaScript, which can prevent
30 trivial bugs.
31- Even though YSH is stricter, it should still be convenient to use
32 interactively.
33
34## Keywords Are More Consistent and Powerful Than Builtins
35
36YSH has 5 keywords affect shell variables. Unlike shell builtins, they're
37statically-parsed, and take dynamically-typed **expressions** on the right.
38
39### Declare With `var` and `const`
40
41It looks like JavaScript:
42
43 var name = 'Bob'
44 const age = (20 + 1) * 2
45
46 echo "$name is $age years old" # Bob is 42 years old
47
48Note that `const` is enforced by a dynamic check. It's meant to be used at the
49top level only, not within `proc` or `func`.
50
51 const age = 'other' # Will fail because `readonly` bit is set
52
53### Mutate With `setvar` and `setglobal`
54
55 proc p {
56 var name = 'Bob' # declare
57 setvar name = 'Alice' # mutate
58
59 setglobal g = 42 # create or mutate a global variable
60 }
61
62### "Return" By Mutating a `Place` (advanced)
63
64A `Place` is a more principled mechanism that "replaces" shell's dynamic scope.
65To use it:
66
671. Create a place with the `&` prefix operator
681. Pass the place around as you would any other value.
691. Assign to the place with its `setValue(x)` method.
70
71Example:
72
73 proc p (s; out) { # place is a typed param
74 # mutate the place
75 call out->setValue("prefix-$s")
76 }
77
78 var x
79 p ('foo', &x) # pass a place
80 echo x=$x # => x=prefix-foo
81
82- *Style guideline*: In some situations, it's better to "return" a value on
83 stdout, and use `$(myproc)` to retrieve it.
84
85### Comparison to Shell
86
87Shell and [bash]($xref) have grown many mechanisms for "declaring" and mutating
88variables:
89
90- "bare" assignments like `x=foo`
91- **builtins** like `declare`, `local`, and `readonly`
92- The `-n` "nameref" flag
93
94Examples:
95
96 readonly name=World # no spaces allowed around =
97 declare foo="Hello $name"
98 foo=$((42 + a[2]))
99 declare -n ref=foo # $foo can be written through $ref
100
101These constructs are all discouraged in YSH code.
102
103## Keywords Behave Differently at the Top Level (Like JavaScript)
104
105The "top-level" of the interpreter is used in two situations:
106
1071. When using YSH **interactively**.
1082. As the **global** scope of a batch program.
109
110Experienced YSH users may notice that `var` and `setvar` behave differently in
111the top-level scope vs. `proc` scope. This is caused by the tension between
112the interactive shell and the strictness of YSH.
113
114In particular, the `source` builtin is dynamic, so YSH can't know all the names
115defined at the top level.
116
117For reference, JavaScript's modern `let` keyword has similar behavior.
118
119### Usage Guidelines
120
121Before going into detail on keyword behavior, here are some practical
122guidelines:
123
124- **Interactive** sessions: Use shell's `x=y`, or YSH `setvar`. You can think
125 of `setvar` like Python's assignment operator: it creates or mutates a
126 variable.
127 - **Short scripts** (~20 lines) can also use this style.
128- **Long programs**: Refactor them into composable "functions", i.e. `proc`.
129 - First wrap the **whole program** into `proc main { }`.
130 - The top level should only have `const` declarations. (You can use `var`,
131 but it has special rules, explained below.)
132 - The body of `proc` and `func` should have variables declared with `var`.
133 - Inside these code blocks, use `setvar` to mutate **local** variables, and
134 `setglobal` to mutate **globals**.
135
136That's all you need to remember. The following sections explain the rationale
137for these guidelines.
138
139### The Top-Level Scope Has Only Dynamic Checks
140
141The lack of static checks affects the recommended usage for both interactive
142sessions and batch scripts.
143
144#### Interactive Use: `setvar` only
145
146As mentioned, you only need the `setvar` keyword in an interactive shell:
147
148 ysh$ setvar x = 42 # create variable 'x'
149 ysh$ setvar x = 43 # mutate it
150
151Details on top-level behavior:
152
153- `var` behaves like `setvar`: It creates or mutates a variable. In other
154 words, a `var` definition can be **redefined** at the top-level.
155- A `const` can also redefine a `var`.
156- A `var` can't redefine a `const` because there's a **dynamic** check that
157 disallows mutation (like shell's `readonly`).
158
159#### Batch Use: `const` only
160
161It's simpler to use only constants at the top level.
162
163 const USER = 'bob'
164 const HOST = 'example.com'
165
166 proc p {
167 ssh $USER@$HOST ls -l
168 }
169
170This is so you don't have to worry about a `var` being redefined by a statement
171like `source mylib.sh`. A `const` can't be redefined because it can't be
172mutated.
173
174It may be useful to put mutable globals in a constant dictionary, as it will
175prevent them from being redefined:
176
177 const G = { mystate = 0 }
178
179 proc p {
180 setglobal G.mystate = 1
181 }
182
183### `proc` and `func` Scope Have Static Checks
184
185These YSH code units have additional **static checks** (parse errors):
186
187- Every variable must be declared once and only once with `var`. A duplicate
188 declaration is a parse error.
189- `setvar` of an undeclared variable is a parse error.
190
191## Procs Don't Use "Dynamic Scope"
192
193Procs are designed to be encapsulated and composable like processes. But the
194[dynamic scope]($xref:dynamic-scope) rule that Bourne shell functions use
195breaks encapsulation.
196
197Dynamic scope means that a function can **read and mutate** the locals of its
198caller, its caller's caller, and so forth. Example:
199
200 g() {
201 echo "f_var is $f_var" # g can see f's local variables
202 }
203
204 f() {
205 local f_var=42 g
206 }
207
208 f
209
210YSH code should use `proc` instead. Inside a proc call, the `dynamic_scope`
211option is implicitly disabled (equivalent to `shopt --unset dynamic_scope`).
212
213### Reading Variables
214
215This means that adding the `proc` keyword to the definition of `g` changes its
216behavior:
217
218 proc g() {
219 echo "f_var is $f_var" # Undefined!
220 }
221
222This affects all kinds of variable references:
223
224 proc p {
225 echo $foo # look up foo in command mode
226 var y = foo + 42 # look up foo in expression mode
227 }
228
229As in Python and JavaScript, a local `foo` can *shadow* a global `foo`. Using
230`CAPS` for globals is a common style that avoids confusion. Remember that
231globals should usually be constants in YSH.
232
233### Shell Language Constructs That Write Variables
234
235In shell, these language constructs assign to variables using dynamic
236scope. In YSH, they only mutate the **local** scope:
237
238- `x=val`
239 - And variants `x+=val`, `a[i]=val`, `a[i]+=val`
240- `export x=val` and `readonly x=val`
241- `${x=default}`
242- `mycmd {x}>out` (stores a file descriptor in `$x`)
243- `(( x = 42 + y ))`
244
245### Builtins That Write Variables
246
247These builtins are also "isolated" inside procs, using local scope:
248
249- [read](ref/chap-builtin-cmd.html#read) (`$REPLY`)
250- [readarray](ref/chap-builtin-cmd.html#readarray) aka `mapfile`
251- [getopts](ref/chap-builtin-cmd.html#getopts) (`$OPTIND`, `$OPTARG`, etc.)
252- [printf](ref/chap-builtin-cmd.html#printf) -v
253- [unset](ref/chap-osh-assign.html#unset)
254
255YSH Builtins:
256
257- [compadjust](ref/chap-builtin-cmd.html#compadjust)
258- [try](ref/chap-builtin-cmd.html#try) and `_status`
259
260<!-- TODO: should YSH builtins always behave the same way? Isn't that a little
261faster? I think read --all is not consistent. -->
262
263### Reminder: Proc Scope is Flat
264
265All local variables in shell functions and procs live in the same scope. This
266includes variables declared in conditional blocks (`if` and `case`) and loops
267(`for` and `while`).
268
269 proc p {
270 for i in 1 2 3 {
271 echo $i
272 }
273 echo $i # i is still 3
274 }
275
276This includes first-class YSH blocks:
277
278 proc p {
279 var x = 42
280 cd /tmp {
281 var x = 0 # ERROR: x is already declared
282 }
283 }
284
285## More Details
286
287### Examples of Place Mutation
288
289The expression to the left of `=` is called a **place**. These are basically
290Python or JavaScript expressions, except that you add the `setvar` or
291`setglobal` keyword.
292
293 setvar x[1] = 2 # array element
294 setvar d['key'] = 3 # dict element
295 setvar d.key = 3 # syntactic sugar for the above
296 setvar x, y = y, x # swap
297
298### Bare Assignment
299
300[Hay](hay.html) allows `const` declarations without the keyword:
301
302 hay define Package
303
304 Package cpython {
305 version = '3.12' # like const version = ...
306 }
307
308### Temp Bindings
309
310Temp bindings precede a simple command:
311
312 PYTHONPATH=. mycmd
313
314They create a new namespace on the stack where each cell has the `export` flag
315set (`declare -x`).
316
317In YSH, the lack of dynamic scope means that they can't be read inside a
318`proc`. So they're only useful for setting environment variables, and can be
319replaced with:
320
321 env PYTHONPATH=. mycmd
322 env PYTHONPATH=. $0 myproc # using the ARGV dispatch pattern
323
324## Appendix A: More on Shell vs. YSH
325
326This section may help experienced shell users understand YSH.
327
328Shell:
329
330 g=G # global variable
331 readonly c=C # global constant
332
333 myfunc() {
334 local x=X # local variable
335 readonly y=Y # local constant
336
337 x=mutated # mutate local
338 g=mutated # mutate global
339 newglobal=G # create new global
340
341 caller_var=mutated # dynamic scope (YSH doesn't have this)
342 }
343
344YSH:
345
346 var g = 'G' # global variable (discouraged)
347 const c = 'C' # global constant
348
349 proc myproc {
350 var x = 'L' # local variable
351
352 setvar x = 'mutated' # mutate local
353 setglobal g = 'mutated' # mutate global
354 setglobal newglobal = 'G' # create new global
355 }
356
357## Appendix B: Problems With Top-Level Scope In Other Languages
358
359- Julia 1.5 (August 2020): [The return of "soft scope" in the
360 REPL](https://julialang.org/blog/2020/08/julia-1.5-highlights/#the_return_of_soft_scope_in_the_repl).
361 - In contrast to Julia, YSH behaves the same in batch mode vs. interactive
362 mode, and doesn't print warnings. However, it behaves differently at the
363 top level. For this reason, we recommend using only `setvar` in
364 interactive shells, and only `const` in the global scope of programs.
365- Racket: [The Top Level is Hopeless](https://gist.github.com/samth/3083053)
366 - From [A Principled Approach to REPL Interpreters](https://2020.splashcon.org/details/splash-2020-Onward-papers/5/A-principled-approach-to-REPL-interpreters)
367 (Onward 2020). Thanks to Michael Greenberg (of Smoosh) for this reference.
368 - The behavior of `var` at the top level was partly inspired by this
369 paper. It's consistent with bash's `declare`, and similar to JavaScript's
370 `let`.
371
372## Related Documents
373
374- [Interpreter State](interpreter-state.html)
375 - The shell has a stack of namespaces.
376 - Each namespace contains {variable name -> cell} bindings.
377 - Cells have a tagged value (string, array, etc.) and 3 flags (readonly,
378 export, nameref).
379- [Guide to Procs and Funcs](proc-func.html)
380