| 1 | ---
|
| 2 | default_highlighter: oils-sh
|
| 3 | ---
|
| 4 |
|
| 5 | Variable Declaration, Mutation, and Scope
|
| 6 | =========================================
|
| 7 |
|
| 8 | This doc addresses these questions:
|
| 9 |
|
| 10 | - How do variables behave in YSH?
|
| 11 | - What are some practical guidelines for using them?
|
| 12 |
|
| 13 | <div id="toc">
|
| 14 | </div>
|
| 15 |
|
| 16 | ## YSH Design Goals
|
| 17 |
|
| 18 | YSH is a graceful upgrade to shell, and the behavior of variables follows from
|
| 19 | that philosophy.
|
| 20 |
|
| 21 | - OSH implements shell-compatible behavior.
|
| 22 | - YSH enhances shell with **new features** like expressions over typed data,
|
| 23 | which will be familiar to Python and JavaScript programmers.
|
| 24 | - It's a **stricter** language.
|
| 25 | - Procs (shell functions) are self-contained and modular. They're
|
| 26 | understandable by reading their signature.
|
| 27 | - We removed [dynamic scope]($xref:dynamic-scope). This mechanism isn't
|
| 28 | familiar to most programmers, and may cause accidental mutation (bugs).
|
| 29 | - YSH has variable **declarations** like JavaScript, which can prevent
|
| 30 | trivial bugs.
|
| 31 | - Even though YSH is stricter, it should still be convenient to use
|
| 32 | interactively.
|
| 33 |
|
| 34 | ## Keywords Are More Consistent and Powerful Than Builtins
|
| 35 |
|
| 36 | YSH has 5 keywords affect shell variables. Unlike shell builtins, they're
|
| 37 | statically-parsed, and take dynamically-typed **expressions** on the right.
|
| 38 |
|
| 39 | ### Declare With `var` and `const`
|
| 40 |
|
| 41 | It looks like JavaScript:
|
| 42 |
|
| 43 | var name = 'Bob'
|
| 44 | const age = (20 + 1) * 2
|
| 45 |
|
| 46 | echo "$name is $age years old" # Bob is 42 years old
|
| 47 |
|
| 48 | Note that `const` is enforced by a dynamic check. It's meant to be used at the
|
| 49 | top level only, not within `proc` or `func`.
|
| 50 |
|
| 51 | const age = 'other' # Will fail because `readonly` bit is set
|
| 52 |
|
| 53 | ### Mutate With `setvar` and `setglobal`
|
| 54 |
|
| 55 | proc p {
|
| 56 | var name = 'Bob' # declare
|
| 57 | setvar name = 'Alice' # mutate
|
| 58 |
|
| 59 | setglobal g = 42 # create or mutate a global variable
|
| 60 | }
|
| 61 |
|
| 62 | ### "Return" By Mutating a `Place` (advanced)
|
| 63 |
|
| 64 | A `Place` is a more principled mechanism that "replaces" shell's dynamic scope.
|
| 65 | To use it:
|
| 66 |
|
| 67 | 1. Create a place with the `&` prefix operator
|
| 68 | 1. Pass the place around as you would any other value.
|
| 69 | 1. Assign to the place with its `setValue(x)` method.
|
| 70 |
|
| 71 | Example:
|
| 72 |
|
| 73 | proc p (s; out) { # place is a typed param
|
| 74 | # mutate the place
|
| 75 | call out->setValue("prefix-$s")
|
| 76 | }
|
| 77 |
|
| 78 | var x
|
| 79 | p ('foo', &x) # pass a place
|
| 80 | echo x=$x # => x=prefix-foo
|
| 81 |
|
| 82 | - *Style guideline*: In some situations, it's better to "return" a value on
|
| 83 | stdout, and use `$(myproc)` to retrieve it.
|
| 84 |
|
| 85 | ### Comparison to Shell
|
| 86 |
|
| 87 | Shell and [bash]($xref) have grown many mechanisms for "declaring" and mutating
|
| 88 | variables:
|
| 89 |
|
| 90 | - "bare" assignments like `x=foo`
|
| 91 | - **builtins** like `declare`, `local`, and `readonly`
|
| 92 | - The `-n` "nameref" flag
|
| 93 |
|
| 94 | Examples:
|
| 95 |
|
| 96 | readonly name=World # no spaces allowed around =
|
| 97 | declare foo="Hello $name"
|
| 98 | foo=$((42 + a[2]))
|
| 99 | declare -n ref=foo # $foo can be written through $ref
|
| 100 |
|
| 101 | These constructs are all discouraged in YSH code.
|
| 102 |
|
| 103 | ## Keywords Behave Differently at the Top Level (Like JavaScript)
|
| 104 |
|
| 105 | The "top-level" of the interpreter is used in two situations:
|
| 106 |
|
| 107 | 1. When using YSH **interactively**.
|
| 108 | 2. As the **global** scope of a batch program.
|
| 109 |
|
| 110 | Experienced YSH users may notice that `var` and `setvar` behave differently in
|
| 111 | the top-level scope vs. `proc` scope. This is caused by the tension between
|
| 112 | the interactive shell and the strictness of YSH.
|
| 113 |
|
| 114 | In particular, the `source` builtin is dynamic, so YSH can't know all the names
|
| 115 | defined at the top level.
|
| 116 |
|
| 117 | For reference, JavaScript's modern `let` keyword has similar behavior.
|
| 118 |
|
| 119 | ### Usage Guidelines
|
| 120 |
|
| 121 | Before going into detail on keyword behavior, here are some practical
|
| 122 | guidelines:
|
| 123 |
|
| 124 | - **Interactive** sessions: Use shell's `x=y`, or YSH `setvar`. You can think
|
| 125 | of `setvar` like Python's assignment operator: it creates or mutates a
|
| 126 | variable.
|
| 127 | - **Short scripts** (~20 lines) can also use this style.
|
| 128 | - **Long programs**: Refactor them into composable "functions", i.e. `proc`.
|
| 129 | - First wrap the **whole program** into `proc main { }`.
|
| 130 | - The top level should only have `const` declarations. (You can use `var`,
|
| 131 | but it has special rules, explained below.)
|
| 132 | - The body of `proc` and `func` should have variables declared with `var`.
|
| 133 | - Inside these code blocks, use `setvar` to mutate **local** variables, and
|
| 134 | `setglobal` to mutate **globals**.
|
| 135 |
|
| 136 | That's all you need to remember. The following sections explain the rationale
|
| 137 | for these guidelines.
|
| 138 |
|
| 139 | ### The Top-Level Scope Has Only Dynamic Checks
|
| 140 |
|
| 141 | The lack of static checks affects the recommended usage for both interactive
|
| 142 | sessions and batch scripts.
|
| 143 |
|
| 144 | #### Interactive Use: `setvar` only
|
| 145 |
|
| 146 | As mentioned, you only need the `setvar` keyword in an interactive shell:
|
| 147 |
|
| 148 | ysh$ setvar x = 42 # create variable 'x'
|
| 149 | ysh$ setvar x = 43 # mutate it
|
| 150 |
|
| 151 | Details on top-level behavior:
|
| 152 |
|
| 153 | - `var` behaves like `setvar`: It creates or mutates a variable. In other
|
| 154 | words, a `var` definition can be **redefined** at the top-level.
|
| 155 | - A `const` can also redefine a `var`.
|
| 156 | - A `var` can't redefine a `const` because there's a **dynamic** check that
|
| 157 | disallows mutation (like shell's `readonly`).
|
| 158 |
|
| 159 | #### Batch Use: `const` only
|
| 160 |
|
| 161 | It's simpler to use only constants at the top level.
|
| 162 |
|
| 163 | const USER = 'bob'
|
| 164 | const HOST = 'example.com'
|
| 165 |
|
| 166 | proc p {
|
| 167 | ssh $USER@$HOST ls -l
|
| 168 | }
|
| 169 |
|
| 170 | This is so you don't have to worry about a `var` being redefined by a statement
|
| 171 | like `source mylib.sh`. A `const` can't be redefined because it can't be
|
| 172 | mutated.
|
| 173 |
|
| 174 | It may be useful to put mutable globals in a constant dictionary, as it will
|
| 175 | prevent them from being redefined:
|
| 176 |
|
| 177 | const G = { mystate = 0 }
|
| 178 |
|
| 179 | proc p {
|
| 180 | setglobal G.mystate = 1
|
| 181 | }
|
| 182 |
|
| 183 | ### `proc` and `func` Scope Have Static Checks
|
| 184 |
|
| 185 | These YSH code units have additional **static checks** (parse errors):
|
| 186 |
|
| 187 | - Every variable must be declared once and only once with `var`. A duplicate
|
| 188 | declaration is a parse error.
|
| 189 | - `setvar` of an undeclared variable is a parse error.
|
| 190 |
|
| 191 | ## Procs Don't Use "Dynamic Scope"
|
| 192 |
|
| 193 | Procs are designed to be encapsulated and composable like processes. But the
|
| 194 | [dynamic scope]($xref:dynamic-scope) rule that Bourne shell functions use
|
| 195 | breaks encapsulation.
|
| 196 |
|
| 197 | Dynamic scope means that a function can **read and mutate** the locals of its
|
| 198 | caller, its caller's caller, and so forth. Example:
|
| 199 |
|
| 200 | g() {
|
| 201 | echo "f_var is $f_var" # g can see f's local variables
|
| 202 | }
|
| 203 |
|
| 204 | f() {
|
| 205 | local f_var=42 g
|
| 206 | }
|
| 207 |
|
| 208 | f
|
| 209 |
|
| 210 | YSH code should use `proc` instead. Inside a proc call, the `dynamic_scope`
|
| 211 | option is implicitly disabled (equivalent to `shopt --unset dynamic_scope`).
|
| 212 |
|
| 213 | ### Reading Variables
|
| 214 |
|
| 215 | This means that adding the `proc` keyword to the definition of `g` changes its
|
| 216 | behavior:
|
| 217 |
|
| 218 | proc g() {
|
| 219 | echo "f_var is $f_var" # Undefined!
|
| 220 | }
|
| 221 |
|
| 222 | This affects all kinds of variable references:
|
| 223 |
|
| 224 | proc p {
|
| 225 | echo $foo # look up foo in command mode
|
| 226 | var y = foo + 42 # look up foo in expression mode
|
| 227 | }
|
| 228 |
|
| 229 | As in Python and JavaScript, a local `foo` can *shadow* a global `foo`. Using
|
| 230 | `CAPS` for globals is a common style that avoids confusion. Remember that
|
| 231 | globals should usually be constants in YSH.
|
| 232 |
|
| 233 | ### Shell Language Constructs That Write Variables
|
| 234 |
|
| 235 | In shell, these language constructs assign to variables using dynamic
|
| 236 | scope. In YSH, they only mutate the **local** scope:
|
| 237 |
|
| 238 | - `x=val`
|
| 239 | - And variants `x+=val`, `a[i]=val`, `a[i]+=val`
|
| 240 | - `export x=val` and `readonly x=val`
|
| 241 | - `${x=default}`
|
| 242 | - `mycmd {x}>out` (stores a file descriptor in `$x`)
|
| 243 | - `(( x = 42 + y ))`
|
| 244 |
|
| 245 | ### Builtins That Write Variables
|
| 246 |
|
| 247 | These builtins are also "isolated" inside procs, using local scope:
|
| 248 |
|
| 249 | - [read](ref/chap-builtin-cmd.html#read) (`$REPLY`)
|
| 250 | - [readarray](ref/chap-builtin-cmd.html#readarray) aka `mapfile`
|
| 251 | - [getopts](ref/chap-builtin-cmd.html#getopts) (`$OPTIND`, `$OPTARG`, etc.)
|
| 252 | - [printf](ref/chap-builtin-cmd.html#printf) -v
|
| 253 | - [unset](ref/chap-osh-assign.html#unset)
|
| 254 |
|
| 255 | YSH Builtins:
|
| 256 |
|
| 257 | - [compadjust](ref/chap-builtin-cmd.html#compadjust)
|
| 258 | - [try](ref/chap-builtin-cmd.html#try) and `_error`
|
| 259 |
|
| 260 | <!-- TODO: should YSH builtins always behave the same way? Isn't that a little
|
| 261 | faster? I think read --all is not consistent. -->
|
| 262 |
|
| 263 | ### Reminder: Proc Scope is Flat
|
| 264 |
|
| 265 | All local variables in shell functions and procs live in the same scope. This
|
| 266 | includes variables declared in conditional blocks (`if` and `case`) and loops
|
| 267 | (`for` and `while`).
|
| 268 |
|
| 269 | proc p {
|
| 270 | for i in 1 2 3 {
|
| 271 | echo $i
|
| 272 | }
|
| 273 | echo $i # i is still 3
|
| 274 | }
|
| 275 |
|
| 276 | This includes first-class YSH blocks:
|
| 277 |
|
| 278 | proc p {
|
| 279 | var x = 42
|
| 280 | cd /tmp {
|
| 281 | var x = 0 # ERROR: x is already declared
|
| 282 | }
|
| 283 | }
|
| 284 |
|
| 285 | ## More Details
|
| 286 |
|
| 287 | ### Examples of Place Mutation
|
| 288 |
|
| 289 | The expression to the left of `=` is called a **place**. These are basically
|
| 290 | Python or JavaScript expressions, except that you add the `setvar` or
|
| 291 | `setglobal` keyword.
|
| 292 |
|
| 293 | setvar x[1] = 2 # array element
|
| 294 | setvar d['key'] = 3 # dict element
|
| 295 | setvar d.key = 3 # syntactic sugar for the above
|
| 296 | setvar x, y = y, x # swap
|
| 297 |
|
| 298 | ### Bare Assignment
|
| 299 |
|
| 300 | [Hay](hay.html) allows `const` declarations without the keyword:
|
| 301 |
|
| 302 | hay define Package
|
| 303 |
|
| 304 | Package cpython {
|
| 305 | version = '3.12' # like const version = ...
|
| 306 | }
|
| 307 |
|
| 308 | ### Temp Bindings
|
| 309 |
|
| 310 | Temp bindings precede a simple command:
|
| 311 |
|
| 312 | PYTHONPATH=. mycmd
|
| 313 |
|
| 314 | They create a new namespace on the stack where each cell has the `export` flag
|
| 315 | set (`declare -x`).
|
| 316 |
|
| 317 | In YSH, the lack of dynamic scope means that they can't be read inside a
|
| 318 | `proc`. So they're only useful for setting environment variables, and can be
|
| 319 | replaced with:
|
| 320 |
|
| 321 | env PYTHONPATH=. mycmd
|
| 322 | env PYTHONPATH=. $0 myproc # using the ARGV dispatch pattern
|
| 323 |
|
| 324 | ## Appendix A: More on Shell vs. YSH
|
| 325 |
|
| 326 | This section may help experienced shell users understand YSH.
|
| 327 |
|
| 328 | Shell:
|
| 329 |
|
| 330 | g=G # global variable
|
| 331 | readonly c=C # global constant
|
| 332 |
|
| 333 | myfunc() {
|
| 334 | local x=X # local variable
|
| 335 | readonly y=Y # local constant
|
| 336 |
|
| 337 | x=mutated # mutate local
|
| 338 | g=mutated # mutate global
|
| 339 | newglobal=G # create new global
|
| 340 |
|
| 341 | caller_var=mutated # dynamic scope (YSH doesn't have this)
|
| 342 | }
|
| 343 |
|
| 344 | YSH:
|
| 345 |
|
| 346 | var g = 'G' # global variable (discouraged)
|
| 347 | const c = 'C' # global constant
|
| 348 |
|
| 349 | proc myproc {
|
| 350 | var x = 'L' # local variable
|
| 351 |
|
| 352 | setvar x = 'mutated' # mutate local
|
| 353 | setglobal g = 'mutated' # mutate global
|
| 354 | setglobal newglobal = 'G' # create new global
|
| 355 | }
|
| 356 |
|
| 357 | ## Appendix B: Problems With Top-Level Scope In Other Languages
|
| 358 |
|
| 359 | - Julia 1.5 (August 2020): [The return of "soft scope" in the
|
| 360 | REPL](https://julialang.org/blog/2020/08/julia-1.5-highlights/#the_return_of_soft_scope_in_the_repl).
|
| 361 | - In contrast to Julia, YSH behaves the same in batch mode vs. interactive
|
| 362 | mode, and doesn't print warnings. However, it behaves differently at the
|
| 363 | top level. For this reason, we recommend using only `setvar` in
|
| 364 | interactive shells, and only `const` in the global scope of programs.
|
| 365 | - Racket: [The Top Level is Hopeless](https://gist.github.com/samth/3083053)
|
| 366 | - From [A Principled Approach to REPL Interpreters](https://2020.splashcon.org/details/splash-2020-Onward-papers/5/A-principled-approach-to-REPL-interpreters)
|
| 367 | (Onward 2020). Thanks to Michael Greenberg (of Smoosh) for this reference.
|
| 368 | - The behavior of `var` at the top level was partly inspired by this
|
| 369 | paper. It's consistent with bash's `declare`, and similar to JavaScript's
|
| 370 | `let`.
|
| 371 |
|
| 372 | ## Related Documents
|
| 373 |
|
| 374 | - [Interpreter State](interpreter-state.html)
|
| 375 | - The shell has a stack of namespaces.
|
| 376 | - Each namespace contains {variable name -> cell} bindings.
|
| 377 | - Cells have a tagged value (string, array, etc.) and 3 flags (readonly,
|
| 378 | export, nameref).
|
| 379 | - [Guide to Procs and Funcs](proc-func.html)
|
| 380 |
|