| 1 | ---
 | 
| 2 | in_progress: yes
 | 
| 3 | ---
 | 
| 4 | 
 | 
| 5 | Interpreter State
 | 
| 6 | =================
 | 
| 7 | 
 | 
| 8 | The Oils project has a single interpreter that supports both the OSH and YSH
 | 
| 9 | languages.
 | 
| 10 | 
 | 
| 11 | In other words, It's useful to think of Unix shell in historical layers:
 | 
| 12 | 
 | 
| 13 | - [OSH]($xref): A compatible but cleaned-up shell language.
 | 
| 14 |   1. Thompson Shell (pipelines, exit status)
 | 
| 15 |   2. Bourne Shell (variables, functions)
 | 
| 16 |   3. [Korn Shell]($xref:ksh) (indexed arrays)
 | 
| 17 |   4. [Bash]($xref:bash) (`shopt`, associative arrays)
 | 
| 18 | - [YSH]($xref): A new shell language that manipulates the same interpreter
 | 
| 19 |   state in a cleaner way.
 | 
| 20 | 
 | 
| 21 | <!--
 | 
| 22 | TODO:
 | 
| 23 | 
 | 
| 24 | - New "Pulp"?
 | 
| 25 | - Use fenced code blocks
 | 
| 26 |   - and run through BOTH bash and osh
 | 
| 27 |     - and link to this doc
 | 
| 28 |   - bash 4.4 in a sandbox?
 | 
| 29 | -->
 | 
| 30 | 
 | 
| 31 | 
 | 
| 32 | <div id="toc">
 | 
| 33 | </div>
 | 
| 34 | 
 | 
| 35 | ## Example
 | 
| 36 | 
 | 
| 37 | Shell has many syntaxes for the same semantics, which can be confusing.  For
 | 
| 38 | example, in bash, these four statements do similar things:
 | 
| 39 | 
 | 
| 40 | ```sh-prompt
 | 
| 41 | $ foo='bar'
 | 
| 42 | $ declare -g foo=bar
 | 
| 43 | $ x='foo=bar'; typeset $x
 | 
| 44 | $ printf -v foo bar
 | 
| 45 | 
 | 
| 46 | $ echo $foo
 | 
| 47 | bar
 | 
| 48 | ```
 | 
| 49 | 
 | 
| 50 | In addition, YSH adds JavaScript-like syntax:
 | 
| 51 | 
 | 
| 52 | ```
 | 
| 53 | var foo = 'bar'
 | 
| 54 | ```
 | 
| 55 | 
 | 
| 56 | YSH syntax can express more data types, but it may also confuse new users.
 | 
| 57 | 
 | 
| 58 | So the sections below describe the shell from a **semantic** perspective, which
 | 
| 59 | should help users reason about their programs.
 | 
| 60 | 
 | 
| 61 | Quick tip: Use the [pp]($help) builtin to inspect shell variables.
 | 
| 62 | 
 | 
| 63 | ## Design Goals
 | 
| 64 | 
 | 
| 65 | ### Simplify and Rationalize bash
 | 
| 66 | 
 | 
| 67 | POSIX shell has a fairly simple model: everything is a string, and `"$@"` is a
 | 
| 68 | special case.
 | 
| 69 | 
 | 
| 70 | Bash adds many features on top of POSIX, including arrays and associative
 | 
| 71 | arrays.  Oils implements those features, and a few more.
 | 
| 72 | 
 | 
| 73 | However, it also significantly simplifies the model.
 | 
| 74 | 
 | 
| 75 | A primary difference is mentioned in [Known Differences](known-differences.html):
 | 
| 76 | 
 | 
| 77 | - In bash, the *locations* of values are tagged with types, e.g. `declare -A
 | 
| 78 |   unset_assoc_array`.
 | 
| 79 | - In Oils, *values* are tagged with types.  This is how common dynamic languages
 | 
| 80 |   like Python and JavaScript behave.
 | 
| 81 | 
 | 
| 82 | In other words, Oils "salvages" the confusing semantics of bash and produces
 | 
| 83 | something simpler, while still being very compatible.
 | 
| 84 | 
 | 
| 85 | ### Add New Features and Types
 | 
| 86 | 
 | 
| 87 | TODO
 | 
| 88 | 
 | 
| 89 | - eggex type
 | 
| 90 | - later: floating point type
 | 
| 91 | 
 | 
| 92 | ## High Level Description
 | 
| 93 | 
 | 
| 94 | ### Memory Is a Stack
 | 
| 95 | 
 | 
| 96 | - Shell has a stack but no heap.  The stack stores:
 | 
| 97 |   - Variables that are local to a function.
 | 
| 98 |   - The **arguments array** which is spelled `"$@"` in shell, and `@ARGV` in
 | 
| 99 |     YSH.
 | 
| 100 | - Shell's memory has values and locations, but **no** references/pointers.
 | 
| 101 | 
 | 
| 102 | <!--
 | 
| 103 | later: YSH adds references to data structures on the heap, which may be recurisve.
 | 
| 104 | -->
 | 
| 105 | 
 | 
| 106 | ### Environment Variables Become Global Variables
 | 
| 107 | 
 | 
| 108 | On initialization, environment variables like `PYTHONPATH=.` are copied into
 | 
| 109 | the shell's memory as global variables, with the `export` flag set.
 | 
| 110 | 
 | 
| 111 | Global variables are stored in the first stack frame, i.e. the one at index
 | 
| 112 | `0`.
 | 
| 113 | 
 | 
| 114 | ### Functions and Variables Are Separate
 | 
| 115 | 
 | 
| 116 | There are two distinct namespaces.  For example:
 | 
| 117 | 
 | 
| 118 | ```
 | 
| 119 | foo() {
 | 
| 120 |   echo 'function named foo'
 | 
| 121 | }
 | 
| 122 | foo=bar   # a variable; doesn't affect the function
 | 
| 123 | ```
 | 
| 124 | 
 | 
| 125 | ### Variable Name Lookup with "Dynamic Scope"
 | 
| 126 | 
 | 
| 127 | OSH has it, but YSH limits it.
 | 
| 128 | 
 | 
| 129 | ### Limitations of Arrays And Compound Data Structures
 | 
| 130 | 
 | 
| 131 | Shell is a value-oriented language.
 | 
| 132 | 
 | 
| 133 | - Can't Be Nested 
 | 
| 134 | - Can't Be Passed to Functions or Returned From Functions
 | 
| 135 | - Can't Take References; Must be Copied
 | 
| 136 | 
 | 
| 137 | Example:
 | 
| 138 | 
 | 
| 139 | ```
 | 
| 140 | declare -a myarray=("${other_array[@]}")   # shell
 | 
| 141 | 
 | 
| 142 | var myarray = :| @other_array |            # Oils
 | 
| 143 | ```
 | 
| 144 | 
 | 
| 145 | Reason: There's no Garbage collection.
 | 
| 146 | 
 | 
| 147 | ### Integers and Coercion
 | 
| 148 | 
 | 
| 149 | - Strings are coerced to integers to do math.
 | 
| 150 | - What about `-i` in bash?
 | 
| 151 | 
 | 
| 152 | 
 | 
| 153 | ### Unix `fork()` Has Copy-On-Write Semantics
 | 
| 154 | 
 | 
| 155 | See the [Process Model](process-model.html) document.
 | 
| 156 | 
 | 
| 157 | 
 | 
| 158 | ## Key Data Types
 | 
| 159 | 
 | 
| 160 | TODO: [core/runtime.asdl]($oils-src)
 | 
| 161 | 
 | 
| 162 | <!-- 
 | 
| 163 | TODO:
 | 
| 164 | - Make a graphviz diagram once everything is settled?
 | 
| 165 | -->
 | 
| 166 | 
 | 
| 167 | ### `cell`
 | 
| 168 | 
 | 
| 169 | TODO
 | 
| 170 | 
 | 
| 171 | - [export]($help) only applies to **strings**
 | 
| 172 | 
 | 
| 173 | ### `value`
 | 
| 174 | 
 | 
| 175 | Undef, Str, Sequential/Indexed Arrays, Associative Array
 | 
| 176 | 
 | 
| 177 | - OSH has `value.BashArray`, and YSH has `value.List`.
 | 
| 178 | - no integers, but there is (( ))
 | 
| 179 | - "$@" is an array, and "${a[@]}" too
 | 
| 180 |   - not true in bash -- it's fuzzy there
 | 
| 181 |   - but $@ and ${a[@]}  are NOT arrays
 | 
| 182 | - flags: readonly and exported (but arrays/assoc arrays shouldn't be exported)
 | 
| 183 |   - TODO: find that
 | 
| 184 | 
 | 
| 185 | ### `cmd_value` for shell builtins
 | 
| 186 | 
 | 
| 187 | Another important type:
 | 
| 188 | 
 | 
| 189 | ```
 | 
| 190 |   assign_arg = (lvalue lval, value? rval, int spid)
 | 
| 191 | 
 | 
| 192 |   cmd_value =
 | 
| 193 |     Argv(string* argv, int* arg_spids, command__BraceGroup? block)
 | 
| 194 |   | Assign(builtin builtin_id,
 | 
| 195 |            string* argv, int* arg_spids,
 | 
| 196 |            assign_arg* pairs)
 | 
| 197 | ```
 | 
| 198 | 
 | 
| 199 | 
 | 
| 200 | ## Printing State
 | 
| 201 | 
 | 
| 202 | ### Shell Builtins
 | 
| 203 | 
 | 
| 204 | Oils supports various shell and bash operations to view the interpreter state.
 | 
| 205 | 
 | 
| 206 | - `set` prints variables and their values
 | 
| 207 | - `set -o` prints options
 | 
| 208 | - `declare/typeset/readonly/export -p` prints a subset of variables
 | 
| 209 | - `test -v` tests if a variable is defined.
 | 
| 210 | 
 | 
| 211 | ### [pp]($help) in Oils
 | 
| 212 | 
 | 
| 213 | Pretty prints a cell.
 | 
| 214 | 
 | 
| 215 | This is cleaner!
 | 
| 216 | 
 | 
| 217 | TODO: What about functions
 | 
| 218 | 
 | 
| 219 | 
 | 
| 220 | 
 | 
| 221 | 
 | 
| 222 | ## Modifying State
 | 
| 223 | 
 | 
| 224 | ### YSH Keywords
 | 
| 225 | 
 | 
| 226 | TODO: See YSH Keywords doc.
 | 
| 227 | 
 | 
| 228 | ### Shell Assignment Builtins: declare/typeset, readonly, export
 | 
| 229 | 
 | 
| 230 | ...
 | 
| 231 | 
 | 
| 232 | ### [unset]($help)
 | 
| 233 | 
 | 
| 234 | You can't unset an array in OSH?  But you can in bash.
 | 
| 235 | 
 | 
| 236 | ### Other Builtins
 | 
| 237 | 
 | 
| 238 | - [read]($help).  Sometimes sets the magic `$REPLY` variable.
 | 
| 239 | - [getopts]($help)
 | 
| 240 | 
 | 
| 241 | 
 | 
| 242 | ## Links
 | 
| 243 | 
 | 
| 244 | - [Process Model](process-mode.html)
 | 
| 245 | - <https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays>
 | 
| 246 | - <https://www.thegeekstuff.com/2010/06/bash-array-tutorial>
 | 
| 247 | 
 | 
| 248 | 
 | 
| 249 | ## Appendix: Bash Issues
 | 
| 250 | 
 | 
| 251 | <!--
 | 
| 252 | ### Surprising Parsing
 | 
| 253 | 
 | 
| 254 | Parsing bash is undecidable.
 | 
| 255 | 
 | 
| 256 |     A[x]
 | 
| 257 |     a[x]
 | 
| 258 | -->
 | 
| 259 | 
 | 
| 260 | ### Strings and Arrays Are Confused
 | 
| 261 | 
 | 
| 262 |     Horrible
 | 
| 263 | 
 | 
| 264 |     a=('1 2' 3)
 | 
| 265 |     b=(1 '2 3')  # two different elements
 | 
| 266 | 
 | 
| 267 |     [[ $a == $b ]]
 | 
| 268 |     [[ ${a[0]} == ${b[0]} ]]
 | 
| 269 | 
 | 
| 270 |     [[ ${a[@]} == ${b[@]} ]]
 | 
| 271 | 
 | 
| 272 | 
 | 
| 273 | Associative arrays and being undefined
 | 
| 274 | 
 | 
| 275 | - half an array type
 | 
| 276 |   - strict_array removes this
 | 
| 277 |   - case $x in "$@"
 | 
| 278 | - half an associative array type
 | 
| 279 | 
 | 
| 280 | ### Indexed Arrays and Associative Arrays Are Confused
 | 
| 281 | 
 | 
| 282 | ### Empty and Unset Are Confused
 | 
| 283 | 
 | 
| 284 | - empty array conflicts with `set -o nounset` (in bash 4.3).  I can't recommend
 | 
| 285 |   in good faith.
 | 
| 286 | 
 | 
| 287 | <!--
 | 
| 288 | test -v (???)  Was there a bug here?
 | 
| 289 | -->
 | 
| 290 | 
 | 
| 291 | 
 | 
| 292 | 
 | 
| 293 | <!--
 | 
| 294 | 
 | 
| 295 | ## Quirky Syntax and Semantics in Shell Sublanguages
 | 
| 296 | 
 | 
| 297 | ### Command
 | 
| 298 | 
 | 
| 299 | Mentioned above: 
 | 
| 300 | 
 | 
| 301 |     a[x+1]+=x
 | 
| 302 |     a[x+1]+=$x
 | 
| 303 | 
 | 
| 304 |     s+='foo'
 | 
| 305 | 
 | 
| 306 | ### Word
 | 
| 307 | 
 | 
| 308 | Mentioned above:
 | 
| 309 | 
 | 
| 310 |     echo ${a[0]}
 | 
| 311 |     echo "${a[0]}"
 | 
| 312 |     echo ${a[i+1]}
 | 
| 313 | 
 | 
| 314 | ### Arithmetic Does Integer Coercion
 | 
| 315 | 
 | 
| 316 | SURPRISING!  Avoid if you can!!!
 | 
| 317 | 
 | 
| 318 |     (( a[ x+1 ] += s ))  # 
 | 
| 319 | 
 | 
| 320 | 
 | 
| 321 | ### Boolean: [[ $a = $b ]]
 | 
| 322 | 
 | 
| 323 | Operates on strings only.  Can't compare
 | 
| 324 | 
 | 
| 325 | -->
 | 
| 326 | 
 |