| 1 | ---
 | 
| 2 | default_highlighter: oils-sh
 | 
| 3 | ---
 | 
| 4 | 
 | 
| 5 | YSH Fixes Shell's Error Handling (`errexit`)
 | 
| 6 | ============================================
 | 
| 7 | 
 | 
| 8 | <style>
 | 
| 9 |   .faq {
 | 
| 10 |     font-style: italic;
 | 
| 11 |     color: purple;
 | 
| 12 |   }
 | 
| 13 | 
 | 
| 14 |   /* copied from web/blog.css */
 | 
| 15 |   .attention {
 | 
| 16 |     text-align: center;
 | 
| 17 |     background-color: #DEE;
 | 
| 18 |     padding: 1px 0.5em;
 | 
| 19 | 
 | 
| 20 |     /* to match p tag etc. */
 | 
| 21 |     margin-left: 2em;
 | 
| 22 |   }
 | 
| 23 | </style>
 | 
| 24 | 
 | 
| 25 | YSH is unlike other shells:
 | 
| 26 | 
 | 
| 27 | - It never silently ignores an error, and it never loses an exit code.
 | 
| 28 | - There's no reason to write an YSH script without `errexit`, which is on by
 | 
| 29 |   default.
 | 
| 30 | 
 | 
| 31 | This document explains how YSH makes these guarantees.  We first review shell
 | 
| 32 | error handling, and discuss its fundamental problems.  Then we show idiomatic
 | 
| 33 | YSH code, and look under the hood at the underlying mechanisms.
 | 
| 34 | 
 | 
| 35 | (If you just want to **use** YSH, see [YSH Error Handling: A Quick
 | 
| 36 | Guide](ysh-error.html).)
 | 
| 37 | 
 | 
| 38 | [file a bug]: https://github.com/oilshell/oil/issues
 | 
| 39 | 
 | 
| 40 | <div id="toc">
 | 
| 41 | </div>
 | 
| 42 | 
 | 
| 43 | ## Review of Shell Error Handling Mechanisms
 | 
| 44 | 
 | 
| 45 | POSIX shell has fundamental problems with error handling.  With `set -e` aka
 | 
| 46 | `errexit`, you're [damned if you do and damned if you don't][bash-faq].
 | 
| 47 | 
 | 
| 48 | GNU [bash]($xref) fixes some of the problems, but **adds its own**, e.g. with
 | 
| 49 | respect to process subs, command subs, and assignment builtins.
 | 
| 50 | 
 | 
| 51 | YSH fixes all the problems by adding new builtin commands, special variables,
 | 
| 52 | and global options.  But you see a simple interface with `try` and `_error`.
 | 
| 53 | 
 | 
| 54 | Let's review a few concepts before discussing YSH.
 | 
| 55 | 
 | 
| 56 | ### POSIX Shell
 | 
| 57 | 
 | 
| 58 | - The special variable `$?` is the exit status of the "last command".  It's a
 | 
| 59 |   number between `0` and `255`.
 | 
| 60 | - If `errexit` is enabled, the shell will abort if `$?` is nonzero.
 | 
| 61 |   - This is subject to the *Disabled `errexit` Quirk*, which I describe below.
 | 
| 62 | 
 | 
| 63 | These mechanisms are fundamentally incomplete.
 | 
| 64 | 
 | 
| 65 | ### Bash
 | 
| 66 | 
 | 
| 67 | Bash improves error handling for pipelines like `ls /bad | wc`.
 | 
| 68 | 
 | 
| 69 | - `${PIPESTATUS[@]}` stores the exit codes of all processes in a pipeline.
 | 
| 70 | - When `set -o pipefail` is enabled, `$?` takes into account every process in a
 | 
| 71 |   pipeline.
 | 
| 72 |   - Without this setting, the failure of `ls` would be ignored.
 | 
| 73 | - `shopt -s inherit_errexit` was introduced in bash 4.4 to re-introduce error
 | 
| 74 |   handling in command sub child processes.  This fixes a bash-specific bug.
 | 
| 75 | 
 | 
| 76 | But there are still places where bash will lose an exit code.
 | 
| 77 | 
 | 
| 78 |  
 | 
| 79 | 
 | 
| 80 | ## Fundamental Problems
 | 
| 81 | 
 | 
| 82 | Let's look at **four** fundamental issues with shell error handling.  They
 | 
| 83 | underlie the **nine** [shell pitfalls enumerated in the
 | 
| 84 | appendix](#list-of-pitfalls).
 | 
| 85 | 
 | 
| 86 | ### When Is `$?` Set?
 | 
| 87 | 
 | 
| 88 | Each external process and shell builtin has one exit status.  But the
 | 
| 89 | definition of `$?` is obscure: it's tied to the `pipeline` rule in the POSIX
 | 
| 90 | shell grammar, which does **not** correspond to a single process or builtin.
 | 
| 91 | 
 | 
| 92 | We saw that `pipefail` fixes one case:
 | 
| 93 | 
 | 
| 94 |     ls /nonexistent | wc   # 2 processes, 2 exit codes, but just one $?
 | 
| 95 | 
 | 
| 96 | But there are others:
 | 
| 97 | 
 | 
| 98 |     local x=$(false)                 # 2 exit codes, but just one $?
 | 
| 99 |     diff <(sort left) <(sort right)  # 3 exit codes, but just one $?
 | 
| 100 | 
 | 
| 101 | This issue means that shell scripts fundamentally **lose errors**.  The
 | 
| 102 | language is unreliable.
 | 
| 103 | 
 | 
| 104 | ### What Does `$?` Mean?
 | 
| 105 | 
 | 
| 106 | Each process or builtin decides the meaning of its exit status independently.
 | 
| 107 | Here are two common choices:
 | 
| 108 | 
 | 
| 109 | 1. **The Failure Paradigm**
 | 
| 110 |    - `0` for success, or non-zero for an error.
 | 
| 111 |    - Examples: most shell builtins, `ls`, `cp`, ...
 | 
| 112 | 1. **The Boolean Paradigm**
 | 
| 113 |    - `0` for true, `1` for false, or a different number like `2` for an error.
 | 
| 114 |    - Examples: the `test` builtin, `grep`, `diff`, ...
 | 
| 115 | 
 | 
| 116 | New error handling constructs in YSH deal with this fundamental inconsistency.
 | 
| 117 | 
 | 
| 118 | ### The Meaning of `if`
 | 
| 119 | 
 | 
| 120 | Shell's `if` statement tests whether a command exits zero or non-zero:
 | 
| 121 | 
 | 
| 122 |     if grep class *.py; then
 | 
| 123 |       echo 'found class'
 | 
| 124 |     else
 | 
| 125 |       echo 'not found'  # is this true?
 | 
| 126 |     fi
 | 
| 127 | 
 | 
| 128 | So while you'd expect `if` to work in the boolean paradigm, it's closer to
 | 
| 129 | the failure paradigm.  This means that using `if` with certain commands can
 | 
| 130 | cause the *Error or False Pitfall*:
 | 
| 131 | 
 | 
| 132 |     if grep 'class\(' *.py; then  # grep syntax error, status 2
 | 
| 133 |       echo 'found class('
 | 
| 134 |     else
 | 
| 135 |       echo 'not found is a lie'
 | 
| 136 |     fi
 | 
| 137 |     # => grep: Unmatched ( or \(
 | 
| 138 |     # => not found is a lie
 | 
| 139 | 
 | 
| 140 | That is, the `else` clause conflates grep's **error** status 2 and **false**
 | 
| 141 | status 1.
 | 
| 142 | 
 | 
| 143 | Strangely enough, I encountered this pitfall while trying to disallow shell's
 | 
| 144 | error handling pitfalls in YSH!  I describe this in another appendix as the
 | 
| 145 | "[meta pitfall](#the-meta-pitfall)".
 | 
| 146 | 
 | 
| 147 | ### Design Mistake: The Disabled `errexit` Quirk
 | 
| 148 | 
 | 
| 149 | There's more bad news about the design of shell's `if` statement.  It's subject
 | 
| 150 | to the *Disabled `errexit` Quirk*, which means when you use a **shell function**
 | 
| 151 | in a conditional context, errors are unexpectedly **ignored**.
 | 
| 152 | 
 | 
| 153 | That is, while `if ls /tmp` is useful, `if my-ls-function /tmp` should be
 | 
| 154 | avoided.  It yields surprising results.
 | 
| 155 | 
 | 
| 156 | I call this the *`if myfunc` Pitfall*, and show an example in [the
 | 
| 157 | appendix](#disabled-errexit-quirk-if-myfunc-pitfall).
 | 
| 158 | 
 | 
| 159 | We can't fix this decades-old bug in shell.  Instead we disallow dangerous code
 | 
| 160 | with `strict_errexit`, and add new error handling mechanisms.
 | 
| 161 | 
 | 
| 162 |  
 | 
| 163 | 
 | 
| 164 | ## YSH Error Handling: The Big Picture 
 | 
| 165 | 
 | 
| 166 | We've reviewed how POSIX shell and bash work, and showed fundamental problems
 | 
| 167 | with the shell language.
 | 
| 168 | 
 | 
| 169 | But when you're using YSH, **you don't have to worry about any of this**!
 | 
| 170 | 
 | 
| 171 | ### YSH Fails On Every Error
 | 
| 172 | 
 | 
| 173 | This means you don't have to explicitly check for errors.  Examples:
 | 
| 174 | 
 | 
| 175 |     shopt --set ysh:upgrade     # Enable good error handling in bin/osh
 | 
| 176 |                                 # It's the default in bin/ysh.
 | 
| 177 |     shopt --set strict_errexit  # Disallow bad shell error handling.
 | 
| 178 |                                 # Also the default in bin/ysh.
 | 
| 179 | 
 | 
| 180 |     local date=$(date X)        # 'date' failure is fatal
 | 
| 181 |     # => date: invalid date 'X' 
 | 
| 182 | 
 | 
| 183 |     echo $(date X)              # ditto
 | 
| 184 | 
 | 
| 185 |     echo $(date X) $(ls > F)    # 'ls' isn't executed; 'date' fails first
 | 
| 186 | 
 | 
| 187 |     ls /bad | wc                # 'ls' failure is fatal
 | 
| 188 | 
 | 
| 189 |     diff <(sort A) <(sort B)    # 'sort' failure is fatal
 | 
| 190 | 
 | 
| 191 | On the other hand, you won't experience this problem caused by `pipefail`:
 | 
| 192 | 
 | 
| 193 |     yes | head                 # doesn't fail due to SIGPIPE
 | 
| 194 | 
 | 
| 195 | The details are explained below.
 | 
| 196 | 
 | 
| 197 | ### `try` Handles Command and Expression Errors
 | 
| 198 | 
 | 
| 199 | You may want to **handle failure** instead of aborting the shell.  In this
 | 
| 200 | case, use the `try` builtin and inspect the `_error` variable it sets.
 | 
| 201 | 
 | 
| 202 |     try {                     # try takes a block of commands
 | 
| 203 |       ls /etc
 | 
| 204 |       ls /BAD                 # it stops at the first failure
 | 
| 205 |       ls /lib
 | 
| 206 |     }                         # After try, $? is always 0
 | 
| 207 |     if (_error.code !== 0) {  # Now check _error
 | 
| 208 |       echo 'failed'
 | 
| 209 |     }
 | 
| 210 | 
 | 
| 211 | Note that:
 | 
| 212 | 
 | 
| 213 | - The `_error.code` variable is different than `$?`.
 | 
| 214 |   - The leading `_` is a PHP-like convention for special variables /
 | 
| 215 |     "registers" in YSH.
 | 
| 216 | - Idiomatic YSH programs don't look at `$?`.
 | 
| 217 | 
 | 
| 218 | You also have fine-grained control over every process in a pipeline:
 | 
| 219 | 
 | 
| 220 |     try {
 | 
| 221 |       ls /bad | wc
 | 
| 222 |     }
 | 
| 223 |     write -- @_pipeline_status  # every exit status
 | 
| 224 | 
 | 
| 225 | And each process substitution:
 | 
| 226 | 
 | 
| 227 |     try {
 | 
| 228 |       diff <(sort left.txt) <(sort right.txt)
 | 
| 229 |     }
 | 
| 230 |     write -- @_process_sub_status  # every exit status
 | 
| 231 | 
 | 
| 232 | 
 | 
| 233 |  
 | 
| 234 | 
 | 
| 235 | <div class="attention">
 | 
| 236 | 
 | 
| 237 | See [YSH vs. Shell Idioms > Error Handling](idioms.html#error-handling) for
 | 
| 238 | more examples.
 | 
| 239 | 
 | 
| 240 | </div>
 | 
| 241 | 
 | 
| 242 |  
 | 
| 243 | 
 | 
| 244 | Certain expressions produce fatal errors, like:
 | 
| 245 | 
 | 
| 246 |     var x = 42 / 0  # divide by zero will abort shell
 | 
| 247 | 
 | 
| 248 | The `try` builtin also handles them:
 | 
| 249 | 
 | 
| 250 |     try {
 | 
| 251 |        var x = 42 / 0
 | 
| 252 |     }
 | 
| 253 |     if failed {
 | 
| 254 |       echo 'divide by zero'
 | 
| 255 |     }
 | 
| 256 | 
 | 
| 257 | More examples: 
 | 
| 258 | 
 | 
| 259 | - Index out of bounds `a[i]` 
 | 
| 260 | - Nonexistent key `d->foo` or `d['foo']`.
 | 
| 261 | 
 | 
| 262 | Such expression evaluation errors result in status `3`, which is an arbitrary non-zero
 | 
| 263 | status that's not used by other shells.  Status `2` is generally for syntax
 | 
| 264 | errors and status `1` is for most runtime failures.
 | 
| 265 | 
 | 
| 266 | ### `boolstatus` Enforces 0 or 1 Status
 | 
| 267 | 
 | 
| 268 | The `boolstatus` builtin addresses the *Error or False Pitfall*:
 | 
| 269 | 
 | 
| 270 |     if boolstatus grep 'class' *.py {  # may abort the program
 | 
| 271 |       echo 'found'      # status 0 means 'found'
 | 
| 272 |     } else {
 | 
| 273 |       echo 'not found'  # status 1 means 'not found'
 | 
| 274 |     }
 | 
| 275 | 
 | 
| 276 | Rather than confusing **error** with **false**, `boolstatus` will abort the
 | 
| 277 | program if `grep` doesn't return 0 or 1.
 | 
| 278 | 
 | 
| 279 | You can think of this as a shortcut for
 | 
| 280 | 
 | 
| 281 |     try {
 | 
| 282 |       grep 'class' *.py
 | 
| 283 |     }
 | 
| 284 |     case (_error.code) {
 | 
| 285 |       (0)    { echo 'found' }
 | 
| 286 |       (1)    { echo 'not found' }
 | 
| 287 |       (else) { echo 'fatal'
 | 
| 288 |                exit $[_error.code]
 | 
| 289 |              }
 | 
| 290 |     }
 | 
| 291 | 
 | 
| 292 | ### FAQ on Language Design
 | 
| 293 | 
 | 
| 294 | <div class="faq">
 | 
| 295 | 
 | 
| 296 | Why is there `try` but no `catch`?
 | 
| 297 | 
 | 
| 298 | </div>
 | 
| 299 | 
 | 
| 300 | First, it offers more flexibility:
 | 
| 301 | 
 | 
| 302 | - The handler usually inspects `_error.code`, but it may also inspect
 | 
| 303 |   `_pipeline_status` or `_process_sub_status`.
 | 
| 304 | - The handler may use `case` instead of `if`, e.g. to distinguish true / false
 | 
| 305 |   / error.
 | 
| 306 | 
 | 
| 307 | Second, it makes the language smaller:
 | 
| 308 | 
 | 
| 309 | - `try` / `catch` would require specially parsed keywords.  But our `try` is a
 | 
| 310 |   shell builtin that takes a block, like `cd` or `shopt`.
 | 
| 311 | - The builtin also lets us write either `try ls` or `try { ls }`, which is hard
 | 
| 312 |   with a keyword.
 | 
| 313 | 
 | 
| 314 | Another way to remember this is that there are **three parts** to handling an
 | 
| 315 | error, each of which has independent choices:
 | 
| 316 | 
 | 
| 317 | 1. Does `try` take a simple command or a block?  For example, `try ls` versus
 | 
| 318 |    `try { ls; var x = 42 / n }`
 | 
| 319 | 2. Which status do you want to inspect?
 | 
| 320 | 3. Inspect it with `if` or `case`?  As mentioned, `boolstatus` is a special
 | 
| 321 |    case of `try / case`.
 | 
| 322 | 
 | 
| 323 | <div class="faq">
 | 
| 324 | 
 | 
| 325 | Why is `_error.code` different from `$?`
 | 
| 326 | 
 | 
| 327 | </div>
 | 
| 328 | 
 | 
| 329 | This avoids special cases in the interpreter for `try`, which is again a
 | 
| 330 | builtin that takes a block.
 | 
| 331 | 
 | 
| 332 | The exit status of `try` is always `0`.  If it returned a non-zero status, the
 | 
| 333 | `errexit` rule would trigger, and you wouldn't be able to handle the error!
 | 
| 334 | 
 | 
| 335 | Generally, [errors occur *inside* blocks, not
 | 
| 336 | outside](proc-block-func.html#errors).
 | 
| 337 | 
 | 
| 338 | Again, idiomatic YSH scripts never look at `$?`, which is only used to trigger
 | 
| 339 | shell's `errexit` rule.  Instead they invoke `try` and inspect `_error.code`
 | 
| 340 | when they want to handle errors.
 | 
| 341 | 
 | 
| 342 | <div class="faq">
 | 
| 343 | 
 | 
| 344 | Why `boolstatus`?  Can't you just change what `if` means in YSH?
 | 
| 345 | 
 | 
| 346 | </div>
 | 
| 347 | 
 | 
| 348 | I've learned the hard way that when there's a shell **semantics** change, there
 | 
| 349 | must be a **syntax** change.  In general, you should be able to read code on
 | 
| 350 | its own, without context.
 | 
| 351 | 
 | 
| 352 | Readers shouldn't have to constantly look up whether `ysh:upgrade` is on.  There
 | 
| 353 | are some cases where this is necessary, but it should be minimized.
 | 
| 354 | 
 | 
| 355 | Also, both `if foo` and `if boolstatus foo` are useful in idiomatic YSH code.
 | 
| 356 | 
 | 
| 357 |  
 | 
| 358 | 
 | 
| 359 | <div class="attention">
 | 
| 360 | 
 | 
| 361 | **Most users can skip to [the summary](#summary).**  You don't need to know all
 | 
| 362 | the details to use YSH.
 | 
| 363 | 
 | 
| 364 | </div>
 | 
| 365 | 
 | 
| 366 |  
 | 
| 367 | 
 | 
| 368 | ## Reference: Global Options
 | 
| 369 | 
 | 
| 370 | 
 | 
| 371 | Under the hood, we implement the `errexit` option from POSIX, bash options like
 | 
| 372 | `pipefail` and `inherit_errexit`, and add more options of our
 | 
| 373 | own.  They're all hidden behind [option groups](options.html) like `strict:all`
 | 
| 374 | and `ysh:upgrade`.
 | 
| 375 | 
 | 
| 376 | The following sections explain new YSH options.
 | 
| 377 | 
 | 
| 378 | ### `command_sub_errexit` Adds More Errors
 | 
| 379 | 
 | 
| 380 | In all Bourne shells, the status of command subs is lost, so errors are ignored
 | 
| 381 | (details in the [appendix](#quirky-behavior-of)).  For example:
 | 
| 382 | 
 | 
| 383 |     echo $(date X) $(date Y)  # 2 failures, both ignored
 | 
| 384 |     echo                      # program continues
 | 
| 385 | 
 | 
| 386 | The `command_sub_errexit` option makes both `date` invocations an an error.
 | 
| 387 | The status `$?` of the parent `echo` command will be `1`, so if `errexit` is
 | 
| 388 | on, the shell will abort.
 | 
| 389 | 
 | 
| 390 | (Other shells should implement `command_sub_errexit`!)
 | 
| 391 | 
 | 
| 392 | ### `process_sub_fail` Is Analogous to `pipefail`
 | 
| 393 | 
 | 
| 394 | Similarly, in this example, `sort` will fail if the file doesn't exist.
 | 
| 395 | 
 | 
| 396 |     diff <(sort left.txt) <(sort right.txt)  # any failures are ignored
 | 
| 397 | 
 | 
| 398 | But there's no way to see this error in bash.  YSH adds `process_sub_fail`,
 | 
| 399 | which folds the failure into `$?` so `errexit` can do its job.
 | 
| 400 | 
 | 
| 401 | You can also inspect the special `_process_sub_status` array variable to
 | 
| 402 | implement custom error logic.
 | 
| 403 | 
 | 
| 404 | ### `strict_errexit` Flags Two Problems
 | 
| 405 | 
 | 
| 406 | Like other `strict_*` options, YSH `strict_errexit` improves your shell
 | 
| 407 | programs, even if you run them under another shell like [bash]($xref)!  It's
 | 
| 408 | like a linter *at runtime*, so it can catch things that [ShellCheck][] can't.
 | 
| 409 | 
 | 
| 410 | [ShellCheck]: https://www.shellcheck.net/
 | 
| 411 | 
 | 
| 412 | `strict_errexit` disallows code that exhibits these problems:
 | 
| 413 | 
 | 
| 414 | 1. The `if myfunc` Pitfall
 | 
| 415 | 1. The `local x=$(false)` Pitfall
 | 
| 416 | 
 | 
| 417 | See the appendix for examples of each.
 | 
| 418 | 
 | 
| 419 | #### Rules to Prevent the `if myfunc` Pitfall
 | 
| 420 | 
 | 
| 421 | In any conditional context, `strict_errexit` disallows:
 | 
| 422 | 
 | 
| 423 | 1. All commands except `((`, `[[`, and some simple commands (e.g. `echo foo`).
 | 
| 424 |    - Detail: `! ls` is considered a pipeline in the shell grammar.  We have to
 | 
| 425 |      allow it, while disallowing `ls | grep foo`.
 | 
| 426 | 2. Function/proc invocations (which are a special case of simple
 | 
| 427 |    commands.)
 | 
| 428 | 3. Command sub and process sub (`shopt --unset allow_csub_psub`)
 | 
| 429 | 
 | 
| 430 | This means that you should check the exit status of functions and pipeline
 | 
| 431 | differently.  See [Does a Function
 | 
| 432 | Succeed?](idioms.html#does-a-function-succeed), [Does a Pipeline
 | 
| 433 | Succeed?](idioms.html#does-a-pipeline-succeed), and other [YSH vs. Shell
 | 
| 434 | Idioms](idioms.html).
 | 
| 435 | 
 | 
| 436 | #### Rule to Prevent the `local x=$(false)` Pitfall
 | 
| 437 | 
 | 
| 438 | - Command Subs and process subs are disallowed in assignment builtins: `local`,
 | 
| 439 |   `declare` aka `typeset`, `readonly`, and `export`.
 | 
| 440 | 
 | 
| 441 | No:
 | 
| 442 | 
 | 
| 443 |     local x=$(false)
 | 
| 444 | 
 | 
| 445 | Yes:
 | 
| 446 | 
 | 
| 447 |     var x = $(false)   # YSH style
 | 
| 448 | 
 | 
| 449 |     local x            # Shell style
 | 
| 450 |     x=$(false)
 | 
| 451 | 
 | 
| 452 | ### `sigpipe_status_ok` Ignores an Issue With `pipefail`
 | 
| 453 | 
 | 
| 454 | When you turn on `pipefail`, you may inadvertently run into this behavior:
 | 
| 455 | 
 | 
| 456 |     yes | head
 | 
| 457 |     # => y
 | 
| 458 |     # ...
 | 
| 459 | 
 | 
| 460 |     echo ${PIPESTATUS[@]}
 | 
| 461 |     # => 141 0
 | 
| 462 | 
 | 
| 463 | That is, `head` closes the pipe after 10 lines, causing the `yes` command to
 | 
| 464 | **fail** with `SIGPIPE` status `141`.
 | 
| 465 | 
 | 
| 466 | This error shouldn't be fatal, so OSH has a `sigpipe_status_ok` option, which
 | 
| 467 | is on by default in YSH.
 | 
| 468 | 
 | 
| 469 | ### `verbose_errexit`
 | 
| 470 | 
 | 
| 471 | When `verbose_errexit` is on, the shell prints errors to `stderr` when the
 | 
| 472 | `errexit` rule is triggered.
 | 
| 473 | 
 | 
| 474 | ### FAQ on Options
 | 
| 475 | 
 | 
| 476 | <div class="faq">
 | 
| 477 | 
 | 
| 478 | Why is there no `_command_sub_status`?  And why is `command_sub_errexit` named
 | 
| 479 | differently than `process_sub_fail` and `pipefail`?
 | 
| 480 | 
 | 
| 481 | </div>
 | 
| 482 | 
 | 
| 483 | Command subs are executed **serially**, while process subs and pipeline parts
 | 
| 484 | run **in parallel**.
 | 
| 485 | 
 | 
| 486 | So a command sub can "abort" its parent command, setting `$?` immediately.
 | 
| 487 | The parallel constructs must wait until all parts are done and save statuses in
 | 
| 488 | an array.  Afterward, they determine `$?` based on the value of `pipefail` and
 | 
| 489 | `process_sub_fail`.
 | 
| 490 | 
 | 
| 491 | <div class="faq">
 | 
| 492 | 
 | 
| 493 | Why are `strict_errexit` and `command_sub_errexit` different options?
 | 
| 494 | 
 | 
| 495 | </div>
 | 
| 496 | 
 | 
| 497 | Because `shopt --set strict:all` can be used to improve scripts that are run
 | 
| 498 | under other shells like [bash]($xref).  It's like a runtime linter that
 | 
| 499 | disallows dangerous constructs.
 | 
| 500 | 
 | 
| 501 | On the other hand, if you write code with `command_sub_errexit` on, it's
 | 
| 502 | impossible to get the same failures under bash.  So `command_sub_errexit` is
 | 
| 503 | not a `strict_*` option, and it's meant for code that runs only under YSH.
 | 
| 504 | 
 | 
| 505 | <div class="faq">
 | 
| 506 | 
 | 
| 507 | What's the difference between bash's `inherit_errexit` and YSH
 | 
| 508 | `command_sub_errexit`?  Don't they both relate to command subs?
 | 
| 509 | 
 | 
| 510 | </div>
 | 
| 511 | 
 | 
| 512 | - `inherit_errexit` enables failure in the **child** process running the
 | 
| 513 |   command sub.
 | 
| 514 | - `command_sub_errexit` enables failure in the **parent** process, after the
 | 
| 515 |   command sub has finished.
 | 
| 516 | 
 | 
| 517 |  
 | 
| 518 | 
 | 
| 519 | ## Summary
 | 
| 520 | 
 | 
| 521 | YSH uses three mechanisms to fix error handling once and for all.
 | 
| 522 | 
 | 
| 523 | It has two new **builtins** that relate to errors:
 | 
| 524 | 
 | 
| 525 | 1. `try` lets you explicitly handle errors when `errexit` is on.
 | 
| 526 | 1. `boolstatus` enforces a true/false meaning.  (This builtin is less common).
 | 
| 527 | 
 | 
| 528 | It has three **special variables**:
 | 
| 529 | 
 | 
| 530 | 1. The `_error` register, which is set by `try`.
 | 
| 531 |    - Remember that `_error.code` is distinct from `$?`, and that idiomatic YSH
 | 
| 532 |      programs don't use `$?`.
 | 
| 533 | 1. The `_pipeline_status` array (another name for bash's `PIPESTATUS`)
 | 
| 534 | 1. The `_process_sub_status` array for process substitutions.
 | 
| 535 | 
 | 
| 536 | Finally, it supports all of these **global options**:
 | 
| 537 | 
 | 
| 538 | - From POSIX shell:
 | 
| 539 |   - `errexit`
 | 
| 540 | - From [bash]($xref):
 | 
| 541 |   - `pipefail`
 | 
| 542 |   - `inherit_errexit` aborts the child process of a command sub.
 | 
| 543 | - New:
 | 
| 544 |   - `command_sub_errexit` aborts the parent process immediately after a failed
 | 
| 545 |     command sub.
 | 
| 546 |   - `process_sub_fail` is analogous to `pipefail`.
 | 
| 547 |   - `strict_errexit` flags two common problems.
 | 
| 548 |   - `sigpipe_status_ok` ignores a spurious "broken pipe" failure.
 | 
| 549 |   - `verbose_errexit` controls whether error messages are printed.
 | 
| 550 | 
 | 
| 551 | When using `bin/osh`, set all options at once with `shopt --set ysh:upgrade
 | 
| 552 | strict:all`.  Or use `bin/ysh`, where they're set by default.
 | 
| 553 | 
 | 
| 554 | <!--
 | 
| 555 | Related 2020 blog post [Reliable Error
 | 
| 556 | Handling](https://www.oilshell.org/blog/2020/10/osh-features.html#reliable-error-handling).
 | 
| 557 | -->
 | 
| 558 | 
 | 
| 559 | 
 | 
| 560 | ## Related Docs
 | 
| 561 | 
 | 
| 562 | - [YSH vs. Shell Idioms](idioms.html) shows more examples of `try` and `boolstatus`.
 | 
| 563 | - [Shell Idioms](shell-idioms.html) has a section on fixing `strict_errexit`
 | 
| 564 |   problems in Bourne shell.
 | 
| 565 | 
 | 
| 566 | Good articles on `errexit`:
 | 
| 567 | 
 | 
| 568 | - Bash FAQ: [Why doesn't `set -e` do what I expected?][bash-faq]
 | 
| 569 | - [Bash: Error Handling](http://fvue.nl/wiki/Bash:_Error_handling) from
 | 
| 570 |   `fvue.nl`
 | 
| 571 | 
 | 
| 572 | [bash-faq]: http://mywiki.wooledge.org/BashFAQ/105
 | 
| 573 | 
 | 
| 574 | Spec Test Suites:
 | 
| 575 | 
 | 
| 576 | - <https://www.oilshell.org/release/latest/test/spec.wwz/survey/errexit.html>
 | 
| 577 | - <https://www.oilshell.org/release/latest/test/spec.wwz/survey/errexit-oil.html>
 | 
| 578 | 
 | 
| 579 | These docs aren't about error handling, but they're also painstaking
 | 
| 580 | backward-compatible overhauls of shell!
 | 
| 581 | 
 | 
| 582 | - [Simple Word Evaluation in Unix Shell](simple-word-eval.html)
 | 
| 583 | - [Egg Expressions (YSH Regexes)](eggex.html)
 | 
| 584 | 
 | 
| 585 | For reference, this work on error handling was described in [Four Features That
 | 
| 586 | Justify a New Unix
 | 
| 587 | Shell](https://www.oilshell.org/blog/2020/10/osh-features.html) (October 2020).
 | 
| 588 | Since then, we changed `try` and `_error` to be more powerful and general.
 | 
| 589 | 
 | 
| 590 |  
 | 
| 591 | 
 | 
| 592 | ## Appendices
 | 
| 593 | 
 | 
| 594 | ### List Of Pitfalls
 | 
| 595 | 
 | 
| 596 | We mentioned some of these pitfalls:
 | 
| 597 | 
 | 
| 598 | 1. The `if myfunc` Pitfall, caused by the Disabled `errexit` Quirk (`strict_errexit`)
 | 
| 599 | 1. The `local x=$(false)` Pitfall (`strict_errexit`)
 | 
| 600 | 1. The Error or False Pitfall (`boolstatus`, `try` / `case`)
 | 
| 601 |    - Special case: When the child process is another instance of the shell, the
 | 
| 602 |      Meta Pitfall is possible.
 | 
| 603 | 1. The Process Sub Pitfall (`process_sub_fail` and `_process_sub_status`)
 | 
| 604 | 1. The `yes | head` Pitfall (`sigpipe_status_ok`)
 | 
| 605 | 
 | 
| 606 | There are two pitfalls related to command subs:
 | 
| 607 | 
 | 
| 608 | 6. The `echo $(false)` Pitfall (`command_sub_errexit`)
 | 
| 609 | 6. Bash's `inherit_errexit` pitfall.
 | 
| 610 |    - As mentioned, this bash 4.4 option fixed a bug in earlier versions of
 | 
| 611 |      bash.  YSH reimplements it and turns it on by default.
 | 
| 612 | 
 | 
| 613 | Here are two more pitfalls that don't require changes to YSH:
 | 
| 614 | 
 | 
| 615 | 8. The Trailing `&&` Pitfall
 | 
| 616 |    - When `test -d /bin && echo found` is at the end of a function, the exit
 | 
| 617 |      code is surprising.
 | 
| 618 |    - Solution: always use `if` rather than `&&`.
 | 
| 619 |    - More reasons: the `if` is easier to read, and `&&` isn't useful when
 | 
| 620 |      `errexit` is on.
 | 
| 621 | 8. The surprising return value of `(( i++ ))`, `let`, `expr`, etc.
 | 
| 622 |    - Solution: Use `i=$((i + 1))`, which is valid POSIX shell.
 | 
| 623 |    - In YSH, use `setvar i += 1`.
 | 
| 624 | 
 | 
| 625 | #### Example of `inherit_errexit` Pitfall
 | 
| 626 | 
 | 
| 627 | In bash, `errexit` is disabled in command sub child processes:
 | 
| 628 | 
 | 
| 629 |     set -e
 | 
| 630 |     shopt -s inherit_errexit  # needed to avoid 'touch two'
 | 
| 631 |     echo $(touch one; false; touch two)
 | 
| 632 | 
 | 
| 633 | Without the option, it will touch both files, even though there is a failure
 | 
| 634 | `false` after the first.
 | 
| 635 | 
 | 
| 636 | #### Bash has a grammatical quirk with `set -o failglob`
 | 
| 637 | 
 | 
| 638 | This isn't a pitfall, but a quirk that also relates to errors and shell's
 | 
| 639 | **grammar**.  Recall that the definition of `$?` is tied to the grammar.
 | 
| 640 | 
 | 
| 641 | Consider this program:
 | 
| 642 | 
 | 
| 643 |     set -o failglob
 | 
| 644 |     echo *.ZZ        # no files match
 | 
| 645 |     echo status=$?   # show failure
 | 
| 646 |     # => status=1
 | 
| 647 | 
 | 
| 648 | This is the same program with a newline replaced by a semicolon:
 | 
| 649 | 
 | 
| 650 |     set -o failglob
 | 
| 651 | 
 | 
| 652 |     # Surprisingly, bash doesn't execute what's after ; 
 | 
| 653 |     echo *.ZZ; echo status=$?
 | 
| 654 |     # => (no output)
 | 
| 655 | 
 | 
| 656 | But it behaves differently. This is because newlines and semicolons are handled
 | 
| 657 | in different **productions of the grammar**, and produce distinct syntax trees.
 | 
| 658 | 
 | 
| 659 | (A related quirk is that this same difference can affect the number of
 | 
| 660 | processes that shells start!)
 | 
| 661 | 
 | 
| 662 | ### Disabled `errexit` Quirk / `if myfunc` Pitfall
 | 
| 663 | 
 | 
| 664 | This quirk is a bad interaction between the `if` statement, shell functions,
 | 
| 665 | and `errexit`.  It's a **mistake** in the design of the shell language.
 | 
| 666 | Example:
 | 
| 667 | 
 | 
| 668 |     set -o errexit     # don't ignore errors
 | 
| 669 | 
 | 
| 670 |     myfunc() {
 | 
| 671 |       ls /bad          # fails with status 1
 | 
| 672 |       echo 'should not get here'
 | 
| 673 |     }
 | 
| 674 | 
 | 
| 675 |     myfunc  # Good: script aborts before echo
 | 
| 676 |     # => ls: '/bad': no such file or directory
 | 
| 677 | 
 | 
| 678 |     if myfunc; then  # Surprise!  It behaves differently in a condition.
 | 
| 679 |       echo OK
 | 
| 680 |     fi
 | 
| 681 |     # => ls: '/bad': no such file or directory
 | 
| 682 |     # => should not get here
 | 
| 683 | 
 | 
| 684 | We see "should not get here" because the shell **silently disables** `errexit`
 | 
| 685 | while executing the condition of `if`.  This relates to the fundamental
 | 
| 686 | problems above:
 | 
| 687 | 
 | 
| 688 | 1. Does the function use the failure paradigm or the boolean paradigm?
 | 
| 689 | 2. `if` tests a single exit status, but every command in a function has an exit
 | 
| 690 |    status.  Which one should we consider?
 | 
| 691 | 
 | 
| 692 | This quirk occurs in all **conditional contexts**:
 | 
| 693 | 
 | 
| 694 | 1. The condition of the `if`, `while`, and `until`  constructs
 | 
| 695 | 2. A command/pipeline prefixed by `!` (negation)
 | 
| 696 | 3. Every clause in `||` and `&&` except the last.
 | 
| 697 | 
 | 
| 698 | ### The Meta Pitfall
 | 
| 699 | 
 | 
| 700 | I encountered the *Error or False Pitfall* while trying to disallow other error
 | 
| 701 | handling pitfalls!  The *meta pitfall* arises from a combination of the issues
 | 
| 702 | discussed:
 | 
| 703 | 
 | 
| 704 | 1. The `if` statement tests for zero or non-zero status.
 | 
| 705 | 1. The condition of an `if` may start child processes.  For example, in `if
 | 
| 706 |    myfunc | grep foo`,  the `myfunc` invocation must be run in a subshell.
 | 
| 707 | 1. You may want an external process to use the **boolean paradigm**, and
 | 
| 708 |    that includes **the shell itself**.  When any of the `strict_` options
 | 
| 709 |    encounters bad code, it aborts the shell with **error** status `1`, not
 | 
| 710 |    boolean **false** `1`.
 | 
| 711 | 
 | 
| 712 | The result of this fundamental issue is that `strict_errexit` is quite strict.
 | 
| 713 | On the other hand, the resulting style is straightforward and explicit.
 | 
| 714 | Earlier attempts allowed code that is too subtle.
 | 
| 715 | 
 | 
| 716 | ### Quirky Behavior of `$?`
 | 
| 717 | 
 | 
| 718 | This is a different way of summarizing the information above.
 | 
| 719 | 
 | 
| 720 | Simple commands have an obvious behavior:
 | 
| 721 | 
 | 
| 722 |     echo hi           # $? is 0
 | 
| 723 |     false             # $? is 1
 | 
| 724 | 
 | 
| 725 | But the parent process loses errors from failed command subs:
 | 
| 726 | 
 | 
| 727 |     echo $(false)     # $? is 0
 | 
| 728 |                       # YSH makes it fail with command_sub_errexit
 | 
| 729 | 
 | 
| 730 | Surprisingly, bare assignments take on the value of any command subs:
 | 
| 731 | 
 | 
| 732 |     x=$(false)        # $? is 1 -- we did NOT lose the exit code
 | 
| 733 | 
 | 
| 734 | But assignment builtins have the problem again:
 | 
| 735 | 
 | 
| 736 |     local x=$(false)  # $? is 0 -- exit code is clobbered
 | 
| 737 |                       # disallowed by YSH strict_errexit
 | 
| 738 | 
 | 
| 739 | So shell is confusing and inconsistent, but YSH fixes all these problems.  You
 | 
| 740 | never lose the exit code of `false`.
 | 
| 741 | 
 | 
| 742 | 
 | 
| 743 |  
 | 
| 744 | 
 | 
| 745 | ## Acknowledgments
 | 
| 746 | 
 | 
| 747 | - Thank you to `ca2013` for extensive review and proofreading of this doc.
 | 
| 748 | 
 | 
| 749 | 
 |