OILS / doc / idioms.md View on Github | oilshell.org

959 lines, 606 significant
1---
2default_highlighter: oils-sh
3---
4
5YSH vs. Shell Idioms
6====================
7
8This is an informal, lightly-organized list of recommended idioms for the
9[YSH]($xref) language. Each section has snippets labeled *No* and *Yes*.
10
11- Use the *Yes* style when you want to write in YSH, and don't care about
12 compatibility with other shells.
13- The *No* style is discouraged in new code, but YSH will run it. The [OSH
14 language]($xref:osh-language) is compatible with
15 [POSIX]($xref:posix-shell-spec) and [bash]($xref).
16
17[J8 Notation]: j8-notation.html
18
19<!-- cmark.py expands this -->
20<div id="toc">
21</div>
22
23## Use [Simple Word Evaluation](simple-word-eval.html) to Avoid "Quoting Hell"
24
25### Substitute Variables
26
27No:
28
29 local x='my song.mp3'
30 ls "$x" # quotes required to avoid mangling
31
32Yes:
33
34 var x = 'my song.mp3'
35 ls $x # no quotes needed
36
37### Splice Arrays
38
39No:
40
41 local myflags=( --all --long )
42 ls "${myflags[@]}" "$@"
43
44Yes:
45
46 var myflags = :| --all --long |
47 ls @myflags @ARGV
48
49### Explicitly Split, Glob, and Omit Empty Args
50
51YSH doesn't split arguments after variable expansion.
52
53No:
54
55 local packages='python-dev gawk'
56 apt install $packages
57
58Yes:
59
60 var packages = 'python-dev gawk'
61 apt install @[split(packages)]
62
63Even better:
64
65 var packages = :| python-dev gawk | # array literal
66 apt install @packages # splice array
67
68---
69
70YSH doesn't glob after variable expansion.
71
72No:
73
74 local pat='*.py'
75 echo $pat
76
77
78Yes:
79
80 var pat = '*.py'
81 echo @[glob(pat)] # explicit call
82
83---
84
85YSH doesn't omit unquoted words that evaluate to the empty string.
86
87No:
88
89 local e=''
90 cp $e other $dest # cp gets 2 args, not 3, in sh
91
92Yes:
93
94 var e = ''
95 cp @[maybe(e)] other $dest # explicit call
96
97### Iterate a Number of Times (Split Command Sub)
98
99No:
100
101 local n=3
102 for x in $(seq $n); do # No implicit splitting of unquoted words in YSH
103 echo $x
104 done
105
106OK:
107
108 var n = 3
109 for x in @(seq $n) { # Explicit splitting
110 echo $x
111 }
112
113Better;
114
115 var n = 3
116 for x in (1 .. n+1) { # Range, avoids external program
117 echo $x
118 }
119
120Note that `{1..3}` works in bash and YSH, but the numbers must be constant.
121
122## Avoid Ad Hoc Parsing and Splitting
123
124In other words, avoid *groveling through backslashes and spaces* in shell.
125
126Instead, emit and consume [J8 Notation]($xref:j8-notation):
127
128- J8 strings are [JSON]($xref) strings, with an upgrade for byte string
129 literals
130- [JSON8]($xref) is [JSON]($xref), with this same upgrade
131- [TSV8]($xref) is TSV with this upgrade (not yet implemented)
132
133Custom parsing and serializing should be limited to "the edges" of your YSH
134programs.
135
136### More Strategies For Structured Data
137
138- **Wrap** and Adapt External Tools. Parse their output, and emit [J8 Notation][].
139 - These can be one-off, "bespoke" wrappers in your program, or maintained
140 programs. Use the `proc` construct and `flagspec`!
141 - Example: [uxy](https://github.com/sustrik/uxy) wrappers.
142 - TODO: Examples written in YSH and in other languages.
143- **Patch** Existing Tools.
144 - Enhance GNU grep, etc. to emit [J8 Notation][]. Add a
145 `--j8` flag.
146- **Write Your Own** Structured Versions.
147 - For example, you can write a structured subset of `ls` in Python with
148 little effort.
149
150<!--
151 ls -q and -Q already exist, but --j8 or --tsv8 is probably fine
152-->
153
154## The `write` Builtin Is Simpler Than `printf` and `echo`
155
156### Write an Arbitrary Line
157
158No:
159
160 printf '%s\n' "$mystr"
161
162Yes:
163
164 write -- $mystr
165
166The `write` builtin accepts `--` so it doesn't confuse flags and args.
167
168### Write Without a Newline
169
170No:
171
172 echo -n "$mystr" # breaks if mystr is -e
173
174Yes:
175
176 write --end '' -- $mystr
177 write -n -- $mystr # -n is an alias for --end ''
178
179### Write an Array of Lines
180
181 var myarray = :| one two three |
182 write -- @myarray
183
184## New Long Flags on the `read` builtin
185
186### Read a Line
187
188No:
189
190 read line # Mangles your backslashes!
191
192Better:
193
194 read -r line # Still messes with leading and trailing whitespace
195
196 IFS= read -r line # OK, but doesn't work in YSH
197
198Yes:
199
200 read --raw-line # Gives you the line, without trailing \n
201
202(Note that `read --raw-line` is still an unbuffered read, which means it slowly
203reads a byte at a time. We plan to add buffered reads as well.)
204
205### Read a Whole File
206
207No:
208
209 read -d '' # harder to read, easy to forget -r
210
211Yes:
212
213 read --all # sets $_reply
214 read --all (&myvar) # sets $myvar
215
216### Read a Number of Bytes
217
218No:
219
220 read -n 3 # slow because it respects -d delim
221 # also strips whitespace
222
223Better:
224
225 read -N 3 # good behavior, but easily confused with -n
226
227Yes:
228
229 read --num-bytes 3 # sets $_reply
230 read --num-bytes 3 (&myvar) # sets $myvar
231
232
233### Read Until `\0` (consume `find -print0`)
234
235No:
236
237 # Obscure syntax that bash accepts, but not other shells
238 read -r -d '' myvar
239
240Yes:
241
242 read -0 (&myvar)
243
244## YSH Enhancements to Builtins
245
246### Use `shopt` Instead of `set`
247
248Using a single builtin for all options makes scripts easier to read:
249
250Discouraged:
251
252 set -o errexit
253 shopt -s dotglob
254
255Idiomatic:
256
257 shopt --set errexit
258 shopt --set dotglob
259
260(As always, `set` can be used when you care about compatibility with other
261shells.)
262
263### Use `:` When Mentioning Variable Names
264
265YSH accepts this optional "pseudo-sigil" to make code more explicit.
266
267No:
268
269 read -0 record < file.bin
270 echo $record
271
272Yes:
273
274 read -0 (&myvar) < file.bin
275 echo $record
276
277
278### Consider Using `--long-flags`
279
280Easier to write:
281
282 test -d /tmp
283 test -d / && test -f /vmlinuz
284
285 shopt -u extglob
286
287Easier to read:
288
289 test --dir /tmp
290 test --dir / && test --file /vmlinuz
291
292 shopt --unset extglob
293
294## Use Blocks to Save and Restore Context
295
296### Do Something In Another Directory
297
298No:
299
300 ( cd /tmp; echo $PWD ) # subshell is unnecessary (and limited)
301
302No:
303
304 pushd /tmp
305 echo $PWD
306 popd
307
308Yes:
309
310 cd /tmp {
311 echo $PWD
312 }
313
314### Batch I/O
315
316No:
317
318 echo 1 > out.txt
319 echo 2 >> out.txt # appending is less efficient
320 # because open() and close()
321
322No:
323
324 { echo 1
325 echo 2
326 } > out.txt
327
328Yes:
329
330 fopen > out.txt {
331 echo 1
332 echo 2
333 }
334
335The `fopen` builtin is syntactic sugar -- it lets you see redirects before the
336code that uses them.
337
338### Temporarily Set Shell Options
339
340No:
341
342 set +o errexit
343 myfunc # without error checking
344 set -o errexit
345
346Yes:
347
348 shopt --unset errexit {
349 myfunc
350 }
351
352### Use the `forkwait` builtin for Subshells, not `()`
353
354No:
355
356 ( cd /tmp; rm *.sh )
357
358Yes:
359
360 forkwait {
361 cd /tmp
362 rm *.sh
363 }
364
365Better:
366
367 cd /tmp { # no process created
368 rm *.sh
369 }
370
371### Use the `fork` builtin for async, not `&`
372
373No:
374
375 myfunc &
376
377 { sleep 1; echo one; sleep 2; } &
378
379Yes:
380
381 fork { myfunc }
382
383 fork { sleep 1; echo one; sleep 2 }
384
385## Use Procs (Better Shell Functions)
386
387### Use Named Parameters Instead of `$1`, `$2`, ...
388
389No:
390
391 f() {
392 local src=$1
393 local dest=${2:-/tmp}
394
395 cp "$src" "$dest"
396 }
397
398Yes:
399
400 proc f(src, dest='/tmp') { # Python-like default values
401 cp $src $dest
402 }
403
404### Use Named Varargs Instead of `"$@"`
405
406No:
407
408 f() {
409 local first=$1
410 shift
411
412 echo $first
413 echo "$@"
414 }
415
416Yes:
417
418 proc f(first, @rest) { # @ means "the rest of the arguments"
419 write -- $first
420 write -- @rest # @ means "splice this array"
421 }
422
423You can also use the implicit `ARGV` variable:
424
425 proc p {
426 cp -- @ARGV /tmp
427 }
428
429### Use "Out Params" instead of `declare -n`
430
431Out params are one way to "return" values from a `proc`.
432
433No:
434
435 f() {
436 local in=$1
437 local -n out=$2
438
439 out=PREFIX-$in
440 }
441
442 myvar='init'
443 f zzz myvar # assigns myvar to 'PREFIX-zzz'
444
445
446Yes:
447
448 proc f(in, :out) { # : is an out param, i.e. a string "reference"
449 setref out = "PREFIX-$in"
450 }
451
452 var myvar = 'init'
453 f zzz :myvar # assigns myvar to 'PREFIX-zzz'.
454 # colon is required
455
456### Note: Procs Don't Mess With Their Callers
457
458That is, [dynamic scope]($xref:dynamic-scope) is turned off when procs are
459invoked.
460
461Here's an example of shell functions reading variables in their caller:
462
463 bar() {
464 echo $foo_var # looks up the stack
465 }
466
467 foo() {
468 foo_var=x
469 bar
470 }
471
472 foo
473
474In YSH, you have to pass params explicitly:
475
476 proc bar {
477 echo $foo_var # error, not defined
478 }
479
480Shell functions can also **mutate** variables in their caller! But procs can't
481do this, which makes code easier to reason about.
482
483## Use Modules
484
485YSH has a few lightweight features that make it easier to organize code into
486files. It doesn't have "namespaces".
487
488### Relative Imports
489
490Suppose we are running `bin/mytool`, and we want `BASE_DIR` to be the root of
491the repository so we can do a relative import of `lib/foo.sh`.
492
493No:
494
495 # All of these are common idioms, with caveats
496 BASE_DIR=$(dirname $0)/..
497
498 BASE_DIR=$(dirname ${BASH_SOURCE[0]})/..
499
500 BASE_DIR=$(cd $($dirname $0)/.. && pwd)
501
502 BASE_DIR=$(dirname (dirname $(readlink -f $0)))
503
504 source $BASE_DIR/lib/foo.sh
505
506Yes:
507
508 const BASE_DIR = "$this_dir/.."
509
510 source $BASE_DIR/lib/foo.sh
511
512 # Or simply:
513 source $_this_dir/../lib/foo.sh
514
515The value of `_this_dir` is the directory that contains the currently executing
516file.
517
518### Include Guards
519
520No:
521
522 # libfoo.sh
523 if test -z "$__LIBFOO_SH"; then
524 return
525 fi
526 __LIBFOO_SH=1
527
528Yes:
529
530 # libfoo.sh
531 module libfoo.sh || return 0
532
533### Taskfile Pattern
534
535No:
536
537 deploy() {
538 echo ...
539 }
540 "$@"
541
542Yes
543
544 proc deploy() {
545 echo ...
546 }
547 runproc @ARGV # gives better error messages
548
549## Error Handling
550
551[YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html) once and
552for all! Here's a comprehensive list of error handling idioms.
553
554### Don't Use `&&` Outside of `if` / `while`
555
556It's implicit because `errexit` is on in YSH.
557
558No:
559
560 mkdir /tmp/dest && cp foo /tmp/dest
561
562Yes:
563
564 mkdir /tmp/dest
565 cp foo /tmp/dest
566
567It also avoids the *Trailing `&&` Pitfall* mentioned at the end of the [error
568handling](error-handling.html) doc.
569
570### Ignore an Error
571
572No:
573
574 ls /bad || true # OK because ls is external
575 myfunc || true # suffers from the "Disabled errexit Quirk"
576
577Yes:
578
579 try { ls /bad }
580 try { myfunc }
581
582### Retrieve A Command's Status When `errexit` is On
583
584No:
585
586 # set -e is enabled earlier
587
588 set +e
589 mycommand # this ignores errors when mycommand is a function
590 status=$? # save it before it changes
591 set -e
592
593 echo $status
594
595Yes:
596
597 try {
598 mycommand
599 }
600 echo $[_error.code]
601
602### Does a Builtin Or External Command Succeed?
603
604These idioms are OK in both shell and YSH:
605
606 if ! cp foo /tmp {
607 echo 'error copying' # any non-zero status
608 }
609
610 if ! test -d /bin {
611 echo 'not a directory'
612 }
613
614To be consistent with the idioms below, you can also write them like this:
615
616 try {
617 cp foo /tmp
618 }
619 if failed { # shortcut for (_error.code !== 0)
620 echo 'error copying'
621 }
622
623### Does a Function Succeed?
624
625When the command is a shell function, you shouldn't use `if myfunc` directly.
626This is because shell has the *Disabled `errexit` Quirk*, which is detected by
627YSH `strict_errexit`.
628
629**No**:
630
631 if myfunc; then # errors not checked in body of myfunc
632 echo 'success'
633 fi
634
635**Yes**. The *`$0` Dispatch Pattern* is a workaround that works in all shells.
636
637 if $0 myfunc; then # invoke a new shell
638 echo 'success'
639 fi
640
641 "$@" # Run the function $1 with args $2, $3, ...
642
643**Yes**. The YSH `try` builtin sets the special `_error` variable and returns
644`0`.
645
646 try {
647 myfunc # doesn't abort
648 }
649 if failed {
650 echo 'success'
651 }
652
653### Does a Pipeline Succeed?
654
655No:
656
657 if ps | grep python; then
658 echo 'found'
659 fi
660
661This is technically correct when `pipefail` is on, but it's impossible for
662YSH `strict_errexit` to distinguish it from `if myfunc | grep python` ahead
663of time (the ["meta" pitfall](error-handling.html#the-meta-pitfall)). If you
664know what you're doing, you can disable `strict_errexit`.
665
666Yes:
667
668 try {
669 ps | grep python
670 }
671 if failed {
672 echo 'found'
673 }
674
675 # You can also examine the status of each part of the pipeline
676 if (_pipeline_status[0] !== 0) {
677 echo 'ps failed'
678 }
679
680### Does a Command With Process Subs Succeed?
681
682Similar to the pipeline example above:
683
684No:
685
686 if ! comm <(sort left.txt) <(sort right.txt); then
687 echo 'error'
688 fi
689
690Yes:
691
692 try {
693 comm <(sort left.txt) <(sort right.txt)
694 }
695 if failed {
696 echo 'error'
697 }
698
699 # You can also examine the status of each process sub
700 if (_process_sub_status[0] !== 0) {
701 echo 'first process sub failed'
702 }
703
704(I used `comm` in this example because it doesn't have a true / false / error
705status like `diff`.)
706
707### Handle Errors in YSH Expressions
708
709 try {
710 var x = 42 / 0
711 echo "result is $[42 / 0]"
712 }
713 if failed {
714 echo 'divide by zero'
715 }
716
717### Test Boolean Statuses, like `grep`, `diff`, `test`
718
719The YSH `boolstatus` builtin distinguishes **error** from **false**.
720
721**No**, this is subtly wrong. `grep` has 3 different return values.
722
723 if grep 'class' *.py {
724 echo 'found' # status 0 means found
725 } else {
726 echo 'not found OR ERROR' # any non-zero status
727 }
728
729**Yes**. `boolstatus` aborts the program if `egrep` doesn't return 0 or 1.
730
731 if boolstatus grep 'class' *.py { # may abort
732 echo 'found' # status 0 means found
733 } else {
734 echo 'not found' # status 1 means not found
735 }
736
737More flexible style:
738
739 try {
740 grep 'class' *.py
741 }
742 case (_error.code) {
743 (0) { echo 'found' }
744 (1) { echo 'not found' }
745 (else) { echo 'fatal' }
746 }
747
748## Use YSH Expressions, Initializations, and Assignments (var, setvar)
749
750### Initialize and Assign Strings and Integers
751
752No:
753
754 local mystr=foo
755 mystr='new value'
756
757 local myint=42 # still a string in shell
758
759Yes:
760
761 var mystr = 'foo'
762 setvar mystr = 'new value'
763
764 var myint = 42 # a real integer
765
766### Expressions on Integers
767
768No:
769
770 x=$(( 1 + 2*3 ))
771 (( x = 1 + 2*3 ))
772
773Yes:
774
775 setvar x = 1 + 2*3
776
777### Mutate Integers
778
779No:
780
781 (( i++ )) # interacts poorly with errexit
782 i=$(( i+1 ))
783
784Yes:
785
786 setvar i += 1 # like Python, with a keyword
787
788### Initialize and Assign Arrays
789
790Arrays in YSH look like `:| my array |` and `['my', 'array']`.
791
792No:
793
794 local -a myarray=(one two three)
795 myarray[3]='THREE'
796
797Yes:
798
799 var myarray = :| one two three |
800 setvar myarray[3] = 'THREE'
801
802 var same = ['one', 'two', 'three']
803 var typed = [1, 2, true, false, null]
804
805
806### Initialize and Assign Dicts
807
808Dicts in YSH look like `{key: 'value'}`.
809
810No:
811
812 local -A myassoc=(['key']=value ['k2']=v2)
813 myassoc['key']=V
814
815
816Yes:
817
818 # keys don't need to be quoted
819 var myassoc = {key: 'value', k2: 'v2'}
820 setvar myassoc['key'] = 'V'
821
822### Get Values From Arrays and Dicts
823
824No:
825
826 local x=${a[i-1]}
827 x=${a[i]}
828
829 local y=${A['key']}
830
831Yes:
832
833 var x = a[i-1]
834 setvar x = a[i]
835
836 var y = A['key']
837
838### Conditions and Comparisons
839
840No:
841
842 if (( x > 0 )); then
843 echo 'positive'
844 fi
845
846Yes:
847
848 if (x > 0) {
849 echo 'positive'
850 }
851
852### Substituting Expressions in Words
853
854No:
855
856 echo flag=$((1 + a[i] * 3)) # C-like arithmetic
857
858Yes:
859
860 echo flag=$[1 + a[i] * 3] # Arbitrary YSH expressions
861
862 # Possible, but a local var might be more readable
863 echo flag=$['1' if x else '0']
864
865
866## Use [Egg Expressions](eggex.html) instead of Regexes
867
868### Test for a Match
869
870No:
871
872 local pat='[[:digit:]]+'
873 if [[ $x =~ $pat ]]; then
874 echo 'number'
875 fi
876
877Yes:
878
879 if (x ~ /digit+/) {
880 echo 'number'
881 }
882
883Or extract the pattern:
884
885 var pat = / digit+ /
886 if (x ~ pat) {
887 echo 'number'
888 }
889
890### Extract Submatches
891
892No:
893
894 if [[ $x =~ foo-([[:digit:]]+) ]] {
895 echo "${BASH_REMATCH[1]}" # first submatch
896 }
897
898Yes:
899
900 if (x ~ / 'foo-' <capture d+> /) { # <> is capture
901 echo $[_group(1)] # first submatch
902 }
903
904## Glob Matching
905
906No:
907
908 if [[ $x == *.py ]]; then
909 echo 'Python'
910 fi
911
912Yes:
913
914 if (x ~~ '*.py') {
915 echo 'Python'
916 }
917
918
919No:
920
921 case $x in
922 *.py)
923 echo Python
924 ;;
925 *.sh)
926 echo Shell
927 ;;
928 esac
929
930Yes (purely a style preference):
931
932 case $x { # curly braces
933 (*.py) # balanced parens
934 echo 'Python'
935 ;;
936 (*.sh)
937 echo 'Shell'
938 ;;
939 }
940
941## TODO
942
943### Distinguish Between Variables and Functions
944
945- `$RANDOM` vs. `random()`
946- `LANG=C` vs. `shopt --setattr LANG=C`
947
948## Related Documents
949
950- [Shell Language Idioms](shell-idioms.html). This advice applies to shells
951 other than YSH.
952- [What Breaks When You Upgrade to YSH](upgrade-breakage.html). Shell constructs that YSH
953 users should avoid.
954- [YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html). YSH fixes the
955 flaky error handling in POSIX shell and bash.
956- TODO: Go through more of the [Pure Bash
957 Bible](https://github.com/dylanaraps/pure-bash-bible). YSH provides
958 alternatives for such quirky syntax.
959