OILS / doc / idioms.md View on Github | oilshell.org

957 lines, 606 significant
1---
2default_highlighter: oils-sh
3---
4
5YSH vs. Shell Idioms
6====================
7
8This is an informal, lightly-organized list of recommended idioms for the
9[YSH]($xref) language. Each section has snippets labeled *No* and *Yes*.
10
11- Use the *Yes* style when you want to write in YSH, and don't care about
12 compatibility with other shells.
13- The *No* style is discouraged in new code, but YSH will run it. The [OSH
14 language]($xref:osh-language) is compatible with
15 [POSIX]($xref:posix-shell-spec) and [bash]($xref).
16
17[J8 Notation]: j8-notation.html
18
19<!-- cmark.py expands this -->
20<div id="toc">
21</div>
22
23## Use [Simple Word Evaluation](simple-word-eval.html) to Avoid "Quoting Hell"
24
25### Substitute Variables
26
27No:
28
29 local x='my song.mp3'
30 ls "$x" # quotes required to avoid mangling
31
32Yes:
33
34 var x = 'my song.mp3'
35 ls $x # no quotes needed
36
37### Splice Arrays
38
39No:
40
41 local myflags=( --all --long )
42 ls "${myflags[@]}" "$@"
43
44Yes:
45
46 var myflags = :| --all --long |
47 ls @myflags @ARGV
48
49### Explicitly Split, Glob, and Omit Empty Args
50
51YSH doesn't split arguments after variable expansion.
52
53No:
54
55 local packages='python-dev gawk'
56 apt install $packages
57
58Yes:
59
60 var packages = 'python-dev gawk'
61 apt install @[split(packages)]
62
63Even better:
64
65 var packages = :| python-dev gawk | # array literal
66 apt install @packages # splice array
67
68---
69
70YSH doesn't glob after variable expansion.
71
72No:
73
74 local pat='*.py'
75 echo $pat
76
77
78Yes:
79
80 var pat = '*.py'
81 echo @[glob(pat)] # explicit call
82
83---
84
85YSH doesn't omit unquoted words that evaluate to the empty string.
86
87No:
88
89 local e=''
90 cp $e other $dest # cp gets 2 args, not 3, in sh
91
92Yes:
93
94 var e = ''
95 cp @[maybe(e)] other $dest # explicit call
96
97### Iterate a Number of Times (Split Command Sub)
98
99No:
100
101 local n=3
102 for x in $(seq $n); do # No implicit splitting of unquoted words in YSH
103 echo $x
104 done
105
106OK:
107
108 var n = 3
109 for x in @(seq $n) { # Explicit splitting
110 echo $x
111 }
112
113Better;
114
115 var n = 3
116 for x in (1 .. n+1) { # Range, avoids external program
117 echo $x
118 }
119
120Note that `{1..3}` works in bash and YSH, but the numbers must be constant.
121
122## Avoid Ad Hoc Parsing and Splitting
123
124In other words, avoid *groveling through backslashes and spaces* in shell.
125
126Instead, emit and consume [J8 Notation]($xref:j8-notation):
127
128- J8 strings are [JSON]($xref) strings, with an upgrade for byte string
129 literals
130- [JSON8]($xref) is [JSON]($xref), with this same upgrade
131- [TSV8]($xref) is TSV with this upgrade (not yet implemented)
132
133Custom parsing and serializing should be limited to "the edges" of your YSH
134programs.
135
136### More Strategies For Structured Data
137
138- **Wrap** and Adapt External Tools. Parse their output, and emit [J8 Notation][].
139 - These can be one-off, "bespoke" wrappers in your program, or maintained
140 programs. Use the `proc` construct and `flagspec`!
141 - Example: [uxy](https://github.com/sustrik/uxy) wrappers.
142 - TODO: Examples written in YSH and in other languages.
143- **Patch** Existing Tools.
144 - Enhance GNU grep, etc. to emit [J8 Notation][]. Add a
145 `--j8` flag.
146- **Write Your Own** Structured Versions.
147 - For example, you can write a structured subset of `ls` in Python with
148 little effort.
149
150<!--
151 ls -q and -Q already exist, but --j8 or --tsv8 is probably fine
152-->
153
154## The `write` Builtin Is Simpler Than `printf` and `echo`
155
156### Write an Arbitrary Line
157
158No:
159
160 printf '%s\n' "$mystr"
161
162Yes:
163
164 write -- $mystr
165
166The `write` builtin accepts `--` so it doesn't confuse flags and args.
167
168### Write Without a Newline
169
170No:
171
172 echo -n "$mystr" # breaks if mystr is -e
173
174Yes:
175
176 write --end '' -- $mystr
177 write -n -- $mystr # -n is an alias for --end ''
178
179### Write an Array of Lines
180
181 var myarray = :| one two three |
182 write -- @myarray
183
184## New Long Flags on the `read` builtin
185
186### Read a Line
187
188No:
189
190 read line # Bad because it mangles your backslashes!
191
192For now, please use this bash idiom to read a single line:
193
194 read -r line # Easy to forget -r for "raw"
195
196YSH used to have `read --line`, but there was a design problem: reading
197buffered lines doesn't mix well with reading directly from file descriptors,
198and shell does the latter.
199
200That is, `read -r` is suboptimal because it makes many syscalls, but it's
201already established in shell.
202
203### Read a Whole File
204
205No:
206
207 read -d '' # harder to read, easy to forget -r
208
209Yes:
210
211 read --all # sets $_reply
212 read --all (&myvar) # sets $myvar
213
214### Read a Number of Bytes
215
216No:
217
218 read -n 3 # slow because it respects -d delim
219 # also strips whitespace
220
221Better:
222
223 read -N 3 # good behavior, but easily confused with -n
224
225Yes:
226
227 read --num-bytes 3 # sets $_reply
228 read --num-bytes 3 (&myvar) # sets $myvar
229
230
231### Read Until `\0` (consume `find -print0`)
232
233No:
234
235 # Obscure syntax that bash accepts, but not other shells
236 read -r -d '' myvar
237
238Yes:
239
240 read -0 (&myvar)
241
242## YSH Enhancements to Builtins
243
244### Use `shopt` Instead of `set`
245
246Using a single builtin for all options makes scripts easier to read:
247
248Discouraged:
249
250 set -o errexit
251 shopt -s dotglob
252
253Idiomatic:
254
255 shopt --set errexit
256 shopt --set dotglob
257
258(As always, `set` can be used when you care about compatibility with other
259shells.)
260
261### Use `:` When Mentioning Variable Names
262
263YSH accepts this optional "pseudo-sigil" to make code more explicit.
264
265No:
266
267 read -0 record < file.bin
268 echo $record
269
270Yes:
271
272 read -0 (&myvar) < file.bin
273 echo $record
274
275
276### Consider Using `--long-flags`
277
278Easier to write:
279
280 test -d /tmp
281 test -d / && test -f /vmlinuz
282
283 shopt -u extglob
284
285Easier to read:
286
287 test --dir /tmp
288 test --dir / && test --file /vmlinuz
289
290 shopt --unset extglob
291
292## Use Blocks to Save and Restore Context
293
294### Do Something In Another Directory
295
296No:
297
298 ( cd /tmp; echo $PWD ) # subshell is unnecessary (and limited)
299
300No:
301
302 pushd /tmp
303 echo $PWD
304 popd
305
306Yes:
307
308 cd /tmp {
309 echo $PWD
310 }
311
312### Batch I/O
313
314No:
315
316 echo 1 > out.txt
317 echo 2 >> out.txt # appending is less efficient
318 # because open() and close()
319
320No:
321
322 { echo 1
323 echo 2
324 } > out.txt
325
326Yes:
327
328 fopen > out.txt {
329 echo 1
330 echo 2
331 }
332
333The `fopen` builtin is syntactic sugar -- it lets you see redirects before the
334code that uses them.
335
336### Temporarily Set Shell Options
337
338No:
339
340 set +o errexit
341 myfunc # without error checking
342 set -o errexit
343
344Yes:
345
346 shopt --unset errexit {
347 myfunc
348 }
349
350### Use the `forkwait` builtin for Subshells, not `()`
351
352No:
353
354 ( cd /tmp; rm *.sh )
355
356Yes:
357
358 forkwait {
359 cd /tmp
360 rm *.sh
361 }
362
363Better:
364
365 cd /tmp { # no process created
366 rm *.sh
367 }
368
369### Use the `fork` builtin for async, not `&`
370
371No:
372
373 myfunc &
374
375 { sleep 1; echo one; sleep 2; } &
376
377Yes:
378
379 fork { myfunc }
380
381 fork { sleep 1; echo one; sleep 2 }
382
383## Use Procs (Better Shell Functions)
384
385### Use Named Parameters Instead of `$1`, `$2`, ...
386
387No:
388
389 f() {
390 local src=$1
391 local dest=${2:-/tmp}
392
393 cp "$src" "$dest"
394 }
395
396Yes:
397
398 proc f(src, dest='/tmp') { # Python-like default values
399 cp $src $dest
400 }
401
402### Use Named Varargs Instead of `"$@"`
403
404No:
405
406 f() {
407 local first=$1
408 shift
409
410 echo $first
411 echo "$@"
412 }
413
414Yes:
415
416 proc f(first, @rest) { # @ means "the rest of the arguments"
417 write -- $first
418 write -- @rest # @ means "splice this array"
419 }
420
421You can also use the implicit `ARGV` variable:
422
423 proc p {
424 cp -- @ARGV /tmp
425 }
426
427### Use "Out Params" instead of `declare -n`
428
429Out params are one way to "return" values from a `proc`.
430
431No:
432
433 f() {
434 local in=$1
435 local -n out=$2
436
437 out=PREFIX-$in
438 }
439
440 myvar='init'
441 f zzz myvar # assigns myvar to 'PREFIX-zzz'
442
443
444Yes:
445
446 proc f(in, :out) { # : is an out param, i.e. a string "reference"
447 setref out = "PREFIX-$in"
448 }
449
450 var myvar = 'init'
451 f zzz :myvar # assigns myvar to 'PREFIX-zzz'.
452 # colon is required
453
454### Note: Procs Don't Mess With Their Callers
455
456That is, [dynamic scope]($xref:dynamic-scope) is turned off when procs are
457invoked.
458
459Here's an example of shell functions reading variables in their caller:
460
461 bar() {
462 echo $foo_var # looks up the stack
463 }
464
465 foo() {
466 foo_var=x
467 bar
468 }
469
470 foo
471
472In YSH, you have to pass params explicitly:
473
474 proc bar {
475 echo $foo_var # error, not defined
476 }
477
478Shell functions can also **mutate** variables in their caller! But procs can't
479do this, which makes code easier to reason about.
480
481## Use Modules
482
483YSH has a few lightweight features that make it easier to organize code into
484files. It doesn't have "namespaces".
485
486### Relative Imports
487
488Suppose we are running `bin/mytool`, and we want `BASE_DIR` to be the root of
489the repository so we can do a relative import of `lib/foo.sh`.
490
491No:
492
493 # All of these are common idioms, with caveats
494 BASE_DIR=$(dirname $0)/..
495
496 BASE_DIR=$(dirname ${BASH_SOURCE[0]})/..
497
498 BASE_DIR=$(cd $($dirname $0)/.. && pwd)
499
500 BASE_DIR=$(dirname (dirname $(readlink -f $0)))
501
502 source $BASE_DIR/lib/foo.sh
503
504Yes:
505
506 const BASE_DIR = "$this_dir/.."
507
508 source $BASE_DIR/lib/foo.sh
509
510 # Or simply:
511 source $_this_dir/../lib/foo.sh
512
513The value of `_this_dir` is the directory that contains the currently executing
514file.
515
516### Include Guards
517
518No:
519
520 # libfoo.sh
521 if test -z "$__LIBFOO_SH"; then
522 return
523 fi
524 __LIBFOO_SH=1
525
526Yes:
527
528 # libfoo.sh
529 module libfoo.sh || return 0
530
531### Taskfile Pattern
532
533No:
534
535 deploy() {
536 echo ...
537 }
538 "$@"
539
540Yes
541
542 proc deploy() {
543 echo ...
544 }
545 runproc @ARGV # gives better error messages
546
547## Error Handling
548
549[YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html) once and
550for all! Here's a comprehensive list of error handling idioms.
551
552### Don't Use `&&` Outside of `if` / `while`
553
554It's implicit because `errexit` is on in YSH.
555
556No:
557
558 mkdir /tmp/dest && cp foo /tmp/dest
559
560Yes:
561
562 mkdir /tmp/dest
563 cp foo /tmp/dest
564
565It also avoids the *Trailing `&&` Pitfall* mentioned at the end of the [error
566handling](error-handling.html) doc.
567
568### Ignore an Error
569
570No:
571
572 ls /bad || true # OK because ls is external
573 myfunc || true # suffers from the "Disabled errexit Quirk"
574
575Yes:
576
577 try { ls /bad }
578 try { myfunc }
579
580### Retrieve A Command's Status When `errexit` is On
581
582No:
583
584 # set -e is enabled earlier
585
586 set +e
587 mycommand # this ignores errors when mycommand is a function
588 status=$? # save it before it changes
589 set -e
590
591 echo $status
592
593Yes:
594
595 try {
596 mycommand
597 }
598 echo $[_error.code]
599
600### Does a Builtin Or External Command Succeed?
601
602These idioms are OK in both shell and YSH:
603
604 if ! cp foo /tmp {
605 echo 'error copying' # any non-zero status
606 }
607
608 if ! test -d /bin {
609 echo 'not a directory'
610 }
611
612To be consistent with the idioms below, you can also write them like this:
613
614 try {
615 cp foo /tmp
616 }
617 if failed { # shortcut for (_error.code !== 0)
618 echo 'error copying'
619 }
620
621### Does a Function Succeed?
622
623When the command is a shell function, you shouldn't use `if myfunc` directly.
624This is because shell has the *Disabled `errexit` Quirk*, which is detected by
625YSH `strict_errexit`.
626
627**No**:
628
629 if myfunc; then # errors not checked in body of myfunc
630 echo 'success'
631 fi
632
633**Yes**. The *`$0` Dispatch Pattern* is a workaround that works in all shells.
634
635 if $0 myfunc; then # invoke a new shell
636 echo 'success'
637 fi
638
639 "$@" # Run the function $1 with args $2, $3, ...
640
641**Yes**. The YSH `try` builtin sets the special `_error` variable and returns
642`0`.
643
644 try {
645 myfunc # doesn't abort
646 }
647 if failed {
648 echo 'success'
649 }
650
651### Does a Pipeline Succeed?
652
653No:
654
655 if ps | grep python; then
656 echo 'found'
657 fi
658
659This is technically correct when `pipefail` is on, but it's impossible for
660YSH `strict_errexit` to distinguish it from `if myfunc | grep python` ahead
661of time (the ["meta" pitfall](error-handling.html#the-meta-pitfall)). If you
662know what you're doing, you can disable `strict_errexit`.
663
664Yes:
665
666 try {
667 ps | grep python
668 }
669 if failed {
670 echo 'found'
671 }
672
673 # You can also examine the status of each part of the pipeline
674 if (_pipeline_status[0] !== 0) {
675 echo 'ps failed'
676 }
677
678### Does a Command With Process Subs Succeed?
679
680Similar to the pipeline example above:
681
682No:
683
684 if ! comm <(sort left.txt) <(sort right.txt); then
685 echo 'error'
686 fi
687
688Yes:
689
690 try {
691 comm <(sort left.txt) <(sort right.txt)
692 }
693 if failed {
694 echo 'error'
695 }
696
697 # You can also examine the status of each process sub
698 if (_process_sub_status[0] !== 0) {
699 echo 'first process sub failed'
700 }
701
702(I used `comm` in this example because it doesn't have a true / false / error
703status like `diff`.)
704
705### Handle Errors in YSH Expressions
706
707 try {
708 var x = 42 / 0
709 echo "result is $[42 / 0]"
710 }
711 if failed {
712 echo 'divide by zero'
713 }
714
715### Test Boolean Statuses, like `grep`, `diff`, `test`
716
717The YSH `boolstatus` builtin distinguishes **error** from **false**.
718
719**No**, this is subtly wrong. `grep` has 3 different return values.
720
721 if grep 'class' *.py {
722 echo 'found' # status 0 means found
723 } else {
724 echo 'not found OR ERROR' # any non-zero status
725 }
726
727**Yes**. `boolstatus` aborts the program if `egrep` doesn't return 0 or 1.
728
729 if boolstatus grep 'class' *.py { # may abort
730 echo 'found' # status 0 means found
731 } else {
732 echo 'not found' # status 1 means not found
733 }
734
735More flexible style:
736
737 try {
738 grep 'class' *.py
739 }
740 case (_error.code) {
741 (0) { echo 'found' }
742 (1) { echo 'not found' }
743 (else) { echo 'fatal' }
744 }
745
746## Use YSH Expressions, Initializations, and Assignments (var, setvar)
747
748### Initialize and Assign Strings and Integers
749
750No:
751
752 local mystr=foo
753 mystr='new value'
754
755 local myint=42 # still a string in shell
756
757Yes:
758
759 var mystr = 'foo'
760 setvar mystr = 'new value'
761
762 var myint = 42 # a real integer
763
764### Expressions on Integers
765
766No:
767
768 x=$(( 1 + 2*3 ))
769 (( x = 1 + 2*3 ))
770
771Yes:
772
773 setvar x = 1 + 2*3
774
775### Mutate Integers
776
777No:
778
779 (( i++ )) # interacts poorly with errexit
780 i=$(( i+1 ))
781
782Yes:
783
784 setvar i += 1 # like Python, with a keyword
785
786### Initialize and Assign Arrays
787
788Arrays in YSH look like `:| my array |` and `['my', 'array']`.
789
790No:
791
792 local -a myarray=(one two three)
793 myarray[3]='THREE'
794
795Yes:
796
797 var myarray = :| one two three |
798 setvar myarray[3] = 'THREE'
799
800 var same = ['one', 'two', 'three']
801 var typed = [1, 2, true, false, null]
802
803
804### Initialize and Assign Dicts
805
806Dicts in YSH look like `{key: 'value'}`.
807
808No:
809
810 local -A myassoc=(['key']=value ['k2']=v2)
811 myassoc['key']=V
812
813
814Yes:
815
816 # keys don't need to be quoted
817 var myassoc = {key: 'value', k2: 'v2'}
818 setvar myassoc['key'] = 'V'
819
820### Get Values From Arrays and Dicts
821
822No:
823
824 local x=${a[i-1]}
825 x=${a[i]}
826
827 local y=${A['key']}
828
829Yes:
830
831 var x = a[i-1]
832 setvar x = a[i]
833
834 var y = A['key']
835
836### Conditions and Comparisons
837
838No:
839
840 if (( x > 0 )); then
841 echo 'positive'
842 fi
843
844Yes:
845
846 if (x > 0) {
847 echo 'positive'
848 }
849
850### Substituting Expressions in Words
851
852No:
853
854 echo flag=$((1 + a[i] * 3)) # C-like arithmetic
855
856Yes:
857
858 echo flag=$[1 + a[i] * 3] # Arbitrary YSH expressions
859
860 # Possible, but a local var might be more readable
861 echo flag=$['1' if x else '0']
862
863
864## Use [Egg Expressions](eggex.html) instead of Regexes
865
866### Test for a Match
867
868No:
869
870 local pat='[[:digit:]]+'
871 if [[ $x =~ $pat ]]; then
872 echo 'number'
873 fi
874
875Yes:
876
877 if (x ~ /digit+/) {
878 echo 'number'
879 }
880
881Or extract the pattern:
882
883 var pat = / digit+ /
884 if (x ~ pat) {
885 echo 'number'
886 }
887
888### Extract Submatches
889
890No:
891
892 if [[ $x =~ foo-([[:digit:]]+) ]] {
893 echo "${BASH_REMATCH[1]}" # first submatch
894 }
895
896Yes:
897
898 if (x ~ / 'foo-' <capture d+> /) { # <> is capture
899 echo $[_group(1)] # first submatch
900 }
901
902## Glob Matching
903
904No:
905
906 if [[ $x == *.py ]]; then
907 echo 'Python'
908 fi
909
910Yes:
911
912 if (x ~~ '*.py') {
913 echo 'Python'
914 }
915
916
917No:
918
919 case $x in
920 *.py)
921 echo Python
922 ;;
923 *.sh)
924 echo Shell
925 ;;
926 esac
927
928Yes (purely a style preference):
929
930 case $x { # curly braces
931 (*.py) # balanced parens
932 echo 'Python'
933 ;;
934 (*.sh)
935 echo 'Shell'
936 ;;
937 }
938
939## TODO
940
941### Distinguish Between Variables and Functions
942
943- `$RANDOM` vs. `random()`
944- `LANG=C` vs. `shopt --setattr LANG=C`
945
946## Related Documents
947
948- [Shell Language Idioms](shell-idioms.html). This advice applies to shells
949 other than YSH.
950- [What Breaks When You Upgrade to YSH](upgrade-breakage.html). Shell constructs that YSH
951 users should avoid.
952- [YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html). YSH fixes the
953 flaky error handling in POSIX shell and bash.
954- TODO: Go through more of the [Pure Bash
955 Bible](https://github.com/dylanaraps/pure-bash-bible). YSH provides
956 alternatives for such quirky syntax.
957