OILS / doc / idioms.md View on Github | oilshell.org

966 lines, 611 significant
1---
2default_highlighter: oils-sh
3---
4
5YSH vs. Shell Idioms
6====================
7
8This is an informal, lightly-organized list of recommended idioms for the
9[YSH]($xref) language. Each section has snippets labeled *No* and *Yes*.
10
11- Use the *Yes* style when you want to write in YSH, and don't care about
12 compatibility with other shells.
13- The *No* style is discouraged in new code, but YSH will run it. The [OSH
14 language]($xref:osh-language) is compatible with
15 [POSIX]($xref:posix-shell-spec) and [bash]($xref).
16
17[J8 Notation]: j8-notation.html
18
19<!-- cmark.py expands this -->
20<div id="toc">
21</div>
22
23## Use [Simple Word Evaluation](simple-word-eval.html) to Avoid "Quoting Hell"
24
25### Substitute Variables
26
27No:
28
29 local x='my song.mp3'
30 ls "$x" # quotes required to avoid mangling
31
32Yes:
33
34 var x = 'my song.mp3'
35 ls $x # no quotes needed
36
37### Splice Arrays
38
39No:
40
41 local myflags=( --all --long )
42 ls "${myflags[@]}" "$@"
43
44Yes:
45
46 var myflags = :| --all --long |
47 ls @myflags @ARGV
48
49### Explicitly Split, Glob, and Omit Empty Args
50
51YSH doesn't split arguments after variable expansion.
52
53No:
54
55 local packages='python-dev gawk'
56 apt install $packages
57
58Yes:
59
60 var packages = 'python-dev gawk'
61 apt install @[split(packages)]
62
63Even better:
64
65 var packages = :| python-dev gawk | # array literal
66 apt install @packages # splice array
67
68---
69
70YSH doesn't glob after variable expansion.
71
72No:
73
74 local pat='*.py'
75 echo $pat
76
77
78Yes:
79
80 var pat = '*.py'
81 echo @[glob(pat)] # explicit call
82
83---
84
85YSH doesn't omit unquoted words that evaluate to the empty string.
86
87No:
88
89 local e=''
90 cp $e other $dest # cp gets 2 args, not 3, in sh
91
92Yes:
93
94 var e = ''
95 cp @[maybe(e)] other $dest # explicit call
96
97### Iterate a Number of Times (Split Command Sub)
98
99No:
100
101 local n=3
102 for x in $(seq $n); do # No implicit splitting of unquoted words in YSH
103 echo $x
104 done
105
106OK:
107
108 var n = 3
109 for x in @(seq $n) { # Explicit splitting
110 echo $x
111 }
112
113Better;
114
115 var n = 3
116 for x in (1 .. n+1) { # Range, avoids external program
117 echo $x
118 }
119
120Note that `{1..3}` works in bash and YSH, but the numbers must be constant.
121
122## Avoid Ad Hoc Parsing and Splitting
123
124In other words, avoid *groveling through backslashes and spaces* in shell.
125
126Instead, emit and consume [J8 Notation]($xref:j8-notation):
127
128- J8 strings are [JSON]($xref) strings, with an upgrade for byte string
129 literals
130- [JSON8]($xref) is [JSON]($xref), with this same upgrade
131- [TSV8]($xref) is TSV with this upgrade (not yet implemented)
132
133Custom parsing and serializing should be limited to "the edges" of your YSH
134programs.
135
136### More Strategies For Structured Data
137
138- **Wrap** and Adapt External Tools. Parse their output, and emit [J8 Notation][].
139 - These can be one-off, "bespoke" wrappers in your program, or maintained
140 programs. Use the `proc` construct and `flagspec`!
141 - Example: [uxy](https://github.com/sustrik/uxy) wrappers.
142 - TODO: Examples written in YSH and in other languages.
143- **Patch** Existing Tools.
144 - Enhance GNU grep, etc. to emit [J8 Notation][]. Add a
145 `--j8` flag.
146- **Write Your Own** Structured Versions.
147 - For example, you can write a structured subset of `ls` in Python with
148 little effort.
149
150<!--
151 ls -q and -Q already exist, but --j8 or --tsv8 is probably fine
152-->
153
154## The `write` Builtin Is Simpler Than `printf` and `echo`
155
156### Write an Arbitrary Line
157
158No:
159
160 printf '%s\n' "$mystr"
161
162Yes:
163
164 write -- $mystr
165
166The `write` builtin accepts `--` so it doesn't confuse flags and args.
167
168### Write Without a Newline
169
170No:
171
172 echo -n "$mystr" # breaks if mystr is -e
173
174Yes:
175
176 write --end '' -- $mystr
177 write -n -- $mystr # -n is an alias for --end ''
178
179### Write an Array of Lines
180
181 var myarray = :| one two three |
182 write -- @myarray
183
184## New Long Flags on the `read` builtin
185
186### Read a Line
187
188No:
189
190 read line # Bad because it mangles your backslashes!
191
192For now, please use this bash idiom to read a single line:
193
194 read -r line # Easy to forget -r for "raw"
195
196YSH used to have `read --line`, but there was a design problem: reading
197buffered lines doesn't mix well with reading directly from file descriptors,
198and shell does the latter.
199
200That is, `read -r` is suboptimal because it makes many syscalls, but it's
201already established in shell.
202
203### Read a Whole File
204
205No:
206
207 read -d '' # harder to read, easy to forget -r
208
209Yes:
210
211 read --all # sets $_reply
212 read --all (&myvar) # sets $myvar
213
214### Read a Number of Bytes
215
216No:
217
218 read -n 3 # slow because it respects -d delim
219 # also strips whitespace
220
221Better:
222
223 read -N 3 # good behavior, but easily confused with -n
224
225Yes:
226
227 read --num-bytes 3 # sets $_reply
228 read --num-bytes 3 (&myvar) # sets $myvar
229
230
231### Read Until `\0` (consume `find -print0`)
232
233No:
234
235 # Obscure syntax that bash accepts, but not other shells
236 read -r -d '' myvar
237
238Yes:
239
240 read -0 (&myvar)
241
242## YSH Enhancements to Builtins
243
244### Use `shopt` Instead of `set`
245
246Using a single builtin for all options makes scripts easier to read:
247
248Discouraged:
249
250 set -o errexit
251 shopt -s dotglob
252
253Idiomatic:
254
255 shopt --set errexit
256 shopt --set dotglob
257
258(As always, `set` can be used when you care about compatibility with other
259shells.)
260
261### Use `:` When Mentioning Variable Names
262
263YSH accepts this optional "pseudo-sigil" to make code more explicit.
264
265No:
266
267 read -0 record < file.bin
268 echo $record
269
270Yes:
271
272 read -0 (&myvar) < file.bin
273 echo $record
274
275
276### Consider Using `--long-flags`
277
278Easier to write:
279
280 test -d /tmp
281 test -d / && test -f /vmlinuz
282
283 shopt -u extglob
284
285Easier to read:
286
287 test --dir /tmp
288 test --dir / && test --file /vmlinuz
289
290 shopt --unset extglob
291
292## Use Blocks to Save and Restore Context
293
294### Do Something In Another Directory
295
296No:
297
298 ( cd /tmp; echo $PWD ) # subshell is unnecessary (and limited)
299
300No:
301
302 pushd /tmp
303 echo $PWD
304 popd
305
306Yes:
307
308 cd /tmp {
309 echo $PWD
310 }
311
312### Batch I/O
313
314No:
315
316 echo 1 > out.txt
317 echo 2 >> out.txt # appending is less efficient
318 # because open() and close()
319
320No:
321
322 { echo 1
323 echo 2
324 } > out.txt
325
326Yes:
327
328 fopen > out.txt {
329 echo 1
330 echo 2
331 }
332
333The `fopen` builtin is syntactic sugar -- it lets you see redirects before the
334code that uses them.
335
336### Temporarily Set Shell Options
337
338No:
339
340 set +o errexit
341 myfunc # without error checking
342 set -o errexit
343
344Yes:
345
346 shopt --unset errexit {
347 myfunc
348 }
349
350### Use the `forkwait` builtin for Subshells, not `()`
351
352No:
353
354 ( cd /tmp; rm *.sh )
355
356Yes:
357
358 forkwait {
359 cd /tmp
360 rm *.sh
361 }
362
363Better:
364
365 cd /tmp { # no process created
366 rm *.sh
367 }
368
369### Use the `fork` builtin for async, not `&`
370
371No:
372
373 myfunc &
374
375 { sleep 1; echo one; sleep 2; } &
376
377Yes:
378
379 fork { myfunc }
380
381 fork { sleep 1; echo one; sleep 2 }
382
383## Use Procs (Better Shell Functions)
384
385### Use Named Parameters Instead of `$1`, `$2`, ...
386
387No:
388
389 f() {
390 local src=$1
391 local dest=${2:-/tmp}
392
393 cp "$src" "$dest"
394 }
395
396Yes:
397
398 proc f(src, dest='/tmp') { # Python-like default values
399 cp $src $dest
400 }
401
402### Use Named Varargs Instead of `"$@"`
403
404No:
405
406 f() {
407 local first=$1
408 shift
409
410 echo $first
411 echo "$@"
412 }
413
414Yes:
415
416 proc f(first, @rest) { # @ means "the rest of the arguments"
417 write -- $first
418 write -- @rest # @ means "splice this array"
419 }
420
421You can also use the implicit `ARGV` variable:
422
423 proc p {
424 cp -- @ARGV /tmp
425 }
426
427### Use "Out Params" instead of `declare -n`
428
429Out params are one way to "return" values from a `proc`.
430
431No:
432
433 f() {
434 local in=$1
435 local -n out=$2
436
437 out=PREFIX-$in
438 }
439
440 myvar='init'
441 f zzz myvar # assigns myvar to 'PREFIX-zzz'
442
443
444Yes:
445
446 proc f(in, :out) { # : is an out param, i.e. a string "reference"
447 setref out = "PREFIX-$in"
448 }
449
450 var myvar = 'init'
451 f zzz :myvar # assigns myvar to 'PREFIX-zzz'.
452 # colon is required
453
454### Note: Procs Don't Mess With Their Callers
455
456That is, [dynamic scope]($xref:dynamic-scope) is turned off when procs are
457invoked.
458
459Here's an example of shell functions reading variables in their caller:
460
461 bar() {
462 echo $foo_var # looks up the stack
463 }
464
465 foo() {
466 foo_var=x
467 bar
468 }
469
470 foo
471
472In YSH, you have to pass params explicitly:
473
474 proc bar {
475 echo $foo_var # error, not defined
476 }
477
478Shell functions can also **mutate** variables in their caller! But procs can't
479do this, which makes code easier to reason about.
480
481## Use Modules
482
483YSH has a few lightweight features that make it easier to organize code into
484files. It doesn't have "namespaces".
485
486### Relative Imports
487
488Suppose we are running `bin/mytool`, and we want `BASE_DIR` to be the root of
489the repository so we can do a relative import of `lib/foo.sh`.
490
491No:
492
493 # All of these are common idioms, with caveats
494 BASE_DIR=$(dirname $0)/..
495
496 BASE_DIR=$(dirname ${BASH_SOURCE[0]})/..
497
498 BASE_DIR=$(cd $($dirname $0)/.. && pwd)
499
500 BASE_DIR=$(dirname (dirname $(readlink -f $0)))
501
502 source $BASE_DIR/lib/foo.sh
503
504Yes:
505
506 const BASE_DIR = "$this_dir/.."
507
508 source $BASE_DIR/lib/foo.sh
509
510 # Or simply:
511 source $_this_dir/../lib/foo.sh
512
513The value of `_this_dir` is the directory that contains the currently executing
514file.
515
516### Include Guards
517
518No:
519
520 # libfoo.sh
521 if test -z "$__LIBFOO_SH"; then
522 return
523 fi
524 __LIBFOO_SH=1
525
526Yes:
527
528 # libfoo.sh
529 module libfoo.sh || return 0
530
531### Taskfile Pattern
532
533No:
534
535 deploy() {
536 echo ...
537 }
538 "$@"
539
540Yes
541
542 proc deploy() {
543 echo ...
544 }
545 runproc @ARGV # gives better error messages
546
547## Error Handling
548
549[YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html) once and
550for all! Here's a comprehensive list of error handling idioms.
551
552### Don't Use `&&` Outside of `if` / `while`
553
554It's implicit because `errexit` is on in YSH.
555
556No:
557
558 mkdir /tmp/dest && cp foo /tmp/dest
559
560Yes:
561
562 mkdir /tmp/dest
563 cp foo /tmp/dest
564
565It also avoids the *Trailing `&&` Pitfall* mentioned at the end of the [error
566handling](error-handling.html) doc.
567
568### Ignore an Error
569
570No:
571
572 ls /bad || true # OK because ls is external
573 myfunc || true # suffers from the "Disabled errexit Quirk"
574
575Yes:
576
577 try ls /bad
578 try myfunc
579
580### Retrieve A Command's Status When `errexit` is On
581
582No:
583
584 # set -e is enabled earlier
585
586 set +e
587 mycommand # this ignores errors when mycommand is a function
588 status=$? # save it before it changes
589 set -e
590
591 echo $status
592
593Yes:
594
595 try mycommand
596 echo $_status
597
598### Does a Builtin Or External Command Succeed?
599
600These idioms are OK in both shell and YSH:
601
602 if ! cp foo /tmp {
603 echo 'error copying' # any non-zero status
604 }
605
606 if ! test -d /bin {
607 echo 'not a directory'
608 }
609
610To be consistent with the idioms below, you can also write them like this:
611
612 try cp foo /tmp
613 if (_status !== 0) {
614 echo 'error copying'
615 }
616
617### Does a Function Succeed?
618
619When the command is a shell function, you shouldn't use `if myfunc` directly.
620This is because shell has the *Disabled `errexit` Quirk*, which is detected by
621YSH `strict_errexit`.
622
623**No**:
624
625 if myfunc; then # errors not checked in body of myfunc
626 echo 'success'
627 fi
628
629**Yes**. The *`$0` Dispatch Pattern* is a workaround that works in all shells.
630
631 if $0 myfunc; then # invoke a new shell
632 echo 'success'
633 fi
634
635 "$@" # Run the function $1 with args $2, $3, ...
636
637**Yes**. The YSH `try` builtin sets the special `_status` variable and returns
638`0`.
639
640 try myfunc # doesn't abort
641 if (_status === 0) {
642 echo 'success'
643 fi
644
645### `try` Also Takes a Block
646
647A block arg is useful for multiple commands:
648
649 try { # stops at the first error
650 chmod +x myfile
651 cp myfile /bin
652 }
653 if (_status !== 0) {
654 echo 'error'
655 }
656
657
658### Does a Pipeline Succeed?
659
660No:
661
662 if ps | grep python; then
663 echo 'found'
664 fi
665
666This is technically correct when `pipefail` is on, but it's impossible for
667YSH `strict_errexit` to distinguish it from `if myfunc | grep python` ahead
668of time (the ["meta" pitfall](error-handling.html#the-meta-pitfall)). If you
669know what you're doing, you can disable `strict_errexit`.
670
671Yes:
672
673 try {
674 ps | grep python
675 }
676 if (_status === 0) {
677 echo 'found'
678 }
679
680 # You can also examine the status of each part of the pipeline
681 if (_pipeline_status[0] !== 0) {
682 echo 'ps failed'
683 }
684
685### Does a Command With Process Subs Succeed?
686
687Similar to the pipeline example above:
688
689No:
690
691 if ! comm <(sort left.txt) <(sort right.txt); then
692 echo 'error'
693 fi
694
695Yes:
696
697 try {
698 comm <(sort left.txt) <(sort right.txt)
699 }
700 if (_status !== 0) {
701 echo 'error'
702 }
703
704 # You can also examine the status of each process sub
705 if (_process_sub_status[0] !== 0) {
706 echo 'first process sub failed'
707 }
708
709(I used `comm` in this example because it doesn't have a true / false / error
710status like `diff`.)
711
712### Handle Errors in YSH Expressions
713
714 try {
715 var x = 42 / 0
716 echo "result is $[42 / 0]"
717 }
718 if (_status !== 0) {
719 echo 'divide by zero'
720 }
721
722### Test Boolean Statuses, like `grep`, `diff`, `test`
723
724The YSH `boolstatus` builtin distinguishes **error** from **false**.
725
726**No**, this is subtly wrong. `grep` has 3 different return values.
727
728 if grep 'class' *.py {
729 echo 'found' # status 0 means found
730 } else {
731 echo 'not found OR ERROR' # any non-zero status
732 }
733
734**Yes**. `boolstatus` aborts the program if `egrep` doesn't return 0 or 1.
735
736 if boolstatus grep 'class' *.py { # may abort
737 echo 'found' # status 0 means found
738 } else {
739 echo 'not found' # status 1 means not found
740 }
741
742More flexible style:
743
744 try grep 'class' *.py
745 case $_status {
746 (0) echo 'found'
747 ;;
748 (1) echo 'not found'
749 ;;
750 (*) echo 'fatal'
751 exit $_status
752 ;;
753 }
754
755## Use YSH Expressions, Initializations, and Assignments (var, setvar)
756
757### Initialize and Assign Strings and Integers
758
759No:
760
761 local mystr=foo
762 mystr='new value'
763
764 local myint=42 # still a string in shell
765
766Yes:
767
768 var mystr = 'foo'
769 setvar mystr = 'new value'
770
771 var myint = 42 # a real integer
772
773### Expressions on Integers
774
775No:
776
777 x=$(( 1 + 2*3 ))
778 (( x = 1 + 2*3 ))
779
780Yes:
781
782 setvar x = 1 + 2*3
783
784### Mutate Integers
785
786No:
787
788 (( i++ )) # interacts poorly with errexit
789 i=$(( i+1 ))
790
791Yes:
792
793 setvar i += 1 # like Python, with a keyword
794
795### Initialize and Assign Arrays
796
797Arrays in YSH look like `:| my array |` and `['my', 'array']`.
798
799No:
800
801 local -a myarray=(one two three)
802 myarray[3]='THREE'
803
804Yes:
805
806 var myarray = :| one two three |
807 setvar myarray[3] = 'THREE'
808
809 var same = ['one', 'two', 'three']
810 var typed = [1, 2, true, false, null]
811
812
813### Initialize and Assign Dicts
814
815Dicts in YSH look like `{key: 'value'}`.
816
817No:
818
819 local -A myassoc=(['key']=value ['k2']=v2)
820 myassoc['key']=V
821
822
823Yes:
824
825 # keys don't need to be quoted
826 var myassoc = {key: 'value', k2: 'v2'}
827 setvar myassoc['key'] = 'V'
828
829### Get Values From Arrays and Dicts
830
831No:
832
833 local x=${a[i-1]}
834 x=${a[i]}
835
836 local y=${A['key']}
837
838Yes:
839
840 var x = a[i-1]
841 setvar x = a[i]
842
843 var y = A['key']
844
845### Conditions and Comparisons
846
847No:
848
849 if (( x > 0 )); then
850 echo 'positive'
851 fi
852
853Yes:
854
855 if (x > 0) {
856 echo 'positive'
857 }
858
859### Substituting Expressions in Words
860
861No:
862
863 echo flag=$((1 + a[i] * 3)) # C-like arithmetic
864
865Yes:
866
867 echo flag=$[1 + a[i] * 3] # Arbitrary YSH expressions
868
869 # Possible, but a local var might be more readable
870 echo flag=$['1' if x else '0']
871
872
873## Use [Egg Expressions](eggex.html) instead of Regexes
874
875### Test for a Match
876
877No:
878
879 local pat='[[:digit:]]+'
880 if [[ $x =~ $pat ]]; then
881 echo 'number'
882 fi
883
884Yes:
885
886 if (x ~ /digit+/) {
887 echo 'number'
888 }
889
890Or extract the pattern:
891
892 var pat = / digit+ /
893 if (x ~ pat) {
894 echo 'number'
895 }
896
897### Extract Submatches
898
899No:
900
901 if [[ $x =~ foo-([[:digit:]]+) ]] {
902 echo "${BASH_REMATCH[1]}" # first submatch
903 }
904
905Yes:
906
907 if (x ~ / 'foo-' <capture d+> /) { # <> is capture
908 echo $[_group(1)] # first submatch
909 }
910
911## Glob Matching
912
913No:
914
915 if [[ $x == *.py ]]; then
916 echo 'Python'
917 fi
918
919Yes:
920
921 if (x ~~ '*.py') {
922 echo 'Python'
923 }
924
925
926No:
927
928 case $x in
929 *.py)
930 echo Python
931 ;;
932 *.sh)
933 echo Shell
934 ;;
935 esac
936
937Yes (purely a style preference):
938
939 case $x { # curly braces
940 (*.py) # balanced parens
941 echo 'Python'
942 ;;
943 (*.sh)
944 echo 'Shell'
945 ;;
946 }
947
948## TODO
949
950### Distinguish Between Variables and Functions
951
952- `$RANDOM` vs. `random()`
953- `LANG=C` vs. `shopt --setattr LANG=C`
954
955## Related Documents
956
957- [Shell Language Idioms](shell-idioms.html). This advice applies to shells
958 other than YSH.
959- [What Breaks When You Upgrade to YSH](upgrade-breakage.html). Shell constructs that YSH
960 users should avoid.
961- [YSH Fixes Shell's Error Handling (`errexit`)](error-handling.html). YSH fixes the
962 flaky error handling in POSIX shell and bash.
963- TODO: Go through more of the [Pure Bash
964 Bible](https://github.com/dylanaraps/pure-bash-bible). YSH provides
965 alternatives for such quirky syntax.
966