OILS / doc / proc-func.md View on Github | oilshell.org

896 lines, 601 significant
1---
2default_highlighter: oils-sh
3---
4
5Guide to Procs and Funcs
6========================
7
8YSH has two major units of code: shell-like `proc`, and Python-like `func`.
9
10- Roughly speaking, procs are for commands and **I/O**, while funcs are for
11 pure **computation**.
12- Procs are often **big**, and may call **small** funcs. On the other hand,
13 it's possible, but rarer, for funcs to call procs.
14- You can write shell scripts **mostly** with procs, and perhaps a few funcs.
15
16This doc compares the two mechanisms, and gives rough guidelines.
17
18<!--
19See the blog for more conceptual background: [Oils is
20Exterior-First](https://www.oilshell.org/blog/2023/06/ysh-design.html).
21-->
22
23<div id="toc">
24</div>
25
26## Tip: Start Simple
27
28Before going into detail, here's a quick reminder that you don't have to use
29**either** procs or funcs. YSH is a language that scales both down and up.
30
31You can start with just a list of plain commands:
32
33 mkdir -p /tmp/dest
34 cp --verbose *.txt /tmp/dest
35
36Then copy those into procs as the script gets bigger:
37
38 proc build-app {
39 ninja --verbose
40 }
41
42 proc deploy {
43 mkdir -p /tmp/dest
44 cp --verbose *.txt /tmp/dest
45 }
46
47 build-app
48 deploy
49
50Then add funcs if you need pure computation:
51
52 func isTestFile(name) {
53 return (name => endsWith('._test.py'))
54 }
55
56 if (isTestFile('my_test.py')) {
57 echo 'yes'
58 }
59
60## At a Glance
61
62### Procs vs. Funcs
63
64This table summarizes the difference between procs and funcs. The rest of the
65doc will elaborate on these issues.
66
67<style>
68 thead {
69 background-color: #eee;
70 font-weight: bold;
71 }
72 table {
73 font-family: sans-serif;
74 border-collapse: collapse;
75 }
76
77 tr {
78 border-bottom: solid 1px;
79 border-color: #ddd;
80 }
81
82 td {
83 padding: 8px; /* override default of 5px */
84 }
85</style>
86
87<table>
88 <thead>
89 <tr>
90 <td></td>
91 <td>Proc</td>
92 <td>Func</td>
93 </tr>
94 </thead>
95
96 <tr>
97 <td>Design Influence</td>
98<td>
99
100Shell-like.
101
102</td>
103<td>
104
105Python- and JavaScript-like, but **pure**.
106
107</td>
108 </tr>
109
110 <tr>
111 <td>Shape</td>
112
113<td>
114
115Procs are shaped like Unix processes: with `argv`, an integer return code, and
116`stdin` / `stdout` streams.
117
118They're a generalization of Bourne shell "functions".
119
120</td>
121<td>
122
123Funcs are shaped like mathematical functions.
124
125</td>
126 </tr>
127
128 <tr>
129<td>
130
131Architectural Role ([Oils is Exterior First](https://www.oilshell.org/blog/2023/06/ysh-design.html))
132
133</td>
134<td>
135
136**Exterior**: processes and files.
137
138</td>
139
140<td>
141
142**Interior**: functions and garbage-collected data structures.
143
144</td>
145 </tr>
146
147 <tr>
148 <td>I/O</td>
149 <td>
150
151Procs may start external processes and pipelines. Can perform I/O anywhere.
152
153</td>
154 <td>
155
156Funcs need an explicit `value.IO` param to perform I/O.
157
158</td>
159 </tr>
160
161 <tr>
162 <td>Example Definition</td>
163<td>
164
165 proc print-max (; x, y) {
166 echo $[x if x > y else y]
167 }
168
169</td>
170<td>
171
172 func computeMax(x, y) {
173 return (x if x > y else y)
174 }
175
176</td>
177 </tr>
178
179 <tr>
180 <td>Example Call</td>
181<td>
182
183 print-max (3, 4)
184
185Procs can be put in pipelines:
186
187 print-max (3, 4) | tee out.txt
188
189</td>
190<td>
191
192 var m = computeMax(3, 4)
193
194Or throw away the return value, which is useful for functions that mutate:
195
196 call computeMax(3, 4)
197
198</td>
199 </tr>
200
201 <tr>
202 <td>Naming Convention</td>
203<td>
204
205`kebab-case`
206
207</td>
208<td>
209
210`camelCase`
211
212</td>
213 </tr>
214
215 <tr>
216<td>
217
218[Syntax Mode](command-vs-expression-mode.html) of call site
219
220</td>
221 <td>Command Mode</td>
222 <td>Expression Mode</td>
223 </tr>
224
225 <tr>
226 <td>Kinds of Parameters / Arguments</td>
227 <td>
228
2291. Word aka string
2301. Typed and Positional
2311. Typed and Named
2321. Block
233
234Examples shown below.
235
236</td>
237 <td>
238
2391. Positional
2401. Named
241
242(both typed)
243
244</td>
245 </tr>
246
247 <tr>
248 <td>Return Value</td>
249 <td>Integer status 0-255</td>
250 <td>
251
252Any type of value, e.g.
253
254 return ([42, {name: 'bob'}])
255
256</td>
257 </tr>
258
259 <tr>
260 <td>Interface Evolution</td>
261<td>
262
263**Slower**: Procs exposed to the outside world may need to evolve in a compatible or "versionless" way.
264
265</td>
266<td>
267
268**Faster**: Funcs may be refactored internally.
269
270</td>
271 </tr>
272
273 <tr>
274 <td>Parallelism?</td>
275<td>
276
277Procs can be parallel with:
278
279- shell constructs: pipelines, `&` aka `fork`
280- external tools and the [$0 Dispatch
281 Pattern](https://www.oilshell.org/blog/2021/08/xargs.html): xargs, make,
282 Ninja, etc.
283
284</td>
285<td>
286
287Funcs are inherently **serial**, unless wrapped in a proc.
288
289</td>
290 </tr>
291
292 <tr>
293 <td colspan=3 style="text-align: center; padding: 3em">More <code>proc</code> features ...</td>
294 </tr>
295
296 <tr>
297 <td>Kinds of Signature</td>
298 <td>
299
300Open `proc p {` or <br/>
301Closed `proc p () {`
302
303</td>
304 <td>-</td>
305 </tr>
306
307 <tr>
308 <td>Lazy Args</td>
309<td>
310
311 assert [42 === x]
312
313</td>
314 <td>-</td>
315 </tr>
316
317</table>
318
319### Func Calls and Defs
320
321Now that we've compared procs and funcs, let's look more closely at funcs.
322They're inherently **simpler**: they have 2 types of args and params, rather
323than 4.
324
325YSH argument binding is based on Julia, which has all the power of Python, but
326without the "evolved warts" (e.g. `/` and `*`).
327
328In general, with all the bells and whistles, func definitions look like:
329
330 # pos args and named args separated with ;
331 func f(p1, p2, ...rest_pos; n1=42, n2='foo', ...rest_named) {
332 return (len(rest_pos) + len(rest_named))
333 }
334
335Func calls look like:
336
337 # spread operator ... at call site
338 var pos_args = [3, 4]
339 var named_args = {foo: 'bar'}
340 var x = f(1, 2, ...pos_args; n1=43, ...named_args)
341
342Note that positional args/params and named args/params can be thought of as two
343"separate worlds".
344
345This table shows simpler, more common cases.
346
347
348<table>
349 <thead>
350 <tr>
351 <td>Args / Params</td>
352 <td>Call Site</td>
353 <td>Definition</td>
354 </tr>
355 </thead>
356
357 <tr>
358 <td>Positional Args</td>
359<td>
360
361 var x = myMax(3, 4)
362
363</td>
364<td>
365
366 func myMax(x, y) {
367 return (x if x > y else y)
368 }
369
370</td>
371 </tr>
372
373 <tr>
374 <td>Spread Pos Args</td>
375<td>
376
377 var args = [3, 4]
378 var x = myMax(...args)
379
380</td>
381<td>
382
383(as above)
384
385</td>
386 </tr>
387
388 <tr>
389 <td>Rest Pos Params</td>
390<td>
391
392 var x = myPrintf("%s is %d", 'bob', 30)
393
394</td>
395<td>
396
397 func myPrintf(fmt, ...args) {
398 # ...
399 }
400
401</td>
402 </tr>
403
404 <tr>
405 <td colspan=3 style="text-align: center; padding: 3em">...</td>
406 </tr>
407
408</td>
409 </tr>
410
411 <tr>
412 <td>Named Args</td>
413<td>
414
415 var x = mySum(3, 4, start=5)
416
417</td>
418<td>
419
420 func mySum(x, y; start=0) {
421 return (x + y + start)
422 }
423
424</td>
425 </tr>
426
427 <tr>
428 <td>Spread Named Args</td>
429<td>
430
431 var opts = {start: 5}
432 var x = mySum(3, 4, ...opts)
433
434</td>
435<td>
436
437(as above)
438
439</td>
440 </tr>
441
442 <tr>
443 <td>Rest Named Params</td>
444<td>
445
446 var x = f(start=5, end=7)
447
448</td>
449<td>
450
451 func f(; ...opts) {
452 if ('start' not in opts) {
453 setvar opts.start = 0
454 }
455 # ...
456 }
457
458</td>
459 </tr>
460
461</table>
462
463### Proc Calls and Defs
464
465Like funcs, procs have 2 kinds of typed args/params: positional and named.
466
467But they may also have **string aka word** args/params, and a **block**
468arg/param.
469
470In general, a proc signature has 4 sections, like this:
471
472 proc p (
473 w1, w2, ...rest_word; # word params
474 p1, p2, ...rest_pos; # pos params
475 n1, n2, ...rest_named; # named params
476 block # block param
477 ) {
478 echo 'body'
479 }
480
481In general, a proc call looks like this:
482
483 var pos_args = [3, 4]
484 var named_args = {foo: 'bar'}
485
486 p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args) {
487 echo 'block'
488 }
489
490The block can also be passed as an expression after a second semicolon:
491
492 p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args; block)
493
494<!--
495- Block is really last positional arg: `cd /tmp { echo $PWD }`
496-->
497
498Some simpler examples:
499
500<table>
501 <thead>
502 <tr>
503 <td>Args / Params</td>
504 <td>Call Site</td>
505 <td>Definition</td>
506 </tr>
507 </thead>
508
509 <tr>
510 <td>Word args</td>
511<td>
512
513 my-cd /tmp
514
515</td>
516<td>
517
518 proc my-cd (dest) {
519 cd $dest
520 }
521
522</td>
523 </tr>
524
525 <tr>
526 <td>Rest Word Params</td>
527<td>
528
529 my-cd -L /tmp
530
531</td>
532<td>
533
534 proc my-cd (...flags) {
535 cd @flags
536 }
537
538 <tr>
539 <td>Spread Word Args</td>
540<td>
541
542 var flags = :| -L /tmp |
543 my-cd @flags
544
545</td>
546<td>
547
548(as above)
549
550</td>
551 </tr>
552
553</td>
554 </tr>
555
556 <tr>
557 <td colspan=3 style="text-align: center; padding: 3em">...</td>
558 </tr>
559
560 <tr>
561 <td>Typed Pos Arg</td>
562<td>
563
564 print-max (3, 4)
565
566</td>
567<td>
568
569 proc print-max ( ; x, y) {
570 echo $[x if x > y else y]
571 }
572
573</td>
574 </tr>
575
576 <tr>
577 <td>Typed Named Arg</td>
578<td>
579
580 print-max (3, 4, start=5)
581
582</td>
583<td>
584
585 proc print-max ( ; x, y; start=0) {
586 # ...
587 }
588
589</td>
590 </tr>
591
592 <tr>
593 <td colspan=3 style="text-align: center; padding: 3em">...</td>
594 </tr>
595
596
597
598 <tr>
599 <td>Block Argument</td>
600<td>
601
602 my-cd /tmp {
603 echo $PWD
604 echo hi
605 }
606
607</td>
608<td>
609
610 proc my-cd (dest; ; ; block) {
611 cd $dest (; ; block)
612 }
613
614</td>
615 </tr>
616
617 <tr>
618 <td>All Four Kinds</td>
619<td>
620
621 p 'word' (42, verbose=true) {
622 echo $PWD
623 echo hi
624 }
625
626</td>
627<td>
628
629 proc p (w; myint; verbose=false; block) {
630 = w
631 = myint
632 = verbose
633 = block
634 }
635
636</td>
637 </tr>
638
639</table>
640
641## Common Features
642
643Let's recap the common features of procs and funcs.
644
645### Spread Args, Rest Params
646
647- Spread arg list `...` at call site
648- Rest params `...` at definition
649
650### The `error` builtin raises exceptions
651
652The `error` builtin is idiomatic in both funcs and procs:
653
654 func f(x) {
655 if (x <= 0) {
656 error 'Should be positive' (status=99)
657 }
658 }
659
660Tip: reserve such errors for **exceptional** situations. For example, an input
661string being invalid may not be uncommon, while a disk full I/O error is more
662exceptional.
663
664(The `error` builtin is implemented with C++ exceptions, which are slow in the
665error case.)
666
667### Out Params: `&myvar` is of type `value.Place`
668
669Out params are more common in procs, because they don't have a typed return
670value.
671
672 proc p ( ; out) {
673 call out->setValue(42)
674 }
675 var x
676 p (&x)
677 echo "x set to $x" # => x set to 42
678
679But they can also be used in funcs:
680
681 func f (out) {
682 call out->setValue(42)
683 }
684 var x
685 call f(&x)
686 echo "x set to $x" # => x set to 42
687
688Observation: procs can do everything funcs can. But you may want the purity
689and familiar syntax of a `func`.
690
691---
692
693Design note: out params are a nicer way of doing what bash does with `declare
694-n` aka `nameref` variables. They don't rely on [dynamic
695scope]($xref:dynamic-scope).
696
697## Proc-Only Features
698
699Procs have some features that funcs don't have.
700
701### Lazy Arg Lists `where [x > 10]`
702
703A lazy arg list is implemented with `shopt --set parse_bracket`, and is syntax
704sugar for an unevaluated `value.Expr`.
705
706Longhand:
707
708 var my_expr = ^[42 === x] # value of type Expr
709 assert (myexpr)
710
711Shorthand:
712
713 assert [42 === x] # equivalent to the above
714
715### Open Proc Signatures bind `argv`
716
717TODO: Implement new `ARGV` semantics.
718
719When a proc signature omits `()`, it's called **"open"** because the caller can
720pass "extra" arguments:
721
722 proc my-open {
723 write 'args are' @ARGV
724 }
725 # All valid:
726 my-open
727 my-open 1
728 my-open 1 2
729
730Stricter closed procs:
731
732 proc my-closed (x) {
733 write 'arg is' $x
734 }
735 my-closed # runtime error: missing argument
736 my-closed 1 # valid
737 my-closed 1 2 # runtime error: too many arguments
738
739
740An "open" proc is nearly is nearly identical to a shell function:
741
742 shfunc() {
743 write 'args are' @ARGV
744 }
745
746## Usage Notes
747
748### 3 Ways to Return a Value
749
750Let's review the recommended ways to "return" a value:
751
7521. `return (x)` in a `func`.
753 - The parentheses are required because expressions like `(x + 1)` should
754 look different than words.
7551. Pass a `value.Place` instance to a proc or func.
756 - That is, out param `&out`.
7571. Print to stdout in a `proc`
758 - Capture it with command sub: `$(myproc)`
759 - Or with `read`: `myproc | read --all; echo $_reply`
760
761Obsolete ways of "returning":
762
7631. Using `declare -n` aka `nameref` variables in bash.
7641. Relying on [dynamic scope]($xref:dynamic-scope) in POSIX shell.
765
766### Procs Compose in Pipelines / "Bernstein Chaining"
767
768Some YSH users may tend toward funcs because they're more familiar. But shell
769composition with procs is very powerful!
770
771They have at least two kinds of composition that funcs don't have.
772
773See #[shell-the-good-parts]($blog-tag):
774
7751. [Shell Has a Forth-Like
776 Quality](https://www.oilshell.org/blog/2017/01/13.html) - Bernstein
777 chaining.
7781. [Pipelines Support Vectorized, Point-Free, and Imperative
779 Style](https://www.oilshell.org/blog/2017/01/15.html) - the shell can
780 transparently run procs as elements of pipelines.
781
782<!--
783
784In summary:
785
786* func signatures look like JavaScript, Julia, and Go.
787 * named and positional are separated with `;` in the signature.
788 * The prefix `...` "spread" operator takes the place of Python's `*args` and `**kwargs`.
789 * There are optional type annotations
790* procs are like shell functions
791 * but they also allow you to name parameters, and throw errors if the arity
792is wrong.
793 * and they take blocks.
794
795-->
796
797## Summary
798
799YSH is influenced by both shell and Python, so it has both procs and funcs.
800
801Many programmers will gravitate towards funcs because they're familiar, but
802procs are more powerful and shell-like.
803
804Make your YSH programs by learning to use procs!
805
806## Appendix
807
808### Implementation Details
809
810procs vs. funcs both have these concerns:
811
8121. Evaluation of default args at definition time.
8131. Evaluation of actual args at the call site.
8141. Arg-Param binding for builtin functions, e.g. with `typed_args.Reader`.
8151. Arg-Param binding for user-defined functions.
816
817So the implementation can be thought of as a **2 &times; 4 matrix**, with some
818code shared. This code is mostly in [ysh/func_proc.py]($oils-src).
819
820### Related
821
822- [Variable Declaration, Mutation, and Scope](variables.html) - in particular,
823 procs don't have [dynamic scope]($xref:dynamic-scope).
824- [Block Literals](block-literals.html) (in progress)
825
826<!--
827TODO: any reference topics?
828-->
829
830<!--
831OK we're getting close here -- #**language-design>Unifying Proc and Func Params**
832
833I think we need to write a quick guide first, not a reference
834
835
836It might have some **tables**
837
838It might mention concerete use cases like the **flag parser** -- #**oil-dev>Progress on argparse**
839
840
841### Diff-based explanation
842
843- why not Python -- because of `/` and `*` special cases
844- Julia influence
845- lazy args for procs `where` filters and `awk`
846- out Ref parameters are for "returning" without printing to stdout
847
848#**language-design>N ways to "return" a value**
849
850
851- What does shell have?
852 - it has blocks, e.g. with redirects
853 - it has functions without params -- only named params
854
855
856- Ruby influence -- rich DSLs
857
858
859So I think you can say we're a mix of
860
861- shell
862- Python
863- Julia (mostly subsumes Python?)
864- Ruby
865
866
867### Implemented-based explanation
868
869- ASDL schemas -- #**oil-dev>Good Proc/Func refactoring**
870
871
872### Big Idea: procs are for I/O, funcs are for computation
873
874We may want to go full in on this idea with #**language-design>func evaluator without redirects and $?**
875
876
877### Very Basic Advice, Up Front
878
879
880Done with #**language-design>value.Place, & operator, read builtin**
881
882Place works with both func and proc
883
884
885### Bump
886
887I think this might go in the backlog - #**blog-ideas**
888
889
890#**language-design>Simplify proc param passing?**
891
892-->
893
894
895
896<!-- vim sw=2 -->