Why Sponsor Oils? | source | all docs for version 0.22.0 | all versions | oilshell.org

Guide to Procs and Funcs

YSH has two major units of code: shell-like proc, and Python-like func.

Roughly speaking, procs are for commands and I/O, while funcs are for pure computation.
Procs are often big, and may call small funcs. On the other hand, it's possible, but rarer, for funcs to call procs.
You can write shell scripts mostly with procs, and perhaps a few funcs.

This doc compares the two mechanisms, and gives rough guidelines.

Table of Contents

Spread Args, Rest Params

The error builtin raises exceptions

Out Params: &myvar is of type value.Place

Proc-Only Features

Lazy Arg Lists where [x > 10]

Open Proc Signatures bind argv

Usage Notes

3 Ways to Return a Value

Procs Compose in Pipelines / "Bernstein Chaining"

Summary

Appendix

Implementation Details

Tip: Start Simple

Before going into detail, here's a quick reminder that you don't have to use either procs or funcs. YSH is a language that scales both down and up.

You can start with just a list of plain commands:

mkdir -p /tmp/dest
cp --verbose *.txt /tmp/dest

Then copy those into procs as the script gets bigger:

proc build-app {
  ninja --verbose
}

proc deploy {
  mkdir -p /tmp/dest
  cp --verbose *.txt /tmp/dest
}

build-app
deploy

Then add funcs if you need pure computation:

func isTestFile(name) {
  return (name => endsWith('._test.py'))
}

if (isTestFile('my_test.py')) {
  echo 'yes'
}

At a Glance

Procs vs. Funcs

This table summarizes the difference between procs and funcs. The rest of the doc will elaborate on these issues.

	Proc	Func
Design Influence	Shell-like.	Python- and JavaScript-like, but pure.
Shape	Procs are shaped like Unix processes: with `argv`, an integer return code, and `stdin` / `stdout` streams. They're a generalization of Bourne shell "functions".	Funcs are shaped like mathematical functions.
Architectural Role (Oils is Exterior First)	Exterior: processes and files.	Interior: functions and garbage-collected data structures.
I/O	Procs may start external processes and pipelines. Can perform I/O anywhere.	Funcs need an explicit `value.IO` param to perform I/O.
Example Definition	`proc print-max (; x, y) { echo $[x if x > y else y] }`	`func computeMax(x, y) { return (x if x > y else y) }`
Example Call	`print-max (3, 4)` Procs can be put in pipelines: `print-max (3, 4) \| tee out.txt`	`var m = computeMax(3, 4)` Or throw away the return value, which is useful for functions that mutate: `call computeMax(3, 4)`
Naming Convention	`kebab-case`	`camelCase`
Syntax Mode of call site	Command Mode	Expression Mode
Kinds of Parameters / Arguments	Word aka string Typed and Positional Typed and Named Block Examples shown below.	Positional Named (both typed)
Return Value	Integer status 0-255	Any type of value, e.g. `return ([42, {name: 'bob'}])`
Interface Evolution	Slower: Procs exposed to the outside world may need to evolve in a compatible or "versionless" way.	Faster: Funcs may be refactored internally.
Parallelism?	Procs can be parallel with: shell constructs: pipelines, `&` aka `fork` external tools and the $0 Dispatch Pattern: xargs, make, Ninja, etc.	Funcs are inherently serial, unless wrapped in a proc.
More `proc` features ...
Kinds of Signature	Open `proc p {` or Closed `proc p () {`	-
Lazy Args	`assert [42 === x]`	-

Func Calls and Defs

Now that we've compared procs and funcs, let's look more closely at funcs. They're inherently simpler: they have 2 types of args and params, rather than 4.

YSH argument binding is based on Julia, which has all the power of Python, but without the "evolved warts" (e.g. / and *).

In general, with all the bells and whistles, func definitions look like:

# pos args and named args separated with ;
func f(p1, p2, ...rest_pos; n1=42, n2='foo', ...rest_named) {
  return (len(rest_pos) + len(rest_named))
}

Func calls look like:

# spread operator ... at call site
var pos_args = [3, 4]
var named_args = {foo: 'bar'}
var x = f(1, 2, ...pos_args; n1=43, ...named_args)

Note that positional args/params and named args/params can be thought of as two "separate worlds".

This table shows simpler, more common cases.

Args / Params	Call Site	Definition
Positional Args	`var x = myMax(3, 4)`	`func myMax(x, y) { return (x if x > y else y) }`
Spread Pos Args	`var args = [3, 4] var x = myMax(...args)`	(as above)
Rest Pos Params	`var x = myPrintf("%s is %d", 'bob', 30)`	`func myPrintf(fmt, ...args) { # ... }`
...
Named Args	`var x = mySum(3, 4, start=5)`	`func mySum(x, y; start=0) { return (x + y + start) }`
Spread Named Args	`var opts = {start: 5} var x = mySum(3, 4, ...opts)`	(as above)
Rest Named Params	`var x = f(start=5, end=7)`	`func f(; ...opts) { if ('start' not in opts) { setvar opts.start = 0 } # ... }`

Proc Calls and Defs

Like funcs, procs have 2 kinds of typed args/params: positional and named.

But they may also have string aka word args/params, and a block arg/param.

In general, a proc signature has 4 sections, like this:

proc p (
    w1, w2, ...rest_word;     # word params
    p1, p2, ...rest_pos;      # pos params
    n1, n2, ...rest_named;    # named params
    block                     # block param
) {
  echo 'body'
}

In general, a proc call looks like this:

var pos_args = [3, 4]
var named_args = {foo: 'bar'}

p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args) {
  echo 'block'
}

The block can also be passed as an expression after a second semicolon:

p /bin /tmp (1, 2, ...pos_args; n1=43, ...named_args; block)

Some simpler examples:

Args / Params	Call Site	Definition
Word args	`my-cd /tmp`	`proc my-cd (dest) { cd $dest }`
Rest Word Params	`my-cd -L /tmp`	`proc my-cd (...flags) { cd @flags }`
Spread Word Args	`var flags = :\| -L /tmp \| my-cd @flags`	(as above)
...
Typed Pos Arg	`print-max (3, 4)`	`proc print-max ( ; x, y) { echo $[x if x > y else y] }`
Typed Named Arg	`print-max (3, 4, start=5)`	`proc print-max ( ; x, y; start=0) { # ... }`
...
Block Argument	`my-cd /tmp { echo $PWD echo hi }`	`proc my-cd (dest; ; ; block) { cd $dest (; ; block) }`
All Four Kinds	`p 'word' (42, verbose=true) { echo $PWD echo hi }`	`proc p (w; myint; verbose=false; block) { = w = myint = verbose = block }`

Common Features

Let's recap the common features of procs and funcs.

Spread Args, Rest Params

Spread arg list ... at call site
Rest params ... at definition

The `error` builtin raises exceptions

The error builtin is idiomatic in both funcs and procs:

func f(x) {   
  if (x <= 0) {
    error 'Should be positive' (status=99)
  }
}

Tip: reserve such errors for exceptional situations. For example, an input string being invalid may not be uncommon, while a disk full I/O error is more exceptional.

(The error builtin is implemented with C++ exceptions, which are slow in the error case.)

Out Params: `&myvar` is of type `value.Place`

Out params are more common in procs, because they don't have a typed return value.

proc p ( ; out) {
  call out->setValue(42)
}
var x
p (&x)
echo "x set to $x"  # => x set to 42

But they can also be used in funcs:

func f (out) {
  call out->setValue(42)
}
var x
call f(&x)
echo "x set to $x"  # => x set to 42

Observation: procs can do everything funcs can. But you may want the purity and familiar syntax of a func.

Design note: out params are a nicer way of doing what bash does with declare -n aka nameref variables. They don't rely on dynamic scope.

Proc-Only Features

Procs have some features that funcs don't have.

Lazy Arg Lists `where [x > 10]`

A lazy arg list is implemented with shopt --set parse_bracket, and is syntax sugar for an unevaluated value.Expr.

Longhand:

var my_expr = ^[42 === x]  # value of type Expr
assert (myexpr)

Shorthand:

assert [42 === x]  # equivalent to the above

Open Proc Signatures bind `argv`

TODO: Implement new ARGV semantics.

When a proc signature omits (), it's called "open" because the caller can pass "extra" arguments:

proc my-open {
  write 'args are' @ARGV
}
# All valid:
my-open
my-open 1 
my-open 1 2

Stricter closed procs:

proc my-closed (x) {
  write 'arg is' $x
}
my-closed      # runtime error: missing argument
my-closed 1    # valid
my-closed 1 2  # runtime error: too many arguments

An "open" proc is nearly is nearly identical to a shell function:

shfunc() {
  write 'args are' @ARGV
}

Usage Notes

3 Ways to Return a Value

Let's review the recommended ways to "return" a value:

return (x) in a func.
- The parentheses are required because expressions like (x + 1) should look different than words.
Pass a value.Place instance to a proc or func.
- That is, out param &out.
Print to stdout in a proc
- Capture it with command sub: $(myproc)
- Or with read: myproc | read --all; echo $_reply

Obsolete ways of "returning":

Using declare -n aka nameref variables in bash.
Relying on dynamic scope in POSIX shell.

Procs Compose in Pipelines / "Bernstein Chaining"

Some YSH users may tend toward funcs because they're more familiar. But shell composition with procs is very powerful!

They have at least two kinds of composition that funcs don't have.

See #shell-the-good-parts:

Shell Has a Forth-Like Quality - Bernstein chaining.
Pipelines Support Vectorized, Point-Free, and Imperative Style - the shell can transparently run procs as elements of pipelines.

Summary

YSH is influenced by both shell and Python, so it has both procs and funcs.

Many programmers will gravitate towards funcs because they're familiar, but procs are more powerful and shell-like.

Make your YSH programs by learning to use procs!

Appendix

Implementation Details

procs vs. funcs both have these concerns:

Evaluation of default args at definition time.
Evaluation of actual args at the call site.
Arg-Param binding for builtin functions, e.g. with typed_args.Reader.
Arg-Param binding for user-defined functions.

So the implementation can be thought of as a 2 × 4 matrix, with some code shared. This code is mostly in ysh/func_proc.py.

Variable Declaration, Mutation, and Scope - in particular, procs don't have dynamic scope.
Block Literals (in progress)

Generated on Thu, 25 Jul 2024 00:09:30 +0000

Guide to Procs and Funcs

Tip: Start Simple

At a Glance

Procs vs. Funcs

Func Calls and Defs

Proc Calls and Defs

Common Features

Spread Args, Rest Params

The error builtin raises exceptions

Out Params: &myvar is of type value.Place

Proc-Only Features

Lazy Arg Lists where [x > 10]

Open Proc Signatures bind argv

Usage Notes

3 Ways to Return a Value

Procs Compose in Pipelines / "Bernstein Chaining"

Summary

Appendix

Implementation Details

Related

The `error` builtin raises exceptions

Out Params: `&myvar` is of type `value.Place`

Lazy Arg Lists `where [x > 10]`

Open Proc Signatures bind `argv`