Syntactic Concepts in YSH

These documents introduce the YSH language:

In contrast, the concepts introduced below may help advanced users remember YSH and its syntax. Read on to learn about:

Table of Contents
Command vs. Expression Mode
Lexer Modes
More Information
Sigils and Sigil Pairs
Valid Contexts
Parse Options to Take Over (), [], @, and =
Static Parsing
Aside: Duplicate Functionality in Bash
Related Links
Related Documents
Appendix: Hand-Written vs. Generated Parsers

Command vs. Expression Mode

The YSH parser starts out in command mode:

echo "hello $name"

for i in 1 2 3 {
  echo $i
}

But it switches to expression mode in a few places:

var x = 42 + a[i]          # the RHS of = is a YSH expression

echo $[mydict['key']]      # interpolated expressions with $[]

json write ({key: "val"})  # typed args inside ()

See Command vs. Expression Mode for details.

Lexer Modes

Lexer modes are a technique that YSH uses to manage the complex syntax of shell, which evolved over many decades.

For example, : means something different in each of these lines:

PATH=/bin:/usr/bin          # Literal string
echo ${x:-default}          # Part of an operator
echo $(( x > y ? 42 : 0 ))  # Arithmetic Operator
var myslice = a[3:5]        # YSH expression

To solve this problem, YSH has a lexer that can run in many modes. Multiple parsers read from this single lexer, but they demand different tokens, depending on the parsing context.

More Information

Sigils and Sigil Pairs

A sigil is a symbol like the $ in $mystr.

A sigil pair is a sigil with opening and closing delimiters, like ${var} and @(seq 3).

An appendix of A Feel For YSH Syntax lists the sigil pairs in the YSH language.

Valid Contexts

Each sigil pair may be available in command mode, expression mode, or both.

For example, command substitution is available in both:

echo $(hostname)      # command mode
var x = $(hostname)   # expression mode

So are raw and C-style string literals:

echo $'foo\n'  # the bash-compatible way to do it
var s = $'foo\n'

echo r'c:\Program Files\'
var raw = r'c:\Program Files\'

But array literals only make sense in expression mode:

var myarray = :| one two three |

echo one two three  # no array literal needed

A sigil pair often changes the lexer mode to parse what's inside.

Parse Options to Take Over (), [], @, and =

Most users don't have to worry about parse options. Instead, they run either bin/osh or bin/ysh, which are actually aliases for the same binary. The difference is that bin/ysh has the option group ysh:all on by default.

Nonetheless, here are two examples.

The parse_at option (in group ysh:upgrade) turns @ into the splice operator when it's at the front of a word:

$ var myarray = :| one two three |

$ echo @myarray         # @ isn't an an operator in shell
@myarray

$ shopt -s parse_at     # parse the @ symbol
$ echo @myarray
one two three

$ echo '@myarray'       # quote it to get the old behavior
@myarray

The parse_bracket option (also in group ysh:upgrade) lets you pass unevaluated expressions to a command with []:

assert (^[42 === x])   # assert is passed an expression, not value
assert [42 === x]      # syntax sugar with parse_bracket

Static Parsing

POSIX specifies that Unix shell has multiple stages of parsing and evaluation. For example:

$ x=2 
$ code='3 * x'
$ echo $(( code ))  # Silent eval of a string.  Dangerous!
6

YSH expressions are parsed in a single stage, and then evaluated, which makes it more like Python or JavaScript:

$ setvar code = '3 * x'
$ echo $[ code ]
3 * x

Another example: shell assignment builtins like readonly and local dynamically parsed, while YSH assignment like const and var are statically parsed.

Aside: Duplicate Functionality in Bash

It's confusing that bash has both statically- and dynamically-parsed variants of the same functionality.

Boolean expressions:

C-style string literals:

Related Links

Related Documents

Appendix: Hand-Written vs. Generated Parsers

The OSH language is parsed "by hand", while the YSH expression language is parsed with tables generated from a grammar (a modified version of Python's pgen).

This is mostly an implementation detail, but users may notice that OSH gives more specific error messages!

Hand-written parsers give you more control over errors. Eventually the YSH language may have a hand-written parser as well. Either way, feel free to file bugs about error messages that confuse you.

Generated on Wed, 03 Jul 2024 13:47:37 +0000