| 1 | trees: Sketch of Storage / Networking Architecture
 | 
| 2 | ==================================================
 | 
| 3 | 
 | 
| 4 | As usual, we try not to invent anything big or new, but instead focus on
 | 
| 5 | composing and rationalizing existing software and protocols:
 | 
| 6 | 
 | 
| 7 | - Many good implementation of POSIX file systems (Linux ext4, ZFS, etc.)
 | 
| 8 | - git, a distributed version control system
 | 
| 9 |   - in particular the packfile format
 | 
| 10 |   - the ssh send/receive pattern
 | 
| 11 | - Static WWW file servers like Apache and nginx
 | 
| 12 | - tar files, gzip files
 | 
| 13 | 
 | 
| 14 | ## Use Cases
 | 
| 15 | 
 | 
| 16 | 1. Building CI containers faster with wedges
 | 
| 17 |    - native deps: re2c, bloaty, uftrace, ...
 | 
| 18 |    - Python deps, e.g. MyPy
 | 
| 19 |    - R deps, e.g. dplyr
 | 
| 20 |    - wedge source is a .treeptr tarball
 | 
| 21 |    - wedge derived is a .treeptr file
 | 
| 22 | 2. CI serving `.wwz` files.  We need fast random access.
 | 
| 23 | 3. Running benchmarks on multiple machines
 | 
| 24 |    - `oils-for-unix` tarball from EVERY commit, sync'd to different CI tasks
 | 
| 25 | 4. Comparisons across distros, OSes, and hardware
 | 
| 26 |    - building same packages on Debian, Ubuntu, Alpine
 | 
| 27 |    - and FreeBSD
 | 
| 28 |    - x86 / x86-64 / ARM
 | 
| 29 | 5. Web .log files can be .treeptr files
 | 
| 30 | 
 | 
| 31 | ## Silo: Large Trees Managed Outside Git
 | 
| 32 | 
 | 
| 33 | You can `git pull` and `git push` without paying for these large objects, e.g.
 | 
| 34 | container images.
 | 
| 35 | 
 | 
| 36 | To start, trees use regular compression with `gzip`.  Later, it will introspect
 | 
| 37 | trees and take **hints** for **differential** compression.
 | 
| 38 | 
 | 
| 39 | Related:
 | 
| 40 | 
 | 
| 41 | - git annex
 | 
| 42 | - git LFS
 | 
| 43 | 
 | 
| 44 | ### Data
 | 
| 45 | 
 | 
| 46 |     https://oilshell.org/
 | 
| 47 |       deps.silo/
 | 
| 48 |         objects/            # everything is a blob at first
 | 
| 49 |           00/               # checksums calculated with git hash-object
 | 
| 50 |             123456.gz       # may be a .tar file, but silo doesn't know
 | 
| 51 |         pack/               # like git, it can have deltas, and be repacked
 | 
| 52 |           foo.pack
 | 
| 53 |           foo.idx
 | 
| 54 |         derived/            # DERIVED trees, e.g. different deltas,
 | 
| 55 |                             # different compression, SquashFS, ...
 | 
| 56 | 
 | 
| 57 | ### Commands
 | 
| 58 | 
 | 
| 59 |     silo verify             # blobs should have valid checksums
 | 
| 60 | 
 | 
| 61 | Existing tools:
 | 
| 62 | 
 | 
| 63 |     rsync        # back up the entire thing
 | 
| 64 |     rclone       # ditto, but works with cloud storage
 | 
| 65 | 
 | 
| 66 |     ssh rm "$@"  # a list of vrefs to delete can be calculated by 'medo reachable'
 | 
| 67 |     scp          # create a new silo from 'medo reachable' manifest
 | 
| 68 | 
 | 
| 69 |     du --si -s   # Total size of the Silo
 | 
| 70 | 
 | 
| 71 | ## Medo (meadow): Named and Versioned Subtrees in `git`
 | 
| 72 | 
 | 
| 73 | To start, this will untar and uncompress blobs from a Silo.  We can also:
 | 
| 74 | 
 | 
| 75 | - Materialize a git `tree`, e.g. in a packfile
 | 
| 76 | - Mount a git `tree` directly with FUSE.  I think the pack `.idx` does binary
 | 
| 77 |   search, which makes this possible.
 | 
| 78 |   - TODO: write prototype with pygit2 wrapping libgit2
 | 
| 79 |   - [FUSE bindings seem in question](https://stackoverflow.com/questions/52925566/which-module-is-the-actual-interface-to-fuse-from-python-3)
 | 
| 80 | 
 | 
| 81 | ### Data
 | 
| 82 | 
 | 
| 83 |     ~/git/oilshell/oil/    
 | 
| 84 |       deps/                          # 3 medo structure is arbitrary; they're
 | 
| 85 |                                      # generally mounted in different places, and
 | 
| 86 |                                      # used by different tools
 | 
| 87 |        
 | 
| 88 |         source.medo/                 # Relocatable data
 | 
| 89 |           SILO.json                  # Can point to multiple Silos
 | 
| 90 |           Python-3.10.4.treeptr      # with checksum and provenance (original URL)
 | 
| 91 | 
 | 
| 92 |         derived.medo/                # derived values, some are wedges with absolute paths
 | 
| 93 |           SILO.json                  # Can point to multiple Silos
 | 
| 94 |           debian/
 | 
| 95 |             bullseye/
 | 
| 96 |               Python-3.10.4.treeptr
 | 
| 97 |           ubuntu/
 | 
| 98 |             20.04/
 | 
| 99 |               Python-3.10.4.treeptr  # derived data has provenance:
 | 
| 100 |                                      # base layer, mounts of input / code, env / shell command
 | 
| 101 |             22.04/
 | 
| 102 |               Python-3.10.4.treeptr
 | 
| 103 | 
 | 
| 104 |         opaque.medo/                # Opaque values that can use more provenance.
 | 
| 105 |           SILO.json
 | 
| 106 |           images/                   # 'docker save' format.  Make sure it can be imported.
 | 
| 107 |             debian/
 | 
| 108 |               bullseye/
 | 
| 109 |                 slim.treeptr
 | 
| 110 | 
 | 
| 111 |           layers/
 | 
| 112 |             debian/
 | 
| 113 |               bullseye/
 | 
| 114 |                 mypy-deps.treeptr   # packages needed to build it
 | 
| 115 | 
 | 
| 116 | ### Commands
 | 
| 117 | 
 | 
| 118 |     # Get files to build.  This does uncompress/untar.
 | 
| 119 |     medo expand deps/source.medo/Python-3.10.4.treeptr _tmp/source/
 | 
| 120 | 
 | 
| 121 |     # Or sync files that are already built.  If they already exist, verify
 | 
| 122 |     # checksums.
 | 
| 123 |     medo expand deps/derived.medo/debian/bullseye/ /wedge/oilshell.org/deps
 | 
| 124 | 
 | 
| 125 |     # Combine SILO.json and the JSON in the .treeptr
 | 
| 126 |     medo url-for deps/source.medo/Python-3.10.4.treeptr
 | 
| 127 | 
 | 
| 128 |     # Verify checksums.
 | 
| 129 |     medo verify deps.medo/ /wedge/oilshell.org/deps
 | 
| 130 | 
 | 
| 131 |     # Makes a tarball and .treeptr that you can scp/rsync
 | 
| 132 |     medo add /wedge/oilshell.org/bash-4.4/ deps.medo/ubuntu/18.04/bash-4.4.treeptr
 | 
| 133 | 
 | 
| 134 |     medo reachable deps.medo/  # first step of garbage collection
 | 
| 135 | 
 | 
| 136 |     medo mount  # much later: FUSE mount
 | 
| 137 | 
 | 
| 138 | ## `/wedge`: A binary-centric "semi-distro" that works with OCI containers, and without
 | 
| 139 | 
 | 
| 140 | A package exports one or more binaries, and is a `treeptr` value:
 | 
| 141 | 
 | 
| 142 | - metadata is stored in a `.medo` directory
 | 
| 143 | - data is stored in a Silo
 | 
| 144 | 
 | 
| 145 | The package typically lives in a subdirectory of `/wedge`.  This is due to to
 | 
| 146 | `configure --prefix=/wedge/...`.
 | 
| 147 | 
 | 
| 148 | What can you do with it?
 | 
| 149 | 
 | 
| 150 | - A wedge can be mounted, e.g. `--mount type=bind,...`
 | 
| 151 | - It can be copied into an image: `COPY ...`
 | 
| 152 |   - for quick deployment to cloud services, like Github Actions or fly.io
 | 
| 153 | - It has provenance, like other treeptr values.  The provenance is either:
 | 
| 154 |   - the original URL, for source data
 | 
| 155 |   - the code, data, and environment used to build it
 | 
| 156 | 
 | 
| 157 | Related:
 | 
| 158 | 
 | 
| 159 | - GNU Stow (symlinks)
 | 
| 160 | - GoboLinux
 | 
| 161 | - Distri (exchange dirs with FUSE)
 | 
| 162 | - Nix/Bazel: a wedge is a "purely functional" value
 | 
| 163 | - Docker: wedges are meant to be created in containers, and mounted in
 | 
| 164 |   containers
 | 
| 165 | 
 | 
| 166 | ### Data
 | 
| 167 | 
 | 
| 168 |     /wedge/                     # an absolute path, for --configure --prefix=/wedge/..
 | 
| 169 |       oils-for-unix.org/        # scoped to domain
 | 
| 170 |         pkg/                    # arbitrary structure, for dev dependencies
 | 
| 171 |           Python-3.10.4.treeptr # metadata
 | 
| 172 |           Python-3.10.4/
 | 
| 173 |             python              # Executable, which needs a 'python3' symlink
 | 
| 174 | 
 | 
| 175 | ## Design Notes
 | 
| 176 | 
 | 
| 177 | ### Data and Metadata Formats
 | 
| 178 | 
 | 
| 179 | Text:
 | 
| 180 | 
 | 
| 181 | - JSON for .treeptr, MEDO.json, SILO.json
 | 
| 182 | - lockfile / "world" / manifest - what does this look like?
 | 
| 183 | 
 | 
| 184 | Data:
 | 
| 185 | 
 | 
| 186 | - `git`
 | 
| 187 |   - blob
 | 
| 188 |   - tree for FS metadata 
 | 
| 189 |   - no commit objects!
 | 
| 190 |   - packfile for multiple objects
 | 
| 191 | - Archiving: `.tar`, 
 | 
| 192 |   - OCI layers use `.tar`
 | 
| 193 | - Compression: `.gz`, `bzip2`, etc.
 | 
| 194 | - Encryption (well LUKS does the whole system)
 | 
| 195 | 
 | 
| 196 | ### knot: Incremental, Parallel, Coarse-Grained, Containerized Builds with Ninja
 | 
| 197 | 
 | 
| 198 | It's a wrapper like `ninja_lib.py`.  Importantly, everything you build should
 | 
| 199 | be versioned, immutable, and cached, so it doesn't use timestamps!  
 | 
| 200 | 
 | 
| 201 | Distributed builds, too?  Multiple workers can pull and publish intermediate
 | 
| 202 | values to the same Silo.
 | 
| 203 | 
 | 
| 204 | Key ideas:
 | 
| 205 | 
 | 
| 206 | - the knot worker pulls tasks and is pointed at source.medo and derived.medo
 | 
| 207 |   directories.
 | 
| 208 | - All of this metadata is in git.  The git repo is sync'd on worker
 | 
| 209 |   initialization, and continually updated.
 | 
| 210 |   - TODO: if 2 workers grab the same task, it should be OK.  One of their git
 | 
| 211 |     commits will fail?
 | 
| 212 | - The worker does a lazy 'medo sync'
 | 
| 213 | - The worker keeps a local cache of the Silo, according to the parts of the
 | 
| 214 |   Medo it needs
 | 
| 215 |   - It can give HINTS for differential compression, saying "I have
 | 
| 216 |     Python-3.10.4, send me delta for Python-3.10.5"
 | 
| 217 |   - If all metadata is local, it can be even smarter
 | 
| 218 | 
 | 
| 219 | (Name: it's geometry like "wedge", and hopefully cuts a "Gordian knot.")
 | 
| 220 | 
 | 
| 221 | 
 | 
| 222 | ## TODO 
 | 
| 223 | 
 | 
| 224 | ### Research
 | 
| 225 | 
 | 
| 226 | - shrub vs. blob?
 | 
| 227 |   - a shrub is a subtree, unlike a git `tree` object which is like an inode
 | 
| 228 |   - is all of the metadata like paths and sizes stored client side?  Then the
 | 
| 229 |     client can give repacking hints for differential compression, rather than
 | 
| 230 |     the server doing anything smart.
 | 
| 231 |   - medo explode?  You change the reference client-side
 | 
| 232 |   - or silo explode?  It can redirect from blob to shrub
 | 
| 233 | - TODO: look at git tree format, and whether an entire subtree/shrub of
 | 
| 234 |   metadata can be stored client-side.  We want ONLY trees, and blobs should be
 | 
| 235 |   DANGLING.
 | 
| 236 |   - Use pack format, or maybe a text format.
 | 
| 237 | 
 | 
| 238 | ```
 | 
| 239 | ~/git/oilshell/oil$ git cat-file -p master^{tree}
 | 
| 240 | 040000 tree 37689433372bc7f1db7109fe1749bff351cba5b0    .builds
 | 
| 241 | 040000 tree 5d6b8fdbeb144b771e10841b7286df42bfce4c52    .circleci
 | 
| 242 | 100644 blob 6385fd579efef14978900830e5fd74bbac907011    .cirrus.yml
 | 
| 243 | 100644 blob 343af37bf39d45b147bda8a85e8712b0292ddfea    .clang-format
 | 
| 244 | 040000 tree 03400f57a8475d0cc696557833088d718adb2493    .github
 | 
| 245 | ```
 | 
| 246 | 
 | 
| 247 | ### More
 | 
| 248 | 
 | 
| 249 | - Analog for low level `runc`, `crun`
 | 
| 250 | - Analog for high level `docker run`, `podman run`
 | 
| 251 | - The equivalent of inotify() on a silo / medo.
 | 
| 252 |   - could be an REST API on `https://app.oilshell.org/soil.medo/events/` for tarballs
 | 
| 253 |   - it tells you what Silo to fetch from
 | 
| 254 | - Source browser for https://www.oilshell.org/deps.silo
 | 
| 255 | 
 | 
| 256 | ## Ideas / Slogans
 | 
| 257 | 
 | 
| 258 | - "Distributed OS without RPCs".  We use the paradigms of state
 | 
| 259 |   synchronization, dependency graphs (partial orders), and probably low-level
 | 
| 260 |   "events".
 | 
| 261 | - Silo is the **data plane**; Medo is the **control plane**
 | 
| 262 |   - Hay config files will also be a control plane
 | 
| 263 | - Silo is a **mechanism**; Medo is for **policy**
 | 
| 264 | - `/wedge` is a **middleground** between Docker and Nix/Bazel
 | 
| 265 |   - Nix / Bazel are purely functional, but require rewriting upstream build
 | 
| 266 |     systems in their own language (to fully make use of them)
 | 
| 267 |     - Concretely: I don't want to rewrite the R build system for the tidyverse.
 | 
| 268 |       I want to use the Debian packaging that already works, and that core R
 | 
| 269 |       developers maintain.
 | 
| 270 |   - `/wedge` is purely functional in the sense that wedges are literally
 | 
| 271 |     **values**.  But like Docker, you can use shell commands that mutate layers
 | 
| 272 |     to create them.  You can run entire language package managers and build
 | 
| 273 |     systems via shell.
 | 
| 274 |   - Wedges compose with, and compose better than, Docker layers.
 |