| 1 | Notes on VM Opcodes
 | 
| 2 | ===================
 | 
| 3 | 
 | 
| 4 | 2018 Bytecode assessement
 | 
| 5 | -------------------------
 | 
| 6 | 
 | 
| 7 | ### Same
 | 
| 8 | 
 | 
| 9 | Stack Management Bytecodes that are the same:
 | 
| 10 | 
 | 
| 11 | - `POP_TOP`
 | 
| 12 | - `DUP_TOP_TWO`
 | 
| 13 | - `ROT_{TWO,THREE,FOUR}`
 | 
| 14 | 
 | 
| 15 | (We could switch to a register VM, but that would be an completely orthogonal
 | 
| 16 | change.)
 | 
| 17 | 
 | 
| 18 | Control Flow Bytecodes that are the same:
 | 
| 19 | 
 | 
| 20 | - `SETUP_LOOP`
 | 
| 21 | - `SETUP_WITH`
 | 
| 22 |   - `WITH_CLEANUP`
 | 
| 23 | - `SETUP_{EXCEPT,FINALLY}`
 | 
| 24 |   - `END_FINALLY`
 | 
| 25 | - `POP_BLOCK`
 | 
| 26 | - `JUMP_{FORWARD,ABSOLUTE,...}`
 | 
| 27 | - `POP_JUMP_*`
 | 
| 28 | - `{BREAK,CONTINUE}_LOOP`
 | 
| 29 | - `RETURN_VALUE`
 | 
| 30 | - `RAISE_VARARGS` -- although it's not as general
 | 
| 31 | - `GET_ITER`
 | 
| 32 | 
 | 
| 33 | Data structure bytecodes that are likely the same:
 | 
| 34 | 
 | 
| 35 | - `{LIST,SET,MAP}_ADD`
 | 
| 36 | - `STORE_MAP`
 | 
| 37 | - `BUILD_{TUPLE,LIST,SET,MAP}`
 | 
| 38 | - `SLICE_*`
 | 
| 39 | 
 | 
| 40 | At least in the beginning they are the same.  Later we might have specialized
 | 
| 41 | data structures, e.g. `Array<Str>`, which is extremely common in shell.
 | 
| 42 | 
 | 
| 43 | ### Changed
 | 
| 44 | 
 | 
| 45 | Load / Store bytescodes that will take indices instead of names:
 | 
| 46 | 
 | 
| 47 | - `{LOAD,STORE}_NAME`
 | 
| 48 |   - fast variants go away: `{LOAD,STORE}_FAST`
 | 
| 49 | - `{LOAD,STORE}_GLOBAL`
 | 
| 50 | - `{LOAD,STORE}_ATTR` - for object members
 | 
| 51 | 
 | 
| 52 | 
 | 
| 53 | Highly Changed based on language semantics
 | 
| 54 | 
 | 
| 55 | - `CALL_FUNCTION_*` -- Instead of four variants, we may just have one more
 | 
| 56 |   static kind.
 | 
| 57 |   - It will support `f(msg, *args)` and `f(*args, **kwargs)`, but maybe not
 | 
| 58 |     much else?
 | 
| 59 | 
 | 
| 60 | Bytecodes that can be type-specialized:
 | 
| 61 | 
 | 
| 62 | - `BINARY_*`
 | 
| 63 | - `UNARY_*`
 | 
| 64 | - `COMPARE_OP` -- or maybe just don't allow nonsensical comparisons
 | 
| 65 | 
 | 
| 66 | Maybe type-specialized:
 | 
| 67 | 
 | 
| 68 | - `FOR_ITER` -- iterating items in a list, iterating characters in a string
 | 
| 69 |   could be compiled statically.  In other words, the iterator protocol isn't
 | 
| 70 |   quite necessary.
 | 
| 71 | 
 | 
| 72 | ### Removed
 | 
| 73 | 
 | 
| 74 | Dynamic bytecodes that will go away, because names are statically resolved:
 | 
| 75 | 
 | 
| 76 | - `BUILD_CLASS`
 | 
| 77 | - `MAKE_FUNCTION`
 | 
| 78 | - `IMPORT_{NAME,STAR}`
 | 
| 79 | - maybe: `MAKE_CLOSURE`: this should be done statically?  Closures and classes
 | 
| 80 |   should be the same?  It's like calling a constructor.
 | 
| 81 | 
 | 
| 82 | Other Removed:
 | 
| 83 | 
 | 
| 84 | - `DELETE_NAME`: Namespaces are static
 | 
| 85 | - Might be unnecessary for our purposes: `YIELD_FROM`
 | 
| 86 | - `EXEC_STMT`: I want a different interface to the compiler, for
 | 
| 87 |   metaprogramming purposes.
 | 
| 88 | 
 | 
| 89 | Deprecated:
 | 
| 90 | 
 | 
| 91 | - `PRINT_*` -- this should just be a normal function call
 | 
| 92 | 
 | 
| 93 | ### Additions
 | 
| 94 | 
 | 
| 95 | - Bytecodes for ASDL structures?
 | 
| 96 | - Bytecodes for shell?
 | 
| 97 | - For parsing VM?
 | 
| 98 | 
 | 
| 99 | 2017
 | 
| 100 | ----
 | 
| 101 | 
 | 
| 102 | This is an elaboration on:
 | 
| 103 | 
 | 
| 104 | https://docs.python.org/2/library/dis.html
 | 
| 105 | 
 | 
| 106 | I copy the descriptions and add my notes, based on what I'm working on.
 | 
| 107 | 
 | 
| 108 | 
 | 
| 109 | 
 | 
| 110 | `SETUP_LOOP(delta)`
 | 
| 111 | 
 | 
| 112 | Pushes a block for a loop onto the block stack. The block spans from the
 | 
| 113 | current instruction with a size of delta bytes.
 | 
| 114 | 
 | 
| 115 | NOTES: compiler2 generates an extra SETUP_LOOP, for generator expressions,
 | 
| 116 | along with POP_BLOCK.
 | 
| 117 | 
 | 
| 118 | 
 | 
| 119 | `POP_BLOCK()`
 | 
| 120 | 
 | 
| 121 | Removes one block from the block stack. Per frame, there is a stack of blocks,
 | 
| 122 | denoting nested loops, try statements, and such.
 | 
| 123 | 
 | 
| 124 | 
 | 
| 125 | `LOAD_CLOSURE(i)`
 | 
| 126 | 
 | 
| 127 | Pushes a reference to the cell contained in slot `i` of the cell and free
 | 
| 128 | variable storage. The name of the variable is `co_cellvars[i]` if i is less
 | 
| 129 | than the length of `co_cellvars`. Otherwise it is
 | 
| 130 | `co_freevars[i - len(co_cellvars)]`.
 | 
| 131 | 
 | 
| 132 | NOTES: compiler2 generates an extra one of these
 | 
| 133 | 
 | 
| 134 | 
 | 
| 135 | `MAKE_CLOSURE(argc)`
 | 
| 136 | 
 | 
| 137 | Creates a new function object, sets its `func_closure` slot, and pushes it on
 | 
| 138 | the stack. `TOS` is the code associated with the function, `TOS1` the tuple
 | 
| 139 | containing cells for the closure’s free variables. The function also has `argc`
 | 
| 140 | default parameters, which are found below the cells.
 | 
| 141 | 
 | 
| 142 | 
 | 
| 143 | `LOAD_DEREF(i)`
 | 
| 144 | 
 | 
| 145 | Loads the cell contained in slot `i` of the cell and free variable storage.
 | 
| 146 | Pushes a reference to the object the cell contains on the stack.
 | 
| 147 | 
 | 
| 148 | 
 | 
| 149 | `GET_ITER()`
 | 
| 150 | 
 | 
| 151 | Implements TOS = iter(TOS).
 | 
| 152 | 
 | 
| 153 | NOTES: Hm how do I implement this?  It turns it from a collection into an
 | 
| 154 | iterator.  Gah.
 | 
| 155 | 
 | 
| 156 |     PyObject *iter = PyObject_GetIter(iterable); 
 | 
| 157 | 
 | 
| 158 |     objects/abstract.c - 
 | 
| 159 |     objects/iterobject.c - PySeqIter_New
 | 
| 160 |     PySeqIter_Type has a it_seq field.  The PyObject being iterated over.  It
 | 
| 161 |     maintains an index too.
 | 
| 162 |     How does items() work as an iterable then?
 | 
| 163 | 
 | 
| 164 |     Then iter_iternext() calls:
 | 
| 165 |     PySequence_GetItem(seq, it->it_index)
 | 
| 166 | 
 | 
| 167 | 
 | 
| 168 | 
 | 
| 169 | `LOAD_FAST(var_num)`
 | 
| 170 | 
 | 
| 171 | Pushes a reference to the local `co_varnames[var_num]` onto the stack.
 | 
| 172 | 
 | 
| 173 | NOTES:
 | 
| 174 | This still does a named lookup?  Generator expressions do `LOAD_FAST 0 (.0)`
 | 
| 175 | since there is no formal parameter name.
 | 
| 176 | 
 | 
| 177 | Oh I see, there is a `PyObject** fastlocals` in EvalFrame
 | 
| 178 | 
 | 
| 179 | It's initialized to `f->f_localsplus` -- frame holds them.  Oh I see, that's
 | 
| 180 | where the frame setup is different!  Don't need inspect.callargs.
 | 
| 181 | 
 | 
| 182 | 
 | 
| 183 | FastCall populates fastlocals from `PyObject** args` and `nargs`.
 | 
| 184 | 
 | 
| 185 | 
 | 
| 186 | 
 | 
| 187 | 
 | 
| 188 | 
 | 
| 189 | 
 | 
| 190 | 
 |