1 | Notes on VM Opcodes
|
2 | ===================
|
3 |
|
4 | 2018 Bytecode assessement
|
5 | -------------------------
|
6 |
|
7 | ### Same
|
8 |
|
9 | Stack Management Bytecodes that are the same:
|
10 |
|
11 | - `POP_TOP`
|
12 | - `DUP_TOP_TWO`
|
13 | - `ROT_{TWO,THREE,FOUR}`
|
14 |
|
15 | (We could switch to a register VM, but that would be an completely orthogonal
|
16 | change.)
|
17 |
|
18 | Control Flow Bytecodes that are the same:
|
19 |
|
20 | - `SETUP_LOOP`
|
21 | - `SETUP_WITH`
|
22 | - `WITH_CLEANUP`
|
23 | - `SETUP_{EXCEPT,FINALLY}`
|
24 | - `END_FINALLY`
|
25 | - `POP_BLOCK`
|
26 | - `JUMP_{FORWARD,ABSOLUTE,...}`
|
27 | - `POP_JUMP_*`
|
28 | - `{BREAK,CONTINUE}_LOOP`
|
29 | - `RETURN_VALUE`
|
30 | - `RAISE_VARARGS` -- although it's not as general
|
31 | - `GET_ITER`
|
32 |
|
33 | Data structure bytecodes that are likely the same:
|
34 |
|
35 | - `{LIST,SET,MAP}_ADD`
|
36 | - `STORE_MAP`
|
37 | - `BUILD_{TUPLE,LIST,SET,MAP}`
|
38 | - `SLICE_*`
|
39 |
|
40 | At least in the beginning they are the same. Later we might have specialized
|
41 | data structures, e.g. `Array<Str>`, which is extremely common in shell.
|
42 |
|
43 | ### Changed
|
44 |
|
45 | Load / Store bytescodes that will take indices instead of names:
|
46 |
|
47 | - `{LOAD,STORE}_NAME`
|
48 | - fast variants go away: `{LOAD,STORE}_FAST`
|
49 | - `{LOAD,STORE}_GLOBAL`
|
50 | - `{LOAD,STORE}_ATTR` - for object members
|
51 |
|
52 |
|
53 | Highly Changed based on language semantics
|
54 |
|
55 | - `CALL_FUNCTION_*` -- Instead of four variants, we may just have one more
|
56 | static kind.
|
57 | - It will support `f(msg, *args)` and `f(*args, **kwargs)`, but maybe not
|
58 | much else?
|
59 |
|
60 | Bytecodes that can be type-specialized:
|
61 |
|
62 | - `BINARY_*`
|
63 | - `UNARY_*`
|
64 | - `COMPARE_OP` -- or maybe just don't allow nonsensical comparisons
|
65 |
|
66 | Maybe type-specialized:
|
67 |
|
68 | - `FOR_ITER` -- iterating items in a list, iterating characters in a string
|
69 | could be compiled statically. In other words, the iterator protocol isn't
|
70 | quite necessary.
|
71 |
|
72 | ### Removed
|
73 |
|
74 | Dynamic bytecodes that will go away, because names are statically resolved:
|
75 |
|
76 | - `BUILD_CLASS`
|
77 | - `MAKE_FUNCTION`
|
78 | - `IMPORT_{NAME,STAR}`
|
79 | - maybe: `MAKE_CLOSURE`: this should be done statically? Closures and classes
|
80 | should be the same? It's like calling a constructor.
|
81 |
|
82 | Other Removed:
|
83 |
|
84 | - `DELETE_NAME`: Namespaces are static
|
85 | - Might be unnecessary for our purposes: `YIELD_FROM`
|
86 | - `EXEC_STMT`: I want a different interface to the compiler, for
|
87 | metaprogramming purposes.
|
88 |
|
89 | Deprecated:
|
90 |
|
91 | - `PRINT_*` -- this should just be a normal function call
|
92 |
|
93 | ### Additions
|
94 |
|
95 | - Bytecodes for ASDL structures?
|
96 | - Bytecodes for shell?
|
97 | - For parsing VM?
|
98 |
|
99 | 2017
|
100 | ----
|
101 |
|
102 | This is an elaboration on:
|
103 |
|
104 | https://docs.python.org/2/library/dis.html
|
105 |
|
106 | I copy the descriptions and add my notes, based on what I'm working on.
|
107 |
|
108 |
|
109 |
|
110 | `SETUP_LOOP(delta)`
|
111 |
|
112 | Pushes a block for a loop onto the block stack. The block spans from the
|
113 | current instruction with a size of delta bytes.
|
114 |
|
115 | NOTES: compiler2 generates an extra SETUP_LOOP, for generator expressions,
|
116 | along with POP_BLOCK.
|
117 |
|
118 |
|
119 | `POP_BLOCK()`
|
120 |
|
121 | Removes one block from the block stack. Per frame, there is a stack of blocks,
|
122 | denoting nested loops, try statements, and such.
|
123 |
|
124 |
|
125 | `LOAD_CLOSURE(i)`
|
126 |
|
127 | Pushes a reference to the cell contained in slot `i` of the cell and free
|
128 | variable storage. The name of the variable is `co_cellvars[i]` if i is less
|
129 | than the length of `co_cellvars`. Otherwise it is
|
130 | `co_freevars[i - len(co_cellvars)]`.
|
131 |
|
132 | NOTES: compiler2 generates an extra one of these
|
133 |
|
134 |
|
135 | `MAKE_CLOSURE(argc)`
|
136 |
|
137 | Creates a new function object, sets its `func_closure` slot, and pushes it on
|
138 | the stack. `TOS` is the code associated with the function, `TOS1` the tuple
|
139 | containing cells for the closure’s free variables. The function also has `argc`
|
140 | default parameters, which are found below the cells.
|
141 |
|
142 |
|
143 | `LOAD_DEREF(i)`
|
144 |
|
145 | Loads the cell contained in slot `i` of the cell and free variable storage.
|
146 | Pushes a reference to the object the cell contains on the stack.
|
147 |
|
148 |
|
149 | `GET_ITER()`
|
150 |
|
151 | Implements TOS = iter(TOS).
|
152 |
|
153 | NOTES: Hm how do I implement this? It turns it from a collection into an
|
154 | iterator. Gah.
|
155 |
|
156 | PyObject *iter = PyObject_GetIter(iterable);
|
157 |
|
158 | objects/abstract.c -
|
159 | objects/iterobject.c - PySeqIter_New
|
160 | PySeqIter_Type has a it_seq field. The PyObject being iterated over. It
|
161 | maintains an index too.
|
162 | How does items() work as an iterable then?
|
163 |
|
164 | Then iter_iternext() calls:
|
165 | PySequence_GetItem(seq, it->it_index)
|
166 |
|
167 |
|
168 |
|
169 | `LOAD_FAST(var_num)`
|
170 |
|
171 | Pushes a reference to the local `co_varnames[var_num]` onto the stack.
|
172 |
|
173 | NOTES:
|
174 | This still does a named lookup? Generator expressions do `LOAD_FAST 0 (.0)`
|
175 | since there is no formal parameter name.
|
176 |
|
177 | Oh I see, there is a `PyObject** fastlocals` in EvalFrame
|
178 |
|
179 | It's initialized to `f->f_localsplus` -- frame holds them. Oh I see, that's
|
180 | where the frame setup is different! Don't need inspect.callargs.
|
181 |
|
182 |
|
183 | FastCall populates fastlocals from `PyObject** args` and `nargs`.
|
184 |
|
185 |
|
186 |
|
187 |
|
188 |
|
189 |
|
190 |
|