-
Notifications
You must be signed in to change notification settings - Fork 286
BST: convert all nodes to directly operate at vregs instead of names #1377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7142a25
to
a2caa3b
Compare
ok this should be ready for an initial highlevel review + suggestions for renaming etc... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great! I left some initial thoughts on naming / some random other things I noticed. Let me know when you think it's ready for full review.
src/core/bst.h
Outdated
|
||
class BST_slice : public BST { | ||
// base class of all nodes which have a single destination vreg | ||
class BST_dst : public BST_stmt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only because you asked for thoughts on naming :P When I first saw BST_dst
I thought it was the destination that a bst node assigns to, not a bst node with a destination. I have a very small preference for something like BST_stmt_withdest
, and maybe calling isBST_dst
something like has_dest
.
class SliceVisitor; | ||
class BST_keyword; | ||
|
||
static constexpr int VREG_UNDEFINED = std::numeric_limits<int>::min(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment here would be nice, especially if this is pulling double duty as both a source and a destination vreg.
src/core/bst.h
Outdated
static BST_UnpackIntoArray* create(int num_elts) { | ||
BST_UnpackIntoArray* o | ||
= (BST_UnpackIntoArray*)new char[offsetof(BST_UnpackIntoArray, vreg_dst) + num_elts * sizeof(int)]; | ||
new (o) BST_UnpackIntoArray(num_elts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small personal preference, but I usually like to override operator new
rather than using placement-new. I like it because it makes it clear you are just changing how the allocation works, while still letting the compiler deal with the casting + constructor. It also blocks the user from doing something like new BST_UnpackIntoArray(*BST_UnpackIntoArray::create(10))
, which I think would currently compile but crash.
public: | ||
BST_expr* target; | ||
BST_expr* value; | ||
int vreg_src = VREG_UNDEFINED; // this vreg will not get killed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be great to have a comment saying why we need a CopyVReg bytecode. And maybe have a comment towards the top of the file that goes over the properties of the bytecode, such as that source vregs are always killed after use, how constants are represented, etc.
src/core/bst.h
Outdated
virtual bool isBST_dst() const { return false; } | ||
}; | ||
|
||
class BST_stmt : public BST { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is the only class that inherits from BST now -- can we just fold these fields into BST itself?
virtual bool visit_keyword(BST_keyword* node) { return false; } | ||
virtual bool visit_langprimitive(BST_LangPrimitive* node) { return false; } | ||
virtual bool visit_jump(BST_Jump* node) { return false; } | ||
virtual bool visit_landingpad(BST_Landingpad* node) override { return false; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the override
markers here or add them to all of the functions? newer gccs complain. same with FlattenVisitor and BasicBlockTypePropagator.
Maybe we should turn off -Winconsistent-missing-override
until we can get that checked in CI :P
src/core/bst.h
Outdated
static const BST_TYPE::BST_TYPE TYPE = BST_TYPE::DeleteSubSlice; | ||
}; | ||
|
||
class BST_Ellipsis : public BST_dst { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be treated as a constant rather than another node (looks like this is how CPython does it).
src/runtime/types.h
Outdated
public: | ||
DEFAULT_CLASS(module_cls); | ||
|
||
std::vector<Box*> constants; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is worth a comment here -- I wasn't expecting to find that constants are stored twice on the module, once in the str/int/long/etc_constants
and once in constants
. I think we should eventually move to having the constants be stored on the code objects, but for now I think the current way is fine but should have a comment / todo.
bool has_dest_vreg() const override { return true; } | ||
}; | ||
|
||
#define BSTVARVREGS(opcode, base_class, num_elts, vreg_dst) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kmod Is this what you had in mind?
I also tried to reduce the code duplication using macros... But in general I don't worry too much about the Nodes currently because I'm currently working on a python script which auto generates them in order to support the 1 and 4 byte vregs...
I cleaned it up, it's ready now for review 👍. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some very minor comments; looks great, let's merge!
src/runtime/types.h
Outdated
public: | ||
std::unique_ptr<SourceInfo> source; // source can be NULL for functions defined in the C/C++ runtime | ||
std::unique_ptr<SourceInfo> source; // source can be NULL for functions defined in the C/C++ runtime | ||
const ConstantVRegInfo constant_vregs; // keeps track of all constants inside the bytecode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but could we change this to BORROWED(ConstantVRegInfo)
?
// baseline jit: [user visible] [cross block] | ||
// llvm jit : [user visible] | ||
class VRegInfo { | ||
private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but could we remove the BOM in cfg.cpp? In vim I could do this with :set nobomb
. I'm not really sure of the pros or cons but I don't think any of the other files have one.
return phi; | ||
} | ||
|
||
class SymTableDstVRegDeleter : NoopBSTVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a comment for what this is used for?
**basic design:** This PR changes our BST nodes to directly operate on vregs instead of pointers to other nodes and names (except a few exceptions: `BST_Invoke`, `BST_MakeFunction` and `BST_MakeClass` which still needs to get converted). Most nodes got a destination vreg and one or more source vregs. Currently all of them are 32bit long but I plan to store them more compact very soon. Some nodes support a variable size of operands (e.g. the tuple node) but the size can't change after creating the node. I removed several unneeded opcodes and split a lot of nodes into separate opcodes (it may make sense to split them even further in the future). Generally all instructions except `CopyVReg` kill the source operand vregs except if the source is a ref to a constant. If one needs the preserve the source vreg on needs to create a new temporary using the `CopyVReg` opcode. There is a special vreg number: `VREG_UNDEFINED = std::numeric_limits<int>::min()`. - when it's set as an operand vreg: it means that this is a not-set optional argument. (e.g. for a slice which only has `lower` set, `upper` would be `VREG_UNDEFINED`) - if it's the destination it's means the result value should get immediately killed (e.g. `invoke 15 16: %undef = %11(%14)` this is a call whose result gets ignored) all other negative vreg numbers are indices into a constant table (after adding 1 and making them positive). (e.g. `(4, 2, 'lala')` generates: `%undef = (%-1|4|, %-2|2|, %-3|'lala'|)` this creates a tuple whose elements are the constant idx -1, -2 and -3. In order to make it easier for a human to understand we print the actual value of the constant between | characters) - constants can be all str and numeric types and 'None'. - every constant will only get stored once in the table this reduces the total memory usage by about 20% currently but I'm very sure with the future changes it will be significantly lower. **near future:** - change the jump and branch instruction to reference `CFGBlocks` by index. - store all `InternedString` inside a table and use indices into the the table to access them. - remove the 'BoxedCode*' member - devirtualize the classes = with this changes the bytecode can get freely copied around (only need to update the CFGBlock table) which allows us to attach the directly next to each other. - I plan to use one bit of the the opcode to mark the instruction as only requiring 8bit vreg operands (which should handle the majority of cases with 128 temps and 127 constants + 1undef vreg value) - another bit will get used to specify if this instruction is inside an `invoke`. if this bit is set there are 2 one 1 or 4 bytes long block indices directly behind the instruction. - serialize the bytecode to disk. (maybe serialize the constants using pickle) **thing which need to get improved** - currently the constant table get's attached to the `BoxedModule` maybe there is a better location, I also needed to pass the `BoxedModule` into some functions e.g. BST printing because otherwise we could not pretty-print the constants - `BST_Name` is not an opcode it's just used to initialize the arguments when a function get's called and stores where and how the arguments need to get stored. - more consistent opcode names and rename `TmpValue` to something better - we currently don't print the `InternedString` name - we only print the vreg number **additional changed made which are hidden in the large diff** 👎 - removed unused code initializing the items of `BST_Dict` (we use/used separate assignments to add the items) - lower `ExtSlice` inside the CFG phase to a tuple of slices - separated opcode for load subscript when it needs to be a slice and when it's only lower and upper (=`__getslice__`) before this got handled in the interpreter/jit - generate a constant `None` load inside the CFG when `None` gets loaded by name
👍 👍 |
basic design:
This PR changes our BST nodes to directly operate on vregs instead of pointers to other nodes and names (except a few exceptions:
BST_Invoke
,BST_MakeFunction
andBST_MakeClass
which still needs to get converted).Most nodes got a destination vreg and one or more source vregs. Currently all of them are 32bit long but I plan to store them more compact very soon. Some nodes support a variable size of operands (e.g. the tuple node) but the size can't change after creating the node. I removed several unneeded opcodes and split a lot of nodes into separate opcodes (it may make sense to split them even further in the future).
Generally all instructions except
CopyVReg
kill the source operand vregs except if the source is a ref to a constant. If one needs the preserve the source vreg on needs to create a new temporary using theCopyVReg
opcode.There is a special vreg number:
VREG_UNDEFINED = std::numeric_limits<int>::min()
.lower
set,upper
would beVREG_UNDEFINED
)invoke 15 16: %undef = %11(%14)
this is a call whose result gets ignored)all other negative vreg numbers are indices into a constant table (after adding 1 and making them positive).
(e.g.
(4, 2, 'lala')
generates:%undef = (%-1|4|, %-2|2|, %-3|'lala'|)
this creates a tuple whose elements are the constant idx -1, -2 and -3. In order to make it easier for a human to understand we print the actual value of the constant between | characters)this reduces the total memory usage by about 20% currently but I'm very sure with the future changes it will be significantly lower.
near future:
CFGBlocks
by index.InternedString
inside a table and use indices into the the table to access them.BoxedCode*
member= with this changes the bytecode can get freely copied around (only need to update the CFGBlock table) which allows us to store them directly next to each other.
invoke
. if this bit is set there are two 1 or 4 bytes long block indices directly behind the instruction.thing which need to get improved
BoxedModule
maybe there is a better location, I also needed to pass theBoxedModule
into some functions e.g. BST printing because otherwise we could not pretty-print the constantsBST_Name
is not an opcode it's just used to initialize the arguments when a function get's called and stores where and how the arguments need to get stored.TmpValue
to something betterInternedString
name - we only print the vreg numberadditional changed made which are hidden in the large diff 👎
BST_Dict
(we use/used separate assignments to add the items)ExtSlice
inside the CFG phase to a tuple of slices__getslice__
) before this got handled in the interpreter/jitNone
load inside the CFG whenNone
gets loaded by name