Skip to content

Conversation

undingen
Copy link
Contributor

@undingen undingen commented Sep 27, 2016

basic design:
This PR changes our BST nodes to directly operate on vregs instead of pointers to other nodes and names (except a few exceptions: BST_Invoke, BST_MakeFunction and BST_MakeClass which still needs to get converted).
Most nodes got a destination vreg and one or more source vregs. Currently all of them are 32bit long but I plan to store them more compact very soon. Some nodes support a variable size of operands (e.g. the tuple node) but the size can't change after creating the node. I removed several unneeded opcodes and split a lot of nodes into separate opcodes (it may make sense to split them even further in the future).
Generally all instructions except CopyVReg kill the source operand vregs except if the source is a ref to a constant. If one needs the preserve the source vreg on needs to create a new temporary using the CopyVReg opcode.

There is a special vreg number: VREG_UNDEFINED = std::numeric_limits<int>::min().

  • when it's set as an operand vreg: it means that this is a not-set optional argument. (e.g. for a slice which only has lower set, upper would be VREG_UNDEFINED)
  • if it's the destination it's means the result value should get immediately killed (e.g. invoke 15 16: %undef = %11(%14) this is a call whose result gets ignored)

all other negative vreg numbers are indices into a constant table (after adding 1 and making them positive).
(e.g. (4, 2, 'lala') generates: %undef = (%-1|4|, %-2|2|, %-3|'lala'|) this creates a tuple whose elements are the constant idx -1, -2 and -3. In order to make it easier for a human to understand we print the actual value of the constant between | characters)

  • constants can be all str and numeric types and 'None'.
  • every constant will only get stored once in the table

this reduces the total memory usage by about 20% currently but I'm very sure with the future changes it will be significantly lower.

near future:

  • change the jump and branch instruction to reference CFGBlocks by index.
  • store all InternedString inside a table and use indices into the the table to access them.
  • remove the BoxedCode* member
  • devirtualize the classes
    = with this changes the bytecode can get freely copied around (only need to update the CFGBlock table) which allows us to store them directly next to each other.
  • I plan to use one bit of the the opcode to mark the instruction as only requiring 8bit vreg operands (which should handle the majority of cases with 128 temps and 127 constants + 1undef vreg value)
  • another bit will get used to specify if this instruction is inside an invoke. if this bit is set there are two 1 or 4 bytes long block indices directly behind the instruction.
  • serialize the bytecode to disk. (maybe serialize the constant table using pickle)

thing which need to get improved

  • currently the constant table get's attached to the BoxedModule maybe there is a better location, I also needed to pass the BoxedModule into some functions e.g. BST printing because otherwise we could not pretty-print the constants
  • BST_Name is not an opcode it's just used to initialize the arguments when a function get's called and stores where and how the arguments need to get stored.
  • more consistent opcode names and rename TmpValue to something better
  • we currently don't print the InternedString name - we only print the vreg number

additional changed made which are hidden in the large diff 👎

  • removed unused code initializing the items of BST_Dict (we use/used separate assignments to add the items)
  • lower ExtSlice inside the CFG phase to a tuple of slices
  • separated opcode for load subscript when it needs to be a slice and when it's only lower and upper (=__getslice__) before this got handled in the interpreter/jit
  • generate a constant None load inside the CFG when None gets loaded by name

@undingen undingen added the wip label Sep 27, 2016
@undingen undingen force-pushed the new_bst4 branch 4 times, most recently from 7142a25 to a2caa3b Compare September 29, 2016 21:09
@undingen
Copy link
Contributor Author

ok this should be ready for an initial highlevel review + suggestions for renaming etc...

Copy link
Collaborator

@kmod kmod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great! I left some initial thoughts on naming / some random other things I noticed. Let me know when you think it's ready for full review.

src/core/bst.h Outdated

class BST_slice : public BST {
// base class of all nodes which have a single destination vreg
class BST_dst : public BST_stmt {
Copy link
Collaborator

@kmod kmod Oct 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only because you asked for thoughts on naming :P When I first saw BST_dst I thought it was the destination that a bst node assigns to, not a bst node with a destination. I have a very small preference for something like BST_stmt_withdest, and maybe calling isBST_dst something like has_dest.

class SliceVisitor;
class BST_keyword;

static constexpr int VREG_UNDEFINED = std::numeric_limits<int>::min();
Copy link
Collaborator

@kmod kmod Oct 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comment here would be nice, especially if this is pulling double duty as both a source and a destination vreg.

src/core/bst.h Outdated
static BST_UnpackIntoArray* create(int num_elts) {
BST_UnpackIntoArray* o
= (BST_UnpackIntoArray*)new char[offsetof(BST_UnpackIntoArray, vreg_dst) + num_elts * sizeof(int)];
new (o) BST_UnpackIntoArray(num_elts);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small personal preference, but I usually like to override operator new rather than using placement-new. I like it because it makes it clear you are just changing how the allocation works, while still letting the compiler deal with the casting + constructor. It also blocks the user from doing something like new BST_UnpackIntoArray(*BST_UnpackIntoArray::create(10)), which I think would currently compile but crash.

public:
BST_expr* target;
BST_expr* value;
int vreg_src = VREG_UNDEFINED; // this vreg will not get killed!
Copy link
Collaborator

@kmod kmod Oct 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be great to have a comment saying why we need a CopyVReg bytecode. And maybe have a comment towards the top of the file that goes over the properties of the bytecode, such as that source vregs are always killed after use, how constants are represented, etc.

src/core/bst.h Outdated
virtual bool isBST_dst() const { return false; }
};

class BST_stmt : public BST {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is the only class that inherits from BST now -- can we just fold these fields into BST itself?

virtual bool visit_keyword(BST_keyword* node) { return false; }
virtual bool visit_langprimitive(BST_LangPrimitive* node) { return false; }
virtual bool visit_jump(BST_Jump* node) { return false; }
virtual bool visit_landingpad(BST_Landingpad* node) override { return false; }
Copy link
Collaborator

@kmod kmod Oct 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the override markers here or add them to all of the functions? newer gccs complain. same with FlattenVisitor and BasicBlockTypePropagator.

Maybe we should turn off -Winconsistent-missing-override until we can get that checked in CI :P

src/core/bst.h Outdated
static const BST_TYPE::BST_TYPE TYPE = BST_TYPE::DeleteSubSlice;
};

class BST_Ellipsis : public BST_dst {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be treated as a constant rather than another node (looks like this is how CPython does it).

public:
DEFAULT_CLASS(module_cls);

std::vector<Box*> constants;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worth a comment here -- I wasn't expecting to find that constants are stored twice on the module, once in the str/int/long/etc_constants and once in constants. I think we should eventually move to having the constants be stored on the code objects, but for now I think the current way is fine but should have a comment / todo.

bool has_dest_vreg() const override { return true; }
};

#define BSTVARVREGS(opcode, base_class, num_elts, vreg_dst) \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmod Is this what you had in mind?
I also tried to reduce the code duplication using macros... But in general I don't worry too much about the Nodes currently because I'm currently working on a python script which auto generates them in order to support the 1 and 4 byte vregs...

@undingen
Copy link
Contributor Author

undingen commented Oct 3, 2016

I cleaned it up, it's ready now for review 👍.
But I left my changes in a separate commit to it's easier to see what I changed.

@undingen undingen changed the title wip: bst: convert all nodes to directly operate at vregs instead of names bst: convert all nodes to directly operate at vregs instead of names Oct 4, 2016
@undingen undingen removed the wip label Oct 4, 2016
Copy link
Collaborator

@kmod kmod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some very minor comments; looks great, let's merge!

public:
std::unique_ptr<SourceInfo> source; // source can be NULL for functions defined in the C/C++ runtime
std::unique_ptr<SourceInfo> source; // source can be NULL for functions defined in the C/C++ runtime
const ConstantVRegInfo constant_vregs; // keeps track of all constants inside the bytecode
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but could we change this to BORROWED(ConstantVRegInfo)?

// baseline jit: [user visible] [cross block]
// llvm jit : [user visible]
class VRegInfo {
private:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but could we remove the BOM in cfg.cpp? In vim I could do this with :set nobomb. I'm not really sure of the pros or cons but I don't think any of the other files have one.

return phi;
}

class SymTableDstVRegDeleter : NoopBSTVisitor {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a comment for what this is used for?

**basic design:**
This PR changes our BST nodes to directly operate on vregs instead of pointers to other nodes and names (except a few exceptions: `BST_Invoke`, `BST_MakeFunction` and `BST_MakeClass` which still needs to get converted).
Most nodes got a destination vreg and one or more source vregs. Currently all of them are 32bit long but I plan to store them more compact very soon. Some nodes support a variable size of operands (e.g. the tuple node) but the size can't change after creating the node. I removed several unneeded opcodes and split a lot of nodes into separate opcodes (it may make sense to split them even further in the future).
Generally all instructions except `CopyVReg` kill the source operand vregs except if the source is a ref to a constant. If one needs the preserve the source vreg on needs to create a new temporary using the `CopyVReg` opcode.

There is a special vreg number: `VREG_UNDEFINED = std::numeric_limits<int>::min()`.
- when it's set as an operand vreg: it means that this is a not-set optional argument. (e.g. for a slice which only has `lower` set, `upper` would be `VREG_UNDEFINED`)
- if it's the destination it's means the result value should get immediately killed (e.g. `invoke 15 16: %undef = %11(%14)` this is a call whose result gets ignored)

all other negative vreg numbers are indices into a constant table (after adding 1 and making them positive).
(e.g. `(4, 2, 'lala')` generates:  `%undef = (%-1|4|, %-2|2|, %-3|'lala'|)` this creates a tuple whose elements are the constant idx -1, -2 and -3. In order to make it easier for a human to understand we print the actual value of the constant between | characters)
- constants can be all str and numeric types and 'None'.
- every constant will only get stored once in the table

this reduces the total memory usage by about 20% currently but I'm very sure with the future changes it will be significantly lower.

**near future:**
- change the jump and branch instruction to reference `CFGBlocks` by index.
- store all `InternedString` inside a table and use indices into the the table to access them.
- remove the 'BoxedCode*' member
- devirtualize the classes
= with this changes the bytecode can get freely copied around (only need to update the CFGBlock table) which allows us to attach the directly next to each other.

- I plan to use one bit of the the opcode to mark the instruction as only requiring 8bit vreg operands (which should handle the majority of cases with 128 temps and 127 constants + 1undef vreg value)
- another bit will get used to specify if this instruction is inside an `invoke`. if this bit is set there are 2 one 1 or 4 bytes long block indices directly behind the instruction.

- serialize the bytecode to disk. (maybe serialize the constants using pickle)

**thing which need to get improved**
- currently the constant table get's attached to the `BoxedModule` maybe there is a better location, I also needed to pass the `BoxedModule` into some functions e.g. BST printing because otherwise we could not pretty-print the constants
- `BST_Name` is not an opcode it's just used to initialize the arguments when a function get's called and stores where and how the arguments need to get stored.
- more consistent opcode names and rename `TmpValue` to something better
- we currently don't print the `InternedString` name - we only print the vreg number

**additional changed made which are hidden in the large diff** 👎
- removed unused code initializing the items of `BST_Dict` (we use/used separate  assignments to add the items)
- lower `ExtSlice` inside the CFG phase to a tuple of slices
- separated opcode for load subscript when it needs to be a slice and when it's only lower and upper (=`__getslice__`) before this got handled in the interpreter/jit
- generate a constant `None` load inside the CFG when `None` gets loaded by name
@undingen undingen changed the title bst: convert all nodes to directly operate at vregs instead of names BST: convert all nodes to directly operate at vregs instead of names Oct 5, 2016
@undingen undingen merged commit 9cd8e75 into pyston:master Oct 5, 2016
@kmod
Copy link
Collaborator

kmod commented Oct 5, 2016

👍 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants