@@ -4,6 +4,7 @@ What's currently here:
44
55- lexer.py: lexer for C, originally written by Mark Shannon
66- plexer.py: OO interface on top of lexer.py; main class: ` PLexer `
7+ - parser.py: Parser for instruction definition DSL; main class ` Parser `
78- ` generate_cases.py ` : driver script to read ` Python/bytecodes.c ` and
89 write ` Python/generated_cases.c.h `
910
@@ -18,16 +19,16 @@ The DSL for the instruction definitions in `Python/bytecodes.c` is described
1819Note that there is some dummy C code at the top and bottom of the file
1920to fool text editors like VS Code into believing this is valid C code.
2021
21- ## A bit about the parsers
22+ ## A bit about the parser
2223
23- The parser classes use use a pretty standard recursive descent scheme,
24+ The parser class uses a pretty standard recursive descent scheme,
2425but with unlimited backtracking.
2526The ` PLexer ` class tokenizes the entire input before parsing starts.
2627We do not run the C preprocessor.
2728Each parsing method returns either an AST node (a ` Node ` instance)
2829or ` None ` , or raises ` SyntaxError ` (showing the error in the C source).
2930
30- All parsing methods are decorated with ` @contextual ` , which automatically
31+ Most parsing methods are decorated with ` @contextual ` , which automatically
3132resets the tokenizer input position when ` None ` is returned.
3233Parsing methods may also raise ` SyntaxError ` , which is irrecoverable.
3334When a parsing method returns ` None ` , it is possible that after backtracking
@@ -36,13 +37,3 @@ a different parsing method returns a valid AST.
3637Neither the lexer nor the parsers are complete or fully correct.
3738Most known issues are tersely indicated by ` # TODO: ` comments.
3839We plan to fix issues as they become relevant.
39-
40- A particular problem is that we don't know which identifiers are typedefs.
41- This makes some parts of the C grammar ambiguous, for example,
42- ` (x)*y ` could be a cast to type ` x ` of the pointer dereferencing ` *y ` ,
43- or it could mean to compute ` x ` times ` y ` .
44- Similarly, ` (x)(y) ` could cast ` y ` to type ` x ` ,
45- or call function ` x ` with argument ` y ` .
46- Our parser currently interprets such cases as casts.
47- We will solve this when we need to understand expressions in more detail
48- (for example, by providing a list of known typedefs generated by hand).
0 commit comments