Skip to content

Conversation

@rudi-c
Copy link
Contributor

@rudi-c rudi-c commented Aug 28, 2015

Not all references in the program are being scanned, but we get away with it for now. Until we having a moving collector, don't actually scan those references, but make note that they are there.

I have no idea how the extra code I added ended up making things faster, it's probably a fluke, but at least it's not slower.

Comparing to '['baseline']'
running ['python', '../pyston-perf/benchmarking/benchmark_suite/fannkuch_med.py']
1.03298902512
              pyston (calibration)                      :   1.03s baseline: 1.02 (  1.0%)
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3.py']
4.54294991493
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3.py']
2.61548089981
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3.py']
2.55432510376
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3.py']
2.54696393013
              pyston django_template3.py                :   2.55s baseline: 2.55 ( -0.1%)
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench.py']
3.22923612595
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench.py']
2.15312814713
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench.py']
2.12213397026
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench.py']
2.12070083618
              pyston pyxl_bench.py                      :   2.12s baseline: 2.16 ( -1.7%)
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/sqlalchemy_imperative2.py']
3.63884902
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/b$
nchmark_suite/sqlalchemy_imperative2.py']
2.62657094002
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/b$
nchmark_suite/sqlalchemy_imperative2.py']
2.63046097755
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/b$
nchmark_suite/sqlalchemy_imperative2.py']
2.62272500992
              pyston sqlalchemy_imperative2.py          :   2.62s baseline: 2.62 (  0.1%)
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3_10x.py']
16.8686971664
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3_10x.py']
15.7492239475
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3_10x.py']
15.7083630562
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/django_template3_10x.py']
14.6849870682
              pyston django_template3_10x.py            :  14.68s baseline: 14.72 ( -0.2%)
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench_10x.py']
18.3980820179
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench_10x.py']
16.8311018944
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench_10x.py']
bnMVXCZZZZZXCBVNN16.9298191071
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/pyxl_bench_10x.py']
16.213558197
              pyston pyxl_bench_10x.py                  :  16.21s baseline: 16.62 ( -2.5%)
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/sqlalchemy_imperative2_10x.py']
22.6907730103
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/sqlalchemy_imperative2_10x.py']
20.723708868
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/sqlalchemy_imperative2_10x.py']
20.665446043
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/be
nchmark_suite/sqlalchemy_imperative2_10x.py']
20.5785398483
              pyston sqlalchemy_imperative2_10x.py      :  20.58s baseline: 20.34 (  1.2%)
              pyston (geomean-2ec9)                     :   2.42s baseline: 2.43 ( -0.6%)
Deleting report 'last'
Saving results to 'last'

@rudi-c
Copy link
Contributor Author

rudi-c commented Aug 28, 2015

This makes a little bit more sense.

              pyston (calibration)                      :   1.02s baseline: 1.01 (  1.3%)                                                                                  
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3.py']                                   
4.57949185371                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3.py']                                   
2.64598608017                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3.py']                                   
2.58076500893                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3.py']                                   
2.55589199066                                                                                                                                                              
              pyston django_template3.py                :   2.56s baseline: 2.52 (  1.4%)                                                                                  
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench.py']                                         
3.24633908272                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench.py']                                         
2.11768198013                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench.py']                                         
2.15740418434                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench.py']                                         
2.13894510269                                                                                                                                                              
              pyston pyxl_bench.py                      :   2.12s baseline: 2.13 ( -0.5%)                                                                                  
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2.py']                             
3.63308310509                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2.py']                             
2.62131094933                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2.py']                             
2.63231086731                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2.py']                             
2.61750292778                                                                                                                                                              
              pyston sqlalchemy_imperative2.py          :   2.62s baseline: 2.61 (  0.3%)                                                                                  
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3_10x.py']                               
16.9421300888                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3_10x.py']                               
15.7274749279                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3_10x.py']                               
15.8433899879                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/django_template3_10x.py']                               
14.8036231995                                                                                                                                                              
              pyston django_template3_10x.py            :  14.80s baseline: 14.59 (  1.5%)                                                                                 
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench_10x.py']                                     
18.2095270157                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench_10x.py']                                     
16.5363659859                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench_10x.py']                                     
16.4009490013                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/pyxl_bench_10x.py']                                     
16.7375910282                                                                                                                                                              
              pyston pyxl_bench_10x.py                  :  16.40s baseline: 16.68 ( -1.6%)                                                                                 
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2_10x.py']                         
22.6622071266                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2_10x.py']                         
20.5999491215                                                                                                                                                              
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2_10x.py']                         
20.642480135                                                                                                                                                               
running ['../pyston-perf/benchmarking/../../pyston/./pyston_release', '../pyston-perf/benchmarking/benchmark_suite/sqlalchemy_imperative2_10x.py']                         
20.5364289284                                                                                                                                                              
              pyston sqlalchemy_imperative2_10x.py      :  20.54s baseline: 20.39 (  0.7%)                                                                                 
              pyston (geomean-2ec9)                     :   2.42s baseline: 2.41 (  0.4%)                                                                                  
Deleting report 'last'                                                                                                                                                     
Saving results to 'last'                                                                                                                                                   

@rudi-c
Copy link
Contributor Author

rudi-c commented Aug 29, 2015

For visiting objects on the stack, I have another variant like this:

class UntrackedObjectWithTrackedPointers {                                              
public:                                                                                 
    virtual ~UntrackedObjectWithTrackedPointers() = default;                            
    virtual void gc_visit(GCVisitor* visitor) = 0;                                      
};                                                                                      

class GCHandled {                                                                                      
    UntrackedObjectWithTrackedPointers* obj;                                                           

    class GCHandledStack : public GCAllocatedRuntime {                                                 
        std::vector<UntrackedObjectWithTrackedPointers*> stack;                                        

        friend GCHandled;                                                                              
    public:                                                                                            
        virtual void gc_visit(GCVisitor* visitor) {                                                    
            for (UntrackedObjectWithTrackedPointers* handled : stack) {                                
                if (handled) {                                                                         
                    handled->gc_visit(visitor);                                                        
                }                                                                                      
            }                                                                                          
        }                                                                                              

        void push(UntrackedObjectWithTrackedPointers* o) {                                             
            stack.push_back(o);                                                                        
        }                                                                                              

        void pop() {                                                                                   
            stack.pop_back();                                                                          
        }                                                                                              
    };                                                                                                 

    static GCHandledStack* gc_handled_stack;                                                           

public:                                                                                                
    GCHandled(UntrackedObjectWithTrackedPointers* obj);                                                
    ~GCHandled();                                                                                      
};                                                                                                     
GCHandled::GCHandledStack* GCHandled::gc_handled_stack = NULL;                                 

GCHandled::GCHandled(UntrackedObjectWithTrackedPointers* obj) : obj(obj) {                     
    if (!gc_handled_stack) {                                                                   
        gc_handled_stack = new GCHandledStack();                                               
        registerPermanentRoot(gc_handled_stack);                                               
    }                                                                                          
    gc_handled_stack->push(obj);                                                               
}                                                                                              

GCHandled::~GCHandled() {                                                                      
    gc_handled_stack->pop();                                                                   
}                                                                                              

The advantage is that it can be used for any kind of stack-bound lifetime, including pointers in unique_ptrs. So we would use this version also, for the rewriter.

std::unique_ptr<Rewriter> rewriter(Rewriter::createRewriter(return_addr, num_orig_args, "runtimeCall"));                                               
gc::GCHandled handled(rewriter.get());                                                                                                                                                                                                                                             

However, the StackObjectWithGCHandler in the third commit is a bit nicer to use because you don't have to create a separate object. Any object that inherits from StackObjectWithGCHandler will be have its gc handler called, even if it is stack allocated.

We could keep both, or just the more general one. What you think?

@kmod
Copy link
Collaborator

kmod commented Sep 1, 2015

I'm +1 on the variant you posted in your comment -- it seems similar to what other VMs use for their precise stack scanning and seems like it could be a step towards that for us. I think the extra-object issue can be resolved by making the handle the only object; ie for the rewrite case do something like

gc::RootingHandler<std::unique_ptr<Rewriter>> rewriter(createRewriter());
rewriter->commit();

Well, it might be tricky specifically in the case of unique_ptr.

@kmod
Copy link
Collaborator

kmod commented Sep 1, 2015

I'm a bit confused about the last commit in this PR -- were we not already scanning interpreter references?

@rudi-c
Copy link
Contributor Author

rudi-c commented Sep 1, 2015

I was concerned about vregs because it's not a direct pointer to the Pyston heap. But I just realized it allocates with alloca so the references still end up on the stack - that commit can probably be discarded.

@rudi-c
Copy link
Contributor Author

rudi-c commented Sep 1, 2015

Ok I've removed the commit that deals with ASTInterpreter. As far as the calling gc handler on stack objects is concerned, I'll put that in another commit when I scan the rewriter.

For marking collectors, the redudant visits no-op to avoid the
performance hit.
@rudi-c rudi-c force-pushed the redundantvisit branch 3 times, most recently from 218c35e to d80c374 Compare September 1, 2015 20:21
kmod added a commit that referenced this pull request Sep 1, 2015
Notion of redundant visits to slowly move towards scanning everything
@kmod kmod merged commit 7c96b62 into pyston:master Sep 1, 2015
@kmod
Copy link
Collaborator

kmod commented Sep 1, 2015

cool :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants