-
Couldn't load subscription status.
- Fork 116
Open
Labels
Description
There are some concepts we've talked about (and have in some cases touched using FunctionNodes #121) that we need to figure out how to cleanly integrate into our pipelines interfaces.
Among these are:
- Block Transformers and Estimators (operating on big feature spaces split across several RDDs)
- Aggregators (sampling, windowing, caching, etc.)
- Joining (both in the context of block nodes, and e.g.
join(identity, transformer)) - Operating on grouped data (e.g.
Grouped(Transformer[A, B]): Transformer[Seq[A], Seq[B]]) - Hyperparameter tuning & cross validation (making sure to allow optimizations like how linear mappers can train multiple lambdas at once)