A sklearn-like library for sparse factorization machines and thier variants for classification and regression in Python.
Factorization machines (FMs) are machine learning models based on feature interaction (co-occurrence) through polynomial terms. Sparse FMs use not all feature interactions but partial of them by using sparsity-inducing regularization, namely, sparse FMs are FMs with feature interaction selection.
This package provides some solvers for optimizing FM-like models with some regularizations.
Regularizer \ Purpose | Feature Interaction Selection | Feature Selection |
---|---|---|
l1 |
✗ | ✓ |
l21 |
✗ | ✓ |
squaredl12 (omegati ) |
✓ (recommended) | ✓ |
squaredl21 (omegacs ) |
✗ | ✓ (recommended) |
For more detail, please see our paper.
Sparse FMs | Sparse Higher-order FMs | Sparse All-subsets Model | |
---|---|---|---|
pcd |
l1 , squaredl12 , omegati |
l1 , omegati |
l1 , omegati |
pbcd |
l1 , l21 , squaredl21 , omegacs |
l1 , l21 , omegacs |
l21 , omegacs |
psgd |
l1 , l2 , squaredl12 , squaredl21 |
None | None |
The pcd
and pbcd
algorithms are easy to use and produce a sparse solution, so basically you should use pcd
for feature interaction selection and pbcd
for feature selection.
However, for large-scale datasets, the use of the psgd
is recommended because of its scalability.
pip install git+https://github.com/neonnnnn/sparsepoly
- Kyohei Atarashi, Satoshi Oyama, and Masahito Kurihara. Factorization Machines with Regularization for Sparse Feature Interactions. Journal of Machine Learning Research, 22(153), pp. 1--50, 2021.
- Kyohei Atarashi, 2020-present