Pikala is a practical, physics‑aware upgrade of a Skala XC pipeline. It adds many optimizations.
Pikala achieved accuracy exceed B3LYP. (trained on 49k samples, Not only MSR-ACC, im obtaining the data publishing licence.)
See the detailed notes in doc/ for more.
- B3LYP/def2-TZVP + gCP + D4. Smaller basis, same accuracy (hopefully)
- Δ‑learning.
- Mixture of Linear Experts, faster training and inference.
Build the smallest set (writes E_xc0_total in each shard):
.venv/bin/python scripts/build_msracc_100.py \
--src msr-acc/tae25 --out data/msracc_s100_def2svp --k 100 \
--grid-level 3 --scf-policy fast --disp auto --with-gcp auto
Pretrain with subset grids, Δ‑learning, and MoLE (K=8):
torchrun --standalone --nproc_per_node=8 \
.venv/bin/python scripts/train_pretrain_msracc_s100_ddp.py \
--data data/msracc_s100_def2svp --epochs 20 --max-lr 1e-3 \
--subset-K 65536 --micro-grids 200000 --amp --compile \
--moe-k 8 --moe-temp-start 2.0 --moe-temp-end 0.8 --moe-temp-epochs 10 \
--moe-noise-start 0.5 --moe-noise-end 0.05 --moe-noise-epochs 10 \
--moe-balance 1e-3 --moe-balance-beta 0.02 --moe-ortho 1e-4 \
--replay-cap 64 --replay-prob 0.25
Resume on the next hop and calibrate the gate bias to the global EMA target:
torchrun --standalone --nproc_per_node=4 \
.venv/bin/python scripts/train_pretrain_msracc_s100_ddp.py \
--data data/<next_hop_dir> --epochs 12 --max-lr 1e-3 \
--subset-K 65536 --micro-grids 200000 --amp --compile --resume \
--moe-k 8 ... (same MoE flags) \
--calibrate-gate-bias --calib-mols 2 --calib-K 65536
