Configuration¶
The main configuration file is:
configs/default.yaml
It controls the dataset, model, optimizer, trainer, checkpoints, evaluation, and Aim tracking.
Typical fields¶
The default config controls:
- random seed,
- dataset sizes,
- input dimension,
- batch size,
- model dimensions,
- optimizer settings,
- trainer settings,
- Aim repository path,
- checkpoint path,
- evaluation checkpoint path.
Trainer settings¶
Example:
trainer:
max_epochs: 10
accelerator: auto
devices: auto
For CPU/GPU portability, auto is a good default.
For explicit single-GPU training:
trainer:
accelerator: gpu
devices: 1
For distributed training:
trainer:
accelerator: gpu
devices: 2
strategy: ddp
Aim settings¶
Example:
aim:
repo: local/aim
experiment_name: ml-template
When starting a new project, change experiment_name.
Checkpoint settings¶
The template saves checkpoints under:
local/checkpoints/
A common default evaluation checkpoint is:
local/checkpoints/best.ckpt
Adding new configs¶
Add new YAML files under:
configs/
For example:
configs/debug.yaml
configs/gpu.yaml
configs/large_model.yaml
Then run:
pixi run python train.py --config configs/debug.yaml