Phi-4 Research Assistant Training
Control Panel
Training Configuration
Model
- Model: unsloth/phi-4-unsloth-bnb-4bit
- Learning Rate: 2e-05
- Per-Device Batch Size: 16
- Gradient Accumulation: 3
- Total Effective Batch Size: 16 × 4 × 3 = 192
- Epochs: 3
- Precision: BF16
- Max Sequence Length: 2048
Hardware
- GPU: 4× L4 (24 GB VRAM per GPU, total: 96 GB)
- Multi-GPU Strategy: ddp
- Memory Optimizations: Gradient Checkpointing
Dataset
- Dataset: George-API/phi4-cognitive-dataset
- Dataset Split: train
Training Information
Hardware:
- 4× NVIDIA L4 GPUs (24GB VRAM per GPU, 96GB total)
- Training with BF16 precision
- Using Data Parallel for multi-GPU
- Effective batch size: 16 (per device) × 4 (GPUs) × 3 (gradient accumulation) = 192
Notes:
- Training may take several hours depending on dataset size
- Check the Space logs for real-time progress
- Model checkpoints will be saved to ./results directory