Autonomous Energy Intelligence for LLM fine-tuning — powered by Gemini on AMD MI300X
Multi-agent swarm optimizes hyperparameters to minimize Joules-per-token, enforcing energy governance policies in real-time. Built with real AMD MI300X telemetry data from QLoRA fine-tuning runs.
Smaller batch sizes do not reduce power draw on MI300X. The GPU saturates at ~750W regardless — smaller batches just take longer, burning 26,534 extra Joules for the same result.
Same power draw (~750W peak) but the small-batch run runs 40s longer.The red zone is wasted energy.
Both runs process the same 245K tokens. The gap between the curves is pure waste.
amdsmi.amdsmi_get_power_info() every 0.5s. Energy calculated using trapezoidal integration. CO2 factor: 390 gCO2/kWh (US average). Both runs: Qwen2.5-7B, QLoRA NF4, LoRA r=16, 500 Hermes traces, effective batch size 8.