GPU Metrics Ingestion API
Stream GPU telemetry from your agent to AluminatiAi for real-time energy monitoring, cost attribution, and efficiency analysis.
/v1/. The /api/ prefix is internal and may change between releases. Always use https://aluminatiai.com/v1/ in your agent and integrations.Authentication
All requests must include your API key in the X-API-Key header. Keys are prefixed with alum_.
X-API-Key: alum_YOUR_KEY_HEREFind your key in Dashboard → Setup. You can rotate it at any time (old key is immediately invalidated).
POST /v1/metrics/ingest
Ingest one or more GPU metric samples. Send an array of metric objects. The agent should call this endpoint approximately every 60 seconds per GPU.
Required fields
timestampUTC timestamp of the reading.
Must not be more than 5 minutes in the future.
gpu_indexZero-based index of the GPU on the host.
Must be ≥ 0.
gpu_uuidUnique device identifier from nvidia-smi (e.g. GPU-abc123).
power_draw_wInstantaneous power draw in watts.
Range: 0 – 1 500 W.
utilization_gpu_pctGPU compute utilization (0 – 100).
utilization_memory_pctGPU memory bandwidth utilization (0 – 100).
temperature_cGPU temperature in degrees Celsius.
Range: 0 – 120 °C.
memory_used_mbGPU memory currently in use, in megabytes.
Must be ≥ 0.
Optional fields
gpu_nameHuman-readable GPU model (e.g. NVIDIA A100 80GB).
power_limit_wConfigured TDP cap in watts.
energy_delta_jEnergy consumed since last sample, in joules. Must be ≥ 0.
fan_speed_pctFan speed 0 – 100.
memory_total_mbTotal installed GPU memory in megabytes.
sm_clock_mhzCurrent SM (CUDA core) clock in MHz.
memory_clock_mhzCurrent memory clock in MHz.
job_idIdentifier for the workload currently running on this GPU.
team_idTeam or cost-center tag for attribution. Max 128 chars.
model_tagModel name or experiment label. Max 128 chars.
scheduler_sourceScheduler that placed this workload.
One of: "kubernetes", "slurm", "runai", "manual"
Example request
curl -X POST https://aluminatiai.com/v1/metrics/ingest \
-H "Content-Type: application/json" \
-H "X-API-Key: alum_YOUR_KEY_HERE" \
-d '[{
"timestamp": "2026-02-25T12:00:00Z",
"gpu_index": 0,
"gpu_uuid": "GPU-abc123",
"power_draw_w": 312.5,
"utilization_gpu_pct": 87,
"utilization_memory_pct": 62,
"temperature_c": 71,
"memory_used_mb": 51200
}]'Example payload
[
{
"timestamp": "2026-02-25T12:00:00Z",
"gpu_index": 0,
"gpu_uuid": "GPU-abc123",
"gpu_name": "NVIDIA A100 80GB",
"power_draw_w": 312.5,
"power_limit_w": 400,
"energy_delta_j": 18750,
"utilization_gpu_pct": 87,
"utilization_memory_pct": 62,
"temperature_c": 71,
"memory_used_mb": 51200,
"memory_total_mb": 81920,
"job_id": "job_train_llama3",
"team_id": "ml-infra",
"model_tag": "llama3-70b-finetune",
"scheduler_source": "kubernetes"
}
]Responses
{
"success": true,
"inserted": 1,
"message": "Successfully ingested 1 metric"
}{
"error": "Metric at index 0 missing required fields: gpu_uuid, power_draw_w"
}| 401 | Missing or invalid API key. |
| 429 | Rate limit exceeded. Check X-RateLimit-Reset header. |
| 500 | Internal server error. Retry with exponential back-off. |
GET /v1/metrics/ingest
Health check. No authentication required. Returns endpoint metadata.
curl https://aluminatiai.com/v1/metrics/ingest{
"status": "ok",
"endpoint": "GPU Metrics Ingestion API",
"methods": ["POST"],
"documentation": "https://aluminatiai.com/docs/api"
}GET /v1/chargeback/report
Pull a cost attribution report for a date range. Returns GPU usage broken down by team, GPU architecture, and model tag — suitable for billing your internal teams or exporting to finance tooling.
Query parameters
fromStart date (inclusive).
Format: YYYY-MM-DD
toEnd date (inclusive).
Format: YYYY-MM-DD
formatResponse format. Default: "json". Use "csv" to download a spreadsheet.
Example — JSON report
curl "https://aluminatiai.com/v1/chargeback/report?from=2026-03-01&to=2026-03-31" \
-H "X-API-Key: alum_YOUR_KEY_HERE"{
"period": { "from": "2026-03-01", "to": "2026-03-31" },
"generated_at": "2026-04-01T09:00:00.000Z",
"summary": {
"total_cost_usd": 4821.60,
"total_gpu_hours": 320.14
},
"teams": [
{
"team_id": "ml-infra",
"total_cost_usd": 3214.40,
"total_gpu_hours": 213.63,
"by_gpu": [
{ "gpu_arch": "A100", "gpu_hours": 213.63, "rate": 15.04, "cost_usd": 3213.20 }
],
"by_model": [
{ "model_tag": "llama3-70b-finetune", "gpu_hours": 180.10, "cost_usd": 2708.30 },
{ "model_tag": "mistral-7b-eval", "gpu_hours": 33.53, "cost_usd": 504.10 }
]
}
]
}Example — CSV download
curl "https://aluminatiai.com/v1/chargeback/report?from=2026-03-01&to=2026-03-31&format=csv" \
-H "X-API-Key: alum_YOUR_KEY_HERE" \
-o march-chargeback.csvCSV columns: team_id, model_tag, gpu_arch, gpu_hours, rate_per_hour_usd, cost_usd
POST /v1/agent/heartbeat
Sent automatically by the agent every 5 minutes. Records that a host is alive and reporting. Visible in the Fleet page of the dashboard. You can also call this manually to test connectivity.
curl -X POST https://aluminatiai.com/v1/agent/heartbeat \
-H "X-API-Key: alum_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{ "agent_version": "0.2.0", "gpu_count": 4, "hostname": "gpu-node-01" }'Rate limits
| Endpoint | Limit | Window |
|---|---|---|
| POST /v1/metrics/ingest | 100 requests | per minute, per API key |
| GET /v1/chargeback/report | 60 requests | per minute |
| POST /v1/agent/heartbeat | 60 requests | per minute |
| POST /v1/metrics/ingest (key rotation) | 5 requests | per hour |
Rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) are included on every ingest response.