API Reference

GPU Metrics Ingestion API

Stream GPU telemetry from your agent to NemulAI for real-time energy monitoring, cost attribution, and efficiency analysis.

Base URL: https://nemulai.com

Stable API prefix: All public endpoints use /v1/. The /api/ prefix is internal and may change between releases. Always use https://nemulai.com/v1/ in your agent and integrations.

Authentication

All requests must include your API key in the X-API-Key header. Keys are prefixed with alum_.

X-API-Key: alum_YOUR_KEY_HERE

Find your key in Dashboard → Setup. You can rotate it at any time (old key is immediately invalidated).

POST /v1/metrics/ingest

Ingest one or more GPU metric samples. Send an array of metric objects. The agent should call this endpoint approximately every 60 seconds per GPU.

Rate limit: 100 req / minMax 1 000 metrics per request

Required fields

timestamp

string (ISO 8601)required

UTC timestamp of the reading.

Must not be more than 5 minutes in the future.

gpu_index

integerrequired

Zero-based index of the GPU on the host.

Must be ≥ 0.

gpu_uuid

stringrequired

Unique device identifier from nvidia-smi (e.g. GPU-abc123).

power_draw_w

floatrequired

Instantaneous power draw in watts.

Range: 0 – 1 500 W.

utilization_gpu_pct

integerrequired

GPU compute utilization (0 – 100).

utilization_memory_pct

integerrequired

GPU memory bandwidth utilization (0 – 100).

temperature_c

floatrequired

GPU temperature in degrees Celsius.

Range: 0 – 120 °C.

memory_used_mb

integerrequired

GPU memory currently in use, in megabytes.

Must be ≥ 0.

Optional fields

gpu_name

stringoptional

Human-readable GPU model (e.g. NVIDIA A100 80GB).

power_limit_w

floatoptional

Configured TDP cap in watts.

energy_delta_j

floatoptional

Energy consumed since last sample, in joules. Must be ≥ 0.

fan_speed_pct

integeroptional

Fan speed 0 – 100.

memory_total_mb

integeroptional

Total installed GPU memory in megabytes.

sm_clock_mhz

integeroptional

Current SM (CUDA core) clock in MHz.

memory_clock_mhz

integeroptional

Current memory clock in MHz.

job_id

stringoptional

Identifier for the workload currently running on this GPU.

team_id

stringoptional

Team or cost-center tag for attribution. Max 128 chars.

model_tag

stringoptional

Model name or experiment label. Max 128 chars.

scheduler_source

stringoptional

Scheduler that placed this workload.

One of: "kubernetes", "slurm", "runai", "manual"

Example request

curl -X POST https://nemulai.com/v1/metrics/ingest \
  -H "Content-Type: application/json" \
  -H "X-API-Key: alum_YOUR_KEY_HERE" \
  -d '[{
    "timestamp": "2026-02-25T12:00:00Z",
    "gpu_index": 0,
    "gpu_uuid": "GPU-abc123",
    "power_draw_w": 312.5,
    "utilization_gpu_pct": 87,
    "utilization_memory_pct": 62,
    "temperature_c": 71,
    "memory_used_mb": 51200
  }]'

Example payload

[
  {
    "timestamp": "2026-02-25T12:00:00Z",
    "gpu_index": 0,
    "gpu_uuid": "GPU-abc123",
    "gpu_name": "NVIDIA A100 80GB",
    "power_draw_w": 312.5,
    "power_limit_w": 400,
    "energy_delta_j": 18750,
    "utilization_gpu_pct": 87,
    "utilization_memory_pct": 62,
    "temperature_c": 71,
    "memory_used_mb": 51200,
    "memory_total_mb": 81920,
    "job_id": "job_train_llama3",
    "team_id": "ml-infra",
    "model_tag": "llama3-70b-finetune",
    "scheduler_source": "kubernetes"
  }
]

Responses

200 OKMetrics accepted

{
  "success": true,
  "inserted": 1,
  "message": "Successfully ingested 1 metric"
}

400 Bad RequestValidation failure

{
  "error": "Metric at index 0 missing required fields: gpu_uuid, power_draw_w"
}

401	Missing or invalid API key.
429	Rate limit exceeded. Check X-RateLimit-Reset header.
500	Internal server error. Retry with exponential back-off.

GET /v1/metrics/ingest

Health check. No authentication required. Returns endpoint metadata.

curl https://nemulai.com/v1/metrics/ingest

{
  "status": "ok",
  "endpoint": "GPU Metrics Ingestion API",
  "methods": ["POST"],
  "documentation": "https://nemulai.com/docs/api"
}

GET /v1/chargeback/report

Pull a cost attribution report for a date range. Returns GPU usage broken down by team, GPU architecture, and model tag — suitable for billing your internal teams or exporting to finance tooling.

Auth: cookie or X-API-Keyformat: json | csv

Query parameters

from

stringrequired

Start date (inclusive).

Format: YYYY-MM-DD

to

stringrequired

End date (inclusive).

Format: YYYY-MM-DD

format

stringoptional

Response format. Default: "json". Use "csv" to download a spreadsheet.

Example — JSON report

curl "https://nemulai.com/v1/chargeback/report?from=2026-03-01&to=2026-03-31" \
  -H "X-API-Key: alum_YOUR_KEY_HERE"

{
  "period": { "from": "2026-03-01", "to": "2026-03-31" },
  "generated_at": "2026-04-01T09:00:00.000Z",
  "summary": {
    "total_cost_usd": 4821.60,
    "total_gpu_hours": 320.14
  },
  "teams": [
    {
      "team_id": "ml-infra",
      "total_cost_usd": 3214.40,
      "total_gpu_hours": 213.63,
      "by_gpu": [
        { "gpu_arch": "A100", "gpu_hours": 213.63, "rate": 15.04, "cost_usd": 3213.20 }
      ],
      "by_model": [
        { "model_tag": "llama3-70b-finetune", "gpu_hours": 180.10, "cost_usd": 2708.30 },
        { "model_tag": "mistral-7b-eval",      "gpu_hours": 33.53,  "cost_usd": 504.10  }
      ]
    }
  ]
}

Example — CSV download

curl "https://nemulai.com/v1/chargeback/report?from=2026-03-01&to=2026-03-31&format=csv" \
  -H "X-API-Key: alum_YOUR_KEY_HERE" \
  -o march-chargeback.csv

CSV columns: team_id, model_tag, gpu_arch, gpu_hours, rate_per_hour_usd, cost_usd

POST /v1/agent/heartbeat

Sent automatically by the agent every 5 minutes. Records that a host is alive and reporting. Visible in the Fleet page of the dashboard. You can also call this manually to test connectivity.

Auth: X-API-KeyAgent use only

curl -X POST https://nemulai.com/v1/agent/heartbeat \
  -H "X-API-Key: alum_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{ "agent_version": "0.2.0", "gpu_count": 4, "hostname": "gpu-node-01" }'

Rate limits

Endpoint	Limit	Window
POST /v1/metrics/ingest	100 requests	per minute, per API key
GET /v1/chargeback/report	60 requests	per minute
POST /v1/agent/heartbeat	60 requests	per minute
POST /v1/metrics/ingest (key rotation)	5 requests	per hour

Rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) are included on every ingest response.

NemulAI API · v1 (stable)

Agent docs Setup guide Contact us Dashboard