API Reference

Automate your infrastructure with our RESTful API. Manage clusters, deployments, and resources programmatically.

Authentication

All API requests require authentication using an API token. Include your token in the Authorization header:

curl -H "Authorization: Bearer YOUR_API_TOKEN" \
  https://api.nailabx.com/v1/clusters

Generate API tokens from your dashboard settings.

Base URL

https://api.nailabx.com/v1

Clusters API

POST/clusters

Create a new GPU cluster.

{
  "name": "my-cluster",
  "gpu_type": "h100",
  "gpu_count": 2,
  "region": "us-east-1",
  "storage_gb": 500,
  "network_tier": "premium"
}

Response:

{
  "id": "clst_abc123",
  "name": "my-cluster",
  "status": "provisioning",
  "gpu_type": "h100",
  "gpu_count": 2,
  "region": "us-east-1",
  "ip_address": "10.0.1.42",
  "created_at": "2025-02-09T10:30:00Z"
}

GET/clusters

List all your clusters.

curl -H "Authorization: Bearer YOUR_TOKEN" \
  https://api.nailabx.com/v1/clusters

GET/clusters/{id}

Get details about a specific cluster.

DELETE/clusters/{id}

Terminate a cluster and release resources.

Deployments API

POST/deployments

Deploy a model or application.

{
  "cluster_id": "clst_abc123",
  "name": "inference-api",
  "type": "inference",
  "image": "nailabx/pytorch:latest",
  "command": ["python", "serve.py"],
  "replicas": 3,
  "port": 8080,
  "env": {
    "MODEL_PATH": "/models/llama-7b"
  }
}

PATCH/deployments/{id}/scale

Scale a deployment up or down.

{
  "replicas": 5
}

Storage API

POST/volumes

Create a persistent storage volume.

{
  "name": "dataset-volume",
  "size_gb": 1000,
  "region": "us-east-1",
  "type": "nvme-ssd"
}

Official SDKs

Python

pip install nailabx

Full-featured Python SDK

JavaScript

npm install @nailabx/sdk

Node.js and browser support

Go

go get github.com/nailabx/go-sdk

High-performance Go client

Rate Limits

API requests are rate-limited to ensure fair usage:

• 100 requests/minute per API token
• 1000 requests/hour per account
• Rate limit headers are included in all responses

Error Handling

All errors follow a consistent format:

{
  "error": {
    "code": "INSUFFICIENT_QUOTA",
    "message": "Insufficient GPU quota for requested configuration",
    "details": {
      "requested": 8,
      "available": 4
    }
  }
}