Public JSON API
Free, static, CORS-friendly JSON for everything localllm-advisor knows about: GPU compatibility, ranked recommendations per use case, per-model VRAM tables, the tier list. CDN-served, no auth, no rate limits, no scraping.
Quick start
Every endpoint is a plain JSON file under https://localllm-advisor.com/api/v1. Hit it from any HTTP client — browser, fetch, requests, curl, even an LLM tool-call.
curl
# Top coding models for an RTX 4090
curl -s https://localllm-advisor.com/api/v1/gpu/nvidia-rtx-4090/coding.json | jq '.recommendations[:5]'
# Which GPUs can run Qwen 3 32B?
curl -s https://localllm-advisor.com/api/v1/model/qwen-3-32b-dense.json | jq '.runnable_on[:5]'JavaScript / TypeScript
const res = await fetch(
"https://localllm-advisor.com/api/v1/gpu/nvidia-rtx-4090/coding.json"
);
const data = await res.json();
for (const m of data.recommendations.slice(0, 5)) {
console.log(`${m.name} ${m.quant} ~${m.estimated_tps} tok/s`);
}Python
import requests
# Get the top coding models for an RTX 4090
r = requests.get("https://localllm-advisor.com/api/v1/gpu/nvidia-rtx-4090/coding.json")
data = r.json()
print(f"Top 5 coding models on the {data['gpu']['name']}:")
for m in data["recommendations"][:5]:
print(f" {m['name']:30s} {m['quant']:8s} ~{m['estimated_tps']} tok/s")
Endpoints
https://localllm-advisor.com/api/v1/index.jsonTop-level registry — version, generated_at, list of every endpoint with descriptions.
https://localllm-advisor.com/api/v1/models.jsonLightweight directory of every model in the dataset (id, slug, family, params, benchmarks).
https://localllm-advisor.com/api/v1/gpus.jsonLightweight directory of every GPU we have specs for (vram, bandwidth, vendor, tdp, price).
https://localllm-advisor.com/api/v1/tier-list.jsonCurated tier list (S/A/B/C/D) of canonical models grouped by VRAM ceiling. Same data the /tier-list page renders.
https://localllm-advisor.com/api/v1/gpu/{slug}.jsonPer-GPU summary: top recommendations across every use case.
try: https://localllm-advisor.com/api/v1/gpu/nvidia-rtx-4090.json
https://localllm-advisor.com/api/v1/gpu/{slug}/{useCase}.jsonPer-(GPU, use case) ranked recommendations. Use cases: chat, coding, reasoning, creative.
try: https://localllm-advisor.com/api/v1/gpu/nvidia-rtx-4090/coding.json
https://localllm-advisor.com/api/v1/model/{slug}.jsonPer-model: full quantization table + which popular GPUs can run it (with tps estimates).
try: https://localllm-advisor.com/api/v1/model/qwen-3-32b-dense.json
Sample response
https://localllm-advisor.com/api/v1/gpu/nvidia-rtx-4090/coding.json
{
"schema": "[email protected]",
"generated_at": "2026-04-25T12:00:00.000Z",
"gpu": {
"slug": "nvidia-rtx-4090",
"name": "NVIDIA RTX 4090",
"vendor": "nvidia",
"vram_mb": 24576,
"bandwidth_gbps": 1008,
"tdp_watts": 450,
"price_usd": 1999
},
"use_case": "coding",
"benchmark_channels": ["bigcodebench","humaneval","math","ifeval"],
"count": 25,
"recommendations": [
{
"id": "qwen3-coder-30b-a3b",
"slug": "qwen3-coder-30b-a3b",
"name": "Qwen3-Coder 30B A3B",
"family": "qwen",
"params_b": 30,
"architecture": "moe",
"quality_score": 70,
"quant": "Q4_K_M",
"bpw": 4.83,
"vram_mb": 19200,
"vram_pct": 78.1,
"estimated_tps": 52
}
// ...
]
}License & attribution
The dataset is licensed under CC BY 4.0. You may use it commercially, redistribute it, build derivatives — we ask only that you link back to localllm-advisor.com so users can find updates and methodology.
Stability & versioning
The URL prefix /api/v1/ is the contract. Breaking changes (renamed fields, removed endpoints) bump the major version (/api/v2/) and we run both for at least 6 months. Backwards-compatible additions (new optional fields, new endpoints) land on v1 without a bump.
Each response carries a schema field (e.g. [email protected]) so consumers can branch on the version if needed.
Methodology
Compatibility uses the same fitting heuristic the rest of the site does (largest fitting quant under 85% of VRAM). Ranking inside each (GPU, use case) endpoint is a benchmark-channel composite with a completeness penalty so partially-benchmarked models can't cherry-pick. Full details on the methodology page.