What is model spoofing in LLM APIs?

Model spoofing is when an API provider claims to offer a premium model like GPT-4 but actually serves a cheaper model like GPT-3.5-turbo. They may use system prompts to make the cheaper model mimic the expensive one.

What is an API relay chain?

A relay chain is when your API requests pass through intermediary servers before reaching the actual model provider. Relays add latency, introduce data interception risks, and may modify requests or responses.

Is there a free fake API detection tool?

Yes, api-dna.com offers a completely free fake API detection tool. It checks for model substitution, relay chains, and token inflation. No sign-up required.

Fake API Detection Tool — Identify Relay Chains & Model Spoofing

Learn how fake API providers operate and how to detect model substitution, relay chains, and token inflation with a free detection tool.

The LLM API resale market is booming — and so is API fraud. Providers advertise GPT-4 or Claude 3.5 access but quietly serve cheaper models, inflate token counts, and route requests through unsecured relay servers. If you're paying for API access, you need a fake API detection tool that goes beyond simple key validation.

How Fake API Providers Operate

Understanding the fraud techniques helps you detect them. Here are the three main methods fake API providers use:

1. Model Substitution (Model Spoofing)

The provider claims to offer GPT-4 at a premium price but actually routes your requests to GPT-3.5-turbo or a similarly cheap model. They may use system prompts to make the cheaper model mimic the more expensive one — for example, instructing GPT-3.5 to "respond as if you are GPT-4." The API response headers may even claim the model is "gpt-4," but the actual output quality reveals the truth.

2. Token Inflation

The provider reports 2-3x more tokens than your request actually consumed. Since most users pay per token, this directly inflates your bill. For example, a request that genuinely uses 500 tokens might be reported as 1,200 tokens. Over thousands of requests, this adds up to significant overcharges. Read more in our Token Inflation Explained guide.

3. Relay Chains

Your requests pass through one or more intermediary servers before reaching the actual model provider. Each relay adds latency, introduces a data interception risk, and may modify the request or response. Some relays are legitimate (load balancers, caching layers), but many are unauthorized resellers who intercept and log your data. Our API Relay Detection guide covers this in detail.

How Fake API Detection Works

API-DNA uses a multi-layered detection approach:

Behavioral Fingerprinting

We send carefully crafted prompts that produce distinct responses from different models. By analyzing the response style, reasoning patterns, and output characteristics, we can identify the real model — even if the provider claims it's something else. For example, GPT-4 and GPT-3.5 have measurably different abilities on specific reasoning tasks.

Infrastructure Analysis

We trace the network path from your key to the model server. This includes checking IP addresses, ASN ownership (who owns the server), TLS certificate details, and HTTP headers. If a key claims to connect to OpenAI but the server is hosted on a random cloud provider, that's a clear sign of a relay.

Token Auditing

We independently estimate the token count for each request and compare it against what the provider reports. Significant discrepancies indicate token inflation. Our estimation uses the same tokenization algorithms as the original providers, giving us accurate baselines.

Using the Fake API Detection Tool

To detect fake APIs with API-DNA:

Enter your API key and base URL — Supports OpenAI, Anthropic, Google, and any OpenAI-compatible endpoint
Click "Check" — The tool runs behavioral, infrastructure, and token analysis
Review the detection report — See claimed vs. detected model, relay chain analysis, and token audit results
Take action — If fraud is detected, switch to a verified provider or go direct

Red Flags That Your API Provider Is Fake

Before even running a detection scan, watch for these warning signs:

Prices significantly below market rate — If GPT-4 access costs 50% less than OpenAI's official pricing, something is off
Custom base URL required — Legitimate providers use official API endpoints; relays require custom URLs
No rate limits — Real API keys have usage tiers; unlimited access suggests a shared or stolen key
Vague model names — "GPT-4 class" or "Claude-equivalent" usually means a cheaper substitute
No official documentation — If the provider can't show you model cards or API docs, proceed with caution

For a full verification checklist, see our API Provider Verification guide.

Real-World Examples of API Fraud

API-DNA has detected numerous fraud cases. Common patterns include:

GPT-3.5 sold as GPT-4 — The most common fraud. The provider advertises GPT-4 pricing but routes to GPT-3.5. Cost savings for them: ~10x. Extra cost for you: ~3x.
Claude Haiku sold as Claude Opus — Similar model substitution, especially common with Anthropic resellers.
2x token inflation — Every request is billed at 2x the actual token count. Over a month of heavy usage, this can mean hundreds of dollars in overcharges.
Multi-hop relays — Your request passes through 2-3 intermediaries, each adding latency and data exposure risk.

Detect Fake APIs Now

Don't pay for models you're not getting. Run a free detection scan on your API key — behavioral fingerprinting, infrastructure analysis, and token auditing in one click.

🛡️ Detect Fake APIs — Free Scan