Rate limits
Phantom rate-limits API calls per token, with two independent buckets so a burst of reads can't starve writes:
| Bucket | Burst (per minute) | Sustained |
|---|---|---|
| read (GET, HEAD) | 120 | ~10 / sec |
| write (POST, PATCH, PUT, DELETE) | 30 | ~2 / sec |
Plus a defence-in-depth per-IP cap of 30 unauthenticated requests / minute to prevent token-probing on a fresh IP.
How to read the headers
Every response carries:
RateLimit-Limit: 120 ← bucket size
RateLimit-Remaining: 117 ← what's left
RateLimit-Reset: 43 ← seconds until the bucket refillsWhen you hit the cap:
HTTP/1.1 429 Too Many Requests
Retry-After: 43
RateLimit-Limit: 120
RateLimit-Remaining: 0
RateLimit-Reset: 43{
"error": {
"code": "rate_limited",
"message": "API rate limit exceeded. Try again in 43s.",
"bucket": "read"
}
}Retry-After and RateLimit-Reset are always in seconds, and always equal (they refer to the same value).
Sensible client behaviour
- Always read
RateLimit-Remaining. Throttle yourself preemptively if it's below ~10 so you don't get caught in a 429 mid-task. - Honour
Retry-After. Don't busy-loop — wait the exact number of seconds it gave you. - Exponential backoff on 429s if you have a budget. A repeat 429 means another caller on the same token is also hitting the limit; back off.
- Spread your work. A batch job that fires 100 writes in one second will trip the write limit; spread it over a minute.
Bypassing the limit
You can't. If you need higher throughput, mint multiple tokens and have your callers round-robin between them, or contact us so we can talk about the use case. We won't bump the limit silently — the cap is part of how we keep one customer's traffic from affecting another's.
What counts as a "request"
- Every HTTP call to
/api/v1/*consumes 1 from the appropriate bucket, including 4xx responses.- A response of
401consumed 1 unauth request (per-IP bucket). - A response of
403fromRequireApiScopealready consumed 1 from the read/write bucket — the token was valid, just not for this endpoint.
- A response of
- A successful
Idempotent-Replay: trueresponse still consumes 1 from the bucket — the replay logic runs after rate-limit accounting.
Burst handling
The bucket refills on a fixed 60-second window aligned to the wall clock — so at :00, :01, :02 of every minute, every token's budget resets. That means you can briefly consume close to 2× the per-minute rate by hitting :59→:01 of the same minute. We accept that — it's negligible at the budgets above and a much simpler mental model than a true token bucket.
