API error codes and best practices
1) Why standardize errors
Predictability for customers: a single format and behavior of retras.
Debug acceleration: 'trace _ id '/' request _ id', stable 'error _ code'.
Security: SQL/stack traces/configs will not leak.
Observability: reporting on error taxonomy (validation, quotas, timeouts, etc.).
2) Basic principles
1. A single response format for all 4xx/5xx (and for 2xx with partial errors - a separate scheme).
2. Clear HTTP semantics: correct status is most important.
3. Two levels of code: transport ('status') and domain stable 'error _ code'.
4. Retriable vs Non-retriable: specify explicitly and give a hint about the back-off.
5. Default security: details - only to client with rights; without internal traces.
6. Localization: machine code remains stable, text - we translate.
3) Single error format (based on RFC 7807)
Recommended JSON (extended 'application/problem + json'):json
{
"type": "https://api. example. com/errors/validation_failed",
"title": "Validation failed",
"status": 422,
"error_code": "VAL_001",
"detail": "Field 'email' must be a valid address",
"instance": "req_01HZY...93",
"trace_id": "a1b2c3d4e5f6",
"retriable": false,
"errors": [
{"field": "email", "code": "email_invalid", "message": "Invalid email"}
],
"hint": "Fix payload and retry",
"meta": {"docs": "https://docs. example. com/errors#VAL_001"}
}
Required: 'type', 'title', 'status', 'error _ code', 'trace _ id'.
Optional: 'errors []' (by fields), 'retriable', 'hint', 'meta'.
- `Content-Type: application/problem+json`
- `X-Request-ID`/`Traceparent` (W3C)
- (for 429/503) 'Retry-After' (seconds or date)
4) Semantics of HTTP statuses (merging "classics" and practice)
2xx (nuanced success)
200 OK is a common success.
201 Created - Location.
202 Accepted - asynchronously in the queue (give 'status _ url').
207 Multi-Status - partial success (avoid if possible).
4xx (client error)
400 Bad Request - syntax/format, but not field validation (preferably 422).
401 Unauthorized - no/invalid token. Let's' WWW-Authenticate.'
403 Forbidden - token is valid, but there are not enough rights (RBAC/ABAC/limits).
404 Not Found - no resource/endpoint.
409 Conflict - optimal locking, idempotency.
410 Gone - endpoint permanently removed.
412 Precondition Failed - ETag/If-Match failed.
415 Unsupported Media Type - Invalid 'Content-Type'.
422 Unprocessable Entity - validation of business rules.
429 Too Many Requests - exceeded quotas/speed (see § 7).
5xx (server error)
500 Internal Server Error - sudden error; not to disclose details.
502 Bad Gateway - Upstream Error.
503 Service Unavailable - degradation/overload, give'Retry-After '.
504 Gateway Timeout - backend timeout.
5) Domain taxonomy 'error _ code'
We recommend the following ranges:- 'AUTH _ '- authentication/authorization.
- 'VAL _ '- validation of input data.
- 'RATELIMIT _ '- quotas and speed.
- 'IDEMP _ '- idempotence/duplicates.
- 'CONFLICT _ '- versions/status.
- 'DEP _ '- dependencies (PSP/DNS/SMTP).
- 'PAY _ '- business errors of the payment domain.
- 'SEC _ '- security (signatures, HMAC, mTLS).
- 'INT _ '- internal sudden.
- Stability over time (back-compat).
- Descriptions and examples in the error directory (docs + machine-readable JSON).
6) Retriable vs Non-retriable
Fields:- `retriable: true|false`
- If 'true' - necessarily 'Retry-After' (in seconds) or contract "exponential back-off (starting from 1-2 s, max 30-60 s)."
Retriable usually: '502/503/504', some '500', '429' (after the window).
Non-retriable: `400/401/403/404/409/410/415/422`.
7) Rate limit & quota errors (429)
Body:json
{
"type": "https://api. example. com/errors/rate_limited",
"title": "Rate limit exceeded",
"status": 429,
"error_code": "RATELIMIT_RPS",
"detail": "Too many requests",
"retriable": true
}
Titles:
- `Retry-After: 12`
- `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Reset`
- Для квот: `X-Quota-Limit`, `X-Quota-Remaining`, `X-Quota-Reset`
8) Idempotence and conflicts
In write requests - 'Idempotency-Key' (unique within 24-72 hours).
Retry conflict → 409 Conflict with'error _ code: "IDEMP_REPLAY"'.
Resource version conflict for ETag → 412 Precondition Failed.
In the response, attach a'resource _ id '/' status _ url' for a secure re-request.
9) Validation and 422
Return a list of errors by field:json
{
"status": 422,
"error_code": "VAL_001",
"errors": [
{"field":"email","code":"email_invalid","message":"Invalid email"},
{"field":"age","code":"min","message":"Must be >= 18"}
]
}
Rules:
- Do not duplicate the same in 400 - 422 preferred for business validation.
- Messages are human-readable; 'code' is machine-readable.
10) Error security
Never: stack traces, SQL, file paths, private host names.
Edit the PII; keep an eye on GDPR/DSAR.
For signature/HMAC, distinguish between'SEC _ SIGNATURE _ MISMATCH '(403) and'SEC _ TIMESTAMP _ SKEW' (401/403) with the prompt "check ± time 5 min."
11) Correlation and observability
Always add 'trace _ id '/' X-Request-ID' and scroll through the logs/tracks.
Aggregate errors by'error _ code'and'status' → dashboards "top errors," "new vs known."
Alerts: 5xx/422/429 spike, p95 latency, share of errors.
12) gRPC/GraphQL/Webhooks - mappings
gRPC ↔ HTTP
GraphQL
Transport 200, but'errors [] 'inside - add'extensions. code` и `trace_id`.
For "fatal" (authentication/quotas) - a real HTTP 401/403/429 is better.
Webhooks
Consider only 2xx recipients successful.
Retrai with exponential back-off, 'X-Webhook-ID', 'X-Signature'.
410 from the recipient - stop retray (endpoint removed).
13) Error versioning
'type '/' error _ code '- stable; new - only add.
When changing the body schema, raise the minor version of the API or 'problem + json; v=2`.
Documentation: code table + examples; changelog errors.
14) Documentation (OpenAPI fragments)
Global responses
yaml components:
responses:
Problem:
description: Problem Details content:
application/problem+json:
schema:
$ref: '#/components/schemas/Problem'
schemas:
Problem:
type: object required: [type, title, status, error_code, trace_id]
properties:
type: { type: string, format: uri }
title: { type: string }
status: { type: integer }
error_code: { type: string }
detail: { type: string }
instance: { type: string }
trace_id: { type: string }
retriable: { type: boolean }
errors:
type: array items:
type: object properties:
field: { type: string }
code: { type: string }
message: { type: string }
Example of an endpoint
yaml paths:
/v1/users:
post:
responses:
'201': { description: Created }
'401': { $ref: '#/components/responses/Problem' }
'422': { $ref: '#/components/responses/Problem' }
'429': { $ref: '#/components/responses/Problem' }
'500': { $ref: '#/components/responses/Problem' }
15) Testing and quality
Test contract: match 'application/problem + json', required fields.
Negative tests: all branches 401/403/404/ 409/422/429/500.
Chaos/latency: checking retrays for 5xx/ 503/504/429 ('Retry-After').
Security tests: no internal messages, correct PII mask.
Backward-compat: Old customers understand new fields (add, don't break).
16) Implementation checklist
- Single 'problem + json' + stable 'error _ code'.
- Correct HTTP/gRPC/GraphQL semantics.
- Retriable/non-retriable + 'Retry-After '/back-off recommendations.
- Rate-limit headers and 429 behavior.
- Idempotency ('Idempotency-Key', 409/412).
- Security: no stack traces/secrets, PII edition.
- 'trace _ id '/' X-Request-ID' in all errors.
- Error catalog documentation and examples.
- Monitoring by error taxonomy.
- Autotests of negative scenarios.
17) Mini-FAQ
How is 400 different from 422?
400 - broken request (syntax/content type). 422 - valid in syntax, but business rules did not pass.
When is 401 and when is 403?
401 - no/incorrect token; 403 - there is a token, there are not enough rights.
Is' Retry-After'always needed?
For 429/503, yes; for the rest, retriable - it is advisable to give an explicit recommendation.
Total
Well-designed bugs are the contract: correct HTTP status, single 'problem + json', stable 'error _ code', explicit retray hints, and strong security. Standardize the format, document the taxonomy, add telemetry and tests - and your API becomes predictable, secure and integrator-friendly.