Schema registry and data evolution
Why do I need a schema registry
The Schema Registry is a centralized source of truth for data contracts (APIs, events, threads, messages, stores) that provides:- Predictable evolution: compatibility rules and automatic breakage checking.
- Repeatability and transparency: the history of versions, who/when/why changed.
- Standardization: uniform names, error formats, trace fields, PII labels.
- Integration with CI/CD: blocking breaking changes before production.
The registry links Protocol-first and contract compatibility, making changes quick and secure.
Formats and applications
JSON Schema: REST/HTTP payloads, documents, configurations.
Avro: event buses (Kafka/Pulsar), compact/evolution via field ID.
Protobuf: gRPC/RPC, binary efficient, strict tags.
GraphQL SDL: type and directive schema, evolution via '@ deprecated'.
SQL DDL as an artifact: we fix agreed views (for example, external storefronts) - with caution.
Compatibility modes
BACKWARD: New schemas read old data/messages. Suitable for a producer who extends payload additively.
FORWARD: old consumers read new data correctly (requires a tolerant reader).
FULL: combines both (stricter, more convenient for public contracts).
NONE: no checks - for sandboxes only.
- Events: more often BACKWARD (producer extends payload optional).
- Public APIs: FULL or BACKWARD + strict tolerant reader on clients.
- Internal prototypes: temporarily NONE, but not on trunk.
Safe (additive) vs. dangerous changes
Additive (OK):- Add an optional field/type.
- Enum extension with new values (with tolerant reader).
- Add alternate projection/event ('.enriched').
- Easing constraints ('minLength', 'maximum' ↑, but not ↓).
- Delete/rename fields or change their type/mandatory.
- Changing the semantics of statuses/codecs/order in threads.
- Re-use of protobuf tags.
- Changing the partitioning key in events.
Register organization
Naming and addressing
Groups/spaces: 'payments', 'kyc', 'audit'.
Names: 'payment. authorized. v1` (events), `payments. v1. CaptureRequest` (gRPC), `orders. v1. Order` (JSON Schema).
Major in name, minors in metadata/schema version.
Metadata
'owner '(command),' domain ',' slas' (SLO/SLA), 'security. tier` (PII/PCI), `retention`, `compatibility_mode`, `sunset`, `changelog`.
Lifecycle Management
Draft → Review → Approved → Released → Deprecated → Sunset.
Automatic validators/linters, manual design-review (API Guild), release notes.
Integration into CI/CD
1. Pre-commit: local linters (Spectral/Buf/Avro tools).
2. PR-pipeline: schema-diff → compatibility mode check; blocking breaking.
3. Artifact publish: push consistent schema to registry + generate SDK/models.
4. Runtime-guard (optional): Gateway/producer validates payload against the current scheme.
- `openapi-diff --fail-on-breaking`
- `buf breaking --against
` - `avro-compat --mode BACKWARD`
- generating golden samples and running CDC tests.
Evolution of schemes: practices
Additive-first: новые поля — `optional/nullable` (JSON), `optional` (proto3), default в Avro.
Reverse pyramid model: the core is stable, enrichment is nearby and optional.
Dual-emit/dual-write for major: we publish 'v1' and 'v2' in parallel.
Sunset plan: dates, uses, warnings, adapters.
Tolerant reader: clients ignore unknown fields and correctly handle new enum.
Examples of schemes and checks
JSON Schema (fragment, additive field)
json
{
"$id": "orders.v1.Order",
"type": "object",
"required": ["id", "status"],
"properties": {
"id": { "type": "string", "format": "uuid" },
"status": { "type": "string", "enum": ["created", "paid", "shipped"] },
"risk_score": { "type": "number", "minimum": 0, "maximum": 1 }
},
"additionalProperties": true
}
Avro (default for compatibility)
json
{
"type": "record",
"name": "PaymentAuthorized",
"namespace": "payment.v1",
"fields": [
{ "name": "payment_id", "type": "string" },
{ "name": "amount", "type": "long" },
{ "name": "currency", "type": "string" },
{ "name": "risk_score", "type": ["null", "double"], "default": null }
]
}
Protobuf (do not reuse tags)
proto syntax = "proto3";
package payments.v1;
message CaptureRequest {
string payment_id = 1;
int64 amount = 2;
string currency = 3;
optional double risk_score = 4; // additive
}
// tag=4 зарезервирован под risk_score, его нельзя менять/удалять без v2
Event register and partitioning
Naming events: 'domain. action. v{major}` (`payment. captured. v1`).
The partitioning key is part of the contract ('payment _ id', 'user _ id').
Core vs Enriched: '.v1' (core) and '.enriched. v1 '(details).
Registry compatibility: modes at theme/type level; CI refuses incompatible changes.
Migration Management
Expand → Migrate → Contract (REST/gRPC):1. Add fields/tables 2) start writing/reading new fields; 3) delete old after sunset.
- Dual-emit (Events): parallel to'v1 '/' v2', consumer/projection migration, then removal of 'v1'.
- Replay: reassembling projections from the log to a new diagram (only with compatibility and migrators).
- Adapters: gateways/proxies that translate 'v1↔v2' for complex clients.
Safety and compliance
PII/PCI labels in the diagram: 'x-pii: true', 'x-sensitivity: high'.
Access policies: who can publish/modify schemes (RBAC), sign releases.
Cryptography: signature of schema versions, immutable audit logs (WORM).
Right to be forgotten: specify fields that require encryption/crypto erasure; guidance in the registry.
Observability and audit
Dashboards: number of changes, types (minor/major), share of rejected PRs, version usage.
Audit trail: who changed the scheme, links to PR/ADR, related release.
Runtime metrics: percentage of messages that failed validation; compatibility incidents.
Tools (sample stack)
OpenAPI/JSON Schema: Spectral, OpenAPI Diff, Schemathesis.
Protobuf/gRPC: Buf, buf-breaking, protoc linters.
Avro/Events: Confluent/Redpanda Schema Registry, Avro-tools, Karapace.
GraphQL: GraphQL Inspector, GraphQL Codegen.
Registers/catalogs: Artifact Registry, Git-based registry, Backstage Catalog, custom UI.
Documentation: Redocly/Stoplight, Swagger-UI, GraphiQL.
Anti-patterns
Swagger-wash: the scheme does not reflect the reality of the service (or vice versa).
Disabled compatibility check: "urgent" → the product breaks.
Reusing protobuf tags: silent data corruption.
Single compatibility mode "for everything": different domains require different modes.
Raw CDCs as public schemes: leaking the DB model out, the impossibility of evolution.
Implementation checklist
- Defined artifact format and compatibility mode by domain.
- Linters and schema-diff are configured in CI, PR is blocked when breaking.
- Enabled for clients' tolerant reader; 'additionalProperties = true' (where applicable).
- Major changes go through RFC/ADR, there is a sunset plan and dual-emit/dual-write.
- Circuits are marked with PII/PCI and access levels; auditing is enabled.
- Version usage and compatibility failures dashboards.
- Generating SDK/models from the registry is part of the pipeline.
- Documentation and golden samples updated automatically.
FAQ
Is it possible without a registry to store schemes in Git?
Yes, but the registry adds compatibility APIs, search, metadata, centralized policy, and on-the-fly validation. The best option is Git as storage + UI/policies on top.
How do I choose compatibility mode?
Look at the direction of change: if the producer expands payload - BACKWARD. For public API/SDK - FULL. For fast prototypes - temporarily NONE (not on trunk).
What to do if necessary breaking?
Preparing v2: dual-emit/dual-run, sunset-dates, adapters, telemetry of use, migration guides.
Do I need to validate payload in runtime?
For critical domains, yes: this prevents junk messages and speeds up diagnostics.
Result
The schema registry turns the evolution of data from risky improvisation into a manageable process: uniform interoperability rules, automated validations, comprehensible versions and a transparent history. Add to it the discipline of additive-first, tolerant reader, dual-emit and sunset - and your contracts will develop quickly, without breakdowns and night incidents.