GH GambleHub

Distributed Tracing: OpenTelemetry

Distributed Tracing: OpenTelemetry

1) Nima uchun OTel va u nima beradi

OpenTelemetry (OTel) - ochiq standart va OTLP yagona protokoli bilan telemetriya (treys, metrika, loglar) uchun SDK/agentlar/kollektorlar to’plami. Maqsadlar:
  • So’rov yo’llarining to’liq ko’rinishi (gateway → services → DB/kesh/navbatlar).
  • Tezkor RCA/degradatsiyalar va relizlarni sozlash (kanareykalar/blue-green).
  • SLO va avto-otkatlar bilan bog’lanish (ma’lumotlardagi operatsion echimlar).
  • Vendor-agnostika: bitta APM bilan bog’lanmagan holda har qanday orqa tomonga eksport qilish.

Tayanch tamoyillari: standardize, sample smart, secure by default, correlate everything.

2) Asoslari: kontekst, spanlar, atributlar

Trace - daraxt/qoʻngʻiroqlar grafasi; Span - operatsiya (RPC, SQL, navbatni chaqirish).
Span Kind: `SERVER`, `CLIENT`, `PRODUCER`, `CONSUMER`, `INTERNAL`.
W3C Trace Context:’traceparent’,’tracestate’sarlavhalari; kontekst xizmatlararo ko’chiriladi.
Attributes - kalit qiymati (past kardinallik!), Events - vaqt belgilari, Status - kod/xato tavsifi.
Links - span aloqasi (async/fan-out/fan-in uchun muhim).

Spanlarni nomlash:
  • HTTP:’HTTP {METHOD}’(atribut sifatida’GET/withdraw’)
  • DB: `DB SELECT` / `DB INSERT`
  • Queue: `QUEUE publish topic=X` / `QUEUE consume topic=X`

3) Semantik konvensiyalar (semconv)

Barqaror atribut sxemalaridan foydalaning:
  • HTTP/GRPC: `http. method`, `http. route`, `http. status_code`, `url. full`.
  • DB: `db. system=postgresql`, `db. statement’(faqat xavfsiz siqish!),’db. name`.
  • Messaging: `messaging. system=kafka`, `messaging. operation=receive`, `messaging. destination`.
  • Cloud/K8s/Host: `cloud. region`, `k8s. pod. name`, `container. id`.
  • Resource attributes (majburiy):’service. name`, `service. version`, `deployment. environment`.

Sxemaning barqarorligini’schemaUrl’orqali SDK/Collector resurslarida koʻrsating.

4) Sampling: head, tail, adaptive

Head-based (SDKda): oldindan, arzon hal qiladi; high-QPS uchun yaxshi, lekin «qiziqarli» trassalarni o’tkazib yuborishi mumkin.
Tail-based (Collector’da): trek tugaganidan keyin hal qiladi; maqomi, latentligi, atributlari bo’yicha qoidalarga imkon beradi.
Adaptive/Dinamik: p95 xato/o’sishda sampl ulushini oshiradi.

Prod-daraja retsepti: Head 1-5% global + Tail tanlov «muhim»:’status = ERROR’,’latency> p95’, «pul yo’nalishlari», PSP/KYC xatolari.

5) Korrelyatsiya: metriklar, loglar, treyslar

Exemplars: metrik gistogrammdagi’trace _ id’belgilari (trassaga tez sakrash).
Loglar:’trace _ id ’/’ span _ id’qoʻshing va loglardan trasa oʻting.
SpanMetrics (processor): SLO/alertlar uchun RED-metrika (’requests, errors, duration’) trassalaridan agregat qiladi.

6) Joylashtirish arxitekturasi

Agent (DaemonSet) har bir uzelda ilovalardan (OTLP) va forvarditlardan yigʻadi.
Gateway (Cluster/Region) - marshrutlash/sampling/boyitish payplaynlari bilan markaziy Collector (ko’p nusxalar).
OTLP: gRPC `4317`, HTTP `4318`; TLS/mTLS’ni yoqing.

«agent + gateway» ning afzalliklari: izolyatsiya, buferlash, lokal backpressure, soddalashtirilgan tarmoq.

7) OpenTelemetry Collector - asosiy shablon (gateway)

yaml receivers:
otlp:
protocols:
grpc: { endpoint: 0. 0. 0. 0:4317 }
http: { endpoint: 0. 0. 0. 0:4318 }

processors:
memory_limiter: { check_interval: 5s, limit_percentage: 75 }
batch: { timeout: 2s, send_batch_size: 8192 }
attributes:
actions:
- key: deployment. environment action: upsert value: prod resource:
attributes:
- key: service. namespace action: upsert value: core tail_sampling:
decision_wait: 5s policies:
- name: errors type: status_code status_code: { status_codes: [ERROR] }
- name: slow_traces type: latency latency: { threshold_ms: 800 }
- name: important_routes type: string_attribute string_attribute:
key: http. route values: ["/withdraw", "/deposit"]
- name: baseline_prob type: probabilistic probabilistic: { sampling_percentage: 5 }

exporters:
otlp/apm:
endpoint: apm-backend:4317 tls: { insecure: true }
prometheus:
endpoint: 0. 0. 0. 0:9464

extensions:
health_check: {}
pprof: { endpoint: 0. 0. 0. 0:1777 }
zpages: { endpoint: 0. 0. 0. 0:55679 }

service:
extensions: [health_check, pprof, zpages]
pipelines:
traces:  { receivers: [otlp], processors: [memory_limiter,attributes,resource,batch,tail_sampling], exporters: [otlp/apm] }
metrics: { receivers: [otlp], processors: [batch], exporters: [prometheus] }
logs:   { receivers: [otlp], processors: [batch], exporters: [] }

8) SLO uchun SpanMetrics va RED

Protsessorni qoʻshish:
yaml processors:
spanmetrics:
metrics_exporter: prometheus histogram:
explicit:
buckets: [50ms,100ms,200ms,400ms,800ms,1600ms,3200ms]
service:
pipelines:
traces: { receivers: [otlp], processors: [spanmetrics,batch,tail_sampling], exporters: [otlp/apm] }
metrics: { receivers: [otlp], processors: [batch], exporters: [prometheus] }

Endi SLO/alertlar uchun’traces _ spanmetrics _ calls {service, route, code}’va’duration _ bucket’mavjud.

9) K8s: Collector (DaemonSet + Deployment)

Agent (DaemonSet) parchasi:
yaml apiVersion: apps/v1 kind: DaemonSet metadata: { name: otel-agent, namespace: observability }
spec:
template:
spec:
containers:
- name: otelcol image: otel/opentelemetry-collector:latest args: ["--config=/conf/agent. yaml"]
ports:
- { containerPort: 4317, name: otlp-grpc }
- { containerPort: 4318, name: otlp-http }

Gateway (Deployment) - bir nechta nusxalar, Service ClusterIP/Ingress, CPU/QPS boʻyicha HPA.

10) Xavfsizlik va maxfiylik

TLS/mTLS между SDK → Agent → Gateway → Backend.
Gateway kirish joyidagi autentifikatsiya (Basic/OAuth/Headers); kelib chiqishini cheklang.
PII tahriri: sifatlarni filtrlash/maskalash (’user. email’,’card.’) - Collector protsessorida.
Limitlar: SDKda hodisa oʻlchami/atributlar sonini cheklang (kardinallikdan himoya qilish).
RBAC backendda + loyihalar/tenantlarning alohida neyspeyslari.

Collector filtr namunasi:
yaml processors:
attributes/redact:
actions:
- key: user. email action: delete
- key: payment. card action: delete

11) Instrumentatsiya: tezkor startlar

Node. js

js import { NodeSDK } from "@opentelemetry/sdk-node";
import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-grpc";
import { Resource } from "@opentelemetry/resources";
import { SemanticResourceAttributes as R } from "@opentelemetry/semantic-conventions";

const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({ url: "http://otel-agent. observability:4317" }),
resource: new Resource({
[R.SERVICE_NAME]: "payments-api",
[R.SERVICE_VERSION]: "1. 14. 2",
[R.DEPLOYMENT_ENVIRONMENT]: "prod"
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk. start();

Java (Spring)

java
// Gradle: io. opentelemetry. instrumentation:opentelemetry-spring-boot-starter
// application. yml otel:
service:
name: orders-api exporter:
otlp:
endpoint: http://otel-agent. observability:4317 traces:
sampler: parentbased_traceidratio sampler-arg: 0. 05

Python (FastAPI)

python from opentelemetry import trace from opentelemetry. sdk. resources import Resource from opentelemetry. exporter. otlp. proto. grpc. trace_exporter import OTLPSpanExporter from opentelemetry. sdk. trace import TracerProvider from opentelemetry. sdk. trace. export import BatchSpanProcessor

provider = TracerProvider(resource=Resource. create({"service. name":"fraud-scoring","deployment. environment":"prod"}))
provider. add_span_processor(BatchSpanProcessor(OTLPSpanExporter(endpoint="http://otel-agent. observability:4317", insecure=True)))
trace. set_tracer_provider(provider)

Go

go exp, _:= otlptracegrpc. New(ctx, otlptracegrpc. WithEndpoint("otel-agent. observability:4317"), otlptracegrpc. WithInsecure())
res:= resource. NewWithAttributes(semconv. SchemaURL, semconv. ServiceNameKey. String("gateway"), semconv. DeploymentEnvironmentKey. String("prod"))
tp:= sdktrace. NewTracerProvider(sdktrace. WithBatcher(exp), sdktrace. WithResource(res), sdktrace. WithSampler(sdktrace. ParentBased(sdktrace. TraceIDRatioBased(0. 05))))
otel. SetTracerProvider(tp)

12) Asinxron: navbatlar, shinalar, cron

PRODUCER/CONSUMER’links’orqali aloqaga ega.
Xabarlar sarlavhasi kontekstini targʻib qiling (’traceparent ’/’ baggage’).
Batch-consume’da xabar uchun span’lar yarating yoki’messaging’atributi bilan birlashtirish. batch. size`.
cron/joblar uchun: boshlangʻich hodisalar uchun + links uchun yangi trace (agar mavjud boʻlsa).

13) Baggage va targeting

Minimal barqaror kalitlarni (’tenant _ id’,’region’,’vip _ tier’) baggage’da saqlang; PII taqiqlansin.
Keyinchalik segmentlar bo’yicha metrlarni yig’ish uchun gateway/gateway logger orqali tashlang.

14) Relizlar va SLO-geyting bilan integratsiya

Kanar qadamlari → marshrutlar/uz-segmentlar bo’yicha’traces _ spanmetrics _’ni tekshiring.
Degradatsiyada (5xx/p95) - avto-stop va orqaga qaytish (Argo Rollouts AnalysisTemplate + PromQL).
Metriklardan olingan nusxalar to’g «ridan to’g» ri reliz oralig’idagi «yomon» trassalarga olib boradi.

15) Limitlar va unumdorlik

Ограничивайте: `OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT`, `OTEL_SPAN_EVENT_COUNT_LIMIT`, `OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT`.
Ehtimollik/chastotaga qarab/stacktrace istisnolarini semple qiling.
SDK va Collector’dagi Batch-protsessor; portlashlarda yo’lni yo’qotmaslik uchun navbat tuting.

16) Muvofiqlik va migratsiya

Propagatorlar: W3C dan foydalaning; migratsiyada B3/X-Ray oʻqishni qoʻllab-quvvatlang (dual-propagation).
Eksport: OTLP → APM (Jaeger/Tempo/Elastic/X-Ray va boshqalar).
Semconv ning barqaror versiyalari -’schemaUrl’ni tuzating va yangilanishlarni rejalashtiring.

17) Anti-patternlar

Atributlarning yuqori kardinalligi (label’da’user _ id’, dinamik kalitlar).
’trace _ id’ → bilan bogʻlanish yoʻq.
To’g’ridan-to’g’ri dasturlardan Internet-APMga eksport qilish (gateway’siz, TLS/mTLS’siz).
Oziq-ovqat mahsulotining atigi 100 foizini yigʻish qimmat va maʼnosiz.
’db’ dan foydalanuvchi maʼlumotlari bilan SQL soʻrovlari dampalari. statement`.
Muvofiqlashtirilmagan servis nomi/versiyasi - metrika «parchalanmoqda».

18) Joriy etish chek-varaqasi (0-45 kun)

0-10 kun

SDK/avtoinstrumentlashni 2-3 ta tanqidiy xizmatlarda yoqish.
Agent (DaemonSet) + Gateway (Deployment), OTLP 4317/4318 ni TLS bilan moslash.
’service’ qoʻshish. name`, `service. version`, `deployment. environment’hamma joyda.

11-25 kun

Tail-sampling xato/yashirin/» pul» yo’nalishlari bo’yicha.
SpanMetrics → Prometheus, Exemplars va RED/SLO dashbordlarini o’z ichiga oladi.
W3C’ni API-shlyuz/NGINX/mesh orqali targ’ib qilish; loglarni muvofiqlashtirish.

26-45 kun

Navbatlarni/DB/keshni qoplash; async uchun links.
Collector’da PII-tahririyat siyosati; SDKdagi atributlar limitlari.
SLO-geyting relizlari va avto-qaytishni integratsiyalash.

19) Etuklik metrikasi

Kiruvchi soʻrovlarni trassirovka qilish ≥ 95% (sampling head/tail hisobga olingan holda).
Exemplars bilan metrikalarning ulushi ≥ 80%.
Metrikadan trassaga RCA vaqti ≤ 2 daqiqa (p50).
0 atributlar/hodisalarda PII sizib chiqishi (skaner).
Barcha servislar’service’ga ega. name/version/environment’va kelishilgan semantika.

20) Ilovalar: foydali parchalar

NGINX targ’iboti:
nginx proxy_set_header traceparent $http_traceparent;
proxy_set_header tracestate $http_tracestate;
proxy_set_header baggage   $http_baggage;
Prometheus с Exemplars (Grafana):

histogram_quantile(0. 95, sum(rate(traces_spanmetrics_duration_bucket{route="/withdraw"}[5m])) by (le))

Policy: PII atributlarni taqiqlash (psevdo-linter)

yaml forbid_attributes:
- user. email
- payment. card
- personal.

21) Xulosa

OpenTelemetry kuzatishni standartlashtirilgan, boshqariladigan konturga aylantiradi: yagona semantika, xavfsiz targ’ibot, aqlli sampling va metrik va logli kuchli korrelyatsiya. Agent + gateway tuzing, tail-sampling, spanmetrics va Exemplars qo’shing, PII va kardinallikka rioya qiling - va traska nafaqat tuzatish uchun, balki SRE/Release avtomatlashtirilgan yechimlari uchun ham vosita bo’ladi, MTTR va har chiqarilishdagi xavflarni kamaytiradi.

Contact

Biz bilan bog‘laning

Har qanday savol yoki yordam bo‘yicha bizga murojaat qiling.Doimo yordam berishga tayyormiz.

Telegram
@Gamble_GC
Integratsiyani boshlash

Email — majburiy. Telegram yoki WhatsApp — ixtiyoriy.

Ismingiz ixtiyoriy
Email ixtiyoriy
Mavzu ixtiyoriy
Xabar ixtiyoriy
Telegram ixtiyoriy
@
Agar Telegram qoldirilgan bo‘lsa — javob Email bilan birga o‘sha yerga ham yuboriladi.
WhatsApp ixtiyoriy
Format: mamlakat kodi va raqam (masalan, +998XXXXXXXX).

Yuborish orqali ma'lumotlaringiz qayta ishlanishiga rozilik bildirasiz.