GH GambleHub

Audit and logging tools

1) Why do you need it

Objectives:
  • Traceability of actions (who/what/when/where/why).
  • Rapid incident investigations and forensics.
  • Regulatory and customer compliance.
  • Risk management and MTTR reduction in incidents.
  • Support for risk, anti-fraud, compliance models (KYC/AML/RTBF/Legal Hold).
Key principles:
  • Completeness of source coverage.
  • Record immutability and integrity.
  • Standardized event schemas.
  • Search availability and correlation.
  • Minimization of personal data and privacy control.

2) Instrument landscape

2. 1 Log management and indexing

Сбор/агенты: Fluent Bit/Fluentd, Vector, Logstash, Filebeat/Winlogbeat, OpenTelemetry Collector.
Storage and search: Elasticsearch/OpenSearch, Loki, ClickHouse, Splunk, Datadog Logs.
Streaming/tires: Kafka/Redpanda, NATS, Pulsar - for buffering and fan-out.
Parsing and normalization: Grok/regex, OTel processors, Logstash pipelines.

2. 2 SIEM/Detect & Respond

SIEM: Splunk Enterprise Security, Microsoft Sentinel, Elastic Security, QRadar.
UEBA/behavioral analysis: embedded modules in SIEM, ML detectors.
SOAR/orchestration: Cortex/XSOAR, Tines, Shuffle - playbook automation.

2. 3 Audit and immutability

Аудит подсистем: Linux auditd/ausearch, Windows Event Logs, DB-аудит (pgAudit, MySQL audit), Kubernetes Audit Logs, CloudTrail/CloudWatch/Azure Monitor/GCP Cloud Logging.
Immutable storage: WORM buckets (Object Lock), S3 Glacier Vault Lock, write-once volumes, logging with crypto signature/hash chain.
TSA/timestamps: binding to NTP/PTP, periodic anchoring of hashes in external trusted time.

2. 4 Observability and traces

Metrics/trails: Prometheus + Tempo/Jaeger/OTel, correlation of logs ↔ traces by trace_id/span_id.
Dashboards and alerts: Grafana/Kibana/Datadog.


3) Event sources (cover scope)

Infrastructure: OS (syslog, auditd), containers (Docker), orchestration (Kubernetes Events + Audit), network devices, WAF/CDN, VPN, IAM.
Applications and APIs: API gateway, service mash, web servers, backends, queues, schedulers, webhooks.
DB and vaults: queries, DDL/DML, access to secrets/keys, access to object storage.
Payment integrations: PSP/acquiring, chargeback events, 3DS.
Operations and processes: console/CI/CD inputs, admin panels, configuration/feature flag changes, releases.
Security: IDS/IPS, EDR/AV, vulnerability scanners, DLP.
User events: authentication, login attempts, KYC status change, deposits/outputs, bets/games (with anonymization if necessary).


4) Data schemes and standards

Unified event model: 'timestamp', 'event. category`, `event. action`, `user. id`, `subject. id`, `source. ip`, `http. request_id`, `trace. id`, `service. name`, `environment`, `severity`, `outcome`, `labels.`.
Стандарты схем: ECS (Elastic Common Schema), OCSF (Open Cybersecurity Schema Framework), OpenTelemetry Logs.
Correlation keys: 'trace _ id', 'session _ id', 'request _ id', 'device _ id', 'k8s. pod_uid`.
Quality: required fields, validation, deduplication, sampling for "noisy" sources.


5) Architectural reference

1. Collection on nodes/agents →

2. Pre-processing (parsing, PII-edition, normalization) →

3. Tire (Kafka) with retching ≥ 3-7 days →

4. Thread forks:
  • Online storage (search/correlation, hot storage 7-30 days).
  • Immutable archive (WORM/Glacier 1-7 years for audit).
  • SIEM (detection and incidents).
  • 5. Dashboards/search (operations, security, compliance).
  • 6. SOAR for reaction automation.
Storage layers:
  • Hot: SSD/indexing, fast search (rapid response).
  • Warm: compression/less frequent access.
  • Cold/Archive (WORM): cheap long-term storage, but unchangeable.

6) Immutability, integrity, trust

WORM/lock object - block deletion and modification for the duration of the policy.
Crypto signature and hash chain: by batches/chunks of logs.
Hash-anchoring: periodic publication of hashes in an external registry or trusted time.
Time synchronization: NTP/PTP, drift monitoring; recording'clock. source`.
Change control: four-eyed/dual control for retention/Legal Hold policies.


7) Privacy and compliance

PII minimization: store only the necessary fields, edit/mask in ingest.
Aliasing: 'user. pseudo_id', the storage of mapping is separate and limited.
GDPR/DSAR/RTBF: source classification, managed logical delete/hide in replicas, exceptions for legal retention duties.
Legal Hold: "freeze" tags, suspension of deletion in archives; journal of activities around Hold.
Standard mapping: ISO 27001 A.8/12/15, SOC 2 CC7, PCI DSS Req. 10, local market regulation.


8) Operations and processes

8. 1 Playbooks/Runbooks

Source loss: how to identify (heartbeats), how to restore (replay from the bus), how to compensate for gaps.
Increasing delays: queue checking, sharding, indexes, backpressure.
Investigation of event X: KQL/ES-query template + link to the trace context.
Legal Hold: who puts, how to shoot, how to document.

8. 2 RACI (in brief)

R (Responsible): Observation-team for collection/delivery; SecOps for detection rules.
A (Accountable): CISO/Head of Ops for policies and budget.
C (Consulted): DPO/Legal for privacy; Architecture for circuits.
I (Informed): Support/Product/Risk Management.


9) Quality Metrics (SLO/KPI)

Coverage:% of critical sources are connected (target ≥ 99%).
Ingest lag: p95 delivery delay (<30 sec).
Indexing success: proportion of events with no parsing errors (> 99. 9%).
Search latency: p95 <2 sec for typical window 24h requests.
Drop rate: loss of events <0. 01%.
Alert fidelity: Precision/Recall by rules, share of false positives.
Cost per GB: Storage/index cost per period.


10) Retention policies (example)

CategoryHotWarmArchive (WORM)Total
Audit admin panels14 d90 d5 years5 years
Payment events7 d60 d7 years7 years
Those. application logs3 d30 d1 year1 year
Security (IDS/EDR)14 d90 d2 years2 years

Policies are specified by Legal/DPO and local regulations.


11) Detection and alerts (skeleton)

Rules (rule-as-code):
  • Suspicious authentication (impossible movement, TOR, frequent errors).
  • Escalation of privileges/roles.
  • Configuration/secret changes outside the release schedule.
  • Abnormal transaction patterns (AML/anti-fraud signals).
  • Mass data uploads (DLP triggers).
  • Fault tolerance: 5xx squall, latency degradation, multiple pod restarts.
Contexts:
  • Enrichment with geo/IP reputation, linking to releases/feature flags, linking to tracks.

12) Log access security

RBAC and segregation of duties: separate roles for readers/analysts/admins.
Just-in-time access: temporary tokens, audit of all reads of "sensitive" indexes.
Encryption: in-transit (TLS), at-rest (KMS/CMK), key isolation.
Secrets and keys: rotation, limiting the export of events with PII.


13) Implementation Roadmap

MVP (4-6 weeks):

1. Source directory + minimum schema (ECS/OCSF).

2. Agent on nodes + OTel Collector; centralized parsing.

3. Storage Hot (OpenSearch/Elasticsearch/Loki) + dashboards.

4. Basic alerts (authentication, 5xx, config changes).

5. Archive in Object Storage with a lock object (WORM).

Phase 2:
  • Kafka as a tire, replay, retray queue.
  • SIEM + first correlation rules, SOAR playbooks.
  • Crypto signature of batches, anchoring of hashes.
  • Legal Hold policies, DSAR/RTBF procedures.
Phase 3:
  • UEBA/ML detection.
  • Data Catalog, lineage.
  • Cost optimization: sampling "noisy" logs, tiering.

14) Frequent mistakes and how to avoid them

Log noise without a scheme: → introduce mandatory fields and sampling.
No traces: → to implement trace_id in core services and proxies.
A single "monolith" of logs: → divided into domains and criticality levels.
Not immutable: → to enable WORM/Object Lock and signature.
Secrets in the logs: → filters/editors, token scanners, reviews.


15) Launch checklist

  • Criticality Priority Source Register.
  • Unified scheme and validators (CI for parsers).
  • Agent strategy (daemonset in k8s, Beats/OTel).
  • Splint and retention.
  • Hot/Cold/Archive + WORM
  • RBAC, encryption, access log.
  • SOAR basic alerts and playbooks.
  • Dashboards for Ops/Sec/Compliance.
  • DSAR/RTBF/Legal Hold policies.
  • KPI/SLO + storage budget.

16) Examples of events (simplified)

json
{
"timestamp": "2025-10-31T19:20:11.432Z",
"event": {"category":"authentication","action":"login","outcome":"failure"},
"user": {"id":"u_12345","pseudo_id":"p_abcd"},
"source": {"ip":"203.0.113.42"},
"http": {"request_id":"req-7f91"},
"trace": {"id":"2fe1…"},
"service": {"name":"auth-api","environment":"prod"},
"labels": {"geo":"EE","risk_score":72},
"severity":"warning"
}

17) Glossary (brief)

Audit trail - a sequence of unchangeable records that records the actions of the subject.
WORM - write-once, read-many storage mode.
SOAR - automation of response to incidents by playbooks.
UEBA - analysis of user behavior and entities.
OCSF/ECS/OTel - standards for log schemes and telemetry.


18) The bottom line

The audit and logging system is not a "log stack," but a managed program with a clear data schema, an unchangeable archive, correlation and reaction playbooks. Compliance with the principles in this article increases observability, speeds up investigations and closes key requirements of Operations and Compliance.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.