Audit Trails and Access Traces
1) Purpose and scope
Purpose: to ensure the provability of user/service actions, transparency of investigations, compliance with regulatory requirements and internal standards (GDPR/AML, contracts with PSP/KYC providers, ISO/PCI, if applicable).
Coverage: all production systems, platform services (account, payments, anti-fraud, CUS/sanctions, RG), admin panels, API gateways, DWH/BI, infrastructure (K8s/cloud), integration with vendors.
2) What to log (event classes)
1. Identification and access: login/logout, MFA, password/key change, SSO, "break-glass" access.
2. Administrative actions: changes to roles/rights, configurations, anti-fraud/sanctions rules, feature flags.
3. Operations with PII/financial data: reading/exporting/deleting, uploading, accessing KYC, viewing VIP profiles.
4. Transactions and money: Cash-outs/deposits, cancellations, returns, chargeback decisions.
5. Compliance/AML/KYC: screening results (sanctions/PEP/Adverse Media), decisions (TP/FP), EDD/STR/SAR.
6. Incidents and security: escalations, WAF/IDS rule changes, service isolation, secret rotation.
7. Integrations/vendors: API calls, errors, timeouts, exports, data deletion/return confirmations.
3) Mandatory event fields (minimum)
`event_id` (UUID), `ts_utc`, `ts_local`, `source_service`, `trace_id`/`span_id`
'actor _ type '(user/service/vendor),' actor _ id '(strong identifier),' actor _ org '(if B2B)
`subject_type` (account/tx/document/dataset), `subject_id`
`action` (e. g., `READ_PII`, `EXPORT_DATA`, `ROLE_UPDATE`, `WITHDRAWAL_APPROVE`)
`result` (success/deny/error) и `reason`/`error_code`
'ip ',' device _ fingerprint ',' geo '(country/region),' auth _ context '(MFA/SSO)
'fields _ accessed '/' scope '(when working with PII/financial data) - with masking
'purpose '/' ticket _ id '(reason: DSAR, incident, regulator request, operational task)
4) Immutability and provability
WORM storage for the "golden" copy (immutable buckets/retention policies).
Crypto signature/hash chain: periodically signing batches of events and/or building a chain of hashes (hash chaining) to identify modifications.
Log of changes to schemes/rules: versioning schemes and logging policy; any edits go through CAB.
Dual-loop storage: online index (search) + archive/immutability.
5) Time synchronization and tracing
Single NTP/Chrony in all environments; in the logs - 'ts _ utc' as a source of truth.
To each log - 'trace _ id '/' span _ id' for end-to-end tracing of requests (correlation between services, vendors and front).
6) Privacy and secrets
Prohibited: passwords, tokens, full PAN/CSC, full document numbers, raw biometrics.
Default masking: e-mail/phone/IBAN/PAN → tokens/partial display.
Aliasing: 'user _ id' → stable token in analytics; binding to a real ID - only in a protected loop.
DSAR compatibility: the ability to selectively extract logs by subject without revealing extraneous PII.
7) Shelf life and levels (retention)
8) Access and Control (RBAC/ABAC)
Audit log reading roles are separate from administration roles.
MFA and Just-in-Time access (break-glass) with auto-revocation/logging of reasons.
"Minimum" policy: access to PII/financial fields only when necessary and with 'purpose' fixation.
Export/upload: white lists of destinations and formats; mandatory signature/hash, upload log.
9) SIEM/SOAR/ETL integration
The audit event flow enters SIEM for correlations (e. g., mass'READ _ PII '+ input from new device).
SOAR playbooks: auto-tickets for violation of policies (no 'purpose', abnormal volume, access outside the window).
ETL/DWH: 'audit _ access', 'pii _ exports', 'admin _ changes' windows with quality control and schema versioning.
10) Data quality and validators
Schemas as code (JSON/Protobuf/Avro): required fields, types, dictionaries; CI validators.
Rejection and quarantine queue for events with schema errors; scrap metrics.
Deduplication/idempotency by '(event_id, trace_id, ts)'; retransmission control.
11) RACI
12) SOP: Data Access Investigation
1. Trigger: SIEM alert (abnormal 'READ _ PII '/export), complaint, signal from vendor.
2. Collection of artifacts: unloading events by 'actor _ id '/' subject _ id '/' trace _ id', 'purpose' log, related logs (WAF/IdP).
3. Verification of legality: the presence of a foundation (DSAR/incident/service task), coordination, access windows.
4. Impact assessment: PII scope/categories, jurisdictions, risk to subjects.
5. Solution: incident-bridge (when High/Critical), containment (revocation of accesses, key rotation).
6. Report and CAPAs: causes, violated policies, measures (masking, training, RBAC changes), deadlines.
13) SOP: Data Export (Regulator/Partner/DSAR)
1. Request → verification of foundation and identity (for DSAR) → generation of request to DWH.
2. Depersonalization/minimization by default; inclusion of PII only on legal grounds.
3. Download generation (CSV/JSON/Parquet) → signature/hash → write to the download log (who/when/what/to/reason).
4. Transfer via an approved channel (sFTP/Secure link); copy retention period - by policy.
5. Post-inspection: confirmation of receipt, deletion of temporary files.
14) Metrics and KRIs/KPIs
Coverage: the share of critical systems sending audit events ≥ 95%.
DQ errors: events rejected by validator ≤ 0. 5% of the flow.
MTTD of flow loss: ≤ 15 min (alert at silence).
Abnormal accesses without 'purpose': = 0 (KRI).
Response time to investigation: median ≤ 4 h, P95 ≤ 24 h.
Signed/hash exports: 100%.
Retention: deletions/archives on time ≥ 99%.
15) Vendor and sub-processor requirements
DPA/SLA: description of audit logs (schemes, terms, geography, export format), WORM/immutability, SLA of incident notifications.
Vendor access: named service accounts, logs of their actions, the possibility of selective audit.
Offboarding: key revocation, export/deletion of logs, closing act, confirmation of backup destruction.
16) Safety and protection against manipulation
Separation of roles: source admin ≠ storage admin ≠ auditor.
Agent/collector signature, mTLS between components.
Anti-tamper controls: comparison of hashes, regular integrity checks, alerts for discrepancies.
Geo-replication of WORM copies and regular recovery tests.
17) Type errors and anti-patterns
Logging sensitive values (PAN/secrets) → immediate inclusion of redaction-middleware.
Missing 'purpose '/' ticket _ id' when accessing PII.
Local uploads "to the desktop" and sending by e-mail.
Lack of a single scheme and validation → silent fields, impossibility of correlation.
A single super account without being tied to a person or service.
18) Checklists
18. 1 Policy Launch/Review
- Schemas and dictionaries are approved; required fields included
- Masking and prohibitions on secrets are enabled
- NTP configured, 'trace _ id' everywhere
- Hot/Warm/Cold/WORM layers are stacked
- RBAC/ABAC and break-glass are designed
- SIEM/SOAR integrated, alerts tested
18. 2 Monthly audit
- Export Selection: Signatures/Logs Correct
- Check retention/deletions/Legal Hold
- DQ metrics OK, quarantine parsing
- Vendor logs available/full
19) Implementation Roadmap
Weeks 1-2: inventory of systems, coordination of schemes and mandatory fields, time and trace settings.
Weeks 3-4: enabling masking, WORM layer, SIEM/SOAR integration, running export logs.
Month 2: validator/alert automation, investigation playbooks, team training.
Month 3 +: regular audits, integrity stress tests, tiering, vendor/contract audits.
TL; DR
Strong audit trails = full and structured events + immutability (WORM) and signatures + PII masking + hard access and upload logs + SIEM/SOAR integration. This speeds up investigations, reduces risks and makes compliance provable.