GH GambleHub

Storing and deleting user data

1) Why a Retention and Disposition Policy

The goal is to store only the necessary data, exactly as much as required, and safely delete it at the end of the processing goals. This reduces legal risks, attack surface, infrastructure costs and simplifies auditing (licenses, PSP partners, regulators).

Key principles:
  • Purpose/basis (contract, law, legitimate interest, consent).
  • Minimization and segregation (PII ↔ pseudonyms ↔ anonymous).
  • Predictable timing and provable removal procedures.
  • Continuous monitoring (logs, reports, metrics).

2) Data zones and architectural supports

Zone A - PII/sensitive: KYC, payment tokens, biometrics (where acceptable). At-rest encryption, strict RBAC/ABAC, JIT accesses.
Zone B - Pseudonymized: stable tokens for analytics/ML; prohibition of direct de-identification.
Zone C - Anonymous Aggregates: Reporting/Research; long shelf life is allowed.

Supporting mechanisms:
  • Data Catalog/RoPA (operations register), Retention Service (rules), Deletion Orchestrator (end-to-end deletion), WORM archive (audit/incidents).

3) Retention matrix: how to compose

Steps:

1. Match processing targets ↔ legal grounds ↔ data categories ↔ deadlines.

2. Describe the triggers of the starting point (events: account creation, last login, account closure, end of contract, final transaction).

3. Fix the method at the end: deletion, anonymization, blocking (when you need a "frieze").

4. Specify owner and exceptions (AML/taxes/disputes/fraud).

Example (for wiki):
CategoryPurposeBasisTermTrigger startMethod after completion
Account (IDs, contact)Record keepingContractrelationship period + 6 monthsAccount closureRemoval
KYC (documents, selfie templates)Identification/AMLYur. duty≥5 years after the end of the relationshipLast Transaction/CloseLock → delete by date
Payment tokensPayments/ConclusionsContract/Duty2 years after last surgeryLast payment activityRemoval
Security logs (inputs/IP)Safety/fraudLegitimate interest12-24 monthsEvent recordAnonymization
Analytics (pseudo-ID)Product AnalyticsLegitimate interest/consent13 months (common)Event collectionAnonymization/deletion
Marketing (email/sms/push)CommunicationsConsentwhile there is consent + 30 daysRecall/ExpirationRemoval
Incident casesCompliance/InvestigationYur. interest/duty3-6 yearsCase closureArchive → Delete
💡 Values - reference points; update with local licensing and PSP contractual requirements.

4) Retention policy (skeleton)

1. Scope, roles (data owner, DPO, Security, Operations).
2. Definitions (PD categories, zones, archive, backup, anonymization/pseudonymization).
3. Linking data to goals/bases and deadlines (reference to the retention matrix).
4. Manage exceptions (legal hold, investigations, regulatory requests).
5. Access controls, encryption, audit uploads.
6. Revision procedure (quarterly/if goals/providers change).

5) Pipeline removal and anonymization

Stages:
  • Mark-for-Deletion: marking records and dependencies; checking "holds."
  • Grace Period: buffer (e.g. 7-30 days) to cancel in error.
  • Soft Delete: logical hiding from production services; Stop mailings/treatments.
  • Hard Delete/Anonymize: physical cleanup/irreversible anonymization in primary storage.
  • Cascade & Fan-out: cascade into derivatives (caches, search indexes, phichestore, DWH, ML layers).
  • Backups: deferred cleaning by backup policy (see below).
  • Evidence: act of deletion (ID, classifier, time, systems), log in WORM.
Technical rules:
  • Delete by subject key traced by lineage.
  • Idempotent tasks, retrays, command deduplication.
  • SLA: most deletions ≤30 days from request (if applicable).
  • Control "non-removable" fields: replace with tokens/mask.

6) Backups and replicas: what to do with copies

Immutable backups (ransomware-resistance) are stored under a separate policy; direct editing is not allowed.
The subject is removed from backups after the backup expires and recovery to the combat environment is prohibited, if this leads to re-identification.
Document: window for storing backups (for example, 30/60/90 days), recovery scripts and the "sanitization" process during recovery (post-scripts for re-deleting marked records).

7) Exceptions and "legal hold"

Sometimes deletion cannot be performed immediately (e.g. AML, tax audits, litigation). Procedure:
  • Put Legal Hold with indication of the reason, term and owner.
  • Block access to data for any purpose other than as specified.
  • Periodically review the holds and remove as soon as the base falls away.

8) Documentation and artifacts

Retention matrix (versioned).
Removal Procedure (SOP) - Steps, Roles, SLAs, Escalations.
Deletion Evidence Log (WORM): who/what/when/result.
Backups Policy: timelines, storage class, recovery tests.
Data Lineage Map: from primary tables to derived layers.
Exceptions/Legal Holds Register.

9) Metrics and quality control

Retention Adherence% of records deleted on schedule.
Deletion SLA: median/95th percentile since request/trigger.
Cascade Completion Rate - Percentage of systems where removal is complete.
Backups Window Compliance: percentage of backups deleted by date.
Access/Export Violations: unauthorized reads/uploads.
DSR SLAs (if applicable): responses ≤ deadlines.
Incident Rate is the number of delete/misalignment failures.

10) Checklists (operating)

Before launching feature

  • Target/treatment base and storage area (A/B/C) defined.
  • Added a row to the retention matrix (term, trigger, method).
  • Deletion Orchestrator (keys, cascades, idempotency) is configured.
  • Auditing enabled (WORM logs), RoPA updated.

Daily/Weekly

  • The Delete Task Scheduler ran smoothly.
  • New Legal Holds registered, expired - withdrawn.
  • Checked backup reports (create/expire).

Quarterly

  • A review of the retention matrix and exceptions.
  • Test recovery from backup + "sanitization" of scripts.
  • Reconciliation of metrics (SLA, Cascade, Violations), improvement plan.

11) Frequent mistakes and how to avoid them

Reserve storage → hard binding to targets; automatic TTL by category.
There is no cascade → the data remains in caches/indexes/fichestore; implement a universal orchestrator.
Dev/Stage with PD → use synthetic sets/masking; automatic cutting of dumps.
Backups outside the policy → define windows, prohibit unauthorized restores, sanitization tests.
Lack of evidence → WORM log, removal acts, regular reports.

Mixed grounds → separate marketing/security/contract; do not delay the deadline "just in case."

12) Example of custom deletion (end-to-end scenario)

1. The user closes the account or submits a DSR for deletion.
2. Check exceptions (AML, disputes) → if any - Legal Hold with limited goals.
3. If there is no hold: Mark-for-Deletion → Grace 14 days → Soft Delete.
4. Hard Delete/Anonymize in the transaction layer, then cascade to caches, indexes, DWH, ML-feature store.
5. Logging in Evidence Log, updating status in profile/mail.
6. Cleaning from backups after the storage window expires.

13) Roles and Responsibilities (RACI)

Data Owner/Domain Lead - deadlines and goals; updating the retention matrix.
DPO/Privacy - compliance, advice on exceptions.
Security/CISO - encryption, access, audit, backups/recoveries.
Data Engineering — Deletion Orchestrator, lineage, каскады.
Support/Operations - DSR communications, status and SLAs.
Legal - legal holds, interaction with regulators/courts.

14) Templates for your wiki

Retention-Matrix. xlsx/MD (category → purpose → basis → term → method).
Deletion-SOP. md (step-by-step regulation with escalations).
Backups-Policy. md (windows, storage classes, recovery test plan).
Legal-Holds-Register. md (test/removal forms).
Data-Lineage-Diagram (links from tables to derivatives).
Monthly-Privacy-Ops-Report. md (metrics, incidents, improvements).

15) Implementation Roadmap (6 steps)

1. Inventory: map of data/flows, comparison of goals and reasons.
2. Retention matrix: draft deadlines + owners; alignment with Legal/DPO.
3. Deletions orchestrator: keys, cascades, backup sanitization, WORM logs.
4. Policies/Procedures: Retention Policy, Deletion SOP, Backups Policy, Legal Hold.
5. Automation and monitoring: schedules, alerts, dashboard metrics.
6. Audits and training: quarterly revision, certificate templates, recovery training.

Result

Effective data retention and disposition is a manageable cycle: purpose → duration → control → secure disposition/anonymization → provability. Segregation of zones, retention matrix, cascading deletion (including backups), understandable exceptions and metrics turn privacy and compliance from a risk into a competitive advantage - without losses for product speed and UX quality.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.