GH GambleHub

Fraud detection

Fraud Detection

Antifraud is not only a "risk model." This is the circuit: standardized events → features and graphs → rules/models → decision and action → explanation and appeals → effect measurement and drift control. Below is a system instruction applicable to payment and gaming platforms, marketplaces and fintech services.

1) Threat map (what we protect)

Payment schemes: stolen cards, card testing, chargebacks, friendly fraud.
Account risks: hacking/interception, multiaccounting, bonus abuse, device farms.
KYC/AML: false documents, dummies, cash out, sanctions/PEP risks.
Behavioral: bots, scripts, abnormal patterns of rates/transactions.
Affiliate: fraud of traffic/referrals, stimulation of low-quality deposits.

2) Signals and raw materials

Device/network: device fingerprint, canvas/wag, emulators, IP/ASN/proxy/VPN, geovelositi.
Payment: BIN/MCC/card country, 3DS/ECI, AVS/CVV results, velocity (by card/account/device), limit deviations.
Behavior: speed of forms, mouse/touch trajectories, dwell-time, sequence of actions.
Social/graph: coincidence of phones/e-mail/maps/addresses/devices, common features with "bad" nodes.
CUS/Documents: OCR quality/selfie-matching/liveliness (liveness), date/source, blacklists/sanctions.

3) Feature store (point-in-time)

Time windows: 5m/1h/24h/7d for velocity feature; expon. smoothing.
Units by identity: by user_id, phone, e-mail, map, device, IP/ASN.

Geo/Time: Country/Region/Timezone/Local Holiday Profiles

Feature graph: degree/triangle count/PageRank, proportion of connections with bad ones, component.
KYC quality: confidence OCR, edit distance of names/addresses, IBAN/TIN validation.
Anti-faces: strictly point-in-time, no future marks; online/offline parity.

4) Markup and target variables

Targets: chargeback = 1, confirmed_fraud=1, bonus_abuse=1.
Windows of deferred truth: tags come after T (chargebacks), use the "frieze" of the period when learning.
Distribution: strong imbalance (0. 1-1% "units") → weighing/sampling carefully.
Surrogate tags: manual confirmations and appeals - keep confident.

5) Models and approaches

Rules (policy-as-code): whitelists/blacklists, velocity thresholds, geovelocities, incompatible attributes. Fast, understandable, base for fail-safe.
Supervision: gradient boosting/forest, logistic regression, tabular NNs with cost-sensitive loss.
Anomalies: Isolation Forest, LOF, robust z-score/seasonal-decomp, autoencoders.
Graph approaches: link prediction, GNN/DeepWalk embeddings, general device/map rules.
Hybrids: cascade (rules → ML → graph), ensembles with different fines for FP/FN.
Calibration: Platt/Isotonic for probabilities; thresholds from the cost of errors.

6) Quality metrics (focus on rare classes)

PR-AUC as primary; ROC-AUC is secondary in imbalance.
Recall@FPR≤x%, Precision@k, Cost-sensitive utility.
Coverage and Latency p95 for production scoring.
Fairness/Harms: Errors by Country/Device/Payment Method Segment.

7) Threshold policy and hysteresis

Separate the solution zones:
  • 'score ≥ τ_block' → autoblock;
  • 'τ _ review ≤ score <τ_block' → manual review;
  • 'score <τ_review' → skipping.

Add hysteresis (input/output threshold is different) and cool-down (minimum retry intervals) to avoid "blinking."

Decision table example

ConditionContextActionGuardrails
`score ≥ 0. 95` или `device in blacklist`paymentBlockingFPR≤0. 3%, SLA <1c
`0. 8≤score<0. 95 'and' amount> Q90'paymentHand reviewSLA 2h
'geo-velocity> 1,000km/h'and' # 3DS'authenticationStep-up KYC/3DSZhaloby≤Kh

8) Online circuit: scoring and orchestration

Streaming: Events via Bus; features from the online feature store; idempotency via 'event _ id'.
Latency: target p95 (for example, ≤ 100-300 ms per request).
Orchestrator: guaranteed delivery, retrai/backoff, DLQ, rate-limit across channels.
Action channels: 3DS/step-up, hold/limit, block, request for documents, ticket to the case manager, notification to the user.

Audit: end-to-end 'correlation _ id' "signal→resheniye→deystviye→iskhod."

9) Human-in-the-loop and case management

Cases: aggregate incidents/evidence, show an explanation (top features/rules, graph-neighbors).
Permissions: auto-block/partial limit/request for additional ACC/closure.
Training: analysts' edits go back to data (relabel), asset-lening at the border.
SLA: P1/P2 priority, response times, queues, load sharing.

10) Graph analysis in practice

Связи: `user ↔ device ↔ card ↔ phone ↔ email ↔ IP`.
Patterns: "stars" of card testing, "components" of bonus abuse, general proxies/VPNs.
Scoring nodes/edges: weighted PageRank, suspiciousness by the proportion of bad neighbors.
Preventive: quarantine new nodes if they are included in the "infected" component.

11) KYC/AML/sanctions and compliance

Match: sanctions lists/POP/address media; fuzzy search, name normalization/transliteration.
Documents: liveliness/anti-spoofing, MRZ/visual signs check, geo-consistency.
Transactional monitoring: rules on amounts/thresholds/chains of transfers, scenarios were reset.
Governance: RLS/CLS, PII masking, decision log, explainability, and path of appeal.

12) Effect estimate (not only "accuracy")

Solution economics:
[
EV =\text {Prev. damage} -\text {Cost of false blocks} -\text {Transaction costs}
]

Policies/tests: A/B/quasi-experiments (DiD) for thresholds and rules; bandits to select a step-up method.
Guardrails: complaints/appeals, NPS, proportion of "incorrect locks" (FPR), latency.

13) Monitoring, drift and SLO

Quality: PR-AUC/Recall @ FPR via sliding window; probability calibration.
Drift: PSI/KL by key features, share of "unknown" BIN/ASN, new device clusters.
Operations: p95 latency, share of timeouts,% of manual escalations, backlog review.
SLO: availability> 99. 9%, Decision→Action p95 ≤ 2–5 c; "stopcock" in case of data quality degradation.
Runibooks: surge in card testing, drop in 3DS, outage provider, storm of logs.

14) Data and code architecture

Events: canonical scheme (UTC, version, source), idempotent keys.
Feature Store: online/offline parity, point-in-time flights, versioning transformations.
Models: register of versions, reproducible pipelines, certification in production, shadow-launch.
Rules-as-Code: git repository, review/checklists, regression tests.
Explainability: SHAP/rule weights log, case samples for support training.

15) Security, privacy, ethics

PII minimization: tokenization/hashing of identifiers; separate "safe" stores.
Access: RLS/CLS and audit reads/uploads; export - with tokens and deadlines.
Fairness: Test error differentiation by region/method, eliminate invalid attributes.
Transparency: reasons for decisions and understandable appeal to the user.

16) Pseudo-SQL and recipes

Idempotent Transaction Log

sql
MERGE INTO fact_payments t
USING staging_payments s
ON t. txn_id = s. txn_id
WHEN MATCHED AND s. updated_at > t. updated_at THEN
UPDATE SET status=s. status, amount=s. amount, updated_at=s. updated_at
WHEN NOT MATCHED THEN
INSERT (txn_id,user_id,card_hash,amount,currency,event_time,created_at)
VALUES (s. txn_id,s. user_id,s. card_hash,s. amount,s. currency,s. event_time,NOW());

Velocity features (24h window)

sql
SELECT user_id,
COUNT()             AS tx_24h,
SUM(amount)            AS sum_24h,
COUNT(DISTINCT card_hash)     AS uniq_cards_24h,
COUNT(DISTINCT device_hash)    AS uniq_devices_24h,
MIN(event_time)          AS first_tx_24h,
MAX(event_time)          AS last_tx_24h
FROM fact_payments
WHERE event_time >= NOW() - INTERVAL '24 hour'
GROUP BY user_id;

17) Anti-fraud launch checklist

  • Signals and circuitry standardized, idempotency enabled
  • Feature Store with point-in-time, online/offline parity
  • Labels are formed without faces, deferred truth windows are taken into account
  • Threshold policy with hysteresis and step-up, SLA and guardrails set
  • Case management and human-in-the-loop are set up, explainability is available
  • Metrics: PR-AUC, Recall @ FPR, Cost-utility; fairness-diagnostics
  • Drift/Error Monitoring, Alerts, Incident Runibooks
  • Governance: model/rule versions, reviews, solution audits, KYC/AML compliance
  • A/B/DiD plan for thresholds/policies; safe folback on rules

Total

Strong anti-fraud is a hybrid of rules, models and graphs in a controlled loop: high-quality signals and features → threshold policy with hysteresis → fast online scoring and orchestration of actions → human-in-the-loop and transparent appeals → effect metrics and drift control. By following this scheme, you reduce losses, limit harm from false locks, and maintain the trust of users and regulators.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.