GH GambleHub

Object Storage: MinIO, S3

Brief Summary

Object storage is a flat key space (bucket/object) accessible through the S3 API, with high durability and horizontal scale. MinIO provides S3-compatibility on-prem/in Kubernetes; Amazon S3 is a cloud benchmark with a rich ecosystem. Key solutions: fault tolerance scheme (replica/EC), security policy, storage classes and lifecycles, as well as SLOs on latency/bandwidth and cost per 1 TB/month.

Architecture and principles

Units: bucket → object (key), metadata (ETag, versions, tags), ACL/policies.
API: PUT/GET/DELETE, Multipart Upload, Presented URL, Copy, ListV2, Select (server selections), Notifications.
Consistency - Today's S3/MinIO is strong consistency for read-after-write operations.
Longevity vs availability: achieved by replication/erasure coding, distributed across nodes/zones/regions.

Product Use Cases

Media/content (art, previews, provider catalogs): cheap storage + CDN.
Logs/raw events/fichesters: cheap ingest, Parquet/JSON formats.
Backups/snapshots of databases and artifacts: versioning + Object Lock (WORM).
ML/analytics: datasets, models, checkpoints; reserved URL for secure issuance.
Reporting/compliance: immutability and retention by policy.

Selection: S3 (cloud) vs MinIO (on-prem/K8s)

S3 (cloud):
  • Pros: operability, storage classes (Standard/IA/Glacier-like), built-in multi-zone, ecosystem.
  • Cons: cost of outgoing traffic, data localization requirements.
MinIO (on-prem/K8s):
  • Pros: control over data/geography/networks/cost, high performance on NVMe, multi-tenancy.
  • Disadvantages: exploitation on your side (upgrades, observability, drives/networks).

Fault tolerance and coding schemes

Replication (N copies): Simple but inefficient in capacity.
Erasure Coding (EC k + m): divides the object into k data + m code blocks; survives m failures and saves space compared to an N-fold replica.
MinIO topology: erasure set, nodes in pool; ≥ 4 nodes, disks in different servers/racks are desirable.
Multi-zone/multi-site: replica by zone/region, active-active buckets with conflict resolution by version.

Security and access

Authentication and rights

Root/Service Users, Policy IAM (JSON), STS for temporary keys (signed roles).
Bucket policies: 's3: GetObject', 's3: PutObject', 's3: DeleteObject', conditions by prefixes/tags/Source IP/Referer.

Example of IAM policy (read from prefix only):
json
{
"Version":"2012-10-17",
"Statement":[{
"Effect":"Allow",
"Action":["s3:GetObject","s3:ListBucket"],
"Resource":[
"arn:aws:s3:::media-bucket",
"arn:aws:s3:::media-bucket/public/"
],
"Condition":{"StringLike":{"s3:prefix":["public/"]}}
}]
}

Encryption

SSE-S3: vault server keys.
SSE-KMS: keys in external/embedded KMS (Vault, cloud KMS), control of rotation and audit.
SSE-C: the key is provided by the client (on critical paths).
In-flight encryption: TLS, mTLS between services/gateways.

Immutability

Bucket versioning (delete/overwrite protection).
Object Lock (WORM): режим Governance/Compliance, поля `RetentionUntilDate` и Legal Hold.

Lifecycle Policies and Storage Classes

Lifecycle: transition to "warm/cold" class, deletion of old versions, retention period for previews/temporary files.
Tearing MinIO: on-prem → cloud S3-class/external bucket; selection by prefixes/tags.

Example lifecycle (delete untouchable versions after 30 days, archive after 90):
xml
<LifecycleConfiguration>
<Rule>
<ID>archive-90</ID><Status>Enabled</Status>
<Filter><Prefix>logs/</Prefix></Filter>
<NoncurrentVersionExpiration><NoncurrentDays>30</NoncurrentVersionExpiration>
<Expiration><Days>365</Days></Expiration>
</Rule>
</LifecycleConfiguration>

Replication and multisite

CRR/SRR: Cross/Same-Region, selective prefixes/tags.
Active-Active: bidirectional replica with versioning; it is important to specify priority/conflicts.
Validation and lag: lag metrics, alerts for undelivered objects.

Notifications and integration (event-driven)

MinIO Bucket Notifications: Kafka, NATS, Webhook, AMQP, MQTT, Elasticsearch.
Триггеры: `s3:ObjectCreated:`, `s3:ObjectRemoved:`, `s3:Replication:`.
Patterns: auto-generation preview, ETL in DWH, fichester update, signal in anti-fraud.

Example of'mc'webhook configuration:
bash mc event add my/minio/media arn:minio:sqs::WEBHOOK:thumbs \
--event put --prefix uploads/

Performance Profiles

Latency: p95/p99 GET/PUT; the target for API hot paths is p95 GET ≤ 30-50ms in the local data center.
Throughput: Multipart-PUT (parts 8-64 MB), parallel downloads, pipelining.
Network: 25-100 GbE, jumbo MTU inside the factory, RSS/RPS on NIC, NUMA affinity.
Disks: NVMe for hot working-set, HDD for archive; MinIO has disk symmetry in erasure-set.
Client tuning: increase 'max _ concurrency' SDK, reuse TCP, correct timeouts and backoff.

Observability and alerting

MinIO/S3 metrics: operations (PUT/GET/DELETE/List), bytes, errors, latency, replica lag, healing.
Host/drives: SMART/temperature, I/O queues, drops/retransmit.

SLO (examples):
  • Bucket availability ≥ 99. 95 %/30 days.
  • p95 GET ≤ 50ms (local), p95 PUT ≤ 150ms (multipart).
  • Replication success ≥ 99. 9%, lag ≤ 60 s p95.
  • The recovery time of the defective disk ≤ 24 hours (healing does not "kill" the food).

FinOps and Economics

Cost of 1 TB/month: disk + depreciation + energy + network + operation (for on-prem).
Egress-cost: plan cache/CDN, offline previews in the cloud.
Tearing/lifesycle: aggressive movement of cold data, compression/partitioning (Parquet).

Quotas and budgets: per-tenant limits of bins/bytes/RPS, reports "$/1 M requests."

Spot/Preemptible calculations for ETL: if you pull processing next to MinIO.

Deploy MinIO

Bare-metal (EC Simplified Cluster)

bash minio server http://node{1...4}/export{1...8} \
--console-address ":9001" --address ":9000"
Recommendations:
  • ≥ 4 nodes, 8-12 disks per node; same disk size/speed.
  • Post nodes by rack/power/switch.
  • Reverse-proxy/Ingress (TLS 1. 2+/1. 3, HSTS), mTLS for internal clients.

Kubernetes (Tenants)

NVIDIA/MinIO Operator (CRD `Tenant`), StatefulSet с дисками, PV/PVC, anti-affinity, topology spread.
Resources: CPU pools for network flows, high'ulimit '(FD), individual storage classes (NVMe/HDD).
Updates: alternately, with healing/replication and SLO controls.

'mc' tools (MinIO Client)

bash alias mc alias set my https://minio. example KEY SECRET

create bucket, enable versioning and WORM mc mb my/media mc version enable my/media mc retention set --default COMPLIANCE 365d my/media

read-only policy for public/
mc policy set json./policy. json my/media

replication to cloud bucket mc replicate add my/media --remote s3/backup --replicate "delete, metadata, delete-marker"

Kafka mc event add my/media arn: minio: sqs:: kafka: k1 --event put, delete

Product Integration Patterns

Reserved URL for download/download without direct key issuance.
Content validation: size/type limits, antivirus scanner in notifications.
Metadata/tags: for lifecycle/archives/moderation.
CDN before object: reduced egress and latency to end users.
RAG/ML: storage of embeddings/shards, dataset manifests, model versions (Model Registry over S3).

Safety and compliance

Audit logs: who/what/when (PUT/GET/DELETE), unchangeable logs in a separate WORM bucket.
Network controls: dedicated VLAN/VRF, Security Groups/ACL, private endpoints.
KMS and key rotation: annual rotation policy, DUAL-control on unseal.
PII/PCI: bucket segmentation, strict access policy (ABAC by data tags), Object Lock for reporting.

Launch checklist

  • Selected data classes: hot/warm/cold; RPO/RTO/SLO objectives.
  • erasure-sets and number of nodes designed; failure tests.
  • TLS/mTLS, KMS, IAM/STS, bucket policies and versioning.
  • Lifecycle/tearing and replication; Object Lock for critical buckets.
  • Notifications in Kafka/Webhook; antivirus/ETL/preview.
  • Monitoring (operations, replication log, disks, network), alerts and dashboards.
  • Plan of updates/extensions (rolling), runbook healing/rebalance.
  • Quotas/billing/reporting per-tenant.

Common errors

Mixing NVMe and HDD in one erasure-set → unpredictable latency.
No versioning/Retention → risk of loss/ransomware.
Multipart off/parts too small → low bandwidth.
Non-replicable critical data buckets.
Lack of DR/recovery tests and egress cost control.

iGaming/fintech specific

Logs/raw events: Parquet + lifecycle (hot 7-30 days, then archive/tearing).
Media content and providers: presented GET, CDN, aggressive cache-control.
Wallet/database backups: versioning + WORM, regular DR exercises, isolated account/cluster for replicas.
Antifraud/fichestors: low reading latency (local MinIO), events in Kafka for calculations.
Reporting and regulators: Object Lock (Compliance), unchangeable audit logs, clear retention policies.

Total

S3-compatible object storage is the basic "brick" of a modern platform. The correct EU/replication scheme, hard IAM/encryption/Retention, thoughtful lifecycle/tearing and notifications turn it into a reliable "passive disk" for media, logs, backups and ML data. In MinIO, you get control and on-prem/K8s speed; in S3 - the scale and ecosystem of the cloud. Record everything in IaC, measure SLO and cost - and the facility will be a predictable, safe and economical support for products.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.