Kogort tahlili
Kogort tahlili
Kogort tahlili obyektlarni (odatda foydalanuvchilarni) yagona boshlanish hodisasi bo’yicha guruhlaydi va ular qanchalik faol va qimmatli bo’lib qolishini taqqoslaydi. Bunday yondashuv tizimdagi vaqt ta’sirini (mavsumlar, aksiyalar) kogorta yoshi ta’siridan (boshlangan kundan boshlab) ajratadi.
1) Bazaviy ta’riflar
Kohort (cohort): «tug’ilgan» voqeasi bilan birlashtirilgan ko’plab o’yinchilar - ro’yxatdan o’tish, birinchi depozit, birinchi o’yin, birinchi xarid.
Kalendar oʻqi (calendar time): haqiqiy sanalar (2025-10-01,...).
Kogorta yoshining o’qi (cohort age): «tug’ilgan» kundan boshlab kunlar/haftalar (D0, D1,...).
Ushlab qolish metrikasi: D1/D7/D30 (Exact va Rolling), WAU/MAU, Stickiness (DAU/MAU).
Monetizatsiya: ARPU/ARPPU, kümülativ LTV (D7/D30/D90).
Hisobga olish birligi: foydalanuvchi (user/master_id) - pasportga qayd qiling.
2) Kogortalarning turlari va ularni qachon tanlash
Acquisition-kogortlar: ro’yxatga olish/birinchi tashrif sanasi bo’yicha - jalb qilish va onbording kanallarini baholash.
Activation/Monetization-kogortlar: birinchi depozit/xarid bo’yicha - early-monetization va promo bahosi.
Feature-kogortlar: birinchi fich/oʻyin toifasidan foydalanish boʻyicha - relizlar effekti.
Behavior-kogortlar: RFM/boshlang’ich patterni bo’yicha (masalan, «tungi mobil»).
3) O’qlar va to’rlar: matritsaga qanday qarash kerak
Kogort matritsasi: satrlar - kogortlar (kalendar), ustunlar - yosh (D0... D90).
Mavsumiylik: mavsumiy effektlarni ajratish uchun diagonallarni (bir xil kalendar kuni) solishtiring.
Normallashtirish: nisbiy metriklar (CR, ulushlar) + kumulyativ (LTV), ikkalasini ko’rsating.
4) Kogorta va metrik pasport (template)
5) Psevdo-SQL: retention matritsasi (Exact Dn)
sql
WITH regs AS (
SELECT user_id, DATE_TRUNC('day', MIN(ts)) AS cohort_day
FROM event_register
GROUP BY 1
),
act AS (
SELECT user_id, DATE_TRUNC('day', ts) AS act_day
FROM event_activity
),
ages AS (
SELECT r. user_id, r. cohort_day, a. act_day,
(a. act_day - r. cohort_day) AS age_days
FROM regs r
JOIN act a ON a. user_id = r. user_id
),
exact AS (
SELECT cohort_day,
age_days,
COUNT(DISTINCT user_id) AS users_active
FROM ages
GROUP BY 1,2
),
coh_size AS (
SELECT cohort_day, COUNT(DISTINCT user_id) AS cohort_size
FROM regs GROUP BY 1
)
SELECT e. cohort_day,
e. age_days,
e. users_active::decimal / NULLIF(c. cohort_size,0) AS exact_retention
FROM exact e
JOIN coh_size c USING (cohort_day)
WHERE age_days IN (1,7,30,90)
ORDER BY cohort_day, age_days;
Rolling Dn (1... n kunlik faollik)
sql
WITH days AS (... as above...),
roll AS (
SELECT cohort_day,
CASE WHEN age_days BETWEEN 1 AND 7 THEN 7
WHEN age_days BETWEEN 1 AND 30 THEN 30 END AS bucket,
COUNT(DISTINCT user_id) AS any_active
FROM days
WHERE age_days BETWEEN 1 AND 30
GROUP BY 1,2
)
SELECT r. cohort_day, r. bucket AS Dn,
r. any_active::decimal / s. cohort_size AS rolling_retention
FROM roll r
JOIN (SELECT cohort_day, COUNT(DISTINCT user_id) cohort_size FROM regs GROUP BY 1) s USING (cohort_day)
ORDER BY cohort_day, Dn;
6) Kogort LTV va monetizatsiya
Jami LTV (Dn): Dn kogortidan foydalanuvchi uchun daromad summasi.
ARPU/ARPPU: foydalanuvchiga/Dn bo’yicha to’lovchiga daromad.
To’lovchilar%: Dn ga ≥ 1 to’lovli ulush.
sql
WITH reg AS (
SELECT user_id, DATE_TRUNC('day', MIN(ts)) AS cohort_day
FROM event_register GROUP BY 1
),
pay AS (
SELECT user_id, amount, DATE_TRUNC('day', ts) AS pay_day
FROM fact_payments
),
ltv AS (
SELECT r. cohort_day,
(pay_day - r. cohort_day) AS age_days,
SUM(amount) AS rev
FROM reg r JOIN pay p USING (user_id)
WHERE pay_day >= r. cohort_day
GROUP BY 1,2
),
cum AS (
SELECT cohort_day, age_days,
SUM(rev) OVER (PARTITION BY cohort_day ORDER BY age_days ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS rev_cum
FROM ltv
)
SELECT c. cohort_day, c. age_days,
c. rev_cum::decimal / NULLIF(sz. cohort_size,0) AS ltv_per_user
FROM cum c
JOIN (SELECT cohort_day, COUNT(DISTINCT user_id) cohort_size FROM reg GROUP BY 1) sz USING (cohort_day)
WHERE age_days IN (7,30,90)
ORDER BY cohort_day, age_days;
7) Survival/hazard ushlab qolish uchun
Kaplan-Meier: omon qolish uchun mo’ljallanmagan egri chiziq (S (t)) - «oqmaganlar» ulushi.
Hazard-modellar (So’x/logit-kunlar): belgilarning (kanal, mamlakat, platforma, bonuslar, kontent) chiqib ketish xavfiga ta’siri.
Amaliyot: KMni segmentlar bo’yicha quramiz, keyin farqni hazard modeli bilan tushuntiramiz.
8) Mavsumiylik, TZ va kalendar
TZ: voqealarni UTCda saqlang, mahalliy TZ bozorida tahlil qiling; konsistent boʻling.
Kalendar: bayramlar/ish haqi/uchrashuvlar/relizlar - bayroqlarga o’xshaydi; o’xshash haftalarning kogortalarini solishtiring.
Harakatlanuvchi oyna: haftalik/oylik kogortalar uchun - bayramlar va hisobot davrlarining ko’payishi.
9) Segmentatsiya va atributsiya
Segmentlar: jalb qilish kanali, platforma/OS, geo, birinchi kontent, narx/limitlar, to’lov usuli.
Kogortaning atributlari: foydalanuvchini «kim olib keldi» - algoritmni (last non-direct, data-driven) tuzating.
LTV tortish: kanallar/segmentlar bo’yicha nafaqat CR, balki LTV (D30/D90) ni ham solishtiring.
10) Vizualizatsiya
Kogort matritsasining issiqlik kartasi (CR/LTV).
Trend chiziqlari kalendar boʻyicha D1/D7/D30.
Survival/Hazard grafiklar.
Bridge «LTV ni D30 ga o’zgartirdi»: to’lovchilar hissasi, chastota, o’rta chek.
11) Eksperimentlar va sabablar
A/B: onbording, tutoriallar, paywall, offerlar. Asosiy metrikasi - D7/D30 retention va LTV (D30).
Kvazieksperimentlar: DiD/bozorlarda sotish uchun sintetik nazorat.
Uplift modellari: Qayta faollashtirishda (Qini/AUUC, uplift @k) qaytish o’sishini nishonga oling.
12) Ekspluatatsiya va governans
Version:’RET _ D7 _ vN’,’LTV _ D30 _ vN’; aktivlik/valyutani aniqlash o’zgarganda changelog.
SLO yangilik: kundalik kogortalar - soat 06:00 gacha tayyorgarlik; ma’lumotlar muddati ≤ 1 soat
Sifati: voqealar coverage, dublikatlar ulushi, kogortdan tashqari bot/frod ulushi.
Foydalanish: RLS/CLS, PII niqoblash; eksport - faqat agregatlar.
Runbooks: D1 (onbording), D7 (kontent), hodisalar/identifikatsiyalarning buzilishi.
13) Tez-tez xatolar (anti-patternlar)
O’qlarning aralashishi: kohortning turli yoshlarini turli fasllarda tuzatishlarsiz solishtiriladi.
Rolling vs Exact: bir xil talqin qilinadi.
Birliklarni aralashtirish: bo’limdagi sessiyalar, bo’limdagi foydalanuvchilar.
«O’rtacha» yig’indisi: o’lchov/bo’limlarni yig’ish o’rniga.
TZ/kalendar ignori: D1 ning kun/bayram chegaralarida siljishi.
Bot/frod/QA filteri yoʻq.
Hisobga olinmagan restartlar: moslamasiz hisoblar split/merge.
14) Kogort hisobotini e’lon qilishdan oldin chek-varaq
- Tug’ilgan voqea, birlik, TZ, faollik oynalari aniqlandi
- Botlar/frod/QA chiqarib tashlandi; identifikatsiya qilingan (golden record)
- CR (Exact/Rolling) va LTV matritsalari D7/D30/D90
- Taqvim/bayramlar inobatga olingan; kanal/platforma/geo bo’yicha segmentlar
- Survival/hazard grafiklari va bridge LTV qoʻshilgan
- Metrika versiyalari va atributsiya algoritmi hujjatlashtirilgan
- Yangi SLO sozlash, coverage/dublikat/xato monitoringi
- Runbooks D1/D7 qulashi va hodisalarning uzilishi uchun tayyor
Jami
Kogort tahlili - bu ikkita o’q va intizom: o’rnatilgan «tug’ilgan vaqt», to’g’ri oynalar va TZ, ushlab turish matritsalari va LTV, segmentatsiya va o’zgarishlarni sababiy tekshirish. Bunday yondashuv nafaqat egri chiziqlarni kuzatishga, balki onbordingni qayerda tuzatish, qaysi kanallarni kattalashtirish, qaysi kontent va offerlar oʻyinchilarni uzoqroq ushlab turish va LTVni oshirish kabi qarorlar qabul qilishga yordam beradi.