GH GambleHub

Team rotation and shifts

1) Rotation objectives

Rotation is a systemic way to provide continuous coverage, predictable load, and rapid response without burnout and loss of context. Key objectives:
  • even distribution of pages and night hours;
  • guaranteed replacement in case of force majeure;
  • transparency of schedules, vacations and restrictions;
  • compliance with SLA/compliance requirements and retention of audit.

2) Roles and coverage

P1 (Primary on-call): first response, triage, synchronization with IC.
P2 (Secondary on-call): backup for overload/escalations.
IC-of-the-day/Duty Manager: leader in SEV-1 +, coordination of decisions.
Observer/Shadow: Shadow learning without pages.

Recommendations:
  • avoid releases ± 30 minutes from the shift;
  • for complex windows, keep two active slots (P1 + P2);
  • IC has a dedicated shift, does not combine P1.

3) Rotation models

24/7 with 8-hour shifts: morning/day/night (3 crews). Minimum fatigue, more switching.
24/7 with 12-hour shifts: fewer switches, need compensation and strict limits.
Follow-the-sun: regions transmit coverage across time zones; fewer late-night pages.
Follow-the-moon: Night coverage is moved to the "far" region for load outside local primetime.
Week-on/Week-off: one week on-call, then a week without pages (for mature teams and low noise).

4) Fairness and sustainability rules

Night/weekend quotas: maximum N nights and M weekend shifts per person per period.
Balance of pages: if the engineer has> target threshold per period - redistribution/remediation.
Prohibition of singles: night windows only P1 + P2.
Unavailability windows: planned in advance (vacation/illness/training), the schedule is recalculated automatically.
Shadow periods: each new on-call takes ≥ 2 shifts in the shade.

5) Schedule planning and publishing

Planning horizon: 6-8 weeks, revision - every 2 weeks.
General calendar of rotations (public read-only), in each slot - P1/P2/IC/Shadow, contacts.
Replacements (swap) are issued with a ticket/application and confirmed by a bridge bot.
Publication: for T-14 days minimum, changes - with team notification.

6) Handover procedures

Shift card (required fields): active incidents (ID/SEV/owner), next step/ETA, window risks (releases/migrations/quotas), SLO status, enabled degradation feature flags, status page/comms.
Checklist "pass": the card has been updated, all oral knowledge → tickets, timers for updates have been set, contact P2 has been confirmed.
I "accept" the checklist: I read the card, checked the dashboards in 2-4 hours, took possession of the incidents, made an echo message to the channel.

7) Fatigue management (fatigue)

Paging limits/hour and/or shift, auto-escalation to P2 when exceeded.
Quiet Hours for P2/P3 signals (only Page-critical ones are affected).
Post-incident rest: Mandatory time off after heavy nights (SEV-1 +).
Weekly alert review → noise reduction, rule editing.
Load monitoring: page/person schedule and team mood (NPS shifts).

8) Safety and compliance

JIT/JEA accesses: on-call rights are granted only to the shift window.
Audit trail: who was on duty, who took what actions were performed; unchangeable storage.
Duty with sensitive operations (PII/payments): separate shift and tolerance class; disabling personal devices, SSO + mTLS.
Legal/PR/Privacy contact points are marked on the shift card.

9) Automation

Calendar ↔ pager ↔ ChatOps: the bot publishes "who on-call," allows '/swap ', creates a handover card from sources (dashboards, tickets, releases).
Readiness check at the beginning of the shift: pager sound, VPN/SSO, access, communication.
Document templates: SOP/Runbook for routines and incidents; auto-references in alerts.
Integration with releases: release annotations → temporary suppression of non-key alerts for the first 30 minutes.

10) Rotation quality metrics

MTTA/MTTR around the shift (± 30 minutes from switching).
Handover Defect Rate - the proportion of lost context incidents per shift.
Alerts per on-call hour (median/95th percentile),% actionable.
Load per person - pages/person/week; variance between participants.
Missed/Late Updates - delays in Comms SLA.
Swap rate and causes (fatigue/vacation/conflict).
NPS shifts (by short survey) and trend.

11) Schedule templates

A. 24/7, 8-hour (3 brigades)


Brigade A: 08: 00-16: 00
Brigade B: 16: 00-00: 00
Brigade C: 00: 00-08: 00
Each team: P1 + P2, IC on a separate schedule (day slot)
Rotation: A→B→C every week; weekend moves in a circle

B. Follow-the-sun (3 regions)


EU: 07:00–15:00      AMER: 15:00–23:00      APAC: 23:00–07:00 (UTC)
Each region: P1 local, P2 neighboring
IC: coincides with active region; transfer 15 minutes before shift

B. Week-on/Week-off (low noise)


Week 1: Team X (P1/P2) Week 2: Team Y
Daily IC common to both
Limit: no more than 2 consecutive weeks for one person

12) Checklists

Before publishing the graph

  • 24/7 coverage without holes, P1 + P2 in each slot.
  • Holidays/training/availability restrictions are taken into account.
  • The balance of nights/weekends is fair.
  • IC and Shadow assigned.
  • Auto-sync with pager/calendar is enabled.

Shift started

  • P1/P2/IC confirmed presence (bot/chat).
  • Access, communication, dashboards checked.
  • Handover card received, echo sent.

Shift completed

  • The handover card has been updated and closed.
  • Incidents transferred from next step/ETA.
  • A short AAR was performed, improvements were recorded (if there were failures).

13) Anti-patterns

Lonely P1 at night without backup.
Publication of the schedule for the week ahead without horizon and replacement.
Releases at the time of the shift without IC and gates.
"Oral" programs without a card and tickets.
Zero compensation/time off after heavy nights.
No audit of swaps and reasons for replacements.

Rotation without training: a new on-call immediately "into battle."

14) Implementation Roadmap (4-6 weeks)

1. Ned. 1: coverage inventory, model selection (24/7 or follow-the-sun), role assignment.
2. Ned. 2: start calendar + pager + bot, handover/SOP templates.
3. Ned. 3: pilot 2-3 week cycles, collecting metrics (alerts/hour, MTTA around shifts).
4. Ned. 4: alert review, tuning noise and quotas, entering Shadow shifts.
5. Ned. 5-6: formalization of compensation/quiet hours, reports for management, automation of swaps.

15) The bottom line

Rotation is a process, not Excel: transparent graphs, roles and handover cards; calendar and pager automation; fair fatigue rules and limits; quality metrics and regular reviews. With this approach, shifts become predictable, people become stable, and users and partners do not notice that the team changes by the hour.

Contact

Get in Touch

Reach out with any questions or support needs.We are always ready to help!

Telegram
@Gamble_GC
Start Integration

Email is required. Telegram or WhatsApp — optional.

Your Name optional
Email optional
Subject optional
Message optional
Telegram optional
@
If you include Telegram — we will reply there as well, in addition to Email.
WhatsApp optional
Format: +country code and number (e.g., +380XXXXXXXXX).

By clicking this button, you agree to data processing.