Reliability Guarantee

99.9% Uptime SLA — Backed by Credits

Q: What does 99.9% uptime mean in practice?

A 99.9% monthly uptime SLA means a maximum of 43.8 minutes of unplanned downtime per month. In practice, our actual uptime over the past 12 months has averaged 99.97%, which translates to less than 13 minutes of total monthly downtime. Scheduled maintenance windows are excluded from the calculation and are always communicated 72 hours in advance.

Q: How do SLA credits work?

SLA credits are applied automatically to your next billing cycle based on the measured uptime for the previous month. You do not need to file a claim or contact support. Credits are tiered: 10% for uptime between 99.5%-99.9%, 25% for 99.0%-99.5%, and 50% for uptime below 99.0%. Credits are calculated against your monthly hosting or managed service fee.

Q: What is your disaster recovery strategy?

We maintain a warm-standby disaster recovery site in a separate AWS region with asynchronous database replication every 5 minutes. Our Recovery Time Objective (RTO) is under 4 hours, and our Recovery Point Objective (RPO) is 6 hours. We conduct full DR drills quarterly, simulating complete regional failures and validating end-to-end recovery procedures.

Q: How will I be notified during an incident?

You receive proactive notifications within 15 minutes of incident detection via email and optional webhook. Our public status page provides real-time updates throughout the incident lifecycle. After resolution, you receive a post-incident report within 5 business days that includes root cause analysis, timeline, impact assessment, and corrective actions taken.

Q: Can the platform handle sudden traffic spikes?

Yes. Our Kubernetes-based auto-scaling automatically provisions additional compute capacity within 60 seconds of detecting increased demand. We have tested the platform at 10x peak load during quarterly load tests. Real-world events like New Year's Eve, major sporting events, and festival periods have triggered 5-8x normal traffic volumes without any service degradation.

Q: Do you provide a real-time status page?

Yes. Every client has access to a real-time status page that displays current system status, historical uptime metrics, and active incident updates. You can also subscribe to email, SMS, or webhook notifications for status changes. The status page tracks uptime across all components independently — API, WebSocket, admin panel, payment processing, and mapping services.

Every minute of downtime costs your business riders and revenue. Our infrastructure is engineered for continuous availability with auto-scaling, multi-region redundancy, and a 24/7 Network Operations Center — all backed by a contractual 99.9% uptime SLA with financial credits.

View SLA Terms Book Infrastructure Demo

99.9%

Uptime SLA

<200ms

API Response Time

24/7

NOC Monitoring

<4h

Disaster Recovery RTO

Infrastructure

Multi-Region, Multi-AZ Architecture Built for Zero Downtime

Our platform runs on Amazon Web Services across multiple Availability Zones within each deployment region. Every critical component — application servers, databases, caching layers, and message queues — is deployed in an active-active or active-standby configuration across at least two AZs, ensuring that a single data center failure never takes your platform offline.

Database clusters use synchronous replication across AZs with automatic failover that completes in under 30 seconds. Application load balancers continuously health-check every instance and route traffic away from unhealthy nodes before they impact end users. The entire architecture is defined as infrastructure-as-code, allowing us to rebuild a complete environment from scratch in under 90 minutes.

Infrastructure Stack

Compute: Amazon EKS with auto-scaling node groups that scale from 3 to 50+ nodes based on real-time traffic demand.
Database: Amazon Aurora with multi-AZ synchronous replication, automated backups every 6 hours, and point-in-time recovery.
Caching: Amazon ElastiCache Redis cluster for sub-millisecond session and geolocation data retrieval.
CDN: Amazon CloudFront with 450+ edge locations globally for static assets, API acceleration, and DDoS mitigation.
Load Balancing: Application Load Balancers with health checks every 10 seconds, connection draining, and SSL termination.

Performance

Auto-Scaling Infrastructure That Handles Peak Demand

Ride-hailing traffic is inherently unpredictable — New Year's Eve surges, event traffic spikes, and weather-driven demand can multiply your concurrent trips by 10x in minutes. Our auto-scaling architecture handles these spikes seamlessly without manual intervention.

Horizontal Auto-Scaling

Application pods scale automatically based on CPU, memory, and request latency metrics. Scale-up events trigger in under 60 seconds to meet sudden demand spikes.

Sub-200ms API Latency

P95 API response time under 200 milliseconds globally. Critical ride-matching and location update APIs consistently respond in under 100ms at the 50th percentile.

Global CDN

CloudFront CDN with 450+ edge locations ensures that static assets, app configurations, and map tiles load instantly for riders and drivers worldwide.

DDoS Protection

AWS Shield Advanced provides always-on network flow monitoring and automated DDoS mitigation that absorbs volumetric attacks without impacting legitimate traffic.

Real-Time Monitoring

Datadog-powered observability stack with custom dashboards tracking API latency, error rates, database performance, and WebSocket connection health in real time.

Load Testing

Quarterly load tests simulating 10x peak traffic volumes validate that auto-scaling policies activate correctly and that no degradation occurs under sustained high load.

Disaster Recovery

Tested Disaster Recovery with 4-Hour RTO

Our disaster recovery strategy follows a warm-standby model with cross-region database replication, automated infrastructure provisioning, and pre-staged application images. In the event of a complete regional failure, our DR procedures can restore full service to a secondary region within 4 hours (RTO) with a maximum data loss window of 6 hours (RPO).

We conduct full disaster recovery drills every quarter, simulating complete regional outages and validating that every component — from database restoration to DNS failover to SSL certificate provisioning — functions correctly under real recovery conditions. Each drill produces a detailed report with timing metrics and improvement recommendations.

Incident Communication

When incidents occur, you are never left in the dark. Our status page provides real-time updates, and affected clients receive proactive notifications via email and webhook within 15 minutes of incident detection. Post-incident, you receive a detailed root cause analysis and a corrective action plan within 5 business days.

Cross-Region Replication

Database writes are asynchronously replicated to a secondary AWS region every 5 minutes. In a regional outage, the replica can be promoted to primary within 30 minutes.

Automated Backups

Full database snapshots every 6 hours with 30-day retention. Point-in-time recovery available for any moment within the retention window at 5-minute granularity.

Infrastructure as Code

Entire environments defined in Terraform. A complete production replica can be provisioned from scratch in under 90 minutes in any supported AWS region.

DNS Failover

Route 53 health checks monitor the primary endpoint every 10 seconds. Automatic DNS failover to the DR region activates within 60 seconds of confirmed failure.

SLA Credits

Contractual SLA with Financial Accountability

We put our money where our uptime is. If we fail to meet our 99.9% monthly uptime commitment, you receive automatic service credits — no support tickets required.

99.9% - 99.5% Uptime

10% service credit applied automatically to your next billing cycle. Covers up to 4.4 hours of cumulative downtime in a calendar month.

99.5% - 99.0% Uptime

25% service credit applied automatically. Covers between 4.4 and 7.3 hours of cumulative monthly downtime. Includes priority incident review.

Below 99.0% Uptime

50% service credit plus executive incident review with CTO participation. Full root cause analysis delivered within 3 business days.

Uptime is calculated excluding scheduled maintenance windows (communicated 72 hours in advance) and force majeure events. Credits are applied automatically — no claim process required.

Client Testimonial

Operators Trust Our Reliability

"We process over 8,000 rides per day across three cities. During Ramadan last year, our daily volume tripled almost overnight. The platform scaled without a single dropped request. In 18 months of operation, we have experienced exactly 12 minutes of total downtime — and we received a proactive notification before we even noticed."

VP of Technology

Regional Ride-Hailing Platform, Saudi Arabia — 2,400+ Drivers

FAQ

SLA & Uptime FAQ

What does 99.9% uptime mean in practice?

A 99.9% monthly uptime SLA means a maximum of 43.8 minutes of unplanned downtime per month. In practice, our actual uptime over the past 12 months has averaged 99.97%, which translates to less than 13 minutes of total monthly downtime. Scheduled maintenance windows are excluded from the calculation and are always communicated 72 hours in advance.

How do SLA credits work?

SLA credits are applied automatically to your next billing cycle based on the measured uptime for the previous month. You do not need to file a claim or contact support. Credits are tiered: 10% for uptime between 99.5%-99.9%, 25% for 99.0%-99.5%, and 50% for uptime below 99.0%. Credits are calculated against your monthly hosting or managed service fee.

What is your disaster recovery strategy?

We maintain a warm-standby disaster recovery site in a separate AWS region with asynchronous database replication every 5 minutes. Our Recovery Time Objective (RTO) is under 4 hours, and our Recovery Point Objective (RPO) is 6 hours. We conduct full DR drills quarterly, simulating complete regional failures and validating end-to-end recovery procedures.

How will I be notified during an incident?

You receive proactive notifications within 15 minutes of incident detection via email and optional webhook. Our public status page provides real-time updates throughout the incident lifecycle. After resolution, you receive a post-incident report within 5 business days that includes root cause analysis, timeline, impact assessment, and corrective actions taken.

Can the platform handle sudden traffic spikes?

Yes. Our Kubernetes-based auto-scaling automatically provisions additional compute capacity within 60 seconds of detecting increased demand. We have tested the platform at 10x peak load during quarterly load tests. Real-world events like New Year's Eve, major sporting events, and festival periods have triggered 5-8x normal traffic volumes without any service degradation.

Do you provide a real-time status page?

Yes. Every client has access to a real-time status page that displays current system status, historical uptime metrics, and active incident updates. You can also subscribe to email, SMS, or webhook notifications for status changes. The status page tracks uptime across all components independently — API, WebSocket, admin panel, payment processing, and mapping services.

Always On

Build on Infrastructure You Can Count On

Review our SLA terms, see our infrastructure architecture, or schedule a technical deep-dive with our platform engineering team.

Book an Infrastructure Demo View Pricing & SLA