[Sandbox] Orchestration - GraphQL API outage - Incident details - Awell

Updates

Resolved
July 15, 2024 at 3:22 PM
Resolved
July 15, 2024 at 3:22 PM
Early this morning, our Sandbox environment experienced an outage due to a new data replication feature intended to improve the availability of the database cluster. During a routine maintenance operation, a database server failed to shut down gracefully because of the data replication configuration.
While the server eventually restarted, it was not able to synchronize with the other servers, leading to a continuous loop of failed synchronization attempts which eventually made the cluster unresponsive.
The Production environments have no risk of being impacted by this issue as they use a different configuration for data replication. The configuration of the Sandbox environment has been aligned with the Production environments to eliminate the risk of this occurring again.
The issue has now been fully resolved. We apologize for any inconvenience caused and appreciate your understanding.
Monitoring
July 15, 2024 at 10:57 AM
Monitoring
July 15, 2024 at 10:57 AM
The database backup has been successfully restored. All services are back online. We will keep investigating the root cause of this issue and will post updates as we find more information.
Identified
July 15, 2024 at 10:38 AM
Identified
July 15, 2024 at 10:38 AM
Our database cluster started experiencing issues around 4:15 AM UTC. Despite our best attempt we were not able to restore it to a healthy state. In order to restore service we decided to restore data from the latest backup. This operation is ongoing and should complete within the next hour. More information will be posted once service is restored.
Investigating
July 15, 2024 at 4:26 AM
Investigating
July 15, 2024 at 4:26 AM
[Sandbox] Orchestration - GraphQL API cannot be accessed at the moment. This incident was created by an automated monitoring service.

Awell - [Sandbox] Orchestration - GraphQL API outage – Incident details

All systems operational

[Sandbox] Orchestration - GraphQL API outage