Root Cause
Last Sunday, there was an issue with the database cluster due to a key rotation process. The Postgres operator rollout was not executed during the key rotation. Later, Kubernetes initiated the rollout, which led to the failover of the database instances. This caused some users to be unable to access Pipefy.
Resolution
The failover process, which is a built-in feature of the database cluster, had to be completed without interruption. The total downtime experienced was 4 minutes.
Action Plan
No immediate action was taken during the incident, as it was expected. However, the necessary actions should have been performed during the scheduled maintenance window on Sunday.