Slowness in Platform

Incident Report for Pipefy

Postmortem

Root Cause

We experienced a performance degradation on our platform due to the inability to scale the environment effectively. The root cause was a shortage of machines to meet the increasing resource demand. This issue stemmed from a capacity shortage with our cloud provider, OCI, which is currently facing high demand for this specific type of machine. Unfortunately, this directly impacted our services.

Resolution

The issue was resolved by manually scaling the environment, which involved adding a different type of machine to the cluster to improve performance.

Action Plan

To prevent similar issues in the future, we have implemented the following measures: Created additional fallback node pools to enhance resource availability and improve scalability. Configured a dedicated alert for resource scarcity to enable faster detection and proactive response. Submitted a request for a medium CPU usage reservation to our cloud provider. Filed a formal complaint with our Cloud representatives, urging them to address the capacity limitations impacting our services.

We are fully committed to enhancing platform stability and will continue to closely monitor the situation to ensure optimal performance.

Posted Jul 01, 2025 - 19:03 UTC

Resolved

This incident has been resolved.
Posted Jun 25, 2025 - 16:23 UTC

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Jun 25, 2025 - 14:07 UTC

Identified

The issue has been identified and a fix is being implemented.
Posted Jun 25, 2025 - 13:49 UTC

Investigating

We are currently investigating this issue.
Posted Jun 25, 2025 - 13:40 UTC
This incident affected: Application.