Subscribe to receive notifications of new posts:

Engineering

Improving platform resilience at Cloudflare through automation

2024-10-09

EdgeEngineeringServerlessDeveloper PlatformDevelopersAgile Developer ServicesGoReliabilitySpeed & Reliability

We realized that we need a way to automatically heal our platform from an operations perspective, and designed and built a workflow orchestration platform to provide these self-healing capabilities across our global network. We explore how this has helped us to reduce the impact on our customers due to operational issues, and the rich variety of similar problems it has empowered us to solve....