Creates a new Git repository on your GitHub/ GitLab account: Cloudflare will automatically clone and create a new repository on your account, so you can continue developing.
Automatically provisions resources the app needs: If your repository requires Cloudflare primitives like a Workers KV namespace, a D1 database, or an R2 bucket, Cloudflare will automatically provision them on your account and bind them to your Worker upon deployment.
Configures Workers Builds (CI/CD): Every new push to your production branch on your newly created repository will automatically build and deploy courtesy of Workers Builds.
There is nothing more frustrating than struggling to kick the tires on a new project because you don’t know where to start. Over the past couple of months, we’ve launched some improvements to getting started on Workers, including a gallery of Git-connected templates that help you kickstart your development journey.
But we think there’s another part of the story. Everyday, we see new Workers applications being built and open-sourced by developers in the community, ranging from starter projects to mission critical applications. These projects are designed to be shared, deployed, customized, and contributed to. But first and foremost, they must be simple to deploy.
If you’ve open-sourced a new Workers application before, you may have listed in your README the following in order to get others going with your repository:
“Clone this repo”
“Install these packages”
“Install Wrangler”
“Create this database”
“Paste the database ID back into your config file”
“Run this command to deploy”
“Push to a new Git repo”
“Set up CI”
And the list goes on the more complicated your application gets, deterring other developers and making your project feel intimidating to deploy. Now, your project can be up and running in one shot — which means more traction, more feedback, and more contributions.
We’re not just talking about building and sharing small starter apps but also complex pieces of software. If you’ve ever self-hosted your own instance of an application on a traditional cloud provider before, you’re likely familiar with the pain of tedious setup, operational overhead, or hidden costs of your infrastructure.
Self-hosting with traditional cloud provider
Self-hosting with Cloudflare
Setup a VPC
Install tools and dependencies
Set up and provision storage
Manually configure CI/CD pipeline to automate deployments
Scramble to manually secure your environment if a runtime vulnerability is discovered
Configure autoscaling policies and manage idle servers
✅Serverless
✅Highly-available global network
✅Automatic provisioning of datastores like D1 databases and R2 buckets
✅Built-in CI/CD workflow configured out of the box
✅Automatic runtime updates to keep your environment secure
✅Scale automatically and only pay for what you use.
By making your open-source repository accessible with a Deploy to Cloudflare button, you can allow other developers to deploy their own instance of your app without requiring deep infrastructure expertise.
We’re inviting all Workers developers looking to open-source their project to add Deploy to Cloudflare buttons to their projects and help others get up and running faster. We’ve already started working with open-source app developers! Here are a few great examples to explore:
Fiberplane helps developers build, test and explore Hono APIs and AI Agents in an embeddable playground. This Developer Week, Fiberplane released a set of sample Worker applications built on the ‘HONC' stack — Hono, Drizzle ORM, D1 Database, and Cloudflare Workers — that you can use as the foundation for your own projects. With an easy one-click Deploy to Cloudflare, each application comes preconfigured with the open source Fiberplane API Playground, making it easy to generate OpenAPI docs, test your handlers, and explore your API, all within one embedded interface.
You can now build and deploy remote Model Context Protocol (MCP) servers on Cloudflare Workers! MCP servers provide a standardized way for AI agents to interact with services directly, enabling them to complete actions on users' behalf. Cloudflare's remote MCP server implementation supports authentication, allowing users to login to their service from the agent to give it scoped permissions. This gives users the ability to interact with services without navigating dashboards or learning APIs — they simply tell their AI agent what they want to accomplish.
AI agents are intelligent systems capable of autonomously executing tasks by making real-time decisions about which tools to use and how to structure their workflows. Unlike traditional automation (which follows rigid, predefined steps), agents dynamically adapt their strategies based on context and evolving inputs. This template serves as a starting point for building AI-driven chat agents on Cloudflare's Agent platform. Powered by Cloudflare’s Agents SDK, it provides a solid foundation for creating interactive AI chat experiences with a modern UI and tool integrations capabilities.
Be sure to make your Git repository public and add the following snippet including your Git repository URL.
\n
[](https://deploy.workers.cloudflare.com/?url=<YOUR_GIT_REPO_URL>)
\n
When another developer clicks your Deploy to Cloudflare button, Cloudflare will parse the Wrangler configuration file, provision any resources detected, and create a new repo on their account that’s updated with information about newly created resources. For example:
\n
{\n "compatibility_date": "2024-04-03",\n\n "d1_databases": [\n {\n "binding": "MY_D1_DATABASE",\n\n\t//will be updated with newly created database ID\n "database_id": "1234567890abcdef1234567890abcdef"\n }\n ]\n}
\n
Check out our documentation for more information on how to set up a deploy button for your application and best practices to ensure a successful deployment for other developers.
For new Cloudflare developers, keep an eye out for “Deploy to Cloudflare” buttons across the web, or simply paste the URL of any public GitHub or GitLab repository containing a Workers application into the Cloudflare dashboard to get started.
\n \n \n
During Developer Week, tune in to our blog as we unveil new features and announcements — many including Deploy to Cloudflare buttons — so you can jump right in and start building!
"],"published_at":[0,"2025-04-08T14:00+01:00"],"updated_at":[0,"2025-04-08T13:00:02.729Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/38pbnujhmJ8qz7MTyQ5B6V/3ceafd1241de33f3a61bc2900be4c5b9/image1.png"],"tags":[1,[[0,{"id":[0,"2xCnBweKwOI3VXdYsGVbMe"],"name":[0,"Developer Week"],"slug":[0,"developer-week"]}],[0,{"id":[0,"6hbkItfupogJP3aRDAq6v8"],"name":[0,"Cloudflare Workers"],"slug":[0,"workers"]}],[0,{"id":[0,"4HIPcb68qM0e26fIxyfzwQ"],"name":[0,"Developers"],"slug":[0,"developers"]}],[0,{"id":[0,"3txfsA7N73yBL9g3VPBLL0"],"name":[0,"Open Source"],"slug":[0,"open-source"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Nevi Shah"],"slug":[0,"nevi"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/2WVp9J8BoRJaBMR7crkqWH/f7814ed0df05b50babb47c6ff5b936e5/nevi.png"],"location":[0,null],"website":[0,null],"twitter":[0,"@nevikashah"],"facebook":[0,null]}]]],"meta_description":[0,"You can now add a Deploy to Cloudflare button to your repository’s README when building a Workers application, making it simple for other developers to set up and deploy your project! "],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/deploy-workers-applications-in-seconds"],"metadata":[0,{"title":[0,"Skip the setup: deploy a Workers application in seconds"],"description":[0,"You can now add a Deploy to Cloudflare button to your repository’s README when building a Workers application, making it simple for other developers to set up and deploy your project! "],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5apJrvxfcNveJr5PhvhOA1/bdc0287863c9baf2086ab9848aba1de3/Skip_the_setup-_deploy_a_Workers_application_in_seconds-OG.png"]}]}],[0,{"id":[0,"01zA7RtUKkhrUeINJ9AIS3"],"title":[0,"Open-sourcing OpenPubkey SSH (OPKSSH): integrating single sign-on with SSH"],"slug":[0,"open-sourcing-openpubkey-ssh-opkssh-integrating-single-sign-on-with-ssh"],"excerpt":[0,"OPKSSH (OpenPubkey SSH) is now open-sourced as part of the OpenPubkey project."],"featured":[0,false],"html":[0,"
OPKSSH makes it easy to SSH with single sign-on technologies like OpenID Connect, thereby removing the need to manually manage and configure SSH keys. It does this without adding a trusted party other than your identity provider (IdP).
A cornerstone of modern access control is single sign-on (SSO), where a user authenticates to an identity provider (IdP), and in response the IdP issues the user a token. The user can present this token to prove their identity, such as “Google says I am Alice”. SSO is the rare security technology that both increases convenience — users only need to sign in once to get access to many different systems — and increases security.
OpenID Connect (OIDC) is the main protocol used for SSO. As shown below, in OIDC the IdP, called an OpenID Provider (OP), issues the user an ID Token which contains identity claims about the user, such as “email is alice@example.com”. These claims are digitally signed by the OP, so anyone who receives the ID Token can check that it really was issued by the OP.
Unfortunately, while ID Tokens do include identity claims like name, organization, and email address, they do not include the user’s public key. This prevents them from being used to directly secure protocols like SSH or End-to-End Encrypted messaging.
Note that throughout this post we use the term OpenID Provider (OP) rather than IdP, as OP specifies the exact type of IdP we are using, i.e., an OpenID IdP. We use Google as an example OP, but OpenID Connect works with Google, Azure, Okta, etc.
\n \n \n
Shows a user Alice signing in to Google using OpenID Connect and receiving an ID Token
OpenPubkey, shown below, adds public keys to ID Tokens. This enables ID Tokens to be used like certificates, e.g. “Google says alice@example.com is using public key 0x123.” We call an ID token that contains a public key a PK Token. The beauty of OpenPubkey is that, unlike other approaches, OpenPubkey does not require any changes to existing SSO protocols and supports any OpenID Connect compliant OP.
\n \n \n
Shows a user Alice signing in to Google using OpenID Connect/OpenPubkey and then producing a PK Token\nWhile OpenPubkey enables ID Tokens to be used as certificates, OPKSSH extends this functionality so that these ID Tokens can be used as SSH keys in the SSH protocol. This adds SSO authentication to SSH without requiring changes to the SSH protocol.
OPKSSH frees users and administrators from the need to manage long-lived SSH keys, making SSH more secure and more convenient.
“In many organizations – even very security-conscious organizations – there are many times more obsolete authorized keys than they have employees. Worse, authorized keys generally grant command-line shell access, which in itself is often considered privileged. We have found that in many organizations about 10% of the authorized keys grant root or administrator access. SSH keys never expire.” \n- Challenges in Managing SSH Keys – and a Call for Solutions by Tatu Ylonen (Inventor of SSH)
In SSH, users generate a long-lived SSH public key and SSH private key. To enable a user to access a server, the user or the administrator of that server configures that server to trust that user’s public key. Users must protect the file containing their SSH private key. If the user loses this file, they are locked out. If they copy their SSH private key to multiple computers or back up the key, they increase the risk that the key will be compromised. When a private key is compromised or a user no longer needs access, the user or administrator must remove that public key from any servers it currently trusts. All of these problems create headaches for users and administrators.
OPKSSH overcomes these issues:
Improved security: OPKSSH replaces long-lived SSH keys with ephemeral SSH keys that are created on-demand by OPKSSH and expire when they are no longer needed. This reduces the risk a private key is compromised, and limits the time period where an attacker can use a compromised private key. By default, these OPKSSH public keys expire every 24 hours, but the expiration policy can be set in a configuration file.
Improved usability: Creating an SSH key is as easy as signing in to an OP. This means that a user can SSH from any computer with opkssh installed, even if they haven’t copied their SSH private key to that computer.
To generate their SSH key, the user simply runs opkssh login, and they can use ssh as they typically do.
Improved visibility: OPKSSH moves SSH from authorization by public key to authorization by identity. If Alice wants to give Bob access to a server, she doesn’t need to ask for his public key, she can just add Bob’s email address bob@example.com to the OPKSSH authorized users file, and he can sign in. This makes tracking who has access much easier, since administrators can see the email addresses of the authorized users.
OPKSSH does not require any code changes to the SSH server or client. The only change needed to SSH on the SSH server is to add two lines to the SSH config file. For convenience, we provide an installation script that does this automatically, as seen in the video below.
Shows a user Alice SSHing into a server with her PK Token inside her SSH public key. The server then verifies her SSH public key using the OpenPubkey verifier.
Let’s look at an example of Alice (alice@example.com) using OPKSSH to SSH into a server:
Alice runs opkssh login. This command automatically generates an ephemeral public key and private key for Alice. Then it runs the OpenPubkey protocol by opening a browser window and having Alice log in through their SSO provider, e.g., Google.
If Alice SSOs successfully, OPKSSH will now have a PK Token that commits to Alice’s ephemeral public key and Alice’s identity. Essentially, this PK Token says “alice@example.com authenticated her identity and her public key is 0x123…”.
OPKSSH then saves to Alice’s .ssh directory:
an SSH public key file that contains Alice’s PK Token
and an SSH private key set to Alice’s ephemeral private key.
When Alice attempts to SSH into a server, the SSH client will find the SSH public key file containing the PK Token in Alice’s .ssh directory, and it will send it to the SSH server to authenticate.
The SSH server forwards the received SSH public key to the OpenPubkey verifier installed on the SSH server. This is because the SSH server has been configured to use the OpenPubkey verifier via the AuthorizedKeysCommand.
The OpenPubkey verifier receives the SSH public key file and extracts the PK Token from it. It then verifies that the PK Token is unexpired, valid, signed by the OP and that the public key in the PK Token matches the public key field in the SSH public key file. Finally, it extracts the email address from the PK Token and checks if alice@example.com is allowed to SSH into this server.
Consider the problems we face in getting OpenPubkey to work with SSH without requiring any changes to the SSH protocol or software:
How do we get the PK Token from the user’s machine to the SSH server inside the SSH protocol?\nWe use the fact that SSH public keys can be SSH certificates, and that SSH certificates have an extension field that allows arbitrary data to be included in the certificate. Thus, we package the PK Token into an SSH certificate extension so that the PK Token will be transmitted inside the SSH public key as a normal part of the SSH protocol. This enables us to send the PK Token to the SSH server as additional data in the SSH certificate, and allows OPKSSH to work without any changes to the SSH client.
How do we check that the PK Token is valid once it arrives at the SSH server?\nSSH servers support a configuration parameter called the AuthorizedKeysCommandthat allows us to use a custom program to determine if an SSH public key is authorized or not. Thus, we change the SSH server’s config file to use the OpenPubkey verifier instead of the SSH verifier by making the following two line change to sshd_config:
The OpenPubkey verifier will check that the PK Token is unexpired, valid and signed by the OP. It checks the user’s email address in the PK Token to determine if the user is authorized to access the server.
How do we ensure that the public key in the PK Token is actually the public key that secures the SSH session?\nThe OpenPubkey verifier also checks that the public key in the public key field in the SSH public key matches the user’s public key inside the PK Token. This works because the public key field in the SSH public key is the actual public key that secures the SSH session.
We have open sourced OPKSSH under the Apache 2.0 license, and released it as openpubkey/opkssh on GitHub. While the OpenPubkey project has had code for using SSH with OpenPubkey since the early days of the project, this code was intended as a prototype and was missing many important features. With OPKSSH, SSH support in OpenPubkey is no longer a prototype and is now a complete feature. Cloudflare is not endorsing OPKSSH, but simply donating code to OPKSSH.
OPKSSH provides the following improvements to OpenPubkey:
There are a number of ways to get involved in OpenPubkey or OPKSSH. The project is organized through the OPKSSH GitHub. We are building an open and friendly community and welcome pull requests from anyone. If you are interested in contributing, see our contribution guide.
We run a community meeting every month which is open to everyone, and you can also find us over on the OpenSSF Slack in the #openpubkey channel.
"],"published_at":[0,"2025-03-25T13:00+00:00"],"updated_at":[0,"2025-03-26T14:28:34.974Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1SXiWOhmwfDs86m6u84i8l/7b6b0874f6f2f91964b87383349c7785/image2.png"],"tags":[1,[[0,{"id":[0,"3txfsA7N73yBL9g3VPBLL0"],"name":[0,"Open Source"],"slug":[0,"open-source"]}],[0,{"id":[0,"64Z8wlRoBi6qbWfgdpgCJl"],"name":[0,"SSH"],"slug":[0,"ssh"]}],[0,{"id":[0,"6qgGalxjft44m5oDkd3i1p"],"name":[0,"Single Sign On (SSO)"],"slug":[0,"sso"]}],[0,{"id":[0,"1QsJUMpv0QBSLiVZLLQJ3V"],"name":[0,"Cryptography"],"slug":[0,"cryptography"]}],[0,{"id":[0,"7FzaH9AEvtFLQN298eEwwU"],"name":[0,"Authentication"],"slug":[0,"authentication"]}],[0,{"id":[0,"1x7tpPmKIUCt19EDgM1Tsl"],"name":[0,"Research"],"slug":[0,"research"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Ethan Heilman"],"slug":[0,"ethan-heilman"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/4O71DnT2dvNJsTmWv6PnQI/42a821c809de522be30aa15ea8477fc0/Ethan_Heilman.webp"],"location":[0],"website":[0],"twitter":[0],"facebook":[0]}]]],"meta_description":[0,"OPKSSH (OpenPubkey SSH) is now open-sourced as part of the OpenPubkey project. This enables users and organizations to configure SSH to work with single sign-on technologies like OpenID Connect, removing the need to manually manage & configure SSH keys without adding a trusted party other than your IdP."],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/open-sourcing-openpubkey-ssh-opkssh-integrating-single-sign-on-with-ssh"],"metadata":[0,{"title":[0,"Open-sourcing OpenPubkey SSH (OPKSSH): integrating single sign-on with SSH"],"description":[0,"OPKSSH (OpenPubkey SSH) is now open-sourced as part of the OpenPubkey project. This enables users and organizations to configure SSH to work with single sign-on technologies like OpenID Connect, removing the need to manually manage & configure SSH keys without adding a trusted party other than your IdP."],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3YFOkinUKPO3rl18jJKJnb/6fc2b6a7b4d51985155e34b6bd1f56e2/Open-sourcing_OpenPubkey_SSH__OPKSSH_-_integrating_single_sign-on_with_SSH-OG.png"]}]}],[0,{"id":[0,"6HAo0CAvmODAhYHnIF5Hbr"],"title":[0,"Open source all the way down: Upgrading our developer documentation"],"slug":[0,"open-source-all-the-way-down-upgrading-our-developer-documentation"],"excerpt":[0,"At Cloudflare, we treat developer content like an open source product. This collaborative approach enables global contributions to enhance quality and relevance for a wide range of users. This year,"],"featured":[0,false],"html":[0,"
At Cloudflare, we treat developer content like a product, where we take the user and their feedback into consideration. We are constantly iterating, testing, analyzing, and refining content. Inspired by agile practices, treating developer content like an open source product means we approach our documentation the same way an open source software project is created and maintained. Open source documentation empowers the developer community because it allows anyone, anywhere, to contribute content. By making both the content and the framework of the documentation site publicly accessible, we provide developers with the opportunity to not only improve the material itself but also understand and engage with the processes that govern how the documentation is built, approved, and maintained. This transparency fosters collaboration, learning, and innovation, enabling developers to contribute their expertise and learn from others in a shared, open environment. We also provide feedback to other open source products and plugins, giving back to the same community that supports us.
\n
\n
Building the best open source documentation experience
Great documentation empowers users to be successful with a new product as quickly as possible, showing them how to use the product and describing its benefits. Relevant, timely, and accurate content can save frustration, time, and money. Open source documentation adds a few more benefits, including building inclusive and supportive communities that help reduce the learning curve. We love being open source!
While the Cloudflare content team has scaled to deliver documentation alongside product launches, the open source documentation site itself was not scaling well. developers.cloudflare.com had outgrown the workflow for contributors, plus we were missing out on all the neat stuff created by developers in the community.
Just like a software product evaluation, we reviewed our business needs. We asked ourselves if remaining open source was appropriate? Were there other tools we wanted to use? What benefits did we want to see in a year or in five years? Our biggest limitations in addition to the contributor workflow challenges seemed to be around scalability and high maintenance costs for user experience improvements.
After compiling our wishlist of new features to implement, we reaffirmed our commitment to open source. We valued the benefit of open source in both the content and the underlying framework of our documentation site. This commitment goes beyond technical considerations, because it's a fundamental aspect of our relationship with our community and our philosophy of transparency and collaboration. While the choice of an open source framework to build the site on might not be visible to many visitors, we recognized its significance for our community of developers and contributors. Our decision-making process was heavily influenced by two primary factors: first, whether the update would enhance the collaborative ecosystem, and second, how it would improve the overall documentation experience. This focus reflects that our open source principles, applied to both content and infrastructure, are essential for fostering innovation, ensuring quality through peer review, and building a more engaged and empowered user community.
\n
\n
Cloudflare developer documentation: A collaborative open source approach
Cloudflare’s developer documentation is open source on GitHub, with content supporting all of Cloudflare’s products. The underlying documentation engine has gone through a few iterations, with the first version of the site released in 2020. That first version provided dev-friendly features such as dark mode and proper code syntax.
In 2021, we introduced a new custom documentation engine, bringing significant improvements to the Cloudflare content experience. The benefits of the Gatsby to Hugo migration included:
Faster development flow: The development flow replicated production behavior, increasing iteration speed and confidence. Preview links via Cloudflare Pages were also introduced, so the content team and stakeholders could quickly review what content would look like in production.
Custom components: Introduced features like resources-by-selector which let us reference content throughout the repository and gave us the flexibility to expand checks and automations.
Structured changelog management: Implementation of structured YAML changelog entries which facilitated sharing with various platforms like RSS feeds, Developer Discord, and within the docs themselves.
Improved performance: Significant page load time improvements with the migration to HTML-first and almost instantaneous local builds.
These features were non-negotiable as part of our evaluation of whether to migrate. We knew that any update to the site had to maintain the functionality we’d established as core parts of the new experience.
\n
\n
2024 update: Say “hello, world!” to our new developer documentation, powered by Astro
After careful evaluation, we chose to migrate from Hugo to the Astro (and by extension, JavaScript) ecosystem. Astro fulfilled many items on our wishlist including:
Enhanced content organization: Improved tagging and better cross-referencing of related pages.
Extensibility: Support for user plugins like starlight-image-zoom for lightbox functionality.
Development experience: Type-checking at build time with astro check, along with syntax highlighting, Intellisense, diagnostic messages, and plugins for ESLint, Stylelint, and Prettier.
JavaScript/TypeScript support: Aligned the docs site framework with the preferred languages of many contributors, facilitating easier contribution.
CSS management: Introduction of Tailwind and scoped styles.
Starlight, Astro’s documentation theme, was a key factor in the decision. Its powerful component overrides and plugins system allowed us to leverage built-in components and base styling.
Content needed to be migrated quickly. With dozens of pull requests opened and merged each day, entering a code freeze for a week simply wasn’t feasible. This is where the nature of abstract syntax trees (ASTs) came into play, only parsing the structure of a Markdown document rather than details like whitespace or indentation that would make a regular expression approach tricky.
With Hugo in 2021, we configured code block functionality like titles or line highlights with front matter inside the code block.
When we migrated from Gatsby to Hugo in 2021, the pull request included 4,850 files and the migration took close to three weeks from planning to implementation. This time around, the migration was nearly twice as large, with 8,060 files changed. Our planning and migration took six weeks in total:
10 days: Evaluate platforms, vendors, and features
14 days: Migrate the components required by the documentation site
The migration resulted in removing a net -19,624 lines of code from our maintenance burden.
\n \n \n
While the number of files had grown substantially since our last major migration, our strategy was very similar to the 2021 migration. We used Markdown AST and astray, a utility to walk ASTs, created specifically for the previous migration!
A website migration like our move to Astro/Starlight is a complex process that requires time to plan, review, and coordinate, and our preparation paid off! Including our Cloudflare Community MVPs as part of the planning and review period proved incredibly helpful. They provided great guidance and feedback as we planned for the migration. We only needed one day of code freeze, and there were no rollbacks or major incidents. Visitors to the site never experienced downtime, and overall the migration was a major success.
During testing, we ran into several use cases that warranted using experimental Astro APIs. These APIs were always well documented, thanks to fantastic open source content from the Astro community. We were able to implement them quickly without impacting our release timeline.
We also ran into an edge case with build time performance due to the number of pages on our site (4000+). The Astro team was quick to triage the problem and begin investigation for a permanent fix. Their fast, helpful fixes made us truly grateful for the support from the Astro Discord server. A big thank you to the Astro/Starlight community!
Migrating developers.cloudflare.com to Astro/Starlight is just one example of the ways we prioritize world-class documentation and user experiences at Cloudflare. Our deep investment in documentation makes this a great place to work for technical writers, UX strategists, and many other content creators. Since adopting a content like a product strategy in 2021, we have evolved to better serve the open source community by focusing on inclusivity and transparency, which ultimately leads to happier Cloudflare users.
We invite everyone to connect with us and explore these exciting new updates. Feel free to reach out if you’d like to speak with someone on the content team or share feedback about our documentation. You can share your thoughts or submit a pull request directly on the cloudflare-docs repository in GitHub.
"],"published_at":[0,"2025-01-08T14:00+00:00"],"updated_at":[0,"2025-01-08T14:00:03.578Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/39T3oFp8K80C21v3tOQxsc/2006154cf15184a17e61165cbec58b3e/BLOG-2600_1.png"],"tags":[1,[[0,{"id":[0,"7nUqeGThZ2m4zUG1xv6ffg"],"name":[0,"Technical Writing"],"slug":[0,"technical-writing"]}],[0,{"id":[0,"3txfsA7N73yBL9g3VPBLL0"],"name":[0,"Open Source"],"slug":[0,"open-source"]}],[0,{"id":[0,"17eVIVTZv365SSCxzaDL9o"],"name":[0,"Developer Documentation"],"slug":[0,"developer-documentation"]}],[0,{"id":[0,"4HIPcb68qM0e26fIxyfzwQ"],"name":[0,"Developers"],"slug":[0,"developers"]}],[0,{"id":[0,"3JAY3z7p7An94s6ScuSQPf"],"name":[0,"Developer Platform"],"slug":[0,"developer-platform"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Kim Jeske"],"slug":[0,"kim"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1SDxpGoF91lM10f0XhL8O4/a908a8f914260396b646107c396b1de6/kim.png"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}],[0,{"name":[0,"Kian Newman-Hazel"],"slug":[0,"kian-newman-hazel"],"bio":[0],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/48ksPIMXauCn5H9RdlYj3H/9f672f14dcdb2555f5a32aa73efb504c/IMG_8432.jpg"],"location":[0,"United Kingdom"],"website":[0],"twitter":[0],"facebook":[0]}],[0,{"name":[0,"Kody Jackson"],"slug":[0,"kody"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/1uXVtuGTFZLmrGCd37Yog8/e54bed777ce72671e6dab85692a8ecd7/kody.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}]]],"meta_description":[0,"At Cloudflare, we treat developer content like an open source product. This collaborative approach enables global contributions to enhance quality and relevance for a wide range of users. This year, we scaled our documentation site to better meet the needs of users by migrating to the Astro ecosystem."],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/open-source-all-the-way-down-upgrading-our-developer-documentation"],"metadata":[0,{"title":[0,"Open source all the way down: Upgrading our developer documentation"],"description":[0,"At Cloudflare, we treat developer content like an open source product. This collaborative approach enables global contributions to enhance quality and relevance for a wide range of users. This year, we scaled our documentation site to better meet the needs of users by migrating to the Astro ecosystem."],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/unu6xNHqGPZdNnccHPWhz/b45b02ee55cd04ae92bb01324207cfce/BLOG-2600_OG.png"]}]}],[0,{"id":[0,"2hySj1JFTXmlofjA6IRijm"],"title":[0,"Is this thing on? Using OpenBMC and ACPI power states for reliable server boot"],"slug":[0,"how-we-use-openbmc-and-acpi-power-states-to-monitor-the-state-of-our-servers"],"excerpt":[0,"Cloudflare’s global fleet benefits from being managed by open source firmware for the Baseboard Management Controller (BMC), OpenBMC. This has come with various challenges, some of which we discuss here with an explanation of how the open source nature of the firmware for the BMC enabled us to fix the issues and maintain a more stable fleet."],"featured":[0,false],"html":[0,"\n
At Cloudflare, we provide a range of services through our global network of servers, located in 330 cities worldwide. When you interact with our long-standing application services, or newer services like Workers AI, you’re in contact with one of our fleet of thousands of servers which support those services.
These servers which provide Cloudflare services are managed by a Baseboard Management Controller (BMC). The BMC is a special purpose processor — different from the Central Processing Unit (CPU) of a server — whose sole purpose is ensuring a smooth operation of the server.
Regardless of the server vendor, each server has this BMC. The BMC runs independently of the CPU and has its own embedded operating system, usually referred to as firmware. At Cloudflare, we customize and deploy a server-specific version of the BMC firmware. The BMC firmware we deploy at Cloudflare is based on the Linux Foundation Project for BMCs, OpenBMC. OpenBMC is an open-sourced firmware stack designed to work across a variety of systems including enterprise, telco, and cloud-scale data centers. The open-source nature of OpenBMC gives us greater flexibility and ownership of this critical server subsystem, instead of the closed nature of proprietary firmware. This gives us transparency (which is important to us as a security company) and allows us faster time to develop custom features/fixes for the BMC firmware that we run on our entire fleet.
In this blog post, we are going to describe how we customized and extended the OpenBMC firmware to better monitor our servers’ boot-up processes to start more reliably and allow better diagnostics in the event that an issue happens during server boot-up.
Server systems consist of multiple complex subsystems that include the processors, memory, storage, networking, power supply, cooling, etc. When booting up the host of a server system, the power state of each subsystem of the server is changed in an asynchronous manner. This is done so that subsystems can initialize simultaneously, thereby improving the efficiency of the boot process. Though started asynchronously, these subsystems may interact with each other at different points of the boot sequence and rely on handshake/synchronization to exchange information. For example, during boot-up, the UEFI (Universal Extensible Firmware Interface), often referred to as the BIOS, configures the motherboard in a phase known as the Platform Initialization (PI) phase, during which the UEFI collects information from subsystems such as the CPUs, memory, etc. to initialize the motherboard with the right settings.
\n \n \n
Figure 1: Server Boot Process
When the power state of the subsystems, handshakes, and synchronization are not properly managed, there may be race conditions that would result in failures during the boot process of the host. Cloudflare experienced some of these boot-related failures while rolling out open source firmware (OpenBMC) to the Baseboard Management Controllers (BMCs) of our servers.
\n
\n
Baseboard Management Controller (BMC) as a manager of the host
A BMC is a specialized microprocessor that is attached to the board of a host (server) to assist with remote management capabilities of the host. Servers usually sit in data centers and are often far away from the administrators, and this creates a challenge to maintain them at scale. This is where a BMC comes in, as the BMC serves as the interface that gives administrators the ability to securely and remotely access the servers and carry out management functions. The BMC does this by exposing various interfaces, including Intelligent Platform Management Interface (IPMI) and Redfish, for distributed management. In addition, the BMC receives data from various sensors/devices (e.g. temperature, power supply) connected to the server, and also the operating parameters of the server, such as the operating system state, and publishes the values on its IPMI and Redfish interfaces.
\n \n \n
Figure 2: Block diagram of BMC in a server system.
At Cloudflare, we use the OpenBMC project for our Baseboard Management Controller (BMC).
Below are examples of management functions carried out on a server through the BMC. The interactions in the examples are done over ipmitool, a command line utility for interacting with systems that support IPMI.
\n
# Check the sensor readings of a server remotely (i.e. over a network)\n$ ipmitool <some authentication> <bmc ip> sdr\nPSU0_CURRENT_IN | 0.47 Amps | ok\nPSU0_CURRENT_OUT | 6 Amps | ok\nPSU0_FAN_0 | 6962 RPM | ok\nSYS_FAN | 13034 RPM | ok\nSYS_FAN1 | 11172 RPM | ok\nSYS_FAN2 | 11760 RPM | ok\nCPU_CORE_VR_POUT | 9.03 Watts | ok\nCPU_POWER | 76.95 Watts | ok\nCPU_SOC_VR_POUT | 12.98 Watts | ok\nDIMM_1_VR_POUT | 29.03 Watts | ok\nDIMM_2_VR_POUT | 27.97 Watts | ok\nCPU_CORE_MOSFET | 40 degrees C | ok\nCPU_TEMP | 50 degrees C | ok\nDIMM_MOSFET_1 | 36 degrees C | ok\nDIMM_MOSFET_2 | 39 degrees C | ok\nDIMM_TEMP_A1 | 34 degrees C | ok\nDIMM_TEMP_B1 | 33 degrees C | ok\n\n…\n\n# check the power status of a server remotely (i.e. over a network)\nipmitool <some authentication> <bmc ip> power status\nChassis Power is off\n\n# power on the server\nipmitool <some authentication> <bmc ip> power on\nChassis Power Control: On
\n
Switching to OpenBMC firmware for our BMCs gives us more control over the software that powers our infrastructure. This has given us more flexibility, customizations, and an overall better uniform experience for managing our servers. Since OpenBMC is open source, we also leverage community fixes while upstreaming some of our own. Some of the advantages we have experienced with OpenBMC include a faster turnaround time to fixing issues, optimizations around thermal cooling, increased power efficiency and supporting AI inference.
While developing Cloudflare’s OpenBMC firmware, however, we ran into a number of boot problems.
Host not booting: When we send a request over IPMI for a host to power on (as in the example above, power on the server), ipmitool would indicate the power status of the host as ON, but we would not see any power going into the CPU nor any activity on the CPU. While ipmitool was correct about the power going into the chassis as ON, we had no information about the power state of the server from ipmitool, and we initially falsely assumed that since the chassis power was on, the rest of the server components should be ON. The System Event Log (SEL), which is responsible for displaying platform-specific events, was not giving us any useful information beyond indicating that the server was in a soft-off state (powered off), working state (operating system is loading and running), or that a “System Restart” of the host was initiated.
\n
# System Event Logs (SEL) showing the various power states of the server\n$ ipmitool sel elist | tail -n3\n 4d | Pre-Init |0000011021| System ACPI Power State ACPI_STATUS | S5_G2: soft-off | Asserted\n 4e | Pre-Init |0000011022| System ACPI Power State ACPI_STATUS | S0_G0: working | Asserted\n 4f | Pre-Init |0000011023| System Boot Initiated RESTART_CAUSE | System Restart | Asserted
\n
In the System Event Logs shown above, ACPI is the acronym for Advanced Configuration and Power Interface, a standard for power management on computing systems. In the ACPI soft-off state, the host is powered off (the motherboard is on standby power but CPU/host isn’t powered on); according to the ACPI specifications, this state is called S5_G2. (These states are discussed in more detail below.) In the ACPI working state, the host is booted and in a working state, also known in the ACPI specifications as status S0_G0 (which in our case happened to be false), and the third row indicates the cause of the restart was due to a System Restart. Most of the boot-related SEL events are sent from the UEFI to the BMC. The UEFI has been something of a black box to us, as we rely on our original equipment manufacturers (OEMs) to develop the UEFI firmware for us, and for the generation of servers with this issue, the UEFI firmware did not implement sending the boot progress of the host to the BMC.
One discrepancy we observed was the difference in the power status and the power going into the CPU, which we read with a sensor we call CPU_POWER.
\n
# Check power status\n$ ipmitool <some authentication> <bmc ip> power status\nChassis Power is on\n
\n
However, checking the power into the CPU shows that the CPU was not receiving any power.
\n
# Check power going into the CPU\n$ ipmitool <some authentication> <bmc ip> sdr | grep CPU_POWER \nCPU_POWER | 0 Watts | ok
\n
The CPU_POWER being at 0 watts contradicts all the previous information that the host was powered up and working, when the host was actually completely shut down.
Missing Memory Modules: Our servers would randomly boot up with less memory than expected. Computers can boot up with less memory than installed due to a number of problems, such as a loose connection, hardware problem, or faulty memory. For our case, it happened not to be any of the usual suspects, but instead was due to both the BMC and UEFI trying to simultaneously read from the memory modules, leading to access contentions. Memory modules usually contain a Serial Presence Detect (SPD), which is used by the UEFI to dynamically detect the memory module. This SPD is usually located on an inter-integrated circuit (i2c), which is a low speed, two write protocol for devices to talk to each other. The BMC also reads the temperature of the memory modules via the i2c. When the server is powered on, amongst other hardware initializations, the UEFI also initializes the memory modules that it can detect via their (i.e. each individual memory modules) Serial Presence Detect (SPD), the BMC could also be trying to access the temperature of the memory module at the same time, over the same i2c protocol. This simultaneous attempted read denies one of the parties access. When the UEFI is denied access to the SPD, it thinks the memory module is not available and skips over it. Below is an example of the related i2c-bus contention logs we saw in the journal of the BMC when the host is booting.
\n
kernel: aspeed-i2c-bus 1e78a300.i2c-bus: irq handled != irq. expected 0x00000021, but was 0x00000020
\n
The above logs indicate that the i2c address 1e78a300 (which happens to be connected to the serial presence detect of the memory modules) could not properly handle a signal, known as an interrupt request (irq). When this scenario plays out on the UEFI, the UEFI is unable to detect the memory module.
\n \n \n
Figure 3: I2C diagram showing I2C interconnection of the server’s memory modules (also known as DIMMs) with the BMC
Thermal telemetry: During the boot-up process of some of our servers, some temperature devices, such as the temperature sensors of the memory modules, would show up as failed, thereby causing some of the fans to enter a fail-safe Pulse Width Modulation (PWM) mode. PWM is a technique to encode information delivered to electronic devices by adjusting the frequency of the waveform signal to the device. It is used in this case to control fan speed by adjusting the frequency of the power signal delivered to the fan. When a fan enters a fail-safe mode, PWM is used to set the fan speeds to a preset value, irrespective of what the optimized PWM setting of the fans should be, and this could negatively affect the cooling of the server and power consumption.
In the process of studying the issues we faced relating to the boot-up process of the host, we learned how the power state of the subsystems within the chassis changes. Part of our learnings led us to investigate the Advanced Configuration and Power Interface (ACPI) and how the ACPI state of the host changed during the boot process.
Advanced Configuration and Power Interface (ACPI) is an open industry specification for power management used in desktop, mobile, workstation, and server systems. The ACPI Specification replaces previous power management methodologies such as Advanced Power Management (APM). ACPI provides the advantages of:
Allowing OS-directed power management (OSPM).
Having a standardized and robust interface for power management.
Sending system-level events such as when the server power/sleep buttons are pressed
Hardware and software support, such as a real-time clock (RTC) to schedule the server to wake up from sleep or to reduce the functionality of the CPU based on RTC ticks when there is a loss of power.
From the perspective of power management, ACPI enables an OS-driven conservation of energy by transitioning components which are not in active use to a lower power state, thereby reducing power consumption and contributing to more efficient power management.
The ACPI Specification defines four global “Gx” states, six sleeping “Sx” states, and four “Dx” device power states. These states are defined as follows:
\n
\n
\n
\n
\n
\n
\n
\n
\n \n
\n
\n
Gx
\n
\n
\n
Name
\n
\n
\n
Sx
\n
\n
\n
Description
\n
\n
\n
\n
\n
G0
\n
\n
\n
Working
\n
\n
\n
S0
\n
\n
\n
The run state. In this state the machine is fully running
\n
\n
\n
\n
\n
G1
\n
\n
\n
Sleeping
\n
\n
\n
S1
\n
\n
\n
A sleep state where the CPU will suspend activity but retain its contexts.
\n
\n
\n
\n
\n
S2
\n
\n
\n
A sleep state where memory contexts are held, but CPU contexts are lost. CPU re-initialization is done by firmware.
\n
\n
\n
\n
\n
S3
\n
\n
\n
A logically deeper sleep state than S2 where CPU re-initialization is done by device. Equates to Suspend to RAM.
\n
\n
\n
\n
\n
S4
\n
\n
\n
A logically deeper sleep state than S3 in which DRAM is context is not maintained and contexts are saved to disk. Can be implemented by either OS or firmware.
\n
\n
\n
\n
\n
G2
\n
\n
\n
Soft off but PSU still supplies power
\n
\n
\n
S5
\n
\n
\n
The soft off state. All activity will stop, and all contexts are lost. The Complex Programmable Logic Device (CPLD) responsible for power-up and power-down sequences of various components e.g. CPU, BMC is on standby power, but the CPU/host is off.
\n
\n
\n
\n
\n
G3
\n
\n
\n
Mechanical off
\n
\n
\n
\n
PSU does not supply power. The system is safe for disassembly.
\n
\n
\n
\n
\n
Dx
\n
\n
\n
Name
\n
\n
\n
Description
\n
\n
\n
\n
\n
D0
\n
\n
\n
Fully powered on
\n
\n
\n
Hardware device is fully functional and operational
\n
\n
\n
\n
\n
D1
\n
\n
\n
Hardware device is partially powered down
\n
\n
\n
Reduced functionality and can be quickly powered back to D0
\n
\n
\n
\n
\n
D2
\n
\n
\n
Hardware device is in a deeper lower power than D1
\n
\n
\n
Much more limited functionality and can only be slowly powered back to D0.
\n
\n
\n
\n
\n
D3
\n
\n
\n
Hardware device is significantly powered down or off
\n
\n
\n
Device is inactive with perhaps only the ability to be powered back on
\n
\n
\n \n
\n
\n
The states that matter to us are:
S0_G0_D0: often referred to as the working state. Here we know our host system is running just fine.
S2_D2: Memory contexts are held, but CPU context is lost. We usually use this state to know when the host’s UEFI is performing platform firmware initialization.
S5_G2: Often referred to as the soft off state. Here we still have power going into the chassis, however, processor and DRAM context are not maintained, and the operating system power management of the host has no context.
Since the issues we were experiencing were related to the power state changes of the host — when we asked the host to reboot or power on — we needed a way to track the various power state changes of the host as it went from power off to a complete working state. This would give us better management capabilities over the devices that were on the same power domain of the host during the boot process. Fortunately, the OpenBMC community already implemented an ACPI daemon, which we extended to serve our needs. We added an ACPI S2_D2 power state, in which memory contexts are held, but CPU context is lost, to the ACPI daemon running on the BMC to enable us to know when the host’s UEFI is performing firmware initialization, and also set up various management tasks for the different ACPI power states.
An example of a power management task we carry out using the S0_G0_D0 state is to re-export our Voltage Regulator (VR) sensors on S0_G0_D0 state, as shown with the service file below:
Having set this up, OpenBMC has a Net Function (ipmiSetACPIState) in phosphor-host-ipmid that is responsible for setting the ACPIState of the host on the BMC. This command is called by the host using the standard ipmi command with the corresponding NetFn=0x06 and Cmd=0x06.
In the event of an immediate power cycle (i.e. host reboots without operating system shutdown), the host is unable to send its S5_G2 state to the BMC. For this case, we created a patch to OpenBMC’s x86-power-control to let the BMC become aware that the host has entered the ACPI S5_G2 state (i.e. soft-off). When the host comes out of the power off state, the UEFI performs the Power On Self Test (POST) and sends the S2_D2 to the BMC, and after the UEFI has loaded the OS on the host, it notifies the BMC by sending the ACPI S0_G0_D0 state.
Going back to the boot-up issues we faced, we discovered that they were mostly caused by devices which were in the same power domain of the CPU, interfering with the UEFI/platform firmware initialization phase. Below is a high level description of the fixes we applied.
Servers not booting: After identifying the devices that were interfering with the POST stage of the firmware initialization, we used the host ACPI state to control when we set the appropriate power mode state for those devices so as not to cause POST to fail.
Memory modules missing: During the boot-up process, memory modules (DIMMs) are powered and initialized in S2_D2 ACPI state. During this initialization process, UEFI firmware sends read commands to the Serial Presence Detect (SPD) on the DIMM to retrieve information for DIMM enumeration. At the same time, the BMC could be sending commands to read DIMM temperature sensors. This can cause SMBUS collisions, which could either cause DIMM temperature reading to fail or UEFI DIMM enumeration to fail. The latter case would cause the system to boot up with reduced DIMM capacity, which could be mistaken as a failing DIMM scenario. After we had discovered the race condition issue, we disabled the BMC from reading the DIMM temperature sensors during S2_D2 ACPI state and set a fixed speed for the corresponding fans. This solution allows our UEFI to retrieve all the necessary DIMM subsystems information for enumeration, and our servers now boot up with the correct size of memory.
Thermal telemetry: In S0_G0 power state, when sensors are not reporting values back to the BMC, the BMC assumes that devices may be overheating and puts the fan controller into fail-safe mode where fan speeds are ramped up to maximum speed. However, in S5_G2 state, some thermal sensors like CPU temperature, NIC temperature, etc. are not powered and not available. Our solution is to set these thermal sensors as non-functional in their exported configuration when in S5_G2 state and during the transition from S5_G2 state to S2_D2 state. Setting the affected devices as non-functional in their configuration, instead of waiting for thermal sensor read commands to error out, prevents the controller from entering the fail-safe mode.
Aside from resolving issues, we have seen other benefits from implementing ACPI Power State on our BMC firmware. An example is in the area of our automated firmware regression testing. Various parts of our tests require rebooting/power cycling the servers over a hundred times, during which we monitor the ACPI power state changes of our servers as against using a boolean (running or not running, pingable or not pingable) to assert the status of our servers.
Also, it has given us the opportunity to learn more about the complex subsystems in a server system, and the various power modes of the different subsystems. This is an aspect that we are still actively learning about as we look to further optimize various aspects of the boot sequence of our servers.
In the course of time, implementing ACPI states is helping us achieve the following:
All components are enabled by end of boot sequence,
BIOS and BMC are able to retrieve component information,
And the BMC is aware when thermal sensors are in a non-functional state.\n
For better observability of the boot progress and “last state” of our systems, we have also started the process of adding the BootProgress object of the Redfish ComputerSystem Schema into our systems. This will give us an opportunity for pre-operating system (OS) boot observability and an easier debug starting point when the UEFI has issues (such as when the server isn’t coming on) during the server platform initialization.
With each passing day, Cloudflare’s OpenBMC team, which is made up of folks from different embedded backgrounds, learns about, experiments with, and deploys OpenBMC across our global fleet. This has been made possible by relying on the OpenBMC community’s contribution (as well as upstreaming some of our own contributions), and our interaction with our various vendors, thereby giving us the opportunity to make our systems more reliable, and giving us the ownership and responsibility of the firmware that powers the BMCs that manage our servers. If you are thinking of embracing open-source firmware in your BMC, we hope this blog post written by a team which started deploying OpenBMC less than 18 months ago has inspired you to give it a try.
For those who are interested in considering making the jump to open-source firmware, check it out here!
"],"published_at":[0,"2024-10-22T14:00+01:00"],"updated_at":[0,"2024-10-22T14:29:45.868Z"],"feature_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7tXffZR8RhNdDz7mqsydir/41a33236f8eac2ca38d4b5ae9c21a77e/image3.png"],"tags":[1,[[0,{"id":[0,"7fYeYFd6aJXluz5xmGkdZT"],"name":[0,"Infrastructure"],"slug":[0,"infrastructure"]}],[0,{"id":[0,"3txfsA7N73yBL9g3VPBLL0"],"name":[0,"Open Source"],"slug":[0,"open-source"]}],[0,{"id":[0,"2zGeaSPj2uoPFugy8J60n1"],"name":[0,"OpenBMC"],"slug":[0,"open-bmc"]}],[0,{"id":[0,"1IVpRmO1Bg0J9pDI7FUeEB"],"name":[0,"Servers"],"slug":[0,"servers"]}],[0,{"id":[0,"vGIiidDZ4NKOzDrDSfIjN"],"name":[0,"Firmware"],"slug":[0,"firmware"]}]]],"relatedTags":[0],"authors":[1,[[0,{"name":[0,"Nnamdi Ajah"],"slug":[0,"nnamdi"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/3FssFdDxuBmbKbJiY4PuNj/e36d48a362480cbc7e38b1017290ad69/nnamdi.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}],[0,{"name":[0,"Ryan Chow"],"slug":[0,"ryan-chow"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/5TJGoRJtGt1tLdEfiXF5Zn/3d7cda2f3b5cacd67b8bb1720d8dcc40/ryan-chow.png"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}],[0,{"name":[0,"Giovanni Pereira Zantedeschi"],"slug":[0,"giovanni"],"bio":[0,null],"profile_image":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/7A7A57tAXHrOynxnx3eFpi/15cfa82b216391c547be01178e30cd98/giovanni.jpg"],"location":[0,null],"website":[0,null],"twitter":[0,null],"facebook":[0,null]}]]],"meta_description":[0,"Cloudflare’s global fleet benefits from being managed by open source firmware for the Baseboard Management Controller (BMC), OpenBMC. This has come with various challenges, some of which we discuss here with an explanation of how the open source nature of the firmware for the BMC enabled us to fix the issues and maintain a more stable fleet."],"primary_author":[0,{}],"localeList":[0,{"name":[0,"blog-english-only"],"enUS":[0,"English for Locale"],"zhCN":[0,"No Page for Locale"],"zhHansCN":[0,"No Page for Locale"],"zhTW":[0,"No Page for Locale"],"frFR":[0,"No Page for Locale"],"deDE":[0,"No Page for Locale"],"itIT":[0,"No Page for Locale"],"jaJP":[0,"No Page for Locale"],"koKR":[0,"No Page for Locale"],"ptBR":[0,"No Page for Locale"],"esLA":[0,"No Page for Locale"],"esES":[0,"No Page for Locale"],"enAU":[0,"No Page for Locale"],"enCA":[0,"No Page for Locale"],"enIN":[0,"No Page for Locale"],"enGB":[0,"No Page for Locale"],"idID":[0,"No Page for Locale"],"ruRU":[0,"No Page for Locale"],"svSE":[0,"No Page for Locale"],"viVN":[0,"No Page for Locale"],"plPL":[0,"No Page for Locale"],"arAR":[0,"No Page for Locale"],"nlNL":[0,"No Page for Locale"],"thTH":[0,"No Page for Locale"],"trTR":[0,"No Page for Locale"],"heIL":[0,"No Page for Locale"],"lvLV":[0,"No Page for Locale"],"etEE":[0,"No Page for Locale"],"ltLT":[0,"No Page for Locale"]}],"url":[0,"https://blog.cloudflare.com/how-we-use-openbmc-and-acpi-power-states-to-monitor-the-state-of-our-servers"],"metadata":[0,{"title":[0,"Is this thing on? Using OpenBMC and ACPI power states for reliable server boot"],"description":[0,"Cloudflare’s global fleet benefits from being managed by open source firmware for the Baseboard Management Controller (BMC), OpenBMC. This has come with various challenges, some of which we discuss here with an explanation of how the open source nature of the firmware for the BMC enabled us to fix the issues and maintain a more stable fleet."],"imgPreview":[0,"https://cf-assets.www.cloudflare.com/zkvhlag99gkb/LZdIivdHwtFqTFXKQ5uDD/ae1874bac4f578597a346ade6a24f248/Is_this_thing_on"]}]}]]],"locale":[0,"ko-kr"],"translations":[0,{"posts.by":[0,"작성자:"],"footer.gdpr":[0,"GDPR"],"lang_blurb1":[0,"이 게시물은 {lang1}로도 이용할 수 있습니다."],"lang_blurb2":[0,"이 게시물은 {lang1} 및 {lang2}로도 이용할 수 있습니다."],"lang_blurb3":[0,"이 게시물은 {lang1}, {lang2} 및 {lang3}로도 이용할 수 있습니다."],"footer.press":[0,"언론"],"header.title":[0,"Cloudflare 블로그"],"search.clear":[0,"지우기"],"search.filter":[0,"필터"],"search.source":[0,"소스"],"footer.careers":[0,"채용 정보"],"footer.company":[0,"회사"],"footer.support":[0,"지원"],"footer.the_net":[0,"theNet"],"search.filters":[0,"필터"],"footer.our_team":[0,"Cloudflare 팀"],"footer.webinars":[0,"웨비나"],"page.more_posts":[0,"더 많은 게시물"],"posts.time_read":[0,"{time}분 읽기"],"search.language":[0,"언어"],"footer.community":[0,"커뮤니티"],"footer.resources":[0,"리소스"],"footer.solutions":[0,"솔루션"],"footer.trademark":[0,"상표"],"header.subscribe":[0,"구독"],"footer.compliance":[0,"규제 준수"],"footer.free_plans":[0,"Free 요금제"],"footer.impact_ESG":[0,"영향/ESG"],"posts.follow_on_X":[0,"X에서 팔로우하기"],"footer.help_center":[0,"지원 센터"],"footer.network_map":[0,"네트워크 지도"],"header.please_wait":[0,"기다려 주세요"],"page.related_posts":[0,"관련 게시물"],"search.result_stat":[0,"{search_keyword}에 대한 {search_total}개의 결과 중 {search_range}개"],"footer.case_studies":[0,"사례 연구"],"footer.connect_2024":[0,"Connect 2024"],"footer.terms_of_use":[0,"이용 약관"],"footer.white_papers":[0,"백서"],"footer.cloudflare_tv":[0,"Cloudflare TV"],"footer.community_hub":[0,"커뮤니티 허브"],"footer.compare_plans":[0,"요금제 비교"],"footer.contact_sales":[0,"영업 부서 문의"],"header.contact_sales":[0,"영업 부서 문의"],"header.email_address":[0,"이메일 주소"],"page.error.not_found":[0,"페이지 찾을 수 없음"],"footer.developer_docs":[0,"개발자 문서"],"footer.privacy_policy":[0,"개인정보 취급방침"],"footer.request_a_demo":[0,"데모를 요청하세요"],"page.continue_reading":[0,"계속 읽기"],"footer.analysts_report":[0,"분석 보고서"],"footer.for_enterprises":[0,"기업용"],"footer.getting_started":[0,"시작하기"],"footer.learning_center":[0,"학습 센터"],"footer.project_galileo":[0,"Galileo 프로젝트"],"pagination.newer_posts":[0,"새 게시물"],"pagination.older_posts":[0,"예전 게시물"],"posts.social_buttons.x":[0,"X 관련 링크"],"search.icon_aria_label":[0,"검색"],"search.source_location":[0,"소스/위치"],"footer.about_cloudflare":[0,"Cloudflare 소개"],"footer.athenian_project":[0,"Athenian 프로젝트"],"footer.become_a_partner":[0,"파트너 되기"],"footer.cloudflare_radar":[0,"Cloudflare Radar"],"footer.network_services":[0,"네트워크 서비스"],"footer.trust_and_safety":[0,"신뢰 및 안전"],"header.get_started_free":[0,"무료로 시작하기"],"page.search.placeholder":[0,"Cloudflare 검색"],"footer.cloudflare_status":[0,"Cloudflare의 지위"],"footer.cookie_preference":[0,"쿠키 기본 설정"],"header.valid_email_error":[0,"유효한 이메일이어야 합니다."],"search.result_stat_empty":[0,"{search_range} / {search_total} 검색 결과"],"footer.connectivity_cloud":[0,"클라우드 연결성"],"footer.developer_services":[0,"개발자 서비스"],"footer.investor_relations":[0,"투자자 관계"],"page.not_found.error_code":[0,"오류 코드: 404"],"search.autocomplete_title":[0,"쿼리를 입력하세요. Enter 키를 눌러 검색하세요"],"footer.logos_and_press_kit":[0,"로고 및 보도 자료 키트"],"footer.application_services":[0,"앱 서비스"],"footer.get_a_recommendation":[0,"추천받기"],"posts.social_buttons.reddit":[0,"Reddit 관련 링크"],"footer.sse_and_sase_services":[0,"SSE 및 SASE 서비스"],"page.not_found.outdated_link":[0,"오래된 링크를 사용했거나, 주소를 잘못 입력했을 수 있습니다."],"footer.report_security_issues":[0,"보안 문제 보고"],"page.error.error_message_page":[0,"죄송합니다. 찾으시는 페이지를 찾을 수 없습니다."],"header.subscribe_notifications":[0,"구독해서 새 게시물에 대한 알림을 받으세요."],"footer.cloudflare_for_campaigns":[0,"Cloudflare for Campaigns"],"header.subscription_confimation":[0,"구독 확인되었습니다. 구독해 주셔서 감사합니다!"],"posts.social_buttons.hackernews":[0,"Hacker News 관련 링크"],"footer.diversity_equity_inclusion":[0,"다양성, 공정성, 포용성"],"footer.critical_infrastructure_defense_project":[0,"핵심 인프라 방어 프로젝트"]}],"localesAvailable":[1,[[0,"en-us"],[0,"zh-cn"],[0,"zh-tw"],[0,"fr-fr"],[0,"de-de"],[0,"ja-jp"],[0,"es-es"]]],"footerBlurb":[0,"Cloudflare에서는 전체 기업 네트워크를 보호하고, 고객이 인터넷 규모의 애플리케이션을 효과적으로 구축하도록 지원하며, 웹 사이트와 인터넷 애플리케이션을 가속화하고, DDoS 공격을 막으며, 해커를 막고, Zero Trust로 향하는 고객의 여정을 지원합니다.
어떤 장치로든 1.1.1.1에 방문해 인터넷을 더 빠르고 안전하게 만들어 주는 Cloudflare의 무료 애플리케이션을 사용해 보세요.
더 나은 인터넷을 만들기 위한 Cloudflare의 사명을 자세히 알아보려면 여기에서 시작하세요. 새로운 커리어 경로를 찾고 있다면 채용 공고를 확인해 보세요."]}" client="load" opts="{"name":"Post","value":true}" await-children="">
Cloudflare는 오픈 소스의 힘을 믿습니다. 단순한 코드 이상의 의미를 지니고 있으며, 인터넷을 발전시키는 협업, 혁신, 지식 공유의 정신이 깃들어 있습니다. 인터넷이 번성하는 토대는 바로 오픈 소스로, 전 세계 개발자와 크리에이터가 더 큰 목표에 기여할 수 있도록 합니다.
하지만 오픈 소스 유지 관리자는 전 세계 사용자들에게 프로젝트를 제공하고 운영하는 데 드는 비용으로 어려움을 겪는 경우가 많습니다. Cloudflare는 Git와 Linux Foundation 같은 Cloudflare의 훌륭한 오픈 소스 프로그램을 지원할 수 있는 특권을 누렸으며, Cloudflare가 가장 큰 도움을 줄 수 있는 영역을 직접 배울 수 있었습니다.
오늘, Cloudflare는 간소화되고 확장된 오픈 소스 프로그램인 프로젝트 Alexandria를 소개합니다. 고대 도시 알렉산드리아는 풍부한 도서관과 고대 세계 7대 불가사의 중 하나인 등대가 있는 것으로 유명했습니다. 알렉산드리아 등대는 멀리서 온 사람들을 맞이하는 문화와 커뮤니티의 등불이 되었습니다. Cloudflare는 오픈 소스 프로젝트가 전 세계 개발자들을 위한 등불이자 더 나은 인터넷을 만드는 데 핵심적인 지식의 원천이라는 점에서 알렉산드리아가 훌륭한 비유라고 생각합니다.
이 프로젝트는 Cloudflare 제품을 무료로 제공하기 위해 더 많은 오픈 소스 프로젝트에 대해 연간 반복 크레딧을 제공합니다. 과거에는 Pro 요금제로의 업그레이드를 제공했으나, 이제는 각 프로젝트의 규모와 필요에 맞춘 업그레이드를 제공하며, Workers와 Pages 등 더 많은 제품에 접근할 수 있습니다. 프로젝트 Alexandria의 목표는 모든 오픈 소스 소프트웨어 프로젝트가 Cloudflare의 강화된 보안, 성능 최적화, 개발자 도구를 무료로 이용하여 단순히 생존하는 것에 그치지 않고 번창할 수 있도록 지원하는 것입니다.
니즈에 맞는 프로그램 구축하기
Cloudflare는 오픈 소스 프로젝트에는 다양한 니즈가 있다는 것을 잘 알고 있습니다. 예를 들어 패키지 리포지토리와 같은 일부 프로젝트는 스토리지 및 전송 비용에 우선순위를 둘 수 있습니다. DDoS 공격으로부터 방어하기 위해 도움이 필요한 프로젝트도 있습니다. 또 어떤 프로젝트는 확장 가능하고 안전한 애플리케이션을 빠르게 구축하고 배포할 수 있는 강력한 개발자 플랫폼이 필요할 수 있습니다.
Cloudflare는 새로운 프로그램을 통해 귀하의 프로젝트와 협력하여 니즈에 맞는 다양한 혜택을 제공합니다.
Cloudflare Workers 및 Pages에 대한 요청 수 증가로 더 많은 트래픽을 처리하고 애플리케이션을 전 세계적으로 확장할 수 있도록 지원합니다.
빌드 및 아티팩트에 대한 R2 스토리지 확장으로 프로젝트 자산을 효율적으로 저장하고 이용할 수 있도록 필요한 공간을 보장합니다.
개선된 Zero Trust 액세스, 즉 원격 브라우저 격리, 무제한 사용자 지원, 더 긴 활동 로그 보존 등을 통해 프로젝트 보안에 대한 더 많은 인사이트를 확보하고 제어 능력을 강화할 수 있습니다.
프로그램에 참여하는 모든 오픈 소스 프로젝트는 Cloudflare Discord 서버 의 전용 채널 을 통해 추가 리소스와 지원을 받게 됩니다. 또한 현재 Cloudflare가 지원하지는 않지만 향후 제공할 수 있는 다른 내용이 있다고 생각하신다면 기꺼이 실현할 수 있는 방법을 찾아드리겠습니다.
많은 오픈 소스 프로젝트는 Cloudflare의 넉넉한 무료 등급한도 내에서 운영됩니다. 더 나은 인터넷을 조성하고자 하는 Cloudflare의 사명은 프로젝트의 규모에 상관없이 오픈 소스 패키지를 전 세계적으로 구축하고, 보호하며 배포하는 데 비용이 걸림돌이 되지 않아야 한다는 것을 의미합니다. 인디 또는 소규모 오픈 소스 프로젝트는 크레딧 없이도 여전히 무료로 운영할 수 있습니다. 대규모 프로젝트의 경우 연간 반복 크레딧을 사용할 수 있으므로 패키지 및 웹사이트의 저장, 보안, 제공을 위한 인프라에 지출하는 대신 자금을 혁신에 계속 재투자할 수 있습니다.
Cloudflare는 혁신적일 뿐만 아니라 인터넷의 지속적인 성장 및 건전성에 중요한 프로젝트를 지원하기 위해 최선을 다하고 있습니다. 프로그램에 대한 기준은 동일합니다.
Cloudflare를 사용하는 OpenJS Foundation의 주목할 만한 예는 Node.js CDN Worker입니다. 현재 Node.js 웹 인프라 및 빌드 팀이 활발하게 개발 중이며, 이들의 웹 사이트에서 제공되는 모든 Node.js 릴리스 자산(바이너리, 문서 등)을 서비스하는 것을 목표로 하고 있습니다.
Aaron Snell은 이러한 릴리스 자산이 현재 Cloudflare가 이끄는 단일 정적 원본 파일 서버에서 제공되고 있다고 설명했습니다. 이 설정은 몇 년 전까지만 잘 작동했지만 새로운 릴리스에서 문제가 발생하기 시작했습니다. 새로운 릴리스에서는 캐시 제거가 발생했는데, 이는 릴리스 자산에 대한 모든 요청이 캐시 누락으로 인해 Cloudflare가 정적 파일 서버로 직접 이동하여 과부하가 발생했음을 의미합니다. Node.js는 밤마다 빌드를 릴리스하기 때문에 이 문제는 매일 발생하게 됩니다.
CDN Worker는 Cloudflare Workers와 R2를 사용하여 릴리스 자산 요청을 처리함으로써 정적 파일 서버의 모든 부하를 제거하고, Node.js 다운로드 및 문서의 가용성을 개선하며, 궁극적으로는 장기적으로 더 지속 가능한 프로세스를 만드는 방식으로 이 문제를 해결할 계획입니다.
OpenTofu
OpenTofu 는 독점 코드형 인프라 플랫폼에 대한 무료 개방형 대안을 구축하는 데 집중해 왔습니다. 주요 과제 중 하나는 비용을 낮게 유지하면서도 레지스트리의 안정성과 확장성을 보장하는 것이었습니다. Cloudflare의 R2 스토리지 및 캐싱 서비스는 OpenTofu가 대규모로 정적 파일을 제공할 수 있으면서 대역폭이나 성능 병목 현상을 걱정하지 않도록 해주는 완벽한 해결책이었습니다.
OpenTofu 팀은 레지스트리를 운영하는 데 있어 대역폭뿐만 아니라 인적 비용 측면에서도 가능한 한 비용을 낮추는 것이 매우 중요하다고 언급했습니다. 그러나 수천 명의 개발자가 레지스트리가 다운되어 인프라를 업데이트하지 못하는 상황을 방지하기 위해서는 레지스트리의 가동 시간을 100%에 가깝게 보장하는 것이 중요했습니다.
Go로 작성된 레지스트리 코드베이스는 OpenTofu 레지스트리 API에서 가능한 모든 응답을 미리 생성하고 정적 파일을 R2 버킷에 업로드합니다. OpenTofu는 R2를 사용함으로써 서버나 확장성 문제에 대해 걱정할 필요 없이 사실상 무료로 레지스트리를 운영할 수 있었습니다.
JuliaLang
JuliaLang은 최근 Cloudflare의 OSS 스폰서십 프로그램에 합류했으며, Cloudflare는 이 기업의 생태계가 원활하게 운영될 수 있도록 핵심 인프라를 지원하게 되어 매우 기쁩니다. 이 지원의 핵심은 JuliaLang이 Cloudflare의 서비스를 사용하여 패키지를 사용자들에게 원활하게 전달할 수 있도록 하는 것입니다.
Elliot Saba에 따르면, JuliaLang은 Amazon Lightsail을 사용하여 비용 효율적인 글로벌 CDN으로 패키지를 사용자들에게 제공해 왔습니다 그러나 사용자 기반이 확장되면서 대역폭 한계를 초과하는 경우가 발생했고, 트래픽 급증으로 인해 부하 분산 장치 VM이 과부하되어 클라우드 비용이 상승하고 성능 저하를 경험하기도 했습니다. 이제 JuliaLang은 Cloudflare R2를 사용하고 있으며, R2 개체 스토리지의 속도와 안정성은 자체 데이터센터 내 솔루션보다 더 뛰어난 성능을 발휘하고 있습니다. 또한 대역폭 요금이 없기 때문에 JuliaLang은 이제 과거 비용의 10분의 1도 안 되는 비용으로 더 빠르고 안정적인 서비스를 누릴 수 있습니다.
Cloudflare가 무엇을 도와드릴까요?
귀사의 프로젝트가 Cloudflare 기준에 부합하며, 비용을 절감하고 예상치 못한 비용을 없애고 싶으시다면 지원해 주시기를 바랍니다! Cloudflare는 다음 세대의 오픈 소스 프로젝트가 인터넷에 중요한 영향을 미칠 수 있도록 지원하고 싶습니다.
자세한 내용과 지원 방법을 확인하려면 새로운 프로젝트 Alexandria 페이지를 방문해 주세요. 이 프로그램을 통해 도움을 받을 수 있는 다른 프로젝트를 알고 계신다면 이 정보를 널리 공유해 주세요!
You can now add a Deploy to Cloudflare button to your repository’s README when building a Workers application, making it simple for other developers to set up and deploy your project! ...
At Cloudflare, we treat developer content like an open source product. This collaborative approach enables global contributions to enhance quality and relevance for a wide range of users. This year,...
Cloudflare’s global fleet benefits from being managed by open source firmware for the Baseboard Management Controller (BMC), OpenBMC. This has come with various challenges, some of which we discuss here with an explanation of how the open source nature of the firmware for the BMC enabled us to fix the issues and maintain a more stable fleet....