In recent months, we’ve seen a leap forward for closed-source image generation models with the rise of Google’s Nano Banana and OpenAI image generation models. Today, we’re happy to share that a new open-weight contender is back with the launch of Black Forest Lab’s FLUX.2 [dev] and available to run on Cloudflare’s inference platform, Workers AI. You can read more about this new model in detail on BFL’s blog post about their new model launch here.
We have been huge fans of Black Forest Lab’s FLUX image models since their earliest versions. Our hosted version of FLUX.1 [schnell] is one of the most popular models in our catalog for its photorealistic outputs and high-fidelity generations. When the time came to host the licensed version of their new model, we jumped at the opportunity. The FLUX.2 model takes all the best features of FLUX.1 and amps it up, generating even more realistic, grounded images with added customization support like JSON prompting.
Our Workers AI hosted version of FLUX.2 has some specific patterns, like using multipart form data to support input images (up to 4 512x512 images), and output images up to 4 megapixels. The multipart form data format allows users to send us multiple image inputs alongside the typical model parameters. Check out our developer docs changelog announcement to understand how to use the FLUX.2 model.
What makes FLUX.2 special? Physical world grounding, digital world assets, and multi-language support
The FLUX.2 model has a more robust understanding of the physical world, allowing you to turn abstract concepts into photorealistic reality. It excels at generating realistic image details and consistently delivers accurate hands, faces, fabrics, logos, and small objects that are often missed by other models. Its knowledge of the physical world also generates life-like lighting, angles and depth perception.
Figure 1. Image generated with FLUX.2 featuring accurate lighting, shadows, reflections and depth perception at a café in Paris.
This high-fidelity output makes it ideal for applications requiring superior image quality, such as creative photography, e-commerce product shots, marketing visuals, and interior design. Because it can understand context, tone, and trends, the model allows you to create engaging and editorial-quality digital assets from short prompts.
Aside from the physical world, the model is also able to generate high-quality digital assets such as designing landing pages or generating detailed infographics (see below for example). It’s also able to understand multiple languages naturally, so combining these two features – we can get a beautiful landing page in French from a French prompt.
Générer une page web visuellement immersive pour un service de promenade de chiens. L'image principale doit dominer l'écran, montrant un chien exubérant courant dans un parc ensoleillé, avec des touches de vert vif (#2ECC71) intégrées subtilement dans le feuillage ou les accessoires du chien. Minimiser le texte pour un impact visuel maximal.
Character consistency – solving for stochastic drift
FLUX.2 offers multi-reference editing with state-of-the-art character consistency, ensuring identities, products, and styles remain consistent for tasks. In the world of generative AI, getting a high-quality image is easy. However, getting the exact same character or product twice has always been the hard part. This is a phenomenon known as "stochastic drift", where generated images drift away from the original source material.
Figure 2. Stochastic drift infographic (generated on FLUX.2)
One of FLUX.2’s breakthroughs is multi-reference image inputs designed to solve this consistency challenge. You’ll have the ability to change the background, lighting, or pose of an image without accidentally changing the face of your model or the design of your product. You can also reference other images or combine multiple images together to create something new.
In code, Workers AI supports multi-reference images (up to 4) with a multipart form-data upload. The image inputs are binary images and output is a base64 encoded image:
curl --request POST \
--url 'https://api.cloudflare.com/client/v4/accounts/{ACCOUNT}/ai/run/@cf/black-forest-labs/flux-2-dev' \
--header 'Authorization: Bearer {TOKEN}' \
--header 'Content-Type: multipart/form-data' \
--form 'prompt=take the subject of image 2 and style it like image 1' \
--form input_image_0=@/Users/johndoe/Desktop/icedoutkeanu.png \
--form input_image_1=@/Users/johndoe/Desktop/me.png \
--form steps=25
--form width=1024
--form height=1024
We also support this through the Workers AI Binding:
const image = await fetch("http://image-url");
const form = new FormData();
const image_blob = await streamToBlob(image.body, "image/png");
form.append('input_image_0', image_blob)
form.append('prompt', 'a sunset with the dog in the original image')
const resp = await env.AI.run("@cf/black-forest-labs/flux-2-dev", {
multipart: {
body: form,
contentType: "multipart/form-data"
}
})
Built for real world use cases
The newest image model signifies a shift towards functional business use cases, moving beyond simple image quality improvements. FLUX.2 enables you to:
Create Ad Variations: Generate 50 different advertisements using the exact same actor, without their face morphing between frames.
Trust Your Product Shots: Drop your product on a model, or into a beach scene, a city street, or a studio table. The environment changes, but your product stays accurate.
Build Dynamic Editorials: Produce a full fashion spread where the model looks identical in every single shot, regardless of the angle.
Figure 3. Combining the oversized hoodie and sweatpant ad photo (generated with FLUX.2) with Cloudflare’s logo to create product renderings with consistent faces, fabrics, and scenery. **Note: we prompted for white Cloudflare font as well instead of the original black font.
Granular controls — JSON prompting, HEX codes and more!
The FLUX.2 model makes another advancement by allowing users to control small details in images through tools like JSON prompting and specifying specific hex codes.
For example, you could send this JSON as a prompt (as part of the multipart form input) and the resulting image follows the prompt exactly:
{
"scene": "A bustling, neon-lit futuristic street market on an alien planet, rain slicking the metal ground",
"subjects": [
{
"type": "Cyberpunk bounty hunter",
"description": "Female, wearing black matte armor with glowing blue trim, holding a deactivated energy rifle, helmet under her arm, rain dripping off her synthetic hair",
"pose": "Standing with a casual but watchful stance, leaning slightly against a glowing vendor stall",
"position": "foreground"
},
{
"type": "Merchant bot",
"description": "Small, rusted, three-legged drone with multiple blinking red optical sensors, selling glowing synthetic fruit from a tray attached to its chassis",
"pose": "Hovering slightly, offering an item to the viewer",
"position": "midground"
}
],
"style": "noir sci-fi digital painting",
"color_palette": [
"deep indigo",
"electric blue",
"acid green"
],
"lighting": "Low-key, dramatic, with primary light sources coming from neon signs and street lamps reflecting off wet surfaces",
"mood": "Gritty, tense, and atmospheric",
"background": "Towering, dark skyscrapers disappearing into the fog, with advertisements scrolling across their surfaces, flying vehicles (spinners) visible in the distance",
"composition": "dynamic off-center",
"camera": {
"angle": "eye level",
"distance": "medium close-up",
"focus": "sharp on subject",
"lens": "35mm",
"f-number": "f/1.4",
"ISO": 400
},
"effects": [
"heavy rain effect",
"subtle film grain",
"neon light reflections",
"mild chromatic aberration"
]
}
To take it further, we can ask the model to recolor the accent lighting to a Cloudflare orange by giving it a specific hex code like #F48120.
The newest FLUX.2 [dev] model is now available on Workers AI — you can get started with the model through our developer docs or test it out on our multimodal playground.