The State of Generative Media 2026
A look at fal's deep-dive into the remarkable progress we’re seeing in generative media
America | Tech | Opinion | Culture | Charts
Today, our friends at fal released the State of Generative Media Report, a deep-dive into the remarkable progress we’re seeing in generative media. fal has a particularly privileged vantage point: their inference engine serves over 600 models to millions of hobbyists, developers, and enterprises who collectively generate billions of assets. Naturally, we were eager to dig in.
Here are our top takeaways.
1) No one model to rule them all
The world of image and video models is remarkably fragmented — and intentionally so. fal’s report finds that enterprise production deployments use a median of 14 different models. This is a striking contrast to the LLM landscape, where a handful of major labs dominate. a16z’s growth strategy recently put out a report that illustrates just how concentrated that market is: OpenAI, Gemini, and Anthropic together command 89% of enterprise wallet share. In generative media, we see nothing like that.
This makes sense, and is actually something we’ve written about before. Even if a model is extraordinary at photorealistic images, or excels at anime aesthetics, or has strong physics simulation, that doesn’t mean you should use it for background removal, sound generation, or multi-shot narrative scenes. Each model tends to be strong in some areas and weaker in others.
The job of infrastructure (and a team like fal) expands accordingly: it’s not just about serving requests efficiently, but about supporting the rapid pace of new releases — rolling out new models every few weeks and providing day-0 support as the field moves faster than enterprise software typically does.
2) From inference to orchestration
Another reason why there are dozens of models used simultaneously is producing a single polished asset is rarely a single inference call. In practice, developers chain multiple models together: generate an image, remove the background, upscale it, recolor it, apply a style-consistent LoRA - to achieve the brand-level consistency and quality that one-shot prompting simply can’t deliver. The unit of work isn’t one model, it’s a workflow.
This has real implications for infrastructure: it’s not enough to serve individual models quickly. You need to orchestrate multi-step pipelines with low cumulative latency, manage dependencies between steps, and make it easy to swap in new models as the frontier moves. As we get into the territory of long form videos, this pipeline is evolving even more rapidly. Producing even a short branded film means chaining together scene generation, camera motion control, character persistence across cuts, dialogue synthesis, sound design, and post-production effects, with each step relying on a specialized model and the output of the step before it.
This is where developer tooling becomes make-or-break. If every model in the pipeline has a different API shape, auth, error handling, and async behavior, your team spends more time on plumbing than product. You need a unified interface across models, workflow primitives to chain steps into a single callable pipeline, streaming for intermediate results, and queue management for long-running jobs. The orchestration layer matters as much as the models themselves, and it’s a big part of why infrastructure like fal’s is so critical to take models from prototyping to production.
3) Not all pixels are worth the same
One thing we’ve observed is that builders have gotten remarkably savvy about model selection. The key insight: the right model depends on what you’re generating and at what scale.
If you’re producing huge volumes of small, utilitarian images — think product thumbnails or feed assets — you bias toward models that are fast and cheap, because the marginal value of perfection is low and the marginal cost compounds fast. Models like Flux are a natural fit here. Conversely, when you’re generating hero assets where polish is the priority - ad campaigns, logos, brand imagery - it makes sense to pay for something like Nano Banana Pro, because small imperfections will look unprofessional at that level of scrutiny.
But the cost calculus doesn’t stop at the model layer. Infrastructure matters too. In fal’s survey with Artificial Analysis, 58% of organizations identified cost optimization as their primary criterion when selecting model infrastructure, ahead of factors like model availability and generation speed. Competition is happening at two layers simultaneously: between infra providers racing to offer the most cost-effective run of a given model, and between models along the cost-quality frontier where the right choice depends on traffic scale and tolerance for imperfection.
4) Adoption is everywhere (but some industries are moving faster than others)
Generative media isn’t a niche anymore. Adoption is showing up across virtually every industry, from creative software to retail and commerce. Three verticals stand out: gaming, advertising, and e-commerce.
In gaming, studios are using generative models to prototype concept art, populate environments, and produce in-game assets at a pace traditional pipelines can’t touch. In advertising, the shift is dramatic - campaigns that once took weeks of production now spin up hundreds of personalized variations in hours. It’s changing the economics of creative testing and spawning entirely new startup categories. And in e-commerce, the case almost makes itself: when you need product shots, lifestyle imagery, and seasonal creative across thousands of SKUs, generative media turns what used to require a team of photographers, weeks of shoots, and long editing cycles into a few prompts and a library of production-ready assets.
5) What comes next?
2026 looks like another packed year for generative media. A few things we’re watching closely:
More capable and coherent video. Seedance 1.0 topped leaderboards in June 2025, and the previews of Seedance 2.0 are blowing us away. Other video model providers like Kling and Grok aren’t far behind – and we expect there’s more to come from Sora at OpenAI and Veo at DeepMind. The next generation of models will continue to push on multi-shot narrative consistency, character persistence across scenes, and controllability. The model releases in 2025 came every 4–6 weeks; there’s no reason to expect that pace to slow down.
World models going from prototype to product. This is arguably the most exciting frontier. In late 2025, Marble (from World Labs) showed that it’s now possible to generate persistent, interactive 3D environments from a single image or text prompt. Meanwhile, other players like Genie 3 (DeepMind) are pushing forward on real-time video that users can explore like it’s a game. We expect the applications of these world models in gaming, entertainment, simulation, and training autonomous systems will be enormous.
Open source gains ground. The closed vs. open source model debate is heating up in generative media. Enterprises are increasingly gravitating toward open-source models for production not because they’re cheaper, but because they’re customizable: when you need brand consistency, character persistence, or product fidelity across millions of generated assets, finetuning on your own data isn’t optional, it’s the whole game. Closed APIs generally don’t allow that, or offer it in very constrained ways. Meanwhile, open-source models like Flux and Qwen Image Edit closed the quality gap faster than anyone expected in 2025.
This newsletter is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. Furthermore, this content is not investment advice, nor is it intended for use by any investors or prospective investors in any a16z funds. This newsletter may link to other websites or contain other information obtained from third-party sources - a16z has not independently verified nor makes any representations about the current or enduring accuracy of such information. If this content includes third-party advertisements, a16z has not reviewed such advertisements and does not endorse any advertising content or related companies contained therein. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z; visit https://a16z.com/investment-list/ for a full list of investments. Other important information can be found at a16z.com/disclosures. You’re receiving this newsletter since you opted in earlier; if you would like to opt out of future newsletters you may unsubscribe immediately.













