It's time for agentic video editing

2025 was the year of video. 2026 is the year we let agents edit it.

Jan 21, 2026

Before we start:
a16z is a remarkable collection of people. Today we’re highlighting a few of them, as we celebrate some well-deserved promotion and recognition:

We’re thrilled to share that Alex Immerman has been promoted to General Partner on our Growth fund. You can read Alex’s most recent piece on a16z news here, on self-driving cars and their lifesaving potential. Congratulations, AI!
Similarly, Matt Bornstein has been promoted to General Partner on our Infra fund. You can read Matt’s lovingly curated Sci-fi reading list here. Congratulations, Matt!
The whole Infra fund - led by Martin, Raghu, Jennifer and Matt - got a great profile in Bloomberg that you should read.

And now, on with the show. Here’s Justine:

2025 was the year of video. AI-generated ads went mainstream. Launch videos from seed stage startups got millions of views. Video podcasts and interviews exploded.

What you didn’t see was all the work behind the scenes. Cutting 90 minutes of footage into a three-minute short. Correcting lighting and audio in post-production because you couldn’t nail it in the shoot. Searching for the right music and sound effects.

A common rule of thumb in video production is that you’ll spend 80% of your time & energy on editing, and 20% on filming (or now, generating). Crafting compelling video is typically a long and tedious process – and few people have the “taste” to do it right. There’s a significant barrier to entry.

We now have the technology to hand over some of this work to AI agents, which can help us produce both filmed and generated content. Vision models can watch and comprehend massive amounts of video footage. Agents can analyze, plan, and use editing tools on your behalf. And we have enough training data to teach models what makes a video great.

Video agents will blow out the supply curve for quality video – the kind of content that requires days (or weeks) from professional video editors today. What Cursor did for coding, these agents will do for video production.

Why now?

There’s immense demand for agents that give anyone the skills (and taste) of a professional video editor. So why don’t these products already exist? There have been a few recent developments that have unlocked progress here:

Vision models can now process large amounts of video. You have to understand video before you can edit it. This is a non-trivial challenge - there’s a lot of information to process, even in a short clip. We’ve seen a lot of progress with recent LLMs like Gemini 3, GPT-5.2, Molmo 2, and Vidi2, which are inherently multimodal and have longer context windows. Gemini 3 can now process up to an hour of video! You can upload it as an input and ask the model to generate timestamped labels, find a specific moment, or just summarize what’s happening.

Models can now use tools. AI video editors need to be able to take action - not just describe what’s happening or suggest changes. We’re starting to see meaningful progress around LLMs as real agents that can use tools. One of my favorite examples of this is Claude using Blender (a notoriously tricky product that many humans haven’t mastered). You can imagine how this evolves as you give agents access to more tools.

Image and video generation models have improved. I’m a big believer that many video production pipelines will be hybrid - a mix of AI and filmed content. Imagine filming interviews for a documentary, but generating B-roll or historical footage with AI. Or using a motion transfer model to take a reference animation and apply it to a real character. For any of these things to work, models needed to reach a level of quality & consistency to be valuable. Now, that’s finally happening.

What will these agents do?

A few examples of the types of tasks they’ll be able to tackle for us:

Process - whether you’re filming or generating a video, you’ll likely end up with a lot more footage than you need (sometimes by a factor of hundreds - imagine how many “takes” there are for each scene of a movie or TV show). It’s often a challenge to sort through all this footage, organize it, and decide what to use. Products like Eddie AI can take hours of uploaded video and do things like identifying A-roll vs. B-roll, processing multiple camera angles, and comparing takes.

@heyeddie.ai

Eddie AI on Instagram: "How does Eddie handle B-Roll logging? @…

Orchestrate - if we assume that many videos are going to include some element of AI in the future, we’re going to need agents that orchestrate all of the models. For example - imagine you want to add an AI animation to an educational video. You’ll need an agent that can generate the images, send them to a video model, and stitch the outputs together. Products like Glif are launching agents that coordinate between multiple models on a user’s behalf.

Polish - fixing the small details often end up taking a video from good to great. But if you’re not a professional video editor, you may be overwhelmed by the flood of little tasks needed to polish a video. For example - you may want to adjust lighting between clips, clean noise out of the audio track, or take out filler words (“ummms” and “uhhhs”) during an interview. Products like Descript’s Underlord agent can take a video, make all these changes for you, and deliver the final version.

Adapt - when you make a good video, it often makes sense to adapt it for more reach. For example, you may want to cut a YouTube podcast into short clips with different aspect ratios to post on your X, Instagram, and TikTok accounts. Or even translate a video into other languages (and re-dub the speakers) to reach an international audience. Platforms like Overlap allow you to set up node workflows for these adaptation tasks.

Optimize - the ultimate goal isn’t just replacing manual tasks with AI. It’s building agents with taste that can make your videos better. There’s a reason people hire professional video editors: they make things look good. They spend years learning everything from how to hook viewers to pacing the storyline and using music to build an emotional reaction. There are thousands of micro-decisions there. YouTuber Emma Chamberlain famously said that she used to spend 30-40 hours editing a ~15 minute vlog.

What if an AI agent could watch your footage, ask about your objectives, and then craft a few draft versions of a video for you to iterate on? You review and direct - “The opening is too slow.” “Cut the middle section.” “Make the ending hit harder” - and the agent executes.

Video has won. It’s how we learn, market, and connect. But the editing bottleneck keeps growing. More footage captured, more platforms to publish on, more formats required.

The good news is that technology to solve this exists. Vision models, tool-using agents, and massive amounts of training data have all matured in the past year. The pieces are in place.

This means that AI editing agents will dramatically increase the quality of all video we see in the coming months and years, along with the rate at which it’s created.

2025 was the year of video. 2026 is the year we let agents edit it.

If you’re building anything in the AI video space reach out on X at @venturetwins or email at jmoore@a16z.com. I especially want to talk if you’re building agents that can help me edit my companion video to this blogpost (see below!).

This newsletter is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. Furthermore, this content is not investment advice, nor is it intended for use by any investors or prospective investors in any a16z funds. This newsletter may link to other websites or contain other information obtained from third-party sources - a16z has not independently verified nor makes any representations about the current or enduring accuracy of such information. If this content includes third-party advertisements, a16z has not reviewed such advertisements and does not endorse any advertising content or related companies contained therein. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z; visit https://a16z.com/investment-list/ for a full list of investments. Other important information can be found at a16z.com/disclosures. You’re receiving this newsletter since you opted in earlier; if you would like to opt out of future newsletters you may unsubscribe immediately.

A guest post by

Justine Moore

Partner at a16z

Jacek Sidzina

Jan 21

Justine - couldn't agree more that video editing is ready for its Cursor moment. The tech is finally here: vision models, tool-using agents, massive training data.

But here's what I'm seeing from the trenches: the 20/80 rule (20% filming, 80% editing) is actually just the tip of the iceberg. Above that sits an even bigger creative layer - story development, concept iteration, voice-over scripting, audio design, team approvals, client feedback loops.

The real bottleneck isn't just editing execution - it's the orchestration of the entire creative workflow. We're building exactly that: a centralized platform that connects all these human & process challenges first, then plugs AI agents into the right moments. Because the best AI video tools will fail if they can't integrate into how teams actually work.

The agents are coming. But we need to nail the basics of workflow & collaboration to truly unlock their potential. That's what we're solving, would be happy to connect and share insights.

Jesse Gurevich

Jan 21Edited

Great insights! I’m still looking for an app to go from script to quality video production , any suggestions? Or do I need a few apps to accomplish this?

3 replies

10 more comments...

Discussion about this post

Ready for more?