The Agentic Discovery Problem
Media Discovery in an Agentic World: Why Legacy Metadata Alone Won't Save You

Ben Thompson recently published a Stratechery interview with Parag Agrawal, the former Twitter CEO and founder of Parallel, about valuing content on the agentic web.
The immediate implication is for web publishers. There is a bigger implication for media.
Start with these scenarios.
You ask your agent to find something to watch. Not a specific title. A direction. Something tense without violence. Smart but not slow. Some humor. Maybe something that feels like early Spielberg, but less nostalgic. Or a documentary that explains a complex topic without feeling like homework.
How does the agent figure that out?
Or, you remember hearing a podcast segment where Simon Willison talked about dark factories. You do not remember the podcast, the episode or the timestamp. You just remember the topic and person. You ask your agent to find the clip.
Can it?
Right now, probably not reliably.
And the reason is not model intelligence. It is that the content itself is not exposed in a way that lets an agent consistently understand what it is, what it contains, why it matters, and when it is the right match for a particular intent.
That is a real issue for media companies.
The problem is not that media assets lack metadata. They have titles, synopses, cast lists, genres, ratings, runtimes, artwork, reviews, IDs, platform descriptions, and plenty of other descriptive material. But that's not the same as exposing the semantics of a media asset in a predictable, publisher-controlled, machine-readable way that agents can reliably use for discovery.
A synopsis isn't enough. Sure, a synopsis can tell an agent what a film is nominally about. It usually cannot tell the agent what the film feels like, or reliably expose pacing, tone, emotional register, visual style, thematic depth, cultural context, audience fit, and scene-level moments. Most pointedly, it cannot reliably discern why this title is a better match than another title for a specific person in a specific moment. And as agents build deep models of individual taste, the logical next step beyond better matching is content that adapts... versions, formats, presentations, and eventually creative elements tailored to the person receiving them. That's a post for another day, but the foundation is the same: the agent has to understand the content atomically first.
The same problem applies to television, podcasts, news video, sports archives, documentaries, music, audiobooks, educational media, and brand libraries.
Much of that understanding exists somewhere. In marketing copy, reviews, IMDb, fan wikis, transcripts, social conversations and the heads of the people who made, programmed, acquired, or marketed the content.
That scattered knowledge is not an agentic discovery strategy.
When agents search, compare, and recommend, do media companies really want them relying on whatever they can infer, or do they want to provide a richer, more authoritative picture of their own content?
Parallel's work points toward a world where agents become a major class of content users. That world is here. Cloudflare's latest data shows that automated bots now account for 57.5% of all HTTP requests to web content globally, versus 42.5% from humans, a threshold CEO Matthew Prince had not expected until 2027.
Agents will not browse a carousel, admire key art, or scroll a feed. They will evaluate options against an intent. For media companies, that raises a hard question: can an agent understand your catalog well enough to recommend it?
Media companies can influence discovery through trailers, artwork, editorial placement, paid promotion, metadata feeds, PR, platform relationships, and recommendation systems. Nonetheless, in an agent-mediated environment, the decision may happen somewhere else entirely.
If that agent cannot access a reliable semantic representation of the asset, the publisher has lost control over how the content is interpreted. A media company may own the asset, but not the agent's understanding of the asset. That gap matters because agentic discovery will not only rank what is popular. It will rank what is legible.
Agent-legible media means more than metadata. It means exposing the meaning, structure, rights, availability, and relevance signals around media assets in a form agents can reliably interpret.
This is not SEO for AI. That framing is too small. The question is can an agent understand our content well enough to know when it is the right answer?
Right now, for most of the industry, the answer is no. When agents become a primary interface for discovery, that will not be a metadata problem.
It will be a distribution and a monetization problem.