I think an interesting point to be mentioned is that the basic unit of any business or work is a person, and the ability of any given person to use their own context when interacting with agents and LLM's is key.
I built my personal context layer with MindStash, where I capture any new piece of information I find relevant (including this piece), as well as my thoughts. It puts it all in a knowledge graph I can talk to and see all of the connections.
Even more relevant is a human context layer based on real psychological data to help agents understand who they’re interacting with and how to adapt themselves. Number one pushback from users right now is that agents lack personality and feel bland.
While building a text-to-SQL analytics agent on top of Metabase, we saw the same thing. Simple RAG over schema metadata worked for basic queries, but it often failed because the agent lacked context about real data values, business definitions, and how tables actually relate.
What improved results was letting the agent explore the database (inspect tables, test small queries, and learn over time) more like a human analyst would.
I shared some of the lessons from building it here:
Agents fail without context regardless of scale. When context is local, free, and hierarchically organized rather than flat embeddings, different agents can pull different slices from the same knowledge base. A coding agent gets technical context, a data agent gets business definitions, a planning agent gets decision history. Same underlying structure, composed differently per session. For enterprise, apply rbac and soc2
The context layer becomes a coordination mechanism across agents, not just a memory store for one. A promising early implementation of this is ByteRover: local and free, human editable MD files, hierarchical context tree, recall is scored by importance and recency, and works across tools
Or, you could just use the existing frameworks and guidance and masterfully build context-heavy context for epic guidance to orchestrate very brittle workflows (like signing / certifying / notarizing 30 languages into a single package): https://blog.yen.chat/p/skills
Nice articulation of the context problem in data agents. Thanks.
They fail not because LLMs are weak but because they lack stable, maintained, purposeful context bridging messy data to real tasks. Context isn’t just semantics or schemas; it’s constraint, intent, tribal knowledge, and task logic that makes agent behaviour reliable.
That’s precisely where upstream human-declarative signal infrastructure matters: structured, constraint-aware human intent becomes the context layer agents can reliably reason over. Building better context isn’t just about ingesting data, it’s about anchoring it to declared human purpose and constraint before it becomes noise.
If you’re thinking about scalable context layers, think about protocols for high-fidelity human signal, that’s the missing substrate agents actually need.
While building a text-to-SQL analytics agent on top of Metabase, we saw the same thing. Simple RAG over schema metadata worked for basic queries, but it often failed because the agent lacked context about real data values, business definitions, and how tables actually relate.
What improved results was letting the agent explore the database (inspect tables, test small queries, and learn over time) more like a human analyst would.
I shared some of the lessons from building it here:
Thanks for putting this together. I think in enterprises the business context will likely live closer to the BI layer because:
1. Data or engineering teams usually won’t be interested in owning and managing it.
2. Sales, marketing, and other business teams evolve or redefine their metrics over time, so they’ll want ownership.
For smaller teams, however, this may all sit with a single data person who also understands the business context. That setup can help startups move much faster, which we’re already seeing.
the big overlooked data source for context that includes org context, decision traces and human feedback, all in one, is AI agent/chat sessions of folks across the company. make context extraction and disbursal agent agnostic and you have got one of the richest context source, especially when overlapped with org data from CRM, warehouse, etc.
It's not just "proper data", it's also how context is assembled. How conflicting/stale/un-related data is processed to surface the relevant bits.
In some ways, this is not much different from the 2014s, when "data was the new oil" and it was thought that dumping data into data lakes will automatically be useful.
What actually happened was the rise of specialized compute and storage: Spark, document DBs, KV stores that made the underlying data swamps useful.
Same pattern in a different slice: we see the “context layer” problem in eng when code agents (Cursor, Claude, etc.) and humans collaborate on specs - constraints and definitions drift, no single source of truth, tribal knowledge in scattered markdown. We’re building Specularis.org to keep specs canonical and versioned for that workflow. The “living corpus” idea maps cleanly to specs as the context agents and humans share. Great post, thanks!
I think an interesting point to be mentioned is that the basic unit of any business or work is a person, and the ability of any given person to use their own context when interacting with agents and LLM's is key.
I built my personal context layer with MindStash, where I capture any new piece of information I find relevant (including this piece), as well as my thoughts. It puts it all in a knowledge graph I can talk to and see all of the connections.
Even more relevant is a human context layer based on real psychological data to help agents understand who they’re interacting with and how to adapt themselves. Number one pushback from users right now is that agents lack personality and feel bland.
https://technicallyentertaining.substack.com/p/why-your-1-billion-ai-agent-still?utm_source=app-post-stats-page&r=uzcdz&utm_medium=ios
Great article !!!
While building a text-to-SQL analytics agent on top of Metabase, we saw the same thing. Simple RAG over schema metadata worked for basic queries, but it often failed because the agent lacked context about real data values, business definitions, and how tables actually relate.
What improved results was letting the agent explore the database (inspect tables, test small queries, and learn over time) more like a human analyst would.
I shared some of the lessons from building it here:
https://medium.com/@sebastiancajamarca/building-the-analytics-agent-on-metabase-a-progress-report-44192f2d4468
Curious if others building data agents have run into the same challenges.
Agents fail without context regardless of scale. When context is local, free, and hierarchically organized rather than flat embeddings, different agents can pull different slices from the same knowledge base. A coding agent gets technical context, a data agent gets business definitions, a planning agent gets decision history. Same underlying structure, composed differently per session. For enterprise, apply rbac and soc2
The context layer becomes a coordination mechanism across agents, not just a memory store for one. A promising early implementation of this is ByteRover: local and free, human editable MD files, hierarchical context tree, recall is scored by importance and recency, and works across tools
Or, you could just use the existing frameworks and guidance and masterfully build context-heavy context for epic guidance to orchestrate very brittle workflows (like signing / certifying / notarizing 30 languages into a single package): https://blog.yen.chat/p/skills
Nice articulation of the context problem in data agents. Thanks.
They fail not because LLMs are weak but because they lack stable, maintained, purposeful context bridging messy data to real tasks. Context isn’t just semantics or schemas; it’s constraint, intent, tribal knowledge, and task logic that makes agent behaviour reliable.
That’s precisely where upstream human-declarative signal infrastructure matters: structured, constraint-aware human intent becomes the context layer agents can reliably reason over. Building better context isn’t just about ingesting data, it’s about anchoring it to declared human purpose and constraint before it becomes noise.
If you’re thinking about scalable context layers, think about protocols for high-fidelity human signal, that’s the missing substrate agents actually need.
That's what we're working on - https://s7tlabs.com/
Very legitimate assessment, applicable in many industries.
Great article, I liked it.
Data context explains what things mean.
Decision context explains why the organization behaves the way it does.
Most agent stacks today implement the first.
The second — a persistent memory of decisions and their causal dependencies — still feels largely unexplored.
Great article!!!!
While building a text-to-SQL analytics agent on top of Metabase, we saw the same thing. Simple RAG over schema metadata worked for basic queries, but it often failed because the agent lacked context about real data values, business definitions, and how tables actually relate.
What improved results was letting the agent explore the database (inspect tables, test small queries, and learn over time) more like a human analyst would.
I shared some of the lessons from building it here:
https://medium.com/@sebastiancajamarca/building-the-analytics-agent-on-metabase-a-progress-report-44192f2d4468
Curious if others building data agents have run into the same challenges.
Exactly what my team and I are building at Papermap.ai, great read
Thanks for putting this together. I think in enterprises the business context will likely live closer to the BI layer because:
1. Data or engineering teams usually won’t be interested in owning and managing it.
2. Sales, marketing, and other business teams evolve or redefine their metrics over time, so they’ll want ownership.
For smaller teams, however, this may all sit with a single data person who also understands the business context. That setup can help startups move much faster, which we’re already seeing.
Great minds think alike! https://nickcicero.substack.com/p/the-hidden-problem-with-ai-agents?r=3xns6&utm_medium=ios&triedRedirect=true&_src_ref=t.co
The a16z article asks: “Where will the context layer live?”
I develop ‘the specification’. The specification declares what
data is processed, for what purpose, under what legal basis, for how long,
and by whom. Governed apps are generated from the specification. A
policy card is compiled from the specification. The MCP tools expose the
specification to agents. The kernel enforces the specification at runtime.
The wallet stores the data and the audit trail.
The future the a16z article gestures toward — truly autonomous agents
operating safely across enterprise data — requires more than a context
layer. It requires a governance layer that agents cannot circumvent, a
wallet architecture that keeps data sovereign, and formal authority bounds
that make it structurally impossible for an agent to exceed its mandate.
There is no separate context construction pipeline. There is no tribal
knowledge that can go stale. The governance is the context. The specification
generates everything.
∙ For humans: Specifications are readable documentation that actually
controls the system
∙ For agents: Specifications are machine-executable policy that bounds
their actions
∙ For compliance: Every action is auditable to a specification, legal basis,
and formal rule
∙ For sovereignty: Data stays in wallets; apps connect to wallets, not the
reverse
Sovereign by design. Compliant by construction. Governed by specification.
the big overlooked data source for context that includes org context, decision traces and human feedback, all in one, is AI agent/chat sessions of folks across the company. make context extraction and disbursal agent agnostic and you have got one of the richest context source, especially when overlapped with org data from CRM, warehouse, etc.
ps: our attempt to build an cross-agent context layer overlapping with org data is available as a context CLI for agents: https://www.npmjs.com/package/@nex-ai/nex
It's not just "proper data", it's also how context is assembled. How conflicting/stale/un-related data is processed to surface the relevant bits.
In some ways, this is not much different from the 2014s, when "data was the new oil" and it was thought that dumping data into data lakes will automatically be useful.
What actually happened was the rise of specialized compute and storage: Spark, document DBs, KV stores that made the underlying data swamps useful.
Shared on this earlier today:
https://www.linkedin.com/pulse/context-assembly-pipeline-harsh-chaudhary-xofie
Same pattern in a different slice: we see the “context layer” problem in eng when code agents (Cursor, Claude, etc.) and humans collaborate on specs - constraints and definitions drift, no single source of truth, tribal knowledge in scattered markdown. We’re building Specularis.org to keep specs canonical and versioned for that workflow. The “living corpus” idea maps cleanly to specs as the context agents and humans share. Great post, thanks!