Your Data Agents Need Context

Mar 10

The market has realized that data and analytics agents are essentially useless without the right context

25 Comments

I think an interesting point to be mentioned is that the basic unit of any business or work is a person, and the ability of any given person to use their own context when interacting with agents and LLM's is key.

I built my personal context layer with MindStash, where I capture any new piece of information I find relevant (including this piece), as well as my thoughts. It puts it all in a knowledge graph I can talk to and see all of the connections.

Bastian Bergmann

Mar 12

Even more relevant is a human context layer based on real psychological data to help agents understand who they’re interacting with and how to adapt themselves. Number one pushback from users right now is that agents lack personality and feel bland.

https://technicallyentertaining.substack.com/p/why-your-1-billion-ai-agent-still?utm_source=app-post-stats-page&r=uzcdz&utm_medium=ios

Sebastian Cajamarca

Mar 11

Great article !!!

While building a text-to-SQL analytics agent on top of Metabase, we saw the same thing. Simple RAG over schema metadata worked for basic queries, but it often failed because the agent lacked context about real data values, business definitions, and how tables actually relate.

What improved results was letting the agent explore the database (inspect tables, test small queries, and learn over time) more like a human analyst would.

I shared some of the lessons from building it here:

https://medium.com/@sebastiancajamarca/building-the-analytics-agent-on-metabase-a-progress-report-44192f2d4468

Curious if others building data agents have run into the same challenges.

Binh Tran

Mar 11

Agents fail without context regardless of scale. When context is local, free, and hierarchically organized rather than flat embeddings, different agents can pull different slices from the same knowledge base. A coding agent gets technical context, a data agent gets business definitions, a planning agent gets decision history. Same underlying structure, composed differently per session. For enterprise, apply rbac and soc2

The context layer becomes a coordination mechanism across agents, not just a memory store for one. A promising early implementation of this is ByteRover: local and free, human editable MD files, hierarchical context tree, recall is scored by importance and recency, and works across tools

8Lee

Mar 10

Or, you could just use the existing frameworks and guidance and masterfully build context-heavy context for epic guidance to orchestrate very brittle workflows (like signing / certifying / notarizing 30 languages into a single package): https://blog.yen.chat/p/skills

Martin Alen

Mar 16

Nice articulation of the context problem in data agents. Thanks.

They fail not because LLMs are weak but because they lack stable, maintained, purposeful context bridging messy data to real tasks. Context isn’t just semantics or schemas; it’s constraint, intent, tribal knowledge, and task logic that makes agent behaviour reliable.

That’s precisely where upstream human-declarative signal infrastructure matters: structured, constraint-aware human intent becomes the context layer agents can reliably reason over. Building better context isn’t just about ingesting data, it’s about anchoring it to declared human purpose and constraint before it becomes noise.

If you’re thinking about scalable context layers, think about protocols for high-fidelity human signal, that’s the missing substrate agents actually need.

That's what we're working on - https://s7tlabs.com/

Maarten

Mar 13

Very legitimate assessment, applicable in many industries.

Peter S

Mar 12

Great article, I liked it.

Data context explains what things mean.

Decision context explains why the organization behaves the way it does.

Most agent stacks today implement the first.

The second — a persistent memory of decisions and their causal dependencies — still feels largely unexplored.

Sebastian Cajamarca

Mar 11

Great article!!!!

What improved results was letting the agent explore the database (inspect tables, test small queries, and learn over time) more like a human analyst would.

I shared some of the lessons from building it here:

https://medium.com/@sebastiancajamarca/building-the-analytics-agent-on-metabase-a-progress-report-44192f2d4468

Curious if others building data agents have run into the same challenges.

Isaac Sarfo

Mar 11

Exactly what my team and I are building at Papermap.ai, great read

Vivek Sahi

Mar 11

Thanks for putting this together. I think in enterprises the business context will likely live closer to the BI layer because:

1. Data or engineering teams usually won’t be interested in owning and managing it.

2. Sales, marketing, and other business teams evolve or redefine their metrics over time, so they’ll want ownership.

For smaller teams, however, this may all sit with a single data person who also understands the business context. That setup can help startups move much faster, which we’re already seeing.

Nick Cicero

Mar 11

Great minds think alike! https://nickcicero.substack.com/p/the-hidden-problem-with-ai-agents?r=3xns6&utm_medium=ios&triedRedirect=true&_src_ref=t.co

Georg Philip Krog

Mar 11

The a16z article asks: “Where will the context layer live?”

I develop ‘the specification’. The specification declares what

data is processed, for what purpose, under what legal basis, for how long,

and by whom. Governed apps are generated from the specification. A

policy card is compiled from the specification. The MCP tools expose the

specification to agents. The kernel enforces the specification at runtime.

The wallet stores the data and the audit trail.

The future the a16z article gestures toward — truly autonomous agents

operating safely across enterprise data — requires more than a context

layer. It requires a governance layer that agents cannot circumvent, a

wallet architecture that keeps data sovereign, and formal authority bounds

that make it structurally impossible for an agent to exceed its mandate.

There is no separate context construction pipeline. There is no tribal

knowledge that can go stale. The governance is the context. The specification

generates everything.

∙ For humans: Specifications are readable documentation that actually

controls the system

∙ For agents: Specifications are machine-executable policy that bounds

their actions

∙ For compliance: Every action is auditable to a specification, legal basis,

and formal rule

∙ For sovereignty: Data stays in wallets; apps connect to wallets, not the

reverse

Sovereign by design. Compliant by construction. Governed by specification.

Najmuzzaman Mohammad

Mar 10

the big overlooked data source for context that includes org context, decision traces and human feedback, all in one, is AI agent/chat sessions of folks across the company. make context extraction and disbursal agent agnostic and you have got one of the richest context source, especially when overlapped with org data from CRM, warehouse, etc.

ps: our attempt to build an cross-agent context layer overlapping with org data is available as a context CLI for agents: https://www.npmjs.com/package/@nex-ai/nex

Harsh Chaudhary

Mar 10

It's not just "proper data", it's also how context is assembled. How conflicting/stale/un-related data is processed to surface the relevant bits.

In some ways, this is not much different from the 2014s, when "data was the new oil" and it was thought that dumping data into data lakes will automatically be useful.

What actually happened was the rise of specialized compute and storage: Spark, document DBs, KV stores that made the underlying data swamps useful.

Shared on this earlier today:

https://www.linkedin.com/pulse/context-assembly-pipeline-harsh-chaudhary-xofie

NΞRD₿ΞN

Mar 10

Same pattern in a different slice: we see the “context layer” problem in eng when code agents (Cursor, Claude, etc.) and humans collaborate on specs - constraints and definitions drift, no single source of truth, tribal knowledge in scattered markdown. We’re building Specularis.org to keep specs canonical and versioned for that workflow. The “living corpus” idea maps cleanly to specs as the context agents and humans share. Great post, thanks!