Context Graphs: The Term Everyone’s Using and Nobody’s Explaining

Context graphs are getting a lot of attention in Data+AI security. Yet ask most vendors to explain what a context graph is and you’ll get boxes and arrows. The term is everywhere; the substance is harder to find. In this blog, I break down what they actually are, how they’re built, and what separates a real one from a rebranded diagram.

What a Context Graph Actually Is

Worth stating upfront: context graph means different things in different fields. In AI development, a context graph is often a memory or reasoning layer built into an LLM application — a way for a model to track relationships between concepts across a conversation or workflow. That’s a legitimate use of the term, but it’s not what this post is about.

In Data+AI security, a context graph serves a different purpose: it’s a queryable layer that maps the relationships between identities, data, permissions, policies, and activity so you can govern and audit what AI systems can reach, what they’ve done, and whether any of it was authorized. The entities in the graph aren’t concepts — they’re users, service accounts, databases, models, and the access paths between them.

Every security tool tells you what happened. User A accessed Table B. Model X deployed to Production. That’s telemetry, and it’s the easy part. The harder question is why that access existed, whether anyone authorized it, and whether it’s still appropriate given how the environment has changed. A security context graph encodes those answers as a live, traversable structure: every entity connected to the decisions that authorized it, the policies that govern it, and the events that touched it.

For example:

  • User A → has access to → Database X → governed by → Policy: No PII access for non-HR roles
  • AI Agent C → authenticates via → Service Account S → last reviewed → 14 months ago → no approval on record
  • AI Model Y → was trained on → Database X → deployed in → Production App Z → accessible by → External users

Start at any node, walk the edges. You get provenance, accountability, and the full chain of why.

Three Inputs, Three Agentic Problems

At Symmetry, we started from a single premise: you can’t secure data without understanding the relationships around it — who can reach it, through what paths, with what permissions, under what policies, and whether any of that was deliberately authorized. So we built a structure that encoded those relationships: entities, edges, attributes, policies as views, telemetry wired to the same layer. Everything we’ve shipped runs on top of it.

Over time, we’ve realized we’ve built a context graph perfect for Data+AI Security (not for AI reasoning, but for governing what AI can reach) and providing the perfect baseline for enabling AI to reason about all of that in one place.

Encoding our learnings, we believe a security context graph should be built from three inputs. Each one answers a different question about identity and its relationship to data – and each maps directly to a distinct problem that agentic AI creates.

Permissions: what an identity can do

Permissions are the current technical state of access: direct grants, inherited roles, group memberships, service account delegations. They capture what an identity can actually reach right now, independent of whether anyone intended that or whether it still makes sense.

In agentic AI, this is where the identity problem lives. Agents authenticate via shared service accounts, chain tool calls, and spawn sub-agents that inherit permissions from a parent process. A single entry in your logs might represent dozens of autonomous agents funneling through one service principal. An agent might have inherited broad access to production databases because the account it authenticates through was provisioned years ago and never revisited. The permissions are real; the deliberate authorization behind them often is not. The question permissions let you answer: given this agent acting right now, what is the full chain of identity delegation, and is any of it actually sanctioned?

Policies: what an identity should be able to do

Policies are the intended state: what access is authorized, under what conditions, for what purpose. At Symmetry, we define policies as graph views — named, saved queries run against the graph. A view can describe either side of the policy line: everything in compliance, or everything out of it. The view is just a query; you define what it looks for.

In agentic AI, this is where the data problem lives. A RAG pipeline pulls documents from SharePoint, chunks them, embeds them in a vector database, and serves them through a chatbot. The data has moved from a governed store to an embedding layer where the original permissions no longer apply. An agent with access to the chatbot can surface content the user behind it was never authorized to see. Policy views expose this: an out-of-policy view scoped to data classification will surface every model with a training or retrieval path to data it was never supposed to reach. Because the policy definition runs directly against the same graph as the permissions and the data, there’s no translation between systems — just a query.

Telemetry: what an identity has done

Telemetry is the historical record: API calls, data access events, authentication logs, model invocations. On its own it’s a firehose with no way to separate routine from anomalous. It becomes meaningful when evaluated against permissions and policies. An access event within all three is noise. The same event outside any one of them is a signal.

In agentic AI, this is where the operations problem lives. Even with correctly configured access, agents operate dynamically. An agent scoped to read from one database might, through a sequence of tool calls, end up writing to another. The graph of what should happen and what is happening diverge in real time. Telemetry wired to the same layer as permissions and policies is what makes that divergence visible — and attributable.

A single traversal can cross all three: start with an anomalous operation, trace it to the agent that performed it, follow the access chain to the data it touched, evaluate the full path against the policies that should have governed it. An AI agent with permission to access a customer database, where that access violates the data classification policy, and which has been running undiscovered queries for six months — none of those three facts alone triggers an alert. Together, on a shared graph, they do.

Regulatory frameworks like the EU AI Act increasingly require demonstrating what data trained a model, whether it was properly consented, and whether outputs comply with policy. With decision trails, evidence edges, and policies as graph views all in the same structure, that becomes a query rather than an investigation.

Nodes, Edges, and Attributes

Understanding what feeds the graph is one thing. Understanding how it’s structured is another. The graph is built from three elements: nodes, edges, and attributes. A graph is only as queryable as its structure allows. The difference between one that can answer ‘what is the full blast radius of this compromised credential’ and one that can’t usually comes down to whether the underlying structure was designed with that question in mind.

Nodes are entities: users, service accounts, AI agents, databases, models, roles. Edges are relationships: “has access to,” “was trained on,” “governed by,” “authenticated via.” Edges carry direction and properties: when the relationship was established, how it was granted, who approved it.

Attributes are labels on nodes that describe properties of that entity: data classification, sensitivity level, regulatory scope, cloud region, encryption status. They don’t create relationships, but they make the graph filterable. Show me every identity with access to any PII-tagged node. Show me every model with a training path to HIPAA-scoped data. Show me every service account touching confidential data not reviewed in 90 days. Attributes are what turn a graph traversal into a compliance query.

You can scope the entire graph on any attribute and get a complete picture. The graph doesn’t change; the lens does.

What a Context Graph Is Not

It’s not a CMDB. A CMDB tracks explicitly configured assets and relationships. A context graph models dynamic and implicit ones, including relationships nobody configured intentionally, like an AI model’s indirect dependency on a sensitive dataset.

It’s not a policy engine. A policy engine sits outside the data and returns pass/fail against rules defined elsewhere. Policies here are graph views: named queries run against the same structure holding permissions, identities, and telemetry. The result is a traversal you can inspect, not a verdict from a separate system.

It’s not a data catalog. A catalog describes what exists. A context graph captures live relationships and updates continuously. A graph that doesn’t reflect a service account provisioned at 3 AM is a historical snapshot, not a memory layer.

It’s only as good as the data feeding it. Coverage gaps produce false confidence. That’s why connector coverage across cloud, SaaS, on-prem, mainframe, and air-gapped environments isn’t a feature — it’s a correctness requirement.

Symmetry’s Context Graph in Production

In production, the graph makes certain findings possible that simply aren’t visible any other way. Walking a model back through its training pipeline has surfaced LLM training data that mixed PHI with content from unvetted external sources — a control failure invisible to any tool that doesn’t trace the full path from model to data to classification. The same graph that finds that exposure also shows the dormant identities and unused data stores that shouldn’t exist at all; customers have used it to cut cloud assets by 25% not by guessing but by following the evidence. Whether the question is blast radius, training data provenance, policy drift, or incident reconstruction, the answer is a traversal — starting from any node, following the edges, filtered by whatever attributes define the scope of the problem.

Symmetry built the Data Access Graph out of DARPA-funded research at UT Austin, before the industry had settled on a name for what it was. The structure described in this post — queryable relationships between identities, data, permissions, policies as views, and telemetry on a shared layer — is what that graph has always been. It’s a specific kind of context graph: one built for securing data and AI systems, not for powering them. It has run in air-gapped government agencies, yottabyte-scale enterprise estates, and sovereign environments where no data or metadata leaves the perimeter since day one.

The graph isn’t a layer on top of the platform. It is the platform. And that distinction matters most when you’re building agentic workflows on top of it.

Because the graph already knows what every identity can reach, what they’re authorized to reach, and what they’ve actually done — an agentic workflow built on Symmetry starts with that context already resolved. An agent doesn’t need to re-derive access boundaries or query separate systems for policy state. It traverses the same graph that security runs on. That means agentic automation — remediating over-privileged accounts, enforcing classification-based access, investigating anomalies — can operate with the full fidelity of the security layer, not a simplified approximation of it. The graph that governs AI is the same graph that enables it.

Recent Blogs

About Symmetry Systems

Symmetry Systems is the Data+AI security company, providing organizations with the industry’s only comprehensive Data + AI Security Platform that discovers, classifies, protects, and monitors sensitive data across. Born from award-winning DARPA-funded research at UT Austin, our AI-powered platform delivers comprehensive Data+Ai security across all major cloud environments, SaaS applications, on-premise data stores, legacy systems, and airgapped environments. Our “get everywhere” philosophy continuously expands connector coverage to secure data wherever it lives—in all major cloud environments, SaaS applications, and on-premise data stores-including mainframes, legacy systems and airgapped environments

By uniquely merging both identity and data context, Symmetry provides what other DSPM vendors cannot: complete visibility where data exposure meets agentic identities. Organizations use our platform to eliminate unnecessary data, remove excessive permissions, accelerate compliance and cloud migration, and reduce attack surfaces – while safely enabling agentic AI systems with the identity-aware data context they require.

Innovate with confidence with Symmetry Systems.

Privacy Preferences
When you visit our website, it may store information through your browser from specific services, usually in form of cookies. Here you can change your privacy preferences. Please note that blocking some types of cookies may impact your experience on our website and the services we offer.