Skip to main content

· 5 min read

What is a corporate ownership graph?

A corporate ownership graph is a data structure that maps ownership and control relationships between legal entities, allowing algorithms to calculate indirect stakes, resolve beneficial owners, and trace control through layered corporate structures. Here is how they work and why they matter.

A corporate ownership graph is a mathematical structure that represents companies, individuals, and other legal entities as nodes, and ownership or control relationships between them as directed edges. Each edge carries a weight: the ownership percentage that one entity holds in another. The graph as a whole encodes the full corporate structure of any set of entities it covers, allowing algorithms to traverse relationships and calculate indirect ownership stakes through any number of intermediate layers.

The graph representation is not an academic abstraction. It is the only practical way to resolve Ultimate Beneficial Ownership (UBO) at scale. When corporate structures span dozens of layers and hundreds of entities across multiple jurisdictions, manual chain-tracing is impossible. The graph provides the data structure; matrix algebra provides the computation.

How the UBO calculation works mathematically

The standard method for calculating indirect ownership in a corporate graph is matrix inversion, formally written as (I−W)⁻¹. W is the weight matrix, where W[i][j] represents the fraction of entity j that is directly owned by entity i. The identity matrix I minus W, inverted, gives a matrix where each entry represents the total direct and indirect ownership stake of one entity in another, accounting for ownership through all intermediate paths simultaneously.

This method handles two properties of real corporate structures that naive chain-multiplication cannot. First, it handles multi-path ownership: if there are two independent routes from an ultimate owner to a subsidiary, the stakes from both paths are summed correctly. Second, it handles circular ownership: if Company A owns 10 percent of Company B and Company B owns 10 percent of Company A, the matrix equation resolves to a stable fixed point rather than an infinite loop.

Building this calculation correctly requires that the underlying graph be complete and accurate. A missing edge, a wrong ownership percentage, or an entity that is listed under different names in different source registries will produce an incorrect UBO result, often without any signal that something is wrong.

Why the data layer is as important as the algorithm

The mathematical sophistication of the UBO calculation is only valuable if the underlying graph data is trustworthy. Corporate ownership data comes from national registries: Companies House in the UK, SIRENE in France, Receita Federal in Brazil, the SEC and US state registries in the United States. Each of these registries is a primary source: the data has legal standing because it is derived from mandatory disclosures that companies are legally required to file.

The challenge is that these registries are fragmented, use different formats and entity identifiers, and are updated on different schedules. A multinational conglomerate has separate entries in six national registries with no common key linking them. Before the UBO calculation can run, the data pipeline must ingest all of these sources, normalise the formats, and resolve cross-border entity duplicates, typically using probabilistic record linkage methods such as Fellegi–Sunter matching.

Additionally, the graph must be bitemporal: it must store not only the current state of each ownership relationship but also the full history, because compliance investigations frequently require the ownership structure as it existed on a specific historical date. A unidimensional snapshot of current ownership is insufficient for many regulatory and investigative purposes.

Graph databases versus relational databases

Traditional relational databases store ownership data as tables. A table of entities, a table of relationships. Querying ownership chains requires recursive SQL joins, which become extremely slow for deep structures or large datasets. For a graph with millions of nodes and hundreds of millions of edges, a relational approach hits performance limits quickly.

Purpose-built graph databases represent ownership relationships natively, allowing traversal queries that return full ownership chains in milliseconds. Combined with the mathematical UBO computation, this is what makes it possible to run automated KYC checks on corporate counterparties in real-time pipeline integrations, rather than as overnight batch processes.

Applications beyond compliance

Corporate ownership graphs have applications beyond AML and KYC. Quant funds use them to identify common ownership patterns, track corporate restructurings, and build alternative data signals around ownership changes. Investigative journalists use them to trace the ultimate controllers of assets implicated in financial crime or corruption. Corporate strategy teams use them to map competitive ownership, identify potential conflicts of interest, and monitor acquisition activity by tracking changes in ownership edges over time.

The common requirement across all of these use cases is the same: a graph that is accurate, complete, bitemporal, and provenance-linked to source filings. For the data infrastructure behind these use cases, see Briefed Atlas. For the regulatory context that drives much of this demand, see what Ultimate Beneficial Ownership means and what AML compliance requires.

Read the briefing

Every weekday at 06:45. Five sections. Four minutes.

Subscribe free