Why We Build Ontologies

Written By: Johan Gauffin – Head of Commercial at ForgeSight

This article in brief:

The problem
Most organizations still build data products to answer immediate reporting needs. Teams join, flatten, and aggregate data to deliver a single output for one dashboard, one workflow, or one decision. It feels efficient. But that convenience locks a single interpretation of the business into the architecture, making future change expensive.

The mistake
A report-shaped output is not a business model. Reports answer questions. Ontologies define the entities, relationships, states, and actions that make those questions answerable. When teams promote an aggregate to the canonical model, they bury business meaning inside a pipeline and lose the ability to adapt.

The better approach
An ontology models the business at the level it actually operates: purchase orders, shipments, materials, suppliers, receipts, delays, exceptions, approvals, and the links between them. It keeps assumptions explicit, preserves detail where it matters, and supports many use cases, not just one.

Why it matters
When requirements change, organizations built on report outputs have to reverse-engineer their own business logic. Organizations built on ontologies can extend, recombine, and operationalize the model without starting over. That is what creates real flexibility.

For many teams, the first sign of architectural trouble does not look like trouble at all.

A business stakeholder asks for visibility into an operational issue. The engineering team moves quickly. They pull data from several systems, join it, apply business logic, aggregate it into a usable structure, and deliver a dashboard or application that answers the question. It feels like progress. The output is useful. The project looks successful.

Then the business changes the question.

Now, the team needs to explain the delays at a higher level of detail. Or split one operational concept into three. Or connect the output to a workflow instead of a dashboard. Or trace the history behind a decision, not just show the latest status. Suddenly, what looked like a clean solution is hard to extend. The data has already been collapsed into a single interpretation. The model does not bend. It breaks.

This is one of the core reasons we build ontologies.

We build them because enterprises do not just need answers; they need solutions. They need a durable meaning. They need a way to represent the business that survives beyond the first dashboard, workflow, or use case. They need a semantic model that remains useful as requirements shift, teams change, and new applications emerge.

That is what an ontology provides.

An ontology is not just a taxonomy, a schema, or a glossary. It is a structured model of the real-world entities and events the organization operates through, with the relationships, properties, and actions that connect them. It defines what the business is made of and how those parts interact. It is the layer where the organization decides what a purchase order is, what a shipment is, what a material is, how they relate, and which changes or states matter over time.

That may sound abstract. In practice, it is one of the most concrete design choices a company makes.

Once that semantic layer exists, teams can build on it: dashboards, workflows, metrics, alerts, approvals, simulations, automation, and even AI applications. Without it, each new use case rebuilds a partial version of the business in its own logic. Meaning fragments. Definitions drift. Reuse declines. Complexity grows.

The result is an architecture that looks productive on the surface but is brittle beneath the surface.

The Real Mistake: Treating a Report Like a Model

The most common failure is not bad engineering. It is category confusion.

Teams build a report-shaped object and then start treating it as the business model. It is easy to see why. The object is useful. It is business-readable. It combines information from multiple systems. It may even have a clean name and a polished interface. But none of that makes it a canonical representation of the domain.

A report is a view. A dashboard is a view. A denormalized table is a view. Each is designed to answer a specific set of questions at a specific level of detail for a specific audience.

An ontology is different. It is not optimized for one question. It is optimized to preserve the business concepts that enable many questions.

That distinction matters because aggregation is not just technical compression. It is semantic compression. When a team aggregates early, it decides which distinctions no longer matter. That may be fine for a downstream output. But it is risky in the canonical layer, because the distinctions discarded today are often the ones the business needs tomorrow.

Put differently, a report can be right for a use case and still be wrong as a model.

What This Looks Like In Practice

This is not a theoretical concern. It shows up in delivery every day.

Take a supply chain example. Suppose developers are asked to create visibility into procurement and delivery risk. Instead of modelling purchase orders, shipments, materials, suppliers, receipts, and delays as distinct objects with explicit relationships, they build a single aggregated object to power a single report. It joins the relevant data, applies transformations, calculates operational statuses, and outputs a convenient record for the dashboard.

Initially, that may work well. Leaders can see the report. Users can track exceptions. The use case is satisfied.

But then the business asks for something slightly different. Perhaps it wants to distinguish pre-due-date risk from overdue delivery. Perhaps it wants to analyze shipment-level changes separately from purchase-order amendments. Perhaps it wants to trace how a material substitution affected downstream timelines. Perhaps it wants to operationalize the logic in a workflow that routes approvals or escalations.

Now the problem appears. Because the team did not preserve the domain’s ontology, it cannot separate what was collapsed together. Purchase-order, shipment, material, and reporting logic are all fused into a single output. The new use case is forced to recover the business model from a presentation artifact.

That is expensive. It is slow. It is avoidable.

The takeaway is simple: when companies skip ontology work, they increase the cost of change.

The first use case may still ship quickly. But every use case after that becomes more fragile, more bespoke, and more dependent on hidden assumptions in prior pipelines. The organization starts to accumulate not just technical debt but also semantic debt.

Why Ontology Preserves Optionality

The strongest case for ontology-first design is not elegance or conceptual purity. It is optionality.

Businesses rarely know in advance every question they will need to answer. They do not know every workflow they will need to support. They do not know every operational distinction that will matter six months from now. That is why the canonical layer should preserve the business at the level where future recombination is possible.

A purchase order remains a purchase order. A shipment remains a shipment. A material remains a material. A supplier remains a supplier. Exceptions, approvals, amendments, and receipts can be modelled as events, states, or related objects rather than flattened into a single status field. Because the model preserves those distinctions, teams can create new views later without having to reconstruct the underlying world.

This is what optionality means in data architecture. It does not mean avoiding structure. It means choosing a structure that is durable in the face of change.

That is also why ontology work matters for digital twins. A digital twin is valuable only if it represents the real-world entities and processes the business cares about in a form that can be observed, queried, and acted upon. If that twin is built from report outputs instead of domain entities, it may look functional, but it will inherit the rigidity of whatever use case first shaped it.

A real ontology gives the twin its semantic backbone.

The Difference Between a Data Lake, a Report Output, and an Ontology

Organizations often confuse these layers because all three deal with data, but they solve different problems. A data lake is primarily about storage and fidelity. It preserves inputs from many systems, often in raw or lightly processed form. That is critical, but it does not by itself tell the organization what those data mean in relation to one another.

A warehouse or report output is about consumption. It prepares data for performance, analytics, and usability. It is designed to efficiently answer recurring business questions. That is valuable, but by design it reflects a particular perspective on the business. Ontology is about meaning. It defines the domain’s core entities, relationships, states, and actions in a reusable way.

Those layers should coexist. The mistake is not building dashboards, marts, or aggregates. The mistake is promoting those outputs into the semantic role the ontology should play. cleanest way to say it is this: the lake stores, the report presents, the ontology means.

And when organizations want a digital twin, it is the ontology that turns data into an operational model of the business rather than just a collection of synchronized outputs.

What Forward-Deployed Engineers Should Do Differently

For forward-deployed engineers, this has immediate implications.

The goal is not just to deliver the fastest answer to the current question. It is to leave the enterprise more legible after the implementation than before. That means building in ways that preserve business meaning, not burying it inside one-off transformations.

First, model real-world entities, not report artifacts. If the business reasons about orders, shipments, materials, suppliers, inventory positions, receipts, and exceptions, those should be first-class components of the ontology.

Second, keep relationships explicit. Much of the power of an ontology comes not from the nouns alone, but from how they connect. Which shipment fulfills which purchase order? Which material is delayed because of which supplier issue? Which amendment changed which commitment date? Those links are often what decision-makers actually need.

Third, treat aggregates as downstream products. Build them where they improve clarity and speed, but do not confuse them with the canonical model. A report should sit on top of the ontology, not replace it.

Fourth, separate a stable identity from a changing state. A shipment may change status many times. A purchase order may be amended. A specification may evolve. Those changes should enrich the model, not create a new canonical object every time the current view changes.

Fifth, make assumptions visible. Hidden business logic is one of the main reasons architectures become brittle. When definitions live inside transformations rather than the semantic model, teams forget what was assumed, why it was assumed, and how to revise it. Ontology work forces those assumptions into the open.

This does not mean every implementation must start with a grand modelling exercise. It does mean teams should resist the temptation to encode their understanding of the business in a single pipeline just because it is expedient.

Why This Matters Even More Now

For years, poor semantic architecture created friction for analytics teams. Today it creates friction across the whole operating model.

Workflows need shared definitions. Applications need shared objects. Automation needs a reliable state. AI systems need consistent entities and relationships if they are going to reason, retrieve, recommend, or act with any credibility in business contexts.

That raises the stakes.

An organization that relies on report-shaped outputs as its business model does not just make reporting harder. It makes automation harder. It makes application development harder. It makes cross-functional execution harder. It makes AI harder because the system lacks a durable representation of the business itself.

Ontology work is not documentation theatre. It is infrastructure.

It is how a company creates a shared semantic layer that many products and teams can rely on without each rebuilding their own version of reality. Ontologies are not overhead; they are leverage.

The Takeaway

Executives do not need ontologies because they are intellectually satisfying. They need them because the alternative is hidden fragmentation.

When semantic decisions are not made explicitly, they are still made. They are implicitly defined in dashboards, data pipelines, metric logic, source-specific schemas, and workflow code. The organization ends up with many partial models of itself, each useful locally but inconsistent globally.

That is why companies can be data-rich and still struggle to adapt.

The real strategic value of an ontology is that it makes the enterprise’s language durable enough to outlast any single use case. It gives teams a shared model they can extend rather than a growing pile of outputs to reconcile. Dashboards can change quickly because the underlying business concepts remain stable. Workflows can evolve without reinventing definitions. Applications and AI can sit on top of a model of the world, not guess at one from disconnected artifacts.

If your architecture is built around reports, every change request threatens the model. If it is built around ontologies, change happens on top of the model.

One approach creates outputs. The other creates adaptability.

And that is why we build ontologies.

Further Reading

Natalya F. Noy and Deborah L. McGuinness, “Ontology Development 101: A Guide to Creating Your First Ontology” — Stanford University.
Mike Uschold and Michael Gruninger, “Ontologies: Principles, Methods and Applications” — The Knowledge Engineering Review.
Mike Uschold, Martin King, Stuart Moralee, and Yannis Zorgios, “The Enterprise Ontology” — The Knowledge Engineering Review.
W3C, “OWL 2 Web Ontology Language: Document Overview (Second Edition)”.
Palantir, “Core Concepts”.
NIST, “The Industrial Ontologies Foundry (IOF) Core Ontology”.
The Kimball Group, “Grain”.
Veeral Desai, Tim Fountaine, and Kayvaun Rowshankish, “A Better Way to Put Your Data to Work” — Harvard Business Review.
Thomas H. Davenport, Randy Bean, and Shail Jain, “Why Your Company Needs Data-Product Managers” — Harvard Business Review.
Digital Twin Consortium, “Definition of a Digital Twin”.
ISO/IEC, “30173:2023 — Digital Twin: Concepts and Terminology”.
Milos Drobnjakovic, Guodong Shao, Ana Nikolov, Boonserm Kulvatunyou, Simon Frechette, and Vijay Srinivasan, “Towards Ontologizing a Digital Twin Framework for Manufacturing” — NIST.

← Foundry's Second Act - Part 4 of 4 Expanding Value from Palantir Foundry: What Comes After Implementation →