INDD
Digital

Why your data strategy has to precede your AI strategy

INDD Digital Practice · 4 March 2026 · 7 min read

Every week, another leadership team concludes that AI is now urgent enough to fund without a clear prior view on the data it will depend on. The pressure is real — competitive advantage in financial services, retail, logistics, and healthcare across APAC is shifting toward data-intensive capabilities, and the cost of falling behind does compound. But urgency without sequencing produces a specific failure mode we see repeatedly: AI systems that are technically functional and operationally untrustworthy, built on data foundations that were never designed to support them. The investment is real; the returns are not.

What "data readiness" actually means

Data readiness is not a technical certification or a binary state. It is a set of conditions that must hold for a specific use case. The conditions are: the relevant data exists in a form the use case requires; it is accessible at the latency the use case demands; its quality is understood, monitored, and sufficiently high; and there is clear ownership of its accuracy across its lifecycle.

Most organisations have pockets of data that satisfy these conditions for some use cases and not others. The diagnostic work — which should precede the AI strategy, not follow it — is to map those pockets honestly. A data warehouse or a lakehouse architecture does not solve this problem by its existence. The question is not whether a modern data platform has been deployed. It is whether the specific data required for the specific AI use cases you are pursuing is reliable enough, governed enough, and well-understood enough to produce outputs you would trust.

The governance gap is the most common limiting factor

Across the organisations we have assessed in APAC, the limiting constraint is almost never technology. It is ownership. Specifically: the absence of clearly defined ownership over the meaning of key business concepts.

When no single function owns the canonical definition of "customer" — whether a customer is a contract holder, an end user, a household, or a legal entity — AI systems trained on that data inherit the inconsistency. Models that perform well on test sets built from one system's definition fail unpredictably when deployed against operational data drawn from several. The same pattern applies to "transaction," "product," "employee," and every other concept that appears in multiple source systems with subtly different meanings.

Establishing a data governance framework — with defined stewards for each critical data domain, documented definitions, and an arbitration process for definitional disputes — is unglamorous work. It has no visible deliverable at the point of completion. But it is the single intervention most reliably predictive of whether AI investment produces returns or produces expensive demonstrations.

Sequencing the two investments

The practical implication is not that AI investment should wait until data infrastructure is perfect — it never will be. It is that the two investments need to be sequenced against each other with discipline.

A working approach: invest in AI deployment for use cases where data readiness is already high, and invest simultaneously in improving data readiness for use cases where the strategic value of AI is highest. These are parallel tracks. But the AI deployment track should be scoped to what the data track can support today — not to what the technology roadmap aspires to in eighteen months.

Organisations that invert this — that scope AI deployments optimistically and expect the data infrastructure to catch up — consistently find that the data problems surface at the worst possible moment: after production deployment, at scale, in front of customers. The cost of rework at that point is an order of magnitude higher than the cost of sequencing correctly from the start.

Keep reading

Related insights

Facing a decision that matters? Let's talk.

Contact us →