Dimension managementfor the AI era.
The categorical surface of every business system has been managed reactively for thirty years. Agents are about to make that intolerable.
The dimensional data problem, in one paragraph.
Open the CRM and look at the Industry field on the customer accounts. There is a dropdown with twenty industries, and at the bottom there is “Other,” and next to “Other” there is a text field. Every account where the right industry was not on the list, where the rep was not sure, where the rep was in a hurry, ended up under “Other” with whatever the rep typed in the text field.
Three years of this and “Other” is the largest single category, and the text field beside it contains four hundred unique strings: dozens of spellings of the same five real industries, plus genuine new categories that should have been added to the dropdown years ago but never were.
The product team trying to decide which verticals to target opens the segment analysis and finds that the largest customer cohort cannot be characterised. The marketing team building lookalike audiences discovers that the seed segments are unreliable. The model trained to predict churn by industry learns that “Other” is its own predictor of something, but nobody knows what.
This pattern, repeated across every categorical field in every operational system, is the dimensional data problem. It is the most consequential class of data in any analytical system, and it is the least systematically managed.
Pick the door you came in for.
CleanDims is the infrastructure for dimension management at runtime.
A canonical reference for every dimension that matters. Stewards and a request workflow that treats changes as version-controlled infrastructure. Agents that read from synchronised caches and produce confidence-weighted output. The categorical surface, governed in production rather than in retrospect.
See the productThe discipline, written down.
The manifesto, the primer, the target state, and the four-layer taxonomy of what goes wrong. A growing library of practitioner artifacts. An interactive diagnostic that walks from a recognisable symptom down to its root cause. The category is bigger than any one product, and the practice deserves to be developed by everyone who will eventually depend on it.
Read the open sourceEverything about dimensional data, from first principles.
A structured curriculum that starts at “what is a dimension” and arrives at “what good looks like.” A glossary that defines every term the discipline uses. Built for practitioners, accessible to anyone willing to read carefully.
Start learningBuilt for three audiences. Roles in the work, not job titles.
- 01
Heads of analytics, data, and AI
Responsible for the systems that depend on dimensional data and have noticed that the dependence is fragile.
- 02
Practitioners
The data engineers, analysts, architects, and stewards who do the work and want the discipline written down.
- 03
Anyone new to the topic
Coming to dimensional data for the first time, and needing it explained from first principles before the rest of the conversation makes sense.