CleanDims deploys purpose-built AI agents to analyse, cluster, and propose canonical mappings for your categorical data. Every decision goes in front of a human expert before it reaches you.
Every system that accepts free-text input accumulates variation. Tags, labels, and segment names are each entered independently, each expressing the same concept differently. Over time, cross-system analysis becomes unreliable across every function that depends on that data.
Amazon Web Services, AWS, Amazon AWS, aws four entries, one supplier, no reliable spend total.Software Engineer, SWE, Sr. SDE headcount and compensation reports diverge on the same role.Tier-2. Sales uses Mid-Market. Marketing uses SMB. Revenue attribution cannot reconcile.prod, PROD, prd, live cost centre allocation splits across variants.MSA, Master Service Agreement, Master Services Agreement legal reporting draws from an incomplete set.Select a function to see how categorical variance manifests in that team's data and what the canonical output looks like.
Purchase orders, invoices, expense reports, and ERP records reference the same suppliers. Spend consolidation and vendor risk reports draw from incomplete sets because four entries refer to one supplier.
Compensation benchmarking, skills gap analysis, and org reporting depend on consistent title data. When ATS, HRIS, and payroll each record the same role differently, headcount by function or level becomes unreliable.
Campaign performance analysis requires grouping spend and conversion data across all executions. When the same campaign is tagged differently by each team member, aggregated reports are incomplete and budget tracking is unreliable.
Win/loss analysis requires consistent classification. When reps enter reasons freely, the same underlying cause surfaces as a dozen variants. Aggregated loss data cannot drive product, pricing, or positioning decisions.
Cost allocation, security inventories, and infrastructure monitoring all depend on consistent tagging. When environment and team tags are applied independently at provisioning, the same workload accumulates variants across providers and tools.
Feature attribution, NPS analysis, and roadmap reporting depend on consistent product area classification. When the same area is labelled differently across tools, work cannot be attributed and prioritisation draws from incomplete data.
Obligation tracking, renewal management, and risk reporting draw from incomplete datasets when the same agreement type carries four different labels across the contract management system, matter tracker, and ERP.
Ticket routing, SLA compliance, and issue trend analysis depend on consistent category classification. When each agent labels independently, volume by category and time-to-resolution by issue type both fragment across label variants.
Category spend management and supplier consolidation depend on categories being consistently applied. When the same spend type carries four different labels across procurement, finance, and expense systems, totals are unreliable and benchmarking is invalid.
Dimensional variance requires pattern recognition, business context, and human judgement. Applied in the right sequence, consistently, at scale. CleanDims is built around that reality.
CleanDims runs purpose-built agents for each stage: frequency analysis, syntactic clustering, fuzzy matching, semantic grouping, and canonical proposal. Each agent does one job and does it well. Their outputs compound into a coherent, high-confidence mapping proposal.
CleanDims never touches upstream data. The canonical registry holds a mapping from every raw value to its canonical form. Organisations join this registry against their existing data at query time. Source systems remain unchanged with full auditability.
Every mapping proposal is reviewed by a CleanDims subject matter expert before it enters your review queue. Ambiguous cases are escalated with a recommendation and full context. Agents accelerate the work. Humans are responsible for the decisions.
Successful agent sequences are saved as templates. A workflow that governed vendor names for a financial services client becomes the starting point for the next. Each engagement makes future similar work faster, more accurate, and higher confidence.
Without a central canonical registry, Finance, Sales, and Analytics each reconcile the same dimensions independently and diverge. CleanDims gives the whole organisation a single reference, maintained in one place, so the same work is never repeated.
Dimensional variance is not a side problem for us. It is the problem we have studied, seen repeat across industries, and built a methodology specifically to solve. That specialisation means faster processing, fewer escalations, and better canonical decisions.
All data transfer, document execution, review, and delivery runs through the CleanDims platform. No integration, no installation, no onboarding. Direct data source connections are on the roadmap.
The Chaos Assessment is the starting point for every engagement. It produces a complete picture of categorical variance in a specific dataset before any commitment is made to a full standardisation. The fee is credited in full if an engagement proceeds.
All outputs from a full engagement are owned outright by the organisation. No ongoing dependency on CleanDims to maintain or operate them.
Complete mapping of all raw values to canonical form. Delivered in CSV, JSON, Parquet, or direct warehouse export. Every mapping includes the agent confidence score and the full decision trail.
Canonical values enriched with standard identifiers where applicable: UNSPSC codes, SIC/NAICS classifications, ISO codes.
Decision logic used during the engagement, documented so the organisation's team can classify new values without external input.
Process document covering ongoing taxonomy maintenance: new values, edge cases, periodic audits, and system onboarding.
Complete record of every mapping decision, including rationale for ambiguous cases. The audit trail for any classification later questioned.
Output loads into Snowflake, BigQuery, Databricks, Redshift, dbt, Looker, Tableau, and Power BI. No additional tooling required.
CleanDims works alongside the data tools already in use. The canonical output is a structured data file that loads into any tool that accepts tabular data. No migration and no new tooling required to receive or use it.
We publish what we know about categorical data variance: where it originates, why it persists, and how to resolve it. Free to read, no commitment required.
For anything not covered here, use the assessment form to get in touch.
Submit a sample dataset. See how CleanDims agents classify your specific dimension data before any commitment is made to a full engagement.
No commitment required to request an assessment.
This policy explains what information CleanDims collects, how it is used, and your rights in relation to it. CleanDims is a data standardisation service. References to "we", "us", or "CleanDims" refer to the team operating this service.
By submitting an assessment request or entering into an engagement, you acknowledge this privacy policy.
Contact and enquiry information. When you submit the assessment request form, we collect your name, organisation name, job title, and any description of your data challenge that you choose to provide. This information is used solely to respond to your enquiry and scope an engagement.
Client data submitted for processing. During an engagement, you submit dimensional data files for standardisation. This data is processed solely for the purpose of delivering the canonical mapping output. It is not used for any other purpose, not shared with third parties, and deleted on completion of the engagement unless a governance retainer is in place.
Website usage data. We may collect anonymous usage information to understand how visitors interact with our website. No personally identifiable information is collected through this process, and no data is sold to or shared with advertisers or third parties.
We do not use your information for marketing without your explicit consent. We do not sell, rent, or trade your personal information to any third party.
Contact and enquiry information is retained for 24 months following last contact. Engagement data (uploaded files, working documents) is deleted within 30 days of engagement close. Final deliverables are owned by the client and held on our side only if a retainer agreement is active. Financial and contractual records are retained in accordance with applicable legal obligations.
You have the right to access information we hold about you, request correction of inaccurate data, request deletion of your data, and object to processing. To exercise any of these rights, submit a request through our assessment form indicating it is a data rights request. We will respond within 30 days.
Our website uses only essential cookies necessary for the site to function. We do not use tracking cookies, advertising cookies, or third-party analytics cookies that collect personally identifiable information.
We may update this policy from time to time. The date shown at the top of this page reflects the most recent revision. Continued use of our services after an update constitutes acceptance of the revised policy.
For any questions about this privacy policy or to exercise your data rights, please use the assessment request form and indicate your request type in the message field.
These Terms of Service govern your use of the CleanDims website and services. By submitting an assessment request or proceeding with an engagement, you agree to these terms. If you are accepting on behalf of an organisation, you represent that you have authority to bind that organisation.
CleanDims provides categorical data standardisation services. This involves receiving dimensional data from clients, processing it using AI agents and human expert review, and returning a canonical mapping registry, governance documentation, and associated deliverables as described in the engagement scope.
The specific scope, timeline, deliverables, and fees for each engagement are agreed in writing before processing begins. Services described on this website are illustrative. Actual deliverables are defined by the signed engagement scope.
The Chaos Assessment is a one-week analysis of a sample dataset. The fee is $500 USD, payable prior to work commencing. The assessment fee is credited in full against the fee for a subsequent full engagement if one is agreed within 90 days. All assessment outputs are owned by the client regardless of whether a full engagement proceeds. The assessment fee is non-refundable once processing has begun.
Assessment fees are payable in advance. Full engagement fees are invoiced as agreed in the engagement scope, typically with a portion due at commencement and the remainder on delivery. Payment terms are 14 days from invoice date unless otherwise specified. CleanDims reserves the right to pause work on overdue accounts.
CleanDims retains its methodology. The operation library, agent pipeline architecture, template sequences, global dictionary infrastructure, and processing methodology are the proprietary intellectual property of CleanDims. Engagement deliverables do not transfer rights to the underlying methodology.
Client data. All data submitted by the client remains the property of the client. CleanDims acquires no rights to client data beyond what is necessary to perform the service.
Both parties agree to treat the other's confidential information with the same degree of care as their own confidential information, and in any event no less than reasonable care. CleanDims will not disclose client data, engagement details, or output contents to any third party without the client's prior written consent, except as required by applicable law.
CleanDims warrants that it will perform services with reasonable skill and care consistent with good industry practice. CleanDims does not warrant that its canonical outputs will be error-free or suitable for any specific downstream use. The client is responsible for reviewing all outputs before applying them to production systems. Shadow Mode review is provided precisely to enable this.
To the extent permitted by applicable law, all other warranties, express or implied, are excluded.
To the maximum extent permitted by applicable law, CleanDims' total liability for any claim arising from or relating to these terms or the services shall not exceed the total fees paid by the client in the three months preceding the event giving rise to the claim. CleanDims shall not be liable for any indirect, consequential, incidental, or special loss or damage, including loss of revenue, loss of data, or loss of business.
Either party may terminate an engagement by giving written notice if the other party materially breaches these terms and fails to remedy the breach within 14 days of notice. On termination, all client data will be deleted within 30 days, all deliverables completed to that point will be delivered to the client, and fees for work completed will remain payable.
We may update these terms from time to time. For active engagements, the terms in effect at the time of the signed engagement scope will govern that engagement. For new engagements or website use, the current terms apply.