Ab Initio Data Quality Fix May 2026

Ab Initio Data Quality: Why You Can’t Fix Rubbish Later

Stop polishing bad data. Start building it right from the first principle. ab initio data quality

You enforce quality at the point of creation or ingestion. If a record doesn’t meet the first principles of your domain (e.g., timestamp cannot be in the future; customer_id must match a regex), it is rejected immediately. The rule: Do not allow a known violation to enter your persistent storage. Ever. 2. The "Nullable Integer" Paradox Let’s look at a classic first-principles failure: Nulls in numeric fields. Ab Initio Data Quality: Why You Can’t Fix

Audit your warehouse. Pick one critical table. Enforce NOT NULL on every single column. If you truly need a missing value, use a sentinel row (e.g., id = 0 , name = "UNKNOWN" ). You will be shocked how many bugs disappear. If a record doesn’t meet the first principles

Use tools like pydantic (Python), Great Expectations (with expect_column_values_to_not_be_null set to fatal ), or dbt 's constraints (enforced, not just documented). If the contract fails, the pipe breaks. Loudly.