Running the Inspector¶

This page covers the practical workflow: how to run the inspector, what it produces, and how the rest of Dataface uses the cached output.

What the inspector does¶

The inspector profiles database tables and writes the results to target/inspect.json. That artifact is then reused by:

the dft inspect CLI
inspect dashboards and HTML output
MCP catalog() responses
AI schema context generation
compile-time fanout warnings

In practice, the inspector is both a profiling tool and a cache-builder for the rest of the product.

Common commands¶

Profile every table¶

dft inspect

When dft inspect is run without a subcommand, it profiles all discovered tables and then performs post-processing across the catalog:

relationship detection
join multiplicity enrichment
fanout risk scoring
dbt description baking

Useful variants:

dft inspect --schema analytics
dft inspect --include 'stg_*'
dft inspect --exclude '_dbt_*'
dft inspect --connection ./warehouse.duckdb

Profile one table¶

dft inspect table orders

Useful variants:

dft inspect table orders --schema analytics
dft inspect table orders --dialect postgres --connection 'postgresql://...'
dft inspect table orders --format json
dft inspect table orders --format html
dft inspect table orders --force

Profile a CSV file¶

dft inspect table ./data/orders.csv

CSV paths are loaded into an ephemeral DuckDB automatically. You should not pass --connection or --dialect in this mode.

Audit context readiness¶

dft inspect audit

This reads target/inspect.json and reports how complete the catalog is for AI use, including coverage for:

descriptions
semantic typing
grain detection
relationship detection

Customize inspect dashboards¶

dft inspect templates
dft inspect eject model
dft inspect validate-templates

Template customization is documented separately in Inspect Template Customization.

Connection behavior¶

If you run the inspector inside a dbt project and do not pass --connection, Dataface tries to auto-detect the connection from dbt_project.yml and profiles.yml.

If there is no dbt context, pass the connection details explicitly:

dft inspect table orders --dialect duckdb --connection ./warehouse.duckdb
dft inspect table orders --dialect postgres --connection 'postgresql://user:pass@host/db'

Output formats¶

Terminal¶

Default mode. Profiles the table, saves it to target/inspect.json, and prints a compact summary in the terminal.

dft inspect table orders

JSON¶

Returns the single-table JSON payload directly to stdout instead of saving a dashboard HTML page.

dft inspect table orders --format json

HTML¶

Renders the inspect view as a Dataface dashboard using the same underlying profile artifact.

dft inspect table orders --format html --theme light

The cache artifact¶

The default output path is:

target/inspect.json

This file is the shared contract between the inspector and downstream consumers. It stores:

one entry per profiled table
the raw profile output
baked relationship metadata
baked dbt descriptions

The artifact is incremental. Re-running the inspector updates only the tables that changed.

Staleness and `--force`¶

Single-table profiling uses a staleness check. If a table is already up to date, Dataface skips the work:

dft inspect table orders

To bypass the staleness check and re-profile anyway:

dft inspect table orders --force

This matters when:

the underlying table changed but the local cache has not been refreshed yet
you changed profiler logic and want to rebuild old profiles
you want to regenerate downstream relationship or description metadata

When other surfaces will profile versus reuse cache¶

Not every consumer profiles tables automatically.

dft inspect profiles and writes cache entries.
catalog(table='orders') prefers cached data.
catalog(table='orders') does not auto-profile on a cache miss.
catalog(table='orders', force_refresh=True) profiles immediately.
get_schema_context() prefers cached profiles, but falls back to live schema introspection when no cached profile exists.

That split is intentional:

profiling can be expensive
cached context is stable and reusable
schema browsing should still work even before a full profiling run

Recommended workflow¶

For a project that is actively being explored:

Run dft inspect to build the catalog cache.
Run dft inspect audit to see where context is thin.
Add or improve dbt descriptions in models/**/schema.yml.
Re-run dft inspect so the cache bakes in updated descriptions and relationships.
Use catalog() and AI features on top of the cached context.

If you only care about one model, use dft inspect table <name> first and expand to the full catalog later.