Running the Inspector¶
This page covers the practical workflow: how to run the inspector, what it produces, and how the rest of Dataface uses the cached output.
What the inspector does¶
The inspector profiles database tables and writes the results to
target/inspect.json. That artifact is then reused by:
- the
dft inspectCLI - inspect dashboards and HTML output
- MCP
catalog()responses - AI schema context generation
- compile-time fanout warnings
In practice, the inspector is both a profiling tool and a cache-builder for the rest of the product.
Common commands¶
Profile every table¶
When dft inspect is run without a subcommand, it profiles all discovered
tables and then performs post-processing across the catalog:
- relationship detection
- join multiplicity enrichment
- fanout risk scoring
- dbt description baking
Useful variants:
dft inspect --schema analytics
dft inspect --include 'stg_*'
dft inspect --exclude '_dbt_*'
dft inspect --connection ./warehouse.duckdb
Profile one table¶
Useful variants:
dft inspect table orders --schema analytics
dft inspect table orders --dialect postgres --connection 'postgresql://...'
dft inspect table orders --format json
dft inspect table orders --format html
dft inspect table orders --force
Profile a CSV file¶
CSV paths are loaded into an ephemeral DuckDB automatically. You should not
pass --connection or --dialect in this mode.
Audit context readiness¶
This reads target/inspect.json and reports how complete the catalog is for AI
use, including coverage for:
- descriptions
- semantic typing
- grain detection
- relationship detection
Customize inspect dashboards¶
Template customization is documented separately in
Inspect Template Customization.
Connection behavior¶
If you run the inspector inside a dbt project and do not pass
--connection, Dataface tries to auto-detect the connection from
dbt_project.yml and profiles.yml.
If there is no dbt context, pass the connection details explicitly:
dft inspect table orders --dialect duckdb --connection ./warehouse.duckdb
dft inspect table orders --dialect postgres --connection 'postgresql://user:pass@host/db'
Output formats¶
Terminal¶
Default mode. Profiles the table, saves it to target/inspect.json, and prints
a compact summary in the terminal.
JSON¶
Returns the single-table JSON payload directly to stdout instead of saving a dashboard HTML page.
HTML¶
Renders the inspect view as a Dataface dashboard using the same underlying profile artifact.
The cache artifact¶
The default output path is:
This file is the shared contract between the inspector and downstream consumers. It stores:
- one entry per profiled table
- the raw profile output
- baked relationship metadata
- baked dbt descriptions
The artifact is incremental. Re-running the inspector updates only the tables that changed.
Staleness and --force¶
Single-table profiling uses a staleness check. If a table is already up to date, Dataface skips the work:
To bypass the staleness check and re-profile anyway:
This matters when:
- the underlying table changed but the local cache has not been refreshed yet
- you changed profiler logic and want to rebuild old profiles
- you want to regenerate downstream relationship or description metadata
When other surfaces will profile versus reuse cache¶
Not every consumer profiles tables automatically.
dft inspectprofiles and writes cache entries.catalog(table='orders')prefers cached data.catalog(table='orders')does not auto-profile on a cache miss.catalog(table='orders', force_refresh=True)profiles immediately.get_schema_context()prefers cached profiles, but falls back to live schema introspection when no cached profile exists.
That split is intentional:
- profiling can be expensive
- cached context is stable and reusable
- schema browsing should still work even before a full profiling run
Recommended workflow¶
For a project that is actively being explored:
- Run
dft inspectto build the catalog cache. - Run
dft inspect auditto see where context is thin. - Add or improve dbt descriptions in
models/**/schema.yml. - Re-run
dft inspectso the cache bakes in updated descriptions and relationships. - Use
catalog()and AI features on top of the cached context.
If you only care about one model, use dft inspect table <name> first and
expand to the full catalog later.