Skip to content

Running the Inspector

This page covers the practical workflow: how to run the inspector, what it produces, and how the rest of Dataface uses the cached output.

What the inspector does

The inspector profiles database tables and writes the results to target/inspect.json. That artifact is then reused by:

  • the dft inspect CLI
  • inspect dashboards and HTML output
  • MCP catalog() responses
  • AI schema context generation
  • compile-time fanout warnings

In practice, the inspector is both a profiling tool and a cache-builder for the rest of the product.

Common commands

Profile every table

dft inspect

When dft inspect is run without a subcommand, it profiles all discovered tables and then performs post-processing across the catalog:

  • relationship detection
  • join multiplicity enrichment
  • fanout risk scoring
  • dbt description baking

Useful variants:

dft inspect --schema analytics
dft inspect --include 'stg_*'
dft inspect --exclude '_dbt_*'
dft inspect --connection ./warehouse.duckdb

Profile one table

dft inspect table orders

Useful variants:

dft inspect table orders --schema analytics
dft inspect table orders --dialect postgres --connection 'postgresql://...'
dft inspect table orders --format json
dft inspect table orders --format html
dft inspect table orders --force

Profile a CSV file

dft inspect table ./data/orders.csv

CSV paths are loaded into an ephemeral DuckDB automatically. You should not pass --connection or --dialect in this mode.

Audit context readiness

dft inspect audit

This reads target/inspect.json and reports how complete the catalog is for AI use, including coverage for:

  • descriptions
  • semantic typing
  • grain detection
  • relationship detection

Customize inspect dashboards

dft inspect templates
dft inspect eject model
dft inspect validate-templates

Template customization is documented separately in Inspect Template Customization.

Connection behavior

If you run the inspector inside a dbt project and do not pass --connection, Dataface tries to auto-detect the connection from dbt_project.yml and profiles.yml.

If there is no dbt context, pass the connection details explicitly:

dft inspect table orders --dialect duckdb --connection ./warehouse.duckdb
dft inspect table orders --dialect postgres --connection 'postgresql://user:pass@host/db'

Output formats

Terminal

Default mode. Profiles the table, saves it to target/inspect.json, and prints a compact summary in the terminal.

dft inspect table orders

JSON

Returns the single-table JSON payload directly to stdout instead of saving a dashboard HTML page.

dft inspect table orders --format json

HTML

Renders the inspect view as a Dataface dashboard using the same underlying profile artifact.

dft inspect table orders --format html --theme light

The cache artifact

The default output path is:

target/inspect.json

This file is the shared contract between the inspector and downstream consumers. It stores:

  • one entry per profiled table
  • the raw profile output
  • baked relationship metadata
  • baked dbt descriptions

The artifact is incremental. Re-running the inspector updates only the tables that changed.

Staleness and --force

Single-table profiling uses a staleness check. If a table is already up to date, Dataface skips the work:

dft inspect table orders

To bypass the staleness check and re-profile anyway:

dft inspect table orders --force

This matters when:

  • the underlying table changed but the local cache has not been refreshed yet
  • you changed profiler logic and want to rebuild old profiles
  • you want to regenerate downstream relationship or description metadata

When other surfaces will profile versus reuse cache

Not every consumer profiles tables automatically.

  • dft inspect profiles and writes cache entries.
  • catalog(table='orders') prefers cached data.
  • catalog(table='orders') does not auto-profile on a cache miss.
  • catalog(table='orders', force_refresh=True) profiles immediately.
  • get_schema_context() prefers cached profiles, but falls back to live schema introspection when no cached profile exists.

That split is intentional:

  • profiling can be expensive
  • cached context is stable and reusable
  • schema browsing should still work even before a full profiling run

For a project that is actively being explored:

  1. Run dft inspect to build the catalog cache.
  2. Run dft inspect audit to see where context is thin.
  3. Add or improve dbt descriptions in models/**/schema.yml.
  4. Re-run dft inspect so the cache bakes in updated descriptions and relationships.
  5. Use catalog() and AI features on top of the cached context.

If you only care about one model, use dft inspect table <name> first and expand to the full catalog later.