pudl.validate.dbt#
Wrap DBT invocations so we can get custom behavior.
Attributes#
Classes#
Associate a node's name with information describing what went wrong. |
|
Combine overall result with any useful failure context. |
Functions#
Restore logging propagation settings after a dbt invocation. |
|
|
Ensure dbt package dependencies are installed in the project directory. |
|
Get test node output from tests that failed. |
|
Run debug_quantile_constraints macro for failed quantile constraints. |
|
Run the compiled SQL against duckdb to get failure contexts. |
|
Run the DBT build and get failure information back. |
|
Translate dagster asset selection to db node selection. |
Module Contents#
- pudl.validate.dbt._preserve_logging_propagation()[source]#
Restore logging propagation settings after a dbt invocation.
Invoking dbt via dbtRunner triggers Dagster’s logging initialization, which resets
logging.getLogger("dagster").propagatetoFalse. This context manager saves and restores the setting so callers don’t experience unexpected side effects on the global logging configuration.
- class pudl.validate.dbt.NodeContext[source]#
Bases:
NamedTupleAssociate a node’s name with information describing what went wrong.
- class pudl.validate.dbt.BuildResult[source]#
Bases:
NamedTupleCombine overall result with any useful failure context.
- failure_contexts: list[NodeContext][source]#
- pudl.validate.dbt.install_dbt_deps(dbt: dbt.cli.main.dbtRunner | None = None) dbt.cli.main.dbtRunner[source]#
Ensure dbt package dependencies are installed in the project directory.
- pudl.validate.dbt.__get_failed_nodes(results: dbt.artifacts.schemas.run.RunExecutionResult) list[dbt.contracts.graph.nodes.GenericTestNode][source]#
Get test node output from tests that failed.
- pudl.validate.dbt.__get_quantile_contexts(nodes: list[dbt.contracts.graph.nodes.GenericTestNode], dbt: dbt.cli.main.dbtRunner, dbt_dir: pathlib.Path) list[NodeContext][source]#
Run debug_quantile_constraints macro for failed quantile constraints.
This is a little tricky because the macro output is just logged to stdout, and not stored in the dbt.invoke result. So, for each node, we:
redirect stdout
run the macro based on node information
parse stdout to get the context
Also, if a node has multiple parents, we don’t know which table to pass into
debug_quantile_constraintsso we just skip it.
- pudl.validate.dbt.__get_compiled_sql_contexts(nodes: list[dbt.contracts.graph.nodes.GenericTestNode]) list[NodeContext][source]#
Run the compiled SQL against duckdb to get failure contexts.
- pudl.validate.dbt.build_with_context(node_selection: str, dbt_target: str, node_exclusion: str | None = None) BuildResult[source]#
Run the DBT build and get failure information back.
run the DBT build using our selection, returning test failures
split the test failures by type - for most, we will just run the compiled SQL, but other tests such as the weighted quantile tests need extra handling
get contexts for various test failure types
print out test failure context
- pudl.validate.dbt.dagster_to_dbt_selection(selection: str, defs: dagster.Definitions, manifest=None) str[source]#
Translate dagster asset selection to db node selection.
We use the dbt manifest to determine which sources are defined in dbt so that we can map them to dagster assets. So, we need to generate a fresh dbt manifest via
dbt parsewhenever we run this function.turn asset selection into asset keys
turn asset keys into node names
turn node names into selection string