pudl.validate.integrity#
Database integrity validation checks for PUDL data.
This module implements checks for structural database constraints such as foreign key relationships. These checks are applied after all data has been loaded into the database, since the parallel nature of the ETL pipeline means that foreign key constraints cannot be enforced during loading. As these checks are migrated into dbt, this module should shrink accordingly.
Attributes#
Exceptions#
Raised when data in a database violates a foreign key constraint. |
|
Raised when data in a database violate multiple foreign key constraints. |
Functions#
|
Retrieve a dataframe of foreign keys for a table. |
|
Check foreign key relationships in the database. |
Module Contents#
- exception pudl.validate.integrity.ForeignKeyError(child_table: str, parent_table: str, foreign_key: str, rowids: list[int])[source]#
Bases:
sqlalchemy.exc.SQLAlchemyErrorRaised when data in a database violates a foreign key constraint.
- exception pudl.validate.integrity.ForeignKeyErrors(fk_errors: list[ForeignKeyError])[source]#
Bases:
sqlalchemy.exc.SQLAlchemyErrorRaised when data in a database violate multiple foreign key constraints.
- pudl.validate.integrity._get_fk_list(engine: sqlalchemy.Engine, table: str) pandas.DataFrame[source]#
Retrieve a dataframe of foreign keys for a table.
Description from the SQLite Docs: ‘This pragma returns one row for each foreign key constraint created by a REFERENCES clause in the CREATE TABLE statement of table “table-name”.’
The PRAGMA returns one row for each field in a foreign key constraint. This method collapses foreign keys with multiple fields into one record for readability.
- pudl.validate.integrity.check_foreign_keys(engine: sqlalchemy.Engine)[source]#
Check foreign key relationships in the database.
The order assets are loaded into the database will not satisfy foreign key constraints so we can’t enable foreign key constraints. However, we can check for foreign key failures once all of the data has been loaded into the database using the foreign_key_check and foreign_key_list PRAGMAs.
You can learn more about the PRAGMAs in the SQLite docs.
- Raises:
ForeignKeyErrors – if data in the database violate foreign key constraints.