pypeh is a lightweight ETL and data-ops toolkit for Personal Exposure and Health (PEH) data.
It helps you:
- work with PEH-model resources in Python
- load/transform/validate PEH study data
- support FAIR data workflows (findable, accessible, interoperable, reusable)
The toolkit is built to interact with the PEH model from PARC:
Core package:
uv pip install pypehWith dataframe adapter extras (Polars-based workflows):
uv pip install "pypeh[dataframe-adapter]"from pypeh import Session
# Start a session
session = Session()
# Load PEH model resources (e.g. YAML configs) into cache
session.load_persisted_cache(source="config")
# Load tabular data as a DatasetSeries using a DataImportConfig from cache
data_import_config = session.cache.get("<data_import_config_id>", "DataImportConfig")
dataset_series = session.load_tabular_dataset_series(
source="my_data.xlsx",
data_import_config=data_import_config,
)From there you can use adapters for:
- validation
- enrichment (derived variables)
- aggregation
- export/persistence
make test-core
make test-dataframe
make test-rocrate