Data Management¶
PsDataManager is the central data store. It extends Python's dict, using composite tuple keys to store PsData objects.
Creating a Manager¶
from psPlotKit.data_manager.ps_data_manager import PsDataManager
dm = PsDataManager("my_sweep_results.h5")
Registering Data Keys¶
Before loading, tell the manager which keys to import:
dm.register_data_key(
file_key="fs.costing.LCOW", # key in the .h5 file
return_key="LCOW", # your short name
units="USD/m**3", # optional: convert to these units
)
dm.register_data_key(
"fs.water_recovery",
"recovery",
assign_units="%", # assign units without conversion
)
register_data_key Parameters¶
| Parameter | Description |
|---|---|
file_key |
Key path in the HDF5/JSON file |
return_key |
Short name for referencing |
units |
Convert imported data to these units |
assign_units |
Assign units without converting |
conversion_factor |
Manual scaling factor |
directories |
Restrict to specific directories |
Loading Data¶
load_data performs three steps:
- Import — reads data from files for all registered keys
- Check import status — verifies all keys were found (controllable via
check_import_status) - Evaluate expressions — computes any registered expressions (controllable via
evaluate_expressions)
# Warn on missing keys instead of raising
dm.load_data(raise_error=False)
# Skip import checking
dm.load_data(check_import_status=False)
# Skip expression evaluation
dm.load_data(evaluate_expressions=False)
Composite Tuple Keys¶
Data is stored under composite tuple keys built from directory labels and data keys:
- Single-directory files:
("LCOW",)or simply"LCOW" - Multi-directory files:
(("erd_type", "pressure_exchanger"), "membrane_cost", "LCOW")
Inspecting Data¶
dm.display() # all (directory, data_key) entries
dm.display_keys() # unique data keys only
dm.display_directories() # unique directory keys only
Accessing Data¶
Adding Computed Data¶
Selecting Data for Plotting¶
dm.select_data(["LCOW", "recovery"])
selected = dm.get_selected_data() # dict for plotters
dm.clear_selected_data() # reset selection
Selection Parameters¶
| Parameter | Description |
|---|---|
selected_keys |
List of key names to select |
require_all_in_dir |
Only include directories with all keys |
exact_keys |
Require exact key match |
add_to_existing |
Append to current selection |
return_all_if_non_found |
Fall back to all data if no match |
Reducing / Stacking Data¶
Combine data across directories:
This stacks data from directories sharing stack_keys, then applies the reduction ("min", "max", "unique").
Normalizing Data¶
Evaluating Custom Functions¶
dm.eval_function(
directory=dir_key,
name="custom_calc",
function=my_function,
function_dict={"x": "LCOW", "y": "recovery"},
units="dimensionless",
)
Exporting Data to CSV¶
You can export all loaded data to CSV files directly from the manager:
The export behaviour depends on how many directories the manager contains:
- Single directory — writes one CSV file. If the path doesn't end in
.csv, the extension is appended automatically (e.g."results"→results.csv). - Multiple directories — creates a folder and writes one CSV per directory. If the path ends in
.csv, the extension is stripped to form the folder name (e.g."results.csv"→results/).
Column headers are built from each data key's label and units (e.g. LCOW (USD/m**3)).
# single directory — creates results.csv
dm.export_data_to_csv("results")
# multiple directories — creates output/ folder with one CSV per directory
dm.export_data_to_csv("output")
You can also use the PsDataExporter class directly for more control: