Expressions¶

psPlotKit provides an expression tree API for deriving new data from imported keys using Python arithmetic operators.

Overview¶

Instead of writing expression strings, you build expression trees with the ExpressionNode / ExpressionKeys API. Expressions are registered on the PsDataManager and can be evaluated:

Automatically when load_data() finishes, or
Automatically at registration time when data is already present (auto_evaluate_expressions = True, the default), or
Manually by calling evaluate_expressions() when you need full control over timing.

Quick Example¶

from psPlotKit.data_manager.ps_data_manager import PsDataManager

dm = PsDataManager("my_sweep_results.h5")
dm.register_data_key("fs.costing.LCOW", "LCOW")
dm.register_data_key("fs.water_recovery", "recovery")

ek = dm.get_expression_keys()
dm.register_expression(ek.LCOW / ek.recovery, return_key="cost_per_recovery")

dm.load_data()
# "cost_per_recovery" is now available in every directory

Building Expressions Step-by-Step¶

1. Obtain ExpressionKeys¶

ek = dm.get_expression_keys()

get_expression_keys() returns a live reference — any keys you add later via add_data() or register_data_key() are automatically visible on the same ek object. You never need to call get_expression_keys() a second time.

ek = dm.get_expression_keys()

dm.register_data_key("fs.pressure", "pressure")
# ek.pressure is already accessible — no need to re-fetch ek
dm.register_expression(ek.pressure * 0.001, return_key="pressure_kPa")

2. Access Keys¶

There are three ways to reference a key on the ExpressionKeys object:

Attribute access (simple string keys)¶

ek.LCOW        # returns an ExpressionNode for "LCOW"
ek.recovery    # returns an ExpressionNode for "recovery"

Attribute access (sanitised names)¶

Keys that contain special characters or are tuples are automatically converted into valid Python identifiers:

Original key	Safe attribute	Rule
`"LCOW (m**3)"`	`ek.LCOW_m_3`	Non-alphanumeric chars → `_`
`"Ca_2+"`	`ek.Ca_2`	Trailing `+` stripped
`("group", "a")`	`ek.group_a`	Tuple parts joined with `_`

Use ek.print_mapping() to log the full mapping of original keys to safe attribute names.

Item access (original key)¶

Item access always works with the original key — no sanitisation needed:

ek["Ca_2+"]             # string key with special characters
ek["LCOW (m**3)"]       # string key with parentheses
ek["group", "a"]        # tuple key (Python auto-packs the tuple)

Tip

When a key has special characters, item access is the safest and most readable approach.

3. Combine with Arithmetic Operators¶

Standard Python operators produce a new ExpressionNode tree:

ratio     = ek.LCOW / ek.recovery
scaled    = 100 * (ek.LCOW + ek.LCOW) ** 2 / ek.recovery
negated   = -ek.LCOW
with_const = ek.recovery + 0.05

4. Register the Expression¶

dm.register_expression(
    ek.LCOW / ek.recovery,
    return_key="cost_per_recovery",
)

# Optionally set units on the result
dm.register_expression(
    ek.LCOW * ek.recovery,
    return_key="weighted_metric",
    assign_units="USD",
)

# Or convert the result to specific units
dm.register_expression(
    ek.LCOW * 1000,
    return_key="LCOW_per_kgal",
    units="USD/kgal",
)

5. Evaluate¶

See Evaluation Timing below for full details.

Evaluation Timing¶

Automatic on `load_data()` (default workflow)¶

When you register expressions before calling load_data(), they are evaluated automatically at the end of the load:

dm.register_data_key("fs.costing.LCOW", "LCOW")
ek = dm.get_expression_keys()
dm.register_expression(ek.LCOW * 100, return_key="scaled_LCOW")

dm.load_data()       # evaluates all registered expressions
dm.get_data(dir_key, "scaled_LCOW")  # ready to use

Automatic on `register_expression()` (when data exists)¶

If data has already been loaded into the manager, calling register_expression() immediately evaluates all pending expressions so the result is available right away:

dm.load_data()

ek = dm.get_expression_keys()
dm.register_expression(ek.LCOW / ek.recovery, return_key="ratio")
# "ratio" is available immediately — no extra call needed

This behaviour is controlled by the auto_evaluate_expressions flag (default True).

Manual evaluation¶

Set auto_evaluate_expressions = False to take full control:

dm.auto_evaluate_expressions = False

ek = dm.get_expression_keys()
dm.register_expression(ek.LCOW + ek.recovery, return_key="sum_lr")
dm.register_expression(ek.LCOW - ek.recovery, return_key="diff_lr")

dm.load_data(evaluate_expressions=False)  # skip auto-eval in load_data too

# ... do other work ...

dm.evaluate_expressions()   # evaluate when you are ready

Key Name Sanitisation¶

How it works¶

ExpressionKeys converts every registered key into a valid Python identifier so it can be used as an attribute name. The rules are:

Tuple keys are joined into a single string with _ separators.
Characters that are not alphanumeric or _ are replaced with _.
Consecutive underscores are collapsed; leading/trailing underscores are stripped.
If the result starts with a digit, a leading _ is added.

Collision handling¶

When two different keys produce the same safe name (e.g. "Ca_2+" and "Ca_2" both collapse to Ca_2), numeric suffixes _1, _2, … are appended to both names to keep them unique.

Inspecting the mapping¶

ek = dm.get_expression_keys()
ek.print_mapping()

This logs one line per key at INFO level:

ExpressionKeys mapping (3 keys):
  'Ca_2' -> .Ca_2_1
  'Ca_2+' -> .Ca_2_2  (or ek['Ca_2+'])
  'LCOW' -> .LCOW

You can also enable a per-key warning at creation time:

ek = dm.get_expression_keys(warn_on_sanitize=True)

Iteration¶

ExpressionKeys is iterable and exposes len():

for key in ek:
    print(key)          # prints the original (unsanitised) key

len(ek)                 # number of registered keys

Supported Operators¶

Operator	Example	Description
`+`	`ek.a + ek.b`	Addition
`-`	`ek.a - ek.b`	Subtraction
`*`	`ek.a * ek.b`	Multiplication
`/`	`ek.a / ek.b`	Division
`**`	`ek.a ** 2`	Power
`-` (unary)	`-ek.a`	Negation

All operators work with:

Two ExpressionNode objects
An ExpressionNode and a numeric constant (either order)

Working with Tuple Keys¶

Tuple return-keys are common when data is grouped by parameter:

dm.add_data("dir_a", ("membrane", "cost"), cost_data)
dm.add_data("dir_a", ("membrane", "area"), area_data)

ek = dm.get_expression_keys()

# Attribute access — tuple parts joined with _
total = ek.membrane_cost * ek.membrane_area

# Or item access with the original tuple
total = ek["membrane", "cost"] * ek["membrane", "area"]

dm.register_expression(total, return_key="total_membrane_cost")

Working with Special-Character Keys¶

dm.add_data("dir_a", "Ca_2+", ca_data)
dm.add_data("dir_a", "SO4_2-", so4_data)

ek = dm.get_expression_keys()
ek.print_mapping()       # check the safe attribute names

# Item access is simplest for these keys
ratio = ek["Ca_2+"] / ek["SO4_2-"]
dm.register_expression(ratio, return_key="ca_so4_ratio")

How It Works Under the Hood¶

ExpressionKeys.__getattr__ / __getitem__ returns an ExpressionNode leaf referencing a data key (preserving the original key).
Arithmetic operators create new ExpressionNode internal nodes storing op, left, and right.
ExpressionNode.required_keys traverses the tree to collect all referenced data keys.
PsDataManager.evaluate_expressions() iterates over every directory, builds a {key → PsData} dict for the required keys, and calls ExpressionNode.evaluate(data_dict) which recursively evaluates the tree using PsData.data_with_units (unit propagation is handled by the quantities library).
The result is wrapped in a new PsData and stored under the expression's return_key.

Direct PsData Arithmetic¶

You can also perform arithmetic directly on PsData objects outside the expression system:

dir_key = dm.directory_keys[0]
lcow = dm.get_data(dir_key, "LCOW")
recovery = dm.get_data(dir_key, "recovery")

ratio = lcow / recovery
double = lcow + lcow
scaled = lcow * 100
inverted = 1.0 / recovery
squared = lcow ** 2
negative = -lcow

# Results carry descriptive keys
print(ratio.data_key)  # "(LCOW / recovery)"