Skip to content

Checkpoint loading broken for old files after TensorProductElement family change #4998

@sghelichkhani

Description

@sghelichkhani

The recent finat change (firedrakeproject/fiat@b0646c25) and the accompanying firedrake commit (1bd729a) changed how TensorProductElement reports its family. When all sub-elements share the same family, the TPE now reports that instead of "TensorProductElement". So TensorProductElement(CG2, CG2) on an extruded mesh now has .family() == "Lagrange" rather than "TensorProductElement".

This breaks loading checkpoint files saved before the change if they contain CG or DG functions on extruded meshes. In firedrake/embedding.py, get_embedding_element_for_checkpointing checks element.family() against native_elements_for_checkpointing. Old code saw "TensorProductElement" (not native), so these functions were stored with DG embedding. New code sees "Lagrange" (native), tries to load directly, but the file still has DG-embedded data. The assertion on line 1502 of checkpointing.py then fails because the recomputed embedding (CG) doesn't match what's actually stored (DG).

We hit this in G-ADOPT with several checkpoint files on extruded cylindrical meshes and had to regenerate all of them. I'd expect this to bite anyone who has old checkpoints from extruded mesh simulations.

For a fix: if PREFIX_EMBEDDED is present in the HDF5 group, the code could skip recomputing the expected embedding and just proceed with the project/interpolate step, rather than asserting against a freshly computed embedding element that no longer matches old files. Separately, get_embedding_element_for_checkpointing could be updated to handle TensorProductElement instances that now report inherited families, so the save path is also correct going forward.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions