Performance snapshot — benchmarks run on Derecho/GLADE. Regression timeline → (needs 2+ commits to show data)
Topo.set_from_dataset()The cost of loading GEBCO bathymetry and regridding it onto a model grid scales with domain extent, not destination resolution. Before regridding, the pipeline slices GEBCO to the model bounding box — so a larger geographic region means more source data to read and interpolate, regardless of how fine the destination grid is. A 5°×5° domain takes 11.1 s; a 40°×40° domain takes 16.0 min. All grids are generated at 0.1° resolution, so larger domains also produce larger destination grids. Memory usage averages ~18.9 GB.
xe.Regridder() / raw ESMFComputing interpolation weights (the one-time setup cost before any regridding can happen) scales with grid size. For a 300×300 source → 150×150 destination grid, bilinear weight generation takes 655 ms; scaling up to 1500×700 → 700×350 takes 10.6 s. Conservative interpolation takes roughly 1.6× longer than bilinear at the same grid size. Raw ESMF weight generation is similar in cost (0.96× relative to xESMF for the same grid pair), confirming that xESMF's overhead is negligible — it is a thin Python wrapper around the same ESMF C library. Weight files themselves occupy 225 MB–1.8 GB of RSS memory.
When the destination is a boundary line of points rather than a full grid (the pattern used for open-boundary conditions), weight generation is substantially faster: 46 ms–6.2 s across the tested source grid sizes. This is because the destination has far fewer points than a full 2-D grid of similar extent.
regridder(ds)Once weights are built, applying them to a data array is fast regardless of grid size.
Using xESMF, a single timestep on a small grid takes 2 ms;
60 timesteps on the largest tested grid takes 117 ms.
The cost is dominated by the number of destination points and time steps,
not the source grid size. On average, nearest_s2d is 2.4× faster than bilinear during application. Raw ESMF (no xarray) applies the same pre-computed weights considerably faster: 37 μs and 62 ms for the same two cases — roughly 46× and 1.9× faster respectively. The gap is largest for single timesteps, where xESMF's xarray Dataset → numpy conversion and index alignment dominate; it narrows for many timesteps on large grids where the actual interpolation work takes over. In practice xESMF is used because it handles multi-variable, dask-backed arrays transparently — the overhead is a deliberate trade-off for API convenience.
gen_rof_maps()gen_rof_maps() builds ESMF regridding weight files that map river runoff from
a land-model (ROF) mesh onto the ocean (OCN) model mesh. These weight files are computed once
and reused for every model run. The ROF source mesh is held constant (JRA55) across all pairs;
only the OCN destination grid varies — from coarse (1/10°) through fine (1/40°) to a larger
regional domain — so timing directly reflects how destination grid size drives cost.
Importing CrocoDash and mom6_forge is fast enough to be negligible in any workflow:
CrocoDash.case loads in 7 ms,
mom6_forge.topo in 4 ms,
and mom6_forge.grid / mom6_forge.vgrid in
2 ms / 1 ms.
All checked data sources are accessible. Each method is validated on every benchmark run; the table below shows pass/fail status and how long each validation took.
xESMF and ESMF are external libraries — they are not part of CrocoDash or mom6_forge and their performance will not change from commit to commit on those repos. These benchmarks serve as a stable reference: they tell you how fast the underlying regridding engine is on this machine, independently of any CROC code changes. If these numbers change significantly between runs, suspect a different library version, different node type, or different CPU load — not a regression in CROC code.
Detailed charts below ↓
| Product / Method | Working? | Link? | Validation time |
|---|---|---|---|
| gebco / get gebco data with python | Yes | Yes | — |
| gebco / get gebco data script | Yes | Yes | 6 ms |
| glofas / get global data with python | Yes | Yes | 443 ms |
| glofas / get processed global glofas script for cli | Yes | Yes | 5 ms |
| glorys / get glorys data from rda | Yes | Yes | 1.3 s |
| glorys / get glorys data from cds api | Yes | Yes | 1.1 min |
| glorys / get glorys data script for cli | Yes | Yes | 6 ms |
| mom6_output / get mom6 data | Yes | Yes | 35.7 s |
| seawifs / get global seawifs script for cli | Yes | Yes | 7 ms |
| seawifs / get processed global seawifs script for cli | Yes | Yes | 7 ms |
param[0]: CrocoDash.case, mom6_forge.grid, mom6_forge.topo, mom6_forge.vgridparam[0]: 5, 15, 30param[0]: tpxo, cesminputdata, gebco, glofas, glorys, mom6_output, seawifsparam[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: 1, 12, 60param[3]: bilinear, nearest_s2dparam[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: 1, 12, 60param[3]: bilinear, nearest_s2dparam[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: bilinear, nearest_s2dparam[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: bilinear, nearest_s2dparam[0]: coarse_dst, fine_dst, larger_dstparam[0]: coarse_dst, fine_dst, larger_dstparam[0]: 5, 10, 20, 40param[0]: 5, 10, 20, 40param[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: 1, 12, 60param[3]: bilinear, nearest_s2dparam[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: 1, 12, 60param[3]: bilinear, nearest_s2dparam[0]: 300,300, 800,600, 1500,700param[1]: 1000, 10000, 100000param[2]: 1, 12, 60param[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: bilinear, conservativeparam[0]: 300,300, 800,600, 1500,700param[1]: 150,150, 400,300, 700,350param[2]: bilinear, conservativeparam[0]: 300,300, 800,600, 1500,700param[1]: 1000, 10000, 100000param[2]: bilinear, nearest_s2dparam[0]: 300,300, 800,600, 1500,700param[1]: 1000, 10000, 100000param[2]: bilinear, nearest_s2d