SeaSloth Benchmark Report

Summary

Bathymetry — `Topo.set_from_dataset()`

The cost of loading GEBCO bathymetry and regridding it onto a model grid scales with domain extent, not destination resolution. Before regridding, the pipeline slices GEBCO to the model bounding box — so a larger geographic region means more source data to read and interpolate, regardless of how fine the destination grid is. A 5°×5° domain takes 11.1 s; a 40°×40° domain takes 16.0 min. All grids are generated at 0.1° resolution, so larger domains also produce larger destination grids. Memory usage averages ~18.9 GB.

Regridding weights — `xe.Regridder()` / raw ESMF

Computing interpolation weights (the one-time setup cost before any regridding can happen) scales with grid size. For a 300×300 source → 150×150 destination grid, bilinear weight generation takes 655 ms; scaling up to 1500×700 → 700×350 takes 10.6 s. Conservative interpolation takes roughly 1.6× longer than bilinear at the same grid size. Raw ESMF weight generation is similar in cost (0.96× relative to xESMF for the same grid pair), confirming that xESMF's overhead is negligible — it is a thin Python wrapper around the same ESMF C library. Weight files themselves occupy 225 MB–1.8 GB of RSS memory.

OBC-style (locstream) weight generation

When the destination is a boundary line of points rather than a full grid (the pattern used for open-boundary conditions), weight generation is substantially faster: 46 ms–6.2 s across the tested source grid sizes. This is because the destination has far fewer points than a full 2-D grid of similar extent.

Applying pre-computed weights — `regridder(ds)`

Once weights are built, applying them to a data array is fast regardless of grid size. Using xESMF, a single timestep on a small grid takes 2 ms; 60 timesteps on the largest tested grid takes 117 ms. The cost is dominated by the number of destination points and time steps, not the source grid size. On average, nearest_s2d is 2.4× faster than bilinear during application. Raw ESMF (no xarray) applies the same pre-computed weights considerably faster: 37 μs and 62 ms for the same two cases — roughly 46× and 1.9× faster respectively. The gap is largest for single timesteps, where xESMF's xarray Dataset → numpy conversion and index alignment dominate; it narrows for many timesteps on large grids where the actual interpolation work takes over. In practice xESMF is used because it handles multi-variable, dask-backed arrays transparently — the overhead is a deliberate trade-off for API convenience.

Runoff mapping — `gen_rof_maps()`

gen_rof_maps() builds ESMF regridding weight files that map river runoff from a land-model (ROF) mesh onto the ocean (OCN) model mesh. These weight files are computed once and reused for every model run. The ROF source mesh is held constant (JRA55) across all pairs; only the OCN destination grid varies — from coarse (1/10°) through fine (1/40°) to a larger regional domain — so timing directly reflects how destination grid size drives cost.

Module import times

Importing CrocoDash and mom6_forge is fast enough to be negligible in any workflow: CrocoDash.case loads in 7 ms, mom6_forge.topo in 4 ms, and mom6_forge.grid / mom6_forge.vgrid in 2 ms / 1 ms.

Data source availability

All checked data sources are accessible. Each method is validated on every benchmark run; the table below shows pass/fail status and how long each validation took.

A note on xESMF and ESMF benchmarks

xESMF and ESMF are external libraries — they are not part of CrocoDash or mom6_forge and their performance will not change from commit to commit on those repos. These benchmarks serve as a stable reference: they tell you how fast the underlying regridding engine is on this machine, independently of any CROC code changes. If these numbers change significantly between runs, suspect a different library version, different node type, or different CPU load — not a regression in CROC code.

CrocoDash

DataAccessHealth — all products

Product / Method	Working?	Link?	Validation time
gebco / get gebco data with python	Yes	Yes	—
gebco / get gebco data script	Yes	Yes	6 ms
glofas / get global data with python	Yes	Yes	443 ms
glofas / get processed global glofas script for cli	Yes	Yes	5 ms
glorys / get glorys data from rda	Yes	Yes	1.3 s
glorys / get glorys data from cds api	Yes	Yes	1.1 min
glorys / get glorys data script for cli	Yes	Yes	6 ms
mom6_output / get mom6 data	Yes	Yes	35.7 s
seawifs / get global seawifs script for cli	Yes	Yes	7 ms
seawifs / get processed global seawifs script for cli	Yes	Yes	7 ms

CrocoDashImports.time_import

param[0]: CrocoDash.case, mom6_forge.grid, mom6_forge.topo, mom6_forge.vgrid

OBCRegridMerge.time_regrid_and_merge

param[0]: 5, 15, 30

DataAccessLinkCheck.time_link_check

param[0]: tpxo, cesminputdata, gebco, glofas, glorys, mom6_output, seawifs

esmf

ESMFRegridApply.time_apply

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: 1, 12, 60
param[3]: bilinear, nearest_s2d

ESMFRegridApply.track_rss_mb

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: 1, 12, 60
param[3]: bilinear, nearest_s2d

ESMFWeightsGenerate.time_generate_weights

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: bilinear, nearest_s2d

ESMFWeightsGenerate.track_rss_mb

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: bilinear, nearest_s2d

xESMF / ESMF

XESMFRegridApply.time_apply

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: 1, 12, 60
param[3]: bilinear, nearest_s2d

XESMFRegridApply.track_rss_mb

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: 1, 12, 60
param[3]: bilinear, nearest_s2d

XESMFRegridApplyLocstream.time_apply

param[0]: 300,300, 800,600, 1500,700
param[1]: 1000, 10000, 100000
param[2]: 1, 12, 60

XESMFWeightsGenerate.time_generate_weights

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: bilinear, conservative

XESMFWeightsGenerate.track_rss_mb

param[0]: 300,300, 800,600, 1500,700
param[1]: 150,150, 400,300, 700,350
param[2]: bilinear, conservative

XESMFWeightsGenerateLocstream.time_generate_weights

param[0]: 300,300, 800,600, 1500,700
param[1]: 1000, 10000, 100000
param[2]: bilinear, nearest_s2d

XESMFWeightsGenerateLocstream.track_rss_mb

param[0]: 300,300, 800,600, 1500,700
param[1]: 1000, 10000, 100000
param[2]: bilinear, nearest_s2d

SeaSloth Benchmark Report

Summary

Bathymetry — `Topo.set_from_dataset()`

Regridding weights — `xe.Regridder()` / raw ESMF

OBC-style (locstream) weight generation

Applying pre-computed weights — `regridder(ds)`

Runoff mapping — `gen_rof_maps()`

Module import times

Data source availability

A note on xESMF and ESMF benchmarks

CrocoDash

DataAccessHealth — all products

CrocoDashImports.time_import

OBCRegridMerge.time_regrid_and_merge

DataAccessLinkCheck.time_link_check

esmf

ESMFRegridApply.time_apply

ESMFRegridApply.track_rss_mb

ESMFWeightsGenerate.time_generate_weights

ESMFWeightsGenerate.track_rss_mb

mom6_forge

RunoffMappingNearestNeighbour.time_gen_rof_maps_nn

RunoffMappingSmoothed.time_gen_rof_maps_smoothed

TopoSetFromDataset.time_set_from_dataset

TopoSetFromDataset.track_rss_mb

xESMF / ESMF

XESMFRegridApply.time_apply

XESMFRegridApply.track_rss_mb

XESMFRegridApplyLocstream.time_apply

XESMFWeightsGenerate.time_generate_weights

XESMFWeightsGenerate.track_rss_mb

XESMFWeightsGenerateLocstream.time_generate_weights

XESMFWeightsGenerateLocstream.track_rss_mb

Summary

Bathymetry — Topo.set_from_dataset()

Regridding weights — xe.Regridder() / raw ESMF

OBC-style (locstream) weight generation

Applying pre-computed weights — regridder(ds)

Runoff mapping — gen_rof_maps()

Module import times

Data source availability

A note on xESMF and ESMF benchmarks

CrocoDash

DataAccessHealth — all products

CrocoDashImports.time_import

OBCRegridMerge.time_regrid_and_merge

DataAccessLinkCheck.time_link_check

esmf

ESMFRegridApply.time_apply

ESMFRegridApply.track_rss_mb

ESMFWeightsGenerate.time_generate_weights

ESMFWeightsGenerate.track_rss_mb

mom6_forge

RunoffMappingNearestNeighbour.time_gen_rof_maps_nn

RunoffMappingSmoothed.time_gen_rof_maps_smoothed

TopoSetFromDataset.time_set_from_dataset

TopoSetFromDataset.track_rss_mb

xESMF / ESMF

XESMFRegridApply.time_apply

XESMFRegridApply.track_rss_mb

XESMFRegridApplyLocstream.time_apply

XESMFWeightsGenerate.time_generate_weights

XESMFWeightsGenerate.track_rss_mb

XESMFWeightsGenerateLocstream.time_generate_weights

XESMFWeightsGenerateLocstream.track_rss_mb

Bathymetry — `Topo.set_from_dataset()`

Regridding weights — `xe.Regridder()` / raw ESMF

Applying pre-computed weights — `regridder(ds)`

Runoff mapping — `gen_rof_maps()`