Model-Observation comparison with MOM6 and World Ocean Database

Model-Observation comparison with MOM6 and World Ocean Database#

The goal of this notebook is to get familiar with the basics of using CrocoCamp to interpolate MOM6 output onto the space of observations stored in the World Ocean Database (WOD). We will use output from a MOM6 run already stored on NCAR’s HPC system, and WOD13 observation files already in obs_seq format and also stored on NCAR’s HPC system.

Installing CrocoCamp#

If you don’t have CrocoCamp set up yet, here are the instructions to install it on NCAR’s HPC. If you have any issues or need to install it on a different machine, please contact enrico.milanese@whoi.edu.

Running the workflow with custom config options#

This notebook uses the same MOM6 files of tutorials 1 and 2, but different observations and output paths. We can either generate a new config file using as reference of from those tutorials (e.g. see config_tutorial_3.yaml), or we can generate a workflow with one of the old templates, and override the paths that are different. While using a config file is the recommended way as it helps keep better track of our workflow, overriding parameters can come handy during exploratory phases. I’ll do the latter to demonstrate this option in CrocoCamp.

Remember that you need to be running the workflow on Casper, as the configuration settings point to DART’s installation on that machine.

from crococamp.workflows import WorkflowModelObs

# Create and run workflow to interpolate MOM6 model onto World Ocean Database obs space
workflow_WOD13 = WorkflowModelObs.from_config_file(
    'config_tutorial_1.yaml', 
    obs_seq_in_folder = '/glade/campaign/cgd/oce/projects/CROCODILE/workshops/2025/CrocoCamp/tutorial_3/in_WOD13/',
    output_folder='$WORK/crocodile_2025/CrocoCamp/tutorial_3/out_obs_seq_in/',
    input_nml_bck='$WORK/crocodile_2025/CrocoCamp/tutorial_3/input_bckp/',
    trimmed_obs_folder='$WORK/crocodile_2025/CrocoCamp/tutorial_3/out_trimmed_obs_seq_in/',
    parquet_folder='$WORK/crocodile_2025/CrocoCamp/tutorial_3/out_parquet/',
    tmp_folder='$WORK/crocodile_2025/CrocoCamp/tutorial_3/tmp/'
)
workflow_WOD13.run(clear_output=True) #use flag clear_output=True if you want to re-run it and automatically clean all previous output

Displaying the interactive map#

CrocoCamp generates a parquet dataset that contains the values of the WOD observations, the MOM6 model data interpolated onto the observations space, and some basic statistics.

You can load and explore the data using pandas (CrocoCamp supports also dask for large datasets).

good_model_obs_df = workflow_WOD13.get_good_model_obs_df(compute=True) # compute=True triggers the compute of the dask dataframe, returning a pandas dataframe with data loaded in memory
good_model_obs_df.head()                                                   # displays first 5 rows in the dataframe

Loading the interactive map is as simple as importing the widget and passing the dataframe to it:

from crococamp.viz import InteractiveWidgetMap
# Create an interactive map widget to visualize model-observation comparisons
# The widget provides controls for selecting variables, observation types, and time ranges
widget = InteractiveWidgetMap(good_model_obs_df)
widget.setup()