Model-Observation comparison: use your own MOM6 run#

This notebooks is a stripped down version of CrocoCamp’s Tutorial 1 notebook, and you can use it as starting point for a model-obs comparison between your own CESM-MOM6 run and CrocoLake or WOD.

Installing CrocoCamp#

If you don’t have CrocoCamp set up yet, here are the instructions to install it on NCAR’s HPC. If you have any issues or need to install it on a different machine, please contact enrico.milanese@whoi.edu.

CrocoLake observation sequence files#

If you want to use CrocoLake, generate your own observation files by adapting the code below (NB: some fields are intentionally left blank and the cell returns errors if you do not change it as you need).

crocolake_path = '/glade/campaign/cgd/oce/projects/CROCODILE/workshops/2025/CrocoCamp/CrocoLakePHY'

import datetime
import os
from convert_crocolake_obs import ObsSequence
basename = "myCL_obs_seq_"
outdir = "$WORK/crocodile_2025/CrocoCamp/my_project/in_CL/"
basename = os.path.expandvars(outdir+basename)
outdir = os.path.expandvars(outdir)
if not os.path.exists(outdir):
    os.makedirs(outdir, exist_ok=True)

# define horizontal region
LAT0 = 5
LAT1 = 60
LON0 = -100
LON1 = -30

# define depth in dbar
PRES0 = 
PRES1 =

# define variables to import from CrocoLake
selected_variables = [
    "DB_NAME",  # ARGO, GLODAP, SprayGliders, OleanderXBT, Saildrones
    "JULD", # this contains timestamp
    "LATITUDE",
    "LONGITUDE",
    "PRES",
    "TEMP",
    "PRES_QC",
    "TEMP_QC",
    "PRES_ERROR",
    "TEMP_ERROR",
    "PSAL",
    "PSAL_QC",
    "PSAL_ERROR"
]

# month and year are constant in out case
year0 = 
month0 = 
N =

# we loop to generate one file per day
for j in range(N):

    # set date range
    day0 = 1+j
    day1 = day0+1
    date0 = datetime.datetime(year0, month0, day0, 0, 0, 0)
    date1 = datetime.datetime(year0, month0, day1, 0, 0, 0)
    print(f"Converting obs between {date0} and {date1}")

    # this defines AND filters, i.e. we want to load each observation that has latitude within the given range AND longitude within the given range, etc.
    # to exclude NaNs, impose a range to a variable
    and_filters = (
        ("LATITUDE",'>',LAT0),  ("LATITUDE",'<',LAT1),
        ("LONGITUDE",'>',LON0), ("LONGITUDE",'<',LON1),
        ("PRES",'>',PRES0), ("PRES",'<',PRES1),
        ("JULD",">",date0), ("JULD","<",date1)
    )

    # this adds OR conditions to the and_filters, i.e. we want to load all observations that statisfy the AND conditions above, AND that have finite salinity OR temperature values
    db_filters = [
        list(and_filters) + [("PSAL", ">", -1e30), ("PSAL", "<", 1e30)],
        list(and_filters) + [("TEMP", ">", -1e30), ("TEMP", "<", 1e30)],
    ]

    # generate output filename
    obs_seq_out = basename + f".{year0}{month0:02d}{day0:02d}.out"

    # generate obs_seq.in file
    obsSeq = ObsSequence(
        crocolake_path,
        selected_variables,
        db_filters,
        obs_seq_out=obs_seq_out,
        loose=True
    )
    obsSeq.write_obs_seq()

The configuration file#

Remember to generate a new config file for your workflow:

!cp ../configs/config_template.yaml ../configs/config_my_workflow.yaml

and to customize the paths as needed.

Running the workflow#

Generate and run the workflow from your config file. Remember that you need to be running this notebook on Casper, as the configuration file points to DART’s installation on that machine.

from crococamp.workflows import WorkflowModelObs

# Create and run workflow to interpolate MOM6 model onto World Ocean Database obs space
my_workflow = WorkflowModelObs.from_config_file('../configs/config_my_workflow.yaml')
my_workflow.run() #use flag clear_output=True if you want to re-run it and automatically clean all previous output

Displaying the interactive map#

Load and explore the data using pandas or dask:

good_model_obs_df = workflow_float.get_good_model_obs_df(compute=True)  # compute=True triggers the compute of the dask dataframe, returning a pandas dataframe with data loaded in memory
good_model_obs_df.head()                                                # displays first 5 rows in the dataframe

Load the interactive map:

from crococamp.viz import InteractiveWidgetMap
# Create an interactive map widget to visualize model-observation comparisons
# The widget provides controls for selecting variables, observation types, and time ranges
widget = InteractiveWidgetMap(dgood_model_obs_f)
widget.setup()