CrocoDash.extract_forcings.case_setup package#

Submodules#

CrocoDash.extract_forcings.case_setup.driver module#

CrocoDash Forcing Extraction Driver

This module orchestrates the forcing extraction workflow for a CrocoDash case. It coordinates multiple forcing data sources (tides, runoff, BGC, etc.) and processes them into MOM6-compatible file formats.

The script can be run from the command line with various component flags to control which forcings are processed. It loads configuration from config.json and coordinates all extraction, regridding, and formatting operations.

OBC processing (--bc) uses Dask for the GET (download) step only. REGRID and MERGE always run sequentially in the main process — xESMF/ESMF cannot initialize its parallel environment in subprocess workers on PBS/HPC systems (see CrocoDash.extract_forcings.obc).

By default everything runs without a cluster (sequential dask.compute). Pass --n-workers N to launch a LocalCluster that parallelises GET. For PBS clusters, add --pbs along with optional --queue, --walltime, --memory, --cores, and --resource-spec flags (requires dask-jobqueue). For full Python control (e.g. SLURM), create a client with make_pbs_cluster() and pass it to run_workflow() directly.

Typical CLI usage:

python driver.py --all                          # all components, sequential OBC
python driver.py --bc --n-workers 4             # OBC with 4 local workers
python driver.py --bc --n-workers 8 --pbs         --queue regular --walltime 02:00:00         # OBC with 8 PBS jobs
python driver.py --bc --n-workers 8 --pbs         --queue regular --visualize                 # same, plus Dask dashboard link
python driver.py --tides --bgcic                # tides and BGC IC only
python driver.py --all --skip runoff            # all except runoff
python driver.py --ic --no-get                  # IC, skip raw data download

Typical Python usage (HPC power users):

from CrocoDash.extract_forcings.utils import make_pbs_cluster
from CrocoDash.extract_forcings.case_setup.driver import run_workflow

client = make_pbs_cluster(n_workers=8, queue="regular", walltime="02:00:00")
run_workflow(bc=True, ic=True, client=client, visualize=True)
client.close()

Note

On HPC systems (PBS/SLURM), the Dask dashboard runs on an internal compute node that is not directly reachable from your laptop. When --visualize is used with --pbs, the driver prints a ready-to-run SSH tunnel command. Run it on your laptop to forward the port, then open http://localhost:<port>/status in a browser:

ssh -L 8787:<compute-node>:8787 <login-node>
CrocoDash.extract_forcings.case_setup.driver.parse_args()#
CrocoDash.extract_forcings.case_setup.driver.process_bgcic()#

Extract and copy BGC initial conditions from CESM MARBL inputdata.

CrocoDash.extract_forcings.case_setup.driver.process_bgcironforcing()#
CrocoDash.extract_forcings.case_setup.driver.process_bgcrivernutrients()#

Process river nutrient inputs for BGC.

CrocoDash.extract_forcings.case_setup.driver.process_chl()#

Process satellite-derived chlorophyll data

CrocoDash.extract_forcings.case_setup.driver.process_runoff()#

Generate runoff mapping files and interpolation weights.

CrocoDash.extract_forcings.case_setup.driver.process_tides()#

Extract and process tidal forcing from TPXO database.

CrocoDash.extract_forcings.case_setup.driver.resolve_components(args, cfg)#

Resolve which components should run based on CLI flags and config availability.

This function takes the parsed command-line arguments and the configuration, then determines which forcing components should actually execute. It handles: - –all: Enable all components that exist in config - –skip: Disable specific components by name (case-insensitive) - Individual flags: Enable only specified components - Config validation: Skip components requested but not in config

The function modifies args in-place, setting each component flag to True/False based on the resolution logic.

Parameters:
  • args – Parsed command-line arguments (from parse_args())

  • cfg – Config object with .config dict of available components

Returns:

Modified args object with all component flags resolved

CrocoDash.extract_forcings.case_setup.driver.run_from_cli(args, cfg)#

Execute the forcing extraction workflow based on CLI arguments.

Parameters:
  • args – Parsed and resolved command-line arguments

  • cfg – Config object from utils.Config(CONFIG_PATH)

CrocoDash.extract_forcings.case_setup.driver.run_workflow(ic=False, bc=False, bgcic=False, bgcironforcing=False, tides=False, chl=False, runoff=False, bgcrivernutrients=False, skip_get=False, skip_regrid=False, skip_merge=False, preview=False, cfg=None, client=None, n_workers=None, visualize=False, pbs=False)#

Execute the forcing extraction workflow.

This is the shared core used by both run_from_cli and case.py’s process_forcings. Each boolean flag enables the corresponding component. Components run sequentially; parallelism is handled internally by individual components (e.g., OBC uses Dask).

Parameters:
  • ic – Run initial conditions

  • bc – Run boundary conditions (OBC; parallel internally via Dask)

  • bgcic – Run BGC initial conditions

  • bgcironforcing – Run BGC iron forcing

  • tides – Run tidal forcing

  • chl – Run chlorophyll processing

  • runoff – Run runoff mapping

  • bgcrivernutrients – Run BGC river nutrients (always runs after runoff)

  • skip_get – Skip raw data download step (OBC/IC)

  • skip_regrid – Skip regridding step (OBC)

  • skip_merge – Skip merge step (OBC)

  • preview – Preview task graph without executing

  • cfg – Config object; loaded from CONFIG_PATH if None

  • client – Dask distributed Client (power users). Caller owns lifecycle. Create one with make_pbs_cluster() or make_local_cluster().

  • n_workers – Spin up a LocalCluster with this many workers for GET. REGRID and MERGE always run sequentially in the main process. Ignored if client is already provided. If neither is set, OBC uses dask.compute with no cluster overhead.

  • visualize – If True and a Dask client is active, print the Dask dashboard link so progress can be monitored in a browser. When pbs=True, also prints a ready-to-run SSH tunnel command for reaching the dashboard from outside the cluster.

  • pbs – Set to True when the client was created with a PBS cluster. Only affects the extra SSH hint printed by visualize.

CrocoDash.extract_forcings.case_setup.driver.should_run(name, args, cfg)#
CrocoDash.extract_forcings.case_setup.driver.test_driver()#

Test that all the imports work

Module contents#