CrocoDash.extract_forcings.case_setup package#
Submodules#
CrocoDash.extract_forcings.case_setup.driver module#
CrocoDash Forcing Extraction Driver
This module orchestrates the forcing extraction workflow for a CrocoDash case. It coordinates multiple forcing data sources (tides, runoff, BGC, etc.) and processes them into MOM6-compatible file formats.
The script can be run from the command line with various component flags to control which forcings are processed. It loads configuration from config.json and coordinates all extraction, regridding, and formatting operations.
OBC processing (--bc) uses Dask for the GET (download) step only.
REGRID and MERGE always run sequentially in the main process — xESMF/ESMF
cannot initialize its parallel environment in subprocess workers on PBS/HPC
systems (see CrocoDash.extract_forcings.obc).
By default everything runs without a cluster (sequential dask.compute). Pass
--n-workers N to launch a LocalCluster that parallelises GET. For PBS
clusters, add --pbs along with optional --queue, --walltime,
--memory, --cores, and --resource-spec flags (requires
dask-jobqueue). For full Python control (e.g. SLURM), create a client with
make_pbs_cluster() and pass it to
run_workflow() directly.
Typical CLI usage:
python driver.py --all # all components, sequential OBC
python driver.py --bc --n-workers 4 # OBC with 4 local workers
python driver.py --bc --n-workers 8 --pbs --queue regular --walltime 02:00:00 # OBC with 8 PBS jobs
python driver.py --bc --n-workers 8 --pbs --queue regular --visualize # same, plus Dask dashboard link
python driver.py --tides --bgcic # tides and BGC IC only
python driver.py --all --skip runoff # all except runoff
python driver.py --ic --no-get # IC, skip raw data download
Typical Python usage (HPC power users):
from CrocoDash.extract_forcings.utils import make_pbs_cluster
from CrocoDash.extract_forcings.case_setup.driver import run_workflow
client = make_pbs_cluster(n_workers=8, queue="regular", walltime="02:00:00")
run_workflow(bc=True, ic=True, client=client, visualize=True)
client.close()
Note
On HPC systems (PBS/SLURM), the Dask dashboard runs on an internal compute
node that is not directly reachable from your laptop. When --visualize
is used with --pbs, the driver prints a ready-to-run SSH tunnel command.
Run it on your laptop to forward the port, then open http://localhost:<port>/status
in a browser:
ssh -L 8787:<compute-node>:8787 <login-node>
- CrocoDash.extract_forcings.case_setup.driver.parse_args()#
- CrocoDash.extract_forcings.case_setup.driver.process_bgcic()#
Extract and copy BGC initial conditions from CESM MARBL inputdata.
- CrocoDash.extract_forcings.case_setup.driver.process_bgcironforcing()#
- CrocoDash.extract_forcings.case_setup.driver.process_bgcrivernutrients()#
Process river nutrient inputs for BGC.
- CrocoDash.extract_forcings.case_setup.driver.process_chl()#
Process satellite-derived chlorophyll data
- CrocoDash.extract_forcings.case_setup.driver.process_runoff()#
Generate runoff mapping files and interpolation weights.
- CrocoDash.extract_forcings.case_setup.driver.process_tides()#
Extract and process tidal forcing from TPXO database.
- CrocoDash.extract_forcings.case_setup.driver.resolve_components(args, cfg)#
Resolve which components should run based on CLI flags and config availability.
This function takes the parsed command-line arguments and the configuration, then determines which forcing components should actually execute. It handles: - –all: Enable all components that exist in config - –skip: Disable specific components by name (case-insensitive) - Individual flags: Enable only specified components - Config validation: Skip components requested but not in config
The function modifies args in-place, setting each component flag to True/False based on the resolution logic.
- Parameters:
args – Parsed command-line arguments (from parse_args())
cfg – Config object with .config dict of available components
- Returns:
Modified args object with all component flags resolved
- CrocoDash.extract_forcings.case_setup.driver.run_from_cli(args, cfg)#
Execute the forcing extraction workflow based on CLI arguments.
- Parameters:
args – Parsed and resolved command-line arguments
cfg – Config object from utils.Config(CONFIG_PATH)
- CrocoDash.extract_forcings.case_setup.driver.run_workflow(ic=False, bc=False, bgcic=False, bgcironforcing=False, tides=False, chl=False, runoff=False, bgcrivernutrients=False, skip_get=False, skip_regrid=False, skip_merge=False, preview=False, cfg=None, client=None, n_workers=None, visualize=False, pbs=False)#
Execute the forcing extraction workflow.
This is the shared core used by both run_from_cli and case.py’s process_forcings. Each boolean flag enables the corresponding component. Components run sequentially; parallelism is handled internally by individual components (e.g., OBC uses Dask).
- Parameters:
ic – Run initial conditions
bc – Run boundary conditions (OBC; parallel internally via Dask)
bgcic – Run BGC initial conditions
bgcironforcing – Run BGC iron forcing
tides – Run tidal forcing
chl – Run chlorophyll processing
runoff – Run runoff mapping
bgcrivernutrients – Run BGC river nutrients (always runs after runoff)
skip_get – Skip raw data download step (OBC/IC)
skip_regrid – Skip regridding step (OBC)
skip_merge – Skip merge step (OBC)
preview – Preview task graph without executing
cfg – Config object; loaded from CONFIG_PATH if None
client – Dask distributed Client (power users). Caller owns lifecycle. Create one with
make_pbs_cluster()ormake_local_cluster().n_workers – Spin up a LocalCluster with this many workers for GET. REGRID and MERGE always run sequentially in the main process. Ignored if client is already provided. If neither is set, OBC uses
dask.computewith no cluster overhead.visualize – If True and a Dask client is active, print the Dask dashboard link so progress can be monitored in a browser. When
pbs=True, also prints a ready-to-run SSH tunnel command for reaching the dashboard from outside the cluster.pbs – Set to True when the client was created with a PBS cluster. Only affects the extra SSH hint printed by
visualize.
- CrocoDash.extract_forcings.case_setup.driver.should_run(name, args, cfg)#
- CrocoDash.extract_forcings.case_setup.driver.test_driver()#
Test that all the imports work