CrocoDash.extract_forcings package#
Subpackages#
Submodules#
CrocoDash.extract_forcings.bgc module#
- CrocoDash.extract_forcings.bgc.process_bgc_ic(file_path, output_path)#
Copy BGC initial condition file
Parameters: - file_path: str, path to the original BGC IC file - output_path: str, path to save the processed BGC IC file
Returns: - None
- CrocoDash.extract_forcings.bgc.process_bgc_iron_forcing(nx, ny, MARBL_FESEDFLUX_FILE, MARBL_FEVENTFLUX_FILE, inputdir)#
Create dummy iron forcing files for MARBL. Parameters: - nx: int, number of grid points in x-direction - ny: int, number of grid points in y-direction - MARBL_FESEDFLUX_FILE: str, filename for sediment flux input - MARBL_FEVENTFLUX_FILE: str, filename for event flux input - inputdir: str, directory to save the generated files Returns: - None
- CrocoDash.extract_forcings.bgc.process_river_nutrients(global_river_nutrients_filepath, ocn_grid, mapping_file, river_nutrients_nnsm_filepath)#
CrocoDash.extract_forcings.chlorophyll module#
- CrocoDash.extract_forcings.chlorophyll.process_chl(ocn_grid, ocn_topo, inputdir, chl_processed_filepath, output_filepath)#
CrocoDash.extract_forcings.get_dataset_piecewise module#
- CrocoDash.extract_forcings.get_dataset_piecewise.get_dataset_piecewise(product_name: str, function_name: str, product_information: dict, date_format: str, start_date: str, end_date: str, hgrid_path: str | Path, step_days: int, output_dir: str | Path, boundary_number_conversion: dict, run_initial_condition: bool = True, run_boundary_conditions: bool = True, preview: bool = False)#
Retrieves and saves data in piecewise chunks for each boundary over a date range.
- Parameters:
product_name (str) – The name of the data product to retrieve.
function_name (str) – The function to call for retrieving data.
date_format (str) – The date format string (e.g., “%Y-%m-%d”).
start_date (str) – The start date in the specified format.
end_date (str) – The end date in the specified format.
hgrid_path (str or Path) – Path to the hgrid file containing the regional grid.
step_days (int) – The number of days in each data chunk.
output_dir (str or Path) – The directory to save the output NetCDF files.
boundary_number_conversion (dict) – Dictionary mapping boundaries to their numerical identifiers.
run_initial_condition (bool) – Whether or not to run the initial condition, default is true
run_boundary_conditions (bool) – Whether or not to run the boundary conditions, default is true
preview (bool) – Whether or not to preview the run, default is false
- Raises:
ValueError – If the product or function is not found in the registry.
- Returns:
Saves the retrieved data to the specified output directory.
- Return type:
None
CrocoDash.extract_forcings.merge_piecewise_dataset module#
- CrocoDash.extract_forcings.merge_piecewise_dataset.merge_piecewise_dataset(folder: str | Path, input_dataset_regex: str, date_format: str, start_date: str, end_date: str, boundary_number_conversion: dict, output_folder: str | Path, run_initial_condition: bool = True, run_boundary_conditions: bool = True, preview: bool = False)#
Merges piecewise datasets from a folder into consolidated NetCDF files by boundary.
- Parameters:
folder (str or Path) – Path to the folder containing the regridded dataset files.
input_dataset_regex (str) – Regular expression pattern to match dataset files.
date_format (str) – Date format string used for parsing the dataset filenames.
start_date (str) – Start date in the specified format.
end_date (str) – End date in the specified format.
boundary_number_conversion (dict) – Dictionary mapping boundary segment numbers to their labels.
output_folder (str or Path) – Directory to save the merged NetCDF files.
run_initial_condition (bool) – Whether to run initial condition, default is true.
run_boundary_conditions (bool) – Whether to run boundary conditions, default is true.
preview (bool, optional) – Whether to run in preview mode without saving (default is False).
- Raises:
ValueError – If a segment in boundary_number_conversion is not found in the dataset folder.
- Returns:
Saves the merged NetCDF files to the specified output folder.
- Return type:
None
CrocoDash.extract_forcings.regrid_dataset_piecewise module#
- CrocoDash.extract_forcings.regrid_dataset_piecewise.capture_fill_metadata(ds)#
Return a dict mapping variable names → {‘_FillValue’: …, ‘missing_value’: …} Only stores attributes that exist.
- CrocoDash.extract_forcings.regrid_dataset_piecewise.final_cleanliness_fill(var, x_dim, y_dim, z_dim=None)#
- CrocoDash.extract_forcings.regrid_dataset_piecewise.m6b_fill_missing_data_wrapper(ds, xdim, zdim, fill)#
- CrocoDash.extract_forcings.regrid_dataset_piecewise.regrid_dataset_piecewise(folder: str | Path, input_dataset_regex: str, date_format: str, start_date: str, end_date: str, hgrid_path: str | Path, bathymetry: str | Path, dataset_varnames: dict, output_folder: str | Path, boundary_number_conversion: dict, run_initial_condition: bool = True, run_boundary_conditions: bool = True, vgrid_path: str | Path = None, preview: bool = False)#
Find the required files, set up the necessary data, and regrid the dataset.
- Parameters:
folder (str or Path) – Path to the folder containing the dataset files.
input_dataset_regex (str) – Regular expression pattern to match dataset files.
date_format (str) – Date format string used to parse dates in filenames (e.g., “%Y%m%d”).
start_date (str) – Start date of the dataset range in YYYYMMDD format.
end_date (str) – End date of the dataset range in YYYYMMDD format.
hgrid (str or Path) – Path to the horizontal grid file used for regridding.
dataset_varnames (dict) –
Mapping of variable names in the dataset to standardized names. Example: {
”time”: “time”, “latitude”: “yh”, “longitude”: “xh”, “depth”: “zl”
}
output_folder (str or Path) – Path to the folder where the regridded dataset will be saved.
boundary_number_conversion (dict) –
Dictionary mapping boundary names to numerical IDs. Example: {
”north”: 1, “east”: 2, “south”: 3, “west”: 4
}
run_initial_condition (bool) – Whether or not to run the initial condition, defaults to true
run_boundary_conditions (bool) – Whether or not to run the boundary conditions, defaults to true
vgrid_path (str or Path) – Path to the Vertical Coordinate required for the initial condition
preview (bool) – Whether or not to preview the run of this function, defaults to false
- Returns:
The regridded dataset files are saved to the specified output_folder.
- Return type:
None
CrocoDash.extract_forcings.runoff module#
- CrocoDash.extract_forcings.runoff.generate_rof_ocn_map(rof_grid_name, rof_esmf_mesh_filepath, ocn_mesh_filepath, inputdir, grid_name, rmax, fold)#
Generate runoff to ocean mapping files if runoff is active in the compset.
CrocoDash.extract_forcings.tides module#
- CrocoDash.extract_forcings.tides.process_tides(ocn_topo, inputdir, supergrid_path, vgrid_path, tidal_constituents, boundaries, tpxo_elevation_filepath, tpxo_velocity_filepath)#
CrocoDash.extract_forcings.utils module#
- class CrocoDash.extract_forcings.utils.Config(config_path: str = 'config.json')#
Bases:
object- keys()#
- CrocoDash.extract_forcings.utils.check_date_continuity(boundary_file_list: dict)#
Check for overlaps or missing dates between consecutive files.
- CrocoDash.extract_forcings.utils.parse_dataset_folder(folder: str | Path, input_dataset_regex: str, date_format: str)#
Parse a folder to find and extract dataset file information based on a regex pattern.
- Parameters:
folder (str or Path) – Path to the folder containing the dataset files.
input_dataset_regex (str) – Regular expression pattern to match dataset filenames. Example: “(north|east|south|west)_unprocessed.(d{8})_(d{8}).nc”
date_format (str) – Date format string used to parse dates in filenames (e.g., “%Y%m%d”).
- Returns:
Dictionary mapping boundaries to a list of tuples with: - Start date (datetime) - End date (datetime) - Full file path (Path)
Example: {
”north”: [(datetime(2000, 1, 1), datetime(2000, 1, 2), Path(“/path/to/north_20000101_20000102.nc”))], “east”: [(datetime(2000, 1, 3), datetime(2000, 1, 4), Path(“/path/to/east_20000103_20000104.nc”))]
}
- Return type:
dict