IMI configuration file
This page documents settings in the IMI configuration file (config.yml
).
General
|
Name for this inversion; will be used for directory names and prefixes. |
|
Boolean for running the IMI on AWS ( |
|
Boolean for running the IMI as a batch job with |
|
Boolean for running in safe mode to prevent overwriting existing files. |
|
Boolean for uploading output directory to S3. If |
|
S3 path to upload files to (eg. |
|
Files to upload from the IMI Output directory (eg. |
|
Files to upload from the IMI Output directory (eg. |
Period of interest
|
Beginning of the inversion period in |
|
End of the inversion period in |
|
Number of months for the spinup simulation. |
TROPOMI data type
|
Boolean for if the Blended TROPOMI+GOSAT data should be used ( |
Region of interest
|
Boolean for using the GEOS-Chem regional simulation. This should be set to |
|
Two character region ID for using pre-cropped meteorology fields. Select |
|
Minimum longitude edge of the region of interest (only used if |
|
Maximum longitude edge of the region of interest (only used if |
|
Minimum latitude edge of the region of interest (only used if |
|
Maximum latitude edge of the region of interest (only used if |
Kalman filter options
|
Boolean for running the IMI using a Kalman filter for continuous updates ( |
|
Number of days in each Kalman filter update cycle eg. |
|
Fraction of original prior emissions to use in the prior for each Kalman filter update (eg. |
State vector
|
Boolean for whether the IMI should automatically create a rectilinear state vector for the inversion. If |
|
Number of buffer elements (clusters of GEOS-Chem grid cells lying outside the region of interest) to add to the state vector of emissions being optimized in the inversion. Default value is |
|
Width of the buffer elements, in degrees; will not be used if |
|
Land-cover fraction below which to exclude GEOS-Chem grid cells from the state vector when creating the state vector file. Default value is |
|
Offshore GEOS-Chem grid cells with oil/gas emissions above this threshold will be included in the state vector. Default value is |
|
Boolean to optimize boundary conditions during the inversion. Must also include |
|
Boolean to optimize OH during the inversion. Must also include |
Point source datasets
|
Optional list of public datasets to use for visualization of point sources to be included in state vector clustering. Only available option is |
Clustering Options
For more information on using the clustering options take a look at the clustering options page.
|
Boolean for whether to reduce the dimension of the statevector from the native resolution version by clustering elements. If |
|
Boolean for whether to update the statevector clustering with each Kalman Filter update. Note: |
|
Clustering method to use for state vector reduction. (eg. “kmeans” or “mini-batch-kmeans”) |
|
Number of elements in the reduced dimension state vector. This is only used if |
|
yaml list of of coordinates that you would like to force as native resolution state vector elements [lat, lon]. This is useful for ensuring hotspot locations are at the highest available resolution. |
Custom/pre-generated state vector
These settings are only used if CreateAutomaticRectilinearStateVectorFile
is false
. Use them to create a custom state vector file from a shapefile in conjunction with the statevector_from_shapefile.ipynb
jupyter notebook located at:
$ /home/ubuntu/integrated_methane_inversion/src/notebooks/statevector_from_shapefile.ipynb
|
Path to the custom or pre-generated state vector netcdf file. File will be saved here if generating it from a shapefile. |
|
Path to the shapefile. |
Note: To setup a remote Jupyter notebook check out the quick start guide visualize results with python section.
Inversion
|
Error in the prior estimates (1-sigma; relative). Default is |
|
Error in the prior estimates (relative percent). Default is |
|
Error in the prior estimates (using ppb). Default is |
|
Observational error (1-sigma; absolute; ppb). Default value is |
|
Regularization parameter; typically between 0 and 1. Default value is |
|
Boolean for whether the Jacobian matrix has already been computed ( |
Grid
|
Resolution for inversion. Options are |
|
Meteorology to use for the inversion. Options are |
Setup modules
These settings turn on/off (true
/ false
) different steps for setting up the IMI.
|
Boolean to create a GEOS-Chem run directory and modify it with settings from |
|
Boolean to set up a run directory for the spinup-simulation by copying the template run directory and modifying the start/end dates, restart file, and diagnostics. |
|
Boolean to set up run directories for N+1 simulations (one reference simulation, plus N sensitivity simulations for the N state vector elements) by copying the template run directory and modifying the start/end dates, restart file, and diagnostics. Output from these simulations will be used to construct the Jacobian. |
|
Boolean to set up the inversion directory containing scripts needed to perform the inverse analysis; inversion results will be saved here. |
|
Boolean to set up the run directory for the posterior simulation by copying the template run directory and modifying the start/end dates, restart file, and diagnostics. |
Run modules
These settings turn on/off (true
/ false
) different steps for running the inversion.
|
Boolean to run the setup script ( |
|
Boolean to run the spin-up simulation. |
|
Boolean to run the reference and sensitivity simulations. |
|
Boolean to run the inverse analysis code. |
|
Boolean to run the posterior simulation. |
IMI preview
|
Boolean to run the IMI preview ( |
|
Threshold for estimated DOFS below which the IMI should automatically exit with a warning after performing the preview.
Default value |
SLURM Resource Allocation
These settings are used to allocate resources (CPUs and Memory) to the different simulations needed to run the inversion.
Note: some python scripts are also deployed using slurm and default to using the SimulationCPUs
and SimulationMemory
settings.
|
Max amount of time to allocate to each sbatch job (eg. “0-6:00”) |
|
Number of cores to allocate to each in series simulation. |
|
Amount of memory to allocate to each in series simulation (in MB). |
|
Number of cores to allocate to each jacobian simulation (run in parallel). |
|
Amount of memory to allocate to each jacobian simulation (in MB). |
|
Name of the partition(s) you would like all slurm jobs to run on (eg. “debug,huce_intel,seas_compute,etc”). |
|
The maximum number of jacobian simulations to run simultaneously. The default is -1 (no limit) which will submit all jacobian simulations at once. If the value is greater than zero, the sbatch array statement will be modified to include the “%” separator and will limit the number of simultaneously running tasks from the job array to the specifed value. |
Advanced settings: GEOS-Chem options
These settings are intended for advanced users who wish to modify additional GEOS-Chem options.
|
Value to perturb emissions by in each sensitivity simulation. Default value is |
|
Value to perturb OH by if using |
|
Number of ppb to perturb emissions by for domain edges (North, South, East, West) if using |
|
Boolean to apply emissions scale factors derived from a previous inversion. This file should be provided as a netCDF file and specified in HEMCO_Config.rc. Default value is |
|
Boolean to apply OH scale factors derived from a previous inversion. This file should be provided as a netCDF file and specified in HEMCO_Config.rc. Default value is |
|
Boolean to save out hourly diagnostics from GEOS-Chem. This output is used in satellite operators via post-processing. Default value is |
|
Boolean to save out the planeflight diagnostic in GEOS-Chem. This output may be used to compare GEOS-Chem against planeflight data. The path to those data must be specified in input.geos. See the planeflight diagnostic documentation for details. Default value is |
|
Boolean to turn on the GOSAT observation operator in GEOS-Chem. This will save out text files comparing GEOS-Chem to observations, but has to be manually incorporated into the IMI. Default value is |
|
Boolean to turn on the TCCON observation operator in GEOS-Chem. This will save out text files comparing GEOS-Chem to observations, but has to be manually incorporated into the IMI. Default value is |
|
Boolean to turn on the AIRS observation operator in GEOS-Chem. This will save out text files comparing GEOS-Chem to observations, but has to be manually incorporated into the IMI. Default value is |
Advanced settings: Local cluster
These settings are intended for advanced users who wish to (run the IMI on a local cluster).
|
Path for IMI runs and output. |
|
Path to GEOS-Chem input data. |
|
Path to TROPOMI input data. |
|
Path to file containing Conda environment settings. |
|
Name of conda environment. |
|
Boolean for downloading an initial restart file from AWS S3. Default value is |
|
Path to initial GEOS-Chem restart file plus file prefix (e.g. |
|
Path to initial GEOS-Chem restart file plus file prefix (e.g. |
|
Path to GEOS-Chem boundary condition files (for regional simulations). |
|
Version of TROPOMI smoothed boundary conditions to use (e.g. |
|
Boolean to download missing GEOS-Chem data for the preview run. Default value is |
|
Boolean to download missing GEOS-Chem data for the spinup simulation. Default value is |
|
Boolean to download missing GEOS-Chem data for the production (i.e. Jacobian) simulations. Default value is |
|
Boolean to download missing GEOS-Chem data for the posterior simulation. Default value is |
|
Boolean to download missing GEOS-Chem data for the preview run. Default value is |
|
Boolean to download missing GEOS-Chem boundary condition files. Default value is |
Note for *DryRun
options: If you are running on AWS, you will be charged if your ec2 instance is not in the us-east-1 region. If running on a local cluster you must have AWS CLI enabled or you can modify the ./download_data.py
commands in setup_imi.sh
to use washu
instead of aws
. See the GEOS-Chem documentation for more details.