IMI configuration file
This page documents settings in the IMI configuration file (config.yml).
General
|
Name for this inversion; will be used for directory names and prefixes. |
|
Boolean for running the IMI as a batch job with |
|
Boolean for running in safe mode to prevent overwriting existing files. |
|
Boolean for uploading output directory to S3. If |
|
S3 path to upload files to (eg. |
|
Files to upload from the IMI Output directory (eg. |
Period of interest
|
Beginning of the inversion period in |
|
End of the inversion period in |
|
Number of months for the spinup simulation. |
TROPOMI data type
|
Boolean for if the Blended TROPOMI+GOSAT data should be used ( |
|
Boolean for whether to use observations over water ( |
Region of interest
|
Boolean for using the GEOS-Chem regional simulation. This should be set to |
|
Two character region ID for using pre-cropped meteorology fields. Select |
|
Minimum longitude edge of the region of interest (only used if |
|
Maximum longitude edge of the region of interest (only used if |
|
Minimum latitude edge of the region of interest (only used if |
|
Maximum latitude edge of the region of interest (only used if |
Kalman filter options
|
Boolean for running the IMI using a Kalman filter for continuous updates ( |
|
Number of days in each Kalman filter update cycle eg. |
|
Fraction of original prior emissions to use in the prior for each Kalman filter update (eg. |
|
Option to automatically create |
|
Path to custom |
|
Optional variable to specify which Kalman period to start on, if restarting an inversion. Default is |
State vector
|
Boolean for whether the IMI should automatically create a rectilinear state vector for the inversion. If |
|
Number of buffer elements (clusters of GEOS-Chem grid cells lying outside the region of interest) to add to the state vector of emissions being optimized in the inversion. Default value is |
|
Width of the buffer elements, in degrees; will not be used if |
|
Land-cover fraction below which to exclude GEOS-Chem grid cells from the state vector when creating the state vector file. Default value is |
|
Offshore GEOS-Chem grid cells with oil/gas emissions above this threshold will be included in the state vector. Default value is |
|
Boolean to optimize boundary conditions during the inversion. Must also include |
|
Boolean to optimize OH during the inversion. Must also include |
Point source datasets
|
Optional list of public datasets to use for visualization of point sources to be included in state vector clustering. Only available option is |
Clustering Options
For more information on using the clustering options take a look at the clustering options page.
|
Boolean for whether to reduce the dimension of the statevector from the native resolution version by clustering elements. If |
|
Boolean for whether to update the statevector clustering with each Kalman Filter update. Note: |
|
Clustering method to use for state vector reduction. (eg. “kmeans” or “mini-batch-kmeans”) |
|
Maximum number of native resolution elements in a cluster. Default value is |
|
Aggregate DOFS that a cluster must have before being added to the grid. Making this value higher will smooth out the clustering. Default value is |
|
Number of elements in the reduced dimension state vector. This is only used if |
|
yaml list of of coordinates that you would like to force as native resolution state vector elements [lat, lon]. This is useful for ensuring hotspot locations are at the highest available resolution. |
Custom/pre-generated state vector
These settings are only used if CreateAutomaticRectilinearStateVectorFile is false. Use them to create a custom state vector file from a shapefile in conjunction with the statevector_from_shapefile.ipynb jupyter notebook located at:
$ /home/ubuntu/integrated_methane_inversion/src/notebooks/statevector_from_shapefile.ipynb
|
Path to the custom or pre-generated state vector netcdf file. File will be saved here if generating it from a shapefile. |
|
Path to the shapefile. |
Note: To setup a remote Jupyter notebook check out the quick start guide visualize results with python section.
Inversion
|
Boolean value whether to use lognormal error distribution for calculating emissions in the domain of interest. Note: Normal error is used for buffer elements and boundary condition optimization. |
|
Vector of errors in the prior estimates (1-sigma; relative). Default is |
|
Vector of errors in the OH estimates (relative percent). Default is |
|
Vector of errors in the prior estimates (using ppb). Default is |
|
Vector of errors in the prior estimates for buffer elements (1-sigma; relative). Default is |
|
Vector of observational errors (1-sigma; absolute; ppb). Default value is |
|
Vector of regularization parameters; typically between 0 and 1. Default value is |
|
Boolean for whether the Jacobian matrix has already been computed ( |
|
Path to the reference run directory containing previously generated Jacobian. Only used if |
Grid
|
Resolution for inversion. Options are |
|
Meteorology to use for the inversion. Options are |
Setup modules
These settings turn on/off (true / false) different steps for setting up the IMI.
|
Boolean to run the setup script ( |
|
Boolean to create a GEOS-Chem run directory and modify it with settings from |
|
Boolean to set up a run directory for the spinup-simulation by copying the template run directory and modifying the start/end dates, restart file, and diagnostics. |
|
Boolean to set up run directories for N+1 simulations (one reference simulation, plus N sensitivity simulations for the N state vector elements) by copying the template run directory and modifying the start/end dates, restart file, and diagnostics. Output from these simulations will be used to construct the Jacobian. |
|
Boolean to set up the inversion directory containing scripts needed to perform the inverse analysis; inversion results will be saved here. |
|
Boolean to set up the run directory for the posterior simulation by copying the template run directory and modifying the start/end dates, restart file, and diagnostics. |
Run modules
These settings turn on/off (true / false) different steps for running the inversion.
|
Boolean to run a HEMCO standalone simulation to generate the prior emissions. |
|
Boolean to run the spin-up simulation. |
|
Boolean to run the reference and sensitivity simulations. |
|
Boolean to only re-run sensitivity simulations that have not yet completed successfully. This is useful for resuming an interrupted inversion. |
|
Boolean to specify whether the IMI should rerun all sensitivity simulation ( |
|
Boolean to run the inverse analysis code. |
|
Boolean to run the posterior simulation. |
IMI preview
|
Boolean to run the IMI preview ( |
|
Threshold for estimated DOFS below which the IMI should automatically exit with a warning after performing the preview.
Default value |
Job Resource Allocation
These settings are used to allocate resources (CPUs and Memory) to the different simulations needed to run the inversion.
Note: some python scripts are also deployed using slurm and default to using the RequestedCPUs and RequestedMemory settings.
If the inversion step requires more resources than the rest of the IMI workflow, using the optional InversionCPUs and InversionMemory
variables can be convenient.
|
Number of cores to allocate to slurm jobs. |
|
Amount of memory to allocate to each in series simulation (in MB). |
|
Max amount of time to allocate to each sbatch job (eg. “0-6:00”) |
|
Optional Variable. Number of cores to allocate to the inversion job if different from |
|
Optional Variable. Amount of memory to allocate to inversion sbatch job (in MB) if different from |
|
Optional Variable. Max amount of time to allocate to inversion sbatch job (eg. “0-6:00”) if different from |
|
Name of the partition(s) you would like all slurm jobs to run on (eg. “debug,huce_cascade,seas_compute,etc”). |
|
The maximum number of jacobian simulations to run simultaneously. The default is -1 (no limit) which will submit all jacobian simulations at once. If the value is greater than zero, the sbatch array statement will be modified to include the “%” separator and will limit the number of simultaneously running tasks from the job array to the specifed value. |
|
The number of tracers to use for each jacobian simulation. A value of 1 will create and submit a jacobian run for each state vector element. Specifying a value greater than 1 will combine state vector elements into fewer runs. The default values is 5 tracers per simulation. |
Advanced settings: Observing System Simulation Experiment (OSSE)
These settings are intended for advanced users who wish to run an OSSE. This effectively runs the inversion using simulated pseudo-observations with a known prior emissions field. The IMI will generate synthetic observations by randomly perturbing the prior emissions and adding noise to the generated observations based on user specification.
Advanced settings: GEOS-Chem options
These settings are intended for advanced users who wish to modify additional GEOS-Chem options.
|
Target perturbation amount on the emissions in each sensitivity simulation. Default value is |
|
Value to perturb OH by if using |
|
Number of ppb to perturb emissions by for domain edges (North, South, East, West) if using |
|
Boolean to save out hourly diagnostics from GEOS-Chem. This output is used in satellite operators via post-processing. Default value is |
|
Boolean to save out the planeflight diagnostic in GEOS-Chem. This output may be used to compare GEOS-Chem against planeflight data. The path to those data must be specified in geoschem_config.yml. See the planeflight diagnostic documentation for details. Default value is |
|
Boolean to save out the ObsPack diagnostic in GEOS-Chem. This output may be used to compare GEOS-Chem against NOAA ObsPack data. The path to those data must be specified in geoschem_config.yml. See the ObsPack diagnostic documentation for details. Default value is |
|
Boolean to turn on the GOSAT observation operator in GEOS-Chem. This will save out text files comparing GEOS-Chem to observations, but has to be manually incorporated into the IMI. Default value is |
|
Boolean to turn on the TCCON observation operator in GEOS-Chem. This will save out text files comparing GEOS-Chem to observations, but has to be manually incorporated into the IMI. Default value is |
|
Boolean to turn on the AIRS observation operator in GEOS-Chem. This will save out text files comparing GEOS-Chem to observations, but has to be manually incorporated into the IMI. Default value is |
Advanced settings: Local cluster
These settings are intended for advanced users who wish to (run the IMI on a local cluster).
|
Path for IMI runs and output. |
|
Path to GEOS-Chem input data. |
|
Path to TROPOMI input data. |
|
Path to file containing Conda environment settings. |
|
Name of conda environment. |
|
Boolean for downloading an initial restart file from AWS S3. Default value is |
|
Path to initial GEOS-Chem restart file plus file prefix (e.g. |
|
Path to initial GEOS-Chem restart file plus file prefix (e.g. |
|
Path to GEOS-Chem boundary condition files (for regional simulations). |
|
Version of TROPOMI smoothed boundary conditions to use (e.g. |
|
Boolean to download missing GEOS-Chem data for the preview run. Default value is |
|
Boolean to download missing GEOS-Chem data for the spinup simulation. Default value is |
|
Boolean to download missing GEOS-Chem data for the production (i.e. Jacobian) simulations. Default value is |
|
Boolean to download missing GEOS-Chem data for the posterior simulation. Default value is |
|
Boolean to download missing GEOS-Chem data for the preview run. Default value is |
|
Boolean to download missing GEOS-Chem boundary condition files. Default value is |