Constructing an inversion ensemble

You can use the IMI to create a low-cost ensemble of sensitivity inversions with different inversion parameters. This is because the Jacobian matrix computed in the first inversion can easily be reused for additional inversions. This can be useful for understanding the sensitivity of the inversion results to the choice of prior error, observation error, or other inversion parameters.

Automated ensemble generation

The simplest way to generate an ensemble is to run the IMI a single time with a configuration file that specifies vectors of the desired range of hyperparameters (e.g. PriorError: [0.5, 0.75]). The IMI will then run multiple inversions with the various combinations of hyperparameter values. Each ensemble member is saved to the inversion_results_ensemble.nc and gridded_posterior_ensemble.nc file. This method is useful for quickly generating an ensemble without having to manually run the IMI multiple times with new run directories and configuration files. However, vectors can only be applied for the following hyperparameters: PriorError, ObsError, Gamma, PriorErrorBCs, PriorErrorBufferElements, PriorErrorOH.

In the ensemble result files (inversion_results_ensemble.nc and gridded_posterior_ensemble.nc), an additional coordinate is included that allows selection of the inversion results for each ensemble member. In python, this can be done as follows:

python
import xarray as xr

ds = xr.open_dataset('inversion_results_ensemble.nc')

# select the inversion results for ensemble member 2
ensemble_member_2 = ds.sel(ensemble=2)

# print the hyperparameters used for ensemble member 2
params = ["prior_err", "obs_err", "gamma", "prior_err_bc", "prior_err_oh"]
for param in params:
    print(f"{param}: {ensemble_member_2[param]}")

The output used for the posterior simulation is saved to the inversion_results.nc and gridded_posterior.nc` files. The data in these files represents the mean of the ensemble members.

Choosing ensemble members:

The IMI will generate all possible combinations of the hyperparameters specified in the configuration file regardless of whether some combinations are unrealistic. For example, if you tighten the prior error to a value that is very low (eg. 0.01) and weight the observations very highly via the regularization parameter (eg. \(Gamma > 1\)), the inversion will have little freedom to update the emissions and would return the prior emissions despite the heavy weighting of the observations. Therefore, it is important to carefully choose which ensemble members to include in the ensemble analysis. Several visualizations come built into the visualization notebook that can help users identify which ensemble members are realistic. Once identified, unrealistic ensemble members can be removed from the uncertainty analysis.

Another metric that can be used to identify unrealistic ensemble members is the chi-square metric. This metric is saved in the inversion_results.nc file and is a measure of how well the inversion results match the expected output of the chi-square distribution (\(J_a / n \approx 1\)). Ideally, this value should be close to 1. If the chi-square value is much greater (or less) than 1, it is likely that the inversion results are not realistic and the ensemble member should be removed from the uncertainty analysis.

By default, the IMI filters ensemble members such that the ensemble only includes members with (\(J_a / n = 0.5-2.0\)). This filtering can be modified or turned off by changing the filter_ens_members settings in the invert.py and lognormal_invert.py scripts.

The visualization notebook creates the following figures to analyze the ensemble spread and sensitivity of the inversion results to the hyperparameters:

In this example, you can see from the second figure that the inversion results are highly sensitive to the regularization parameter Gamma. As noted above, using a gamma that is too low or too high can lead to unrealistic inversion results. Here, we see that the gamma value of 0.1 is too low, causing the inversion to return the prior emissions. In this case, it would be reasonable to remove the ensemble members with low gamma from the uncertainty analysis.

Manual ensemble generation

In this scenario, you have already run your base inversion. You can then use the IMI to create an ensemble of inversions by specifying a new run directory that references the precomputed Jacobian matrix from the base run directory. This method is useful for generating an ensemble with inputs that cannot be specified as vectors in the configuration file. For example, you may want to run an ensemble of sensitivity inversions that swap out prior emission inventories, or use Lognormal instead of Gaussian prior error distributions.

Note: When applying state vector clustering you cannot use the precomputed Jacobian if you swap out prior emissions inventories. This is because the underlying distribution of emissions within individual state vector clusters may change, requiring a new Jacobian matrix to be computed.

See the Common configurations page for instructions on how to re-configure the IMI to use a pre-computed Jacobian in a new run directory.