Why does the Sobol Sensitivity (S1) so large in my model

I am running a model in Julia with 42 parameters (they are somehow inner-correlated with each other) and 6 outputs, and trying to use the function gsa from the package GlobalSensitivity.jl to diagnose the ST and S1 sensitivity. However, the ST seems to be ok, but there are a lot of f16 values in S1 values…I am using N=10000 for the sample size, and

A, B = QuasiMonteCarlo.generate_design_matrices(samples, lb, ub, sampler)

to sample design matrix for GSA…here attached the matrix of S1. I am wondering why there are so many f16 values? for example, the 13th parameter, and last two parameters…

enter image description here

Could you show a full example? The results look strange to me.

what do you mean by full example? the code to calculate sensitivity is embeded in other subroutines…here is the function to do it:

function globalSensitivity(cost_function, method_options, p_bounds, ::GlobalSensitivitySobolDM; batch=true)
    sampler = getproperty(SindbadOptimization.GlobalSensitivity, Symbol(method_options.sampler))(; method_options.sampler_options...);
    samples = method_options.samples;
    lb = first.(p_bounds);
    ub = last.(p_bounds);
    @debug samples
    A, B = QuasiMonteCarlo.generate_design_matrices(samples, lb, ub, sampler);
    @debug size(A)
    @debug method_options
    results = gsa(cost_function, Sobol(; method_options.method_options...), A, B; method_options..., batch=batch);
    return results
end

results is the out_sensitivity.sensitivity in the snapshot…in this function call, I set N=10000, and using cost function which returns loss as a matrix which has size of (num of outputs, num of evaluations)=(6, 10000*(42+2)).

It’s just one row. Possibly you have a bunch of zeros and something else? We would need a way to run the model to really diagnose it, but indeed that one row is odd.

no? the f12/f16 values happen in the 13th row, and also last two rows…

it’s quite hard to share the model here…but could you please let me know how to really diagnose it? thanks! We tried to narrow down the parameters from 42 to only 10 parameters (10 parameters for one small submodel within this big model), then everything is fine…but for these 42 parameters, things are weird…

If you change to higher precision does it go away? Are the values of that output really large?

what do you mean “higher precision”? you mean increase the number of samples?

Float64, BigFloat

how to change it? from the model yes there is an option to tune the output type…not sure whether there is similar option in gsa function…

It uses the precision matching your call/outputs

thanks, let me try it…

Thanks for your suggestion…I tuned it from Float32 to Float64. However, the result is pretty similar…

However, I noticed that the S2 is very big…there are other parameters (other than the 13th and last two parameters) which show big S2 values…for example the 3rd and 4th parameters in last two columns (in S2 matrix…)

Have you tried to look at the standard error estimates for the indices? I have seen situations like this before which arise due to a highly non-linear model and a lack of convergence in the indices.

Thanks for your reply! But how to look at the standard error estimates for the indices? you mean the confidential interval? or anything else?

if you increase the nboot keyword to around 2000. You can then look at the conf`idence intervals with res.S1_Conf_Int

thanks for your suggestion! With nboot=2000, the S1_Conf_Int is


It seems the error is large…

1 Like

Given the large error indicated by confidential interval, I guess the way to solve it to increase the number of sample? or do you have other solutions…thanks!

With sensitivity indices, negative values indicate poor convergence of the SA, common with many parameters, outputs and big non linear models. The few ways to address this (to my knowledge) are:

  • Increasing number of samples
  • Changing SA method to one that only returns 1st and total order indices (if you’re ok without 2nd Order, these methods generally take fewer samples)
  • Training a surrogate model that will make the generation of samples quicker