I used DiffeqSensitivity’s global sensitivity analysis using the Sobol method in order to explain the influence of specific input parameters of a model I’m building. I’m already getting some exciting results from the first order indices, and am also interested in the second order effects.
I am however unsure how to interpret the second order results.
My model has 35 input variables and 4 output variables. The resulting second order indices matrix has a size of 595 x 4.
How do I know which indexes in the second order indices matrix correspond to the interaction between two input variables?
@Vaibhavdixit02 might want to comment here.
@Janssena the second order indices proceed as
1-2, 1-3, 1-4...1-n, 2-3, 2-4,...2-n... i.e. for each parameter starting from the first we take the combination of that parameter with all subsequent parameters. This leads to 35 \choose 2 total combinations which is 595. I hope this clarifies it?
@Vaibhavdixit02 food for thought: reshape and display it as a matrix?
Hmm, not sure what the reshape should be? Currently it is stored as rows for dependent variables with the sensitivity for that variable due to the parameter (or combination of parameters in second order) as the columns indexed as per the order they are passed in. I think the op might have the dimensions for their output mentioned the other way round (I checked to ensure I remembered correctly)
That is exaclty what I meant yeah! Shape indeed is 4×595. I think it would be most clear if you have a input x input matrix per output variable. That makes it easy to look at all interactions for each input variable and can immediately be visualized using something like a heatmap.
If you want to take a look at using heat map with the
gsa output there is an example https://github.com/PumasAI/PumasTutorials.jl/blob/f97b04da047197643456c61b04deabc0b797892d/tutorials/pkpd/hcvgsa.jmd#L203-L212, it differs a bit since the dispatch for
gsa in Pumas returns a separate struct but it should hopefully give you enough hint to work it out. You can ping me if you get stuck.
I was wondering about the second order indices as well.
Using the MWE from the docs
using GlobalSensitivity, QuasiMonteCarlo
sin(X) + A*sin(X)^2+ B*X^4 *sin(X)
n = 600000
lb = -ones(4)*π
ub = ones(4)*π
sampler = SobolSample()
A,B = QuasiMonteCarlo.generate_design_matrices(n,lb,ub,sampler)
res1 = gsa(ishi,Sobol(order=[0,1,2]),A,B)
The sum over all 2nd order indices considerably exceeds 1. Are the 2nd order indices somehow cumulative?
Second order indices wouldn’t sum to 1, that’s only true for the first order. I am not sure what you mean by cumulative? The definition of second order should make it clear why you can’t expect them to sum to 1, it’s the sensitivities wrt each pair of the parameters, x1&x2, x1&x3…x1&xn, x2&x3, x2&x4 …xn-1&xn. I hope that makes sense
gsa function returns sobol indices, then in the case of orthogonal inputs, the sum of all indices should be equal to 1.
See also the ishigami example here.
using Distributions, DataStructures
params = [Symbol("p$i") for i =1:4]
pbounds = [(l,u) for (l,u) in zip(lb,ub)]
data = SobolData(
params = OrderedDict([p => Uniform(b,b) for (b,p) in zip(pbounds,params)]...),
N = 10000
samples = GlobalSensitivityAnalysis.sample(data)
@. sin(X[1,:]) + A*sin(X[2,:])^2+ B*X[3,:]^4 *sin(X[1,:])
y = ishi_batch(samples')
sob = analyze(data, y)
Negative values do not make sense, but this might be avoided by increasing the number of samples, but the direction is correct, compare also with the results from here.
I don’t see the sum to 1 in the link you shared?
There seems to be a negative value in the link as well, yes it can be avoided by increasing the number of samples, when it’s so close to 0 it can sometimes turn up -ve, for all measures you can consider it effectively 0 in that case.
If you some up all entries in the first order vector and the second order matrix, you end up around 1:
Is there an update on this? Shall I open an issue on github linking to this discussion?