Proper Julia syntax for Kriging interpolation with Geostats.jl?

Hi, I’m trying to use Kriging to interpolate between data values in a discrete array.
The following code (and variations of it) give errors like, MethodError: no method matching EstimationProblem(::Set{Float64}, ::RegularGrid{Float64,1}, ::Set{Float64}), and then, UndefVarError: KrigProb not defined in top-level scope....

Any suggestions on syntax that would work? (And how to interpolate between the discrete data points defined below?) Thanks!

using GeoStats
using KrigingEstimators
KrigSampleDatX = Set([1.0, 2.0, 3.0, 4.0, 5.0])
KrigSampleDatVals = Set([1.23,5.35,8.64,9.23,7.76])
KrigSampleGrid = RegularGrid(100)
KrigProb = KrigingEstimators.EstimationProblem(KrigSampleDatX, KrigSampleGrid, KrigSampleDatVals)
KrigSolver = KrigingEstimators.Kriging()
KrigSoln = KrigingEstimators.solve(KrigProb,KrigSolver)

Hi @CosmoProf, you have two approaches to your problem:

Approach 1 (high-level API)

If you are doing interpolation in a geospatial application, then it is a good idea to use the data types in the GeoStats.jl stack to create a high-level problem with all the necessary elements and then call the Kriging solver to solve the problem.

First step consists of georeferencing your data:

using GeoStats

geodata = georef((y=[1.23,5.35,8.64,9.23,7.76],), [1.0 2.0 3.0 4.0 5.0])
5 PointSet{Float64,1}
  variables
    └─y (Float64)

The second step consists of creating the domain where you want to do the interpolation:

geodomain = RegularGrid((1.0,), (5.0,), dims=(100,))
100 RegularGrid{Float64,1}
  origin:  (1.0,)
  spacing: (0.04040404040404041,)

Finally, you define your problem by specifying which variables you want to interpolate:

 problem = EstimationProblem(geodata, geodomain, :y)
1D EstimationProblem
  data:      5 SpatialData{Float64,1}
  domain:    100 RegularGrid{Float64,1}
  variables: y (Float64)

You can solve the problem with the Kriging solver, and specify parameters for each variable:

solver = Kriging(:y => (variogram=GaussianVariogram(),))
solution = solve(problem, solver)
1D EstimationSolution
  domain: 100 RegularGrid{Float64,1}
  variables: y

You can plot the solution and the data to make sure the result meets yours expectations:

using Plots

plot(solution)
plot!(geodata)

Notice that this approach is particularly useful when you have interpolation in 2D and 3D domains. In your case, the second approach may be simpler.

Approach 2 (low-level API)

If all you need is access to a Kriging interpolator on a set of points, you can use the traditional fit/predict workflow in statistics. You don’t need to load the entire GeoStats.jl stack for that.

First you create your interpolator object:

using KrigingEstimators
using Variography

OK = OrdinaryKriging(GaussianVariogram())
OrdinaryKriging{GaussianVariogram{Float64,Distances.Euclidean}}(GaussianVariogram{Float64,Distances.Euclidean}
  range: Float64 1.0
  sill: Float64 1.0
  nugget: Float64 0.0
  distance: Distances.Euclidean
)

Second, you fit it to the data:

krig = fit(OK, [1.0 2.0 3.0 4.0 5.0], [1.23,5.35,8.64,9.23,7.76])

And finally you predict at new locations:

μ, σ² = predict(krig, [1.0])
(1.23, 0.0)

The second result is the conditional variance, which should be zero at interpolation points.

6 Likes

Thanks juliohm, these code examples are very useful!

Probably low-level API is what I’m looking for, but I’ll try both, and see if I have any more questions.

Most likely I’ll be using Universal Kriging (the data points are sampled on a non-constant function, not from a distribution with constant mean.) If there are complexities beyond just changing the word OrdinaryKriging to UniversalKriging, it would be helpful to find out. Thanks again!

You are welcome @CosmoProf. You can always check the documentation of the objects by typing:

?UniversalKriging

in the Julia prompt after the KrigingEstimators package is imported.

Ok, so Kriging interpolation seems to be incredibly inconsistent! For some data sets it’s great; for others, it’s terrible. Using the same commands (given below), for two different data sets, I’ve attached snapshots of the interpolations for each case. Anyone have any idea what’s going on??

KrigTestX = [dat1 dat2 ...]
KrigTestVals = [dat1, dat2, ...]
KrigFitterUnivDeg3 = fit(UniversalKriging(GaussianVariogram(),3,1), 
                                          KrigTestX, KrigTestVals)
KrigPlot_x_Vals = first(KrigTestX):0.001:last(KrigTestX)
KrigPlot_Interp_Vals_Univ3 = Array{Float64,1}(undef, length(KrigPlot_x_Vals))
for i in 1:length(KrigPlot_x_Vals)
    KrigPlot_Interp_Vals_Univ3[i] = 
               predict(KrigFitterUnivDeg3, [KrigPlot_x_Vals[i]])[1]
end
scatter(KrigTestX',KrigTestVals,color="black");
   plot!(KrigPlot_x_Vals,KrigPlot_Interp_Vals_Univ3, color="blue")

UnivKrigDeg3--Hitting_the_Points UnivKrigDeg3--Missing_the_Points

Hi @CosmoProf , the Kriging method requires you to specify an appropriate variogram model for the data. The variogram model dictates the spatial correlation, and in your example you are using a GaussianVariogram() with default parameters regardless of the data. Notice how in the first plot your x values range from 30 to 40, and in your second plot the values range from 1.5 to 1.25. There is at least a 10x change of scale that you are not taking into account.

Please consider reading the tutorials in GeoStatsTutorials to learn more. The GeoStats.jl documentation is also your friend. I try to provide didactic material in both places.

2 Likes

I’ve actually tried a number of times to search through the documentation, and it has some good info, but some stuff (like your low-level API example above) I couldn’t find anywhere. Most importantly I haven’t been able to find any summary of the options for the Variogram method, just some occasional examples of “GaussianVariogram” and “SphericalVariogram” and “range”, without elucidation.

(The farthest I’ve gotten are [DocStrings]((Docstrings · KrigingEstimators.jl), which doesn’t list the Variogram options, just gives a few examples; and KrigingEstimators, which is apparently a dead end.)

Is there a more explicit documentation page that I’ve missed?

Also, I won’t know the data properties in advance, so the interpolation must be done without human intervention each time. (Interpolation is done by an automated “walker” through the data which interpolates thousands of times in a row; the data environment changes.) Can the Variogram parameters (range? etc.?) be determined directly from the data in each array of samples? In particular, I’m comparing the Kriging interpolation performance vs. Cubic Splines; the spline interpolator acts directly on the data without needing prior knowledge of the data properties.

Is just entering, fit(UniversalKriging(Variogram,3,1), instead of fit(UniversalKriging(GaussianVariogram(),3,1), some super-automated way of letting the GeoStats figure out the best Variogram model for the data? Is that what I’m looking for? (Thanks…)

Everything in the GeoStats.jl stack has docstrings. Type ?GaussianVariogram whenever you want to learn more about the options. If the docstrings aren’t clear, please open an issue and we will try to clarify it further.

Variograms are not in the KrigingEstimators.jl module, they are in the Variography.jl module.

I am assuming you went to the GeoStatsTutorials repository and had a chance to read the VariogramModeling tutorial for example.

Whenever you make a comparison, you need to make a statement about the metrics used in the comparison. For instance, if the metric of success in your use case is related to the absence of user intervention, then Kriging may not be the best candidate. The GeoStatsTutorials provide examples of fitting variogram models to data before doing Kriging interpolation, however these examples assume that you have enough data, and that this data is well-distributed spatially, among other things. You need to have very clear in your mind what you are trying to achieve and what are the requirements for a viable solution.

I am afraid my answer won’t satisfy your immediate interests, but I am here to reiterate the importance of understanding the methods you are trying to use. There are good resources out there to learn more about Kriging, geostatistics, and other modern techniques, but you will need to devote time to reading them carefully.

If you don’t have this time right now because you are busy with other tasks, I recommend sitting down with pen and paper and listing the requirements. At least, you will be able to prune methods from the giant pool of statistical methods out there.

1 Like