Repetition of sample blocks the building of Kriging surrogate model


I’m using my own data as input to build a kriging model. There are three columns for input variables, one column for output variable.

For input, there are:

  1. day_of_year: transforming the date into the day of year. The date covers more than ten years’ data, so there are repetitive elements in this column
  2. air_temperature: recording the temperature in the day
  3. water_depth: water depth for the sensor

For output, I have:

  1. water_temperature: water temperature detected by the sensor

When I try to build a Kriging surrogate model, it returns an error message as below:

There exists a repetition in the samples, cannot build Kriging.

I try to eliminate repetitive element before creating surrogate model, but this seems useless.

df_train = df_train[:, [:day_of_year, :env_temp_mean, :ts_dpth, :ts_temp]]
df_train = unique(df_train)

Does it means for each columns, there should not be any repetitive element for each columns of input, not for each raw of input? Why there is such limit for Kriging model?

It’s undefined in Kriging if the distance between two points is zero because you do operations like divide by the distance. So it has to make sure there’s no repetitions, either in the library itself or have a check on user inputs.