I have some data relative to a certain administrative division, let’s say forest volumes per region.
I need to translate this data in a second definition of “regions”, and sometimes I need to do the opposite.
I have the region definition as raster ASCII grid files, so they are substantially a matrix of ids (integers) of the old and new regions.
I implemented a function (see code below) that optionally takes a third layer to consider the weight is this repartition (in my case it would be the forest cover) and return a matrix of “conversion factors” and the “inverted” ones. My problem is however that when I test it with fake data and I go back, and forward I don’t obtain the original data. Why is it? I do have a loss of information so that I can’t recover the original data?
Here is a MWE script:
julia> using Pkg, Pipe, CSV, HTTP, DataFrames
julia> old_regions_url = "https://lef-nancy.org/files/index.php/s/LLWsWwMsDLez5pJ/download" # 100KB
"https://lef-nancy.org/files/index.php/s/LLWsWwMsDLez5pJ/download"
julia> new_regions_url = "https://lef-nancy.org/files/index.php/s/zZkPxw69JQgxoC8/download" # 100KB
"https://lef-nancy.org/files/index.php/s/zZkPxw69JQgxoC8/download"
julia> old_regions = @pipe HTTP.get(old_regions_url).body |>
CSV.File(_, delim=' ', datarow=7, header=false, ignorerepeated=true, missingstring="-9999") |>
DataFrame |>
Matrix;
julia> new_regions = @pipe HTTP.get(new_regions_url).body |>
CSV.File(_, delim=' ', datarow=7, header=false, ignorerepeated=true, missingstring="-9999") |>
DataFrame |>
Matrix;
julia> function compute_partitioning_factors(from_layer, dest_layer,part_layer=ones(size(from_layer)))
from_list = unique(skipmissing(from_layer))
dest_list = unique(skipmissing(dest_layer))
partitioned_area = [sum(skipmissing(part_layer[isequal.(from_layer,fr_el) .&& isequal.(dest_layer,dest_el)])) for fr_el in from_list, dest_el in dest_list]
partitioning_factors = partitioned_area ./ sum(partitioned_area,dims=2)
inverted_partitioning_factors = partitioned_area ./ sum(partitioned_area,dims=1)
return partitioning_factors, inverted_partitioning_factors
end
compute_partitioning_factors (generic function with 2 methods)
julia> partitioning_factors, inverted_partitioning_factors = compute_partitioning_factors(old_regions, new_regions); # without considering the forest cover, just the regional area
julia> partitioning_factors
12×13 Matrix{Float64}:
0.462863 0.0 0.537137 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.997178 0.000940734 0.0 0.00188147 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0021322 0.995736 0.0 0.0 0.0 0.0 0.0021322 0.0 0.0 0.0 0.0
0.0 0.00140647 0.0 0.0 0.998594 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.389881 0.0 0.0 0.00297619 0.0 0.0 0.605655 0.0 0.0014881 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.464208 0.0 0.0 0.0021692 0.0 0.0 0.0 0.533623 0.0
0.0 0.0 0.0 0.00293255 0.0 0.0 0.271261 0.0 0.725806 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.00132802 0.00531208 0.0 0.992032 0.00132802 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00110865 0.0 0.998891 0.0 0.0
0.0 0.0 0.0 0.0 0.00141844 0.0 0.0 0.997163 0.0 0.0 0.0 0.00141844 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
julia> inverted_partitioning_factors
12×13 Matrix{Float64}:
1.0 0.0 0.996008 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.801209 0.00199601 0.0 0.00174978 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.00199601 0.995736 0.0 0.0 0.0 0.0 0.00201207 0.0 0.0 0.0 0.0
0.0 0.000755858 0.0 0.0 0.621172 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.198035 0.0 0.0 0.00174978 0.0 0.0 0.364695 0.0 0.0013369 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.374453 0.0 0.0 0.00179211 0.0 0.0 0.0 0.997972 0.0
0.0 0.0 0.0 0.00426439 0.0 0.0 0.994624 0.0 0.995976 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.00537634 0.00358423 0.0 0.998663 0.00110865 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.00201207 0.0 0.998891 0.0 0.0
0.0 0.0 0.0 0.0 0.000874891 0.0 0.0 0.629928 0.0 0.0 0.0 0.0020284 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
julia> testorig = [1,10,100,1000,1,10,100,1000,1,10,100,1000]
12-element Vector{Int64}:
1
10
100
1000
1
10
100
1000
1
10
100
1000
julia> reparted = (testorig' * partitioning_factors)'
13-element Vector{Float64}:
0.4628632938643703
15.277057271539391
0.7597636600637447
102.50611208723872
1045.2047750859506
1.0
271.262325088697
105.99509150047489
726.0307577036091
1.0069128248909125
9.990241546747232
53.50409993692404
1000.0
julia> testorig2 = inverted_partitioning_factors * reparted # These are somehow similar but different than the original data !
12-element Vector{Float64}:
1.219593965265186
14.070517713047593
103.5313287782654
649.2638572678067
1.0
43.51153038770419
444.96579344842854
993.350156524107
2.8549526600424606
11.439992365183173
67.79227689014971
1000.0
# When I call the function with the inverted new/old region definition I can find back the "inverted" partitioning factors:
julia> partitioning_factors2, inverted_partitioning_factors2 = compute_partitioning_factors(new_regions,old_regions);
julia> partitioning_factors2' == inverted_partitioning_factors
true