When plotting a choropleth map, how to "read" the appropriate value of the topojson file and the dataframe?

Aizzaac · February 13, 2020, 10:19pm

Hi

@oheil
@davidanthoff

I want to plot a choropleth map using a topojson (I converted the .shp file to a topojson) and a dataframe.

The topojson has the “polygons”, coordinates", and the names of the districts.
The dataframe has many columns. But the useful ones for this case are: “count” (this is the # of IPs) and “featureId” (this has the names of the districts).

What these files have in common is the name of districts.

Here some images for you to understand:

TOPOJSON FILE

DATAFRAME

This is my code:

@vlplot(width=1000, height=800) +
@vlplot(
    mark={
        :geoshape
    },
    data={
        values=JSON.parsefile("/home/juliana/roc/Data/Topojson/disa_ONT_region.topojson"),
        format={
            type=:topojson,
            feature=:disa_ONT_region
        }
    },
    transform=[{
        lookup=:featureId,
        from={
            data=df_splunk,
            key=:featureId,
            fields=["count"]
        }
    }],
    color={
        "rate:q",
        scale={domain=[0, 0.15], scheme=:reds},
        legend={title="IPs Rate in Ontario - BST"}
    },
    projection={
        type=:albersUsa
    }
)

In the “transform” block, I am calling my dataframe (df_splunk).
I want “VegaLite” to use the column “featureId” (names of districts) and the “count” (number of IPs) to visualize a choropleth map. But I have not succeded yet.

I know that in “lookup” I have to put the field to look for in the dataframe. And that “field” is what both files have in common (in this case the names of the districts).

But in what part am I telling VegaLite to use the names of districts of the topojson file? I guess it is : “properties” or “DA_ID” of the topojson file.

So, what am I doing wrong?

Thank you

Here some of the documentation and tutorials that I have checked:

https://www.queryverse.org/VegaLite.jl/dev/examples/examples_maps/

[https://www.flirtwithjulia.com/2019/02/18/Choropleth-Map-With-VegaLite.jl.html]

oheil · February 14, 2020, 12:46pm

Could you try the following:

df_splunk2=DataFrame(DA_ID=df_splunk[!,:featureid],count=Int.(df_splunk[!,count]))

@vlplot(
    width=1000,
    height=800,
    :geoshape,
    data={
        values=JSON.parsefile("/home/juliana/roc/Data/Topojson/disa_ONT_region.topojson"),
        format={
            type=:topojson,
            feature=:disa_ONT_region
        }
    },
    transform=[{
        lookup=:DA_ID,
        from={
            data=df_splunk2,
            key=:DA_ID,
            fields=["count"]
        }
    }],
    color={
        "rate:q",
        scale={domain=[0, 0.15], scheme=:reds},
        legend={title="IPs Rate in Ontario - BST"}
    },
    projection={
        type=:albersUsa
    }
)

the idea is to create and use a new DataFrame where the column names match to the topojson data and count to be an Integer and not a String.

Aizzaac · February 14, 2020, 2:55pm

@oheil

When using your code to change the name and convert to INT I get an error. So I modified it a bit:

#This is to change column 2 from "string" to "int"
df_splunk[!, 2].= parse.(Int ,df_splunk[!,2])
eltype.(eachcol(df_splunk))


#I am renaming column "featureId" to "DA_ID"
rename!(df_splunk, Symbol("featureId")=> Symbol("DA_ID"))

But the plot is empty. I can only see the colorbar.

oheil · February 14, 2020, 3:13pm

Ah, yes, I see my error now. Your code to modify the dataframe is fine.

Have you ever seen your map data?
(Actually I have some major problems to create just a simple map from topojson data as a file.)
What do you see if you just plot the map with:

@vlplot(
    width=500, height=300,
    mark={
        :geoshape,
        fill=:lightgray,
        stroke=:white
    },
    data={
        values=JSON.parsefile("/home/juliana/roc/Data/Topojson/disa_ONT_region.topojson"),
        format={
            type=:topojson,
            feature=:disa_ONT_region
        }
    },
    projection={type=:albersUsa}
)

I want to be sure, that this part of your plot is working.

oheil · February 14, 2020, 3:32pm

And another typo, color must be:

Aizzaac · February 14, 2020, 3:33pm

It works. There is no problem with that. Check!

oheil · February 14, 2020, 3:34pm

Great, now check for the color line, see my post above

Aizzaac · February 14, 2020, 3:39pm

I changed the typo. I missed that completely!
But the map is just red. Probably I have to change the scale of the “count” column (check the image I get when using python).

Do you know how to multiply the whole column “count” by a log?

JULIA

PYTHON

Aizzaac · February 14, 2020, 3:42pm

Okey, I do not think is the scale. Because when I move the mouse over the image It always says: 0

oheil · February 14, 2020, 3:49pm

What happens if you just say:

color="count:q"

no scale and no title?

Aizzaac · February 14, 2020, 3:50pm

I guess it is using a default color.

Aizzaac · February 14, 2020, 3:54pm

In Python, DA_ID is of type “object”.

oheil · February 14, 2020, 3:55pm

“count” can be a bad name, because it is an aggregate method in VegaLite.
Could you try by renaming “count” to something arbitrary like e.g. “myIPcounts” ?

Aizzaac · February 14, 2020, 4:25pm

I did. Nothing happens. But the color bar displays NaN.

oheil · February 14, 2020, 6:28pm

Is there some chance that you could provide us with the json file?

Aizzaac · February 14, 2020, 6:48pm

the topojson? yes. I have to compress it. It is big! 492.0kB

oheil · February 14, 2020, 6:49pm

Better some download service like dropbox.

oheil · February 14, 2020, 7:06pm

Arrived, could you please delete your above post? My email must not be available anymore. Thanks.

Aizzaac · February 14, 2020, 7:06pm

okey. it is deleted.

oheil · February 14, 2020, 8:19pm

I didn’t found a elegant transformation for the problem, that DA_ID is nested one level down into properties. So I manipulated the topojson data:

using VegaLite, DataFrames, JSON

#I renamed the file to .json => I can open in a browser and get it formated and clickable view on the data
#filename="C:\\Users\\oheil\\Desktop\\disa_ONT_region.json"
filename=
"/home/juliana/roc/Data/Topojson/disa_ONT_region.topojson"
json=JSON.parsefile(filename)

The following code just creates a dummy DataFrame with the counts as Integers up to 1000 and the DA_IDs as column id, this is your df_splunk:

DA_IDs=[ json["objects"]["disa_ONT_region"]["geometries"][key]["properties"]["DA_ID"] for key in keys(json["objects"]["disa_ONT_region"]["geometries"]) ]
counts=Int.(floor.(1000*rand(Float64,length(DA_IDs))))
df_splunk=DataFrame(id=DA_IDs,count=counts)

The following code manipulates the already read in json file by adding a field id at the proper level:

for key in keys(json["objects"]["disa_ONT_region"]["geometries"])
	json["objects"]["disa_ONT_region"]["geometries"][key]["id"]=json["objects"]["disa_ONT_region"]["geometries"][key]["properties"]["DA_ID"]
end

Now the choropleth is working as expected by connecting the two id fields in the two data sets:

@vlplot(
    :geoshape,
    width=500, height=300,
    data={
        values=json,
        format={
            type=:topojson,
            feature=:disa_ONT_region
        }
    },
    transform=[{
        lookup=:id,
        from={
            data=df_splunk,
            key=:id,
            fields=["count"]
        }
    }],
    color={
        "count:q",
        scale={scheme=:reds},
        legend={title="IPs Rate in Ontario - BST"}
    },
    projection={
        type=:albersUsa
    }
)

I omitted the domain option so that the color range is scaled automatically.

visualization

Topic		Replies	Views
How can I plot a choropleth map from a dataframe? General Usage first-steps	11	2436	February 13, 2020
How to make vegalite read my topojson file? General Usage first-steps , vegalite	18	2996	March 5, 2025
How to view/ingest data with VegaLite? General Usage first-steps	2	735	February 13, 2020
U.S. Counties Choropleth Visualization shapefile	27	5896	October 13, 2020
GMT Question: Choropleth Map Does not Show Color General Usage plotting	3	298	March 3, 2023

When plotting a choropleth map, how to "read" the appropriate value of the topojson file and the dataframe?

Related topics