Hi
I have plotted a choropleth map using Geopandas (Python) and Vegalite (Julia).
I want to benchmark both of them (ease of use, speed, etc).
I am very interested in speed. So maybe a @time or @time_ns or tic() toc(), etc.
What will you recommend me? I have used the Jupyter notebook.
My code looks like this:
JULIA
##Calling the libraries
using VegaLite, DataFrames, JSON
##Renaming a column
rename!(df_splunk, Symbol("featureId")=> Symbol("id"));
##Parsing the topojson file
disa_ONT_region=JSON.parsefile("/home/juliana/roc/Data/Topojson/disa_ONT_region.topojson")
for key in keys(disa_ONT_region["objects"]["disa_ONT_region"]["geometries"])
disa_ONT_region["objects"]["disa_ONT_region"]["geometries"][key]["id"]=disa_ONT_region["objects"]["disa_ONT_region"]["geometries"][key]["properties"]["DA_ID"]
end
@vlplot(
:geoshape,
width=500, height=300,
data={
values=disa_ONT_region,
format={
type=:topojson,
feature=:disa_ONT_region
}
},
transform=[{
lookup=:id,
from={
data=df_splunk,
key=:id,
fields=["count"]
}
}],
color={
"count:q",
scale={scheme=:reds},
legend={title="IPs Rate in Ontario - BST"}
},
projection={
type=:albersUsa
}
)
PYTHON
##GEOPANDAS
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import geopandas as gpd
from shapely.geometry import Point, Polygon
## Path to .shp file
sf_path = "/home/juliana/roc/Data/ESRI/disa_ONT_region.shp"
#print(sf_path)
###Read .shp file into a geopandas dataframe and plotting
sf=gpd.read_file(sf_path)
#sf=gpd.read_file(sf_path, encoding='utf-8')
#sf.plot()
## Joining dataframes: df_splunk (2 columns) + sf_Q (3 columns) = new_df
df_splunk=df_splunk.iloc[:, 0:2]
#df_splunk
sf_Q=sf.iloc[:, 0:3]
#sf_Q
new_df=sf_Q.merge(df_splunk, how='left', left_on='DA_ID', right_on='featureId')
new_df
## Plotting the cloropleth map
##https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html
ax=new_df.plot(figsize=(10,6), column='count', cmap='Blues', k=5, legend=False)
plt.title("Number of IPs by district - Ontario")
ax.set_axis_off()
Thank you