I’m working with Phil Price over at Andrew Gelman’s blog https://statmodeling.stat.columbia.edu/ to do a comparison between using R and using Julia for basic data analysis stuff.
We’ve got a few simple tasks and are implementing them in both languages to show comparisons. In fact, it started because I expressed frustration with the Tidyverse and its reliance on Fexprs but it’s probably turning out to show essentially that for simple things both ecosystems let you do the simple stuff relatively easily and without fuss… which is fine. I’m OK with just “Julia is very usable if you’re familiar with R”. There is one example simulation based task where it’ll highlight the speed differences, and the existence of good general purpose computing tools such as data structures (the Julia implementation will use a “Set” whereas probably R would be done with vectors and a lot of copying)
However, R has a lot more time to develop specialized tools for visualizations. One task is to visualize some COVID data from the CDC. In R Phil is plotting a map of the US with each county colored. I don’t know how to do that in StatsPlots. Is there a simple way? Yes I know that Julia has other plotting tools, but I don’t want to complexify the comparison by highlighting the fact that there are like 5 viable plotting libraries (Plots/StatsPlots, VegaLite, Makie, Gadfly, PGFPlots, etc etc)
So, how can I plot the counties of the US on a map using StatsPlots/Plots ? If this is not directly possible, what is the primary 1 or 2 tools that would be recommended to plot very simple map plots of the US such as this one that Phil generated (showing that the dataset we were looking at had mostly missing data, grey, and we needed to find a better data source)
I guess for the purposes of this thread I’ll accept any and all solutions, and then will try to choose the one that seems to provide the least resistance to present it
Nice the GMT version seems simple enough and uses a “plot” method which seems straightforward as an interface. the vegalite interface is fine for someone who’s familiar with vegalite and Julia and etc, but will I think be off-putting as an introduction to mapping on Julia.
One question: where do I get the shapefile for the US counties?
which since I’m not familiar with shapefiles leaves me with question how to actually use it?
Installing GMT.jl results in:
Failed to precompile GMT [5752ebe1-31b9-557e-87aa-f909b540aa54] to /var/local/dlakelan/dotjulia/compiled/v1.7/GMT/jl_6mcLAj.
ERROR: Caught signal number 11 (Segmentation fault) at
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x25)[0x7f0af0c2c945]
[0x557263ef6588]
Stack backtrace:
/lib/x86_64-linux-gnu/libgmt.so.6(sig_handler_unix+0xf4)[0x7f0af0e9b344]
/lib/x86_64-linux-gnu/libc.so.6(+0x3c910)[0x7f0af0bde910]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x25)[0x7f0af0c2c945]
/lib/x86_64-linux-gnu/libproj.so.22(_ZN5osgeo4proj6common13UnitOfMeasureD1Ev+0x6f)[0x7f0aec85197f]
/lib/x86_64-linux-gnu/libc.so.6(__cxa_finalize+0xc6)[0x7f0af0be1556]
/lib/x86_64-linux-gnu/libproj.so.19(+0xb8083)[0x7f0ae8c87083]
Building Conda → `/var/local/dlakelan/dotjulia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/6cdc8832ba11c7695f494c9d9a1c31e90959ce0f/build.log`
Building GMT ──→ `/var/local/dlakelan/dotjulia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/d1068159f18828ec1831efd34f1880f84a792b98/build.log`
ERROR: Caught signal number 11 (Segmentation fault) at
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x25)[0x7f5df1963945]
[0x555d3716e552]
Stack backtrace:
/lib/x86_64-linux-gnu/libgmt.so.6(sig_handler_unix+0xf4)[0x7f5df1bd2344]
/lib/x86_64-linux-gnu/libc.so.6(+0x3c910)[0x7f5df1915910]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x25)[0x7f5df1963945]
/lib/x86_64-linux-gnu/libproj.so.22(_ZN5osgeo4proj6common13UnitOfMeasureD1Ev+0x6f)[0x7f5ded58897f]
/lib/x86_64-linux-gnu/libc.so.6(__cxa_finalize+0xc6)[0x7f5df1918556]
/lib/x86_64-linux-gnu/libproj.so.19(+0xb8083)[0x7f5de99be083]
ERROR: LoadError: failed process: Process(`gmt --show-library`, ProcessSignaled(11)) [0]
when I run gmt --show-library it gives the same segfault/backtrace. This is using the current Debian/testing version of GMT, anyone else having this issue?
I reported a bug to Debian and they helped me figure out what was wrong (another package was keeping an older version of libproj installed) after appropriate upgrades and removing the old library gmt works on the command line, so I’ll make some effort to see if I can do something with the GMT.jl solution thanks much!
@rafael.guerra I have the shape file from the above mentioned county information, and would like to just see a map of all counties. The linked example produces the Choropleth after manipulating various things, but for just a first peek, I tried:
@dlakelan Glad you were able to figure out the installing problem because I was clueless.
Regarding the plots, you must first understand that GMT works a bit differently from the other plotting packages (but, don’t they always do among each other?). In GMT, the plotting command adds a layer to the PostScript file. This means we never have the entire plot in memory but that we keep adding layers at need. We only display the figure when it’s finalized, and we do that by using the option show=true. The intermediate layers are added by using the ! notation. i.e. plot(...); plot!(...); plot!(..., show=true)
since you didn’t provided the show option there is nothing to be … shown.
I tried that shapefile and there is something wrong somewhere (likely in the GMT.jl side) when I try to plot it all because many polygons are not shown, but it worked nicely whem imposing a map limit
And if you want to follow that example with the Portuguese COVID map, I have to say that I should have updated it while I remembered better how to. Since then an attribute field has been added to the GMTdataset type and the function make_zvals_vec() was greatly improved to allow fetching polygons by attribute fields.
Great. Thanks for this help. I’d be interested in adding a GMT related tutorial to my archive of data analysis tutorials, both so I can learn more about mapping with GMT, and so others can benefit. Would you be interested in collaborating on that?
Well, yes. Tutorials and examples is a lagging area in GMT.jl
Yes please. GMT and GMT.jl are great tools for mapping, but I do find them very complex. I know you can really do amazing plots and analysis with them, but as a novice they feel overwhelming. I’ve used them, would love to learn more, and tutorials would be a great tool for that.
Great, I’ll PM you, and anyone who wants an invite to the discussion PM me and I’ll add you to the discussion. @alejandromerchan do you want to be on that PM?
I did some mapping in Julia a while aga, on a smaller scale (California counties) using VegaLite. Just like you, I had experience with R and I used a project I had as an excuse to learn more Julia. I haven’t done mapping since, so I don’t know how much things have changed. You’re welcome to loop me in, I gladly help if I can.
I did not!
I actually just used Makie.jl as GeoMakie.jl was not where I needed it to be.
I can see if I can find my example using a Shapefile plus Makie if that would be of interest @alejandromerchan