Scraping a html table from a website

I am trying to obtain COVID-19 data from a website. The website has the data which I want but it is in a html table format

I am looking for Julia tools to scrape the information from the html table. Something like this

Is there an existing package in Julia for this?

I am thinking about writing my own module to do this but I ask here first before writing my own module.

1 Like

https://github.com/JuliaWeb/Gumbo.jl

https://github.com/Algocircle/Cascadia.jl

2 Likes

Yes, the combination of Gumbo and Cascadia is what you need. There is some example code in the README for Cascadia, and some more examples in this SO post: Extracting and Constructing Tables from HTML Files using Julia - Stack Overflow

3 Likes