[ANN] Harbest.jl - Simple web scraping with Julia

Jose_Diaz · December 18, 2022, 10:44pm

Hello people!

I’m making a package, mostly as a personal tool but I just publish it.

It’s kind of a port of Rvest from R, but it will extend the functionality way more, making new and cool functions.

Currently, I see it mostly as sugar code in the sense that combines the funcionality of HTTP, Cascadia and Gumbo in a different syntax.

You can see the documentation here

You can install it using Pkg.add("Harbest")

I’m gonna be making better documentation, better and new functions very soon!

rmsmsgood · December 19, 2022, 4:33am

Awesome. Is there any possibility to download images at the page automatically, in future?

Jose_Diaz · December 22, 2022, 7:45am

Yes!
I’ll be thinking on a nice implementation to do that.
Thanks!

Albert_Zevelev · December 24, 2022, 1:02am

Can your package pulls ski resort data?
I tried here:

Jose_Diaz · December 24, 2022, 2:07am

Yes! Easily

For example, with the following code, you’ll get the lifts, terrain and trails open (from the first link you sent)

using Harbest

html = read_html("https://www.parkcitymountain.com/the-mountain/mountain-conditions/terrain-and-lift-status.aspx")

data = html_elements(html, ".c118__number1--v1")

# data[1] is the amount of lifts, data[2] and data[3] are the rest
lifts_open = html_text3(data[1]) ## "38"

avik · December 25, 2022, 7:06pm

Nice. Let me know if you find any issues with Cascadia that blocks you. I’ve long thought that some syntax sugar on Cascadia/Gumbo would be useful. I’ve considered extending Cascadia with it, but happy to see it in a different package.

Regards

Avik

Topic		Replies	Views
Pull data from websites in Julia General Usage	10	649	December 2, 2022
[Pre-ANN] JuliaPackageComparisons Community	22	1828	September 28, 2023
[ANN] Announcing XML.jl Package Announcements xml	29	2167	June 9, 2023
What library do you suggest to parse HTML page and additionally navigate through the page New to Julia	2	555	December 31, 2019
JuliaPackages.com or: How I Learned to Stop Worrying and Love the ChatGPT Package Announcements	9	1382	June 16, 2023

[ANN] Harbest.jl - Simple web scraping with Julia

Related topics