Youtube searches with HTTP.jl

Hello, folks,

i am currently experimenting with the HTTP lib in Julia and try to evaluate the youtube search suggestions.
It also works as far as I want to get more search suggestions and not just the first 20. What is the best way to do that?
At the moment i do something like this -

using HTTP

function getPositions(search::String, txt::String)
    vpos=UnitRange{Int32}[]
    i=1
    while ((pos=findnext(search, txt, i)) != nothing)
        push!(vpos, pos)
        i=pos[end]+1
    end
    return vpos
end

function getPositions(search::String, txt::String, vec::Vector{UnitRange{Int32}})
    vpos=UnitRange{Int32}[]
    for v in vec
        pos=findnext(search, txt, v[end]+1)
        (pos != nothing) && push!(vpos, pos)
    end
    return vpos
end

function posText(vRange::Vector{UnitRange{Int32}}, txt::String)
    vpos=UnitRange{Int32}[]
    for range in vRange
        if ((pos=findnext("\"", txt, range[end]+1)) != nothing)
            r=range[end]+1:pos[end]-1
            push!(vpos, r)
        end
    end
    return vpos
end

r = HTTP.request("GET", "https://www.youtube.com/results?search_query=julia+programming")

println("Status : ", r.status)

txt = String(r.body)

open("youtube.txt", "w") do io
    print(io, txt)
end

v=getPositions("div class=\"yt-lockup-content",txt)
vt=getPositions("title=\"", txt, v)
vt=posText(vt, txt)

vhref=getPositions("a href=\"", txt, v)
vhref=posText(vhref, txt)

for (n, pos) in enumerate(vhref)
    playLink="https://www.youtube.com" * txt[pos]
    linkText=txt[vt[n]]
    println("($n) : $linkText")
    println("($n) : $playLink")
end

this gives me all the html and scipts which i parse through to search for the titles and the video-links. I don’t know much about webdesign and javascript, but is there a link i overseen to the next search suggestions?

best regards

Michael

YouTube uses JavaScript which means that you wouldn’t be able to see the actual document inspecting the response. You would need to use a WebDriver such as Selenium to actually interact with JavaScript webpages. Julia doesn’t have a WebDriver package at the moment (a few unmaintained projects). For YouTube in particular, you could use their API and that should be doable with HTTP.jl.

Okay, thanks. I probably need access to the google youtube api. I would have to login there for a Developer-ID and so on.

best regards

Michael

Depending on what you intend to do calling out to an existing utility like youtube-dlmight do the trick.
There are more specialized command lines too.

1 Like

yeah looks great. I need only something like give me a thumb-list of the newest videos from channels i like before i have to go directly to youtube.

https://stackoverflow.com/questions/39606419/how-can-i-download-just-thumbnails-using-youtube-dl
+
https://askubuntu.com/questions/643286/can-i-download-videos-from-a-youtube-search-query-using-youtube-dl
+
???
=
Profit?

Nice examples. I’ll give it a try. It’s only for private using

I have now tested youtube-dl, but by chance I saw in the askubuntu forum that you simply enter the page behind the search text and thus you can get more suggestions similar to scrolling down in youtube itself.

https://www.youtube.com/results?search_query=julia+programming&page=1

Then simply change the page to go further, that simulates the scrolling down to get more values in youtube
That goes also with my example solution from above.