Hello
I’m trying to download files from different URLs. In this application, every website has a link called “Formulário de Referência”.
Given these two websites
sites = ["http://www.dynamo.com.br/pt/empresa/dig"
"http://www.riogestao.com.br/download.php"]
I’m trying to download the file that can be accessed when clicking the link “Formulário de Referência”.
When I use HTTP.jl and Gumbo.jl like this
for site in sites
r = HTTP.request("GET", site)
doc = parsehtml(String(r.body))
end
Inside doc
there is link like this
<a href="/uploads/c570e13449bc0c4035eea186994a6896de1ae4bc.pdf"target="_blank">
Formulário de Referência
</a>
or this
<a href="download/pdf/Formulario de Referencia.pdf"target="_blank">
<img alt="Documento PDF"height="16"border="0"src="img/icon_pdf.gif"width="20"></img>
- FORMULÁRIO DE REFERÊNCIA
</a>
I would like to know if there is a way of getting the URLs of the download using these packages.
The URLs would be:
- http://www.riogestao.com.br/download/pdf/Formulario%20de%20Referencia.pdf
- http://www.dynamo.com.br/uploads/c542dcf85cd96179e80abf109154716a4399e06b.pdf
Thanks!