Join parts of a URL - joinpath?

Hello,

I’d like to build a url like “http://www.domain.com/get/data” from a Vector

parts = ["http://www.domain.com", "get", "data"]

At first sight, on a *nix system I will do

julia> joinpath(parts...)
"http://www.domain.com/get/data"

but I think joinpath is system dependant ie on Windows it may use \ instead of /

Is there a function which always use “/” (even with Windows system)

With Python, we have os.path.join and urllib.parse.urljoin
But I haven’t found a join... function in https://github.com/JuliaWeb/URIParser.jl

Kind regards

I’m aware of

join(parts, "/")

but I wonder if a more “high level” function exists

Does https://github.com/JuliaWeb/URIParser.jl have anything useful to you?

I don’t see in URIParser.jl a function to concatenate parts of an url.

Seems to me that this code from the README might be helpful, but this might not be everything you need:

Additionally, there is a method for taking the parts of the URI individually, as well as a convenience method taking host and path which constructs a valid HTTP URL:

julia> URI("hdfs","hdfshost",9000,"/root/folder/file.csv","","","user:password")
URI(hdfs://user:password@hdfshost:9000/root/folder/file.csv)

julia> URI("google.com","/some/path")
URI(http://google.com:80/some/path)

oh yes… I was looking at a function in utils.jl
Thank you

Glad I could help. My apologies - I did not notice that you had already checked the package. I could’ve saved you an iteration.

But that’s not perfect…
How to do this if "/root/folder/file.csv" is in a Vector like ["root", "folder", "file.csv"]

The URI takes a standard path, so joinpath should do what you need in *nix: that is, you can pass in the output of joinpath:

a = ["/","foo", "bar", "baz"]
URI("zzz.com", joinpath(a...))

This might not work on Windows, though, given that joinpath will likely use backslashes there.

There might be a more elegant way of doing this as well.

How about a long-hand?

reduce(*, ["root", "folder", "file.csv"] .* "/")[1:end - 1] ==
       "root/folder/file.csv"
joinpath("http://www.domain.com", "foo", "bar", "baz")

works fine even on Windows, it returns

"http://www.domain.com/foo/bar/baz"

URL joining is a bit more complicated, for example:

joinurl("https://example.com/", "http://foo.bar/") == "http://foo.bar/"
joinurl("https://example.com/a/b", "..") == "https://example.com/a"
joinurl("https://example.com/a/b", "c") == "https://example.com/a/b/c"
joinurl("https://example.com/a/b", "/c") == "https://example.com/c"
joinurl("https://example.com/a/b?x=1", "c") == "https://example.com/a/b/c"
3 Likes

For my use case joinpath is enough but you are right @sirex URL joining is more complicated. Unfortunately I haven’t find a Julia package which provides such a function.

With Python

$ ipython
Python 3.6.6 | packaged by conda-forge | (default, Jul 26 2018, 09:55:02)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.0.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from urllib.parse import urljoin as joinurl

In [2]: joinurl("https://example.com/", "http://foo.bar/")
Out[2]: 'http://foo.bar/'

In [3]: joinurl("https://example.com/a/b", "..")
Out[3]: 'https://example.com/'

In [4]: joinurl("https://example.com/a/b", "c")
Out[4]: 'https://example.com/a/c'

In [5]: joinurl("https://example.com/a/b", "/c")
Out[5]: 'https://example.com/c'

In [6]: joinurl("https://example.com/a/b?x=1", "c")
Out[6]: 'https://example.com/a/c'

In [7]: joinurl("a/b/c", "d/e")
Out[7]: 'a/b/d/e'

Issue opened at https://github.com/JuliaWeb/Roadmap/issues/19

1 Like

Not any more:

joinpath("http://www.domain.com", "foo", "bar", "baz")

"http://www.domain.com\\foo\\bar\\baz"

In julia version 1.6.0.

This was never a good way to join parts of a URL and it just happened to work on UNIX systems where the path separator is /. It would always have produced what you’re seeing on Windows and if it did something else, that’s pretty much an accident. Maybe some UNC drive stuff? Who knows. Don’t use joinpath for this. The entire reason for joinpath to exist is so that it can use platform-specific path separators and handle absolute and relative paths. Neither of those is an issue for URLs. Just join the parts with literal slashes between.

1 Like

See URIs.resolvereference

julia> u = resolvereference("http://example.org/foo/bar/", "/baz/")
URI("http://example.org/baz/")
1 Like