What's logic/rationale behind `parse(Int, some_str)` vs allowing `Int(some_str)`?

I am sure there is a rationale.

But we face some issues at work as we developed some casting action but “casting” from int to string is not allowed.

So I bought up this example in Julia. So I am not sure what’s the rationale for not letting Int(str::String) = parse(Int, some_str)

Please educate me.

3 Likes

but “casting” from int to string is not allowed.

Because that’s not actually casting?

2 Likes

It would be interesting to know the rationale. My idea when I encounter it was that Int is a constructor, so it should always return object of the type Int. But since string can be malformed it may return an exception (or nothing if tryparse is used), so it makes design less coherent.

I don’t really know much about parsing, so I’m just trying to reason my way through this, based on what seems intuitive (since no one properly knowledgeable has weighed in yet).

I believe you mean ‘convert’, not ‘cast’. They are different things, convert calculates an equivalent value, creating new data in the process, while casting is like reinterpret, and changes the interpretation without changing the underlying bits. Therefore it’s also really efficient.

You can actually convert a character to an integer:

jl> Int('A')
65

jl> Int('3')
51

Not what you expected? Characters numerals are not encoded as their corresponding integer values. Based on this, how do you turn “98765” into the number 98765? Strings are somewhat analogous to vectors, so what if you do

jl> Int.(collect("98765"))
5-element Vector{Int64}:
 57
 56
 55
 54
 53

I don’t know what to do with this. Obviously, it’s much harder to parse strings to numbers than to convert, which should be straightforward and quick. ‘parsing strings’ means scanning them for meaning, and is a complicated process, comparatively. Just turning a string that you already know expresses a valid integer or float means iterating along it and calculating and combining it into some value that has meaning.

And this is complicated by various ways of expressing numbers in writing. How would you go about “converting” these strings to numbers:

"25.4e-3"
"0.0254"
"3654"
"0x00000e46"  # this is actually the same as the line above
"3654 + 0im"

The mapping isn’t one-to-one, and you need to know how the encoding works. (Contrary to a type tag, like UInt8 which tells you how to interpret the data, a string doesn’t tell you if you are looking at an Int, a Float32, a Complex{Rational}, or so on. You have to parse and interpret.)

Going the other way is also hard, and we don’t call it “converting a number to a string”, we call it “printing”. There is a string function that turns 7 into “7”, but that is really printing, and String(7) does not work.

8 Likes

In ancient Julia you could do this. It was deprecated in Julia 0.4:

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.7 (2016-09-18 16:17 UTC)
 _/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-pc-linux-gnu

julia> int("3")
WARNING: int(s::AbstractString) is deprecated, use parse(Int,s) instead.
 in depwarn at deprecated.jl:73
 in int at deprecated.jl:50
while loading no file, in expression starting on line 0
3

The deprecation was introduced with reference to https://github.com/JuliaLang/julia/issues/1470 but that issue doesn’t seem to discuss the parsing aspects, so that had probably been decided earlier. My recollection is that parsing a string was considered fundamentally different from conversion from numerical values.

The Int constructor is no stranger to exceptions.

julia> Int(0.5)
ERROR: InexactError: Int64(0.5)
4 Likes

After reading this issue, I can’t stop thinking about broadcasting version of these “conversion” functions.

Float64.([1, 1.5, "2.3", "4"])

# 1.0 1.5 2.3 4.0

That could be real nice, but of course here be dragons.

1 Like

Conversion is appropriate between types with different ways of representing the same kinds of values. The Julia values 123, 0x7b and 123.0 are all different ways of representing the numerical value 123, so it’s legitimate for convert to allow converting between them. The strings "123", "7b" and "123.0" are not ways of representing that numerical value even though you can decide to interpret the strings as having that meaning. You could also interpret "123" as representing the value 83 if you have reason to believe that the number is written in octal. Likewise, "7b" could represent 123 if it’s written in base 16 using the conventional hexadecimal digits, but it could also be an invalid decimal input or it could represent 95 in base 12. The point is that strings don’t have value as numbers, they have to be interpreted and that process of interpreting and decoding a string into a numeric value is called parsing.

26 Likes

Is there any rule of thumb which can draw a borderline between construction/conversion/interpretation?

I mean, this argument looks reasonable when we are talking about numbers, but what about other structures? For example, I can have structure for database connection:

struct DBConn
  dbname::String
  host::String
  port::Int
  login::String
  password::String
end

and one can make a constructor, which accepts all necessary data as a connections string:

dbconn = DBConn(conn_str = "scott:tiger@localhost:5432/mydatabase")

It looks rather convenient, but am I correct that this is a bad style? And proper way to initialize connection with the help of connection string is to write parse method?

dbconn = parse(DBConn, "scott:tiger@localhost:5432/mydatabase")
1 Like

Constructors can accept whatever they want, there’s really no rules. What would be super sketchy is having a convert(::Type{DBConn}, ::AbstractString) method.

6 Likes

One exception would be BigFloat, which accepts a string argument for parsing.

2 Likes

That’s a little different since that’s the only way to accurately construct a specific BigFloat value, which is why that method exists.

2 Likes

I understand from your response above how to delineate between convert and constructors. How do you think about constructors versus parse?

Construction tends to be about wrapping a bunch of values, whereas parsing is about taking a textual representation and interpreting it. There’s no bright line in the sand between these though and constructors can really do anything they want. In the case of simple numerical values though, it seems appropriate to call out that parsing is slow, unreliable (can fail), and ambiguous (you might have the wrong base, even different digits can be used), whereas turning a UInt8 representing the number 123 into an Int is fast, reliable and unambiguous, and thus appropriate for conversion or construction.

9 Likes

I would note that integers are a bit weird here because they’re not a bundle of values, they’re atomic. Strings and floats are like that too. You don’t make one of them from a bundle of values, you convert a value of one numeric type to a different numeric type. In fact, I pretty much never use integer constructors for that reason: I only ever use parse to create an integer value in the first place and convert to go between different representations of integers. I suspect that people are mainly interested in the Int(x) syntax because it’s terse, but I would write that as convert(Int, x).

4 Likes