Why is converting from a type to a symbol so slow?

Tortar · August 25, 2023, 10:32pm

I was experimenting with the sort function where I gave to the by keyword a type related to each element of the array, but types can’t be compared with each others. So I tried to convert the type to a symbol, but it was very slow so I tried to benchmark the conversion and I got

julia> using BenchmarkTools

julia> Symbol(Int64)
:Int64

julia> @benchmark Symbol($Int64)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.120 μs …   7.030 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.160 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.230 μs ± 225.560 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂█▆▆▄▂▁     ▂▄▂▂        ▁                          ▁        ▂
  ████████▆▅▃▁█████████▆▆████▇▆▅▃▅▄▅██▇▇▇▅▅▄▄▄▁▄▄▄▁▇██▇▆▅▄▆██ █
  1.12 μs      Histogram: log(frequency) by time      2.09 μs <

 Memory estimate: 224 bytes, allocs estimate: 6.

I would have expected some nanoseconds, but instead it takes more than a microsecond: is there a way to make the conversion faster?

stevengj · August 26, 2023, 12:40am

Symbol(x) for generic x falls back to printing x to a string, then converting the string to a symbol. This is going to be slow (allocating a new string on each call, among other things).

If you just want an arbitrary ordering, you can compare by the objectid of the type. For example:

julia> sort(Any[3, 4.2, 6, 5.6, "x", 7, "y"], by=objectid∘typeof)
7-element Vector{Any}:
 3
 6
 7
 4.2
 5.6
  "x"
  "y"

Of course, if you have a mixed-type container, such as an array holding elements of different types, you are inherently looking at sub-par Julia performance to start with.

cstjean · August 26, 2023, 2:16am

Note that you can also get the name this way:

julia> T = Int64
Int64

julia> @btime $T.name.name
  101.645 ns (0 allocations: 0 bytes)
:Int64

though with parametric types, it may not be what you want. Also, needless to say, this is not part of the public Julia API and may break…

Tortar · August 26, 2023, 1:41pm

@cstjean What about changing Base with what you propose?

julia> using BenchmarkTools

julia> a = Int64
Int64

julia> @benchmark Symbol($a)
BenchmarkTools.Trial: 10000 samples with 153 evaluations.
 Range (min … max):  681.046 ns …   8.088 μs  ┊ GC (min … max): 0.00% … 89.31%
 Time  (median):     700.654 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   721.068 ns ± 191.343 ns  ┊ GC (mean ± σ):  0.66% ±  2.35%

     ▅█▇▅▃▁▂▃▅▄▃▁      ▁▁▂▁▂▁▁▁                                 ▂
  ▃▂▆████████████▇▇▆███████████▇█▇▆▆▆▆▅▄▅▄▄▄▆▄▄▅▅▄▅▄▃▅▄▅▅▄▅▄▂▅▄ █
  681 ns        Histogram: log(frequency) by time        897 ns <

 Memory estimate: 224 bytes, allocs estimate: 6.

julia> Symbol(x::DataType) = isempty(x.parameters) ? x.name.name : Base.Symbol(x)
Symbol

julia> @benchmark Symbol($a)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
 Range (min … max):  65.169 ns … 120.225 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     68.233 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   68.597 ns ±   2.725 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                         █▁  ▁
  ▁▁▁▁▂▁▁▂▁▂▁▂▂▃▂▂▃▃▇▅▆▇████▇█▄▃▃▂▃▂▂▁▂▂▂▂▂▄▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  65.2 ns         Histogram: frequency by time         72.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Is the idea sound?

stevengj · August 26, 2023, 2:18pm

One the one hand, it’s pretty painless to add optimized methods to Base, and of course internal methods can depend on internal implementation details.

On the other hand, converting types to symbols is ordinarily something that would only occur during metaprogramming, in which context performance is usually not so much of a concern. If it is performance-sensitive for you, this might be an indication that you should re-think how you are using Julia — the cornerstone of making Julia fast is to make sure that types are inferred at compile-time in critical code, and it seems like you are violating that maxim.

bertschi · August 26, 2023, 2:30pm

Why would you convert to symbol for sorting. In my understanding, a symbol is an just opaque name and should merely be compared for equality. Was surprised that an ordering relation is defined on symbols at all, but it probably makes sense as nameof returns a symbol (which arguably is not its name).
In any case, if you want to order types by name, just use nameof – which by the way uses t.name.name on data types internally already.

stevengj · August 26, 2023, 2:33pm

nameof may not do what you want for parameterized types, since nameof returns the same symbol for distinct concrete types:

julia> nameof(Complex{Int})
:Complex

julia> nameof(Complex{Float64})
:Complex

julia> Symbol(Complex{Int})
Symbol("Complex{Int64}")

julia> Symbol(Complex{Float64})
:ComplexF64

Notice that Symbol(x), because it falls back to print and therefore show, uses abbreviated aliases like ComplexF64, which may not be what you want for sorting. To omit the type aliases, you need to call show directly with :compact=>false in the IOContext:

julia> sprint(show, ComplexF64; context=:compact=>false)
"Complex{Float64}"

julia> sprint(show, ComplexF64; context=:compact=>true)
"ComplexF64"

i.e. if you wanted to sort by the full type names, you could use by=x -> sprint(show, typeof(x); context=:compact=>false).

Of course, since this involves allocating a string for every element to be sorted (same as Symbol(T) under the hood), it won’t be blazingly fast. But, as I commented above, if you need sorting by type name to be blazingly fast then you might want to re-think your whole approach.

Tortar · August 26, 2023, 9:18pm

yes, I know that heterogeneous containers are not very performant and indeed I think that the current approach is not perfect because it is too type unstable but I thought that this conversion could have been useful in some other context, but I agree that it could be used in metaprogramming for the most part where performance is not very important, nonetheless if it possible to have an optimized version even for parametric types than I would consider improving the current Base method, though it’s maybe impossible to keep the current behaviour if Symbol(Complex{Float64}) returns an alias

Topic		Replies	Views
When should a function accept a symbol as an argument? New to Julia question , metaprogramming , design	40	5532	July 26, 2020
Performance of printing a type name Performance	3	298	August 8, 2022
Anoymous Functions Slow in sort Performance sort	3	856	August 22, 2018
Get NamedTuple element with String key. Why is this slow? Performance	3	644	March 30, 2022
WIP: faster string sort Internals & Design strings , sort	92	12401	February 8, 2018

Why is converting from a type to a symbol so slow?

Related topics