Slow DateTime() function - how to improve?


#1

here a piece of code which runs very slowly:

function test2(n)
    dtstr="01.01.2016 10:42"
    dt=DateTime(2016)
    for i=1:n
            dt=DateTime(dtstr, "dd.mm.yyyy HH:MM")
        end
    dt
end

@time test2(50000)

20.269838 seconds (10.45 M allocations: 646.281 MB, 1.36% gc time)


#2

I think there’s work in progress to speed this up. See the following:


#3

thanks! In the manual I found something for DateFormat which also works for DateTime: Five times faster!

function test3(n)

    dtstr="01.01.2016 10:42"

    dt=DateTime(2016)

    df = Dates.DateFormat("dd.mm.yyyy HH:MM");

    for i=1:n

        dt=DateTime(dtstr, df)

    end

    dt

end

​

@time test3(50000)

4.018116 seconds (3.20 M allocations: 119.101 MB, 1.31% gc time)

2016-01-01T10:42:00


#4

Side note: in order to benchmark your code you should check out BenchmarkTools.jl package:

julia> using BenchmarkTools

julia> dtstr="01.01.2016 10:42"
"01.01.2016 10:42"

julia> @benchmark DateTime($dtstr, "dd.mm.yyyy HH:MM")
BenchmarkTools.Trial: 
  memory estimate:  7.47 kb
  allocs estimate:  182
  --------------
  minimum time:     69.865 μs (0.00% GC)
  median time:      72.531 μs (0.00% GC)
  mean time:        77.509 μs (2.01% GC)
  maximum time:     6.822 ms (78.13% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> df = Dates.DateFormat("dd.mm.yyyy HH:MM")
Base.Dates.DateFormat(Base.Dates.Slot[Base.Dates.DelimitedSlot{Base.Dates.Day}(Base.Dates.Day,'d',2,"."),Base.Dates.DelimitedSlot{Base.Dates.Month}(Base.Dates.Month,'m',2,"."),Base.Dates.DelimitedSlot{Base.Dates.Year}(Base.Dates.Year,'y',4," "),Base.Dates.DelimitedSlot{Base.Dates.Hour}(Base.Dates.Hour,'H',2,":"),Base.Dates.DelimitedSlot{Base.Dates.Minute}(Base.Dates.Minute,'M',2,r"(?=\s|$)")],"","english")

julia> @benchmark DateTime($dtstr, $df)
BenchmarkTools.Trial: 
  memory estimate:  1.86 kb
  allocs estimate:  57
  --------------
  minimum time:     16.038 μs (0.00% GC)
  median time:      16.813 μs (0.00% GC)
  mean time:        17.314 μs (0.00% GC)
  maximum time:     103.815 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

#5

@benchmark is a good tool! I also tryed some hand made code for parsing iso DateTime format and it is (in Version 0.5) about 20 times faster:

function datetime_iso(dtiso::String)
    year = parse(Int,dtiso[1:4])
    month = parse(Int,dtiso[6:7])
    day = parse(Int,dtiso[9:10])
    hour = parse(Int,dtiso[12:13])
    minute = parse(Int,dtiso[15:16])
    DateTime(year, month, day, hour, minute)
end

@benchmark datetime_iso("2005-10-31 23:59")
BenchmarkTools.Trial: 
  memory estimate:  560 bytes
  allocs estimate:  10
  --------------
  minimum time:     707.306 ns (0.00% GC)
  median time:      779.382 ns (0.00% GC)
  mean time:        861.055 ns (7.48% GC)
  maximum time:     19.558 μs (94.63% GC)
  --------------
  samples:          10000
  evals/sample:     157
  time tolerance:   5.00%
  memory tolerance: 1.00%



#6

Yes, Shashi and Curtis’s work on speeding up DateTime parsing has been merged and will be released with 0.6. You can checkout some of the (official) benchmarks used in the PR referenced above, but in general, it was something on the order of 1000x faster, with ISO-format essentially compiling down to your implementation above. Yay for faster DateTime parsing!