Parsing DateTimes in Julia 0.6


#1

Short version: what’s the Julia 0.6 way to parse a DateTime from a String when the String has extra non-date info at the end, e.g. "2017-6-21 other text here"?

Since upgrading to Julia 0.6, I’ve run into trouble with parsing DateTimes from strings. The strings look like 07-DEC-13 03.00.00.000000000 PM. In Julia 0.5.2, I can parse these fine:

julia> s = "07-DEC-13 03.00.00.000000000 PM";

julia> tf = Dates.DateFormat("d-u-y H.M.S.");

julia> DateTime(s, tf)
0013-12-07T03:00:00

But in Julia 0.6, an error is thrown because the string has extra characters following the specified date format:

julia> s = "07-DEC-13 03.00.00.000000000 PM";

julia> tf = Dates.DateFormat("d-u-y H.M.S.");

julia> DateTime(s, tf)
ERROR: ArgumentError: Found extra characters at the end of date time string
Stacktrace:
 [1] macro expansion at .\dates\parse.jl:103 [inlined]
 [2] tryparsenext_core(::String, ::Int64, ::Int64, ::DateFormat{Symbol("d-u-y H.M.S."),Tuple{Base.Dates.DatePart{'d'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'u'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'H'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'M'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'S'},Base.Dates.Delim{Char,1}}}, ::Bool) at .\dates\parse.jl:39
 [3] macro expansion at .\dates\parse.jl:153 [inlined]
 [4] tryparsenext_internal(::Type{DateTime}, ::String, ::Int64, ::Int64, ::DateFormat{Symbol("d-u-y H.M.S."),Tuple{Base.Dates.DatePart{'d'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'u'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'H'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'M'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'S'},Base.Dates.Delim{Char,1}}}, ::Bool) at .\dates\parse.jl:129
 [5] parse(::Type{DateTime}, ::String, ::DateFormat{Symbol("d-u-y H.M.S."),Tuple{Base.Dates.DatePart{'d'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'u'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'H'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'M'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'S'},Base.Dates.Delim{Char,1}}}) at .\dates\parse.jl:270
 [6] DateTime(::String, ::DateFormat{Symbol("d-u-y H.M.S."),Tuple{Base.Dates.DatePart{'d'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'u'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'H'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'M'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'S'},Base.Dates.Delim{Char,1}}}) at .\dates\io.jl:422

Okay, so in this case I can just manually split each input string at the last ., parse the first part as a DateTime, check whether the second part has AM or PM in it, etc., but this feels suspiciously close to just parsing manually, which presumably is not the intended way to do this, especially given all the parsing mechanisms that do exist.

So I check the release notes which include the following: “Parsing string dates from a Dates.DateFormat object has been deprecated as part of a larger effort toward faster, more extensible date parsing (#20952).” But the documentation still suggests using a DateFormat object to parse the string, and the only deprecation I see in PR 20952 is for Dates.parse.

And in fact, Dates.parse does something quite unintuitive (to me) - not the parsing I’d expect:

julia> Dates.parse("2017-6-21", dateformat"y-m-d")
WARNING: `Dates.parse(x::AbstractString, df::DateFormat)` is deprecated, use `sort!(filter!(el -> isa(el, Dates.Period), Dates.parse_components(x, df), rev=true, lt=Dates.periodisless)`  instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at .\deprecated.jl:70
 [2] parse(::String, ::DateFormat{Symbol("y-m-d"),Tuple{Base.Dates.DatePart{'y'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'m'},Base.Dates.Delim{Char,1},Base.Dates.DatePart{'d'}}}) at .\deprecated.jl:1310
 [3] eval(::Module, ::Any) at .\boot.jl:235
 [4] eval_user_input(::Any, ::Base.REPL.REPLBackend) at .\REPL.jl:66
 [5] macro expansion at .\REPL.jl:97 [inlined]
 [6] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at .\event.jl:73
while loading no file, in expression starting on line 0
3-element Array{Any,1}:
 2017 years
 6 months
 21 days

i.e., Dates.parse is not parsing the string as a date according to the DateFormat anyway, and I have a very high prior that the suggested sort!(filter!(... is not intended to be the general purpose way to parse a DateTime from a string.

So two questions:

  1. Is it a bug that the constructor DateTime(String, DateFormat) now fails on trailing characters not part of the specified DateFormat, and if not a bug, how should such strings generally be dealt with?
  2. What is the proper way to parse a DateTime, and is it a documentation error that the update notes make a cryptic reference to using DateFormat being deprecated while the docs continue to suggest it?