`DateTime` arithmetic on `Microsecond` (or smaller) scale

adienes · August 23, 2023, 9:06pm

HFT apps are certainly not using DateTime, I’m not sure it’s an appropriate use case to even pretend to support with the current design.

gustaphe · August 24, 2023, 4:58am

I don’t know if I missed it, but has anyone suggested DateTime{T}?

Sukera · August 24, 2023, 5:06am

Yes that has been suggested (multiple times), no that is not an option; it makes existing code like

Vector{DateTime}

or

struct Foo
    dt::DateTime
end

type unstable. Not to mention that it’d be just as much work as adding a HighPrecisionDateTime without those problems.

tim.holy · August 24, 2023, 8:40am

I suggested that on GitHub, where @bvdmitri noted the same issue as @CameronBieganek. A solution would be to define

const DateTime = ParametricDateTime{Int64}

Not only could one use this for the precision issue, but one could also use ParametricDateTime{SafeInt64} to perform checked arithmetic, if one were worried about overflow.

Sukera · August 24, 2023, 8:48am

You’d have to pull that change through all of the existing code, modify UTInstant too as well as add a CheckedInt to Base. That seems much more work than just adding a distinct new type.

tim.holy · August 24, 2023, 9:01am

Yes, if all we’re going to do is add PrecisionDateTime then I agree we probably don’t need to make it parametric; that will, however, force a singular choice one way or another about checked arithmetic.

On that note, is there anyone here with the expertise to weigh in on whether the performance of DateTime arithmetic matters? This uncertainty is holding up a decision of whether we should enforce checked arithmetic when adding or subtracting periods from a DateTime. Specifically, in my tests switching to checked arithmetic is about 2x slower for “normal” code, and 5.5x (AVX2) or 11x (AVX512) slower if you’re using SIMD vectorization. (Interestingly, Int128 is similar to checked, and checked Int128 is another 2x on top of all this.) But this matters only if DateTime arithmetic is the performance bottleneck. I have no idea whether that ever happens in practice.

jar1 · August 24, 2023, 9:28am

Could there be an arithmetic equivalent of @inbounds that users can apply to a SafeInteger add?

Sukera · August 24, 2023, 9:46am

It’d be interesting to see how you tested this - I’d imagine adding the same offset to a vector of DateTime is less common, and I’m unsure how common adding distinct offsets to the same DateTime is

tim.holy · August 24, 2023, 10:05am

Just a simple loop-based manual implementation of sum for lists. Not intended to mimic a specific DateTime workload.

stephancb · August 24, 2023, 10:36am

Again, segmented time encoding (some additional reading) would address some of the issues mentioned in the previous posts. Such a segmented time code might look as follows:

struct DsTimeCode  # day segmented time code, medium resolution
                   # (can represent leap second times)
    day2000::Int32 # day2000=0  is 2000-01-01
    hμsec::UInt32  # hectomicro (10^-4) seconds of day
end

struct HrTimeCode  # High resolution time code
    dstimecode::DsTimeCode # day segmented time code, medium resolution
    hfsec::UInt64          # dekayokto (10^-23) seconds of dstimecode
end

A simpler version, wasting a few bits, is

struct DsTimeCode  # day segmented time code, medium resolution
                   # (can represnt leap second times)
    day2000::Int32 # day2000=0  is 2000-01-01
    msec::Int32    # milliseconds of day
end

struct HrTimeCode  # High resolution time code
    dstimecode::DsTimeCode # day segmented time code, medium resolution
    zsec::Int64            # zepto (10^-21) seconds of dstimecode
end

I have been using something like this for some time, but did not publish as a package and have not timed extensively., sorry.

barrettp · August 24, 2023, 2:24pm

I was part of this discussion on Slack. I’m an astronomer and worked in the Time Services Department, now the Precise Time Department, at the US Naval Observatory. They maintain the Master Clock (*). The astronomical time standard is Julian days, which is usually split between integer days and floating point fractional days. Seconds are also used and also split between integer seconds and floating point fractional seconds. Nanosecond time resolution is common in astronomy, mainly because of pulsar timing research. Datasets exists with better than nanosecond resolution over half a century. That is a precision of greater than 10^-18. Atomic clocks are regularly measured to picosecond precision and the international time standard is currently measured to 10^-14 seconds. In the near future (~10 years), optical clocks will have a precision of ~10^-21 seconds. For a few thousand dollars, you can buy a card that measures time to <1 picosecond (see the GuideTech website). My point is that nanosecond and better precision is here and will only get more precise in the near future. After reading through this discussion, you have convinced me never to use the DateTime module, because it does not appear to adhere to the internationally approved time standards. That’s unfortunate, because a standard time library that adheres to approved standards would be beneficial to Julia. Sorry for being a Debby Downer.

(*) The Master Clock is actually an assemble of over 200 atomic clocks separated into 5 Master Clocks. Each Master Clock is compose of short timescale hydrogen masers and medium timescale cesium clocks. They are all synchronized by 6 long timescale Rubidium Fountain clocks that are accurate to <1 second over the lifetime of the Universe, i.e., ~14 billion years.

adienes · August 24, 2023, 2:26pm

“expertise” may be a strong word

but I work in a domain where timestamp arithmetic performance does matter (hft). and as I mentioned before, I feel quite sure that there must be no (serious) DateTime users in this domain due to other aspects of its design—not performance

so from that perspective, I wholeheartedly support checked arithmetic

barrettp · August 24, 2023, 4:29pm

While pondering this issue a bit more, the problem that I see is that DateTime is mixing functionality with input/output. I have always found this to eventually lead to problems. How the values are stored internally in the code should not matter how they are read in or written out. IO is the responsibility of the user. If it takes 128 bits to implement the functionality accurately and precisely, then so be it. Most users are working with time series of a few thousand values, and probably not more than a few million at most. I would think a few billion values would be very rare. Hence the amount of memory needed is not huge. In other words precision is more likely to be a problem than compactness of the data. If compactness is an issue, then the times can be stored as an offset to a reference time or similar data structure. For astronomical software, I will focus on ensuring that precision is paramount.

ryofurue · August 24, 2023, 5:14pm

I see DateTime as a kind of integer, so I want DateTime to behave like Int and I also want a “floating-point” companion to DateTime as Int has Float64 as a companion.

To read a DateTime from a String is problematic:

d = parse(DateTime, "2023-07-01T12:34:56.123456") # -> error

For Int, you solve this problem by

f = parse(Float64, "1234.5678")
i = round(Int, f)

So, it would be nice if we have a floating-point-like companion:

fd = parse(FloatingDateTime, "2023-07-01T12:34:56.123456")
d = round(DateTime, fd)

The “microsecond” problem is the same. Mincrosecond is to DateTime what Float64 is to Int:

f = 3 + 0.6 # -> Float64
i = Int(f) # error
i = Int(round(f)) # fine

Then, I would like

fd = DateTime( . . . ) + nanosecond( . . . ) # -> floating-point DateTime
d = DateTime(fd) # error
d = DateTime(round(fd)) # fine

danielwe · August 24, 2023, 7:13pm

I don’t think the issue is a lack of adherence to international standards. The Dates module does aim to adhere to an international standard, however, it’s one for representing, communicating, and doing simple calendar arithmetic on Gregorian calendar dates and 24-hour time of day (ISO 8601), not one for precision timekeeping (these are really quite unrelated concerns). It can be useful for things like timestamps in database records, but probably not for astronomy and GNSS, and this is arguably the correct tradeoff for a standard library.

Specifically, the documentation states that Dates assumes Universal Time - Wikipedia, which is basically solar time, meaning that the lengths of days and seconds are nonuniform due to variations in the earth’s rotation. This is convenient because it means each day has exactly 86400 seconds and you don’t have to fuss with leap seconds in DateTime arithmetic, but it also means that even millisecond precision is a bit nonsensical, and locating events with nanosecond precision on a timeline that’s uniform across centuries is way beyond the scope of the module.

The mystery here is really why the module supports sub-millisecond intervals at all, given the other design decisions. Perhaps the solution is to deprecate Microsecond and Nanosecond and point users to appropriate packages?

tbeason · August 25, 2023, 2:49pm

That is not the only reason for wanting increased precision though. Some might want that, but many others (including myself) would just want to support the timescales that are in use today. Many things operate at a sub-millisecond frequency today. I personally do not care if I can measure something down to the nanosecond on Sep 1 1502. But right now (I mean literally right as I type), many datasets in various fields of work are being populated with data timestamped to the nanosecond.

dlakelan · August 25, 2023, 3:21pm

This is really the big point in my opinion. If I have a CSV file from someone with ISO timestamps to the nanosecond, I want to be able to read them in, and then discover the time difference between the first one and the tenth one (or whatever).

danielwe · August 25, 2023, 3:26pm

That’s fair, just note that it would add some implementation complexity and maintenance burden in that your module now needs to know about leap seconds (assuming UTC timestamps).

mbauman · August 25, 2023, 8:11pm

Let’s continue the discussion on UT vs UTC over in Universal Time vs UTC Time in Dates and try to keep this topic on the higher precision arithmetic. My split wasn’t perfect here, but I tried to preserve as much as I could. In doing so, I unfortunately ran into some Discourse bugs, which placed some posts completely out of order. Please ping me if you see anything that’s still out-of-place or nonsensical in its (possibly new) context, but I think I got it now.

tim.holy · August 26, 2023, 8:38am

I see is that DateTime is mixing functionality with input/output. I have always found this to eventually lead to problems. How the values are stored internally in the code should not matter how they are read in or written out

I may not be understanding your concern, but that doesn’t appear to be a problem for the Dates stdlib. DateTime is represented in milliseconds and all arithmetic operations use this representation directly. Parsing strings to extract DateTime, and printing DateTime are completely separate. Can you be a bit more concrete about your concern?

Topic		Replies	Views
Differences smaller than Dates.Milliseconds are ignored Internals & Design question , bug , dates	7	720	July 6, 2019
Need to find datetime for a datetime plus a duration/period/timedelta General Usage question , dates	4	924	February 16, 2021
Divising a DateTime by 1 or 10 yields different floating point precision New to Julia	3	65	July 22, 2024
DateTime: high resolution treatment of time with Julia General Usage dates	0	607	February 24, 2018
Convert Dates.DateTime to int64 since epoch New to Julia	13	8498	July 10, 2020

`DateTime` arithmetic on `Microsecond` (or smaller) scale

Related topics