I am happy to announce AbnormalReturns.jl, which, as far as I am aware, is the fastest method to calculate abnormal returns.
Abnormal returns are common in finance and economics event studies, where you subtract the actual firm’s stock market return from some benchmark (such as the average market return on that day or a predicted firm return based on a regression).
In general, calculating abnormal returns requires an estimation window and an event window:
The estimation windows vary in length, but are typically 120-150 business days. Scaled up to a large number of events, this is a lot of work for the computer to estimate.
For some studies, there are large numbers of firm events. For example, there are over 600,000 firm earnings announcements since 1990, each would require its own slice, regression and aggregation.
The most common way to do this is to do a range join, then use groupby and combine. If there are a lot of firm events, in memory datasets (Pandas, DataFrames.jl, etc.) might not be usable due to the size (100 million+ observations with 6-7 columns of data). Therefore, a lot of people use SAS, which I find slow and requires a license.
The most time consuming part of calculating abnormal returns is typically the range join or the regression.
In this package, initially setting up the data is fast (less than 20 seconds for 100 million firm observations), and there is no need to repeat this process. Estimating regressions is also fast. In a benchmark of 1 million firm events with about 250 observations each, all regressions are completed in under 3 seconds (Ryzen 5 3600).
As another comparison, taking the 600,000+ firm earnings announcements, I compared this package to a common SAS macro for calculating abnormal returns. SAS took over 30 minutes to complete the calculations, this package took 30 seconds (which included loading the data into the data structure, so further calculations would be trivial in time).
Overall, this package is considerably faster than these alternatives and does not require large amounts of RAM to store the data. The total space in RAM is usually smaller than the firm return data in a table object.
In addition to running regressions and calculating abnormal returns, this package provides access to common statistics in this field. For example, it provides easy access to the “alpha” and “beta” of a regression.
Another feature I am very happy to include is lag and lead terms in the regression. This is technically a part of StatsModels, but I am not aware of a package that actually implements it. For example:
@formula(firm_ret ~ 1 + lag(mkt_ret))
works as expected, with no requirement to create the column in the dataset beforehand.