TCN’s are hyped to be the best new approach in ML forecasting of Financial Time Series since LSTM, and have been claimed to be superior in performance and more efficient to compute. Has anyone managed to implement TCN’s in Julia? [In theory, it should be possible in Flux using a clever mix of CNN’s and a few other things]. I would like to run some tests to see for myself, but would bow to others with superior knowledge for the details.
The reference implementation https://github.com/locuslab/TCN/blob/master/TCN/tcn.py works well.
Here’s a translation of it to Flux
struct TcnBlock
chain::Chain
shortcut
σ
end
Flux.@functor TcnBlock
(block::TcnBlock)(input) = block.σ.(block.chain(input) + block.shortcut(input))
function tcn_block(a::Int, b::Int;
dilation::Int,
kernel_size=3,
batchnorm=n->BatchNorm(n, relu),
dropout=0.2)
main = Chain(
Conv((1, kernel_size), a=>b; dilation = dilation, pad = Flux.SamePad()),
batchnorm(b),
Dropout(dropout),
Conv((1, kernel_size), b=>b; dilation = dilation, pad = Flux.SamePad()),
batchnorm(b),
Dropout(dropout),
)
shortcut = Conv((1,1), a=>b)
return ResidualBlock(main, shortcut, relu)
end
function tcn(channels; kernel_size=3, batchnorm=n->BatchNorm(n, relu), dropout=0.2)
return Chain((tcn_block(a, b; dilation=2^(i-1), kernel_size=kernel_size, batchnorm=batchnorm, dropout=dropout)
for (i, (a,b)) in enumerate(zip(channels[1:end-1], channels[2:end])))...)
end
the_tcn = tcn([5, 10, 10, 10, 5])
the_tcn(rand(Float32, 1, 81, 5, 32))
This Flux version doesn’t do any chomping, so the filtering isn’t causal but instead extends equally far into the future as well as the past. Just grab the model output shifted to the right by the right amount (depends on your kernel_size
and length of channels
, probably something like (kernel_size^(length(channels)-1) - 1)/ 2
for odd kernel_size
s), and it’ll be ‘causal’.
Bugfix edit: added some boilerplate so that Flux seems the TcnBlock
shortcut params.
Thank you so much for this! That is very helpful indeed.
It is interesting to compare the Flux and python code. The Flux is obviously much more succinct/elegant. How does performance compare, I wonder?
Anyway, I am eager to try this out on an example.
Thanks again!
@compleat if you have an example of using TCN on a real finance dataset, it would be super helpful if you could share w/ the Julia community!
Hi, I was thinking the same thing, perhaps just one simple example for daily returns of the S&P or FTSE as forecast by lags. I would like to try this using Kolia’s code, but I might need some help. [Also, I didn’t understand what he wrote at the end above]
Anyway, thanks for the suggestion. If I get anything running, I will post it.