TCN’s are hyped to be the best new approach in ML forecasting of Financial Time Series since LSTM, and have been claimed to be superior in performance and more efficient to compute. Has anyone managed to implement TCN’s in Julia? [In theory, it should be possible in Flux using a clever mix of CNN’s and a few other things]. I would like to run some tests to see for myself, but would bow to others with superior knowledge for the details.
The reference implementation https://github.com/locuslab/TCN/blob/master/TCN/tcn.py works well.
Here’s a translation of it to Flux
struct TcnBlock chain::Chain shortcut σ end Flux.@functor TcnBlock (block::TcnBlock)(input) = block.σ.(block.chain(input) + block.shortcut(input)) function tcn_block(a::Int, b::Int; dilation::Int, kernel_size=3, batchnorm=n->BatchNorm(n, relu), dropout=0.2) main = Chain( Conv((1, kernel_size), a=>b; dilation = dilation, pad = Flux.SamePad()), batchnorm(b), Dropout(dropout), Conv((1, kernel_size), b=>b; dilation = dilation, pad = Flux.SamePad()), batchnorm(b), Dropout(dropout), ) shortcut = Conv((1,1), a=>b) return ResidualBlock(main, shortcut, relu) end function tcn(channels; kernel_size=3, batchnorm=n->BatchNorm(n, relu), dropout=0.2) return Chain((tcn_block(a, b; dilation=2^(i-1), kernel_size=kernel_size, batchnorm=batchnorm, dropout=dropout) for (i, (a,b)) in enumerate(zip(channels[1:end-1], channels[2:end])))...) end the_tcn = tcn([5, 10, 10, 10, 5]) the_tcn(rand(Float32, 1, 81, 5, 32))
This Flux version doesn’t do any chomping, so the filtering isn’t causal but instead extends equally far into the future as well as the past. Just grab the model output shifted to the right by the right amount (depends on your
kernel_size and length of
channels, probably something like
(kernel_size^(length(channels)-1) - 1)/ 2 for odd
kernel_sizes), and it’ll be ‘causal’.
Bugfix edit: added some boilerplate so that Flux seems the
TcnBlock shortcut params.
Thank you so much for this! That is very helpful indeed.
It is interesting to compare the Flux and python code. The Flux is obviously much more succinct/elegant. How does performance compare, I wonder?
Anyway, I am eager to try this out on an example.
@compleat if you have an example of using TCN on a real finance dataset, it would be super helpful if you could share w/ the Julia community!
Hi, I was thinking the same thing, perhaps just one simple example for daily returns of the S&P or FTSE as forecast by lags. I would like to try this using Kolia’s code, but I might need some help. [Also, I didn’t understand what he wrote at the end above]
Anyway, thanks for the suggestion. If I get anything running, I will post it.