TCN Temporal Convolutional Networks in Julia?

TCN’s are hyped to be the best new approach in ML forecasting of Financial Time Series since LSTM, and have been claimed to be superior in performance and more efficient to compute. Has anyone managed to implement TCN’s in Julia? [In theory, it should be possible in Flux using a clever mix of CNN’s and a few other things]. I would like to run some tests to see for myself, but would bow to others with superior knowledge for the details.

1 Like

The reference implementation https://github.com/locuslab/TCN/blob/master/TCN/tcn.py works well.

Here’s a translation of it to Flux

struct TcnBlock
    chain::Chain
    shortcut
    σ
end

Flux.@functor TcnBlock

(block::TcnBlock)(input) = block.σ.(block.chain(input) + block.shortcut(input))

function tcn_block(a::Int, b::Int;
                          dilation::Int,
                          kernel_size=3,
                          batchnorm=n->BatchNorm(n, relu),
                          dropout=0.2)
    main = Chain(
        Conv((1, kernel_size), a=>b; dilation = dilation, pad = Flux.SamePad()),
        batchnorm(b),
        Dropout(dropout),
        Conv((1, kernel_size), b=>b; dilation = dilation, pad = Flux.SamePad()),
        batchnorm(b),
        Dropout(dropout),
    )
    shortcut = Conv((1,1), a=>b)
    return ResidualBlock(main, shortcut, relu)
end

function tcn(channels; kernel_size=3, batchnorm=n->BatchNorm(n, relu), dropout=0.2)
    return Chain((tcn_block(a, b; dilation=2^(i-1), kernel_size=kernel_size, batchnorm=batchnorm, dropout=dropout)
                  for (i, (a,b)) in enumerate(zip(channels[1:end-1], channels[2:end])))...)
end

the_tcn = tcn([5, 10, 10, 10, 5])

the_tcn(rand(Float32, 1, 81, 5, 32))

This Flux version doesn’t do any chomping, so the filtering isn’t causal but instead extends equally far into the future as well as the past. Just grab the model output shifted to the right by the right amount (depends on your kernel_size and length of channels, probably something like (kernel_size^(length(channels)-1) - 1)/ 2 for odd kernel_sizes), and it’ll be ‘causal’.

Bugfix edit: added some boilerplate so that Flux seems the TcnBlock shortcut params.

10 Likes

Thank you so much for this! That is very helpful indeed.
It is interesting to compare the Flux and python code. The Flux is obviously much more succinct/elegant. How does performance compare, I wonder?

Anyway, I am eager to try this out on an example.

Thanks again!

@compleat if you have an example of using TCN on a real finance dataset, it would be super helpful if you could share w/ the Julia community!

1 Like

Hi, I was thinking the same thing, perhaps just one simple example for daily returns of the S&P or FTSE as forecast by lags. I would like to try this using Kolia’s code, but I might need some help. [Also, I didn’t understand what he wrote at the end above]

Anyway, thanks for the suggestion. If I get anything running, I will post it.

2 Likes