Transformers.jl: causal mask on decoder

Rrdd · February 26, 2023, 2:54am

hi, I am new to Transformers.jl and try to follow the tutorial (Tutorial · Transformers.jl). I wonder where I can find more details about this call
t = decoder_trf(e, m, attention_mask, cross_attention_mask)

In particular, how to modify the above to allow a causal mask to be applied to the decoder input (to avoid peeking ahead). Many thanks!

Rrdd · February 28, 2023, 3:06am

@chengchingwen
To be more specific, for the lookahead mask, shall I do something along the line of:

t = decoder_trf(e, m, NeuralAttentionlib.CausalMask(), cross_attention_mask)

thanks!

chengchingwen · February 28, 2023, 3:34am

You don’t need to do that manually. The TransformerDecoderBlock constructor create a CausalMultiheadQKVAttenOp for the self attention, which does the causal masking already. The basic functionality of attention_mask in decoder is for putting something like LengthMask for avoiding padding affect the computation.

Rrdd · February 28, 2023, 4:55am

@chengchingwen that’s convenient and works like a charm!

A follow-up question: In the rare case when I dont want to have this mask, or maybe a special mask that’s not triangular, can this be done?

chengchingwen · February 28, 2023, 7:16am

Yes, but you would probably need to call the inner-most constructor with MultiheadQKVAttenOp and pass your own attention mask from the input. You can find NeuralAttentionlib for more kind of masks.

Topic		Replies	Views
Implementation of self-attention in Transformers.jl? Machine Learning	9	1236	February 21, 2023
Is there an implementation of the attention mechanism in Flux.jl? Machine Learning flux	5	2865	September 23, 2020
Transformers.jl: How to train for masked languge model tasks in Julia? General Usage transformers	4	513	December 7, 2023
Elegant way to handle multiple input flux layers? Machine Learning question	4	740	February 2, 2023
[ANN] Transformers.jl Package Announcements announcement	6	1965	February 18, 2020

Transformers.jl: causal mask on decoder

Related topics