Basic example for using Transformers.jl for sequential autoencoding

I am interested in using Transformers.jl for sequence-to-sequence autoencoding and was hoping to get help solving a minimum working example building and training an appropriate transformer-based sequence-to-sequence autoencoder using Transformer.jl on a synthetic dataset of sequences of varying length.

For example, consider the following dataset, which has consists of sequences of random samples from a Gaussian distribution:

# Set parameters for synthetic sequence data generation
elem_dim = 5; # The number of dimensions in a sequence's element
mean_seq_length = 10; # Desired average length of sequences 
std_seq_length = 3; # Standard deviation in length of sequences 
num_seqs = 100; # Number of sequences to generate

seqs = [
    [randn(elem_dim,1) for j = 1:(std_seq_length*randn() + mean_seq_length)] 
    for i = 1:num_seqs]
]

Given data of this format, I am interested in the answers to the following questions:

  1. How does the variable seqs need to be formatted in order to be input into a transformer using Transformers.jl?
  2. What is the smallest, most basic transformer-based architecture for sequence-to-sequence autoencoding for data of this type?

Thank you very much for your time and help.I apologize that this question is quite basic, but I haven’t been able to find a suitable answer online.