How do I make my custom (potentially stupid) model in Julia?

Tarny_GG_Channie · October 1, 2023, 1:31pm

Julia is said to be an expressive ML language where you do the obvious thing, putting in math expression and it works. But… how does it work exactly? To try it out, I want to implement my own (potentially stupid) idea for a language model.
This idea is based on three basic assumptions.

Parallel computation between all nodes is good. (Like Transformer)
Low complexity is good (Like RNN to LSTM)
Attention is good (From GRU to transformer and so on).
So, I proposed a (potentially stupid) idea based on binary attention.
The main connections (inspired by wavenet, stride is 1, 2, 4, 8, …)

Each connection is a binary attention.

image818×449 22.4 KB

● If the left side is empty, set the sigmoid result to 1, meaning
attending fully. If the right side is empty, set the sigmoid result to 0.
● Go both directions, use this set of binary attention
blocks to weight the result’s value. Now, it means that
all tokens have attended both forward and backward!
● Share weight at each layer. Do this until all tokens have inputs from all tokens from previous layers.
● Repeat this process N times.

So, with this stupid idea, I want to try implementing it. How do I go about it?

tchebycheff · October 1, 2023, 2:29pm

Custom layers in Flux

Topic		Replies	Views
How usable is Julia for NLP related ML tasks? General Usage	0	267	October 26, 2022
Bug when training a custom model using Flux Machine Learning flux , training	2	355	February 18, 2023
Deep learning in Julia Machine Learning	35	10114	April 22, 2024
Going beyond MNIST with Flux/Zygote and CUDA and Graph Neural Networks Machine Learning	4	778	August 9, 2020
Machine Learning using Julia - Aim/Idealogy of Flux.jl to for simplicity over compexity for programmers Machine Learning question , flux , machine-learning	11	1882	February 8, 2022

How do I make my custom (potentially stupid) model in Julia?

Related topics