Using ChatGPT to get code for training a neural net

I recently discovered an amazing tool, chatgpt :wink: I have found that it’s pretty useful for preparing class notes, and translating them to other languages. I also was aware of discussion here about how well it does for generating Julia code. I decided to try it out myself, to see what is the state of things, and I’m pretty amazed at how well it works.

First, I asked it to write julia code to fit a TCN neural net

The code I got back was

using Flux
using MLDatasets  # For loading time series data, you can replace this with your dataset loading code

# Define the TCN model
function build_tcn(input_size, output_size, num_layers, num_channels, kernel_size)
    model = Chain(
        Conv(1, num_channels[1], kernel_size, pad=(kernel_size - 1) ÷ 2, relu),
        [Chain(Conv(num_channels[i - 1], num_channels[i], kernel_size, pad=(kernel_size - 1) ÷ 2, relu)) for i in 2:num_layers]...,
        Conv(num_channels[end], output_size, 1),
        GlobalMeanPool(),
    )
    return model
end

# Define a function for training
function train_tcn(x, y, model, loss, opt, epochs)
    data = [(x, y)]
    for epoch in 1:epochs
        Flux.train!(loss, params(model), data, opt)
        println("Epoch $epoch, Loss: $(loss(x, y))")
    end
end

# Example usage:
# Replace this with your dataset loading code
input_size = 1
output_size = 1
num_layers = 4
num_channels = [32, 64, 64, 32]
kernel_size = 3

# Create a TCN model
model = build_tcn(input_size, output_size, num_layers, num_channels, kernel_size)

# Define loss and optimizer
loss(x, y) = Flux.mse(model(x), y)
opt = ADAM(0.001)

# Generate some random data for demonstration (replace with your dataset)
n_samples = 100
x = randn(Float32, input_size, n_samples)
y = randn(Float32, output_size, n_samples)

# Train the TCN
epochs = 100
train_tcn(x, y, model, loss, opt, epochs)

which looks pretty good, but doesn’t quite run, complaining about the Conv layer. I suppose it is using some Python-isms. So, I just replaced that with a simple MLP, by hand. With one other by-hand fix, replacing params with Flux.params, the code ends up as

using Flux
using MLDatasets  # For loading time series data, you can replace this with your dataset loading code

# Define the TCN model
function build_tcn(input_size, output_size, num_layers, num_channels, kernel_size)
    model = Chain(
        Dense(input_size, 10, relu),
        Dense(10, output_size)
    )
    return model
end

# Define a function for training
function train_tcn(x, y, model, loss, opt, epochs)
    data = [(x, y)]
    for epoch in 1:epochs
        Flux.train!(loss, Flux.params(model), data, opt)
        println("Epoch $epoch, Loss: $(loss(x, y))")
    end
end

# Example usage:
# Replace this with your dataset loading code
input_size = 1
output_size = 1
num_layers = 4
num_channels = [32, 64, 64, 32]
kernel_size = 3

# Create a TCN model
model = build_tcn(input_size, output_size, num_layers, num_channels, kernel_size)

# Define loss and optimizer
loss(x, y) = Flux.mse(model(x), y)
opt = ADAM(0.001)

# Generate some random data for demonstration (replace with your dataset)
n_samples = 100
x = randn(Float32, input_size, n_samples)
y = randn(Float32, output_size, n_samples)

# Train the TCN
epochs = 100
train_tcn(x, y, model, loss, opt, epochs)

and, lo and behold, this runs!

julia> include("tcn.jl")
Epoch 1, Loss: 1.1102945
Epoch 2, Loss: 1.1054251
Epoch 3, Loss: 1.1007055
Epoch 4, Loss: 1.0960717
Epoch 5, Loss: 1.0915318
Epoch 6, Loss: 1.0870918
Epoch 7, Loss: 1.0827636
Epoch 8, Loss: 1.0785398

I have to say, I’m pretty impressed. It seems that, in the future, the most valuable form of knowledge will be knowing what are the right questions to ask. We will all have to become like Socrates.