How to convert a JPEG image (from mobile) to a stantard 60k Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8, 8}}, 2}) (from Flux.Data.MNIST)?

sylvaticus · August 16, 2021, 10:46am

For an exercise I want to let students draw their own digits, scan them with a mobile and let a MINST-trained flux NN to category them.

I see that the single MINST images are of type Matrix{ColorTypes.Gray{FixedPointNumbers.N0f8}} (alias for Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8, 8}}, 2}).

How do I transform my NxN greyscale mynumber.jpg to the same form as the MNIST images ?

sylvaticus · August 16, 2021, 11:32am

I solved (I think, I still need to try running the actual classifier) with:

using Images, FileIO, ImageTransformations
img_path = "./data/test5.jpg"
img  = load(img_path)
img2 = Gray.(img)
img3 = imresize(img2, (28,28))
img4 = 1.0 .- img3

sylvaticus · August 16, 2021, 12:02pm

A bit OT, why the images loaded using MLDatasets are mirrored and rotated compared to those loaded by using Flux.Data.MNIST ?

using Flux, Flux.Data.MNIST
imgs = MNIST.images()
firstImg = imgs[1]

using MLDatasets 
train_x, train_y = MLDatasets .MNIST.traindata()
firstimg_MLD   = convert(Matrix{Gray{N0f8}},train_x[:,:,1])

How do I get them back in “normal mode” (although I think this is not a big issue in classification using NN, as the rotation/mirroring should be learned) ?

sylvaticus · August 16, 2021, 3:36pm

For those interested, this is the complete script together with a grid for letting students write their own digits… works quite well

(the actual classification script is from various tutorials, mainly this one)

using Pkg
cd(".")
Pkg.activate(".")
#Pkg.add("Flux")
#Pkg.add("MLDatasets")
#Pkg.add("BetaML")
#Pkg.add("Images")
#Pkg.add("FileIO")
#Pkg.add("ImageTransformations")
#Pkg.add("MLDatasets")

using DelimitedFiles
using Statistics
using Flux
using Flux: Data.DataLoader
using Flux: onehotbatch, onecold, crossentropy
using Flux: @epochs
using MLDatasets # For loading the training data
using Images, FileIO, ImageTransformations # For loading the actual images


# Training of the model

x_train, y_train = MLDatasets.MNIST.traindata()
x_train          = permutedims(x_train,(2,1,3)) # For correct img axis
x_train_imgs     = convert(Array{Gray{N0f8},3},deepcopy(x_train))
x_train          = convert(Array{Float32,3},x_train)
x_train          = reshape(x_train,(28,28,1,60000))

y_train          = onehotbatch(y_train, 0:9)
train_data       = DataLoader((x_train, y_train), batchsize=128)
model = Chain(
    # 28x28 => 14x14
    Conv((5, 5), 1=>8, pad=2, stride=2, relu),
    # 14x14 => 7x7
    Conv((3, 3), 8=>16, pad=1, stride=2, relu),
    # 7x7 => 4x4
    Conv((3, 3), 16=>32, pad=1, stride=2, relu),
    # 4x4 => 2x2
    Conv((3, 3), 32=>32, pad=1, stride=2, relu),
    # Average pooling on each width x height feature map
    GlobalMeanPool(),
    flatten,
    Dense(32, 10),
    softmax
)
accuracy(ŷ, y) =  (mean(onecold(ŷ) .== onecold(y)))
loss(x, y)     = Flux.crossentropy(model(x), y)
# learning rate
opt = Descent(0.1)
#opt = Flux.ADAM()
ps = Flux.params(model)

number_epochs = 10
@epochs number_epochs Flux.train!(loss, ps, train_data, opt)

accuracy(model(x_train), y_train) # 0.981


# Loading imgs
function cleanImg!(img,threshold=0.3,radius=0)
    (R,C) = size(img)
    for c in 1:C
        for r in 1:R
            if img[r,c] <= threshold
                allneighmoursunderthreshold = true
                for c2 in max(1,c-radius):min(C,c+radius)
                    for r2 in max(1,r-radius):min(R,r+radius)
                        if img[r2,c2] > threshold
                            allneighmoursunderthreshold = false
                            break
                        end
                    end
                end
                if allneighmoursunderthreshold
                    img[r,c] = Gray(0.0)
                end
            end
        end
    end
    return img
end
imgs_y = convert(Array{Int64,1},dropdims(readdlm("./data/img_labels.txt"),dims=2))
imgs_path = ["./data/test$(i).png" for i in 1:24]
imgs = load.(imgs_path)
imgs = [Gray.(i) for i in imgs]
imgs = [imresize(i, (28,28)) for i in imgs]
imgs = [1.0 .- i for i in imgs]
imgs = cleanImg!.(imgs, 0.3,1)
imgs = cat(imgs...,dims=3)
imgs = reshape(imgs,(28,28,1,size(imgs,3)))

# Doing the actual classification

imgs_est = model(imgs)

imgs_ŷ = onecold(imgs_est, 0:9)

probs = maximum(imgs_est,dims=1)

mean(imgs_ŷ .== imgs_y)

Grid:

odg: Nextcloud
pdf: Nextcloud

ToucheSir · August 16, 2021, 4:35pm

FWIW, you can avoid almost all of that conversion logic by using https://juliaml.github.io/MLDatasets.jl/stable/datasets/MNIST/#MLDatasets.MNIST.traintensor.

sylvaticus · August 17, 2021, 10:20am

…hmmm… even with “basic” preprocessing (GIMP → levels → reducing the max input levels) and the “cleaning” in the script I can’t go over 60%… ok, it’s better than 10%, but remains quite unsatisfactory… I don’t think the problem is in the NN model itself, rather in the preprocessing so that the handmade digits look more like the ones for which the model has been trained… any ideas ?

Topic		Replies	Views
How to convert an array to Matrrix of ColorTypes.Gray New to Julia	4	844	January 26, 2019
Working with CSV of MNIST sign language in Julia deep learning New to Julia	0	384	June 10, 2020
Size Mismatch Convolution Layer Machine Learning question , flux	3	819	June 24, 2021
Flux con! warning Machine Learning	4	1083	November 2, 2020
How to implement custom image dataset for cnn in Julia Flux? Machine Learning	6	1023	January 9, 2023

How to convert a JPEG image (from mobile) to a stantard 60k Array{ColorTypes.Gray{FixedPointNumbers.Normed{UInt8, 8}}, 2}) (from Flux.Data.MNIST)?

Related topics