Is there any replacement in Julia for ImageDataGenerator

Hi,

I was trying to rewrite a python code of cat vs dog classification into julia. In the python code, I have the following:

from keras.preprocessing.image import ImageDataGenerator

# All images will be rescaled by 1./255
train = ImageDataGenerator(rescale=1./255)
test = ImageDataGenerator(rescale=1./255)

train_generator = train.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test.flow_from_directory(
        test_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

I tried converting this to Julia using : using Keras: ImageDataGenerator, it gave me error.
So, I tried using PyCall. I did the following:

using PyCall
py"""
from keras.preprocessing.image import ImageDataGenerator
# All images will be rescaled by 1./255
train = ImageDataGenerator(rescale=1./255)
test = ImageDataGenerator(rescale=1./255)
train_generator = train.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test.flow_from_directory(
        test_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')


"""

But, I’m getting the following error:

PyError ($(Expr(:escape, :(ccall(#= /home/g2-test/.julia/packages/PyCall/kAhnQ/src/pyeval.jl:38 =# @pysym(:PyEval_EvalCode), PyPtr, (PyPtr, PyPtr, PyPtr), o, globals, locals))))) <class 'NameError'>
NameError("name 'train_dir' is not defined")
  File "/home/g2-test/.julia/packages/PyCall/kAhnQ/src/pyeval.jl", line 7, in <module>
    pynamespace(m::Module) =


Stacktrace:
 [1] pyerr_check at /home/g2-test/.julia/packages/PyCall/kAhnQ/src/exception.jl:60 [inlined]
 [2] pyerr_check at /home/g2-test/.julia/packages/PyCall/kAhnQ/src/exception.jl:64 [inlined]
 [3] _handle_error(::String) at /home/g2-test/.julia/packages/PyCall/kAhnQ/src/exception.jl:81
 [4] macro expansion at /home/g2-test/.julia/packages/PyCall/kAhnQ/src/exception.jl:95 [inlined]
 [5] #120 at /home/g2-test/.julia/packages/PyCall/kAhnQ/src/pyeval.jl:38 [inlined]
 [6] disable_sigint at ./c.jl:446 [inlined]
 [7] pyeval_(::String, ::PyDict{String,PyObject,true}, ::PyDict{String,PyObject,true}, ::Int64, ::String) at /home/g2-test/.julia/packages/PyCall/kAhnQ/src/pyeval.jl:37
 [8] top-level scope at /home/g2-test/.julia/packages/PyCall/kAhnQ/src/pyeval.jl:230
 [9] top-level scope at In[50]:2

How can I solve this? Is there any other way?

what is your train_dir

train_dir has 2 folders of cats and dogs each containing of around 1500 images each.

yea but you’re not defining it anywhere, are you?

Yes. I’m defining it.

train_dir = '/home/g2-test/ASHNA/classification/dog vs cat/dataset/training_set/'
classes = ['cat', 'dog']

it’s not in the code you’re showing

This is the complete code:

using Flux
using Images
path = "/home/g2-test/ASHNA/classification/dog vs cat/dataset"
train_dir = "/home/g2-test/ASHNA/classification/dog vs cat/dataset/training_set/"
classes = ["cat", "dog"]
test_dir = "/home/g2-test/ASHNA/classification/dog vs cat/dataset/test_set/"
model = Chain(
              Conv((3, 3), (150 => 148), pad=(1,1), relu),
              MaxPool((2, 2)),
              Conv((3, 3), (74=>72), pad=(1, 1), relu),
              MaxPool((2, 2)),
              Conv((3, 3), (36=>34), pad=(1, 1), relu),
              MaxPool((2, 2)),
              Conv((3, 3), (17=>15), pad=(1, 1), relu),
              MaxPool((2, 2)),
              x -> reshape(x, :, size(x, 4)),
              Dense(6272, 512),
              Dense(512, 1),
              softmax)
using BSON: @save
@save "cat_vs_dog_model.bson" model
using PyCall
py"""
from keras.preprocessing.image import ImageDataGenerator
# All images will be rescaled by 1./255
train = ImageDataGenerator(rescale=1./255)
test = ImageDataGenerator(rescale=1./255)
train_generator = train.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test.flow_from_directory(
        test_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')


"""

whatever is in py""" you can think of running in a separate python REPL, so it won’t know about anything you write in Julia, try repeat those lines inside py"""

Try

using PyCall
ImageDataGenerator = pyimport("keras").preprocessing.image.ImageDataGenerator
train = ImageDataGenerator(rescale = 1/255)
train_generator = train.flow_from_directory(train_dir, target_size = (150, 150), batch_size = 20, class_mode = "binary")

I have no idea how easily train_generator can be hooked up to Flux but at least you have it available on the Julia side of PyCall.

1 Like