I would like to reproduce in julia some pytorch transfer-learning experiments which use Resnet18 on the Kaggle’s Dogs vs. Cats dataset (direct download from fast.ai).
I would like some feedback on how I’m importing the data.
I’m using the package Images
to import the images represented as CHW arrays (this should be what the code I’m trying to reproduce does, by calling torchvision.datasets.ImageFolder()
with torchvision.transforms.ToTensor()
as a parameter).
Resnet18 takes as input 224\times 224 images, so I’m also using Images.PaddedView
to crop them.
Here’s my code:
using Images
cd("~/data/dogscats/")
# Storing data as an array of tuples
img_example = load("train/dogs/dog.1933.jpg")
data_elem_example = (img = copy(channelview(img_example)), class = "dog", filename = "dog.1933.jpg")
data_elem_type = typeof(data_elem_example)
train_set = Array{data_elem_type}(undef,0)
valid_set = Array{data_elem_type}(undef,0)
function crop_center(new_size::Number, img::Array{RGB{Normed{UInt8,8}},2})
radius = new_size/2
h_size, v_size = size(img)
h_shift = floor(Int32, radius - h_size/2)
v_shift = floor(Int32, radius - v_size/2)
shift_img = (h_shift, v_shift)
out_dims= (new_size, new_size)
return copy(PaddedView(0, img, out_dims, shift_img))
end
for s in ["train", "valid"]
for (root, dirs, files) in walkdir(data_path*"/"*s)
for file in files
img_path = joinpath(root, file)
img_cropped = crop_center(224, load(img_path))
CHW_img = copy(channelview(img_cropped))
class = splitpath(root)[end]
target_set = s == "train" ? train_set : valid_set
push!(target_set, (img = CHW_img, class = class, filename = file))
end
end
end
Thanks in advance for your time and suggestions.