I am new to Julia and feel a little bit confused about how to implement custom image loader, first I have generated 10 sample images and saved it in the temp folder, then I have defined “ImageFolder” but really got stuck in understanding how to use numobs and getobs since documentation is not very comprehensible for me this is what I got:
using FileIO, ImageIO, ImageCore, MLUtils
struct ImageFolder
files::Vector{String}
function ImageFolder(dir::String)
return new(readdir(dir; join = true))
end
end
MLUtils.numobs(data::ImageFolder) = data.files
MLUtils.getobs(data::ImageFolder, idx::AbstractVector{<:Integer} = rand(3, 4, length(idx)))
dir = tempname()
mkpath(dir)
@info "Writing random images to $(dir)"
for i in 1:10
save(joinpath(dir, "$i.jpeg"), colorview(RGB, rand(3, 10, 10)))
end
would appreciate any help, thank you in advance…
Tried to implement numobs and getobs so that MLUtils.getobs(data::ImageFolder, idx::AbstractVector{<:Integer}) should return an Array{Float32,4} of dimension d1xd2xd3xn where d1, d2 are the dimensions of the images and n is equal to length(idx).
Thank you for the answer, nevertheless I would like to use numobs and getobs but could not find any comprehensive tutorials, will see if I can come up with something
What’s the issue with the above referenced length and getindex approach?
These are aligned with the MLUtils DataLoader extension methodology and provides performance as good as it gets AFAIK.
getobs should only be implemented for types where there is a difference between getobs and Base.getindex (such as multi-dimensional arrays).
Same story for numobs. So the usage above is correct. You are of course free to override it instead, but you’ll miss out on the nice [...] indexing syntax and other features Julia provides for types with getindex/length defined.
Hi all, yes I am aware of the getobs usage only where is a difference, the thing is for me it is required to use getobs and numobs, but anyway I will try to implement the code with both, thanks for the input:)
If you implement getobs and numobs like you would implement getindex and length, then you’re likely 90% of the way there. If you feel like the docs are missing some important detail on either, feel free to file an issue or PR.
Hi ATR, given a list of files, you can also use mapobs to turn those into a data container of images:
using MLUtils, FileIO
files = [file for file in readdir(DIR; recursive=true) if endswith(file, ".jpeg")]
images = MLUtils.mapobs(FileIO.load, files)
Then you can get individual images like this:
MLUtils.getobs(images, 1)
One more note: where you’re creating some test images, you’re creating arrays with size (3, 10, 10) like one would have in Python to represent a 2D image. In Julia however, images are usually represented as 2-dimensional arrays with an element type like RGB that includes a complete pixel value (e.g. 3 colors).
To convert such a 2D image to a 3D array with the color channels expanded, you can use Images.channelview:
using Images
imagetensors = MLUtils.mapobs(Images.channelview, images)
getobs(imagetensors, 1)