I am using Julia 0.6.2 and have written the following function. (Data generation added as by request)
# Generate testdata function fillpath(path, nbfiles) cd(path) mkdir("juliatest") cd("juliatest") for i in ["0","1"] mkdir(i) cd(i) for j in 1:nbfiles touch("$j.dat") end cd("..") end end fillpath("/tmp", 10000) """ files, subdirs = subdirlabeledfiles(path) Return a Vector files containing the filenames in all subdirectories and a Vector subdirs containing the name of the subdirectory for each file that can be used as a label. """ function subdirlabeledfiles(path) # Get all subdirs of path subdirs = filter(x -> isdir(joinpath(path,x)), readdir(path)) # Get files in all subdirs subdirspaths = joinpath.(path, subdirs) # Get absolute paths of subdirs files = [filter(isfile, joinpath.(subdir, readdir(subdir))) for subdir in subdirspaths] # Get a list naming the subdir for each file subdirs = (fill(subdir, length(files[i])) for (i, subdir) in enumerate(subdirs)) # does not work # subdirs = [fill(subdir, length(files[i])) for (i, subdir) in enumerate(subdirs)] # does work # Flatten the results files = vcat(files...) subdirs = vcat(subdirs...) return files, subdirs end X, Y = subdirlabeledfiles("/tmp/juliatest") length(Y)
The idea: I give a path and the function returns a vector containing all files within all subdirectories and an additional vector indicating in which subdirectory the given file was. It is intended to be used to load datasets where e.g. images for different labels are in different subdirectories. Probably not the most elegant way of doing this, but anyway.
Now to the problem: When I use a generator expression in the line marked with
# does not work the vector I get for subdirs is much too short. Instead of 20000 entries I get 51 entries. When I change it to generate an Array with [ ] it works correctly.
Executing the working code with [ ] gives a length of 20000 which is expected. Running it with ( ) gives a length of 51. The vector contains the correct entries (first “0”, then “1”), there are just not enough entries.
Is there something I am missing? Is this intended behaviour? Is this a bug?