On array comprehensions with `if` statements

Array comprehensions are great to construct multidimensional arrays:

  1. The default case
julia> [x * y for x in 1:10, y in 1:10 ]

10×10 Array{Int64,2}:
  1   2   3   4   5   6   7   8   9   10
  2   4   6   8  10  12  14  16  18   20
  3   6   9  12  15  18  21  24  27   30
  4   8  12  16  20  24  28  32  36   40
  5  10  15  20  25  30  35  40  45   50
  6  12  18  24  30  36  42  48  54   60
  7  14  21  28  35  42  49  56  63   70
  8  16  24  32  40  48  56  64  72   80
  9  18  27  36  45  54  63  72  81   90
 10  20  30  40  50  60  70  80  90  100

but I noticed some inconsistencies:

  1. Separating for loops does not construct a multidimensional array
julia> [x * y for x in 1:10 for y in 1:10 ]

100-element Array{Int64,1}:
  1. Adding an if statement gives a 1d Array.
julia> [
          x * y 
          for x in 1:10, y in 1:10 if y > 5
       ]

50-element Array{Int64,1}:
  1. Adding separate for loops AND if statements
julia> [
           x * y 
           for x in 1:10  if x > 4 
           for y in 1:10  if y > 5
       ]

30-element Array{Int64,1}:

The ones that make sense to me are the default case and case 2. because subsetting of x and y is arbitrary.

  • The reasoning why case 1. would result in a 1d-Array is not clear to me and
  • I would propose that case 3. should result in a 2d array with properly subset columns and rows as a new feature.

It was a design choise. More than one for always return a 1d array.

I think the rules are consistent and quite reasonable:

  • multiple for statements will always return a vector
  • including an if clause will always return a vector
  • a single for with multiple comma separated loops will return a multidimensional array

To construct a multidimensional array with conditionals, just do this:

julia> [x>4 && y>5 ? x * y : missing for x in 1:10, y in 1:10]
10×10 Array{Union{Missing, Int64},2}:
 missing  missing  missing  missing  missing    missing    missing    missing    missing     missing
 missing  missing  missing  missing  missing    missing    missing    missing    missing     missing
 missing  missing  missing  missing  missing    missing    missing    missing    missing     missing
 missing  missing  missing  missing  missing    missing    missing    missing    missing     missing
 missing  missing  missing  missing  missing  30         35         40         45          50
 missing  missing  missing  missing  missing  36         42         48         54          60
 missing  missing  missing  missing  missing  42         49         56         63          70
 missing  missing  missing  missing  missing  48         56         64         72          80
 missing  missing  missing  missing  missing  54         63         72         81          90
 missing  missing  missing  missing  missing  60         70         80         90         100
1 Like
  • consistent
  • reasonable
  • intuitive

This was meant as a proposal to make them better, I edited the working of the initial post to make this clearer.

Ad 3. (just some examples for thinking about problem)

It is simple if all rows have same length :

julia> [x * y for x in [i for i in 1:10  if i > 4] , y in [i for i in 1:10 if i > 5]]

6×5 Array{Int64,2}:
 30  35  40  45   50
 36  42  48  54   60
 42  49  56  63   70
 48  56  64  72   80
 54  63  72  81   90
 60  70  80  90  100

julia> [[x * y for y in 1:10  if y > 5] for x in 1:10  if x > 4]
6-element Array{Array{Int64,1},1}:
 [30, 35, 40, 45, 50] 
 [36, 42, 48, 54, 60] 
 [42, 49, 56, 63, 70] 
 [48, 56, 64, 72, 80] 
 [54, 63, 72, 81, 90] 
 [60, 70, 80, 90, 100]

But you could mix x and y in if in nested for cycle!

julia> [[x * y for y in 1:10  if y+x < 15] for x in 1:10  if x > 4]
6-element Array{Array{Int64,1},1}:
 [5, 10, 15, 20, 25, 30, 35, 40, 45]
 [6, 12, 18, 24, 30, 36, 42, 48]    
 [7, 14, 21, 28, 35, 42, 49]        
 [8, 16, 24, 32, 40, 48]            
 [9, 18, 27, 36, 45]                
 [10, 20, 30, 40]   

it is not m×n matrix here!

First method (above) doesn’t allow this:

julia> [x * y for x in [i for i in 1:10  if i > 4] , y in [i for i in 1:10 if i+x > 5]]
ERROR: UndefVarError: x not defined

ad 1.

Think about using local variables. You could do this:

julia> [
         begin 
           x=i+i+i #=think some complicated expression here =#
           x+x
         end 
         for i in 1:3
       ]
3-element Array{Int64,1}:
  6
 12
 18

with current behavior you could simplify it to:

julia> [
         x+x
         for i in 1:3 
           for x in [i+i+i #=think some complicated expression here =#]
       ]
3-element Array{Int64,1}:
  6
 12
 18

You could understand (call it hack if you want :wink: ) expresion

for x in [val]

as special method for binding value of val to local variable x

I know it is matter of personal preferencies :slight_smile: I just want to add another POV to problem!

That solved my problem, I suspected that I was missing something, thanks!