Current behavior of fill and the like..?

Seif_Shebl · April 1, 2022, 8:22pm

fill is very useful for creating arrays having the same vaule at all locations. However, if this vaule is a mutable object, fill will place that very same object at all locations. This behavior can be confusing, especially for beginners, when one tries to mutate a single location, it will silently modify all locations similarly.

julia> a = fill([1 2; 3 4], 3)
3-element Vector{Matrix{Int64}}:
 [1 2; 3 4]
 [1 2; 3 4]
 [1 2; 3 4]

julia> a[1][1] = 123;

julia> a
3-element Vector{Matrix{Int64}}:
 [123 2; 3 4]
 [123 2; 3 4]
 [123 2; 3 4]

repeat, albeit a different function with a different goal, can also have the same behavior.

julia> b = repeat([[1 2; 3 4]], 3)
3-element Vector{Matrix{Int64}}:
 [1 2; 3 4]
 [1 2; 3 4]
 [1 2; 3 4]

julia> b[1][1] = 123;

julia> b
3-element Vector{Matrix{Int64}}:
 [123 2; 3 4]
 [123 2; 3 4]
 [123 2; 3 4]

IMO, this makes fill and repeat less useful in practice. Also, I’m not sure why Julia doesn’t copy the value at all locations as MATLAB’s repmat does? Of course, for immutable structures this issue doesn’t exist. I know one can use array comprehensions to create such arrays but it’s not as intuitive/compact. Any ideas?

stillyslalom · April 1, 2022, 8:29pm

julia> a = map(copy, fill([1 2; 3 4], 3))
3-element Vector{Matrix{Int64}}:
 [1 2; 3 4]
 [1 2; 3 4]
 [1 2; 3 4]

julia> a[1][1] = 123
123

julia> a
3-element Vector{Matrix{Int64}}:
 [123 2; 3 4]
 [1 2; 3 4]
 [1 2; 3 4]

Seif_Shebl · April 1, 2022, 8:33pm

Yes, one can use map of course, but it is not better than a comprehension anyway.

julia> a = [[1 2; 3 4] for i=1:3]

3-element Vector{Matrix{Int64}}:
 [1 2; 3 4]
 [1 2; 3 4]
 [1 2; 3 4]
julia> a[1][1] = 123;

julia> a
3-element Vector{Matrix{Int64}}:
 [123 2; 3 4]
 [1 2; 3 4]
 [1 2; 3 4]

rdeits · April 1, 2022, 8:35pm

This is a fundamental difference between Julia and Matlab. Matlab copies everything silently under the hood (function arguments, = assignments, etc.). Julia never does this, so fill is just consistent with the way everything else in Julia works. Changing fill to automatically call copy (or should it be deepcopy?) would make it totally inconsistent with the rest of Julia, and I think that’s a bad thing.

On the other hand, I agree that this is a very common trap for new users. It’s so common that I wonder if it would be better to rename fill to something else for Julia 2.0 as a way to gently discourage new Julia users from expecting it to behave just like it does in Matlab. It’s a useful function in Julia, but it’s not useful for the thing that most Matlab users seem to expect it to be useful for.

stillyslalom · April 1, 2022, 8:48pm

Adding a keyword argument like fill(x, dims; copy = false) would be semver-compliant for inclusion in a v1.x release so long as it doesn’t change default behavior, right? And putting the default behavior right in the function signature might do a better job of warning new users than expecting them to read the full docstring.

Henrique_Becker · April 1, 2022, 10:36pm

Why it should? Do you want it to check if the value is mutable and take this decision for the programmer? The short answer is that it is confusing to many programmers (Java behaves the same way, as it is just how it is expected to behave when you have the correct memory model in mind) and it would be terrible for performance if done silently.

DNF · April 1, 2022, 11:25pm

repmat doesn’t make arrays of arrays, it just repeats the values of the input array into a larger array:

>> repmat(rand(2,3), 2, 2)
ans =
    0.2785    0.9575    0.1576    0.2785    0.9575    0.1576
    0.5469    0.9649    0.9706    0.5469    0.9649    0.9706
    0.2785    0.9575    0.1576    0.2785    0.9575    0.1576
    0.5469    0.9649    0.9706    0.5469    0.9649    0.9706

I’m actually not aware of any functionality like fill in Matlab. In fact, you cannot even have arrays of arrays, except with the special ‘cell array’ type.

The way Matlab arrays work aren’t really comparable to Julia’s arrays, I think.

repmat is similar to Julia’s repeat function, though:

julia> repeat(rand(2,3), 2,2)
4×6 Matrix{Float64}:
 0.135393  0.416019  0.238718  0.135393  0.416019  0.238718
 0.765527  0.546136  0.629576  0.765527  0.546136  0.629576
 0.135393  0.416019  0.238718  0.135393  0.416019  0.238718
 0.765527  0.546136  0.629576  0.765527  0.546136  0.629576

Seif_Shebl · April 1, 2022, 11:30pm

I admit that the current behavior of not copying is consistent with Julia’s design; arrays are not copied by default. But as you said, this is a common trap for many new users, and there should be a way to prevent this misunderstanding. On the same time, the functionality of expanding a vector into given dims is very useful in practice, see how many questions in this discourse about making the same mistake of referring to the same object at all locations.
Historically, repmat was superceded by repeat in 2018 so that now repeat works for both strings and arrays besides scalars. I think now we should have a means to prevent usage of fill with arrays and at the same time provide a convenience method that works for 2D and nD arrays. Something similar to this would be very useful (maybe repmat back or expand or any more expressive name):

expand([1 2; 3 4], 3)
3-element Vector{Matrix{Int64}}:
 [1 2; 3 4]
 [1 2; 3 4]
 [1 2; 3 4]

lmiq · April 1, 2022, 11:33pm

Another difficulty is that the argument is parsed in the scope of the caller. Thus, what

fill(rand(2), 10)

Should return? Any alternative would be confusing (the current behavior is confusing but consistent noneless)

DNF · April 1, 2022, 11:36pm

Maybe there should be a fillcopy function?

Seif_Shebl · April 1, 2022, 11:38pm

Sure, MATLAB doesn’t even have array of arrays, but since Julia has that, it makes sense to have that functionality to work for array of arrays in Julia. repmat was deprecated by Jeff in 2018 in favor of repeat, we might think of bringing it back for filling with arrays or choose a better name for a new method, say expand, multicopy, copydims, etc.

CameronBieganek · April 1, 2022, 11:38pm

You won’t get much sympathy about this from core devs. My most downvoted Github issue:

https://github.com/JuliaLang/julia/issues/41209

CameronBieganek · April 1, 2022, 11:49pm

Having a copy = false keyword argument is not ideal, because if you set copy = true you’ll still have the same problem with arrays of arrays of arrays. So what you really want is a deepcopy. But I believe I’ve heard it said that deepcopy in Julia is not very well defined, or shouldn’t exist, or something like that. Can’t find a link now.

So the only viable options are:

Use map or a comprehension.
Add fillf which takes a function as the first argument.
Add @fill.

At least a result of the above linked Github issue is that the documentation for fill will be improved in Julia version 1.8.

DNF · April 1, 2022, 11:52pm

I don’t follow. Bring back what functionality?

As far as I know, Matlab’s repmat functionality exists in Julia’s repeat. I don’t think there is any parallel to what you are looking for in Matlab, nor in previous Julia functions.

Seif_Shebl · April 2, 2022, 12:00am

Thanks for pointing to the related issue. From a fast scim through the discussion there, it seems providing a @fill maco that means [<expr> for _ = 1:n] might be a viable solution.

Seif_Shebl · April 2, 2022, 12:08am

I meant re-using the name repmat since it carries the postfix mat to now work with arrays, similar to what was suggested by @CameronBieganek’s 3 options above (fillf).

Yes, similar to this.

But after I scimmed through that issue I tend to agree on a macro @fill.

CameronBieganek · April 2, 2022, 12:12am

Here’s a link to the recommendation to not use deepcopy (aside from interactive use):

https://github.com/JuliaLang/julia/issues/42796#issuecomment-951232853

Henrique_Becker · April 2, 2022, 1:30am

Not really agree with Jeff on this one, but I think I understand the perspective he is coming from. In package code, you really rarely will really want to generically deep copy something. However, I do not see why this hinders better naming of the functions (the PR’s goal). I mean, this seems like an excessively “training wheels”/“protecting the programmers from themselves” take that is uncommon to Julia design: to avoid giving a function a better name just because it may make them discover a slower function that is not what they need 90% of the time.

Topic		Replies	Views
`fill` is dangerous :-( Internals & Design	1	558	June 27, 2022
Create initialized arrays of structs General Usage	2	3808	November 30, 2017
Simple question about assignment to a vector of vectors General Usage	12	2930	September 27, 2022
fill(anArray,2) behaviour General Usage	5	630	March 27, 2019
Having trouble pushing to an array of arrays General Usage	7	1513	December 7, 2019

Current behavior of fill and the like..?

Related topics