I have to say that in many (most?) C-like languages the difference between variable == constant and variable = constant will be losing your data. It is not something very Julia-specific. Some programmers of the past called for something called yoda conditions, that is basically: never do variable == constant if you can do constant == variable, this way a missing = will give you a compilation time error (on most C-like languages).
Not that I do not think your cautionary tale is not useful (the existence of something like yoda conditionals corroborate it), but it is a little more general problem, and one of the few thing that make me thing if the syntax = (assign) vs == (test) thing was not a major mistake from earlier language designers.
The question in my mind is: Is there a good reason not to disallow assignment inside an indexing expression, such as y=x[x.=1]?
I find it hard to think of a case where this would be intended.
hmmm, maybe it is my inner C programmer talking, but vector[a+=1] is something I already did sometimes and I am found of.
To be fair, I remember seeing a Julia issue thread that discussing the fact that they would need a deprecation cycle to use the end keyword like vector[end] == last(vector) just because there was code out there that had things like vector[begin ... multiple lines of computation ... end] = x and they would break with the parser change, XD.
Technically, .= is not assignment (that is =) but broadcasting. That said, both are valid expressions so one can use their values (even if that would be considered bad style by some people in some contexts).
Generally, even with the best intentions, the parser can’t really protect you from typos like this. It could be in a function, eg
mask(x) = x .= 1 # I meant x .== 1
x[mask(x)] # ouch
What do you mean by “gone”?
Presumably if this data was a result of a long an expensive computation and you are now analyzing it interactively, you loaded it from some file so you can just reload it from the file.
If it was a result of a cheap computation you can just re-run the computation.
(I agree that it can be annoying if “cheap” here means ~ 15 mins. , but still I wouldn’t call this “gone” as in rm *)
In general there is a trade-off between compact syntax and typos doing something unexpected. Eg the following could be a valid use case (if somewhat contrived, and also bad style):
a = rand(Bool, 50) # want a .| b for flags
b = rand(Bool, 50)
x = rand(Int, 50)
y = rand(Int, 50)
x[a .= a .| b] # save result in a
y[a] # reuse
Note that this can also be used to reuse storage for the computed result — which can sometimes be beneficial. E.g., you can easily transform f(x .== y) to f(cache .= (x .== y)) to save on allocations in some loops.
julia> A = rand(1:10000, 10, 10);
julia> function f(A)
x = 0
for i in 1:10000
x += sum(A[A .> i])
end
return x
end
f (generic function with 1 method)
julia> f(A); @time f(A);
0.003767 seconds (30.00 k allocations: 5.840 MiB)
julia> function g(A)
cache = similar(A, Bool)
x = 0
for i in 1:10000
x += sum(A[cache .= (A .> i)])
end
return x
end
g (generic function with 1 method)
julia> g(A); @time g(A);
0.002144 seconds (10.00 k allocations: 4.467 MiB)
Here it didn’t completely help us as indexing (and even views of logical indices) require allocations themselves, but reusing a cache like this can be helpful at times. That’s actually the primary reason why this is supported at all.
If you are using a DataFrame you can use the @where macro from DataFramesMeta:
y = @where(df, :x .== 1) # @where(df,:x.=1) errors
I actually make this typo when using @where quite a lot
(I also accidentally overwritten a dataframe by running the wrong Jupyter notebook cell several times today and had to re-create it which took a few minutes every time I made this mistake, I guess that is my punishment for making the comment above )
The first ever Linter (for any language) I’ve tried, flagged it, but for the wrong reason:
(@v1.5) pkg> add https://github.com/tonyhffong/Lint.jl
lintstr("x=[1;2;3]; y=x[x .= 1]")
1-element Array{LintMessage,1}:
none:23 E321 .: use of undeclared symbol
It seems like it doesn’t understand broadcasting (but at least the Linter has been updated to ‘’‘run’‘’ in Julia 1.0, just needs to be improved).
It doesn’t really matter if code is flagged for the right or wrong reason, or if it flags a bit too much. But it probably flags way to much, e.g. the yoda conditions. And I’m not up-to-speed on the Linter situation in Atom (where available for) or VSCode (I think not, but that editor seems to be the future). Microsoft has some new liner support where code gets underlined as you write it.