So this is the current behaviour of the filter function, when given a function that returns a boolean and a collection (taken from the documentation):
julia> a = 1:10
1:10
julia> filter(isodd, a)
5-element Array{Int64,1}:
1
3
5
7
9
I would argue that this is counter-inntuitive, at least to me, and I would like to discuss why too see what others think.
So my problem with it is the fact that a filter removes, by definition in my intuition. Something is filtered out from the whole. So to my mind, the function-call filter(isodd, a) is a filter that applies to the elements of a that are odd. This means that when the odd numbers are filtered, the function should, in my mind, return the even numbers.
Am I alone in this opinion? And as it is a breaking change, it is even a point to discuss this? I am thinking it could be changed for 2.0, and that it is therefore a worthwhile discussion interested in hearing other opinions on this.
This change would be massively breaking and contrary to every other programming language with a filter function, but in no way improve the functional power of the function. No, this will not be considered.
Not really. A filter usually separates something into two parts. It is up to the user to decide what is kept (could be both, just needed to be separate). Eg an aggregate grading sieve would keep all parts, all of which are needed for the analysis.
The lesson is that one should not rely on intuition for these things — just read the docs if you are unsure.
This trips me up often as well because I think of the function filtering things out, but as others have said, there’s already a traditional meaning of this higher order function in other languages to consider. What might be more plausible is introducing a reject verb and maybe a corresponding select, although that already has many other meanings.
Depends on the context. In signal processing, a filter rejects part of the signal – there’s no way to get it back. One says, for example, “I need to filter this interference” to mean “reject the interference”. You can use filters to decompose a signal into its components, but you need more than one filter (for example, in a sound equalizer).
Coming from an EE, not computing, background, the behavior of filter in programming languages is counter-intuitive to me too, but I’ve gotten used to it.
Although I completely agree with the sentiment that this should not be changed for all the reasons mentioned above, it’s worth noting that even in common use like “coffee filter”, not just EE use, “filter” often implies the removal. So confusion is warranted for any relatively new programmers, though they’d experience this regardless of language choice.
But it’s just something to get used to since, like others have said, this is the convention in all programming languages.
For what it’s worth, in the domain of manufacturing filtration is an operation which separates a mixed liquid/solid stream into a liquid stream and solid stream. We call the liquid stream the “filtrate”, and the solids form a “cake”. Usually the filtrate is the product you’re most interested in but not always. If we treat the filter function as an analogy to this kind of filtration, it might make sense to have a keyword argument that controls which stream is returned (this would also allow getting both if wanted). Something like product=:filtrate, product=:cake, or product=:both.
Personally, I don’t think this is very beneficial, but I thought I would throw it out there.
Sure, there are lots of intuitive interpretations that go either way. The point is that there isn’t a single one, so people should just read the docs.
The only unambiguous approach I can imagine is to spell it out, cf COMMON-LISP:REMOVE-IF-NOT & friends.
The Wikipedia page for filter has a nice table summarizing syntax in various languages. I guess that filter in particular comes from the ML family historically, but I am not sure. In any case, Julia’s usage seems to be the common one in programming.
And as it happens some of the most common filters like the lowpass filters specify by the name what is to be kept, just like programming languages do with their predicate.
Thanks for enlightening replies. So it seems like I am not alone in my intuition, but that as it is massivly breaking and there is a lot of presidence for this implementation, things are fine for now.
If I keep having to think hard to have this make sense, I will implement my own filterout function, as suggested by @anon37204545. Thanks for everyone’s time and opinions
Indeed – and the filter’s output is “the filtered signal”. I guess that’s why I never had as much trouble with filter as defined in programming as the OP. The point I was trying to make is that, in many contexts, a filter does not “separate” something into two parts which you get to keep, which was @Tamas_Papp’s definition.
That is a fine band-aid, and I have concidered it. The original problem was however that I currently feel the need to do mental gymnastics to understand the code I write well, and this just feels like adding a flip to those gymnastics… Then it is better to just accept the current way the function works IMO
A filter can remove, and a filter can keep. A dust filter removes dust, an air filter lets through air. A bandpass filter passes through some frequency bands. A bandstop filter removes them.
The specification depends on whether it’s easier to enumerate the removed or kept parts. Neither is more intuitive or obvious.