What's your simple thing that you'd like to be easier than it is?

I’m interested in a current overview of simple expressions often come up, but which tend to be more verbose or difficult than they could theoretically be. These things are scattered around many threads here and there but I think an overview would be cool. It interests me from a language design point of view what trade-offs are made and what becomes easy or difficult in the process. Please share your own, without deviating too much into discussing each one. That would make the examples less visible.

I’ll start:

Map returning multiple arrays

I often need something like this hypothetical multimap function:

ones, twos = multimap(1:100) do i
    (1, 2)
end

Instead, I think you have to mess around with zip, or create one array first and then deconstruct it, or something like that.

Conditional application of functions

I often do this

result = some_condition ? value : some_function(value)

and if there’s two steps or more involved it’s annoying to repeat value. I don’t even have a suggestion for a syntax that would be nice but the idea is always “some value with f applied, but only if some condition holds” and this comes up constantly.

Map with skipped results

I like map but often I wish I could easily exclude some return values. I guess you can do something with reduce or just filter afterwards, but my intuitive wish would be to use continue like in a for-loop.

filtered_results = map(values) do v
    # computations...
    # then skip values where some condition doesn't hold
    some_condition && continue
    return result
end
4 Likes

This is something I’ve definitely been missing. I guess an unzip function would solve it?

You can do it like this

julia> (true ? identity : length)("string")
"string"

julia> (false ? identity : length)("string")
6

i.e. (condition ? identity : some_function)(value). If you don’t identity's name, you can also call x -> x instead.

2 Likes

Just note that when performance matters, this can inccur some overhead.

1 Like

I am not sure about this — a lot of seemingly “simple” and “obvious” constructs are not implemented because they are actually difficult to implement or integrate into Julia.

1 Like

I often want to search target values in a collection, like with searchsorted etc., but the collection is not sorted, so I have to use findall, findfirst, etc., which does not accept directly the target value (except for true).

When this is done repeatedly in the same code, I find myself writing a wrapper like:

findfirstequal(target, collection) = findfirst(x -> x==target, collection)
findfirst(isequal(target), collection)

:wink:

6 Likes

findfirst(==(target), collection)

8 Likes

OP this is a good list. You should try and turn these into a package or a PR to Base.

1 Like

This is a good idea. I always come across these things, but I just move on without making a note. I should make a note like this sometime.

The multimap issue is something I spent a lot of time on, I often returns named tuples now and I need to unpack them afterwards. Maybe the unzip solution isn’t so bad though :

unzip(x) = ntuple(i -> [x[i] for x in x], length(x[1]))

a,b = (i -> (1,2)).(1:100) |> unzip

About findfirst, I find it annoying that is doesn’t accept generators :

findfirst(x>1 for x in 1:5)   #wrong
findfirst([x>1 for x in 1:5]) #right :(

It’s also hard to remember which functions accept a generators and which do not (maybe there’s a good reason behind it but seems a bit arbitrary).

1 Like

I think a lot of them are pretty accidental, and should just be fixed unless there is compelling reason against it.

1 Like

Why would it be supposed to accept generators? x>1 for x in 1:5 creates only one argument (an iterator), and findfirst needs two (a function and an iterator).

But maybe you thought about this?

julia> findfirst(x -> x>1, 1:5)
2
1 Like

That works but I think the one argument form (taking a collection as input) is more common and convenient, specially since you’ll often use comprehensions in that kind of code, so it’s very natural to copy the inside of the comprehension and put it in another function.

Maybe the issue is that generators are not enumerable in the right way, e.g. maximum works but not argmax.

For generators that produce Bools, this can be implemented in a very straightforward way.

# This would actually work for any iterator, except that it
# violates arbitrary indexing, for which `pairs` is used instead
function Base.findfirst(gen::Base.Generator)
    T = Base.@default_eltype(gen)
    @assert T == Bool "findfirst can't iterate over a generator of non-booleans. Got return type $T"
    for (i, x) in enumerate(gen)
        x && return i
    end
end

The reason this doesn’t already “just work” is actually because generators don’t implement pairs (or keys) which is the method find* relies on. Since generators don’t have getindex defined on them anyhow, I don’t think it’s a problem to use enumerate.

Actually, defining the following in Base makes findfirst and findlast work as expected, without the above definition:

Base.pairs(gen::Base.Generator) = enumerate(gen)
julia> findlast(x>2 for x in 1:5)
5

julia> findfirst(x>2 for x in 1:5)
3

findnext is harder, since it requires indexing, but it can likely be done as well.

3 Likes
julia> X = [1,2,3,4,5]
5-element Vector{Int64}:
 1
 2
 3
 4
 5

julia> [x^2 for x in X if x > 2]
3-element Vector{Int64}:
  9
 16
 25
8 Likes

I think array comprehensions are underutilized on average. they’re super super useful

1 Like

The do syntax is nice with map for more complicated expressions though. But I agree this is useful.

1 Like

The dimensionality issue when taking sum or mean. Writing dropdims(sum(x, dims=2), dims=2) is so cumbersome. I wish sum would just take a kwarg to drop dimensions instead.

3 Likes

You can define a workaround like the following to avoid dims dims dims

julia> mapdim(f::F, array; dims::D) where {F,D} = dropdims(f(array, dims=dims), dims=dims)

julia> @btime mapdim(sum, v, dims=2) setup=(v=rand(512, 512));
  30.600 μs (10 allocations: 4.33 KiB)

julia> @btime dropdims(sum(v, dims=2), dims=2) setup=(v=rand(512, 512))
  30.800 μs (10 allocations: 4.33 KiB)

Curiously, mapslices is still pretty slow:

julia> @btime dropdims(mapslices(sum, v, dims=2), dims=2) setup=(v=rand(512,512));
  1.395 ms (4142 allocations: 114.38 KiB)