Detect argument is an iterable or a "scalar"

ryofurue · July 10, 2023, 1:53pm

I keep finding myself wanting to branch on whether an argument is a collection (iterable?) or a scalar (single value). What’s the idiomatic way?

A toy example would be

function func(xs)
  if is_iterable(xs) # <- How to do this?
    for (i,x) in pairs(xs)
      print("working on the $(i)th item . . . "); flush(stdout)
      dosomethingon(x)
    end
    println()
  else
    dosomethingon(xs)
  end
end

So far, the best I can think of is this inelegant solution

function func(xs)
  if length(xs) > 1
     # iterable
  else
     # scalar
  end
end
func(3:10) # iterable
func([3]) # one-element array =~ scalar

jling · July 10, 2023, 2:05pm

you can’t and you shouldn’t.

Maybe this a XY problem? can you explain why do you need to do this with the SAME method?

normal pattern in Julia is probably something like:

dosomething(x::Number) = ...

function dosomething(xs)
   for (i,x) in pairs(xs)
      dosomething(x)
   end
end

thautwarm · July 10, 2023, 2:28pm

It is not do-able. The inability however shows the right thing: what if someone treats what that you treat as scalars as iterables? Julia code is loaded incrementally: Sam loads package A and Tom loads package B later, what if A and B are supposed (by Sam and Tom) to keep different assumptions about what should be scalar/iterable.

Interestingly, Julia core may not distinguish scalars in the way you’re used to:

julia> x = 1
1

julia> for i in x
           println(i)
       end
1

I’d suggest you define isiterable for your own “application domain”:

MyPackage.isiterable(::MyType) = true
MyPackage.isiterable(::Number) = false
MyPackage.isiterable(::AbstractArray) = true

mikmoore · July 10, 2023, 3:25pm

You should describe what you’re trying to accomplish with more context for a complete answer. Often, when I’m looking for something like this what I really want is a recursive reduction of the input:

countnumbers(x) = sum(countnumbers,x) # recursive definition
countnumbers(x::Number) = 1 # base case
countnumbers(x::AbstractChar) = 0 # special case

countnumbers([[1,2,3],4,(5,6),"seven"]) == 6

But depending on how many base/special cases you need to define, this may not be practical.

ryofurue · July 10, 2023, 4:20pm

using Plots
using StrFormat
function func(xs, . . . )
  # . . . here comes a lot of preparation to plot figures . . .
  if is_iterable(xs)
     for (i, x) in pairs(xs)
       p = plot_a_figure_with_x(x, . . . )
       savefig(p, f"myfig-p\%3.3d(i).png") # -> myfig-p001.png, myfig-p002.png, . . .
     end
  else
     p = plot_a_figure_with_x(xs, . . . )
     savefig(p, "myfig.png")
   end
end
func(x1:delx:x2, . . .) # iterable
func(x, . . .) # scalar

This isn’t a contrived example. I’m just trying to write a program exactly like this.

can you explain why do you need to do this with the SAME method?

It’s just convenient because the two cases share a lot of code. If I were to use two functions instead of one, I would need to separate out the preparation code (see the above code) into another function and call it from the two functions. It would be absolutely doable, but it’d be unnecessary complexity for my particular case at hand. The following solution would be much simpler.

If what I asked for is impossible, I’d just add a Boolean flag to indicate whether the argument is iterable or not:

function func(; xss, . . . )
  (xs, iterable) = xss
  if iterable . . .
  . . .
end
func(xss = (xs, true), . . . )
func(xss = (y, false), . . . )

A simple problem, a simple solution.

I just (incorrectly) imagined that Julia’s type hierarchy had something that indicates “iterability” (just like all the Real types implement the larger-than operator whereas the Complex{T} types don’t).

rocco_sprmnt21 · July 10, 2023, 5:00pm

maybe the isbits() function can do what you ask

mikmoore · July 10, 2023, 6:04pm

I might try something like the following:

function savemyplots(xs::AbstractVector{<:Real}, . . .)
  p = plot_a_figure_with_x(xs, . . . )
  savefig(p, "myfig.png")
end

function savemyplots(xs_collection, . . .)
  for (i, x) in pairs(xs)
    p = plot_a_figure_with_x(x, . . . )
    savefig(p, f"myfig-p\%3.3d(i).png") # -> myfig-p001.png, myfig-p002.png, . . .
  end
end

function func(xs, . . . )
  # . . . here comes a lot of preparation to plot figures . . .
  savemyplots(xs)
end

Replace AbstractVector{<:Real} with whatever input type you want to recognize as a single plot-able element. Everything else will be iterated as a collection of plots to make. If you want multiple specific types to be recognized as single-plot types, then you might need a helper function or trait (like the isiterable suggestion above) and you can that to forward to the single or multiple version of your plotting.

That said, it seems that handling this dispatch at the level of func might be better. I would imagine that the preparatory work also depends on whether you plan to produce one or many plots.

Your option of adding a flag is totally reasonable too. However, at that point I would consider whether you really need func to be the name for both the single and multiple cases. It may be less confusing to handle each under a differently-named function.

As others have said, we don’t have a universal trait (much less a branch in the type hierarchy) to determine the iterability of an object. Whether something should be iterated or treated as a scalar is context-dependent so we could never hope to answer the question accurately and decisively.

For example, should a String be treated as a monolithic object or as a ordered collection of characters? Should a vector be treated as a single object (perhaps representing a single point in N-dimensional space?) or as a collection of numbers? It makes no sense to sort the (x,y,z) coordinates of a point but it’s entirely reasonable to sort a list of prices.

DNF · July 10, 2023, 7:08pm

I don’t think I have ever written a piece of code where I don’t ‘know’ in the code whether a particular object is iterable or not.

Just write separate methods for scalar and AbstractVector inputs, where the latter handles preprocessing separately.

ryofurue · July 11, 2023, 5:08am

Thank you for your analysis! It’s a very accurate analysis of my problem. There, I think this is the crux of the problem:

That said, it seems that handling this dispatch at the level of func might be better. I would imagine that the preparatory work also depends on whether you plan to produce one or many plots.

No, that’s not the case and that is the point! I wrote the original code for a single plot. Then I’m trying to extend it for multiple plots, just to loop over the values of x, changing the filename from myfig.png to myfig-p001.png, . . . That’s the only difference.

My problem is that simple and the solutions everybody proposes are overkill.

I’m not against your solutions in general. I would write such code as everybody proposes if the two cases are different enough. I’m not saying that branching on whether the value is scalar or not is a superior solution in general. In most cases, dispatch is the superior solution.

If I write two functions to handle each cases separately, I would have to pass a lot of arguments to them from my preparatory code. I don’t think that extra complexity is worth it for my problem at hand.

If my code further extends and if the two cases (collection vs scalar) become different enough, then two separate functions would start to make sense. I don’t think, however, that it makes sense to do so just to avoid branching on whether the value is a collection or a scalar.

we don’t have a universal trait (much less a branch in the type hierarchy) to determine the iterability of an object. Whether something should be iterated or treated as a scalar is context-dependent so we could never hope to answer the question accurately and decisively.

Okay.

For example, should a String be treated as a monolithic object or as a ordered collection of characters? Should a vector be treated as a single object (perhaps representing a single point in N-dimensional space?) or as a collection of numbers? It makes no sense to sort the
(x, y, z) coordinates of a point but it’s entirely reasonable to sort a list of prices.

You explain why we cannot determine, in the current Julia, whether a value can be considered iterable or not. But, in the preceding paragraph you mentioned “trait”, and that approach isn’t impossible in principle. For example, the language designer could declare Vector to be a subtype of Iterable. Then, any Vector would have to behave like an Iterable in a context where a Iterable is expected. The language designer could decide that String is not an Iterable and then a String would not work as an Iterable and you would have to write for c in iter(a_string) to extract each character. And so on and so forth.

But, I’m sure that the Julia designers had good reasons not to take this approach.

DNF · July 11, 2023, 6:27am

Two points:

Are you using collections that are not AbstractArrays? Can’t you write your own isiterable function which just tests for AbstractArray and Tuple and AbstractDict? This was suggested by @thautwarm, but you haven’t addressed it, afaict.
You can share code between functions or methods, like this

_myprepcode() =... 

function savemyplots(xs::AbstractArray,...) 
    _myprepcode() 
    for (i, x) in pairs(xs) 
       ... 
    end
end 

function savemyplots(xs::AbstractArray,...) 
    _myprepcode() 
    ...
end

bertschi · July 11, 2023, 6:59am

Just my two cents …

If that is your only use here and you are fine with the definition of iterable that Julia has, i.e.,

for x in [1,2,3]  # will iterate over elements
for x in 1          # will also iterate over single element
for x in :a         # will fail with a method error

you can just wrap your iteration part of the code into a try catch block and run the single plot code when catching a method error.

Note that Julia does have several notions of iteration, i.e., for loops run over iterables whereas broadcast has its own definition of broadcastable. These notions do not always agree, e.g., [identity(x) for x in "abc"] vs identity.("abc"). Thus, depending on your exact requirements, defining your own trait might be necessary.

Using dispatch instead of branching is also common in OOP languages and very light weight in Julia. In your case, just define the functions within your larger function doing the preparation and call them right away. Then, you don’t need to pass any arguments as they have access to the surrounding scope.

Benny · July 11, 2023, 10:27am

As people have said, the idiomatic way is to use separate methods for your iterables versus your non-iterables. You shouldn’t be referring to non-iterables as scalars because some scalars, like Numbers, are actually iterable, they just have no dimensions and 1 iteration.

However, this is actually possible to do because the iteration interface starts simply at iterate(xs). To check whether a function can be called on arguments, you can use applicable(iterate, xs). This is a bad idea as a condition because this is some runtime work you can’t optimize away, but it’s a bit better than try-catching every MethodError. Bear in mind that iterate(xs) isn’t the only call that runs when xs is iterated, so using this as an isiterable assumes that whoever wrote the first called method also wrote the rest.

Henrique_Becker · July 11, 2023, 10:57am

No, it cannot.

julia> isbits(1)
true

julia> isbits((1, 2, 3))
true

Why would you suggest that?

HenriDeh · July 11, 2023, 11:06am

I join the others saying that it is not a good idea performance-wise and also for the readability of your code. If this does not matter to you, there is however a way to achieve what you asked using the Base.hasmethod function to check if xs has the iterate method. Something like hasmethod(iterate, typeof((typeof(xs),)))

Henrique_Becker · July 11, 2023, 12:37pm

This will not work because number types like Int are iterable:

julia> 2[1]
2

julia> first(2)
2

julia> hasmethod(iterate, (Int,))
true

So you will not detect a scalar this way.

rocco_sprmnt21 · July 11, 2023, 12:38pm

i had tried the following tests and conjectured that the function could distinguish the cases the OP cares about.

julia> isbitstype(Int)
true
julia> isbitstype(Float64)
true
julia> isbitstype(Vector)
false
julia> isbitstype(Tuple)
false

But evidently I don’t know how the type system works in julia

HenriDeh · July 11, 2023, 12:43pm

My bad

Henrique_Becker · July 11, 2023, 12:43pm

Tuples can be isbits or not, it depends if they make reference to only isbits types or not.

But more than that, some scalar values are not isbits: isbits(big(1)).

GunnarFarneback · July 11, 2023, 3:09pm

Just Tuple is not even a concrete type.

julia> isconcretetype(Tuple)
false

rocco_sprmnt21 · July 11, 2023, 3:43pm

If this remark means that isbitstype(T) can be true only for some concrete types, it might be useful to add it in the help of the function, together with some further examples of the type isbitstype(Tuple{Int,Int}) == true

help?> isbitstype
search: isbitstype

  isbitstype(T)

  Return true if type T is a "plain data" type, meaning it is immutable and contains no references to other values, only primitive
  types and other isbitstype types. Typical examples are numeric types such as UInt8, Float64, and Complex{Float64}. This category   
  of types is significant since they are valid as type parameters, may not track isdefined / isassigned status, and have a defined   
  layout that is compatible with C.

Topic		Replies	Views
Seeking an "isvector" function New to Julia	48	2529	July 20, 2022
Rant: I hate that it's possible to iterate over an integer New to Julia	41	3715	March 13, 2022
Why is splatting a non-iteratable allowed? Internals & Design	30	1899	July 12, 2020
How to dispatch on an iterable (e.g. accept both tuple or array) General Usage question	4	554	April 15, 2020
recent broadcast changes (iterate by default), scalar struct, and `@.` Internals & Design broadcast	68	7765	January 9, 2019

Detect argument is an iterable or a "scalar"

Related topics