Function to check all arguments are the same and return their value

proposal

#1

I’ve observed that I often encounter this pattern where I have several ways of getting a value I need that should be equivalent. The most typical case is getting some array dimension.

function f(data, x, y, z)
    nx = length(x)
    size(data, 1) == nx || error("Dimensions must match")
    ...
end

In these cases it would be nice to have a small function in Base that does this in one step:

    nx = same(size(data, 1), length(x))

It’s just a small papercut and I don’t know if it would be something worth considering, but for me it would make the language just a bit more comfortable to use since I end up writing code as above in many of my functions. It would reduce the mental overhead of picking one value for the assignment (vs the other ones for the assertion) and encourage checking for consistency since it can be done in one statement rather than two.

It would be especially useful in cases where you want to use a value directly without assigning it to a variable first, but have several ways of getting it that should be equivalent, e.g.:

for i=1:same(size(data, 1), length(x))
   ...

It would also be somewhat analogous to type parameters where these sort of consistency checks can be done implicitly by using the same parameter name for multiple types.

I know I can easily write this function myself, but I probably wouldn’t add it to most code since it’s such a small improvement. I would, however, use it all the time if it was in Base. A possible complication is when the arguments are of a different type (as in same(1, 1.0)) – maybe it would have to promote the values first?

Any thoughts?


#2

LinearAlgebra.checksquare implements a very special case of this pattern, namely

  1. get various attributes of objects,
  2. check that they are equivalent/conformable,
  3. return the one that is relevant for further use.

I am in two minds about this approach: it does make code shorter, but it conflates calculations with checking. There are suggestions for disabling @assert and similar with runtime flags and I am not sure they would interact well. I also like the informative error messages from eg ArgCheck.@argcheck.


#3

I think this can be quite useful, to reduce the clutter and verbosity of repeatedly verifying arguments. But I wonder if it doesn’t better belong in one’s own project? The reason is that I’ve often used constructs like this myself, and I find that they tend to be quite problem-specific: I want to specify the error message, I usually want to include what the compared values actually are in the error message, and I want to throw a specific type of Exception (e.g. DimensionMismatch), not just generic error. And what should same return, the first or second argument? Just because they’re equal doesn’t mean it’s the same object.

Is it really such a burden to add it to your code? In it’s simplest form, it’d just be a one-liner:

verify_equal(a,b) = a == b ? a : error("dimension mismatch")

#4

I like the idea—it is a really common pattern.


#5

I agree that introducing a version of this function to Base only makes sense if there’s a sufficiently general version that makes sense for most use cases. I’m not sure that’s the case, for example with regards to which of the arguments should be returned. Maybe the equality check should be strict enough that it doesn’t matter?

However, I do think it’s a big difference whether a function is part of Base or not. Defining your own functions always has some overhead. You have to think about whether it’s worth it, you have to think about the naming, you have to think about what sort of equality you want to enforce… Defining a lot of little functions can make it hard for others to follow your code because they have to look up all those definitions too when reading the code. I therefore try to stick to “vanilla” Julia whenever possible. I also tend to write a lot of prototypes in Jupyter notebooks when writing new code, so I’d end up redefining these functions over and over again (or more realistically: just not using them at all).


#6

But if you put it in a package, all this overhead is a fixed cost.

If you mean Base + standard libraries, you may be missing quite a bit of functionality.

Having no (performance or syntactic) overhead for user-defined functions was a key design principle in Julia from the very beginning. The language is intended to be extended, and makes this very, very convenient.

The best way to develop something like same is to

  1. put it in your projects as @bennedich suggested,
  2. if you find repeating patterns, wrap it up in a package with unit tests, document, and release,
  3. polish it based on user feedback if necessary.

#7

Well, I see the question of when to extract common pieces of functionality into functions within your project, shared packages, and functionality that is part of the language as one of the hard design problems of programming. Every time you extract functionality this way, you also add a layer of abstraction/indirection – sometimes that’s worth it, sometimes it isn’t. I love the power of Julia functions and packages, and I use both a lot, but for something small like this I think the benefit of using Base-only functionality familiar to everyone looking at the code is greater than the cost so I would stick to the more verbose code I’m using currently.

It’s similar to functions like all and any – they are pretty much just wrappers performing a reduction, but they make the code more clear and I like that they’re part of Base.


#8

I don’t understand this perspective at all. I consider extracting functionality into separate functions one of the cornerstones of good coding style, making it easier to follow your code since the function is given a descriptive enough name that readers hopefully don’t feel the need to inspect the source code, and the surrounding code then becomes more concise and focused on its task. And you also open up for code reuse and unit testing of individual functionality.

(Note: I consider choosing descriptive names very important to achieve self-documenting code. Don’t worry so much about the length of the function name. In your example I would go with something like verify_equal or assert_equal instead of same. IMO same is even misleading here if you use == and not === to check for equality.)


#9

I didn’t mean to imply that functions (even short ones) should be avoided, it’s just my experience that every abstraction comes with a cost that may or may not be worth it. I see good coding style not as a set of fixed principles but as good intuitions about which principle should be followed to which degree. Every “best practice” (comments, extracting common functionality, unit testing, descriptive variable/function names) can become counterproductive if overdone. In the case of a function like same (or all and any) I think I prefer the style of not defining your own function if it’s not part of the language but I think either way is fine.

I agree that same is not a great name, I just couldn’t think of a better one. I don’t think verify_equal is that great either, because it places a lot of focus on the comparison and I would rather place the focus on the value that is returned. I wish there was a good short word that means “any one of these because it shouldn’t matter which one, they are all the same”.

For assignments, my ideal syntax would probably be nx = size(data, 3) = length(x), which is similar to mathematical notation, but that would be too big of a change of the language for such a small feature (and probably impossible anyway).


#10

Oh, and you’re right, it should be === for the comparison in every case I can think of. That should also remove the ambiguity of which argument should be returned.

So the function would be something like this (possibly with a better name and a better error):

same(x, xs...) = all(map(xi -> xi === x, xs)) ? x : error()

Another use case that I would use this for is when a function takes two composite types as argument that both have a field that should be the same and I access this field, something like same(a.grid, b.grid).dx.


#11

This is an unusual point of view; as @bennedich said, abstractions are introduced precisely to economize on mental costs, by hiding the implementation behind the intent. Eg A \ b is much, much easier to grok than understanding that a bunch of loops does LU and substitution.

Of course, as every other tool, they can be abused/overdone, and it takes time to build the experience to use them well.


#12

@mfsch and @StefanKarpinski, perhaps the right Base naming you’re looking for is something that, rather than sounds like an equivalence check, presumes equivalence:

nx = equivalently(size(data, 1), length(x))

(taken from OP)


#13

I like that name! A short word would be nice, but it’s better to have it a bit longer and more clear I think.