Seeking an "isvector" function

chadagreene · April 27, 2022, 10:40pm

I’m seeking a function similar to Matlab’s isvector, to determine whether a variable is one dimensional.

These are the cases where `isvector(a)` should return `true`:

a = [1 2 3 4]
a = [1, 2, 3, 4] 
a = [1 2 3 4]'
a = [1, 2, 3, 4]'

This case should return `false`:

a = [1 2; 3 4]

I’ve tried

sum(size(a).==1)

but that fails for a = [1, 2, 3, 4].

This seems like a very simple problem. Is there a straightforward way to solve it?

fredrikekre · April 27, 2022, 10:54pm

You can check a isa AbstractVector.

However, it feels a bit like an XY problem. I have never (or very rarely) had to check for something like this – normally you would utilize multiple dispatch instead, and have one method for ::AbstractVector and another one for ::AbstractMatrix, for example. Since this is not really possible in matlab you have to use isvector etc, but in Julia there are better ways, usually.

Mason · April 27, 2022, 10:54pm

Agreed with @fredrikekre.

However, if forced to solve the problem as stated, I’d probably do

~~isvector(v) = prod(size(v)) == length(v)~~

~~This will only be true in the case where size(v) is only 1s except one entry which should be the length.~~

Nope, that’s not right, look at @stevengj’s solution

stevengj · April 27, 2022, 10:57pm

It sounds like you want

is1d(a) = count(>(1), size(a)) ≤ 1

but I agree with @fredrikekre that something seems odd here. Under what circumstances in Julia would you want code to accept [1,2,3,4] and [1 2 3 4] but not other arrays? Unlike Matlab, we have a “true” 1d array type in AbstractVector (as opposed to 1-column matrices).

chadagreene · April 27, 2022, 11:14pm

@stevengj Your is1d solution is exactly what I’m looking for. However, your skepticism about what I’m trying to do suggests there might be some fundamental concept I need to learn.

Here’s the case: I’m writing a function that will perform different calculations, based on whether the user enters:

a single point,
a set of points with explicitly defined lat, lon locations, or
a grid of points defined by a 1D lon row vector and a 1D lat column vector (sorry for the Matlab terminology).

Here’s what I have so far:

function myfunction(lat,lon)

    if (length(lat)==1) & (length(lon)==1) 
        # calculate for a single point

    elseif isequal(size(lat),size(lon))    
        # arbitrary array of points, or fully-defined grid of points 

    elseif length(lat)>1 & length(lon)>1 & is1d(lat) & is1d(lon)
        # two 1D vectors of different size define a grid

    else 
        error("Inputs do not make sense.")
    end
end

Is there a different way I should be going about this?

mbaz · April 27, 2022, 11:26pm

What you’re doing works, but multiple dispatch offers a different way to do it. For example:

function myfunction(lat::T1, lon::T2) where {T1, T2 <: Real}
    # calculate for a single point
end

is called whenever lat and lon are real numbers. This is a “method” of the function myfunction. You would define other methods for the other two cases.

stevengj · April 28, 2022, 12:12am

Much simpler to write

function myfunction(lat::Real, lon::Real)

But in any case, the point remains that you should just write different methods to do different things for different argument types.

Note that if you just want to “vectorize” your function over a bunch of points, you can just use dot calls, e.g. myfunction.(lat', lot) to generate a matrix from a grid of points defined by a row vector lat' and a column vector lot.

joa-quim · April 28, 2022, 12:34am

GMT.jl has one with that name.
https://github.com/GenericMappingTools/GMT.jl/blob/master/src/common_options.jl#L3225

I need it to allow users pass input options to GMT modules either as a [x1, x2, x3] or [x1 x2 x3]

stevengj · April 28, 2022, 12:44am

If you’re going to allow those two options, why not allow any iterable container?

liuyxpp · April 28, 2022, 2:14am

Maybe I miss something. But why not

is1d(a) = ndims(a) == 1

Edit: I now understand that it is possible to have nx1 or 1xn matrix or other high dimensional array…
But I also suggest to deal the simple vector if possible.

stevengj · April 28, 2022, 2:24am

Because row vectors are treated as 1-row matrices with ndims == 2, and @chadagreene wanted to distinguish them.

julia> ndims([1 2 3])
2

julia> ndims([1,2,3])
1

julia> ndims([1,2,3]')
2

DNF · April 28, 2022, 5:19am

Agreed, this is most likely a job for broadcasting. But based on @chadagreene’s description it should be

myfunction.(lat, lot)

In fact, even Matlab now has some limited broadcasting capabilities (implicit, though), so even there you can avoid the excruciating dimensions checks sometimes.

That looks a bit like it could be redesigned. Can’t you just wrap the input in vec? In Matlab you would do input = input(:).

DNF · April 28, 2022, 6:34am

I should add that it is deeply ingrained in the Matlab psyche (I should know) to pass various arrays into a function, and then wrangle with their shapes, trying to figure out what to return. This is done in order to expoit ‘vectorization’ in the innermost, ‘built-in’, functions, and to get good performance.

You don’t need to do that in Julia, you can write your code for scalar inputs, and then leave all the trouble with matching of shapes and expansion of dimensions to the broadcasting machinery:

function myfunction(lat::Real,lon::Real)
    # calculations for a single point
end

And then call it with myfunction.(latarray, lonarray) (no need to explicitly use multiple dispatch and create multiple methods), which will handle all shapes and dimensions you want, as long as they ‘make sense’ and fit each other. This tends to make your code simpler, more general, and more robust.

joa-quim · April 28, 2022, 6:09pm

Maybe, but …

github.com

GenericMappingTools/GMT.jl/blob/master/src/common_options.jl#L140


      
          parse_RIr(d::Dict, cmd::String, O::Bool=false, del::Bool=true) = parse_R(d, cmd, O, del, true)
          function parse_R(d::Dict, cmd::String, O::Bool=false, del::Bool=true, RIr::Bool=false)::Tuple{String, String}
          	# Build the option -R string. Make it simply -R if overlay mode (-O) and no new -R is fished here
          	# The RIr option is to assign also the -I and -r when R was given a GMTgrid|image value. This is a
          	# workaround for a GMT bug that ignores this behaviour when from externals.
          
          
	(show_kwargs[1]) && return (print_kwarg_opts([:R :region :limits], "GMTgrid | NamedTuple |Tuple | Array | String"), "")
          
          
	opt_R::String = ""
          	val, symb = find_in_dict(d, [:R :region :limits :region_llur :limits_llur :limits_diag :region_diag], del)
          	if (val !== nothing)
          		opt_R = build_opt_R(val, symb)
          	elseif (IamModern[1])
          		return cmd, ""
          	end
          
          
	if (opt_R == "")		# See if we got the region as tuples of xlim, ylim [zlim]
          		R::String = "";		c = 0
          		if (((val = find_in_dict(d, [:xlim :xlimits])[1]) !== nothing) && isa(val, Tuple) && length(val) == 2)
          			R = @sprintf(" -R%.15g/%.15g", val[1], val[2])
          			c += 2

This function tests for the allowed types that can be provided to build GMT’s -R (BoundingBox) option. There I’m not even using isvector as I should but instead checking the VMr type (a Union{Vector{<:Real}, Matrix{<:Real}}) thus leaving open a path to an uncaught usage error. So yes, checking for vector sensu lato seems useful to me.

tomerarnon · April 29, 2022, 8:51am

But even in that example you can see that a Vector or Matrix with 1 dimension of size >1 aren’t the only vector-like things. You have a check on the same line for a Tuple…

Probably, the best bet in this example, as with nearly every case involving type checking, is to use multiple dispatch. Anything which you (as the designer of this interface) deem “close enough to a vector” should behave like a vector, which usually means converting it to a vector and passing it to another method to handle.

That could look something like

build_opt_R(val) = build_opt_R(x, Symbol())
build_opt_R(val, symb::Symbol) = "" # default case, anything unexpected goes here

function build_opt_R(val::Union{String, Symbol}, symb::Symbol)
    r = string(Val)
    if     (r == "global")     R = " -Rd"
    elseif (r == "global360")  R = " -Rg"
    elseif (r == "same")       R = " -R"
    else                       R = " -R" * r
    end
    return R
end

# helper function. Pretty sure the else case does the same as arg2str...
function _some_name(val, symb)
    R = symb ∈ (:region_llur, :limits_llur, :limits_diag, :region_diag) ?
        " -R" * @sprintf("%.15g/%.15g/%.15g/%.15g", val[1], val[3], val[2], val[4]) :
        " -R" * @sprintf("%.15g/%.15g/%.15g/%.15g", val[1], val[2], val[3], val[4])
    return R
end

# most restrictive/specific case. Only tuples of real numbers of length 4 or 6 are allowed
build_opt_R(val::Union{NTuple{4, Real}, NTuple{6, Real}}, symb::Symbol)  = _some_name(val, symb)*"+r"
# for any array, try turning it into a tuple. If it's an acceptable kind, 
# it'll go through the method above. Otherwise it'll go to the default case.
# Worth noting this is a type unstable conversion, but this doesn't seem like it 
# needs to be performant. If it does, reverse the cases so that Vector
# is the base case, and Tuple and AbstractArray both forward to it with vec!
build_opt_R(val::AbstractArray, symb::Symbol)  = build_opt_R(Tuple(val), symb) # can be widened pretty easily to include any iterable if you want. 
build_opt_R(val::GDtype, symb::Symbol)         = _some_name(val[1].bbox, symb)
build_opt_R(val::GMTdataset, symb::Symbol)     = _some_name(val.ds_bbox, symb)
build_opt_R(val::GItype, symb::Symbol)         = _some_name(val.range[1:4], Symbol()) # I'm cheating here with symb to copy the behavior in the original

Pretty sure this does the same thing, but I haven’t run it through your test suite. Also, apologies for _some_name… it’s late
Of course, I don’t mean to suggest you should always allow all AbstractArrays of any dimension all over your code, and probably the above is way too permissive. Just an example.
It’s just as common to just force the caller to explicitly convert from a matrix (or whatever) to a vector and not bother with any of this.

joa-quim · April 29, 2022, 7:18pm

Thanks for your reply and work to break the type testing function into multi-dispached ones. I do use it in several parts of the code but in examples like this why is the multi-dispatch better? At the end the binary code will have to keep references to 9 functions instead of just 1. Apply this multiplying factor to other similar cases. Why keeping some hash table to a lot more of function names and their entry points across to the entire package is better than be more savvy and use only one to do the same job?

tomerarnon · May 1, 2022, 7:49pm

It’s not inherently better, but it does have some advantages. For example, it is more extensible. If you want to add behavior for a new type, it’s easy to add another method. As the author of the package, it could seem trivial to just add another if-else condition, but if I am a user, I can’t modify the source code without forking the repo. However, I can easily add another method to add the behavior I want. In general, that’s what makes julia packages much more easily composable than packages in other languages. Even as the author though, you may find it easier to add new behaviors (or adjust existing ones) with new methods rather than new branches. There is also the advantage that I only have one method above that does any real “work”. All of the other methods just forward to it. That is a common pattern, and usually makes it easier to debug.

I never think about the number of function references or anything like that, and I believe that in practice that never ends up mattering, so feel free to take my opinion with a grain of salt. However, I think this idea of a multiplying factor may be based on a misunderstanding. Your original function isn’t just one function either. When it is called with a Tuple, a special case of it is compiled for that tuple type. When it is called with a string, a special case is compiled for a String, etc. Although there is only one method defined, there will be many MethodInstances that get compiled (unless you specify @nospecialize on the arguments I suppose). So while there is only one method in the if-else approach, the number of method-instances will be exactly the same, so I don’t think there are any practical savings anyway.

ellocco · July 1, 2022, 8:42am

I would like to define a function with three methods.
But I fail, how can I merge the three different functions,
into one function with three methods?

# distinguish between vector and matrix
function MyLib_isvectorA(_val::Vector{<:Number})
    return count(>(1), size(_val)) ≤ 1
end  
  
function MyLib_isvectorB(_val::Array{<:Number})
    return count(>(1), size(_val)) ≤ 1
end  

function MyLib_isvectorC(_val::Number)
    return count(>(1), size(_val)) ≤ 1
end

albheim · July 1, 2022, 9:00am

This works for me, but I might be missing what you are actually asking for here?

julia> a(x::Vector{<:Number}) = 1
a (generic function with 1 method)

julia> a(x::Array{<:Number}) = 2
a (generic function with 2 methods)

julia> a(x::Number) = 3
a (generic function with 3 methods)

julia> a([3;])
1

julia> a([3;;])
2

julia> a(3)
3

ellocco · July 1, 2022, 9:30am

Thanks for the fast response!
How can I define MyLib_isvector() in a way that it except three input types:
a) Vector{<:Number}
b) Matrix{<:Number}
c) ::{<:Number}

Topic		Replies	Views
Same expression returning inconsistent data types brings considerable inconvenience General Usage question	29	733	April 30, 2024
How to build functions that work efficiently with arguments of different types? General Usage juliacomputing , array , function	15	1347	November 23, 2021
Type assertion? New to Julia	25	2876	June 18, 2020
Using a Custom Length Vector type to define function Parameters Performance question	20	207	July 9, 2024
Detect argument is an iterable or a "scalar" General Usage question	25	1155	July 12, 2023

Seeking an "isvector" function

These are the cases where isvector(a) should return true:

This case should return false:

Related topics

These are the cases where `isvector(a)` should return `true`:

This case should return `false`: