Type assertion?

I’m playing around with a function where the argument should be a vector of same-sized vectors, i.e.,

v = [v1,v2,v3,...,vN]

where v1, …, vN etc. are vectors containing scalar elements, and of the same size.

How can I test for this in the function call? I tried with:

function myfunc(v::Vector{Vector{Number}})
...
end

but that didn’t work. [And it wouldn’t even have checked that the elements of v were vectors of the same size…]

Question:

  • Is there a way to specify the structure of the argument I expect?
  • Would it be better to just forget about asserting the argument type, let misuse of the function lead to an error message, and “force” the user (i.e., myself) to re-read the documentation?

It sounds like you should delegate this “check” to the dispatch system by e.g. creating a type with your desired property (equally sized subarrays) and forcing the user to use those. The consistency check can be done inside the constructors of the type.

1 Like

Vectors don’t have their sizes as part of their types so you cannot type annotate this.

In some cases you could take a Matrix instead of a Vector since all the columns in a Matrix has the same number of rows but that might not be applicable in your case.

1 Like

What I meant is a type which keeps e.g. a flat array with all the values and the number of elements N in their fields.
A constructor for this type then would then look something like this Vector(vectors...) = ... which checks the length for equality and set’s N and the concatenated values.
Finally the get index method is defined which does the magic slicing behind the scenes and returns the sub-arrays.

It’s just a rough idea… :wink:

edit: just realised I was doing too much with awkward arrays in Python today, just storing the Vector of Vectors is perfectly fine. In the constructor then the mentioned check… then use that as type in the function argument.

Thanks! I will think it over tomorrow… I haven’ tried to define my own types yet.

Having your own types in Julia is one of the greatest fun in this language :wink:

1 Like

What I look into is just a convenience function for converting back and forth between sampled based data and time series data, i.e.,

[[y_1(t_1),\ldots,y_{n_y}(t_1)],\ldots,[y_1(t_N),\ldots,y_{n_y}(t_N)]] \\ \updownarrow\\ [[y_1(t_1),\ldots,y_1(t_N)],\ldots,[y_{n_y}(t_1),\ldots,y_{n_y}(t_N)]]

The following function does it:

function vec2vec(v)
    return [getindex.(v,i) for i in 1:lastindex(v[1])]
end

OK – not an earth shattering function; it is so simple that perhaps it is not even worth making a function out of it. But it is convenient: e.g., if sol is the solution of a differential equation solver, vec2vec(sol[:]) would transform the solution to something that is easier to plot — for the cases where I don’t want to use the built-in interpolation.

So, this function only works (properly) if the elements of the vector are vectors of equal size… so I was thinking of adding type assertion to make sure that v (argument of vec2vec) had the right type before I start to operate on the data.

You can create a dead simple function and an assertion which shows a meaningful message to the user. Here is an example with @assert

equal_lengths(v) = all(x->length(x)==length(first(v)), v[2:end])

function vec2vec(v)
    @assert equal_lengths(v)
    return [getindex.(v,i) for i in 1:lastindex(v[1])]
end

and here instead of @assert an error:

function vec2vec(v)
    equal_lengths(v) || error("The vectors must have equal lengths.")
    return [getindex.(v,i) for i in 1:lastindex(v[1])]
end
4 Likes

I’m trying to understand your equal_lengths() function… (since I haven’t used it before):

An equivalent function would be the following, right?

equal_lengths(v) = all( [ length(x) == length(first(v)) for x in v[2:end] ] )

Your version appears to be a little more efficient based on BenchmarkTools– why?

Also, is there an advantage in using first(v) over v[1]?

Your version is allocating a new array unnecessarily.

Some custom arrays might not start with 1.

4 Likes

Ah, ok…

  • my version actually creates a vector of length length(v)-1 filled boolean values, and when created, checks whether all values are true. While tamasgal’s version does not create this vector, but instead checks iteratively?
  • OK – I get that some custom arrays may not start with 1. But if they don’t start with one, does it then make sense to test against v[2:end]? What if the array start with 0… won’t I then miss testing against v[1]?

Yes, you probably would. From 1.4, you can use something like

all(x->length(x)==length(first(v)), v[(begin + 1):end])

For versions before that, consider firstindex(v) + 1.

2 Likes

Correct me if I am wrong, but benefits of starting from begin + 1 are outweighed by the fact that v[begin + 1:end] allocates. So, one should either use @view or just use all vectors (ok, you’ve got +1 true what’s wrong with that?)

julia> v = [zeros(100) for _ in 1:200]
julia> @btime all(x->length(x)==length(first($v)), $v[2:end])
357.779 ns (1 allocation: 1.77 KiB)

julia> @btime all(x->length(x)==length(first($v)), $v)
137.869 ns (0 allocations: 0 bytes)

julia> @btime all(x->length(x)==length(first($v)), @view($v[2:end]))
171.277 ns (0 allocations: 0 bytes)
1 Like

Views and indexing are (should be) orthogonal, in the sense that you should be able to modify the example above by adding a @view. That said, it looks like @view was not updated for #33946.

Incidentally, I think that introducing micro-optimizations like this, especially in First steps, may just be derailings the topic for no good reason. Also, 2:end is broken for generalized indexing, as noted by @BLI. If you really want views at this point, just use

view(v, (firstindex(v) + 1):lastindex(v))

or something equivalent.

I’ve noticed that some of you use the construct $v instead of v sometimes. It seems to have a big impact:

julia> @btime all(x->length(x)==length(first($v)), v)
  128.941 ns (1 allocation: 16 bytes)
true

julia> @btime all(x->length(x)==length(first($v)), $v)
  7.500 ns (0 allocations: 0 bytes)

[I used a different `v` vector than you], but

  • What does the $ syntax actually mean here (not interpolation, I guess…)?, and
  • When should be used and when not?

https://github.com/JuliaCI/BenchmarkTools.jl/blob/master/doc/manual.md#interpolating-values-into-benchmark-expressions

1 Like

The reference talks about $ “interpolating values into benchmark expressions”.

  • Does this imply that this interpolation is only used within the context of benchmarking, and that for regular code in actual use (e.g., when not prefaced by a macro such as @btime), it should not be used?
  • In other words, this interpolation is only inserted in order to get a best possible idea of the performance?

No. Interpolation is a syntactic feature of Julia, used for various things. Here the BenchmarkTools macros kind of hijack use of it for a specific purpose to provide a compact syntax for something that is useful.

Why it is needed is explained in the link above, the syntax is orthogonal to that; it’s just a syntax and has nothing more to do with interpolation in Julia for other purposes. You could design your own macros that pick out a $ and do something else to it.

1 Like

If each individual vector in your vector is reasonably small then this is a perfect situation to use “static vectors” from the StaticArrays.jl package. “Static” here means that the length of the vector is encoded in the typed:

julia> using StaticArrays

julia> v1 = SVector(3, 4)
2-element SArray{Tuple{2},Int64,1,2} with indices SOneTo(2):
 3
 4

julia> v2 = SVector(5, 6)
2-element SArray{Tuple{2},Int64,1,2} with indices SOneTo(2):
 5
 6

julia> v = [v1, v2]
2-element Array{SArray{Tuple{2},Int64,1,2},1}:
 [3, 4]
 [5, 6]

julia> f(v::Vector{SVector{N,T}}) where {N,T} = N
f (generic function with 1 method)

julia> f(v)
2
2 Likes

“this interpolation” was meant to refer to the use of $ within benchmarking. I know interpolation is a standard feature for strings.

In any way, when running the commands without the BenchmarkTools.jl macro statement (@btime), the command crashed. Which means that this use of interpolation does not work outside of the BenchmarkTools.jl package (e.g.,

@time all(x->length(x)==length(first($v)), $v)

crashes.)

So when I create a function for actual use (and not for benchmarking), I need to remove the interpolation symbol in the code.