# Extract an `AbstractVector{T}` from an `AbstractVector{T, Missing}`

I know that a certain subset of the indices of an `AbstractVector{T, Missing}` are not missing. I want to efficiently manipulate that subset.

How can I extract `x::AbstractVector{T}` from `y::AbstractVector{T, Missing}` such that `x[i] === y[i]` whenever `y[i] !== missing` with minimal runtime overhead at creation and at use?

I am okay with undefined behavior if I attempt to access a missing element.

You’ll probably have something like

``````function extract!(avec::AbstractVector{T}, avecm::AbstractVector{Union{T, Missing}}, is) where {T}
for i in is
avec[i] = avecm[i]
end
avec
end

function extract(avecm::AbstractVector{Union{T, Missing}}, is)::AbstractVector{T} where {T}
avec = similar(avecm, T, length(avecm));
extract!(avec, avecm, is)
end

@btime extract!(vec, vecm, is) setup=(
vecm = Vector{Union{Int, Missing}}(undef, 10);
is = [1, 3, 5, 7, 9];
vecm[is] .= is;
vec = similar(vecm, Int, length(vecm));
)

@btime extract(vecm, is) setup=(
vecm = Vector{Union{Int, Missing}}(undef, 10);
is = [1, 3, 5, 7, 9];
vecm[is] .= is;
)
``````

as a baseline already, yielding

``````  8.000 ns (0 allocations: 0 bytes)
34.340 ns (1 allocation: 144 bytes)
``````
1 Like

This is nice, but I’m looking for something substantially faster. I would like to avoid allocating and copying data at all and have a constant runtime with respect to input length. I think this should be possible, at least in the case of `Vector{Union{T, Missing}}`, because `Vector{Union{T, Missing}}` is stored internally as a vector of data and a vector of bits representing whether each element is missing. I just want to get a handle on the internal Vector of data.

I’m looking for something like `x = reinterpret(T, y)`.

For `Array`s, I think this might be safe:

``````without_missing(avecm::Array{Union{T, Missing}}) where T =
unsafe_wrap(Array, reinterpret(Ptr{Int}, pointer(avecm)), size(avecm))

@btime without_missing(v) setup=(
v = Vector{Union{Int, Missing}}(undef, 100000);
is = 1:90000;
v[is] .= is;
);
# 35.259 ns (2 allocations: 64 bytes)
``````

But for `AbstractArrays`, I still don’t know.

1 Like