How to deal with inconsistent objects due to mutation

This is a bit of a basic philosophical question about type and API design. I would love the input of more experienced coders, since I suspect I might be approaching my problem wrong.

I have a type in essence similar to:

struct Wrapper
    v::Vector{Int}
    length::Int
end

The field length should equal length(v) for consistency. This is a toy example, but in the real world length is a bunch of extra fields that should be consistent with field v. Moreover, the API requires that the user can access field v, so I have an exported vector(p::Wrapper) = p.v

The problem is that I have no way to forbid that the user does push!(vector(p), 1), which breaks the internal consistency of p. I see a number of options, all of which are very bad for performance. Is there a standard approach to deal with this problem in Julia?

My approach would be to not allow the user to access v. Or to put it generally, do not expose mutable fields to the user if the field’s value have invariants.

You can consider making Wrapper an AbstractVector{Int} and implement a few basic mutating methods for it, where you in the implementation make sure that the two fields are syncronized.

1 Like

What if the user needs read access of v? Shouldn’t we have some form of ImmutableVector wrapper in Julia?

When would the user need that? Surely they can just access it through Wrapper's interface?

2 Likes

Another alternative could be create a wrapper for v for which push! and other operations are not allowed, such as:

julia> struct MyVec{T} <: AbstractVector{T}
           v::Vector{T}
           length::Int
       end

julia> Base.length(v::MyVec) = v.length

julia> Base.size(v::MyVec) = (1,)

julia> Base.getindex(v::MyVec,i) = getindex(v.v,i)

julia> Base.push!(v::MyVec,val) = error("Cannot push to v")

julia> v = MyVec{Int}([1,2,3],3)
1-element MyVec{Int64}:
 1

julia> push!(v,1)
ERROR: Cannot push to v
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] push!(v::MyVec{Int64}, val::Int64)
   @ Main ./REPL[5]:1
 [3] top-level scope
   @ REPL[18]:1


2 Likes

What if the user needs read access of v? Shouldn’t we have some form of ImmutableVector wrapper in Julia?

If the vectors aren’t too large you could use StaticArrays.jl instead.

2 Likes

Thanks everyone! Ok, so the consensus seems to be to not provide direct API access to any mutable field.
I’d like to note that other standard types, e.g. SparseMatrixCSC, have the same problem. They do provide direct API access to mutable fields (nonzeros for example), and tinkering with them can wreck your SparseMatrix object. It seems to me like it is a genuine problem that we don’t have a very good solution for, would you agree?

EDIT: in other words, it seems to me Julia has more mechanisms for performance than safety

I’m not sure it’s a problem, really, as long as there is a social convention that the fields of a struct are private and should not be changed or relied on, unless explicitly documented.

It’s nice that users can mess with internal types if they want, and is willing to bear the risk. It allows extension of other people’s types.

Relevant:

Edit: But yes, I agree, Julia’s approach of telling people to not mess with internal fields as opposed to forcing people by actually making them inaccessible does prioritize performance (and extensibility) over safety. I think it’s nice still.

2 Likes

How do “safe-by-design” languages deal with this? I am familiar with the functional approach that forbids in-place mutations of anything, but that can be unrealistic for performance. Are there not other approaches whereby one can lock a field (make it immutable to the user) somehow?

(Perhaps what I need is to write such a Lock wrapper for my case…)

Note that there’s no way to prevent a sufficiently motivated user from mutating the internals of any object. You can try making it harder (e.g. with the push!/setindex! overloads suggested above), but ultimately you just need to properly document what users are allowed to do.

4 Likes

In some languages you have to explicitly mark types and/or fields as public, otherwise they will not be visible outside the module they are defined in.

If a field should be immutable, then you should use an immutable type for that field.

If a field should be immutable, then you should use an immutable type for that field.

I think this is the right answer for my specific case. I want to use an AbstractVector type that is (1) immutable and (2) does not need to know it’s length in advance (unlike StaticArrays). We don’t have such an ImmutableArray type in Base, right?

I assume the main goal is to avoid allocating multiple copies of the data. That’s why you want to mutate the array.

You could do this:

using StaticArrays

struct MyVector{N}
    v::SizedVector{N}
    n::Int
end

push!(m::MyVector{N}, x) where N =
    MyVector(SizedVector{N+1}(push!(m.v.data, x)), N+1)

The SizedVector type is protected from resizing by its implementation.

EDIT: Of course this is a bit absurd. You would probably also have an inner constructor and probably not have n as a field at all…

1 Like

You are right there is not yet immutable arrays in Base. See WIP: ImmutableArrays (using EA in Base) by ianatol · Pull Request #44381 · JuliaLang/julia · GitHub. There was some hope it would be in Julia 1.8, but it’s been pushed. Hopefully in 1.9 or 1.10, so sometime in 2022.

2 Likes

This immutable array business is quite tantalizing, but also a bit mysterious. It could be good for parallelism, automatic differentiation, safety, ‘reasoning about’ stuff, etc. as far as I’ve heard.

But how does it actually work? How can it be efficient, and how could it impact the Julia ecosystem? Has anyone mused about any of these questions anywhere?

1 Like

I sincerely would go the completely opposite direction and do not have the other fields, instead they are computed on the fly from the vector and cannot ever disagree with it, but maybe this only works for the toy example.

1 Like

My 2 cents: you can’t have it all (i.e. allowing access to internal data and keeping consistency).

If you’re not keen on compiling per vector size, then you need your wrapper to hold a pointer to something on the heap. For pointing to an immutable vector of variable size, you could annotate v with an abstract type like v::NTuple{N, Int} where N or v::SVector{N,Int} where N (from StaticArrays.jl).

If keeping the length the same is all you really need, you could also make v a Nx1 Matrix. That’ll get around the push!/insert!/deleteat!/pop! methods for AbstractVector.

1 Like

Thanks for all the suggestions! I think that, until we have ImmutableArrays, I will follow what I feel is the standard Julian approach, i.e. to put an emphasis on performance, and place on the user the responsibility of not messing with object properties in undocumented ways. Computing (instead of storing) the invariants is not an option in my case. And as @pfitzseb mentioned, any attempt at ensuring consistency is never going to be bulletproof without explicit language support.