Mutable struct vs closure

Possibly, but Julia does not offer what you call traditional encapsulation, a mechanism designed to explicitly hide state (or make it difficult to access).

Instead, I try to stick to the following implicit rules:

  1. only methods of the same module should modify fields of a structure,

  2. reading fields is kind of borderline, if it is needed to done frequently, it should be done with an accessor, or documented as the recommended API.

If you are really paranoid, you could redefine

Base.getproperty(::MyObject, _) = error("no access to fields")

or similar. But you always have Base.getfield to circumvent this. There is really no way to protect people if they want to do something stupid.

2 Likes

Very interesting question!
I often encapsulate my states like the following:

function _add(x, c)
    new_x = x + c
    new_c -> _add(new_x, new_c), new_x
end
function init_add(x)
    c -> _add(x, c)
end
add = init_add(2)
add, a = add(3) # a == 5
add, a = add(4) # a == 9

Every _add function prepares and returns a new _add function with the current states initialized.
This does not have the problem of Core.Box, does it?
And the good thing is that the user does not have to care about the variables inside.
What do you think of this approach?

Itā€™s a good idea. And it would be nice to be enforced in the language syntax, I think.

hmā€¦ an interesting approach. Could you tell me whatā€™s the name of this approach?

anyway, it seems odd to me that every time we call add(), it returns a new one and effectively removes the old oneā€¦

Just like for loops are not first class?

The lowering doesnā€™t define the language. It is an implementation of the language.

3 Likes

You may be interested in the discussion here:

https://github.com/JuliaLang/julia/issues/12064

but this is probably something a linter could also do.

1 Like

I made this approach up myself. I donā€™t know if it has a name. It works fairly well for my implementations for e.g.:

time_update = init_kalman_filter(x, P)
measurement_update = time_update(F, Q)
time_update, x, P = measurement_update(H, R)
#...

works like a charm

Another trick I find amusing to make it harder to access fields is using field names that start with the comment character ā€œ#ā€. In your case this is how Iā€™d do it:

julia> @eval mutable struct Counter
           $(Symbol("#count"))::Int
       end

julia> @eval function add1!(x::Counter)
           x.$(Symbol("#count")) += 1
       end
add1! (generic function with 1 method)

julia> @eval function minus1!(x::Counter)
           x.$(Symbol("#count")) -= 1
       end
minus1! (generic function with 1 method)

julia> @eval Base.getindex(x::Counter) = x.$(Symbol("#count"))

julia> c = Counter(3)
Counter(3)

julia> add1!(c)
4

julia> c[]
4

julia> minus1!(c)
3

julia> c[]
3

julia> c.#count = 2


ERROR: syntax: incomplete: premature end of input

julia> c[] = 2
ERROR: MethodError: no method matching setindex!(::Counter, ::Int64)
Stacktrace:
 [1] top-level scope at none:0

1 Like

Thatā€™s all rather convoluted just to make a piece of data private, and you can still access it anyway

julia> getfield(c,Symbol("#count"))
3

It sounds like what people really want is the ability to make something private in a certain scope.

For example, it would be great to declare a private variable in the module instead of defining a struct or closure for it.

Yes, in Julia, at the moment, there is no such thing as an unaccessible field. This is just one more trick among others to hide from public API how to change the value of the struct.

So it is really just a matter of making it harder to accidentally change the field. If one is savvy enough to find out how to do that, then, supposedly, one knows what he is doingā€¦

It seems even C++ class private members can be accessed, if someone really wants to.

2 Likes

Yes. Marking something as ā€œprivateā€ in languages like julia or C++ is just a documentation hint for your downstream. Dedicated users will point a disassembler at your binary and just poke at the memory. Then, ten years down the line, some poor sod will be stuck supporting backwards compatibility for these hacks.

If you want to enforce encapsulation, then you must run on a virtual machine that does this for you, like the JVM. There is a plethora of languages targeting the JVM besides java.

Julia targets the hardware, without an intermediate enforcement layer. Since dedicated users will access your internal fields anyway, what is the additional gain above clearly documenting that the field is internal, both by actual docs and by using suggestive names?

Iā€™m not sure what the OP had in mind, but this reminds me of Let Over Lambda by Doug Hoyte. One of the patterns advertized in the book is called let over two lambdas, and illustrates the parallel between closures and objects. It looks like this (in Common Lisp):

(let ((counter 0))
  (values
    (lambda () (incf counter))
    (lambda () (decf counter))))
1 Like

It depends what you want to do. If you just want to update in place, as Simon said, do

import Base: push!

push!(x::Composite, i) = push!(x.values, i)

(You will also need to update smallest and largest.)

Note that even though Composite is an immutable struct, mutable objects inside the struct are still mutable.

Also, if you make a ā€œcopy constructorā€ as

julia> Composite(x::Composite) = Composite(x.values)

then you do not allocate the vector, but rather reuse it:

julia> push!(x.values, 10)
3-element Array{Int64,1}:
  1
  2
 10

julia> y
Composite(1, 2, [1, 2, 10])

i.e. y has also changed.

If you donā€™t want this then indeed you have to allocate a new vector (e.g. with copy).

2 Likes

Can someone explain to me what kind of safety is actually at issue? Is the concern malicious code, or accidentally doing something you didnā€™t mean to do (or a fear users will accidentally do something unintended)?

If the latter, it seems like @Tamas_Pappā€™s point about a clear API is ideal - it took me a while to figure out when I first started with Julia, but itā€™s been a long time since I accessed a field directly (I did it because the accessor didnā€™t exist, and the package maintainers responded to an issue and added one).

1 Like

I guess the usual concern is a large team of programmers, where some members could do something quick & dirty by exposing internals, which could then lead to a bug. Internal style guides and code review protect against this to some extent, and as others have pointed out, there is no protection against a sufficiently determined individual who is bent on shooting themselves in the foot.

But some languages, eg C++, offer facilities for this that at least raise the cost of accessing internals, and some people miss them in Julia.

2 Likes

I agree that encapsulation vs flexibility is an important discussion. Yes, you can get around encapsulation in many other languages even when they were designed for it. In Java for example you can access and modify private fields using reflection. I think this argument misses the point a bit ā€“ IMO the point isnā€™t that itā€™s possible, but that itā€™s so easy to do it in Julia, and that it seems tolerated, or even encouraged. I doubt youā€™ll find anyone seriously suggesting a workaround based on reflection to access private members on a Java forum, and any such code would likely not pass code review. But Iā€™ve lost count of how many posts Iā€™ve seen on this forum suggesting this, or packages doing it. Sometimes accompanied with warnings, but Iā€™m not sure how much that helps.

A recent example is this thread where a new user asked for help on indexing, and was provided one ā€œproperā€ solution, and one shorter solution using internal methods. No-one opposed. As we can see in a later post, the approach using internal methods was chosen as a solution.

Some more examples here, here, here, here, here, here, here.

As for risks/safety, itā€™s not about malicious code IMO, but very well-intentioned code, that becomes unmanageable over time as you reach a large code-base with many authors. Joshua Bloch (author of Effective Java, and many core Java features) writes the following:

"Minimize the accessibility of classes and members"

The single most important factor that distinguishes a well-designed component from a poorly designed one is the degree to which the component hides its internal data and other implementation details from other components. A well-designed component hides all its implementation details, cleanly separating its API from its implementation. Components then communicate only through their APIs and are oblivious to each othersā€™ inner workings. This concept, known as information hiding or encapsulation, is a fundamental tenet of software design.

Information hiding is important for many reasons, most of which stem from the fact that it decouples the components that comprise a system, allowing them to be developed, tested, optimized, used, understood, and modified in isolation. This speeds up system development because components can be developed in parallel. It eases the burden of maintenance because components can be understood more quickly and debugged or replaced with little fear of harming other components. While information hiding does not, in and of itself, cause good performance, it enables effective performance tuning: once a system is complete and profiling has determined which components are causing performance problems, those components can be optimized without affecting the correctness of others. Information hiding increases software reuse because components that arenā€™t tightly coupled often prove useful in other contexts besides the ones for which they were developed. Finally, information hiding decreases the risk in building large systems because individual components may prove successful even if the system does not.

7 Likes

I just realized this is not true - Iā€™m still doing m.captures to get captures from a regular expression match and this is how itā€™s written in the documentation

1 Like

There was one person who pushed back on the _ind2sub solution: you did. And I was grateful you did. I probably would have said something if you hadnā€™t.

I think some of this comes from the legacy of Julia as itā€™s been developed. Many of us started working with Julia pre-1.0 when it was more of a necessity to get your hands dirty because there werenā€™t any other options. That should probably be tempered a bit more now ā€” especially for newer users. There are still cases in those examples you link to where there simply isnā€™t an official solution. In such cases, I think trying to work them out in the thread makes sense, gives the original poster a (hopefully) temporary fix, and gives us fodder for further improving the language. We should probably do better at converting these into GitHub issues (and ideally PRs).

4 Likes

I guess this concise statement from the Perl manual also applies to Julia:

ā€œPerl doesnā€™t have an infatuation with enforced privacy. It would prefer that you stayed out of its living room because you werenā€™t invited, not because it has a shotgunā€
ā€• Larry Wall

6 Likes

I feel reminded of the questions that asked how to ā€œinheritā€ correctly in Julia, with the answer people suggested as most Julian being composition + overloaded accessors functions. But that leads me to the problem of asking ā€œwhich functions does a third party module need to operate on my struct?ā€ And thatā€™s a tough question to answer for something like a dataframe.

Nor do I have any guarantees (formal or informal) that a future release of dataframe wont break things for me. My impression is that the issue here is that there is no programmatic way to express API expectations.

It seems to me it would be extremely useful to have something like (not CS so my terminology is probably wrong) abstract interfaces. Basically a set of function signatures operating on or with the type. Then a module declaring that it implements an interface for a concrete type, would declare that it has implemented a set of functions. And a module could export an interface that it expects for itā€™s structs.

The tooling could then also use this information to show me what the interface functions are that I can use on a type.

I suspect that this could all be done as a package, alas itā€™s beyond my time/abilities.

If I have a composite type

struct foo
    b::bar
end

And I have an interface interf for bar, I could have a macro that implements the interface for foo automatically.

@implement interf b
struct foo
    b::bar
end

Which would just write a bunch of functions:
interf.func(f::foo) = interf.func(f.b)

This would not enforce privacy, but it would communicate what I expect to be private and what I expect to be public, and where you can substitute your behaviour safely.