So, my intention was to ask if you saw this as a binary choice. Thanks for answering.
I think part of the problem is that currently the ecosystem is in a âfrustatedâ state: on the one hand thanks to the flexibility of the language itâs possible to define all sorts of new array types that come with non-standard (at least compared to the old 1-based standard) indexing , but on the other hand the type hierarchy / type system hasnât followed to really precisely express what types are supported by a package.
This then leads to the by now well discussed problems. Right now the ecosystem is sitting at an unstable point and it could go either way: descend into chaos or evolve the type system and move to a better place which is not only very generic for array types but also correct/safe.
Or alternately,
for t in axes(x,1)[begin+3:end], for i in axes(x,2)[begin+1:end]
The axes may be indexed as arrays as well.
Tuples and NamedTuples are the two I want to be seen as of an abstraction.
Hereâs an example for finding the middle index of an odd length or the middle 2 indices of an even length. Hereâs the (afaik) correct way:
julia> function middle(x::UnitRange{Int})
# divide a distance, make room for reference point
step = (length(x)+1)á2 - 1
first(x) + step, last(x) - step
end
middle (generic function with 1 method)
julia> middle.((1:10, 0:9, -1:8, 1:9, 0:8, -1:7))
((5, 6), (4, 5), (3, 4), (5, 5), (4, 4), (3, 3))
When weâre not using generic methods, length(x)
and last(x)
would both look like n
for 1-based indexing; the returned value would look like 1+((n+1)á2-1), n-((n+1)á2-1)
. So what happens if we mix them up?
julia> function wrongmiddle(x::UnitRange{Int})
# dividing a reference point makes no sense
step = (last(x)+1)á2 - 1
first(x) + step, length(x) - step
end
wrongmiddle (generic function with 1 method)
julia> wrongmiddle.((1:10, 0:9, -1:8, 1:9, 0:8, -1:7))
((5, 6), (4, 6), (2, 7), (5, 5), (3, 6), (2, 6))
Incidentally, I kinda dislike begin end
in indexing brackets, something derived from first length last
would fit the corresponding methodsâ names better.
I opened a Github issue for this a while back but havenât had time to come back to it. Hopefully someone that understands the nuance of all these iteration options can explain them in a comprehensive doc page.
Is Quaternions.jl a poison pill? How about Unitful.jl?
Not all code operating on Number
types is correctly structured to handle non-commutative numbers like quaternions. And not all code operating on Real
types is correctly structured to handle dimensionful types ala Unitful. Does that mean that all generic numeric code in Julia is broken, or that all numeric code should be restricted to concrete types like Float64
that are tested? The latter restriction would be a huge blow to the flexibility of the Julia ecosystem as new number types, from DoubleDouble
to Measurement
and Interval
, come along.
Whenever you define a new subtype that stretches the conception of the parent type, itâs a challenge to generic code and itâs likely that many combinations will not work. But itâs also an opportunity to extend the generality of the ecosystem and improve its robustness. For example, the LinearAlgebra stdlib has been slowly extended over the years to have more support for non-commutative and dimensionful types, though this is by no means complete.
Nor is it terrible, in my opinion, to tell people to test things when they combine independent packages defining unusual new types for fundamental things like containers and numbers, and to expect that exotic combinations wonât always work (in which cases they should file issues and PRs).
Good point. It is probably impossible to write generic-enough code to cover every possible way a type may be extended in the future, but at the same time, many extensions probably will just work. Maybe this sort of correctness verification can be covered in documentation rather than with explicit type union function signatures. Just a simple page that lists all external types that package unit tests have been written for. If you use other types with this package, then you are responsible for writing your own unit tests, or better yet, submitting PRs to the package.
I would also point out that thereâs no way anyone could have imagined all the things that can be done with arrays and numbers. These abstractions are hard to get right even with experience and impossible to fully anticipate in advance. The approach Julia has taken is to let people explore and organically react when things donât quite fit together. We are getting to a point where we now have some better notions of what it means to be an array or a number. The interface of arrays is documented here. There is no interface for numbersâthe concept is too general: there are no methods one must implement for all things that are numbers. What then is the point of Juliaâs Number
type? It simply serves as a way to opt into a bunch of generic fallback method definitions, such as âautomaticâ promotion for arithmetic operations like +
and *
(without requiring you to define those) and also some definitions that assume that numbers are value types, rather than containers.
It may be useful to allow interfaces to be formalized and checked automatically, but if we had tried to do that from the beginning with something like arrays, we would have gotten it wrong and the explosive growth weâve seen of useful and strange array types would have been stiffled before it ever began. I would also note that itâs often quite useful to partially implement an interface: I may want to implement something arraylike, but I donât need all of the functionality that some arrays provide. How do I know if my implementation is complete? If my code works, then itâs complete. Of course, thatâs not fully satisfying when writing code that will be used by many other people who may want more features, but partial interface implementations are very useful for exploration.
Iâd instead say, if youâre going to do it, commit to it. A lot of people point to SciML as why they âgo genericâ, but they donât adopt our practices. Just look at a recent PR:
There are 40 test groups that each take on average 30 minutes each. Thatâs about 20 hours of tests that are ran. Do we support unitful?
Do we support abstract arrays that donât have indexing defined?
How are big floats doing in terms of numerical convergence?
And the list just keeps going. With SciML we commit to it: thereâs a huge test coverage and we consider anything that goes wrong with any generic handling an issue. What I see is the issue is the lack of commitment: there are groups that will only test on Array
but have generic codes, and there are groups that donât answer issues within a day about generic codes. That shouldnât be done.
And anyone who really commits to generic coding will have to use ArrayInterface.jl period.
Thereâs just so many details that you cannot query from the Base interface:
https://juliaarrays.github.io/ArrayInterface.jl/dev/api/
Nothing wrong with the Base interface though, this had to be learned over time in a way that can quickly evolve with the growing AbstractArray ecosystem. But if you arenât using that package, then either there are obvious issues with your generic codes that are dead obvious and easy to identify, or thereâs a re-implementation of that in the package with a whole lot of Requires.jl (I only know of one case thatâs the latter). Yes thatâs a strong statement but those primitives were all made for a reason and I can give you counterexamples from all over the ecosystem. For example, how many of you can name off the top of your head a commonly used array type for which eltype(x) !== typeof(x[1])
?
We need a similar effort for Numbers, but we just havenât gotten there yet because thereâs still a lot to do with AbstractArrays.
Though the vision is promising, I think ArrayInterface needs to fix issues like ismutable wrong for FillArrays ¡ Issue #77 ¡ JuliaArrays/ArrayInterface.jl ¡ GitHub first before many libraries can make good use of it. Otherwise one is forced into the same Requires dance as before, just with another level of indirection (e.g. replacing @require XYArrays...
with @require ArrayInterfaceXYArrays...
)
Well yeah, figuring out the full interface is hard and evolving. But it at least gets many many more of these cases right than someone trying to roll it themselves, because thereâs been hundreds of these edge cases over the years.
Yeah, this is an important point. That advice is poorly written, the
should be used chiefly for dispatch
doesnât explain much and is somewhat tautological
if I had to shoot for a better articulation:
I donât think dispatch constraints âon their ownâ should be relied on as the de facto implementation mechanism to enforce an interface boundary, even though it seem they commonly are applied this way in the Julia world. As a result principles like Style Guide ¡ JuMP crop up to combat the effects (which I donât disagree with as a design principle for certain packages/situations - but i think maybe itâs overkill in the other direction)
Itâd be nice if I had an alternative catch-all recommendation for what to do instead, but I donât
itâs almost too easy sometimes to poorly pun on Juliaâs subtyping algorithm to try to enforce behavioral constraints. I think you CAN successfully do that in situations where a typeâs/packageâs design is amenable to it (see below), but iâm not sure thatâs the case in all situations, and in some situations itâs a kind of a leaky mechanism for this purpose
for example, I do think itâs okay to enforce interface boundaries via dispatch constraints in situations where youâre defining behaviors atop a more strictly defined compositional interface (e.g. a wrapper type A(x)
that explicitly surfaces the âA
-likeâ behaviors atop data x
). that case probably meets the poorly-phrased definition of being âused chiefly for dispatchâ. IMO method dispatch makes it quite pleasant to implement these cases, as long as you donât run into promotion-related woes for n-ary methods, which is sometimes the case in a multiple-dispatch-driven systemâŚusually if i hit that point i end up wishing there were some haskell-y norms/capabilities i could lean on to resolve things to the intended behavior, vs. hitting method ambiguities
Of course, and there has been a lot of thought and effort put into getting those cases right. My point was that as a package author who wants to be a good citizen of the ecosystem and practice
here is how things break down:
- I need to know if an array type is can be accumulated into in-place and donât want to use requires on the half dozen or so types that I know wonât support this.
- Oh cool, ArrayInterface advertises a function for this.
- Wait, this function doesnât work for most of the types I care about. No matter, we can file an issue.
- Somebody already filed an issue 2 years ago, but itâs not clear there is consensus on whether itâs considered a problem, let alone how to fix it.
- So Iâm back to needing Requires, except now if I want to use ArrayInterface I have to pirate its own methods.
Perhaps the broader point here is that interfaces need buy-in, and if they arenât getting that it is worth analyzing why. How much of it is technical issues like the example above vs. concerns around maintenance timelines when something breaks vs. other non-technical factors? Is it just a matter of getting the word out or is some negotiation required? etc.
This applies to the other (implementerâs) side of the interface boundary too: when the reaction to an interface varies from this seems unstable (I know itâs out of date, but there has been no follow-up) to canât you handle this? to unresponsive to non-existent, what incentive is there to consume this interface when weâve been told repeatedly that Requires is a no-go?
I expanded my prior low effort attempt via an edit.
julia> foo(x::BaseAndCoreArrays) = x[1:length(x)]
foo (generic function with 1 method)
julia> foo([1,2,3])
3-element Vector{Int64}:
1
2
3
âyou probably wantâ is the crux of the issue here. There might be more than one type of Julia user. I would rather someone restrict their dispatching and be correct rather than accept a AbstractArray
and make incorrect assumptions.
Is this the best way to do to restrict dispatching? Probably not, but it is fairly simple. Traits are probably the way to go.
I definitely disagree that Julia 2.0 should have special loops for certain kind of arrays or indexing. I think we could create a Base
method to obtain a one-based view of an AbstractArray
. That could be done before Julia 2.0.
I agree that âavoid[ing] restricted functions to the tested surface areaâ is the community preference especially when applied to library code. However, I do think we should enable users with tools to either test their code broadly or restrict their their dispatch to âknownâ types if they so wish. Perhaps some users do not really want to write methods for AbstractArray{T,N}
but are reaching for it because they just wanted to support UnitRange
and SubArray
?
A practical option here would be to insert OneBasedArray <: AbstractArray
(or Abstract1Array
?) in the type hierarchy. That would be considered non-breaking according to ColPrac guidelines and would allow people to program to abstract arrays assuming 1-based indexing. But Iâm not sure itâs really worth it: if youâre writing code thatâs really generic enough to apply to all kinds of different arrays, then using begin
instead of 1
seems like not that big a deal.
let me know when the numbers are to get attention
it is not that big a deal, less were it not to have been such kindling in userverse,
This may be more of a pragmatic decision, to allow any member of the Community to respond âNot any longer, all our Arrays subtype OneBasedArraysâ
There is value in a pithy response that makes sense to most everyone.
And it would circumvent some future silliness when arrays become sentient.
ArrayInterface.jl moves fast. Like:
No, we already solved this. See
https://github.com/JuliaArrays/ArrayInterface.jl/tree/master/lib
Of course, itâs not a nice solution, weâd prefer that StaticArrays.jl defines the right functions, but there is ArrayInterfaceStaticArrays.jl and thereâs no Requires.jl required.
Solved. It doesnât use Requires.jl.
Look, itâs not even in the Project.toml.
That was on a personal repo where absolutely no ArrayInterface.jl devs were pinged. If this is needed, we can make an ArrayInterfaceBlockArrays.jl and hold it until it gets upstreamed. No Requires required.
Sure, but all of those issues were handled right? No Requires.jl, load time is in the low microseconds so less than 1% of the package, etc. Seems like thatâs all done? We can re-bump that.
In the meantime, the working code lives in https://github.com/JuliaArrays/ArrayInterface.jl/tree/master/lib/ArrayInterfaceOffsetArrays which is a registered package in Base so you can use it today.
That is merged and completed. StaticArraysCore, ArrayInterfaceStaticArrays, and ArrayInterfaceStaticArraysCore.
Point by point, ArrayInterface already handles all of those cases by subpackaging. Is that nice? No, it would be nice if all AbstractArrays actually defined their interface. But SciML needs this in order for generic code to work, so weâre shouldering the effort for now.
It can start anytime, but from this post you can see why ArrayInterface has been such a big project: weâre implementing the interface for every AbstractArray we can find . Thatâs also why we can tell you about all of the weird edge cases though.
I should end this by saying I know there is a caveat here that, almost by definition, ArrayInterface.jl is still not super stable and is a fast moving package which is terrible to have as a low level interface that everyone depends on. So while I wish this would go into every array type, I would also agree that weâre probably at least a year away from really being stable enough for that. And actually, I think this is the kind of thing that really needs to stabilize and head into Base. Once this interface is more set in stone, I want to do a PR to Base that adds âthese are 20 traits that we know help generic codesâ, and then add that to the AbstractArray page. I donât think we can say itâs a good part of the language until itâs all the way up there.
Until then, itâs the bandaid that SciML needs to maintain so everyone else is free to know that it cannot break for that reason .
I think I was imprecise. ArrayInterface removing Requires from its own dependencies was a massive effort and very much appreciated. What I was referring to is that one still needs Requires to import ArrayInterfaceStaticArrays etc. to avoid taking a dependency on the underlying array package.
One alternative is just adding all the ArrayInterface* bits one needs as direct deps. Thatâs fine for one or two types, but it quickly gets out of hand especially when youâre only calling a subset of the interface. The other alternative is making users import the right subpackages themselves, but that is unlikely to fly.
Thatâs totally understandable. My ask (with ismutable
/ismutable wrong for FillArrays ¡ Issue #77 ¡ JuliaArrays/ArrayInterface.jl ¡ GitHub as the motivating example) is that interfaces favour being conservative and maybe asking users to install a subpackage instead of generating false positives about what they support (with or without subpackages). There will always be new array packages that havenât yet opted into the interface, and this approach would allow us as interface consumers to avoid pirating both ArrayInterface and the array packages whenever the former falls back to a too-optimistic code path and blows up.