How do you actually find out *how* to use an interface in Julia, like the concept of iterator "end"?

search “java iterator”

boom. done. You have a direct path to all the iterators and their construction and usage.

c++ iterator

search “julia iterator”

using the JL internal search yields a mess (discussed shortly)

I found the

next = iterate(iter)
while next !== nothing
    (i, state) = next
    # body
    next = iterate(iter, state)
end

example in substack after 20 minutes of poking around in JL documentation. Even so, I had to fiddle with search terms in order to find the question and answer with the usage example you presented.

discussion:

My chief difficulty is I need to know in advance where the topic is filed under.

If you go to the base documentation

and type “iterators” then click on the the hopefully named Base.iterators all the reader is presented with is “there’s a module named iterators”

fine.

The side panel is dominated ESSENTIALS section presented contains a long list of not the thing I need. Below it, obscured under the mass of essentials, is Collections and Data Structures, followed Mathematics, Numbers which are all not the topic I am searching for. In terms of search results that side panel quickly looks like it’s not relative relevant to the search I just made.

The distinguishing factor in the first two language examples is I am presented with information that allows me to do the thing I need to.

In JL the usage is explained in passing

Further, if I had known to search for iterate(), I would still have to infer the initialization as

iterate(thing)

and know the unpacking convention

(i, state) = next

AND iterate(thing, state) means next item

The actual documentation for iterate() leaves almost everything unsaid.

If my discussion here sounds a bit frustrated and cranky. It’s because it shouldn’t be this hard and hopefully is a useful explanation of why I found everything and find everything relatively more challenging to learn about in Julia. The smart alec way I would describe it is that JL documentation is orthogonal to almost all my expectations. With every other language I have worked with the chief advantage in beauty of Java was that the language designers decided to include a very direct template for class documentation for the fields for the methods for the arguments of the methods and an entire system that was used to generate a useful documentation tree so people could understand abstractions down to their implementations all provided there. I should know I worked with some of the principal people who developed the language. They put a lot of thought into this into making the programming environment of Java, deliberately accessible to what they would call the average programmer. They made it pretty darn clear and pretty darn easy also aided by the structure of the language itself to be quite explanatory in a very concise way.

3 Likes

Probably because you should start with reading the manual first. (Just “Manual”, not the standard library, the devdocs, etc). Searching on demand for every single bit of information is going to get frustrating because you are not familiar with key concepts, so you lack a foundation.

It is like learning linear algebra by trying to search for “SVD” first.

4 Likes

I personally find the Java documentation particularly obtuse for newcomers as I illustrated above, but I know my way around it from experience. In particular, I know Java and C++ are object oriented so I might expect an abstract base class or formal interface.

Part of the difference in expectation here is that Julia does not implement an object oriented programming paradigm. We have a multiple dispatch paradigm which shifts the conceptual emphasis to verbs (i.e. iterate) rather than nouns (i.e. Iterator). The approach here is in fact orthogonal and that is reflected in the documentation.

I think part of your frustration originates from having particular experience with other languages and expecting Julia to be the same. I work in an environment that uses both Java and Python heavily among other languages. While I appreciate that JavaDoc follows a particularly pattern that I familiar with, I see my junior colleagues struggle with this pattern. Thus, I’m not particularly convinced that Java is the best example here. It’s helpful to you and I because we are familiar with the language, but others who are not familiar with Java would struggle with it just as you are struggling with the Julia documentation.

Perhaps one thing to take away from JavaDoc is having some regular structure to Julia docstrings. What pattern should arguments follow in implementing methods? What return type is expected? Is this function part of a larger interface?

From your elaboration, I see that your main interface here is to use search. Alternatives might be using the Table of Contents or using a LLM. This is helpful and thus I can formulate some concrete action items.

  1. We have a module called Base.Iterators that people may land upon when searching for items. Modules should contain some documentation and link to the Iteration Interface section: julia/doc/src/base/iterators.md at master · JuliaLang/julia · GitHub
  2. While Iteration Interface is referred to in the section introduction above where Base.iterate documented, it is not referred to within the docstring for iterate itself. We should refer to interfaces in the docstrings of functions belonging to those interfaces: julia/base/essentials.jl at 966d0af0fdffc727eb240e2e4c908fdd46697e57 · JuliaLang/julia · GitHub
  3. Search should emphasize headings in the Essentials and Manual sections. Search should also expand nouns to their analogous verbs (e.g. “Iterator” => “iterate” or “iteration”).
  4. We may need an explicit comparison between object orientation and multiple dispatch in the manual perhaps with an emphasis on interfaces.

Thank you. Let me see if we can make progress on some of these action items.

9 Likes

Maybe, part of these frustrations are simply because of how search works? I regularly find search in julia docs (and other Documenter docs)… underwhelming.

4 Likes

(and your welcome!)

I think part of your frustration originates from having particular experience with other languages and expecting Julia to be the same.

You are reflecting what others have expressed about the distinction between JL and other languages, but I think this is a distraction. Even if I were trying to look up something C++ or Java I rarely care about the object or interface. Moreover, my general experience with looking up stuff for other languages, including Python and JavaScript, blah blah blah is that the search experience renders discoverable education about these languages.

Perhaps one thing to take away from JavaDoc is having some regular structure to Julia docstrings. What pattern should arguments follow in implementing methods? What return type is expected? Is this function part of a larger interface?

This is (or was) the essence of the Java philosophy

The value of the javadoc system was to make describing the contract easy and automating links to constructs fulfilling such contracts. Endlessly helpful for the years I spent in that domain.

Design by Contract is meaningful regardless of OO or functional etc.

==
Finding stuff that I need:

  1. I am not going to use an LLM. It’s bad form to assume people will. DuckDuckGo’s search assistant offers generally functional outcomes, but not refined ones. LLMs scrape substack. Much of what is referred to is years old. (see below)
  2. TOC is good for a discourse on a topic, not so good for “how do I get X done” queries
  3. your other suggestions are quite good

Let me continue with use case as an individual with glancing familiarity to JL

Let’s assume I have a task and search based on how I think about it

find element in a collection julia
vs
find elements in a collection julia
vs
find an element in a collection julia
vs
find if an element in a collection julia

(try them and see what hits come up for you)

The range of options for finding stuff in collections is a bit more sophisticated than just findall() or in().

doesn’t link to DataFrames or structs, and DataFramesMeta as an example. It really should. But I don’t know how practical any of that would be. javadoc does it by traversing a class’ imports and package.

In summary:

The biggest obstacle is that the theoretic and generic high order modules and interfaces dominate search results both from search engines and in the documentation system.

LLM draws from answers from years ago. Traditional search suffers the same problem.

Julia’s documentation system doesn’t flow between theoretic and concrete. This is a road block to discovery.

I think one of the problems with the Julia documentation is that dozens of docstrings are crammed onto the same page, which incentivizes short docstrings. Each docstring should have its own page.

2 Likes

PHP has this convention where a function has its own page:

I’m not sure I would want each docstring to have its own page. It might be useful for each function to have a page with a summary of methods grouped by module though.

Yeah, all the docstrings in a module for a single generic function can be on the same page, but we should have a separate page for each function.

The general principle is that a generic function should only have one meaning, i.e. docstring, for each arity, so ideally other modules that extend a function shouldn’t have any new docstrings unless the new method adds a new arity.

1 Like

It seems we need a convention for a function synopsis and the following detailed description. The synopsis can occur on the module page.

Maybe the first line should be a one sentence description followed by an empty line. The next paragraph provides more details.

This would be very welcome!

Also, at the risk of branching into something that’s off topic, one of the beauties and strengths of the language are the constructs that allow you to do things with array or data frames, in for the life of me I can’t remember the term and I have no way to rediscover what that word is.

If I use a keyword search with array in it, it’s not getting at the thing I don’t know what to name to discover. The word was something like insights or introspection or something to do with the raises where you can do interesting things like the ‘…’ operation maybe. Don’t feel compelled to answer me for this particular question but I’m just trying to convey a flavor or a sense of things from a newbie view.

If there’s one thing working in OO’s favor is our natural tendency to think of the things we can do with the object in the world along with their attributes. If I have a data array, I’d like to know the set of verbs I can apply to it. It does seem that you know with enough bodies somewhere to write the code one could go through function, definitions, and pull out maybe some degree the set of verbs that have been created that work with a certain kind of object. Just food for thought.

1 Like

This one feels rather different to me because you are less asking about API/concrete functions and more about fundamentals of the language. I agree that the search function does not help you if you don’t know the term you are searching for but I think that’s just not what its made more (as an aside: LLMs are generally GREAT for this - guessing terms from vague descriptions). The discussion before was on the level of implementation, i.e. how would you implement broadcasting for your own type (I think that the specific documentation for implementing broadcasting is also not very detailed/clear but it is workable).

However I think the Julia Manual does a rather good job at teaching these concepts. For your specific example: I think you are either looking for array comprehensions or broadcasting. Both of these are indeed explained if you open the page “Single- and Multi-dimensional Arrays” of the manual (though I agree that this is a rather long page and finding the concept you are looking for likely requires scrolling a good bit - but also you might stumble upon things you never thought about before):

Is your suggestion that there should be a better overview over the concepts? But even then a search function likely won’t help find the right concept from a vague query.

1 Like

The REPL has functionality for this, though it can be a bit verbose because of the duck-typing nature of julia:

julia> a = rand(10); b = rand(10);

julia>  ?(a,b)<TAB> (with or without ending paren, different list)
+(A::Array, Bs::Array...) @ Base arraymath.jl:12
-(A::AbstractArray, B::AbstractArray) @ Base arraymath.jl:6
/(A::AbstractVecOrMat, B::AbstractVecOrMat) @ LinearAlgebra ~/.julia/juliaup/julia-1.12.3+0.x64.linux.gnu/share/julia/stdlib/v1.12/LinearAlgebra/src/generic.jl:1264
(::Colon)(start::T, stop::T) where T @ Base range.jl:7
==(A::AbstractArray, B::AbstractArray) @ Base abstractarray.jl:3029
... and many more

However, the duck typing makes documentaton of possble methods somewhat hard:

julia> struct MyInts; a::Vector{Int}; end
julia> Base.iterate(m::MyInts) = Base.iterate(m.a)
julia> Base.iterate(m::MyInts, state) = Base.iterate(m.a, state)
julia> m = MyInts(rand(Int,10));
julia> sum(m)
-500966851580788704
julia> @which sum(m)
sum(a; kw...)
     @ Base reduce.jl:553

So, my funny vector can be summed by Base.sum, but I may not have intended my vector for summing, just for my own weird stuff, so I never thought of documenting sum for it. Nor any of the other methods which may happen to work. Even more so if my new type is a subtype of something, like of AbstractVector{Int}, or DenseVecOrMat{Int}.

One could of course hope that all abstract types had a complete list of methods which should work, or which are implemented for the abstract type, but due to the duck typing this is difficult to achieve. In some experimental package I have an index type, so that any AbstractVector can be indexed by a bitmask, and the indexing operation results in an iterator over the marked elements. There is no obvious place to document this behaviour, perhaps in getindex, but who will look there?

In short, I think complete documentation is very hard in julia, though I sometimes miss the strict rules of R’s cran, where every argument of every public function in every package must be documented, otherwise it won’t be on cran.

People already linked to the docs on this specific question. But one way you could’ve found it is make it throw an error and see what the error says. What types does it mention, what lines of code does it ppoint to. For example, I just did this:

struct MyType end
x = MyType()
x[1:end]
ERROR: MethodError: no method matching lastindex(::MyType)
The function `lastindex` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  lastindex(::Any, ::Any)
   @ Base abstractarray.jl:427

So now I know to search for lastindex.

3 Likes