RFC: Improved Search for Documentation

documentation

#1

I’ve been working on improved search functionality for the docs using Documenter.jl and Algolia [1].

A working example is available here:

https://haampie.github.io/julia-docs-search-ui

It seems to work incredibly well with minimal configuration. Try for instance hard queries like +, @., *, I, im.

For now you can filter on definitions of functions and types, text sections of the docs, and code snippets. By default section titles are most important, followed by type definitions, paragraphs and code snippets respectively. A tie break for relevancy is based on how far down the page the item occurs: further up is more important.

I’m sure relevancy and the results in general could be further improved, please let me know what you find!

[1] https://community.algolia.com/docsearch/ (this is their offer to open source projects, but I used Documenter.jl in favour of their standard web scraping application)


#2

Yes, this seems great and we really need better doc search badly! Thanks for looking at this. I guess the next question is how to make this our standard search.


#3

This looks great. Which documentation is searched though? It is not the latest, is it? I was looking for “adjoint”, but no luck…


#4

Very nice! I did notice that the back-button behavior isn’t ideal yet.


#5

@PetrKryslUCSD: It’s only the stable docs at the moment.

@tkoolen: agreed, maybe it should change the URL to the current search parameters, yet leave the browsing history unchanged.


#6

Great that you’re looking into this.

Currently the search is done entirely in the client using lunr.js. The Base docs are by far the biggest of these, and therefore the most painful performance wise. Building the index takes several seconds. In Documenter.jl/#560, there is some discussion on improving this, such as prebuilding at the cost of a bigger download, or building it using a web worker. But especially for the Base docs, it might be better to look at other options such as Algolia, also because doing all this work on the client is heavy, especially for mobile.

On Algolia’s pricing page I see that the free community plan is up to 10K Records/100K Operations, and if you apply for the free for open source essential plan, you can get more, with a standard volume of the AOS Plan is 100k records and 200k operations monthly. At least for the Base docs, applying for this plan would probably be wise. No idea how quickly this runs out though, and if it does, will search just stop working altogether?

What you are showing is both a new search provider and a new seach result UI, how tightly are these coupled, can we also use your UI with the existing search, or vice versa?


#7

No idea how quickly this runs out though, and if it does, will search just stop working altogether?

I guess it’s enough, since many other open source projects already use it. Right now there are ~6000 records per version of the docs, so we would need the AOS plan if we were to index all of the docs.

What you are showing is both a new search provider and a new seach result UI, how tightly are these coupled, can we also use your UI with the existing search, or vice versa?

Highlights and excerpts are created on the Algolia servers, so that part is ‘tightly coupled’. Right now I’m using standard components that Algolia offers to render parts of the view (pagination & filters), but that was just for quick prototyping. Other than that there are just some components that are reusable styled with Bootstrap v4.0.0-beta.3.

With some indirection it is certainly possible to completely separate the application state from the views.

I’m not sure whether pagination, highlighting, excerption and filtering are feasible in the existing search though, but the the way results are rendered may be copied of course.


#8

I enjoyed your outstanding search page a lot, but unfortunately it seems to be dysfunctional at the moment. Do you have any updates on this project? Is there some component that needs contributors?