Any advice on Stereo Matching in Julia?

I am looking to do stereo matching in Julia, basically to find the depth in an image by using two images. I am at an early stage and would appreciate some advice on where to look for information, and what libraries are most relevant. I have looked at Julia Images obviously. But what else? Is OpenCV with Julia bindings a good choice or are there better libraries for this particular problem?

Based on this file, there are several algorithms, latest STEREO_HH4 was added last August: Update the stereo sample: · opencv/opencv@6d1f7c2 · GitHub

While in 2015 STEREO_3WAY was added: Adding new HAL-accelerated MODE_SGBM_3WAY · opencv/opencv@aea4157 · GitHub

The latest developments do not seem to be there, with many papers submitted in Dec. or Nov. last year: The KITTI Vision Benchmark Suite

These OpenCV algorithms seem all to be old (variants for better speed, not accuracy?), so outdated?
Despite it explained here as Thesis - Stereo Vision Prototype — Nigel Tve

Stereo semi-global block matching (SGBM) is a strong choice when it comes to gathering depth information of the surroundings.

You could wrap/use any of the code I link to, or reimplement in Julia (I did not find any Julia-only code for this).

I did start at looking at paperswithcode.com, for “stereo matching” (some paper below belong primarily to other categories), but note there are also other stereo-related sub-categories.

What I quickly found, before the links above, maybe in wrong order (or not all relevant):

The implementation of improved version AANet+ (stronger performance & slightly faster speed) is also included in this repo.

another real-time:

the proposed method can
achieve accurate depth estimation in real-time inference. Experimental
results demonstrated that the proposed method processed stereo image
pairs with resolution 1242×375 at 12-33 fps on an NVIDIA Jetson TX2
module and achieved competitive accuracy in depth estimation. The code
is available at GitHub - JiaRenChang/RealtimeStereo: Attention-Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices (ACCV, 2020)

not sure state-of-the-art, they claim, still applies, since older:

KITTI : Results competitive to SOTA, while being real-time (8x faster than SOTA). SOTA among published real-time algorithms .

Despite its simplicity, our probabilistic method achieves state-of-the-art results for both optical flow and stereo matching on established benchmarks.
[1803.08669] Pyramid Stereo Matching Network

See also: JuliaImages Projects – Summer of Code

I wasn’t aware Uber was also into NLP: GitHub - uber-research/PPLM: Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models. see blog:

NLP investigation lead me to Arxiv-nlp at: https://transformer.huggingface.co/ where I tried to auto-complete: “What’s the state-of-the art stereo matching”

2 Likes

Wow thanks! I don’t need specific help with the algorithms per say as I have a colleague, who seems to have a handle on this. The main challenge has been to uncover what Julia code already exists for this and if no whole solution exists, what sort fo building blocks or libraries are most useful beyond Images.jl