Showing a captured video sequence with annotations

question
images
glvisualize
plotting

#1

Hi ,
For debugging an object tracking system, I need to be able to display captured images
and to easily annotate those images with text.

I am looking for a solution that is not dependent on openCV.

I don’t care about colorants or color spaces RGB or HSV and the such I am showing 8-bit gray level
or 32/64 bits floating points images(like imagesc in Matlab)

I would be satisfied with 25-30 fps

Much obliged for any comment or help,
Thanks


#2

Have you looked at @sdanisch GLPlot/GLVisualize? It should be possible to rig something up with signals to put an annotation overlay over the video.

-V


#3

That should be quite straight forward… How do you want to do the annotation?
Scrubbing through the video, stop, click on something and write some text?


#4

Hi @sdanisch , thanks for taking the time to answer this.

I have a matrix of UInt8 1200x1600 every 15ms , which is the output of some camea.
for each image there is a list of several interest points(segmentation centroids) along with strings ( “1” “2” “3” etc) where each string should display
at its respective point.
I further want to display at a fixed position some more information.

I guess I could potentially achieve this using PyPlot and matplotlib but it turns out that for this sizes the garbage
collection kills performance … maybe because its happening on both sides …I don’t know exactly, anyway it is not usable that way.


#5

Why do you think it’s garbage collection? PyPlot isn’t known to be especially fast at displaying images :frowning:

But that should be super straight forward. Let me point you to an example in a minute!


#6

here you go: https://gist.github.com/SimonDanisch/df4b3617737eb90095f92951baa6a687#file-annotated_video-jl !
This example shows how to best create different screens to show something beneath in a fixed location:
https://github.com/JuliaGL/GLVisualize.jl/blob/master/examples/introduction/screens.jl
(not such a nice high-level API yet, admittedly!)

It’s a bit slower than expected, I need to figure out why. Hope it’s still fast enough for you!


#7

Sorry @sdanisch but that is still very far from my description… and not very constructive.
Your example shows how to display text in a certain area of a window this … is needed, not exactly what I meant but still useful.
Now how do I update
the text every frame?

and more important
How do I display the next frame?

try the following code:

using PyPlot
A = zeros(UInt8,1024,1024);
A[500:700,500:600] = 255
P = imshow(A)
an = annotate( "ZOOP!",xy=(100,100),color = (1,0,0,1))

#now lets change that data inplace
A = zeros(UInt8,1024,1024);
A[500:2:700,500:2:600] = 255
P[:set_array](A)
an[:set_text]("ZOOP! Changed!!")

can this be done using GLVisualize?


#8

Have you run the code?
I’m not sure what you mean… My code is doing exactly that?
I’ve update the example to also update the label string per frame:


#9

I definitely think GLVisualize is the best solution here, but in case it’s hard to get working…see the ImageView README about the annotation framework.


#10

Thanks @sdanisch , yes I ran both codes, your latest code is exactly what I specified …
So first thing I notice is
That in order to render a frame it has to be of type array of color type , as a long time Image Processing engineer … I find that infuriating :slight_smile: I am dealing with A/D output … color is just the representation at the end.

There should be a one liner to view any matrix … if it has 3 layers then interpret that as RGB.

Performance though is around 1-2 fps why is that? EDIT : (because of sin cos of 1200x1600 array inside the loop)

Please don’t interpret my temperament as criticism, I value your work and has used it before
to learn openGL and Julia

I suggest the following high level API:
plot(x,y) => create new figure and plot y versus x return a dictionary of String=>Signal for values that can be pushed
interactively
plot!(x,y) => same as above but use the last used window instead
imagesc(M) => show scaled matrix return a dictionary of signals
imagesc!(M) => you get my drift…

Thanks @tim.holy I will give ImageView a look.
By the way I tried to use Images.jl but the slow performance of connected components for large matrices forced me to resort to openCV.


#11

There should be a one liner to view any matrix

GLVisualize is more of a low-level graphics API, so I disagree :wink:

But feel free to develop such a higher level API. If you’re happy with it, it’d be great to open a PR to incorporate it into GLVisualize (or maybe rather GLPlot, which is supposed to offer more higher level abstractions).
I’m slowly working on a higher level API myself, but I need to prioritize lower-level improvements.

Did you get the speed you need after the EDIT? There is no fundamental reason, why the GLVisualize code shouldn’t yield optimal performance. If it doesn’t it should be considered a performance bug! Feel free to open an issue about it.


#12

There should be a one liner to view any matrix … if it has 3 layers then interpret that as RGB.

Only in a world where no one works with images that are actually 3 dimensional. Been there, tried that with the old Images, and it’s not pretty, especially given the importance of type-stability to Julia (and the fact that we have this nice type system). Julia as a whole is moving to the notion “array element = pixel”, no matter whether it’s grayscale or color. In my experience, most people seem to like this once they get used to it.

To “convert,” you can define a one-line helper function: matimage(A) = colorview(RGB, permuteddimsview(A, (3,1,2))). (The last is assuming that your array has color in the 3rd dimension.) This doesn’t copy any data, so it’s efficient.

By the way I tried to use Images.jl but the slow performance of connected components for large matrices forced me to resort to openCV.

Interesting. My install of OpenCV.jl isn’t working right now, so can I ask you to post a performance comparison? Here’s what I get:

julia> A = rand(Bool, 1000, 1000);
                                                                                                                                                            
julia> label = label_components(A);
                                                                                                                                                            
julia> @time label_components(A);
  0.030611 seconds (404 allocations: 10.609 MB)                                                                                                             

or even better:

julia> using BenchmarkTools

julia> @benchmark label_components($A) seconds=1
BenchmarkTools.Trial: 
  memory estimate:  10.59 MiB
  allocs estimate:  40
  --------------
  minimum time:     18.784 ms (0.00% GC)
  median time:      19.239 ms (0.00% GC)
  mean time:        19.662 ms (0.95% GC)
  maximum time:     27.636 ms (0.00% GC)
  --------------
  samples:          51
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%

#13

I can’t resist pointing out that’s 6 times faster than Matlab’s bwconncomp on a similar random image.


#14

Thanks for answering Tim

Only in a world where no one works with images that are actually 3 dimensional.

Actually I do work with 3d data, i think it is quite easy to distinct MxNx3 where M,N >> 3 from
3d data. Anyway I am not religious, openCV also went that way of array element = pixel , and I have written
the transformation to and from representations many time.

I still think it is needed to support array of float/int/etc naturally … when I research new ideas I almost always
work in single plane images, and on a last step check if color information added any value.

For the performance comparison:
I use a gray level image with several small circles … I then threshold at 250 then connected component and centroid extraction:

my dll using openCV : 2.5 ms

Matlab:

>> tic;bwconncomp(cdata>250);toc
Elapsed time is 0.005663 seconds.
>> 

Julia:
about 3 ms for in place thresholding
and 11 ms for labeling
and another 3 ms for extracting centroids

I have a 15ms window, image is 1200x1600 Uint8


#15

Actually I do work with 3d data, i think it is quite easy to distinct MxNx3 where M,N >> 3 from
3d data. Anyway I am not religious, openCV also went that way of array element = pixel , and I have written
the transformation to and from representations many time.

That is how we used to do it, but we got bug reports because of it. What happens when you’ve snipped out a thin slab of an image that happens to have 3 slices in it?

I still think it is needed to support array of float/int/etc naturally … when I research new ideas I almost always
work in single plane images, and on a last step check if color information added any value.

Oh, I agree. The color interpretation should only be for visualization; all of the algorithms in Images.jl should work for arrays of numbers. If not, please do file an issue.

EDIT: but of course, it’s still “one array element = one pixel”. That rule won’t change.

Performance…

Interesting. Performance does seem to depend on the contents of the image, but I’m still getting better performance in Julia than Matlab. Here’s how I generated some fake data:

julia> img = falses(1200,1600);

julia> function makebox!(img, cntr, w=5)
           inds = (cntr[1]-w:cntr[1]+w, cntr[2]-w:cntr[2]+w)
           inds = map(intersect, indices(img), inds)
           img[inds...] = true
       end
makebox! (generic function with 2 methods)

julia> for i = 1:20; makebox!(img, (rand(1:size(img,1)), rand(1:size(img,2)))); end

Even if I first do imgu = 0xff*img and then run @benchmark label_components($imgu .> 250), for me we’re still faster than Matlab (though not by as large a factor). Out of curiosity, how many cores do you have? If Matlab is multithreaded (we are not) and if you have many more cores than I do (I have only 2), then maybe that could explain it?


#16

I don’t care so much for comparison with Matlab as it is not an option for me to go back using Matlab.
I am interested in comparison with C and would be happy even with 2x slowdown.

for the thresholding part, multi-threaded julia on my computer (quad core I think) dropped to 0.6 ms.

I tried a few variants of algorithms for speeding up connected component when I know that the number of blobs is relatively small…one such algorithm is Run Length Encoding or in other more familiar terms thresholding creates a sparse matrix. Also skipping the labeling and computing centroids directly… I squeezed a few ms out, but did not converge to an elegant enough solution to brag about in forums :slight_smile:


#17

By the way, I added the function to GLVisualize and created a slightly simpler example:


#18

I think GLVisualize is on the right track… I came up with a similar (limited to the functionality I need) design of
mixing Reactive programming ,OpenGL and automatic type conversions , to achieve a “Dynamic” openGL experience…

In my design there is also a direct and dynamic(auto type conversion) way to assign variables directly in the shader… maybe this is too low level to expose in GLVisualize.
@sdanisch if you want I can post the Shader accessing bit in a gist … my code will make you go “no , no ,no this is not how you write this …” but may trigger you to do it “right”

Still extensive documentation is needed … not just for each function , but also for the design and internal plumbing.


#19

Yes please post it :wink:


#20

Shaders
Usage is roughly as follows: you need an active openGL context (glfw window)

S = [("glass.vert",GL_VERTEX_SHADER),("glass.frag",GL_FRAGMENT_SHADER)]
prog = GLprogram("directoryOfShaders",S)

prog["Vpos"] = rand(4,100) #set a vertex attribute(vec4) within the shader .. implicitly create a buffer and transfer data
prog["uniformVar"] = 1 #set a uniform variable inside that program
prog["_indices"] = rand(0:99,3,100) #create indices for 100 triangles randomly for indexed drawing

if you are using Atom/Juno examine the structure of prog (structure(prog)) to see a list of uniform variables and vertex attribute variables

This is the first thing I used Julia for … so it is supplied as is, in the hope that you may benefit from it.