Pixymon video stream while using Pan/Tilt Demo?

jamesbass · April 17, 2018, 5:09am

"I’m guessing the video you are referring to was being displayed in Pixymon, and then Rich just used a screen recorder to capture it. "

I’m confused how this could be captured with Pixymon as the app stops showing the video stream when tracking is active. Is there another version of Pixymon which does not stop streaming video when tracking is active that is not available?

rmsanchez · April 17, 2018, 5:09am

Hi James,
You would like to display full processed video while tracking something in the pan/tilt demo— I understand.

The KS video shows a couple seconds of a cobbled together demo where you can see live, processed video and the ball being tracked. This was done in cooked mode using calls from the PC host to control the servos. This feature didn’t work very well because the performance was kinda sucky (25Hz, lots of latency.), required some custom PD gains. It was removed. It might find it’s way back, but it needs work.

Here’s some technical backstory:

So what is cooked mode exactly? In cooked mode, raw pixels are grabbed by Pixy and sent to the PC, and all the processing takes place on the PC side. It’s a simulation. It uses the same code that runs on Pixy’s firmware, so it’s a very accurate simulation — but it’s “cooked”. It’s great for debugging… and we use it for algorithm development (much easier to test ideas on the PC side!)

Why not have Pixy send back a processed video overlay ontop of raw video and have it all nicely blended like in cooked mode?

In order for Pixy to do what it does at 50 fps (hundreds of objects, 7 color signatures), it uses lots of memory (lookup tables, queues, etc). Pixy doesn’t have enough memory to capture a full frame of raw pixels and have enough left over for the processing. And Pixy is only sending what it detects. So there’s no need to keep all those raw pixels in memory.

More backstory… if you’re interested

Pixy has 264K bytes of memory. That’s barely enough to capture a full frame at 8 bits per pixel. So how does Pixy process an entire frame (640x400)? It uses a pixel pipeline – it processes pixels as they come in. Pixy is actually able to identify objects at the top of the frame before the entire frame has been sent to the onboard processor. The average latency is 10ms (1/2 a frame). A traditional vision system — grab frame, process frame, grab frame, process frame — has a latency of at least 1 frame period (20ms in our case) So there’s a big advantage to doing it this way…

So what about latency? Isn’t update rate more important?

Latency is the ugly stepsister to update rate (aka Framerate). Framerate gets all the attention… but Latency will f**k your tracking algorithm up if you don’t give it some love too. At least in robotics where there are control loops — latency and update rate share equal importance, more or less.

Some algorithms (like face detection) necessarily require a traditional approach (grab, process, grab…) For face detection and possibly other algorithms, we’re going to be grabbing 320x200 frames to deal with the memory constraint.

jamesbass · April 17, 2018, 5:09am

Thanks for taking the time to write this detailed explanation of frame latency. I was not aware that the Pixy did not have the traditional 1 frame latency.

It sounds like the best workflow would be to have the pixy attached to a separate camera and calibrate the two image views. To avoid confusion for other people that watch the kickstarter video I would recommend avoiding including shots in future videos that don’t represent the functionality of the device. You may event want to invest in writing a section on the site showing how to synchronize a Pixy to an external camera as I imagine this is a popular use case for many Pixy owners. Overall, I am very happy with the Pixy! Thanks again!