<< return to Pixycam.com

1296X976 versus 316X208 : A Zoom-In Function?


I see that the actual resolution of the imaging chip is 1296X976 but when the Pixy2 is doing color tracking it uses only 316X208 resolution.

My first question is how does it down-grade the image from that high resolution to that very low resolution? Is it choosing every 4th pixel in every 4th row? Is it averaging pixels in some 4X4 set of pixels? Or what?

My other question is this: Could there be a function included, when driving it from an Arduino or similar processor, that allows the Pixy2 to be told WHICH 316X208 section of the overall 1296X976 array to be used for its image processing? I can envision a situation where the controlling program KNOWS approximately where in the field of view the desired object resides, and such a function call could effectively tell the Pixy2 to ZOOM IN on just that section of the image for processing. In fact, in my particular application that’s exactly what I need.


Hello Roger,
The downscaling of the image is done by averaging the pixels, not cropping the image. This way you still get a the same field-of-view as well as reduced noise.

Your suggestion about changing resolution and field-of-view is interesting. It may be possible through manipulating various registers on the camera chip, but would require a modification of the firmware. The short answer is that it’s not supported, but it might be possible. I’ll add to our list of potential future features. :slight_smile:




Thank you for replying to my post.

I would think that averaging the pixels could easily result in RGB values for the downscaled image that likely will not match with the color signatures that it’s looking for, even if a few of the original pixels were very good matches. Certainly just arbitrarily selecting one pixel (e.g. the central one) from each set of to-be-downscaled pixels isn’t a very good approach either. But I’d think you’d get a lot of missed signature matches.

As far as my suggestion about allowing for the selection of a specified area of the image to be used (UN-downscaled), I’ve been looking through the firmware source code to try to get a good handle on how it’s all being done, in hopes that I might actually be able to make a contribution to the project and add that capability. I could really, really use it for the project that I want to do with the Pixy2. I’ve got over 40 years experience as a software engineer, so it’s just possible that I might be able to do that. BUT, the source code doesn’t give a really good idea of what’s actually going on inside the firmware. Is there any documentation that gives a good overview of how it all works? For example, I could NOT find any place in the firmware where it actually does that downscaling or acquiring the pixel data from the image chip.

And in general, is there any documentation that describes how it’s doing signature-matching? My initial testing of the device seems to indicate that it’s pretty sensitive to the variations in the brightness of the object, that maybe it’s looking for specific RGB values and if the target pixel is brighter or dimmer, but with essentially the
same color, it misses it as a match.

Some time ago, before the Pixy products were available, I tried doing basically the same thing that Pixy is doing, but using just an Arduino and a camera. The “signatures” that I used essentially captured the RGB values as RELATIVE to one another, so that even if a target pixel was lighter or darker it could still recognize a match. So I’m curious as to how it’s actually being done in the Pixy products. Is there any documentation on that?

  • Roger Garrett


Hello Roger,
I’m sorry that the source code isn’t well documented. It’s been on our list of things to do, but it keeps getting pushed down.

Regarding the basic method, Pixy converts the RGB space into 2-space – (R-G)/Y and (B-G)/Y where Y is R+G+B. This 2-space is the detection space from which the signatures are constructed.

The code you’re referencing is mostly written in assembly.

frame_m0.c handles clock sync, grabbing raw frames, including downscaling
rls_m0.c handles the initial component differencing and doing a lookup to see if a given pixel could potentially be part of a signature. Blocks.cpp handles the rest – dividing by Y, final thresholding, connected components — I’m just talking about the CCC algorithm.

I’ve probed around the code quite a bit. It’s not well documented, agreed.



Thank you for all the info.

As a software engineer I’ve encountered lots and lots of code that is, shall we say, lacking in documentation. It often doesn’t seem to be a priority or a requirement. The engineer rightfully sees his job as getting it working as quickly as possible, but the supervisor should also be concerned about maintenance. In so many cases, though, the supervisor is also an engineer and abides more by the “get it done” philosophy than the “get it done right” philosophy. I learned early on to include documentation right in the code, so that if someone else ever needs to look at, understand it, or fix it, all the info is right there in the source. Even, as a retired engineer, whenever I write code that only I will ever see, I continue to do in-line documentation.


I’ve been looking at the documentation for the Aptina MT9M114, the image sensor and processing chip that’s used on the Pixy, and it sure does look like it’s fairly straightforward to specify how much and which part of the image to return. SO it looks like it ought to be fairly straightforward to add a few methods to the Arduino (and other) APIs to handle the functionality of, essentially, “zooming in” on an area of interest. I’m thinking that that would be a welcome addition to the Pixy’s fuctionality.

I’ll continue to look over the code (thanks for pointing out which files are the relative ones) and see what I can come up with.

  • Roger