C++ impl of DivQuant clustering

Nov 11, 2015

DivQuant is an incredibly fast clustering algorithm originally written by M. Emre Celebi. The original C source code was ported to templated C++ to produce a fast implementation optimized for mobile devices.

In my previous post, the subject of color sorting was covered using a different C++ clustering implementation. In this post, the same clustering logic is described except that a significantly faster implementation is provided. This DivQuant implementation is actually 5x faster than the GVM implementation and it uses minimal memory at runtime.

The remainder of this post shows the same image and pixel sorting examples shown in my previous post.

What is pixel sorting? The best way to explain is to just show some examples. This example image is serrano.png, it is a well known example image in the graphics community.

    Sun Image

The serrano image is sparse meaning that unique pixels are not densely packed in terms of the 3D colorspace. If one were to extract all the pixels and order them by int order that would look like the following:

    Sun Image

There is clearly some sort of relationship between the pixels in int sorted order, but the usefulness is not directly obvious. Clustering pixel values as a means to sort by color would be more useful.

Clustering as a 3D sort method

The really tricky part of sorting by RGB pixel color is that the RGB values represent a colorspace in 3 dimensions. Sorting the pixels is a rearrangement in 2D in a way that captures the 3D relationships between color components. The DivQuant example project contains an XCode example project that is able to read a PNG image and generate a cluster sorted output. For the serrano image, the pixels broken into clusters would look like the following:

    Sun Image

The image above shows serrano pixels clustered into N = 256 clusters. Each cluster contains the other pixels closest to the cluster centers sorted from darker to lighter pixels. These cluster sorted pixels can be combined together to form a final sorted output like the following:

    Sun Image

The sorted order basically flattens out each cluster so that the final output image is generally sorted from dark to light. Pixel colors nearer to each other in 3D space are near each other in the 2D sorted order. Note for example how the light blues appear in the same line in the output image.

Lenna example

This next example shows the Lenna.png image, another well known image in the graphics community. DivQuant clusters the pixels in this image in under a second.

    Sun Image

When this image is processed into sorted clusters, the output is the following:

    Sun Image

With clusters combined back into a simple 2D image, the sorted pixels are the following:

    Sun Image

How is color sorting useful?

Sorting pixels based on clustering in 3D space has a number of applications including image segmentation, data compression, and image filtering. But without a very fast clustering implementation, one would have a tough time even getting started. This DivQuant in C++ example provides a high performance Open-source implementation with very low memory requirements.