# Panderings in median calculations

Generating a mean or average value is computationally simple and memory efficient. Generating the median, however, is expensive and bloaty. Since it literally took a day of running on my computer to generate the median for a big astrophotography shoot, I was wondering if there weren’t a faster way, so here is the result of a small experiment I ran at the nudging of a good friend.

# Algorithmic differences

## Mean

The reason that means are easy is that you can use a single accumulator for each pixel and one counter for overhead. The average of each pixel the sum of that pixel value in every source image divided by the number of source images. Imaging programs can calculate the values just about as fast as they can read in the source files.

## Median

Selecting the median involves sorting all of the pixel values from each of the source images and choosing the one in the middle. For each pixel, the computer needs to keep track of the pixel values for every source image and then do costly sorting routines before obtaining the median.

# Testing in ImageMagick

ImageMagick is my favorite imaging program by far. It’s chock full of profressional tools that usually seem to outperform the likes of Photoshop or the GIMP. Generating a mean or median is pretty simple too.

```convert input*.jpg -evaluate-sequence mean mean.jpg
convert input*.jpg -evaluate-sequence median median.jpg
```

These are the commands I used in my trial. I took ten source images of the sky at 5472×3648 resolution with 16 bit color channels. These were straight out of the RAW sources from my camera. My theory was that by using different image formats or different methods I could significantly reduce the time required to generate the median.

## File formats

The source images are TIFF exports at 16 bits per channel from LightRoom of the RAW images. I chose TIFF because I wanted no loss in the source data.

JPEG was the next obvious choice to test against. It’s known for its speed, not only algorithmically, but because it has excellent hardware support. There’s practically no standard image format that can be decoded faster than JPEG.

Non-standard formats, however, might be able to beat JPEG. Here, non-standard basically means that people don’t share these over the Internet or transfer imagery this way (compare to PNG, JPEG, WebP, TIFF, BMP, etc…) PNM is a simplified format that essentially records the raw data with few headers and little structure. It’s fast because it required almost no parsing.

Finally, I was intrigued to learn about ImageMagick’s native .mpc format, which is a directly pageable pixel cache saved on disk. It’s only meant for in-transit work because there’s no assumed compatability across compiles or platforms. Similar to PNM, the data format is simplified and is nothing more than taking the in-memory structures of the image and saving those to disk. This requires absolutely no parsing.

## Methods

Only two methods piqued my interest. The basic method is to pass in all of the images at once into `convert` and let it do its things. The second method involves splitting the image into tiles and operating independently on those tiles. The goal is reduced memory usage to allow ImageMagick to do more work in RAM without resorting to the much slower repeated disk access, even on my fast Apple SSD.

Tiling the image is a bit more complicated and took a `bash` script to do properly. The tiles were chosen to be rows twenty-four pixels high, yielding 152 rows that were 5472 pixels wide and were generated based on the following command.

```stream -map rgb -storage-type quantum \
-extract 5472x24+0+\$ROW input-\$N.tif - |\
convert -depth 16 -size 5472x24 rgb:- input-\$N-row-\$ROW.tif
```

# Results

The results of this experiment were surprising, but they confirmed a suspicion I had: ImageMagick is smart and realy good at what it does. None of the methods I found made a big impact on performance. In contrast to being a let-down, I believe what this tells us is that ImageMagick is tuned for the median to handle large inputs and limit memory use.

Format / Method Size of images User time
TIFF 104 MB 5m 47s
JPEG 12 MB 5m 55s
PNM 114 MB 5m 45s
MPC 152 MB 6m 26s
TIFF tiles 770 KB 5m 27s

It was surprising that the tiling didn’t noticeably speed up the process. It was surprising that TIFF images outperformed JPEG images (which are only 8 bit per channel, by the way). I only ran the tile test against TIFF images because they were seemingly fast enough, and after seeing the results from the other tests, I didn’t think it necessary to repeat with the other formats. JPEG is clearly a winner on storage space but TIFF wasn’t the performance elephant I expected it to be. In other words, you don’t end up trading out performance for flexibility and accuracy when using TIFF instead of JPEG as an intermediate format.

N.B. ImageMagick does give the caveat that the MPC format is especially useful when repeatedly reading and parsing an image, which this experiment doesn’t do. The fact that it scored slower and involved larger files is a poor reflection of its intended use cases.

# Remember the mean?

Oh yeah, this post started out with a comparison? How did the previous tests perform when calculating the mean? I’m glad you asked.

Format / Method User time Speedup
TIFF 9.7s 35.8x
JPEG 7.1s 50.0x
PNM 3.6s 95.8x
MPC 5.5s 59.3x
TIFF tiles 4.4s 74.3x

Quite the speedup. Here you can see the TIFF images starting to fall behind the performance curve.

In the end, I learned that PNM is a darn fast format and it even supports high bit depth images. I also learned that I’m just stuck if I want to caclulate the median of a huge stack of images. My day-long calculation, for instance, was composed of 518 source images at 20 megapixels each.

Now theoretically, this doesn’t seem like it should take so long. Consider that we are sorting twenty million arrays, each 518 elements long. The quicksort algorithm has approximately `n log(n)` complexity, which equates to `20e6 · C · 518 · 24.25 ≅ 93e9·C` where `C` is the time it takes to perform one sort iteration. My 2.7 GHz processor can do one hundred billion operations in 37 seconds, so unless it takes ten minutes to sort 518 integers (actually, it’s forty minutes total but we have three color channels) we have some major work being done that isn’t calculating the median – most likely disk access. Correspondingly, the actual CPU utilization was low during these calculations, leading me to suspect IO bottlenecks even more.

If indeed we have an IO bottleneck, it should be pointed out that tiling gains another important advantage. We can generate the tile medians in parallel, using a `Makefile` for easy concurrency. Since I’m not a `make` guru, I scripted together a pretty sloppy `Makefile` to generate the tiles and the stripes and medians of those stripes. Note that I’m reporting the real time here instead of the user time because the user time on multiple cores is longer than the actual real time.

Format / Method Real time Speedup
TIFF tiles in parallel 1m 49s 3.18x
MPC tiles in parallel 1m 50s 3.51x

The additional test for MPC files here was just in case reading all of these files several times would bring in the parsing-time speedup, but apparently they didn’t. The 3x+ speedup corresponds roughly as expected for my quad-core machine – parallel builds with make for the win.

# Conclusion

ImageMagick does a better job handling memory and file formats better than I can. There wasn’t enough gain switching to any other format to justify the switch – they just complicate the build process. For any reasonable gains in these situations, the necessary step is to split up your images or your process into independent groups that can be run in parallel.

photography