[KPhotoAlbum] Rebuilding thumbnails is not an exercise for the faint of heart

Robert Krawitz rlk at alum.mit.edu
Sun May 13 05:27:57 CEST 2018


I rebuilt all of my thumbnails today.  My thumbnail cache had gotten
polluted from some of the work I had been doing, and I was also
curious how long it would take.  It actually took about 11.5 hours for
2.5 TB of images (270,303 total).  That works out to about 60 MB/sec
and resulted in 94+ thumbnail cache files, or about 30 GB of
thumbnails.

I'm not quite sure yet just what to make of this.  The I/O rate is
less than what I would expect, but kpa only used about 15.5 CPU hours
during this, despite having three threads processing images (and
videos).  iostat was showing about 40-50 MB/sec when I was watching
it, with about 100 IO/sec, both of which are a bit low for the class
of drive I'm using.

Another way of looking at it is that each of the 94 cache files
contains thumbnails for about 2875 shots, and in the later stages
(when kpa was processing fairly uniform 10 MB files), it took 10-11
minutes per cache file.  So call that about 270/minute, or 5-6
files/second.

I have no desire to go through this again.  I guess I could take about
100GB of images, stash them on my SSD (SATA, not PCIe), and see what
happens with a throwaway database.  That would at least give me a
better idea whether the limit is CPU or I/O.

I did notice that the thumbnail generator opens and closes the
thumbnail cache file for each thumbnail processed.  That did not
appear to translate into extra I/O ops; my images are distributed
across two drives, and during a part when it was processing images
almost entirely on the second drive, I saw only short bursts of I/O to
the thumbnail cache rather than a steady background rate.

But it does look to me like thumbnail building may be slower than it
should be.  Certainly there should be a warning issued when someone
requests rebuilding all of the thumbnails; accidentally doing that
might be very time consuming.  It doesn't render the program useless
while it's going on, but it is certainly a big inconvenience.
-- 
Robert Krawitz                                     <rlk at alum.mit.edu>

***  MIT Engineers   A Proud Tradition   http://mitathletics.com  ***
Member of the League for Programming Freedom  --  http://ProgFree.org
Project lead for Gutenprint   --    http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton


More information about the KPhotoAlbum mailing list