[KPhotoAlbum] Search performance

Robert Krawitz rlk at alum.mit.edu
Fri Oct 19 00:39:47 CEST 2018


On Thu, 18 Oct 2018 23:02:41 +0200, Andreas Schleth wrote:
> Am 18.10.18 um 21:26 schrieb Johannes Zarl-Zierl:
> ...
>> I should have mentioned in the first mail what I mean by "canonical file
>> format". I've no problems with storing the data into a persistent database for
>> caching.
>> But I still think that the index.xml format has good properties (resistant
>> against file corruption, easy/robust versioning, readable and writable "by
>> hand"). Also, many people use kphotoalbum on different machines in different
>> versions - with the XML format, you can easily pull that off as long as you
>> take some care.
>
> Yes, yes, yes!
>
> Eg: I still use an old 4.2 KPA with all the glorious KIPI plugins to turn time (when someone gives me pictures with the date/time off to sync them with my own images). This works nicely with index files otherwise used with the latest git master.

I don't really disagree, just note that this is going to be the
limiting factor in startup and save performance.

>> If we take the caching approach, we should be able to eat our cake (index.xml
>> format, fast queries) and still have half of it (usually fast loading with
>> "slow" saving to index.xml).
>>
> I somewhat doubt that a large number of images really makes loading
> much slower. There are other factors too, such as (maybe) total tree
> size or type and size of media.

I've measured it :-)

> My image databases all load fairly quick - all around 30 to 40k images:

I have 275K images; it currently takes about 12 seconds to start up.
That's long enough to be annoying if I want to quickly check some
images.

> as at wshome5:~/eigene_Bilder> time kphotoalbum -c index.xml
> real    0m8,219s
> user    0m5,219s
> sys     0m0,448s
> (open & close without save / tree size: 141,449,556 kB / 35457 images / index 31 MB)
>
> My movie database with only around 1k clips and movies takes "forever" to load:
>
> as at wshome5:~/Filme> time kphotoalbum -c index.xml
> real    0m40,944s
> user    0m8,874s
> sys     0m4,718s
> (open & close without saving / tree size: 1,568,340,336 kB / 1100 films / index 1,7 MB)
>
> This big difference tells me (I did not look into the code) that
> looking at a few large files takes KPA much longer than looking at
> many smaller ones...

What version of kpa are you using, and on what CPU?

There *shouldn't* be any difference in startup time depending upon
storage or file type *unless* you have search for new files on startup
turned on, in which case it's going to search the directory for new
files.  I can't judge that without knowing more about the details of
your filesystem structure.  I'm very surprised by your results, unless
you're running on an old version of kpa.

> All my files sit on a NFS share (spinning rust) via GB Ethernet.

NFS is not a good storage back end for KPA or anything else that works
with a lot of files.  The scout thread I implemented in kpa 5.4 should
help when actually loading new files.
-- 
Robert Krawitz                                     <rlk at alum.mit.edu>

***  MIT Engineers   A Proud Tradition   http://mitathletics.com  ***
Member of the League for Programming Freedom  --  http://ProgFree.org
Project lead for Gutenprint   --    http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton


More information about the KPhotoAlbum mailing list