[KPhotoAlbum] Search performance

Andreas Schleth schleth_es at web.de
Thu Oct 18 23:02:41 CEST 2018


Hi everybody,

Am 18.10.18 um 21:26 schrieb Johannes Zarl-Zierl:
...
> I should have mentioned in the first mail what I mean by "canonical file
> format". I've no problems with storing the data into a persistent database for
> caching.
> But I still think that the index.xml format has good properties (resistant
> against file corruption, easy/robust versioning, readable and writable "by
> hand"). Also, many people use kphotoalbum on different machines in different
> versions - with the XML format, you can easily pull that off as long as you
> take some care.

Yes, yes, yes!

Eg: I still use an old 4.2 KPA with all the glorious KIPI plugins to 
turn time (when someone gives me pictures with the date/time off to sync 
them with my own images). This works nicely with index files otherwise 
used with the latest git master.

And I occasionally tweak the database manually. Eg. setting the time to 
somewhere between Jan 1st and Dec 31st makes the image show up in at 
least 2 consecutive years. Changing this to Jan 2nd and Dec. 30th are 
just two commands in vim.

Even if I am usually a bit critical about XML because it is a bit chatty 
(lots of text in names and attributes), it has the great benefit of 
being very robust. Robustness must come first, then the code has to be 
understood by future maintainers, then performance. We are talking about 
data that we want to keep for (many) years to come. My own databases 
date back to 2004/2005, when Blackie himself twiddled with the code. 
This is at least 2 generations of maintainers back.

Thus, everybody involved did a really terrific job in keeping the file 
format stable and backwards compatible over so long a time frame.

> If we take the caching approach, we should be able to eat our cake (index.xml
> format, fast queries) and still have half of it (usually fast loading with
> "slow" saving to index.xml).
>
I somewhat doubt that a large number of images really makes loading much 
slower. There are other factors too, such as (maybe) total tree size or 
type and size of media.

My image databases all load fairly quick - all around 30 to 40k images:

as at wshome5:~/eigene_Bilder> time kphotoalbum -c index.xml
real    0m8,219s
user    0m5,219s
sys     0m0,448s
(open & close without save / tree size: 141,449,556 kB / 35457 images / 
index 31 MB)

My movie database with only around 1k clips and movies takes "forever" 
to load:

as at wshome5:~/Filme> time kphotoalbum -c index.xml
real    0m40,944s
user    0m8,874s
sys     0m4,718s
(open & close without saving / tree size: 1,568,340,336 kB / 1100 films 
/ index 1,7 MB)

This big difference tells me (I did not look into the code) that looking 
at a few large files takes KPA much longer than looking at many smaller 
ones...

All my files sit on a NFS share (spinning rust) via GB Ethernet.


Just my thoughts.

Best regards & thanks for keeping the project alive!

Andreas


More information about the KPhotoAlbum mailing list