[KimDaBa] Re: frontend to help classify images (fwd)
Jesper K. Pedersen
blackie at klaralvdalens-datakonsult.se
Mon May 24 11:36:28 CEST 2004
Please send questions to the kimdaba mailing list, so other people get a
chance to help me answering user questions.
On Friday 21 May 2004 06:25, John at Vicherek.com wrote:
| Hi Jesper,
| In response to the below email I was recommended to take
| a look at your kimdaba - it's quite impressive !
| It seems that it is quite close in many aspects to what
| I'm looking for.
| For example, you already have the "hierarchical"
| classification, i.e. Locations you can specify hierarchy of
| country, state, city and then select at any level of
| I was wondering whether there is a mode in which I use
| flag an image with Yes/No as I run through 5000 images, e.g.
| when SPACE is pressed, the image is flagged as No and next
| is displayed, when ENTER is pressed, the image is flagged as
| Yes and next is displayed.
This is something I've been thinking of myself (my context was that my
girlfriend wanted some of our images printed, so some way she needed to mark
some of them)
I haven't really found a good way yet, I would like it to be a bit more
general that yes marking yes/no.
The obvious thing would be that you could set some options for each image,
thus your yes above would translate into "add keyword yes".
I'd like to put this on the TODO list, but I'm afraid to say that I'm
overworked for the next release with getting KIPI working.
And as always, if you need a feature desperately for real business, you always
have the option of paying me to implement it.
| What would it take to develop such a mode ?
| If you could comment please on my email below.
I'm afraid I dont have time to read the longer part, so let me know if my
answer above didn't completely answer what you need.
| Thanks so much for writing kimdaba,
| ---------- Forwarded message ----------
| Date: Sat, 8 May 2004 14:02:32 -0400 (EDT)
| From: h121 at ied.com
| To: gimp-user at lists.xcf.berkeley.edu
| Subject: frontend to help classify images
| I'm looking for a front-end tool to help me process
| (qualify and classify / catalog) about 10,000 scanned
| I've promissed to someone to process a bunch (10,000) of
| images, and am realizing it might not be as simple as I
| I'm soliciting, in this forum of image processing experts,
| experiences and suggestions for possible solutions. I am a
| complete newbie into image processing, with zero experience
| (I know Gimp exists), so please don't think that I know what
| I want. I only have an idea of what should the ideal target
| roughly look like.
| Before I start, first is a meta-question - is this a good
| forum to ask this at ? In which forum(s) would it be good to
| ask these types of questions ? (I've noticed that gimp-perl
| has on average only 1.5 posts/mth)
| I'm looking for some tool(s) that would help me qualify
| and classify / catalog a bunch of images. I can easily build
| myself the database structures I need in MySQL or
| PostgreSQL. I'm having trouble finding the frontend tool
| which would allow me to view (and manipulate a little) the
| image and would be an effective data entry tool for these
| qualifications / classifications. I'm not so concerned at
| this point about the viewing of the images once they are all
| processed, althouh I imagine that the same tool might be
| also used for viewing these images at the end. I would
| prefer to run it all on linux, but would settle for Windows
| some integrating is needed. (I also know C++ and Java, but
| would prefer not to use them for this, if possible.)
| I have about 4,000 photos w/EXIF info, but I'll write
| about those at the bottom of this post.
| More importantly, I have about 7,000 B&W (dithered) scans
| of docs of various contrasts (sometimes light gray), all of
| them are text (no photos). I currently have them all (99%)
| in pixel format (png and pbm), about 1% are jpegs. None are
| multipage scans, all are single-image scans.
| I need to classify them in several "dimensions", but
| elements / attributes of those dimensions may vary based on
| the type of content the document carries;
| I need to build a searchable database, so I can find them
| by specifying a criteria in one or more dimensions. E.g.
| "all expense docs from `Botanical Gardens' involving period
| June 23, 2003 to July 23, 2003", and a set of 140 image
| files would fall out for display / browsing.
| I would really hope to have a frontend which would be
| fully controllable via kbd, just because kbd is so much
| faster to use than mouse (for most things (*1))
| Key Meaning
| a "This is another page from the same doc. Write it into the DB and
| b "This page is blank - doesn't contain any information" 6.1
| display next scan".
| n "This is scan is a page of another doc. Close the previous logical
| doc." d "Add this page to a doc that has been created before"
| s "Start a new doc."
| f "This page pertains to finances." 6.2
| c "This page pertains to finances / income." 6.2.1
| e "This page pertains to finances / expense." 6.2.2
| l "This page pertains to legal." 6.3
| i "This page pertains to info." 6.4
| (*1) - mouse comes in handy for only two actions: see G4 and
| G7 below
| So I guess I would be looking for a "graphical engine", or
| "display engine" capable of (hopefully fast) display and
| manipulation of images. Separate zoomed window for fine
| navigation would be a nice extra. It would be nice if it
| would have combo boxes for choosing / adding items (see
| dimension 4 below), where the selection of items narrows
| down as you type lookup codes / starting letters of the
| entities. ( see point 4 below )
| If worse comes to worst, I would settle for this whole
| manipulation would be terribly slow, probably ugly, and I
| would hate if I had to use MS's Explorer's exentions :-(
| Not to mention that I have no idea how could I do 8x-zoom
| don't really allow for easy image panning.
| Below is what I think my wishlist should be. But then
| again, I'm new to image processing ...
| This is what I imagine the graphical engine should be able to do:
| G1 fit-to-widow
| G2 fit-width-to-window
| G3 1-to-1 pixel zoom
| G4 8-to-1 pixel zoom (in a smaller window - see G7)
| G5 mouse movement in the above three items moves the image,
| so whole page could be quickly visually scanned for defects
| G6 ability to specify areas (mostly rectangular, possibly
| occasionally rotated) of an image [ this would tango with
| the system feature 2.5 below - ability to treat these areas
| as separate scans (as pieces of different documents) ]
| G7 fine-navigation: nice extra: when Conrol key or
| something is pressed, a fine-navigation (8x zoomed) window
| pops up on the side, and mouse movement is 8x finer - allows
| for spefifying fine rotation angle (1.7.2) by means of
| clicking on two points which *should* be in a straight
| horizontal or vertical line on the original
| G8 another nice extra: "increase contrast" algorithm - in a
| B/W or dithered picture: draw a 2 or 3-pixel wide line
| between pixels that are less then distance X apart (this
| will enhance). This is just my formulation of what a
| "contrast enhancing" algorithm should do. Or another
| algorithm with similar effect: if a pixel has another pixel
| less than distance X away, turn other pixels black in its 2
| or 3-pixel diameter.
| The dimensions would be:
| 1 picture quality dimension:
| 1.2 resolution : 300 ? 600 ? other ?
| 1.3 lineart or dithered ?
| 1.4 legible scan ?
| 1.5 the whole page is scanned ? or are parts / edges missing?
| 1.6 needs re-scan ?
| 1.7 needs post-processing ?
| 1.7.1 rotation by X*90 degrees
| 1.7.2 rotation by Y*0.1 degrees
| 1.7.3 increasing "contrast" (difficult with B&W/dithered pics)
| 2 document structure dimension: (2.1 to 2.3 erased)
| 2.4 which scan is the chapter title page, if any ?
| 2.5 if one scan contains more than one logical document,
| how does the scan divide into areas containing them ?
| 2.6 which library does it belong to ?
| 2.7 which shelf within library does it belong to ?
| 2.8 which volume of books on that shelf does it belong to ?
| 2.9 which book in that volume does it belong to ?
| 2.10 which chapter in that book does it belong to ?
| 2.11 which page of the chapter is it ?
| 2.12 which side of that page is it ?
| 3 time dimension:
| 3.1 date & time
| 3.2 period (from date to date)
| 3.3 expiry date
| 3.4 other date
| 4 entities dimension:
| 4.1 from which entity ? [ choose from / add to list of entities ]
| 4.2 to which entity ? [ choose from / add to list of entities ]
| 4.3 publishing entity ? [ choose from / add to list of entities ]
| 4.4 from which address ? [ choose from / add to list of addresses ]
| 4.5 to which address ? [ choose from / add to list of addresses ]
| 5 values:
| 5.1 ID1
| 5.2 ID2
| 5.3 title
| 5.4 subject
| 5.5 value1
| 5.6 value2
| 5.7 value3
| 6 flag:
| 6.1 blank page ?
| 6.2 financial ?
| 6.2.1 expense ?
| 6.2.2 income ?
| 6.3 legal ?
| 6.4 infomational ?
| 6.5 expired ?
| 7 ownership / responsibility for this doc:
| 7.1 Jack's group
| 7.1.1 Jack
| 7.1.2 Peter
| 7.2 Mary's group
| 7.2.1 Mary
| 7.2.1 Dennis
| Then I have about 4,000 JPEG color pics, most of them
| w/EXIF data.
| With these, there may be additional qualification, plus
| some from above may not qualify
| 1.8 rating of quality of composition (capturing the intended subject)
| 1.9 rating of technical quality
| 1.8.1 focused
| 1.8.2 not shaken (when tripod not used)
| 1.8.3 proper lighting / timing / contrast
| and then sorting them into categories :
| 8. category
| 8.1 trees
| 8.1.1 indoor
| 8.1.2 outdoor
| 8.2 bushes
| 8.3 tools
Jesper K. Pedersen | Klarälvdalens Datakonsult
Senior Software Engineer | www.klaralvdalens-datakonsult.se
Peder Skrams Gade 27 3. tv. |
6700 Esbjerg | Platform-independent
Denmark | software solutions
More information about the KimDaBa