
Signature
"It's better to have rocked and lost than never to have rocked at
all." -John Flansburgh
>I have used applications which search through image files, look at the
>picture, cross reference and generate a list of images which it considers
>similiar. Sometimes it would be way off, but it seemed like for the most
>part there were a lot more false positives than false negatives.
I have often wanted a search engine that could find pictures "similar"
to a given one.
Here is an idea for a reasonably simple though slow algorithm.
You take say a 8x8 grid square and overlay it over the image. You then
look for the most complicated square. I define "complicated" as the
square with the most distinct colours. For tie breaking, you sum the
contrast between all adjacent pixel pairs.
You then shift the grid one pixel right and repeat. Then you repeat
shifting the grid down, until you have covered all possible grids over
the image. Eventually you will discover the most compilicated grid
square. This square considered as a binary number is what you index
images by. Feel free to optimise the algorithm.
This is mainly to help you find duplicate images that were cropped to
find copyright violations. It won't find similar images with doctored
contrast, colours, or scaling, ditto images that have been changed
from jpg to png etc.
And unfortunately, it won't help you to find pictures of blue spotted
tree frogs.

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
Luc The Perverse - 23 Oct 2005 03:59 GMT
>>I have used applications which search through image files, look at the
>>picture, cross reference and generate a list of images which it considers
[quoted text clipped - 24 lines]
> And unfortunately, it won't help you to find pictures of blue spotted
> tree frogs.
You lost me with the blue spotted tree frogs part.
You've got me thinking though. I think edge detection may be the key.
If there isn't something out there that does this already (which I don't
believe) then there should be!

Signature
"It's better to have rocked and lost than never to have rocked at
all." -John Flansburgh
Andrew Thompson - 23 Oct 2005 04:09 GMT
..
>>And unfortunately, it won't help you to find pictures of blue spotted
>>tree frogs.
>
> You lost me with the blue spotted tree frogs part.
Such 'pixel comparison' methods cannot determine high level information.
- 'Blue'(ish/predominantly) - maybe.
- 'Spotted' - much harder.
- Tree frogs - "I've cracked machine vision! Where's my Nobel prize?"
Roedy Green - 23 Oct 2005 04:19 GMT
>Such 'pixel comparison' methods cannot determine high level information.
>- 'Blue'(ish/predominantly) - maybe.
>- 'Spotted' - much harder.
>- Tree frogs - "I've cracked machine vision! Where's my Nobel prize?"
Does there exist some standard for encoding picture content inside the
image in a way that Google for example could find photos of George
Bush with Harriet Miers in the 1970s in Albania. or
Installing a xxxx cartridge in a yyyy printer?

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
Andrew Thompson - 23 Oct 2005 04:30 GMT
>>Such 'pixel comparison' methods cannot determine high level information.
>>- 'Blue'(ish/predominantly) - maybe.
[quoted text clipped - 4 lines]
> image in a way that Google for example could find photos of George
> Bush with Harriet Miers in the 1970s in Albania.
No. JPG's (as well as a variety of other image formats) have
the capacity to store extra information in images (mostly related
to the specifics of the 'shot' - F-Stop, timing..), some can also
store the type of meaningul information you are referring to.
Unfortunately, it seems that there is little/standards
commonality amongst the format of this infromation even
for single image types, let alone image types in general.
I was just thinking of the process that Google uses to
pull up images before I saw your post, actually, and was
about to point out the problem becomes a lot simpler with
meaningful file names like ..
'blue_spotted_tree_frog.jpg'
;-)
> ...or
> Installing a xxxx cartridge in a yyyy printer?
....huh? Are we still talking about images?
Roedy Green - 23 Oct 2005 11:05 GMT
>> Installing a xxxx cartridge in a yyyy printer?
>
>....huh? Are we still talking about images?
a diagram.

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
Roedy Green - 23 Oct 2005 13:16 GMT
>Such 'pixel comparison' methods cannot determine high level information.
>- 'Blue'(ish/predominantly) - maybe.
>- 'Spotted' - much harder.
>- Tree frogs - "I've cracked machine vision! Where's my Nobel prize?"
I really enjoyed that post. It is a joy to see someone pack so much
into so few words.

Signature
Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.
Andrew Thompson - 23 Oct 2005 13:47 GMT
>>Such 'pixel comparison' methods cannot determine high level information.
>>- 'Blue'(ish/predominantly) - maybe.
[quoted text clipped - 3 lines]
> I really enjoyed that post. It is a joy to see someone pack so much
> into so few words.
I was thinking much the same of your original statement!
[ ..and as an added bonus, I like frogs. :-) ]