Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsWhite Papers
Discussion GroupsFirst AidDatabasesJavaBeansGUIJava 3DVirtual MachineCORBASecurityToolsGeneral
Java DirectoryOpen Source ProjectsSample Book ChaptersUser GroupsWeb Resources
Related Topics
Databases.NETMore Topics ...

Java Forum / General / October 2005

Tip: Looking for answers? Try searching our database.

Multimedia

Thread view: 
Luc The Perverse - 23 Oct 2005 01:23 GMT
I have used applications which search through image files, look at the
picture, cross reference and generate a list of images which it considers
similiar.  Sometimes it would be way off, but it seemed like for the most
part there were a lot more false positives than false negatives.

I was wondering if there is open source technology to do this.  (Or similar
operations for Audio and video.)

It was trivial generating a client which could search for duplicate file
sizes and then run a checksum on the files to see if they match.

Signature

"It's better to have rocked and lost than never to have rocked at
all." -John Flansburgh

Roedy Green - 23 Oct 2005 01:59 GMT
>I have used applications which search through image files, look at the
>picture, cross reference and generate a list of images which it considers
>similiar.  Sometimes it would be way off, but it seemed like for the most
>part there were a lot more false positives than false negatives.

I have often wanted a search engine that could find pictures "similar"
to a given one.  

Here is an idea for a reasonably simple though slow algorithm.

You take say a 8x8 grid square and overlay it over the image. You then
look for the most complicated square. I define "complicated" as the
square with the most distinct colours.  For tie breaking, you sum the
contrast between all adjacent pixel pairs.

You then shift the grid one pixel right and repeat. Then you repeat
shifting the grid down, until you have covered all possible grids over
the image.  Eventually you will discover the most compilicated grid
square.  This square considered as a binary number is what you index
images by.  Feel free to optimise the algorithm.

This is mainly to help you find duplicate images that were cropped to
find copyright violations. It won't find similar images with doctored
contrast, colours, or scaling, ditto images that have been changed
from jpg to png etc.  

And unfortunately, it won't help you to find pictures of blue spotted
tree frogs.

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.

Luc The Perverse - 23 Oct 2005 03:59 GMT
>>I have used applications which search through image files, look at the
>>picture, cross reference and generate a list of images which it considers
[quoted text clipped - 24 lines]
> And unfortunately, it won't help you to find pictures of blue spotted
> tree frogs.

You lost me with the blue spotted tree frogs part.

You've got me thinking though.  I think edge detection may be the key.

If there isn't something out there that does this already (which I don't
believe) then there should be!

Signature

"It's better to have rocked and lost than never to have rocked at
all." -John Flansburgh

Andrew Thompson - 23 Oct 2005 04:09 GMT
..
>>And unfortunately, it won't help you to find pictures of blue spotted
>>tree frogs.
>
> You lost me with the blue spotted tree frogs part.

Such 'pixel comparison' methods cannot determine high level information.
- 'Blue'(ish/predominantly) - maybe.
- 'Spotted' - much harder.
- Tree frogs - "I've cracked machine vision!  Where's my Nobel prize?"
Roedy Green - 23 Oct 2005 04:19 GMT
>Such 'pixel comparison' methods cannot determine high level information.
>- 'Blue'(ish/predominantly) - maybe.
>- 'Spotted' - much harder.
>- Tree frogs - "I've cracked machine vision!  Where's my Nobel prize?"

Does there exist some standard for encoding picture content inside the
image in a way that Google for example could find photos of George
Bush  with Harriet Miers  in the 1970s in Albania. or
Installing a xxxx cartridge in a yyyy printer?

Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.

Andrew Thompson - 23 Oct 2005 04:30 GMT
>>Such 'pixel comparison' methods cannot determine high level information.
>>- 'Blue'(ish/predominantly) - maybe.
[quoted text clipped - 4 lines]
> image in a way that Google for example could find photos of George
> Bush  with Harriet Miers  in the 1970s in Albania.

No.  JPG's (as well as a variety of other image formats) have
the capacity to store extra information in images (mostly related
to the specifics of the 'shot' - F-Stop, timing..), some can also
store the type of meaningul information you are referring to.

Unfortunately, it seems that there is little/standards
commonality amongst the format of this infromation even
for single image types, let alone image types in general.

I was just thinking of the process that Google uses to
pull up images before I saw your post, actually, and was
about to point out the problem becomes a lot simpler with
meaningful file names like ..

  'blue_spotted_tree_frog.jpg'

;-)

>  ...or
> Installing a xxxx cartridge in a yyyy printer?

....huh?  Are we still talking about images?
Roedy Green - 23 Oct 2005 11:05 GMT
>> Installing a xxxx cartridge in a yyyy printer?
>
>....huh?  Are we still talking about images?

a diagram.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.

Roedy Green - 23 Oct 2005 13:16 GMT
>Such 'pixel comparison' methods cannot determine high level information.
>- 'Blue'(ish/predominantly) - maybe.
>- 'Spotted' - much harder.
>- Tree frogs - "I've cracked machine vision!  Where's my Nobel prize?"

I really enjoyed that post.  It is a joy to see someone pack so much
into so few words.
Signature

Canadian Mind Products, Roedy Green.
http://mindprod.com Again taking new Java programming contracts.

Andrew Thompson - 23 Oct 2005 13:47 GMT
>>Such 'pixel comparison' methods cannot determine high level information.
>>- 'Blue'(ish/predominantly) - maybe.
[quoted text clipped - 3 lines]
> I really enjoyed that post.  It is a joy to see someone pack so much
> into so few words.

I was thinking much the same of your original statement!

[  ..and as an added bonus, I like frogs.  :-)  ]


Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2009 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.