Image comparison tool

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • bcutting@gmail.com

    Image comparison tool

    I am looking for a way to take a large number of images and find
    matches among them. These images may not be exact replicas. Images
    may have been resized, cropped, faded, color corrected, etc.

    Approach 1
    Programmaticall y extract the information (such as Eigen Vectors/Eigen
    Spaces) and store them in a database. Then apply a comparison
    algorithm between the database entries to find like images.

    Approach 2
    Store all the images and run the comparison tool against the individual
    images

    Approach 1 is preferred since I believe the comparison could be
    performed much more rapidly once the comparison information has been
    extracted. However, this requires a library capable of independent
    extraction and comparison.

    Does anybody have and suggestions for a dll library that can perform
    the above stated tasks?

    Any suggestions on how to store the information in the database? More
    specifically, what would the schema look like.

    Any help is appreciated

  • Peter Bromberg [C# MVP]

    #2
    RE: Image comparison tool

    bcutting,
    There is an author at codeproject.com who has at least one article on motion
    detection algorithms that should be very close.
    Peter

    --
    Co-founder, Eggheadcafe.com developer portal:

    UnBlog:





    "bcutting@gmail .com" wrote:
    I am looking for a way to take a large number of images and find
    matches among them. These images may not be exact replicas. Images
    may have been resized, cropped, faded, color corrected, etc.
    >
    Approach 1
    Programmaticall y extract the information (such as Eigen Vectors/Eigen
    Spaces) and store them in a database. Then apply a comparison
    algorithm between the database entries to find like images.
    >
    Approach 2
    Store all the images and run the comparison tool against the individual
    images
    >
    Approach 1 is preferred since I believe the comparison could be
    performed much more rapidly once the comparison information has been
    extracted. However, this requires a library capable of independent
    extraction and comparison.
    >
    Does anybody have and suggestions for a dll library that can perform
    the above stated tasks?
    >
    Any suggestions on how to store the information in the database? More
    specifically, what would the schema look like.
    >
    Any help is appreciated
    >
    >

    Comment

    • Nils

      #3
      Re: Image comparison tool

      If you want to extract shape info or some other kind of metric, you must
      first know what exactly. E.g. for face recognition, eigenvalues are used,
      for other types of recognition feature points are used, etc.

      If you have no a-priori info, you can still compare the thumbnails (create
      mini-thumbnails of the same size for each image), and use something like a
      Hamming distance to find the matches. If you convert them all to the same
      normalized grayscale images, you can also detect slight colour mismatches.

      I wrote exactly this quite a few years back, and it is still available in
      the form of a shareware image browser having a special "Similar Images"
      filter. It works quite well if one wants to find similar images in large
      databases (up to e.g. 20.000 files).

      Info here:


      I termed the information I store "image metrics", well they are nothing more
      than a smart wavelet-like way of storing the minithumbnails. Since the
      metrics information is small (couple of hundred bytes each), they can be
      kept in memory which speeds up the comparison process enormously.

      The procedure to find duplicates consists of:
      1. Calculate an image metric for each image
      2. Compare the list, using hamming distance with a smart similarity sorter
      3. Output the list to a thumbnail viewer, sorted by similarity, showing only
      similar images as colour-coded groups.

      #1 can take quite some time, but the image database can be stored, so this
      only needs to be done once.

      Hope that helps,

      Nils Haeck



      <bcutting@gmail .comschreef in bericht
      news:1153500670 .450349.211590@ h48g2000cwc.goo glegroups.com.. .
      >I am looking for a way to take a large number of images and find
      matches among them. These images may not be exact replicas. Images
      may have been resized, cropped, faded, color corrected, etc.
      >
      Approach 1
      Programmaticall y extract the information (such as Eigen Vectors/Eigen
      Spaces) and store them in a database. Then apply a comparison
      algorithm between the database entries to find like images.
      >
      Approach 2
      Store all the images and run the comparison tool against the individual
      images
      >
      Approach 1 is preferred since I believe the comparison could be
      performed much more rapidly once the comparison information has been
      extracted. However, this requires a library capable of independent
      extraction and comparison.
      >
      Does anybody have and suggestions for a dll library that can perform
      the above stated tasks?
      >
      Any suggestions on how to store the information in the database? More
      specifically, what would the schema look like.
      >
      Any help is appreciated
      >

      Comment

      • Bob

        #4
        Re: Image comparison tool

        Nils

        That sounds interesting. Have you published anything on the algorithm
        for computing the similarity metric?

        Bob


        Nils wrote:
        I wrote exactly this quite a few years back, and it is still available in
        the form of a shareware image browser having a special "Similar Images"
        filter. It works quite well if one wants to find similar images in large
        databases (up to e.g. 20.000 files).
        >
        Info here:
        http://www.abc-view.com/articles/article3.html

        Comment

        • Terry

          #5
          Re: Image comparison tool

          On 21 Jul 2006 09:51:10 -0700, bcutting@gmail. com wrote:
          >I am looking for a way to take a large number of images and find
          >matches among them. These images may not be exact replicas. Images
          >may have been resized, cropped, faded, color corrected, etc.
          >
          >Approach 1
          >Programmatical ly extract the information (such as Eigen Vectors/Eigen
          >Spaces) and store them in a database. Then apply a comparison
          >algorithm between the database entries to find like images.
          >
          There is a program that does this, called DupDetector. I've used it,
          and it's pretty good at finding similar images. It's freeware.

          However, it is (was) made and distributed by prismatic software, at
          www.prismatic.com, and when you go there now, you get a page saying
          closed as of 5/28/06. You might be able to find it still someplace
          else by googling.

          Also, it's an executable, not a library that you can use, nor was
          source code released as I recall, so it may not be of any use to you.

          Terry

          Comment

          • mark.thomas.7@gmail.com

            #6
            Re: Image comparison tool

            Thumbs Plus (shareware, or it used to be) has done this for years.
            It's reasonably effective, and maybe they will share their secrets..

            Comment

            • Nils

              #7
              Re: Image comparison tool

              Hi Bob,

              No I haven't published anything on the algorithm except for the brief
              description on how the software works on the webpage mentioned. However,
              it's not rocket science :)

              People that have to do a comparison can simply try out the software (30-use
              functional trial, sales price $29). Software engineers/developers wanting to
              make use of the software can always buy the source code from my company. I
              have sold it already to a few companies creating image cataloguers and image
              processing software. I think anyone could write such a thing themself,
              however it might make sense to buy it to save yourself a few weeks of work.

              Here is the basic idea with these assumptions:

              a) We only compare the grayscale version
              b) We are not interested in aspect ratio

              1. Start with an image of dimensions WxH
              2. Scale down into thumbnail of 16x16 pixels, only grayscale, 256 levels
              3. Normalize the thumbnail (so it contains values 0..255 instead of eg.
              25..230)
              4. Create a subthumbnail of 8x8, 4x4, 2x2 and 1x1
              5. Store these thumbnails such that 1x1 comes first, then 2x2, etc

              Now the comparison. Realise that when comparing two images, we are only
              interested in images that are close. So if there's a big difference between
              them, we can abort the comparison quite soon.

              Comparing two images with metrics A and B, metric consisting of

              A = {a1..aN}, N = 341 (1 + 4 + 16 + 64 + 256 = 341)
              B = {b1..bN}, N = 341

              Weighting: w1..wN, where

              w1 = 256
              w2 ..w5 = 64
              w6 ..w21 = 16
              w21..w85 = 4
              w21..w277 = 1

              comparison value between A and B is

              Cp = sum_i,i=1..p{ma x(0, abs(ai - bi) - 1)) * wi}, p can be 1..N

              Note the term max(0, abs(ai - bi) - 1): We compare two pixels, and use the
              "difference - 1", because often a difference of 1 occurs through
              resampling/normalization.

              When comparing two metrics we define a threshold T, so we can stop
              comparison if Cp T, and just store the value Cp (p <= N) up to that point.
              If T is low enough we often can stop comparison after just comparing one
              byte!

              Now.. when comparing a large list of metrics, one can simply sort them, then
              take one as start S, put a sliding window {-T/256, T/256} on the sorted list
              around S and compare all the metrics in that group with S, to find any
              matching metrics to S.

              this way we can build a new list, beginning with S, then the one closest
              matching that one, then find again the closest match to this one, etc. Each
              time we remove the metric from the original list. In the end we have a
              sorted list of images, by similarity.

              The algorithm is still O(N^2) but nevertheless cuts out a large portion of
              work compared to the full N^2 algorithm.

              There are some specialities not mentioned here (for colours, for aspect
              ratio, etc), but this is the general principle.

              Note: I looked into a lot of different techniques (Fourier-transformations ,
              Gabor wavelets, feature point extraction, etc) but more complex is not
              always better. In this case, simplicity seems to favour.

              Hope that helps,

              Nils Haeck



              "Bob" <ralvarez@spamb ob.netschreef in bericht
              news:1153524148 .936850.188660@ b28g2000cwb.goo glegroups.com.. .
              Nils
              >
              That sounds interesting. Have you published anything on the algorithm
              for computing the similarity metric?
              >
              Bob
              >
              >
              Nils wrote:
              >I wrote exactly this quite a few years back, and it is still available in
              >the form of a shareware image browser having a special "Similar Images"
              >filter. It works quite well if one wants to find similar images in large
              >databases (up to e.g. 20.000 files).
              >>
              >Info here:
              >http://www.abc-view.com/articles/article3.html
              >

              Comment

              • Bob

                #8
                Re: Image comparison tool

                Hi Nils

                Yes that does help. Thanks for the explanation.

                Bob

                Comment

                Working...