Halfbakery: Parson's Code for Images

Parsons Code is a very simple but effective way of "indexing" tunes, so that they can be searched. It consists of a string of Us, Ds and Rs, indicating whether successive notes of the melody go up, down or are repeated. It is very simple to create the Parsons Code of a tune (manually or automatically), and very simple and effective to search for tunes using the first few characters of their Parsons Code.

I'm also thinking about algorithms such as BLAST and BLAT, which can search vast databases of DNA and protein sequences and find those which are similar (often only subtly) to a chosen sequence.

Anyway.

Images, and searching thereof. I was trying to find a simple way to search for "images very much like this one". Of course, I can search for key words or names in the image description, or I can search by date or by location etc, but there's no easy way to ask "where has this image been used?".

So, here's a suggestion for a way to automatically index images so as to make them searchable. I am sure this is a well-trodden path, so if anyone who knows the field tells me this is old hat, I will delete.

We need to allow for the fact that copies of an image may have been cropped, scaled, or had their contrast, brightness etc tweaked (either intentionally, or as part of a filetype conversion).

So, here's what we do.

First, convert the image to greyscale if it isn't already.

Then draw (in silico, as it were) 100 horizontal lines across the picture. Those near the middle will be more closely spaced (because this is the area least likely to have been cropped, and the area least displaced by scaling, and also probably the most informative part of the image).

Next, get the "profile" (grey-scale value) of the image across each of these lines. Hence, each line becomes a "mountain range" plotting the variations in tone across the image on that line.

Next, for each profile, select the greyscale threshold which the profile crosses most often (ie, if the threshold is too high, only the peaks cross it; if too low, only the troughs; we want the threshold at the level where the profile crosses back and forth as many times as possible).

Next, measure the distance (in pixels) between successive "crossing points" along the profile (ie, between the points where it crosses the threshold). The result is a series of numbers.

Next, calculate the *ratio* of each number to the one after it. (This takes care of any scaling which the picture may have undergone).

The result is, for each of the hundred lines, a series of numbers. These 100 series of numbers (which can be concatenated) are the index of the picture.

The point about greyscaling, setting the threshold to the "most crossed" level, and taking ratios of successive pairs of intervals, is that these values should be the least susceptible to alteration as a result of scaling, contrast changes etc. In other words, a given picture should give at least reasonably similar indexes even after it has been cropped, scaled, lightened etc.

Now, to search for other instances of an image, the software calculates the index of the sought image, and then looks for images on the webternet whose indices match to at least some extent.

This would work best if all images (apart from the one you're trying to match) were indexed automatically (so the software just trawled the web for partial matches to indices), but it could also possibly work by calculating the indices on the fly.

***FLAWS*** There are many flaws, and this would be only a crude (but fast and simple) solution to image searching. For example, some changes in contrast, saturation etc will alter the index values despite the normalisation steps built in to the procedure. Sloppy searching (eg, allow a percentage of variation in the index values) would partly overcome this, but would be a bit slower.

Likewise, if the image is cropped, the "100 lines" will not pass through equivalent parts of the image, and hence won't match. However, by concentrating lines toward the middle, it's likely that a line in one image will be close enough to a line in cropped copy of the image that it will still work and give a reasonable match.

Remember, we're not looking for identical indices - we're just looking to see if a small part of one image's index matches a small part of another image's index. The results will not be perfect, but would be useful (just like current tag-based searches).

I'm sure I ought to have used the words "dimensionless" and "invariant", and possibly "isomorphic".