Typeface Target Tag (TTT)

OCR Scanner Marker
  [vote for,

Scanning printed text, applying accurate optical character recognition, and making sense of the resultant input data can require precision hardware mechanics, intricate electronic circuitry, sophisticated database software. The process is also limited by the requirement of physical contact between matter and machine. Few would argue this process could use some improvement.

On the other hand, a UPC barcode can be read with a whisk of a box through the air and the process is also called 'scanning.' The discrepancies between machinery in these two type of 'scanning' systems is significant, but the crux of their functions is the same: look at the lines and suss out what they say. With the stroke of a pen, we can close this technology gap.

"-TTT-" is a shorthand for "Typeface Target Tag," a simple symbology to optimize the transfer of printed text to digital data. A unique set of graphic characters could be established to assist an artificial intelligence in the extraction and processing of information collected from hard copy sources. In a blurb, this idea is "An upgrade on the concept of ''X' marks the spot.'"

+..TT-E:p..+ In the proceeding character string, the "+" symbols frame and allign another set of symbols containing information about the written information that follows. In this example "TT-E:p" stands for 'type text- English:prose.' Digits scrawled on a post-it beginning with "+..HW-#..+" would be identified as a "hand writing number." The job of a " - TTT-" is to show a machine where to look, how to see, and what to understand.

+..TT-E:p..+ A -TTT- in the position of the previous character set would function universally as an alignment point for optical scanning, a typeface tag for character recognition, and a metadata marker for context categorization. Fixing constants and finding variable values simplfy the solution of any analytical equation. The idea is imbed enough information in a single spot that the accurate input of other information the marker is attached to can be more easily discerned.

In practice, the process would proceed with this plan: 1) Lightbeams extending from external devices trained on a surface pinpoint a - TTT- and instantly adjusts to focus on this area. 2) Elements in the -TTT- design provide a reference for calculating the size, angle, and position of a limited planar frame for a scanning system to search for variation. 3) Any typeface data imbeded in the -TTT- is extracted by the OCR software and used to interpret the text frame 4) Significant features of the - TTT- symbol structure signal cybernetic systems to expect a specific type of content to be soon sent through their subroutines and steer the sorting and storage of signals as such .

Using 'cross hairs' for precision spacial orientation adjustments has foundations in technology as ancient as the sextant and as modern as motion capture. Changing a typeface's font, size, and style confounds computer chips as a norm. Finding context in the characters is challenging, but conquest in this quest is profound. -TTT- is the answer for it all.

Cube, Feb 15 2006

       +..TT-E:p:h..+ Handwriting   

       Which is English:prose:handwriting followed by the handwritten word "Handwriting" that the scanner can use to calibrate for the handwritten text that follows.
wagster, Feb 15 2006

       Couldn’t get this, sorry. Reading too many bakery ideas, my attention span has been destroyed.
ldischler, Feb 15 2006

       This Idea is plain enough. -TTT- represents the constrast of a sample upon its media, and conveys descriptors such as pixelation to the scanning and OCR software much like a 3-D barcode carries several layers of information. Examples of OCR challenges are:   

       3-D #1: handwriting on paper
3-D #2: type on paper
3-D #3: image on paper
3-D #4: image on other media
reensure, Feb 15 2006

       I thought this was like pi, but the imaginary number version.
phundug, Feb 15 2006

       Am I understanding this right: the TTT is something printed with the text to add later OCR?   

       In that case, why not simply print a microscopic bar code on the edges of each letter? Or, indeed, a single bar code containing the text of the whole page? Then the OCR system would just have to pick up the bar code.
DrCurry, Feb 15 2006

       You might be able to design a font that had some bar-code features built into it. But there is also another coding system out there that uses a rectangle full of small black-and-white squares, to represent data. I think it might be better to design a font using that system than a bar-code system, because the characters in the font would look more "normal".
Vernon, Feb 15 2006

       Unnecessary: the OCR program that came with my latest scanner (/printer/fax/copier) already formats the scanned text in Word the way it looks on the page.
DrCurry, Feb 15 2006

       As does my older model, [DrCurry], but mine only produces images from handwriting or from indecipherable text. I then have to resize, and in most cases retype the overlying image into a text line.   

       Oh, and stray "punctuation" strewn about the page? Could be much better.
reensure, Feb 16 2006

       Given that the OCR can't read my handwriting in the first place, how would it read the handwritten tags (and likely inaccurate alignment crosshairs) I add later?
nick_n_uit, Feb 16 2006


