Please log in.
Before you can vote, you need to register. Please log in or create an account.
Computer: Word Processor: Spelling
Spellchecj by tupo   (+5, -1)  [vote for, against]
Prioritise spelling suggestions based on key proximity

Consider this typo:

"ince"

Spelling correction suggestions could be "Inch, Inca, Since, Nice, Once, etc..."

I propose a keyboard-proximity filter (based on the system's current designated keyboard layout) to order them in the most likely typo order. O is very close to I on the keyboard, so:

Since, Nice, Once are probably the most likely words.

Inca and Inch are much less likely to be typed by accident, and accidents form 90%+ of my spelling errors.

Add to that theme with some intra-sentence grammar checking and common word tagging, and spellchecking could be much more useful.

(How often does one really intend to use the words "tot he" in a sentence? it is more likely to be "to the", for example. Some systems auto-correct that automatically, though.)
-- not_only_but_also, Aug 17 2009

US Patent 6,801,190 http://www.google.c...AAAAEBAJ&dq=6801190
[jutta, Aug 17 2009]

Context-sensitive spell check in Microsoft Office 2007 http://blogs.msdn.c...6/06/05/617653.aspx
[jutta, Aug 17 2009]

Context-sensitive spell check in Google Wave http://googlesystem...-spell-checker.html
[jutta, Aug 17 2009]

Wikipedia: Damerau-Levenshtein distance http://en.wikipedia...evenshtein_distance
Edit distance with bells on. [jutta, Aug 17 2009]

Wikipedia: Needleman-Wunsch algorithm http://en.wikipedia...an-Wunsch_algorithm
This very clearly needs to be worked into a popular "dance craze" song, "Do the Levenshtein-Damerau Needleman-Wunsch". [jutta, Aug 17 2009]

All good ideas, and patented and implemented in a few systems. A patent- or literature-search for "spell checking algorithms" might be in order.

The keyboard proximity thing is implemented, if one bothers with it, as a "confusion matrix" that, given two keys, tells you how likely they are to be confused. When computing the edit distance between two words (-> Levenshtein distance), instead of assigning equal probability for each substitution error, the confusion matrix is used to look up the possibility of this specific error.
-- jutta, Aug 17 2009


It seems intuitive.
-- normzone, Aug 17 2009


//Consider this typo:

"ince"//

Well, duh... plainly obvious you mean Vince
-- vincevincevince, Aug 17 2009


//Levenshtein// - so that's what it's called - I once wrote a program that was intended to act as an "engine" for ALL card games, from snap through Gin Rummy to any/all variants of Poker, with each ruleset defined as a (relatively easy to edit) xml file - the tricky part came during draw/replace scenarios, trying to get the machine to try to decide whether it had a good/bad enough hand to draw a card (and decide which one to burn in the process), and there are lots of routines that reference the Hamming distance between a given hand, and a target one (e.g. four of a kind, or a series of hearts, or a numeric sequence) that the program might have "wanted" - I'm now going to have to go back and rename some of my methods to usd the word "Levenshtein".
-- zen_tom, Aug 17 2009


//confusion matrix// - I'm pretty sure I can implement that myself without any algorithms.
-- wagster, Aug 17 2009


I no. Pathetik isn't it.
-- wagster, Aug 17 2009


//I think spellcheckers should be programmed to deliberately fail every so many words, or even insert barely- noticeable typos whilst typing that won't show up on the finished-product spellcheck.//

I find that happens already, as some errors form another word.

examples:
your/you're
lose/loose
discrete/discreet

(The first to are quite common on the net, and widely reviled.)
-- Loris, Aug 18 2009


One typo I often come across is a 'dyslexic' (no offense to dyslexic people) error - hitting the (theoretically) correct key with the wrong hand (eg. putting 'k' when you needed 'd').
<Pet peeve> People getting 'than' and 'then' mixed up! Grrr!</pp>
-- neutrinos_shadow, Aug 18 2009


//I can't believe editors, who used to have to earn their pay by proofreading, can cheat...//

Yeah!! And what about those lazy sailors who use GPS to navigate..?
-- shudderprose, Aug 19 2009


//The first to are quite common//

Was that "to" intentional? Yeah. Must've been.
-- theleopard, Aug 19 2009


I know someone who frequently misuses "of" - as in must of, could of, should of etc - I don't have the heart to tell them.
-- zen_tom, Aug 19 2009


You must learn to Give In To Your Hate, [zen].
-- 8th of 7, Aug 19 2009


//Was that "to" intentional? Yeah. Must've been.//

It was now.
-- Loris, Aug 19 2009



random, halfbakery