Half a croissant, on a plate, with a sign in front of it saying '50c'

h a l f b a k e r y
It might be better to just get another gerbil.

idea: add, search, annotate, link, view, overview, recent, by name, best, random

meta: news, help, about, links, report a problem

account: Browse anonymously, or get an account and write.

User:
Pass:
Login
Create account.


                               

word extrapolator
Because if "cat" is a word "cats" probably is too.
  (+3, -6)
(+3, -6)
  [vote for,
against]


When using Microsoft Word, one can, if a word is flagged as misspelt, choose to add it to an internal dictionary of words that have been misdiagnosed as wrong.

However, if one proceeds to use another form of this word, either in plural, genitive or something else, more often than not this is also flagged as incorrect, necessitating one to add the variant to the dictionary as well.

I propose that, when a word is added, the computer also adds other likely forms of that word. If this proves too faulty, the selection of adding to the dictionary could be enhanced, with different options for "Add noun to dictionary", "Add verb to dictionary" etc..

It need not work flawlessly, but I dare say that not being told off every time I used a new form of whatever word I am using would be very convenient nevertheless.


dbmag9, Feb 03 2008

"Stemming" http://en.wikipedia.org/wiki/Stemming
I don't think this is used in Word, but various search engines utilise algorithms that operate a bit like this. [zen_tom, Feb 03 2008]

[link]






       Would it mistakenly let you use "gruntled", "underwhelmed", "chalant", "consolate", "wieldy", "descript", "kempt", "shevelled", "maculate", "flappable", "plussed", "cognito", "communicado", "ruly", "ruthful", "pareil", "concerting", "domitable", "corrigible", "committal", "capacitated", "sipid", "petuous", "promptu"... The list goes on.

UnaBubba, Feb 03 2008
  

       "couth"

(though the Scottish-born Mrs AWOL tell me that so-and-so may be un-pejoratively described as "couthy")
  

       I'd rather have to "add to dictionary" than risk a misspelling.(-) I have a hard enough time with words that I accidentally approved that I need removed.

MisterQED, Feb 03 2008
  

       I'm with MisterQED. I suspect the alumnuss of major universitys will be with me on this one. Some of these problems could be circumvented, but only if the software knowed a lot about the whies and wherefores of English.

MaxwellBuchanan, Feb 03 2008
  

       Which does not seem to be much of a priority for Microsoft now, let alone making their embedded dictionary complicated enough to understand concatenations.

UnaBubba, Feb 03 2008
  

       What would be nice - and please somebody scream "baked" - is a dictionary that recognised all the major variants of English at the same time, rather than focusing on just one of the US/UK/Aus/Can variations (or any others missing from the list). I've had to home-brew this in the past with a merger.   

       (Well sayed MB.)

boysparks, Feb 03 2008
  

       I think that's called the OED, [boysparks].

UnaBubba, Feb 03 2008
  

       Sorry, should have been clearer. I meant the kind of dictionary used for spell-checking by word processors and browsers.   

       As an example, my version of MS Word allows 'Language' to be selected as English (US) or English (UK) but not both simultaneously.   

       It probably sounds daft to want such a feature, but as someone who works with more than one variant, it would be a neat touch to have the facility to accept either spelling, as well as to be able to adjust all spellings to match a particular version; e.g. "Anglicise This".   

       Although, no doubt if it does exist, that feature ironically reads as "Anglicize This".

boysparks, Feb 04 2008
  

       I find that selecting English (UK) or English (Aus) still gives me Americanis(z)ed auto-corrections of words I know to be spelled a certain way in the common usage of those countries. It's possibly the single most annoying "feature" of MS Office.   

       I leave spellchecker turned off. It helps with my blood pressure control.

UnaBubba, Feb 04 2008
  

       Agreed.   

       Back to the idea posted, perhaps the application could allow the user to   

       (1) Add new word only   

       or   

       (2) Add word stem and extensions   

       Option (2) would display a suggested stem and its extensions in a column of user-editable fields. Each field would be labelled ("Stem", "Plural", "Past Tense", and so on), with a common word and its extensions displayed in a parallel column to clarify the intent. Unwanted fields could just be blanked out by the user.   

       Problem is that a word not already in the dictionary is likely to be a proper noun, abbreviation, or slang, and as such is unlikely to 'play by the rules'. So it's probably not any quicker to do this than to just separately spell-check all variations used in the document.

boysparks, Feb 04 2008
  

       Later: [+] an evolutionary computing approach to this, breeding and selecting algorithms, might produce half-decent results for very little effort. There's lots of training data already out there (i.e. the dictionary) and a clearly defined fitness function.

boysparks, Feb 04 2008
  

       I think there have been a number of attempts to create neural networks that scan through a set of training documents and determine, through contextual positioning and some amount of stem-awareness, a stochastic view of how well constructed English is formed.

zen_tom, Feb 04 2008
  

       I'm thinking more of a competitive co-evolutionary approach that just focuses on correctly identifying stem and extensions for a given word.   

       The competitive and co-evolutionary aspect would arise through the evolution of two species: The extension deducing algorithms and the training sets.   

       The idea is that as the algorithms are selected over time for their ability to correctly deduce extensions for the training sets, so too are the training sets selected for their 'toughness'. Effectively, an arms race. There are a few checks that need to be built into such a model to prevent niching, but it's all very do-able.

boysparks, Feb 04 2008
  
      
[annotate]
  


 
back: main index
 business 
 computer 
 culture 
 fashion 
 food 
 halfbakery 
 home 
 other 
 product 
 public 
 science 
 sport 
 vehicle