Computer: Word Processing
Etymology markup for word processors   (+72)  [vote for, against]

One of the great strengths of the English language is that it happily incorporates words from almost every other language on the planet (e.g. no one bothers to make up words which means 'ombudsman' or 'quisling' - we just adopt the Swedish and Norwegian words which already mean these things). This means that an English speaker has a fantastic number of words to choose from (about 400,000 - about five times as many as French) and also that the rich lineage of these words can be used as a tool to communicate more effectively.

So, to start with this idea is for a word processor which makes explicit the source (etymology) of the words you use. Rather as MS-Word does spelling and grammar checking on your document it should also do a level of analysis of the etymology of the words you are using. It might show, for example, each word highlighted in a different colour depending on whether it has Latin or Anglo-Saxon roots and then subtly different colours for French, Spanish, Portugese, Old French, German, Old High German, etc. You'll be able to see at a glance whether you're using mostly Latin words (which tend to be longer, sound more refined and have more nuanced meanings) or Anglo-Saxon words (which are shorter, and sound more direct and truthful (a trick used by speechwriters, particularly Churchill)).

A more sophisticated version of this would use a clever matrix of etymology-ranked synonyms to enable you to adjust a slider to make (by automated replacement of words with their synonyms) your text more related to one language and less to another. This could be useful when writing English text to be read by someone with a non-English first language - e.g. a Spaniard may find it easier to read English which uses a high proportion of Latin, Spanish Mexican and Portugese-derived words. You could also use this tool to boost the representation of languages under-represented in English, just for fun (so "beer tent" would become "alcohol kiosk", both words with Arabic roots).
-- hippo, Apr 24 2007

Million Words Hoax http://languagelog....penn.edu/nll/?p=972
[calum, Sep 10 2009]

//then subtly different colours for French, Spanish, Portugese, Old French, German, Old High German, etc.//

... and gibberish, of course.
-- Jinbish, Apr 24 2007


Very {<astute, french><savvy, spanish><clever, dutch/low german>}. Bun!
-- placid_turmoil, Apr 24 2007


Wow!
-- phundug, Apr 24 2007


Wish I'd thought of this one! [+]
-- pertinax, Apr 24 2007


Actually Miriam-Webster suggests the "-ship" suffix derives from the Old English "scieppan", meaning "to shape", but your annotation reminded me of words like "television" which are an invented mixture of Ancient Greek ("tele" - "across", "at a distance", etc.) and Latin ("visere" - "to see") and which would be interesting to try and represent in this markup.
-- hippo, Apr 24 2007


Great idea. Especially the slider tool. Bunnage. [+]
-- theleopard, Apr 24 2007


This is a really lovely idea.
-- calum, Apr 24 2007


This is really a worthy idea. I would pay for this.
-- nomocrow, Apr 25 2007


Excellent writer's tool, would have been very useful with my last book. [+]
-- nuclear hobo, Apr 25 2007


//I wonder how many vocal sounds (vocabulary) can be traced back to out of africa times.//

Not many.

The vocabulary that can be traced back as far as proto-Indo-European runs to a few thousand words (exact numbers disputable). No-one really knows where that was spoken, but probably not in Africa, and no-one really knows what it sounded like (go on, somebody sing me a primitive labio-velar). Any attempts to connect that vocabulary with African language-groups such as Cushitic are, as I understand it, highly speculative (verging on half-baked?).
-- pertinax, Apr 25 2007


I was going to think this up in the next couple of weeks. (+)
Maybe a fortnight.
-- 2 fries shy of a happy meal, Apr 25 2007


what [calum] said
-- po, Apr 25 2007


Splendid notion.

As an English copywriter working in the Middle East, using words of Arabic origin would help me a great deal.
-- marklar, Apr 25 2007


Bun (etymology obscure).
-- DrBob, Apr 25 2007


+ what everyone said, in other languages, too.
a bun by any other name is still a bun
-- xandram, Apr 25 2007


holy smarty pants I like it!
-- twitch, Apr 25 2007


Look at all those buns. You still got it, [hippo]. This product could be offered by the Oxford english dictionary, since they have all that etymology thing sorted out.
-- bungston, Apr 26 2007


For automatic replacements, warning box:

"beer tent => alcohol kiosk" Connotation has changed from 'party' to 'commercial establishment' and Tone has changed from 'colloquial' to 'politically correct'. Continue with changes?
-- Ketchupybread, Apr 26 2007


[bungston] Yes - or with a fast enough web connection it may be possible to scrape the etymology information from an online dictionary as you type.
-- hippo, Apr 28 2007


That would be a simple way to use the online OED. You need a license to see it anyway. This software could be part of the deluxe license.

This program might also be able to convert all words in a given text to their oldest known spelling. You would have to have a font with those old english characters.
-- bungston, Apr 28 2007


[+]

Also, I think you mean "Mexican Spanish."
-- discontinuuity, Apr 30 2007


Oops - a missed comma. I meant "Spanish, Mexican" (last para of idea).
-- hippo, Apr 30 2007


Mmmm, yes, I can envisage a sort of fancy visual web structure branching out from a selected word showing geographically and historically related words.
-- hippo, Sep 10 2009


//There are now more than 1,000,000 words in the English language.//

Well that's what people say but I reckon that most of 'em are just the same letters mixed up in a different order.
-- DrBob, Sep 10 2009


//world linguistics orchard// Brilliant extension of the tree paradigm, I keep finding myself resorting to describing nodular things in terms of leaves, twigs, branches etc - never once have I considered expanding out into the Orchard.

[later] Further thinking about this suggests usage of "Forests" and "Jungles" (I think Forests were experimented with in Microsoft Active Directory, but I can't be sure) But I do like the suggestion of the fruit-bearing, husbanded, ordered collection of trees that Orchard suggests. A Jungle might then describe a superposition of multiple Orchards, occupying the same space, potentially (eugh) interacting with one another. However, there is the unpleasant notion of Monkeys which, in terms of strict taxonomies is never a good idea. Forests are a little tamer, but may still, instead of containing monkeys, harbour Outlaws, and you're back to the same issue - No, Orchards is ideal.
-- zen_tom, Sep 10 2009


The difference in Etymology between English English and American English (how come my chrome spell checking stopped working? and how do I get it back?! ok -- Settings / Language / English / Enable Spell Checking checkbox) is that in English English you take a photograph and then enlarge it, while in American English you shoot a picture and then blow it up.
-- pashute, Jul 02 2013


Splendid [+]
-- bs0u0155, Dec 11 2014


Brilliant! (Latin / Italian / French) [+]
-- csea, Dec 11 2014



random, halfbakery