Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
I didn't say you were on to something, I said you were on something.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.



Site-level Dictionary Microservice

An idea
  [vote for,

In a lot of places on the internet, there are gatherings of people who talk or discuss a general veneer of pretty much everything, equally, with no particular focus.

However, elsewhere, in proper places where there is a focus on a topic, people discuss stuff focusing on that topic. Within the domain of the topic there are often many domain-attached words, phrases, product names, acronyms, made-up words, intentional corruptions of words, and general esoterica that isn't general to everywhere else.

In typing stuff in to text fields, something somewhere either at browser level or OS level will try and dictate, like a dictator, how your spelling should be orthoganolised, and frequently, on a phone or tablet, not let you get past without accepting the executive commands of the dictatorship. This is not only annoying, but basically I disown the result of the web interaction. I know what I typed, I know what I typed was correct, that's all I need to do - do the correct thing. If the system I am typing in wants to do something incorrect to what I have correctly executed, that's not my doing, that's something else, which isn't me, so I didnt' do that. If you want to read what I actually wrote, you should have been here when I did it.

It occurs to me that the web app itself at the site level could present a microservice that contains a lot of topic-specific words that occur a lot in that forum or discussion area. Product names, for example, would be a candidate, as they're often corruptions of sensible words. The ideal would be that it requires no human intervention to build the dictionary, but on the other hand, you wouldn't want it polluting itself with habitual mis-spellings, so if there's any statistical aggregation, it'd have to be able to tell the difference between a lot of people in one place getting it incorrect, and an intentional topic- level corruption.

With correct integration, the browser presents the microservice to the thing that is doing the correcting, to override with greater specificity, the latent dictionarial tendencies.

Ian Tindale, May 29 2017

What is a https://en.wikipedia.org/wiki/Dictionary
dictionary ? [popbottle, May 30 2017]


       That would be very useful indeed - I've been pulling my hair out over the last year and a half trying to get computers to read text and generate topical summaries to pointy haired bosses that are demonstrably and verifiably "good" (the summaries, not the bosses).   

       What started off as an exercise in linear algebra quickly became bogged down in jargon/language/domain specific wordlists to be hoisted into or out of the process depending on the context. What I like about the sound of this idea is that it could be used to generate and publish those wordlists up front, rather than me having to infer them.   

       And by infer them, I mean take a list of all the words used in a given corpus, run them against a dictionary to find the matches, then take the unmatched set, and try to figure out whether they are spelling mistakes, neologisms, jargon, in-jokes or are otherwise being used "on purpose". That's achievable, with a little human curation, but very difficult to explain to a pointy haired one who wants to know what you've been doing for the past 6 months.
zen_tom, May 29 2017

       http://www.longevity.org sort of has this. If you hover your mouse over a word like "resveratrol" it makes a mini hover box describing resveratrol. The topics are not gathered automatically as far as I know.
beanangel, May 30 2017

       I prefer the idea that you originally typed.
pertinax, Jun 01 2017

       Bayesian derived tag cloud.
bigsleep, Jun 02 2017

       No, you can't have that, it'd be guaranteed to be incorrect due to cretins. Off this topic, but to illustrate, if you went by what cretins type into the internet, you'd think that "loose" was a legitimate way of expressing a lack or a loss of something rather than expressing how it is floppy or dangly or how it it rattles around. You'd think there's such a word as "rediculous", and why not purpleiculous or yellowiculous. You'd think that when someone "defiantly" should do something or buy something, it indicates that they shouldn't have, or wouldn't have, or were forced not to, but they went ahead and stubbornly did it anyway. This is because there's a lot of cretins on the internet.
Ian Tindale, Jun 03 2017

       //guaranteed to be incorrect due to cretins//   

pertinax, Jun 03 2017


back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle