Half a croissant, on a plate, with a sign in front of it saying '50c'

h a l f b a k e r y
"Bun is such a sad word, is it not?" -- Watt, "Waiting for Godot"

idea: add, search, annotate, link, view, overview, recent, by name, best, random

meta: news, help, about, links, report a problem

account: Browse anonymously, or get an account and write.

User:
Pass:
Login
Create account.


                     
Please log in.
If you're not logged in, you can see what this page looks like, but you will not be able to add anything.



Bayesian Categorization
Taming the "Category, pick one" menu
 
(0)
  [vote for,
against]


There's something like 1500 idea categories on the Halfbakery, making it really hard to pick the right one from the menu on the "new idea" page.

I think that a little pattern-matching software could automatically pick categories for ideas, or at least make plausible suggestions.

The idea is to use what's called a "naive Bayesian classifier." This is a fairly simple bit of software that would extract features from ideas and, using probabilities gleaned from a training set, assign a probability that the idea belongs in each possible category. It could display the top ten (say) as hints on the "new idea" page.

For a training set, we can just use the current Halfbakery database.

The main open question is what the feature set ought to be. Obvious candidates are words in the text of the idea, or better, their thesaurus categories. (Word pairs or triples might work even better.)


td, Dec 17 2003

Bayesian classifier in Python http://www.divmod.org/Reverend/
Sample code. [td, Oct 04 2004]

Classifying spam http://www.paulgraham.com/spam.html
This article, which is about statistically recognizing spam, has a good description of the naive Bayesian classifier buried in it. [td, Oct 04 2004]



Annotation:







       I find the best way to get a category is simply to search for closely-related ideas. This also has other benefits.

kropotkin, Dec 17 2003
  

       Searching for something closely-related works poorly for sufficiently weird ideas. This idea came to mind because searching wasn't helping me find an apposite category for Wasabi Nasal Spray.   

       You can think of this idea as a (fairly sophisticated) search feature that works on the text of the submitted idea.

td, Dec 17 2003
  

       Heh! "People who viewed this idea also viewed..."

UnaBubba, Dec 17 2003
  

       Tom certainly does have a point though. Manually picking a category is nigh impossible these days.

waugsqueke, Dec 18 2003
  

       Not impossible, but tedious certainly. Fortunately I've got around the problem by not having any new ideas.

DrBob, Dec 18 2003
  

       I find picking the category to be half the fun, sometimes. Although often people disagree with my decision.

Loris, Dec 18 2003
  

       This smacks of Windows XP to me.   

       I really don't like it when computers try to be more intelligent than they are capable of being.

phundug, Dec 18 2003
  


 
back: main index
 business 
 computer 
 culture 
 fashion 
 food 
 halfbakery 
 home 
 other 
 product 
 public 
 science 
 sport 
 vehicle