 h a l f b a k e r y "Bun is such a sad word, is it not?" -- Watt, "Waiting for Godot"
idea:
add, search, annotate, link, view, overview, recent, by name, best, random
meta:
news, help, about, links, report a problem
account:
Browse anonymously,
or get an account
and write.
Login
Create account.
|
|
|
Please log in.
If you're not logged in,
you can see what this page
looks like, but you will
not be able to add anything.
There's something like 1500 idea categories on the Halfbakery, making it really hard to pick the right one from the menu on the "new idea" page.
I think that a little pattern-matching software could automatically pick categories for ideas, or at least make plausible suggestions.
The idea is
to use what's called a "naive Bayesian classifier." This is a fairly simple bit of software that would extract features from ideas and, using probabilities gleaned from a training set, assign a probability that the idea belongs in each possible category. It could display the top ten (say) as hints on the "new idea" page.
For a training set, we can just use the current Halfbakery database.
The main open question is what the feature set ought to be. Obvious candidates are words in the text of the idea, or better, their thesaurus categories. (Word pairs or triples might work even better.) Bayesian classifier in Python
http://www.divmod.org/Reverend/ Sample code. [td, Oct 04 2004]
Classifying spam
http://www.paulgraham.com/spam.html This article, which is about statistically recognizing spam, has a good description of the naive Bayesian classifier buried in it. [td, Oct 04 2004]
Annotation:
|
| |
I find the best way to get a category is simply to search for closely-related ideas. This also has other benefits. |
|
| |
Searching for something closely-related works poorly for sufficiently weird ideas. This idea came to mind because searching wasn't helping me find an apposite category for Wasabi Nasal Spray. |
|
| |
You can think of this idea as a (fairly sophisticated) search feature that works on the text of the submitted idea. |
|
| |
Heh! "People who viewed this idea also viewed..." |
|
| |
Tom certainly does have a point though. Manually picking a category is nigh impossible these days. |
|
| |
Not impossible, but tedious certainly. Fortunately I've got around the problem by not having any new ideas. |
|
| |
I find picking the category to be half the fun, sometimes. Although often people disagree with my decision. |
|
| |
This smacks of Windows XP to me. |
|
| |
I really don't like it when computers try to be more intelligent than they are capable of being. |
|
| |