h a l f b a k e r yAmbivalent? Are you sure?
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
Please log in.
Before you can vote, you need to register.
Please log in or create an account.
|
There's a new kind of spam. I don't know what it's called, yet, but you can see it if you search for "vitamin chocolate" on google and jump ahead to, say, page 10. You'll start seeing pages and pages of nonsensical search results from sites generated only to attract search queries. [Later: This example
no longer works. Yay general improvements in technology!]
They're clearly produced by the same or similar software. The pages all have similar general titles: "Foo - info on Foo" or "section relating to Foo" or "information about Foo"; and next to the quotes with the search term in them, you'll find a general cacaphony of other high-scoring search terms.
Google has a way of specifying conditions for exclusion of results (prefix them with a -), and it has the site: operator that lets me refer to resutls on a specific site. So, once I've learned that a site X is run by thieving bastards that only want to trick idiots into accessing their links, I can use -site:X to exclude it from my results.
That works, but doing it again and again is tiresome.
What I would like to see somewhere would be a way of automatically filtering all my search results to never, ever return anything from domains (and their subdomains) that I've identified as
spam domains. And give users a way of sharing such spam-search-trap blacklists with each other (e.g., by being able to load them from other URLs, in a nice, public XML format.)
That way, google doesn't have to interfere with the users or invent new ways of making their algorithm detect nonsensical results, yet the users have some way of retaliating against this ridiculous attention-grabbing.
[Thanks for the "report spam" link, tsuka. I didn't know you could do that. I've reported mine. I still think there's room for the private industry here.]
Google 'personalized' search (beta)
http://labs.google.com/personalized/ [krelnik, Oct 04 2004, last modified Oct 05 2004]
[link]
|
|
Something funny about all those domain names... Many are two dictionary words: applelot.com, freeglen.com, furytea.com, etc. Can something be done to to filter knowing that these domains are likely machine generated? |
|
|
Certainly there are legitimate websites with two dictionary words (e.g. halfbakery) but, maybe some process for determining the validity of the good sites when there domain name appears machine generated. |
|
|
I just would like to try to automate the process as much as possible, to avoid someone "identifying" a spam domain, which is legitimate. I know it would happen, for the same reason my CD player shows me that "the Raod to you" is playing. |
|
|
But now that I re-read your last paragraph... |
|
|
You'll report them to Google? That won't do any good. You can't stay ahead of those people that way. Google just has to learn to filter out those bait-and-switch sites. One way would be a faster robot cycle time tied to a blacklist. |
|
|
Perhaps, if it was as easy as reporting spam in Yahoo. But even then, spamers could retaliate by reporting real sites, making it impossible to weed out the spam without human intervention. |
|
|
This might be an interesting feature for the "personalized search" that is in beta test at Google right now. It remembers alot of other things about what you do and do not want to see in searches. See link. |
|
|
This is an excellent idea. |
|
|
One slow, non-automated approach is for Google to add a "never see results from this domain again" checkbox next to each search result. Store in your personalized Google profile referenced by a cookie (or account #). |
|
|
Maybe there'd be a nice way for users to share that subsection of their profile. |
|
|
The way to make this work but prevent malicious individuals from subverting the system is by way of votes. Every vote would move the search result farther down in in the results page. I do not think a result should ever disappear completely, just be ranked very low. Of course a very dedicated malicious one could vote multiple times for [zanzibar]'s address. |
|
|
Buy Vitamin Chocolate at Amazon.com: www.amazon.com/search?no_results_found |
|
|
Buy Vitamin Chocolate at Barnes & Noble
www.barnesandnoble.com/query_&search=vitam.... |
|
|
I'd place this idea into a "not only, but also" class of interest. |
|
|
Means, I'd like an agent version to side within my email filtering module; it would followup with an autoupdate to my personal killfile, and cc: notifications to any and all relevant ISP loci. |
|
|
While we're doing killfiles, I'd also like one that hides any results that were on recent searches of mine. |
|
|
This would be extremely useful in searching for photos or images. |
|
| |