h a l f b a k e r y
Breakfast of runners-up.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
or get an account
(It's hard to believe this isn't baked, but...)
When browsing the web, especially the "dark" (unindexed) web, it can be useful to know whether people or subjects of interest are mentioned on a page after it's loaded. When visiting many pages just hoping to find something about an important subject,
it can be too time-consuming to even scan them all, much less read them.
What is needed, is a browser feature or extension which automatically searches each loaded page for multiple terms. There could be several classes of terms, each treated with an appropriate set of rules, such as:
* General subjects: Keywords, indicator phrases, ...
* Places: Known spelling variations, name changes during history, ...
* Surnames: Known spelling variations, Soundex and Double Metaphone, ...
* Use Soundex and Double Metaphone
* Compensation for typo (typographical), scanno (OCR) and spelling errors.
* De-hyphenate hyphenated words.
Each search term should have specifiable criteria such as:
* Exact match
* Match case
* "Whole word only"
* At the start of word "x"
* At the end of word "x"
* Inside word "x"
* Plus all forms of the word
* Plus synonyms
* Regular expression
* Before/after word "x"
* Near (specify word count)
* On the same line as word "x"
* Not near word "x"
* Not before/after word "x"
* Not inside word "x"
* On a different line than word "x"
* Numeric value or range
The feature should have a well-designed GUI to help non-techies enter terms easily.
It would be nice if there was a website where people could share complex searches so others could add them to their collections with a simple click.
Edgar McClintock likes:
* George Thoroughgood (singer)
* Atlanta Rhythm Section (musical group)
* Ozark Mountain Daredevils (musical group)
* Western movies and TV shows
Edgar has relatives and ancestors with the surnames: McClintock, Curtis, Bernstein and Stuart.
Edgar's search terms (in Google-like syntax) might include:
* singer OR artist George Thoroughgood OR Thorogood OR Thoroughbred
* band OR group Atlanta OR Atlantic Rhythm OR Rythm OR Rhythem Section
* McClintock OR MClintock OR McClintok OR MacClintock
* Curtis OR Curtus
* Bernstein OR Bernstine OR Burnstein
* Stuart OR Stewart
...and so on...
Each manually entered term would have typos, scannos and common mispellings added automatically for added certainty of finding every possible match.
It should be noted that typos and scannos can be "computed" and therefore would not require a database of all possible errors of these types.
Invisible Web at wikipedia.org
DarkNet = Deep Web = Deepnet = the invisible Web = Undernet = the hidden Web [Alvin, Nov 13 2011]
I am very familiar with "Ctrl+F", but it only searches for a single, simple term at a time, and never automatically. I have many projects which require finding all occurrences of numerous terms and their variations on each of very numerous pages.
Expandable sections such as on WikiPedia hopefully just have the HTML loaded, but hidden. If so, they could be searched as they are, or perhaps expanded programmatically.
||Just curious as to how you browse the dark unindexed web, and how you know you are there when you are there.
I'm not sure it's always possible to tell for sure when one is on the dark web. I mainly just guess whether the pages I visit would have been composed manually or dynamically generated from external data.
See the link.
Used to be that every search came with a [cache] feature which would highlight individual words on a loaded page.
I don't know why that feature disapeared but I miss it. Placing placing specific phrases in quotation marks along with highlighted words made it very easy to intuit just which string of words would call up what I wanted to see within the first few hits, if not the first hit, most times.
||Makes me want to write a country'n'western song about it.