h a l f b a k e r yWhere life imitates science.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
I remember something I saw but can't find it. I want to search
in
the texts referred to by the website list in my browsing history.
(Or in any list of websites for that...)
grep
https://en.wikipedia.org/wiki/Grep "a command-line utility for searching plain-text data sets" [8th of 7, Dec 08 2020]
Please log in.
If you're not logged in,
you can see what this page
looks like, but you will
not be able to add anything.
Annotation:
|
|
I think you can run a search against your google account
search history - but for that, you need to have fairly reliably
run all your searches under your google username - which
might put off some. It seems to be an option here in
Chrome, might not extend to other browsers, I'm not sure. |
|
|
How do I do that? I don't mean searching for a phrase in the
history list, I mean to search for the phrase in the websites
themselves, those that my history list refers to. |
|
|
My bad - just tried it and the secondary search only works
on webpage _titles_ that appear in your history, not the
original full-text. Sorry, as you were. |
|
|
If your device doesn't support grep, then you need to get one with a proper operating system. |
|
|
I'm in the habit of saving any interesting pages I visit; it conserves bandwidth, defends against bitrot, and incidentally solves [Pashute]'s problem. |
|
|
As Socrates said, Ya cain't grep what ya hain't kep' |
|
|
I too am a webpage hoarder. Do you think we should form a
support group? |
|
|
I've been archiving news on a daily basis for the
last 3 years - being able to enter a search term,
and
see how it appears over time can sometimes be
illuminating - but the process is through heavy
reliance on an ever diminishing set of free-to-
access rss feeds. Interestingly, the idea of data
as an asset has caused a number of information
sources to charge for access to their data. Reuters
turned off their free-to-access news feed, probably
to encourage interested parties to pay for access
to their curated repository as a commercial asset.
A
"seen-it-stored-it" facility would save a lot of
fiddling about. |
|
|
I was about to point you to the Colindale newspaper archive,
which I have used happily in the past, [zen_tom], but it seems the
news from there is not good. |
|
|
// I too am a webpage hoarder. // |
|
|
More deserving of pity than condemnation ... |
|
|
// Do you think we should form a support group? // |
|
|
What, like trestles, or a brick plinth or something ? |
|
|
There's a TV program about going round to people's houses and "de-cluttering" them, with the aid of a skip. |
|
|
Fortunately, a relatively short burst of 7.62mm automatic fire (1 round in 5 tracer) aimed just above head height is enough to send them scurrying away. |
|
|
Maybe you need someone like them, but for your offline archive ? The trick with the automatic fire will probably work just as well if you don't... |
|
|
I can see it coming: The next YouTube advertisements trend
with Declutter the Computer experts. |
|
|
[pertinax] it's a pain, the only way to (non-commercially)
get good content seems to be to manually scrape it from
ever changing websites - the hard part is automatically
monitoring for when the format of those sites change, so
you can recode the scrapers. It's disheartening to clean up
a dataset and find gaps in the middle when something
stopped working due to a redesign. |
|
|
[pashute] have you come across `youtube-dl`? It's a unix
utility for downloading content from youtube - I'm yet to
find a tangible use for it, but it is handy, and does mean
you can pull content down and watch offline, at leisure
without the imposition of advertising - It can only be a
matter of time before it gets disabled/blocked. |
|
| |