I remember something I saw but can't find it. I want to search in the texts referred to by the website list in my browsing history.
(Or in any list of websites for that...)-- pashute, Dec 08 2020 grep https://en.wikipedia.org/wiki/Grep"a command-line utility for searching plain-text data sets" [8th of 7, Dec 08 2020] I think you can run a search against your google account search history - but for that, you need to have fairly reliably run all your searches under your google username - which might put off some. It seems to be an option here in Chrome, might not extend to other browsers, I'm not sure.-- zen_tom, Dec 08 2020 How do I do that? I don't mean searching for a phrase in the history list, I mean to search for the phrase in the websites themselves, those that my history list refers to.-- pashute, Dec 08 2020 My bad - just tried it and the secondary search only works on webpage _titles_ that appear in your history, not the original full-text. Sorry, as you were.-- zen_tom, Dec 08 2020 grep.
<link>
If your device doesn't support grep, then you need to get one with a proper operating system.-- 8th of 7, Dec 08 2020 I'm in the habit of saving any interesting pages I visit; it conserves bandwidth, defends against bitrot, and incidentally solves [Pashute]'s problem.
As Socrates said, Ya cain't grep what ya hain't kep'-- spidermother, Dec 09 2020 I too am a webpage hoarder. Do you think we should form a support group?-- pertinax, Dec 09 2020 I've been archiving news on a daily basis for the last 3 years - being able to enter a search term, and see how it appears over time can sometimes be illuminating - but the process is through heavy reliance on an ever diminishing set of free-to- access rss feeds. Interestingly, the idea of data as an asset has caused a number of information sources to charge for access to their data. Reuters turned off their free-to-access news feed, probably to encourage interested parties to pay for access to their curated repository as a commercial asset. A "seen-it-stored-it" facility would save a lot of fiddling about.-- zen_tom, Dec 09 2020 I was about to point you to the Colindale newspaper archive, which I have used happily in the past, [zen_tom], but it seems the news from there is not good.-- pertinax, Dec 09 2020 // I too am a webpage hoarder. //
More deserving of pity than condemnation ...
// Do you think we should form a support group? //
What, like trestles, or a brick plinth or something ?
There's a TV program about going round to people's houses and "de-cluttering" them, with the aid of a skip.
Fortunately, a relatively short burst of 7.62mm automatic fire (1 round in 5 tracer) aimed just above head height is enough to send them scurrying away.
Maybe you need someone like them, but for your offline archive ? The trick with the automatic fire will probably work just as well if you don't...-- 8th of 7, Dec 09 2020 I can see it coming: The next YouTube advertisements trend with Declutter the Computer experts.-- pashute, Dec 10 2020 [pertinax] it's a pain, the only way to (non-commercially) get good content seems to be to manually scrape it from ever changing websites - the hard part is automatically monitoring for when the format of those sites change, so you can recode the scrapers. It's disheartening to clean up a dataset and find gaps in the middle when something stopped working due to a redesign.
[pashute] have you come across `youtube-dl`? It's a unix utility for downloading content from youtube - I'm yet to find a tangible use for it, but it is handy, and does mean you can pull content down and watch offline, at leisure without the imposition of advertising - It can only be a matter of time before it gets disabled/blocked.-- zen_tom, Dec 10 2020 random, halfbakery