h a l f b a k e r yGetting blown into traffic is never fun.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
Please log in.
Before you can vote, you need to register.
Please log in or create an account.
|
If you look up an idea on the Halfbakery which isn't very recent, you'll probably find that some of the links don't work - the page has moved, or the server doesn't exist anymore - whatever the case, you can't access whatever was linked.
To remedy this, I suggest that each link have a little "(cache)"
link next to them, before the linker's name. This brings you to a barebones page asking whether you want to view the Google cache, Archive.org cache, or a Coralized version of the page for slow-loading links.
Google, as many of you know, keeps a simple text-only cache of a webpage. While the cache isn't the most complete, Google's servers are fast and reliable.
Archive.org offers the "Internet Wayback Machine," a system which lets you view how a page has changed over time. The archive is extensive and dates back to 1996, it seems.
Coral, also known as the New York University Distribution Network, is a free service with which any site can be given virtually infinite bandwidth. Just append "nyud.net:8090" to the end of a domain, and the page is cached - anyone else who visits the Coralized page within a short amount of time sees the cached version, hence saving the original server's bandwidth. This is a great solution for slow-loading pages.
(?) Coral
http://www.scs.cs.nyu.edu/coral/ New York University Distribution Network [rgovostes, Oct 06 2004]
Coralized Idea
http://www.halfbake...idea/Cached_20Links This idea, mirrored by Coral. [rgovostes, Oct 06 2004]
Origins of The Wayback Machine
http://en.wikipedia..._growth_and_storage It started with Alexa Internet's automated crawlers... [Spacecoyote, Nov 24 2014]
[link]
|
|
It should be noted that the Halfbakery's robots.txt file forbids Archive.org from crawling the site and archiving ideas. Or not: |
|
|
That was true for a while, but is currently incorrect. I don't think it was correct at the time you made that annotation. |
|
|
This idea would establish a complicated mechanism (and interdependencies between the halfbakery and several other enterprises, some of them commercial) to do something that
someone intelligent enough to click on that "cached" link can probably do themselves, if they're sufficiently curious. |
|
|
"Coral" is a cool idea, but has nothing to do with the problem you're trying to solve, right? I mean, our links tend to die of old age, not of overload. |
|
|
[By the way, I just tried to follow your link to the "Coralized Idea": |
|
|
Error 408 Request Time-out |
|
|
Server CoralWebPrx/0.1 (See http://www.scs.cs.nyu.edu/coral/) at 128.148.34.3:8090] |
|
|
Thank you for the correction - I thought I had tried looking an idea up on Archive.org before and it told me about the robots.txt file, but I did not confirm this before annotating. |
|
|
Coral is very handy for sites which are slow-loading or have been temporarily knocked offline. It has been frequently used on Slashdot.org to neutralize the "Slashdot effect," i.e. providing a mirror of a page which isn't available due to the heavy traffic. |
|
|
Strange that the link does not work for you. It loads nearly instantly for me. |
|
|
It repeatedly fails for me. We're probably geographically distant and are being redirected to two different proxies - yours works, mine doesn't. Sucks that the demo doesn't work for me (the "cnn.com" demo they have on their webpage works, though), but isn't it cool how you can tell from the errors how it's structured...? |
|
|
I have a user-configured verb in my copy of IE named "cache". If a URL fails to load, I just go up into the address bar, type cache to the left of the URL and a space and hit return. It loads it from the Google cache. |
|
|
This is pretty easy to set up in IE, its just a registry entry. Put the following in a file CACHE.REG and then double click it: |
|
|
REGEDIT4
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\SearchUrl\cache]
@="http://www.google.com/search?q=cache:%s"
" "="+"
"#"="%23"
"&"="%26"
"?"="%3F"
"+"="%2B"
"="="%3D"
|
|
|
If you have the Google toolbar loaded, you can do the same thing from within the toolbar box, just put a colon ":" between "cache" and the URL instead of a space. |
|
|
Yeah, the Google Toolbar also has an "info" button (it's an i in a blue circle) which among other things offers a cached version of the page. I've also seen plugins for Mozilla which add an Archive.org lookup button to the toolbar. You can find also find simple Bookmarklets which Coralize the page you're currently looking at. |
|
|
Still, most users don't have things like these installed. |
|
|
// This idea would establish a complicated
mechanism (and interdependencies between the
halfbakery and several other enterprises, some of
them commercial) to do something that someone
intelligent enough to click on that "cached" link
can probably do themselves, if they're sufficiently
curious. // |
|
|
I know that you can request archive.org to cache a
page. Does it automatically browse the internet
and cache pages that no one requests? If not, it
seems like it would be useful for the the
halfbakery to automatically send a cache request
whenever someone links a page (or possibly wait a
week so ideas or links that get deleted or flagged
right away don't get cached). My first thought
was that this would be taking advantage of
archive.org, but really they want to cache
anything that anyone finds to be interesting or
useful. That is true of most of the links
people post here. |
|
|
Of course you might want to ask if they would
appreciate that before you actually implemented
it. |
|
|
//Does it automatically browse the internet and
cache pages that no one requests?// |
|
| |