h a l f b a k e r y
Normal isn't your first language, is it?
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
or get an account
Archive location as well as content
The Internet Archive Wayback Machine is great for scraping content and saving it, and also for serving said content through a new archive.org URL the latter part of which reproduces the latter part of the source URL.
Proposed is a high-level deal with the domain registers and authorities such that
an abandoned website can be scraped and also its URL redirected. I'm thinking that a similar user interface might be used as a t present with a timeline floating at the top of the page showing previous captures of that URL. But the original domain would also be "captured" and so the original link from 15 years ago would still work.
What about domains which continue in use but are radically changed in their content? Well in theory the same thing could happen, when you enter an obsolete URL instead of getting a 404 page (or worse a silent redirect to the home page) you get hopped to the archive scrape and timeline while still being at the original URL.
What about individual URLs which are still active but have changed radically? Well in that case it would be up to the site owner whether to allow the Archive to put a floating timeline along the top.
Perhaps some at least of this functionality could be done client-side. But I would still like to see obsolete domains captured and vested in the Archive so they can be reunited with their scraped content.
[pocmloc, Mar 22 2021]
||So youre proposing that when you try and open
www.oldURL.com, instead of going to a page not
found, it automatically goes to
www.archive.org/oldURL.com? How does this
mechanism know that the original request has
||The name servers point oldurl.com to the archive server
||Came a cross a kind of half-baked implementation of this, I clicked on a link to visit a website, and I got a Cloudflare landing page which told me that the site was offline, and after a short pause it served me an archived version of the page with a header banner explaining this.
||If you could map base64 onto UTF-8, and were
allowed to encode your URL in an arbitrarily long
UTF-8 string, you could encode the contents of your
entire web-page as a base64 representation embedded
within the URL. It might lead to less than optimal
human recognition for a given URL, but it would
make concrete the link between index and underlying
content. Since any conceivable web-page can be
encoded as a base64 number, you might not actually
need to host your website anymore, since the whole
thing would be reproducible from its (admittedly
||Great idea [z] but not quite it
||I remember [Vernon] posting images here in the link thingy using this method.
||Would website owners have to opt-in to this service - or would
they be able to opt out?