Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
Oh yeah? Well, eureka too.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.




Archive based disaster recovery service.
  (+4, -7)
(+4, -7)
  [vote for,

We provide a last ditch disaster recovery service for websites using archives from Google, Wayback and other online services. Sites crash on a daily basis, some seriously, and an alarming number of these are not backed up. However our experienced team can ensure that these days nothing is lost forever. Using our proprietory software, they can recreate your site word for word, even if the server and all the backup tapes were destroyed in a fire. While it may not be possible to retrieve more recent material, recovery levels of 99% are not ususual, which would save you 99% of the manhours and cost of rebuilding in case of disaster.

Phoenix.com, the internet safety-net.

Cost-effective solutions for a dangerous world.


Of course I can barely code basic html, but I'm sure others could use this idea... Jutta? Egbert?

wagster, Oct 13 2004

Warrick http://warrick.cs.odu.edu/
Tool to do what you want. YMMV. [gtoal, Oct 07 2007]


       have you asked phoenix?
po, Oct 13 2004

       Phoenix has obviously been around a while. You got that link up while the hb search page on my machine spent about five minutes looking for all instances of "Phoenix", there's a few there. Will drop him a line and tell him to headhunt Jutta.
wagster, Oct 13 2004

       There is an existing phoenix.com that has nothing to do with this idea. Would you mind naming your idea something else before it is misinterpreted as referring to the real company?   

       So, your "idea" is to do the same thing that I did, only ... for money?   

       How inventive.   

       Yes, I thought about it. Figure it's too obvious to be worth posting. Still figure that.
jutta, Oct 13 2004

       Funny that he never counted himself among those who regularly destroyed their accounts.
RayfordSteele, Oct 13 2004

       I've been thinking about a peer backup system. One that uses techniques similar to the movie download sites -- they use it to prevent identification, I would use it for privacy and data protection.
theircompetitor, Oct 14 2004

       Jutta's efforts are indeed worth a mint. [Jutta], if you accept donations or have a favored charity please let me know. (Until then I’ll send mine willy-nilly.)   

       I seriously doubt that proprietary software could replace her. I am no programmer, but it must require one to (re)build a custom database from overlapping incomplete collections of unstructured data. Even with a service doing grunt work feeding 99% of plain HTML or custom parsed data, 99% work reduction seems unrealistically high.   

       The burden of work must be in the sophisticated stuff that I don't really understand.
Laughs Last, Oct 14 2004

       I'm sure 99% is unrealistically high, it was more ad-speak than anything. Many sites will be largely uncached anyway for security reasons, especially commercial ones I should imagine.
wagster, Oct 14 2004

       Well, imagine my surprise...   

       We actually use what you might call a peering system where I work. Backups migrate to a central system inhouse, then offsite - automatically - overnight. It's not very sophisticated, but it works very well. There's no reason why multiple archives couldn't be created, bandwidth being the central issue. For true peering, two parties could agree to allow each other the use of a predermined amount of storage.   

       If anyone's interested, I'll see if I can put together a polished version. In all fairness, I've been inhumanly busy lately, so it might take a while to get around to it. Also, there's no facility for security, so you're on your own there.
phoenix, Oct 23 2004

       //recovery levels of 99% are not ususual//   

       Umm,shouldn't that say unusual? This idea could actually work, but it would be expensive and would need a LOT of memory. After all, you probably can't store the entire Internet (or at least most of the really important, English language ones)on a few USB drives.
Shadow Phoenix, Sep 30 2007

       Of course it would work - the majority of this site was recreated in exactly this way about three years ago. The technique was thought up by smarter users than me - I just shamelessly cashed in on the idea by suggesting it was turned into a commercial service, which is why I got them well deserved bones.   

       Please don't bump this again - I might just get more...
wagster, Sep 30 2007

       I actually did this last year. Got most of my web site back from various repositories, although I'ld shot myself in the foot a little by having had a robots.txt file that banned the Microsoft spider.   

       A couple of problems: spidering Google to get your own pages back - they have both a rate limit on what you can fetch, and an absolute limit on the number of results they'll present.   

       Also the Alexa/Wayback Machine archive - there's a big gap between the last three months that Alexa will give you, and the historical pages that archive.org will return. I didn't manage to get a lot of my stuff back for six months while I waited for it to work its way through the invisible pipeline from Alexa to Archive.org.   

       There is a tool called Warrick out there that attempts to do multi-service recovery for you but it didn't do much for me. I ended up writing my own.
gtoal, Sep 30 2007

       Congratulations on piecing your stuff back together! The way I dealt with googles 1000 result limit back then was by throwing in (different) additional search terms that I knew would only appear in smaller subsets of the site. (Specific usernames, for example.) I never hit the rate limit.
jutta, Oct 07 2007

       I once lost my mind, but I think I got it back. De-Boned by one.
xenzag, Oct 07 2007

       // The way I dealt with googles 1000 result limit back then was by throwing in (different) additional search terms that I knew would only appear in smaller subsets of the site. (Specific usernames, for example.) // - that's sort of how I did it -- by using the 'allinurl:' tag and walking the hierarchy. Slowly.   

       Note that the warrick tool is no longer a downloadable utility that you run on your own system but is now a web-based service. So I guess the original proposal is now baked.
gtoal, Oct 08 2007


back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle