Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
Assume a hemispherical cow.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.

user:
pass:
register,


                   

distributed reliability database

Collect information on what works and what doesn't.
 
(0)
  [vote for,
against]

Users run some kind of agent which reports an inventory of the hardware in your system (disk drives, add-on cards, etc) every so often, so there's a record of how long it's been working for. If equipment fails in a way the agent couldn't detect, you report that manually. The agent also records your usage patterns and monitors the temperature of your case (most computers have several temperature sensors) and so on.

A central server collects all this and builds a big index about how long various products last before failure, the effects of light or heavy usage and temperature on failure rate, whether certain equipment tends to cause other equipment to fail, which operating systems have higher a "mean keystrokes between crash" rate and various other statistics of note.

That way you can shut up the "well *I've* been using a (model Foo) for two years and I've never had a problem" folks once and for all. It would also be a lot more useful than those lame uptime contests.

egnor, Jan 22 2003

[link]






       Unfortunately many of the most interesting data points will be forever trapped on computers that just died for some reason and therefore cannot "phone in".
krelnik, Jan 22 2003
  

       Well, the idea is that the user would phone in. Even if they don't, then the system would notice that the system had stopped reporting in...
egnor, Jan 22 2003
  

       A better MTBF stat? (Mean Time Between Failure)
Shz, Jan 22 2003
  

       Croissant, especially if I can drill down through the stats, find the people with the same hardware as me and see what works on their systems.
st3f, Jan 23 2003
  

       Hardware performance also depends on software. Shouldn't this kind of agent be integrated with antivirus..?
Inyuki, Jan 23 2003
  

       Kind-of baked, at least in the software world. Whenever a program crashes, Windows (XP) asks me if I'd like to send an error report. Of course, I have no idea what Microsoft does with the millions (billions?) of error reports they receive each day from failing windows programs.
mgangemi, Jan 23 2003
  

       mgangemi, the idea's about hardware.
waugsqueke, Jan 23 2003
  

       Not entirely: operating systems are mentioned.
krelnik, Jan 23 2003
  

       Software fault reporting (as done by Windows or Mozilla) is not irrelevant, though it's a different technique (and has its own failure problems; it won't catch the truly catastrophic failures, or any failure that impacts the network).   

       I know the Mozilla project uses its crash-report data to compile a list of "top places in the code where crashes occur" as an aid to developer effort. I presume Microsoft does something similar. It's not to report a problem that nobody else saw, it's to prioritize known problems.
egnor, Jan 23 2003
  
      
[annotate]
  


 

back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle