Halfbakery: Distributed File System (Lite)

This idea is inspired by the excellent Coda and AFS file systems. They are distributed file systems. A distributed file system has no central physical location and can be distributed between N peers (in Coda in both RW and in AFS as R). This offers huge advantage when it comes to performance, redundancy and backup. The down side is that both AFS and Coda require a high degree of planning and knowledge to deploy them. The biggest reason for this is that these file systems are designed for very heavy duty use in environments of 10,000s of users where security, compatibility, performance, flexibility and scalability all play a huge role (and complexity creeps in whether we like it or not)

My idea is to simplify this excellent idea and make it practical for networks consisting of 2 to about 50 computers residing on a single LAN.

- Each computer has N GB hard drive or partition available for the cluster
- All computers are on a single high speed LAN with predicable 100+ Mbs connection speed.
- Each file has a master location
- Each file has N spare locations
- Minimum number of spare locations per file is set by the admin (common sense would say that minimum should be 1 so that if one copy fails the second survives - for more mission critical environments this could be increased)
- Available disk space will be reported as a sum of (all drives on the LAN) - (used space) - (used spare space)
- To the user the drive will appear as one huge drive.
- The files will get replicated transparently by caching, ensuring that minimum N spares exist somewhere on the network at all times.
- If you are interested in historical spares (what was the file like 1 week ago) you can also specify those settings and have the system keep those copies for you.
- The user would not know a thing. They would just see one huge drive (or not so huge if spare num was set very high and you are keeping a lot of history)

And yes ... underneath this is exactly what Coda and AFS does - that's why I'm giving them the big credit. But what Coda and AFS fail to do is make the process completely transparent to the admin. By scaling the solution down to a small network - the complexity of the technical issues involved is also reduced and the administration overhead disappears since the decision can be automated based on specified policy. ... but note that the key benefits are not lost .. you still have

- 100% transparent backup system (where all you specify is how many spares you want to keep)
- Improved performance (frequently used files are kept locally)
- Improved organization (users don't have to remember on which of the 10's drives the file sits ... there is only one drive where everything sits)
- Space efficiency. (Most companies I do administration for have 100 + GB hard drives on the workstations but only ~60 GB for the server. The workstations are extremely underutilized going for most of their data on the server and storing only 10 GB of the OS themselves leaving 90 GB+ idle .... for a small company with 10 workstations that means they could sell the file server and have huge 900 GB of storage non-reliable storage (backed up manually) or 450 GB of 1 spare redundant storage (like a giant network RAID1) ... then if you wanted to add the daily history requirement that would eat in to the total number little bit depending on the delta size (it would change only the changes and rebuild them as a full snapshot upon request ... rsync-like idea) ... by the time you'd be sufficiently satisfied with the number of spares and history you might still end up with 60 GB like you had before ... BUT
... you could do all this without having to move a finger. If machines were stolen, crash, or get destroyed in a natural disaster you would be 100% fine as long as you had more spares than the number of lost machines. It would also be extremely scalable ... want extra 400 GB redundant space with 2 spares? ... add 3 x 400 GB anywhere on the LAN and you are done.