h a l f b a k e r yA riddle wrapped in a mystery inside a rich, flaky crust
add, search, annotate, link, view, overview, recent, by name, best, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
Media FileSystem
Filesystem for Media and Other Large Files: one "cluster" = one physical track | |
Synopsis:
Map complete native HD tracks to the OS definition of a "cluster". Specifically for very large files with most-common-usage as read-only/sequential, ie: partition backups, media files, install-zips, etc. It can be implemented as a partition(s) on a drive (or the entire drive).
Background:
HD's
are Low-Level-Formatted at the factory into "tracks" (concentric circles) and "native-sectors" (512Byte pieces of each track). Since the outer tracks are larger and have more room, they have more native-sectors.
High-Level Formatting writes an OS/Filesystem logically onto the native-sector/track/(platter)s. It takes a number of contiguous native-sectors and makes "clusters" out of them which is the basic size it deals with. Clusters can cross track boundaries.
NTFS for instance imposes a cluster-size of 512bytes(default) to 64KB. So.... how many audio or video files do you have that are 512Bytes in length, or even 64k ? or that it would make any kind of sense to break up into portions that small ? Just as importantly, how often do you listen/view 512bytes from a media file ? That's a couple hundredths of a second of a CD quality .wav file. Audio and video recordings are usually played sequentially from end to end (or a large chunk).
A modern disk drive might have an average of 800KB/track over the radius of the disk. That's *much* more inline with modern media files typically measured in 10's or 100's of MB per file.
Proposal:
A filesystem for partitions that are dedicated to large, sequentially accessed files such as A/V files, CD images, large zip/rarfiles, etc.; takes advantage of the physical structure of a HDD to optimize data-transfers. Data is read, written, stored, requested and transferred as complete physical tracks on the disk drive, ie: one physical track is treated by the filesystem as one "cluster"
This isn't a "read-ahead" scheme (though of course you can still do that as well): physical tracks are treated in the same way that clusters currently are.
"Wasteage" even with large amounts of files as small as 12MB (which is the average of most of my MP3'd album tracks) is about 3 percent: larger files means less waste.
Of course the filesystem driver has to know how long each of its individual sectors are, but this is handled at the buffer level, the whole thing is user and programmer transparent except for the amazing lack of clatter from the disk drive and faster access.
Whoops almost forgot...
<60GB partition: 16-bit addressing
<15TB partition: 24-bit addressing
Please log in.
If you're not logged in,
you can see what this page
looks like, but you will
not be able to add anything.
Annotation:
|
| |
What problem are you trying to solve?
Hard disks are already fast enough to
stream data off the disk fast enough for
full screen video, never mind mp3. And
the bottle neck isn't generally seek
times anyway, it's the bandwidth
between the disk and the motherboard. |
|
| |
Read-ahead buffers on the driver
controller are a much better solution
than going back to the bad old days
when the OS needed to know the
specifics of the hard disk geometry. |
|
| |
"or that it would make any kind of sense to break up into portions that small ?"
The cluster is the smallest unit of storage allowed by the OS. It's relevant because if you use 64k clusters and store a 1k text file, you're still taking up 64k of your disk. |
|
| |
If you deal with mostly large files or have a great deal of disk capacity, it makes sense to use larger clusters. In fact, Windows has a limit to the number of clusters it supports and will force you to use larger clusters on larger hard drives. Otherwise, it's best to compromise between the minimum amount of space a file will consume and disk access speeds. |
|
| |
P.S. In Windows, you can set the cluster size when you format the partition. |
|
| |
I don't believe I got 4 bones, but from the comments nobody actually read or understood it ;my explanatory skills are exemplary as usual, but a bit more verbiage might be called for: I *don't* mean let's put all the computer's files onto a filesystem which has 600KB-1.2MB cluster sizes, just the partition(s) that are used for media-storage, CD/DVD Images and archive files. |
|
| |
[gravelpit] why should I have a problem to solve ? I'm just matching the physical data format as close as possible to the real data format (which in this case means taking whole tracks for large files). But as long as you asked, why exactly to you want to have to index 20,000 cluster allocations for a 10MB file instead of 12 ?. Why do you want all those seeks (even if they were just logical seeks)for files that are read sequentially? And point out where I mention something that could have been construed as "old days", bad or otherwise... my tossing out of the "new" LBA scheme ? (what is that, circa 1987 or something?) for a system that takes modern file sizes in stride ? |
|
| |
[phoenix] the parts of your anno that made sense have nothing to do with this post. (As mentioned twice) this is for a media and archive-file partition, not the partition you have your webcache on, or stupid windows and thousands of 1k files that are never used. |
|
| |
[edit] changed the first paragraph to look a little less like I'm advocating putting 1k files onto 800KB clusters which I'm not. |
|
| |
//Make the lowest size addressable unit the physical track on a hard disk// Are those the short tracks near the spindle, or the long ones out near the circumference? |
|
| |
You're going to keep running into the same problem, regardless of what type of data you're talking about. If you had an OS that allowed 800k clusters and you store a file that's 900k, it's taking up 1600k of disk space and you won't have increased your access time at all, your exemplary explanitory skills notwithstanding. The larger cluster types are (generally) for larger hard drives (not larger file types) because the number of clusters per partition has an upper limit. |
|
| |
[WeirdGreenLiquid] yes. A little more creative programming than I was willing to commit to the post and you can fill the clusters even more economically, ie: if you have a 1.2MB file, use contiguous 400K tracks near the spindle; for a 3.6MB file, dump them out in the boonies on the 1.2MB tracks on the outer edge; if it's a 250MB file, put it near the outer edge and take the 0.2 percent space wasteage. |
|
| |
[FeatherDuster] took some slogging but I found out that modern HDD's can do about 200k tracks per inch. So the 1TB 6platter HDD I was looking at has say 1.5M tracks. You could connect 10 of them to make one giant disk drive and *still* only need 24-bit addressing for the whole thing. Got a smaller partition (20-60GB depending on where on the drive the partition is) and you only need 16bit addressing. |
|
| |
Dude, make the whole friggin' platter one big cluster for all I care, I'm just explaining the rationale behind many small clusters. Honestly, you're not saving yourself a lot of time or compared to the trouble of creating the file system. Especially if you're going to have it dynamically allocate drive space based on file size. |
|
| |
(As an aside, what if you open a 1.2M file, modify it and save it back as a 1.6M file - does the file system use four 400K clusters, one 1.2M cluster and one 400K cluster or two 1.2M clusters?) |
|
| |
(P.S. what drives have 200,000 tracks per inch?) |
|
| |
As pointed out, current drives are 'fast enough' even for video. This idea might be useful for a media/archive optimised hard disk, with large capacity but much lower RPM and slower seek times, which would be quieter and use less power and perhaps be cheaper. |
|
| |
@spidermother: Tada! You just invented the DVD. |
|
| |
I had to go through 3 HD mfrs websites before I found one that actually listed a TPI (193-207 kTPI for different models in one of their OEM class ranges, 190GB-1TB)) |
|
| |
[spidermother] that would be my "DataBall" post: 2 track spiral, variable speed. |
|
| |
Hey, I'm just trying to provide a consumer niche for HD mfr's who will otherwise be going titsup when Flash drives hit a reasonable price/reliability point. |
|
| |