Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
The halfway house for at-risk ideas

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.

user:
pass:
register,


             

in-place concatenation

Concatenation of files by rearranging file allocation instead of copying data.
  (+8)(+8)
(+8)
  [vote for,
against]

Suppose you have some big file (or several) split into multiple smaller ones (for crude distributed storage or smth), and need to put them back together. Normally you'd either create a new file and copy the contents from each partial file into that, or append the others to the first part. But all that copying is redundant - at least if you'd delete the parts anyway afterwards.

I'd rather have a program that stitches the parts together by reassigning the clusters taken by the split parts to another file in the correct order. Of course, the part sizes should be exact multiples of the cluster length or there would be garbage or zeros at the seams. But it's often the case that the split size is a nice round number.

orbik, May 18 2010

Please log in.
If you're not logged in, you can see what this page looks like, but you will not be able to add anything.
Short name, e.g., Bob's Coffee
Destination URL. E.g., https://www.coffee.com/
Description (displayed with the short name and URL.)






       [+], but only if the program can do the opposite, as well. That is, split apart a big file into many smaller ones, without copying the clusters.
goldbb, May 19 2010
  

       Wouldn't this depend on the file system's method of indexing clusters? An offset method wouldn't be able to jump to a brand new cluster index.
wjt, May 19 2010
  

       Okay, but is this really a problem for anyone?
phoenix, May 19 2010
  

       This general concept (though mostly in reverse--treating larger files as combinations of smaller ones) would mainly be useful if it could be combined with hard linking and copy-on-write semantics. In that circumstance, it could be very useful if file formats were set up to take advantage of it.   

       For example, a person might have a number of video-editing projects which use portions of some large video files. If a project uses 5 minutes out of a 90-minute file, it would be wasteful to make a separate copy of the 5 minutes, but it would also be wasteful to have to haul around the whole 90-minute file. Best would be to have the system keep a link to the appropriate part of the 90-minute file, with the ability to grab just the appropriate part of the file when copying it to another medium.   

       BTW, on a related note, it may be helpful to have a file format which stores a hash value with each chunk of the file, and have a system maintain an index of the hashes of chunks of data. In that way, when importing a large file, the system could check whether any of the chunks already exist and--if so--avoid storing them redundantly.
supercat, May 19 2010
  

       Some Youtube videos the bit that changes most or rather changes significantly is the audio. If the video is not exactly in the right order from beginning to end it doesn't matter. If the video is: "How to paint your face to look like a comic book character" say, getting a few bits out of sequence is not important. If the file transfer protocol allows minor noise, the overall speed might improve a bit.   

       Does copyright imply that a perfect copy is made ?
popbottle, Jun 12 2016
  

       Copyright covers derivative works as well as verbatim copies.
notexactly, Jul 03 2016
  


 

back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle