h a l f b a k e r yThe best idea since raw toast.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
It seems to me that a lot of space is needlessly used up in MP3 files at the moment, because the songs that people wish to encode have areas of repetition in them. At the moment, these repeated areas are encoded and stored multiple times, which seems inefficient.
What I envisage is a kind of cross
between the MP3 file format, and the old-style "tracker" or "MOD" file formats, which allowed music to be pieced together from samples.
The new format would allow many short stretches of MP3 format audio to be stored within itself, and would then also have instructions on which segments to play in which order. Segments could be played more than once.
The obvious space saving of this idea would be if you encode a song with many separate verses, but a chorus which gets repeated throughout the song. The chorus could be encoded only once, and simply played at the appropriate time.
Maybe I'm getting too old, but also it seems that most chart music these days is incredibly repetitive, and thus would provide even more opportunity for space saving.
As for encoding, the first stage would be for someone to manually identify repeated sections within an existing MP3. However, it would be possible to automate the process by performing a Fourier transform on the data (to convert it to frequencies), and then running a sliding-window algorithm to find repetitions. This could be made easier if the algorithm searched for regular low frequency pulses (i.e. the beat of the song), and then attempted to find repetitions based on multiples of this length of time.
ScreamTracker for DOS
http://www.united-t...e/screamtracker.htm Ahh, memories. [jester, Mar 18 2002]
MOD/XM/S3M/IT Trackers
http://www.soundtra...without_frames.html ...many without a 64k sample limitation [jester, Mar 18 2002]
[link]
|
|
This is an interesting idea. MPEG-2 and MPEG-4 both include prediction tools for audio encoding (in the AAC audio coding format, part of MPEG-2 and the basis of high-quality MPEG-4 audio). This means that if the signal at a given frequency doesn't change much between blocks, then only the difference between blocks needs to be encoded. However the blocks used are about 23 ms for CD-quality audio (44.1 kHz sampling rate), so it won't pick up repetition between bars in music. |
|
|
The biggest arguments against extending the length of time over which prediction is performed, are (1) the large amount of music data that would have to be stored while decoding data - one or more second's worth. This isn't crucial in PC-based decoding, but is much more important in portable devices. (2) the computation time required for searching for matching data, resulting in considerable encoder complexity -- this might present a real problem if you want to encode files on your own PC, and for real-time streaming audio. AAC's prediction algorithm already takes 40% of total decode time, and that only uses the previous two 23ms blocks. |
|
|
I don't know if any current codec (coder/decoder) uses this technique; Microsoft WMA doesn't seem to, and it's pretty much the state of the art. |
|
|
It wouldn't work well on larger sections, like choruses though, as they normally change subtly but meaningfully each time they're sung (e.g. getting louder and more impassioned). |
|
|
If the vocal track (where such a thing exists) were to be encoded separately from the backing track, there could be a huge saving with this new-fangled beat-type music, which seems to consist of the same two bars repeated ad nauseam. |
|
|
[pottedstu] - As far as I can see, this idea would place no extra burden on the player device, and so it would be feasible to implement it on small MP3 players. The player would merely have to play different sections of the file out of order, much like when you drag the progress bar in any graphical MP3 player. |
|
|
However, I admit it would place a much larger burden on the encoder, but that's the price you pay for decreased file size. The encoder could use hints like the timing of the beats to work out likely repeat times - divide the song up into four beat chunks, assuming four beats per bar, and search that way. It would work on most chart songs, but not on something like Queen's Bohemian Rhapsody! |
|
|
I also agree on your point about choruses for a lot of songs, but the choruses in today's chart songs would probably be bit-for-bit identical, what with digital mastering and mixing of these manufactured boy bands! Bleepy techno tunes especially would be about three to four seconds of actual MP3 data followed by the instruction "repeat for four minutes." |
|
|
Am I getting too cynical? Bah, kids these days, their music all sounds the same... |
|
|
[pmillerchip]: I admit my concern about memory was more with streaming media players, where extra storage space would be required. But with an MP3 file, if you have to take parts of the stream from different sections and combine them, that requires a lot more decoding. |
|
|
I don't want to be too negative, since taking advantage of repeated patterns is an obvious way to compress audio that is under-used in current systems. I imagine it would be possible to devise an audio format that allowed for easy repetition, though. |
|
|
Incidentally MPEG-4 includes various tools for synthetic audio, i.e. generating sounds (kind of like MIDI, I think, using both pre-defined instruments and unique-to-the-file wavetables) and applying parametric effects to synthesised and existing sounds, rather than just compressing existing audio streams, so you may be able to do something similar using the facilities of MPEG-4 (transmit blocks, and then use the SA coder -- which includes something between a mark-up language and a full programming language -- to reconstruct the full track); however the synthetic audio features aren't usually implemented in current decoders. But many of the features of MPEG-4 are like this -- designed to facilitate future methods of content generation, so it's possible we could see them used. |
|
|
Sounds like you're looking for a MOD, XM, S3M, or IT file, with the caveat that at least one of the samples will be rather large (the vocal track). Many of the older trackers (MOD composers) limit sample size to ~64k. However, many of the newer ones remove this limitation, and it should be fairly easy to merge mp3 encoding into the existing specs, if this mutation doesn't already exist somewhere. |
|
| |