For Programmers: Free Programming Magazines  


Home > Archive > Compression > January 2008 > misc: yet another lossless audio codec...









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author misc: yet another lossless audio codec...
cr88192

2008-01-14, 3:56 am

well, this was an idea I beat together.

comment could be useful, if anyone has much to comment on.


basically, it is an "update" of the WaveZ0 format, which was developed
primarily for the reason of encoding audio data in my voice synth (this is a
generalization, and as such should be applicable to a wider range of audio
encoders).

the primary difference with this format (from WaveZ0) is in terms of the
header structure (this version has more compact chunk headers, and extension
headers).


primary goals:
be much simpler than, for example, FLAC or WavePack;
focus more on the format, rather than on the implementation (unlike, for
example APE or WavePack);
tolerable compression (maximal compression is not the goal).

as such, the format is non-adaptive (this is for reasons of both simplicity
and implementation neutrality).

even being non-adaptive, in my testing (ok, this was with WaveZ0...) it has
still performed fairly well (getting between about 40% and 70% size
reduction on average for diphones).


intended usage domain:
presumably for storing chunks of audio data, such as diphones, speech, or
sound effects.

not much point in focusing on, for example, lossless music compression
(already well served by technologies like FLAC and APE).


----
WaveZ1

Stream will consist of a number of headers, and chunks of compressed data.

WZ1Head {
FOURCC 'WAZ1'; //block magic
u32 size; //block size (bytes, includes headers)
u32 len; //length (samples) for this block
u32 enc; //block encoding
};

The size field gives the total size of the block (including any headers,
aka, the minimum valid value is >16).
This will allow sing from one block to the next (absent a forwards scan
for the next magic or decoding to figure the size).

The len field tells how many (abstract) samples exist in the block.

The enc field is a bit-packed member, encoding the channels, rate, and bits.

Bits:
0, 0=8 bits, 1=16 bits;
1..3, 0=mono, 1=stereo, 2=joint-stereo, 3..7=reserved;
4..7, rate:
0= resv, 1=11025, 2=22050, 3=44100,
4= 8000, 5=16000, 6=32000, 7=48000,
8=64000, 9=88200, 10=96000,
11..15=reserved;
8..13: 0..23=rice suffix bits, 24..31=reserved;
14..17: predictor:
0=zero (0),
1=last sample (a),
2=linear mid-point (a+(a-b)/2),
3=linear (2a-b),
4=diff-linear (2a-(2b-c)),
5..15=reserved;
18..31: reserved.

Following this header, would be extension headers.

WZ1ExtHead {
byte mrk; //bits 0..5=id, 6=optional, 7=size
if(mrk&0x80)u24 sz;
else byte sz;
byte data[sz];
};

These will allow adding extended features (such as customizable filters or
more modes). This list will be terminated by an empty header, encoded as
mrk=sz=0.

Markers 0x01 and 0x41 will be used for implementation headers, which will
have a FOURCC to identify them.



Compressed audio data will follow after these headers.

Predictions may be made, and the differences from the prediction will be
encoded.

The actual prediction differences will be coded using Rice Codes.
Bit packing will be low-high, and the Rice prefix will consist of 1s
terminated by a 0.

Note that sign will be coded by folding the sign into the LSB:
(0=0, 1=-1, 2=1, 3=-2, 4=2, ...).


For stereo and joint-stereo modes, the coded samples will be interleaved
(first left, then right).

Joint-Stereo will differ from Stereo, in that samples will be be coded as
such: L'=(L+R)/2, R'=L-R.


Industrial One

2008-01-14, 6:57 pm

And just who in the hell would want to losslessly compress VOICE? Are
we talking 44.1 stereo? Voice is about the only data I can tolerate at
16 kbps mono.
cr88192

2008-01-14, 6:57 pm


"Industrial One" <industrial_one@hotmail.com> wrote in message
news:614906d7-dac2-4238-adea-f89a9a686470@i29g2000prf.googlegroups.com...
> And just who in the hell would want to losslessly compress VOICE? Are
> we talking 44.1 stereo? Voice is about the only data I can tolerate at
> 16 kbps mono.


actually, for the voice, most of what I was compressing was 16kbps mono...
but, a 50 or 60% size reduction, and a packaging format, pays off (actually,
the main payoff is the packaging format as, for example, 10k 1-2KiB files
has more than a little overhead on a FS with a 16KiB or so disk-block
size...).

also, it is good to losslessly compress the voice, since we will not be
simply listening to the voice, but, rather, the voice is used as input data
and is heavily processed by tools (making lossy compression, and especially
DCT-based formats, generally unacceptable).


nothing would really stop the format from working on music though, only that
the usual music related tasks (ripping, encoding, decoding, and playback)
are already fairly well handled by formats like APE, WavPack, and FLAC.

then again, the problems with these formats are known:
APE is more or less undocumented, and it seems the exact fileformat is
implementation-version specific;
WavePack is poorly documented, and though not that complex seeming, is not
ideally simple either (also, it has overly large block headers, and makes
use of adaptive coding...).
FLAC is fairly well documented, but not terribly simple IMO.

in all of these cases, it ends up causing, more or less, a
specific-implementation dependency.


my idea is to make the format very simplistic and minimalistic, and avoid
dynamic adaptation (either of the entropy coding, or the predictors), making
it very easy to effectively decode (absent having to worry so much if the
new implementations' adaption teqniques exactly match the old
implementations', ...).

thus, both the predictor and rice-suffix length are encoded in the header,
and do not change for a given chunk.

new fixed predictors, could be given fixed predictor numbers, and possibly,
an auxilary predictor (such as a weighted predictor), could be added by
extension headers (for now, I am not including a weighted predictor, but I
could if it were a 'worthwhile' feature, another would be a small
vector-based predictor, which could implement customized linear-predictors).

I left out a quadratic or cubic predictor, since IME these are rarely
effective predictors (and, then, usually only for pure-tones and similar).

so, the current list, is ones that often do fairly well in practice.


of course, in a general sense, this whole idea could be pointless anyways...


Industrial One

2008-01-14, 9:57 pm

On Jan 14, 4:07 pm, "cr88192" <cr88...@hotmail.com> wrote:
> "Industrial One" <industrial_...@hotmail.com> wrote in message
>
> news:614906d7-dac2-4238-adea-f89a9a686470@i29g2000prf.googlegroups.com...
>
>
> actually, for the voice, most of what I was compressing was 16kbps mono...
> but, a 50 or 60% size reduction, and a packaging format, pays off (actually,
> the main payoff is the packaging format as, for example, 10k 1-2KiB files
> has more than a little overhead on a FS with a 16KiB or so disk-block
> size...).


You're saying you could losslessly reduce the equivalent quality of 16
kbps mono to... 2 kbps? And I don't follow the part about the
overhead, you're talking about the header and metadata taking up a
notable portion of the file, and that your format will be more
efficient and try to make most of the file contain the actual data
without too much header info?

> also, it is good to losslessly compress the voice, since we will not be
> simply listening to the voice, but, rather, the voice is used as input data
> and is heavily processed by tools (making lossy compression, and especially
> DCT-based formats, generally unacceptable).


Ehh... what? If we're not simply listening to the voice, then what are
we listening to? If processing it heavily harms it so much, it can
also be heavily processed with lossless tools, not making a
difference. And loss in voice is more acceptable than in music where
detail is crucial otherwise it doesn't feature the same mental
stimulation. Voice... for the most part is useless and can be
discarded, but not to the point where you can't make out the words or
the general traits of the person's voice.

> nothing would really stop the format from working on music though, only that
> the usual music related tasks (ripping, encoding, decoding, and playback)
> are already fairly well handled by formats like APE, WavPack, and FLAC.


Ain't that hard. Analog2Digital conversion commences, we got us'ns
a .WAV, and we do whatever the hell we want to it after. Thing is,
WAVs don't usually compress more than half.

> then again, the problems with these formats are known:
> APE is more or less undocumented, and it seems the exact fileformat is
> implementation-version specific;
> WavePack is poorly documented, and though not that complex seeming, is not
> ideally simple either (also, it has overly large block headers, and makes
> use of adaptive coding...).
> FLAC is fairly well documented, but not terribly simple IMO.
>
> in all of these cases, it ends up causing, more or less, a
> specific-implementation dependency.


Those sound like some XXXXin' unprofessional asspie-written apps
lackin' in flexibility. 'Glad I never bother with lossless audio
encoding.

> my idea is to make the format very simplistic and minimalistic, and avoid
> dynamic adaptation (either of the entropy coding, or the predictors), making
> it very easy to effectively decode (absent having to worry so much if the
> new implementations' adaption teqniques exactly match the old
> implementations', ...).


It seems most of those other programs are in the Alpha stage. It isn't
wise to put a program into serious personal use until a final is
released. If you're gonna create a new app, be sure to be more
creative with more optimal techniques and edit to your hearts content
until you see fit. And encourage moron netscroungers to back up their
shit before operating.

> thus, both the predictor and rice-suffix length are encoded in the header,
> and do not change for a given chunk.


Doesn't this apply to already-existing ones?

> new fixed predictors, could be given fixed predictor numbers, and possibly,
> an auxilary predictor (such as a weighted predictor), could be added by
> extension headers (for now, I am not including a weighted predictor, but I
> could if it were a 'worthwhile' feature, another would be a small
> vector-based predictor, which could implement customized linear-predictors).
>
> I left out a quadratic or cubic predictor, since IME these are rarely
> effective predictors (and, then, usually only for pure-tones and similar).
>
> so, the current list, is ones that often do fairly well in practice.


Looks pretty complicated in advance, like you're combining to optimize
every known method into one to achieve the ideal encoding the target
requires. But it also looks to me it'll take a while... a number of
times the length of the track. How much do you expect to losslessly
reduce? That is, most can reduce by half. If you can do 10% better,
would 5x the wait be worth it?

> of course, in a general sense, this whole idea could be pointless anyways...


I'd say. No lossless codec can be truly optimal until one is invented
that can recognize, seperate and deduce the melody, vocals, and
percussion mixed into one track. That way, a single verse only has to
be encoded once and addressed when it repeats again with this time
different ascending bass notes, etc. Obviously, that won't happen for
another couple of decades (assuming it's even possible.) But, try and
think of MIDIs, some sound very professional, and are only like 10 KB
when compressed. You can convert back to WAV, but that WAV will still
only compress down to half, regardless that the same track can be fit
into 10 KB. No A.I can recognize and engineer that kinda shit to that
level.
cr88192

2008-01-14, 9:57 pm


"Industrial One" <industrial_one@hotmail.com> wrote in message
news:614906d7-dac2-4238-adea-f89a9a686470@i29g2000prf.googlegroups.com...
> And just who in the hell would want to losslessly compress VOICE? Are
> we talking 44.1 stereo? Voice is about the only data I can tolerate at
> 16 kbps mono.


alternatively, looking back at the FLAC documentation, I realize that the
difference in complexity between my idea and FLAC is minor (the main
difference is in terms of the fiddly header structure and additional
metadata employed in FLAC).

I may well be better off beating together a simplified FLAC encoder/decoder,
and using that in my projects (rather than implementing a generalized
version of an originally overly-simplistic codec...).

or such...


cr88192

2008-01-15, 7:56 am


"Industrial One" <industrial_one@hotmail.com> wrote in message
news:72884bf8-b8c6-49eb-bfe4-0e0c223adc65@n22g2000prh.googlegroups.com...
> On Jan 14, 4:07 pm, "cr88192" <cr88...@hotmail.com> wrote:
>
> You're saying you could losslessly reduce the equivalent quality of 16
> kbps mono to... 2 kbps? And I don't follow the part about the
> overhead, you're talking about the header and metadata taking up a
> notable portion of the file, and that your format will be more
> efficient and try to make most of the file contain the actual data
> without too much header info?
>


I was being braindead there, I meant, 16 kHz...


as for the second part, no:
with typical file systems, such as NTFS or FAT32, there is a certain amount
of overhead caused by the filesystem itself. if you have 10000 1KiB files,
ideally they would take 10MiB, but, they actually take up 160MiB on disk.

even doing something simple, like packing them end to end in a container
format, greatly reduces the disk overhead...

however, when one is packing all these files into a container, it also makes
sense to compress them, so that this container file, rather than taking
10MiB, may only take 3 or 4MiB.


>
> Ehh... what? If we're not simply listening to the voice, then what are
> we listening to? If processing it heavily harms it so much, it can
> also be heavily processed with lossless tools, not making a
> difference. And loss in voice is more acceptable than in music where
> detail is crucial otherwise it doesn't feature the same mental
> stimulation. Voice... for the most part is useless and can be
> discarded, but not to the point where you can't make out the words or
> the general traits of the person's voice.
>


in general, what we listen to, is the output from the program.
the voice itself is input data, and often consists of a very large number of
tiny pieces.
the program may use this data, doing whatever (taking these pieces,
searching them, distorting them, and reassembling them into an output
stream), which the user may then hear.

but, lossless matters, or failing that, LPC is far preferable to DCT,
because of the nature of the audio (DCT noise + formant or doppler shifting
= bad...).


>
> Ain't that hard. Analog2Digital conversion commences, we got us'ns
> a .WAV, and we do whatever the hell we want to it after. Thing is,
> WAVs don't usually compress more than half.
>


yeah...


>
> Those sound like some XXXXin' unprofessional asspie-written apps
> lackin' in flexibility. 'Glad I never bother with lossless audio
> encoding.
>


my case, I have spent all damn day trying to write a working FLAC decoder
(of these, FLAC is generally best supported and most widely used).

mostly works, but there are still some issues (periodic blatent decoding
errors, and every so often the LPC predictor seems to like totally blow up
making sections with loud hisses and pops).


debugging is lame, and FLAC is still a bit of an awkward format to work
with.
it would be nice if the documentation were a little better, as I am left
often having to guess and trial and error things until they work (one issue:
the documentation is not exactly clear as to what parts of the format are
bit-aligned, and what parts are byte-aligned, and I have already discovered
that the spec contains at least a few errors...).


>
> It seems most of those other programs are in the Alpha stage. It isn't
> wise to put a program into serious personal use until a final is
> released. If you're gonna create a new app, be sure to be more
> creative with more optimal techniques and edit to your hearts content
> until you see fit. And encourage moron netscroungers to back up their
> shit before operating.
>


I opted with FLAC, since it is, at least, the best of the bunch...


>
> Doesn't this apply to already-existing ones?
>


FLAC, yes.

APE and WavePack, no...
they seem to like using dynamic adaptation for the various parameters.


>
> Looks pretty complicated in advance, like you're combining to optimize
> every known method into one to achieve the ideal encoding the target
> requires. But it also looks to me it'll take a while... a number of
> times the length of the track. How much do you expect to losslessly
> reduce? That is, most can reduce by half. If you can do 10% better,
> would 5x the wait be worth it?
>


my codecs tend to be pretty fast.

actually, I discovered that the 'vector-based predictor' idea, is actually
commonly called FIR.


>
> I'd say. No lossless codec can be truly optimal until one is invented
> that can recognize, seperate and deduce the melody, vocals, and
> percussion mixed into one track. That way, a single verse only has to
> be encoded once and addressed when it repeats again with this time
> different ascending bass notes, etc. Obviously, that won't happen for
> another couple of decades (assuming it's even possible.) But, try and
> think of MIDIs, some sound very professional, and are only like 10 KB
> when compressed. You can convert back to WAV, but that WAV will still
> only compress down to half, regardless that the same track can be fit
> into 10 KB. No A.I can recognize and engineer that kinda shit to that
> level.


yeah.


I was considering some messing with music synthesis as well, figuring my
current voice synthesis crap could probably also be made to handle music
synthesis...


Industrial One

2008-01-17, 9:57 pm

On Jan 15, 3:14 am, "cr88192" <cr88...@hotmail.com> wrote:
> "Industrial One" <industrial_...@hotmail.com> wrote in message
>
> news:72884bf8-b8c6-49eb-bfe4-0e0c223adc65@n22g2000prh.googlegroups.com...
>
>
>
>
>
>
> I was being braindead there, I meant, 16 kHz...


Eh heh. I was about to crack an autistic "braindead" joke there but
felt like it would be too harsh. Nothing personal, though, we're just
theorizing compression here, xD.

> as for the second part, no:
> with typical file systems, such as NTFS or FAT32, there is a certain amount
> of overhead caused by the filesystem itself. if you have 10000 1KiB files,
> ideally they would take 10MiB, but, they actually take up 160MiB on disk.


How did you get 160 MB? Did you mean 40? 'Cuz last time I checked,
files are treated as a series of 4096-byte clusters, so if a file
contained 1 KB of actual contents, it would take up 4 on the disk, the
other 3 devoted to slack space.

How do you figure on changing that? What filesystem do you plan on
utilizing this on, exactly? Besides, if you're smart, you would pack
the 10,000 waves into an archive so they're compressed + combined into
one file that won't hog up the MFT.

> even doing something simple, like packing them end to end in a container
> format, greatly reduces the disk overhead...


Yup.

> however, when one is packing all these files into a container, it also makes
> sense to compress them, so that this container file, rather than taking
> 10MiB, may only take 3 or 4MiB.


Way less if solid compression is used.

> in general, what we listen to, is the output from the program.
> the voice itself is input data, and often consists of a very large number of
> tiny pieces.
> the program may use this data, doing whatever (taking these pieces,
> searching them, distorting them, and reassembling them into an output
> stream), which the user may then hear.


Hear what? The output or the artifacts from lossy compression in the
output?

> but, lossless matters, or failing that, LPC is far preferable to DCT,
> because of the nature of the audio (DCT noise + formant or doppler shifting
> = bad...).


Not really. I don't notice a difference -- but since voice data is
less significant to me, I can tolerate compression where artifacts
become visible, as long as the words are perceivable and I can tell
whether a man, woman, chink, limey, nigger or sand-nigger was uttering
them. 'Salright.

> my case, I have spent all damn day trying to write a working FLAC decoder
> (of these, FLAC is generally best supported and most widely used).


Doesn't one already exist? Or the author never bothered to include a
decompressor?

> mostly works, but there are still some issues (periodic blatent decoding
> errors, and every so often the LPC predictor seems to like totally blow up
> making sections with loud hisses and pops).


Those sound more like Analog2Digital conversion errors. But if a
faulty decoder is the cause, where the hell is the official one?


Looks to me the FLAC codec is not complete. If you wanna make a
complete one, it doesn't matter *how* it encodes, as long as it can be
correctly decoded and the interface is user friendly where one will
actually know what the XXXX he is doing and not guess what some
meaningless command/button in irreversible m33t-pulling-sp34k is
supposed to be.

Also, you should make the codec adaptable so outputs from older
formats can still be played back, editted and re-converted.
[color=darkred]
> FLAC, yes.


Basically, you're aiming to upgrade FLAC?

> my codecs tend to be pretty fast.


With all those different techniques you plan to implement, it sounds
like it would take a while. More than, say, MP3?

> I was considering some messing with music synthesis as well, figuring my
> current voice synthesis crap could probably also be made to handle music
> synthesis...


It'll be met with limited success. Most wav2midi programs fail (unless
a single audio layer i.e only the melody without the bass, with no
background is being inputted.) I tried converting a MIDI into a wav
with a midi2wav app and then seeing if I can convert the wav back into
the same MIDI with a wav2midi program... it failed.
cr88192

2008-01-17, 9:57 pm


"Industrial One" <industrial_one@hotmail.com> wrote in message
news:2d42dd1e-487c-4839-a927-4a8b8ef88a9d@s19g2000prg.googlegroups.com...
> On Jan 15, 3:14 am, "cr88192" <cr88...@hotmail.com> wrote:
>
> Eh heh. I was about to crack an autistic "braindead" joke there but
> felt like it would be too harsh. Nothing personal, though, we're just
> theorizing compression here, xD.
>


hmm...

actually, I am autistic...
just of the 'aspergers' variety...

I think I also have issues with BP-2, and when I was younger, TLE.


>
> How did you get 160 MB? Did you mean 40? 'Cuz last time I checked,
> files are treated as a series of 4096-byte clusters, so if a file
> contained 1 KB of actual contents, it would take up 4 on the disk, the
> other 3 devoted to slack space.
>


it is variable, and depends largely on drive size and other factors.
ly, a lot of my drives are large enough, that they end up with 16kB
clusters...


> How do you figure on changing that? What filesystem do you plan on
> utilizing this on, exactly? Besides, if you're smart, you would pack
> the 10,000 waves into an archive so they're compressed + combined into
> one file that won't hog up the MFT.
>


that is in part, why I compress and package them...


>
> Yup.
>
>
> Way less if solid compression is used.
>


for data, yes, for lossless audio compression, not particularly.
also note that these files need random access ability, which does not mix
well with solid archiving.


>
> Hear what? The output or the artifacts from lossy compression in the
> output?
>


the output.

but, if you use lossy compression, especially MDCT based methods, then there
is a problem. these kind of codecs tend to do a lot of spectral-tweaking on
the audio, that if played back directly are not very noticable, but are a
bad thing if one is doing accoustic tweaking.

it turns out, for this kind of tweaking, one wants fairly high-quality
input.


actually, it is worse:
for a lot of lower quality samples I have, they were recorded with a 60Hz
tone present (EM interference between power grid and recording technology I
think, one needs to use a laptop with the lights off and away from other
electronic devices to avoid it I think).

but, this tone is a minor issue normally, except if one uses DCT. the
DCT-based codec is then more inclined to preserve the 60Hz tone than the
rest of the signal...


>
> Not really. I don't notice a difference -- but since voice data is
> less significant to me, I can tolerate compression where artifacts
> become visible, as long as the words are perceivable and I can tell
> whether a man, woman, chink, limey, nigger or sand-nigger was uttering
> them. 'Salright.
>


but, are you doing formant or doppler shifting.

formant shifting:
where one takes the voice and adjusts the tone apart from the tempo (2
methods I know of, one based on hackish use of linear expansion/compaction,
looping, and blending, and another based on the DCT).

doppler shifting:
where the audio stream is connected to a potentially moving source, and
adjusts frequency based on relative velocity.

also, there are cases where things like walls, ... may be used in
calculating echoes, ...


in these cases, spectral artifacts are a notable concern, wheras the
artifacts from lossy LPC, even if difficult to get to be perceptually
lossless, tend to hold up better in these cases.

lossless compression is still the best course though...


>
> Doesn't one already exist? Or the author never bothered to include a
> decompressor?
>


FLAC exists, yes, but some of us don't like library dependencies, or having
to haul around some inordinantly large amount of code (or having to try to
get it to build on windows...).

for example, for pretty much every format I use, I write my own decoders and
encoders, which are typically much smaller and lighter weight than their
associated libraries.

for example, I have pretty much an entire decoder (with some extra inflation
due to trying to debug some things, aka, certaion ommisions in the spec) in
< 1kloc.

with some selective editing, I could get it smaller...


>
> Those sound more like Analog2Digital conversion errors. But if a
> faulty decoder is the cause, where the hell is the official one?
>


a lot of battling with bugs and spec issues...

one has to use fixed point in the LPC/FIR calculations (apparently, if one
is lazy and just uses floats, it will screw up, ...).

other problems were due to the ongoing battle of understanding a few of the
more fine details in the format.

as such, I now have a mostly working decoder, but it still has a few
remaining issues (in particular, the one major remaining issue, is that
periodically the LPC predictor will mess up, and generate small sections
with louder than expected output...).


>
> Looks to me the FLAC codec is not complete. If you wanna make a
> complete one, it doesn't matter *how* it encodes, as long as it can be
> correctly decoded and the interface is user friendly where one will
> actually know what the XXXX he is doing and not guess what some
> meaningless command/button in irreversible m33t-pulling-sp34k is
> supposed to be.
>
> Also, you should make the codec adaptable so outputs from older
> formats can still be played back, editted and re-converted.
>


yes, in any case, I would probably make the format simpler and cleaner...


>
> Basically, you're aiming to upgrade FLAC?
>


actually, I decided to change my plans, and implement a clone of the flac
library (still compatible with the existing FLAC library).

first off, this means getting a working decoder, so that I understand at
least the bitstream.
a simple encoder would be easy, but a more involved one would be a little
more work.


>
> With all those different techniques you plan to implement, it sounds
> like it would take a while. More than, say, MP3?
>


FLAC does a lot more, in general.
one has a few fixed predictors, and a generic mechanism (aka: a FIR
predictor).

one basically just iterates a few times and works through the best set of
options.
however, this is fairly fast, since most linear operations are O(n), but the
DCT is O(n^2), (though, a special case of O(n log2 n) is possible with some
amount of specialized coding/factorization).

as such, an encoder can beat through a few of trial and error runs before
the MDCT could even complete.

decoding tends to be fairly fast, since the decoder will just use the
indicated options.
a simple loop doing simple arithmetic tends to be a lot faster than doing
the IMDCT...


>
> It'll be met with limited success. Most wav2midi programs fail (unless
> a single audio layer i.e only the melody without the bass, with no
> background is being inputted.) I tried converting a MIDI into a wav
> with a midi2wav app and then seeing if I can convert the wav back into
> the same MIDI with a wav2midi program... it failed.


no, not wav to midi.


no, I meant, using something analogous to the internals of a "text to
speech" engine, as a "commands to music" engine (aka: something similar to
what is done with midi playback...).

or such...


earlcolby.pottinger@sympatico.ca

2008-01-19, 6:56 pm

Do you really need to compress the samples? From reading the thread
it looks like just archiving all the samples into one file already
gives you big savings. Compression should be used to solve a need,
not just a goal in itself if your main interest is in the processing
of the sound samples.
cr88192

2008-01-19, 6:56 pm


<earlcolby.pottinger@sympatico.ca> wrote in message
news:97d0d0ec-cd49-4582-b06f-dc909c4915a0@m34g2000hsb.googlegroups.com...
> Do you really need to compress the samples? From reading the thread
> it looks like just archiving all the samples into one file already
> gives you big savings. Compression should be used to solve a need,
> not just a goal in itself if your main interest is in the processing
> of the sound samples.


yeah, archiving does save a lot, and compression, a little more...

well, as can be noted in other parts of the thread, I wrote a FLAC decoder,
and now, most of an encoder as well.

not like it will have that much specific use for this subproject, but it may
have uses in other subprojects, or in my main project (replacing another of
my customized LPC codecs with another, more standardized one).

it works I guess...


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com