Jump to content
IGNORED

Another lossy audio compression experimentation


newcoleco

Recommended Posts

I did a little experimentation, nothing concret with a real project but clearly plausible whatever the vintage system. The idea is to use the most significative parts of a wavelet transform which can be applied on digitalized sounds and video.

 

My main objective was to try an extreme case of compression and deal with old processors like Zilog Z80 (no hardware multiplication and floating point values). I did write a paper ( available as a PDF file ) that explains a little bit what is a 1-dimension Haar wavelet transform and why I think it can be used for lossy compression adapted for vintage systems. Maybe someone will find it interresting enough to investigate the idea and push it into cool projects to use extreme lossy compression. With this perspective, I've decided to publish my paper.

 

The result I've published with it is an audio test file to let you hear the effect of discarding less significant wavelets from the transformation (usually white noise and low volume frequencies). The compression ratio can goes beyond LPC (Linear Predictive Coding) but the quality is way worst because it's not based on speech compression principles.

 

http://newcoleco.dev-fr.org/p4202/2010-04-19-haar-wavelets-lossy-compression-for-vintage-processors.html

Link to comment
Share on other sites

  • 1 year later...

I did a little experimentation, nothing concret with a real project but clearly plausible whatever the vintage system. The idea is to use the most significative parts of a wavelet transform which can be applied on digitalized sounds and video.

 

My main objective was to try an extreme case of compression and deal with old processors like Zilog Z80 (no hardware multiplication and floating point values). I did write a paper ( available as a PDF file ) that explains a little bit what is a 1-dimension Haar wavelet transform and why I think it can be used for lossy compression adapted for vintage systems. Maybe someone will find it interresting enough to investigate the idea and push it into cool projects to use extreme lossy compression. With this perspective, I've decided to publish my paper.

 

The result I've published with it is an audio test file to let you hear the effect of discarding less significant wavelets from the transformation (usually white noise and low volume frequencies). The compression ratio can goes beyond LPC (Linear Predictive Coding) but the quality is way worst because it's not based on speech compression principles.

 

http://newcoleco.dev-fr.org/p4202/2010-04-19-haar-wavelets-lossy-compression-for-vintage-processors.html

 

How would you go about playing back the sample? As an example, the Mattel Aquarius only has a speaker that can be on or off, so, we have to use PWM, or sample 1 bit and play back. How would you go about implementing your compression?

Link to comment
Share on other sites

How would you go about playing back the sample? As an example, the Mattel Aquarius only has a speaker that can be on or off, so, we have to use PWM, or sample 1 bit and play back. How would you go about implementing your compression?

Quick answer : Encoding a complex digital sound and reducing it to a 1-bit signal is already a lossy compression, you don't need Haar wavelets except if you want to torture yourself trying to make the result making sense or really get compression after all.

 

 

I'm glad that at least someone here commented my message about the possible application of wavelet compressions with 8-bit systems.

 

Wavelet transformations are used these days as alternatives for lossy data compression. The most common usage is for pictures like the JPEG2000 format. The idea is to apply mathematics on your data in order to transform them into numbers showing what is obviously important to encode and what can be ignored depending on the level of details you want to keep for the final result.

 

If the original data is like noise, then a lossy compression will be either not efficient or makes unwanted visible or audible artefacts. But in a nice picture (jpeg) or music (mp3), the data sequence is mostly smooth with variations that follow a certain harmony having parts of pretty much the same colors or tones that makes the data compressible with wavelets.

 

In my paper (pdf file) I'm talking about the Haar wavelet because of its squared shape that makes it the best candidate for possible applications with 8-bit systems including digital sounds compression. However, it's not a no brainer solution, it may works for you or not depending on the possibilities of the system and how you deal with it.

 

In your case, if the speaker can be only muted or not (1 bit : 0 or 1), and so there is no volume variation possible to simulated a "smooth" wave, then a wavelet method of data compression will not work for you because even the Haar wavelet implies that there is at least 3 possible states (-1,0,1). And if you try to use wavelets to compress multiple 1 bit data as 8 bit data you'll get a result that will not fit want you expected. And considering that the data is already encoded as 1-bit only values, you'll either not know how to encode the transformed data in order to save space (compression) or getting a result that will be even less interesting to use (too much lossy compression makes no sense).

Link to comment
Share on other sites

So from a pratical side, how would this work out on the colecovision? Can you have a rough estimate how much ROM and RAM the player would require for playing a sample ?

Considering the speech example with 75% compression. How much ROM space would the sample itself take ? Are we talking about a few kilobytes or would it be a lot more ?

 

Also do you think if the player would require all available Z80 CPU power or would it leave enough room for other tasks ?

Link to comment
Share on other sites

In your case, if the speaker can be only muted or not (1 bit : 0 or 1), and so there is no volume variation possible to simulated a "smooth" wave, then a wavelet method of data compression will not work for you because even the Haar wavelet implies that there is at least 3 possible states (-1,0,1). And if you try to use wavelets to compress multiple 1 bit data as 8 bit data you'll get a result that will not fit want you expected. And considering that the data is already encoded as 1-bit only values, you'll either not know how to encode the transformed data in order to save space (compression) or getting a result that will be even less interesting to use (too much lossy compression makes no sense).

 

Ok, hear me out. Right now, I can create audio using Pulse Width Modulation to 4 bits. (http://www.atariage.com/forums/topic/180604-digitized-sound-on-aquarius/) Because I had to do shifts in order to lower the bit depth I think I used up too many cycles downsampling the 8 bit sample I had stored. I couldn't find a program to reduce to lower than 8 bit depth... Anyway, I am off topic. My thought is this, I think I have on the Aquarius (Z80 ~ 4 MHz) enough cycles to do PWM at 6 bits (or at least 5 bits) but I still can't store any real amount of audio. (http://en.wikipedia.org/wiki/PC_speaker)

 

What I am thinking is that if I took a 6 bit audio file, used your compression technique, maybe I'd have enough cycles to decode the wavelet, encode it as a PWM 1 bit stream and play it back. I had thought about trying to implement GSM or something like that, but that looked too processor intensive.

Link to comment
Share on other sites

So from a pratical side, how would this work out on the colecovision? Can you have a rough estimate how much ROM and RAM the player would require for playing a sample ?

Considering the speech example with 75% compression. How much ROM space would the sample itself take ? Are we talking about a few kilobytes or would it be a lot more ?

 

Also do you think if the player would require all available Z80 CPU power or would it leave enough room for other tasks ?

I can't remember exactly what I had in mind back then.

 

If the decompression routine is well optimized, my guess was that you can keep the computations +1 and -1 very simple to decode the compressed data almost like a stream and so avoid the need for extra RAM usage to decompress the data and then play the result. So the ROM and RAM space depends really on how you want to implement what I've proposed in the paper.

 

I've never done the coding to test my idea on the ColecoVision, I've made my audio examples by compressing and decompressing audio samples based on what it may sounds like in an ideal world.

  • Like 2
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...