Log in

View Full Version : Encoding Spoken Audio


maikii
04-04-2004, 03:56 AM
I have a "Book on CD", that I would like to transfer to my PPC, to listen to on the PPC.

It is on 10 CDs, so could take up a lot of space. But I figure spoken word would need much less quality than music, so could be compressed much more, and could be monaural, as there is really no need for stereo to hear one person reading a book.

What format have people found to work well for encoding spoken word, to get a small file size, etc.?

I guess Audible.com does a lot of that. They use their own format, right? (I've never tried their service.) Is it similar to MP3, or WMA, or something else entirely? What bit rate is it? Monaural? With their "Audible Manager" software, could one rip a CD to the "Audible" format? (I would guess no, as that would compete with buying their content.)

Any suggestions regarding encoding spoken word audio, what programs good to use for that purpose, etc., would be appreciated.

Jason Dunn
04-04-2004, 04:37 AM
Your best course of action here would be to use the Windows Media Player to rip one track from the CD at different quality levels. First, go into TOOLS > OPTIONS > COPY MUSIC then make sure the quality level is up fairly high - I'd say 160 kbps. Now you have a WMA file. Download and install the Windows Media Encoder. Use the Web Server > Voice Quality Audio (19 kbps) and see how that quality level sounds to you. If it's poor, try again but use the FM Quality Audio (37 kbps) option.

maikii
04-04-2004, 11:05 PM
Your best course of action here would be to use the Windows Media Player to rip one track from the CD at different quality levels. First, go into TOOLS > OPTIONS > COPY MUSIC then make sure the quality level is up fairly high - I'd say 160 kbps. Now you have a WMA file. Download and install the Windows Media Encoder. Use the Web Server > Voice Quality Audio (19 kbps) and see how that quality level sounds to you. If it's poor, try again but use the FM Quality Audio (37 kbps) option.

Well, the 160 kbps would certainly be way too high a quality for a monaural voice recording. I don't even use that high for music--I use 64k WMA.

The WME option might work. It would be a longer and more complicated process though, as WME doesn't rip files from the CD, and only processes one file at a time. I would have to rip the files first with another program. (Perhaps that's what you were referring to above with the 160kbps. If I was going to rip first with another program, later to compress with WME, it might make more sense to rip the files uncompressed .wav, although I might specify monaural for that also.) After ripping, I would have to compress each .wav file separately with WME, as it doesn't look to me like WME will encode a group of files, unlike most other programs that encode. (And the sound on each of these CDs is divided into 3 minute tracks, so there are many of them.) (The version of WME I have on my computer says version 9.00.00.2180. Is there a newer version? )

I was hoping for a simpler way to do it.

Anyone reading who has encoded an audio book? How did you do it?

Ripper014
04-04-2004, 11:34 PM
I personally like to use dBPowerAmp Music Converter... it is a free application that used to convert some Mp3 radio broadcasts of LOTR... they are about an hour each... and I have then down to about 15-17mb... per episode. I am using ogg format... at 64kbps and 44100 hz... and I find it more than acceptable...

I am also using GSPlayer on my PocketPC for playback... the original skin is not the best... but the player is a good one. If you do a search at Brighthand you will find some additional skins for the player.

Jason Dunn
04-05-2004, 01:43 AM
Well, the 160 kbps would certainly be way too high a quality for a monaural voice recording. I don't even use that high for music--I use 64k WMA.

I know that. ;-) WME won't rip, so you first needed to rip the tracks into a forma that WME could then encode. I was suggest a high bitrate so there was lots of data for WME to work with.

Anyway, I tried to help - good luck!

maikii
04-05-2004, 06:12 AM
I think I might have come upon a solution. I only did it with one test file so far, so will report again when I have encoded the whole audio book. It combines ideas from both Jason and Ripper.

It turns out WM9 has a great codec for encoding voice--Windows Media Audio 9 Voice--codec. It is different from using the regular WMA codec. It is optimized for speech encoding--one can get a much smaller size while preserving decent sound for speech. (I'm not sure if Jason was referring to this codec when he mentioned using WME--voice.)

Although that voice codec is automatically installed with the other WM9 codecs, it appeared that the only program that would encode with that codec is WME. (Neither WMP9, or any other ripping and/or encoding program I had mentioned that codec.) As I mentioned in the previous post, doing 10 CDs with a total of over 200 tracks would be quite cumbersome using WME, first ripping with another program, etc.

Ripper's post mentioned "dbPowerAmp Music Converter". I downloaded and installed that, including various codec packages for it. Great free program. I highly recommend it, from the little I've seen so far. It does support the WMA 9 Voice Codec. So with it, I can rip and encode with that codec a CD at a time. I experimented with one track, a 3 minute track. At the very lowest setting, 4 kbps, 8 kHz, mono, CBR, the sound quality wasn't acceptable to me. But when I went up to the next setting, 5 kbps (everything else the same), the sound was quite acceptable to me. The file size for the converted track was 98 kb.

Actually, in comparing the sound of this with the original track on the CD, this highly compressed version sounded better than the original. This is due to the fact that this CD audio book was not recorded well. There is excessive sibilance and other noise. In some of my experiments with compression this noise was increased. However, with the experiment above, there was less of this noise than in the original, perhaps due to a reduction of some of the high frequencies where the noise resides.

maikii
04-08-2004, 11:08 PM
In case anyone is curious, I encoded all 10 CDs, each having close to 70 min on each (so close to 700 minutes total), and the total space is about 30 MB for all 10 CDs, close to the space that one CD takes with 64kbps WMA music encoding.

Sounds fine on the IPAQ. However, when I tried it on a portable music player I just got the Creative Labs Muvo 4GB, it wouldn't play on that. Apparently it doesn't support low bit WMAs. So, I'll hear it on the IPAQ.