The Ur-Quan Masters Discussion Forum

The Ur-Quan Masters Re-Release => Starbase Café => Topic started by: Daktaklakpak on May 06, 2006, 10:03:15 pm



Title: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 06, 2006, 10:03:15 pm
I am looking for some information about the audio format used for Star Control 3 voice clips contained in the game's sound file archive named "STAR001.VOC". There appear to be about 3500 references to single "WAV" format files within the archive, but upon extracting and loading them into a sound editor, they all come up as 8-bit 22,050 Hz Mono. This is fine for the sound effects, intro audio, and credits audio (150 out of ~3500 files), but on the in-game voice files, there exists a large amount of static, though you can tell that there is indeed speech within the noise. I've tried forcing different sample rates and byte ordering on the files to no avail, so I am assuming it's some kind of compressed format, though I've tried loading them as most common compressed formats with no results either.

So if you know what I'm talking about, please let me know what format these files are. If not, perhaps you could point me in the right direction to someone who might be able to provide this information.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: meep-eep on May 07, 2006, 02:22:11 am
.voc is the "Creative Voice" format. It is the wave format used by Creative Labs when .wav wasn't the (de facto) standard yet. I got a wav2voc program with my Sound Blaster Pro 2.0 back in the days, but no voc2wav.
You should be able to find a convertor somewhere on the web.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 07, 2006, 08:51:09 pm
Thanks for trying to help, meep-eep. :) I'll try to be more precise:

Reiteration: STAR001.VOC appears to be an archive of 3,592 WAV format files.

Clarification: In this case, the extension does not denote the type of file; it is simply the name that the programmers of Star Control 3 used for this archive. The information I need is related to the format of the WAV files contained within the STAR001.VOC archive that are not Uncompressed PCM data. 150 of the files (sound effects and intro/cutscene/credits audio) are standard PCM data (22,050 Hz 8-bit Mono), while the remaining 3,442 files are identified by their headers as being WAV data, but neither I, nor any of my software, have been able to determine the exact format.

Errata: The software being utilized is:

    MRIP for extracting the WAV files from STAR001.VOC
    SoundForge & Awave Studio for loading and analyzing the format of those WAV files


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 07, 2006, 09:01:22 pm
Another Clarification: WAV format files usually contain PCM data. So, WAV data = PCM data. My previous response may not have been clear to those unfamiliar with digital audio engineering. Sorry for that.  :D


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: meep-eep on May 07, 2006, 09:48:35 pm
Creative Voice Files contain headers very much like .wav files, if I recall correctly, and they can contain multiple samples. Are you sure your .voc file is not really a Creative Voice File?
The contents of the samples may be adpcm encoded. That was very common in the days before .mp3.
If you send one of the supposed .wav files, I'll see what I can find out.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 07, 2006, 09:57:02 pm
meep-eep: I did indeed try to load STAR001.VOC as a single Creative VOC, no luck. Also tried various ADPCM encodings on the individual WAV files with no results.

How do you recommend I send you one of the files for inspection?


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 07, 2006, 10:10:07 pm
meep-eep: Here is a link to one of the mysterious files. Let me know when you have it.

http://www.oblyvaeon.com/TEMP/WAV00000.zip

Thanks for your help.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: meep-eep on May 07, 2006, 11:33:22 pm
Got it. It doesn't look right. The 4 bytes after "RIFF" are supposed to be 8 bytes short of the total file length.
Perhaps your extraction program isn't quite working right.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 08, 2006, 12:53:24 am
meep-eep: Right. The 4 bytes should be the length of the audio data, which can actually be a little less than the file length, since some WAV formats contain metadata at the end of the file. (or the beginning in FACT chunks, extended formatting, etc.)

These don't seem to have any metadata, and although most of the files do contain too much data truncated to the end of the file (including the RIFF header of the next file in the archive), it is mostly zeroed out. Even with this tacked on, the WAV data should still play properly, followed by static if the extra data is garbage. The header in these is probably the problem, since it also specifies that these files should be 22,050 Hz 8-Bit Mono, which they are clearly not. If you lower the sample rate of the files to 11,025 Hz, you can make out the speech a little better. The one I put up a link to: WAV00000.WAV sounds like it starts out "I offer you peace..." under the static. That's why I'm trying to get past this screwed up header to the data format itself; there is definitely something there.

As for the program, I put it in a sort of 'blind mode', which just searches for a specified 'header' (series of bytes) 'WAVEfmt' offset by -8 bytes, or just 'RIFF', then reads everything until the next 'header' is found and writes that part to a new file, then on to the next file, with no processing. It does have a default 'smart  mode' which checks the data/header against a known format, but that only finds the 150 'good' WAVs in the archive, which the 'blind mode' does find as well, with no corruption.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 08, 2006, 08:13:20 pm
STAR001.VOC seems to be a lot of WAV files concatenated together, like Dak says, but most of them seem to have bogus headers. The header says 8 bit PCM at 22 kHz, but that is definitely wrong. Given the time period, I'd guess the data is actually in some ADPCM variant, but so far I haven't found a decoder that produces a listenable result.

Star Control 3 uses Miles Sound System 3.5b, but tracking down specs for it is quite hard (the current version is completely different).


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 09, 2006, 05:51:02 am
Novus: You calling me a Dak?! Haha.

I have been researching Miles version 3.x actually, but I don't know if that will point to a specific format. Doesn't look like it so far, unfortunately. I agree about ADPCM being a likely format, since when you force most ADPCM formats to load as uncompressed PCM (Unsigned Little-endian 8-bit in this case) they do sound similar to the files in STAR001.VOC.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 09, 2006, 12:24:32 pm
Novus: You calling me a Dak?! Haha.
Yeah, I know, "Daktaklakpak is short form". ;)

Quote
I have been researching Miles version 3.x actually, but I don't know if that will point to a specific format. Doesn't look like it so far, unfortunately. I agree about ADPCM being a likely format, since when you force most ADPCM formats to load as uncompressed PCM (Unsigned Little-endian 8-bit in this case) they do sound similar to the files in STAR001.VOC.
If you take one fixed-length encoding (or close to fixed length) and play it as another, you'll probably be able to discern similar patterns amongst the static. Unfortunately, this makes it harder to guess the format.

I've had more luck with the structure of the archive. The first two bytes seem to be the amount of subfiles, followed by that many offsets to the subfiles (32-bit offsets from file start; some are zero). Each subfile seems to start with 20 00 and then four bytes that seem to be the real file size followed by what appears to be a WAV. The WAV file header seems to contain information about the unpacked file, which means that e.g. the file length in the WAV header is way too large. Each subfile is padded up to the next 4K multiple (for quick disk access?).

Assuming the file size in the WAV header is the unpacked file size, this would suggest a codec with a variable compression factor of something like 3 to 4. This rules out most popular fixed-rate ADPCM codecs (e.g. Creative ADPCM).


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 09, 2006, 07:28:05 pm
Novus: That's about as far as I've gotten with the archive itself as well. One theory I had was that STAR001.VOC is a pseudo-compressed archive (or contains pseudo-compressed files), such that the 'pseudo' part would be referring to the WAV data itself being compressed while the headers become accurate for the files when Star Control 3 (or Miles) decompresses them at runtime, in which case we're at a dead-end if they used a proprietary compression algorithm. Pretty much what you said with regards to the WAV header referring to the unpacked size. It sounds like you've played with this file before :) But at any rate, I'm still hoping it is not a proprietary or in-house derived format. This is my 3rd attempt over the years to crack this thing, heh.

On a side note: I am trying to contact some of the old Legend programmers to see if they will let anything slip, though they are hard to track down.  I found one on the Gamasutra (sp?) staff, but no response yet. Also tried to contact a user (Kohr-Ah Death) on this forum who claims to know the format in this post: http://uqm.stack.nl/forum/index.php?topic=2748.0 Maybe he's had the answer all along...


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 09, 2006, 07:37:34 pm
Also tried to contact a user (Kohr-Ah Death) on this forum who claims to know the format in this post: http://uqm.stack.nl/forum/index.php?topic=2748.0 Maybe he's had the answer all along...
Somehow, I suspect Kohr-Ah Death noticed they were WAVs, but the extractor went crazy for the reasons outlined above.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 10, 2006, 05:25:10 am
From reading K.Death's post, it sounds like he could not extract all the files; he never posted a follow-up to tell us if he did finally get them. So, for those who want to work on this, you can get all 3,592 files out of STAR001.VOC with Multi-Ripper (MRIP) v2.80 by specifying the /N switch on the command line. If you don't use the /N switch, it will only extract the 150 'good' files mentioned earlier (the intro/cutscene/credits and sound effects audio).

Syntax:

MRIP STAR001.VOC /N

You can use the default Windows WAVE search, or for a cleaner result use:

User Defined Pattern: RIFF
File Extension: WAV
Offset: 0


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 11, 2006, 12:03:12 pm
Studying the DOSBox output from SC3 seems to confirm that the speech is 8-bit PCM mono at 22 kHz (DOSBox output at 16 bits/44 kHz (with linear interpolation) has every other sample half way between its neighbours and 22 kHz output exhibits about 240 distinct signal levels). If that part of the header information checks out, the uncompressed length is probably correct, too.

Further analysis of the DOSBox output supports the theory of an ADPCM variant; prime numbers (e.g. 53) are heavily underrepresented in the differences between two successive samples (the fact that 1 or 2 instances show up may be an artefact of the playback process in SC3 or DOSBox). LucasArts VIMA (http://wiki.multimedia.cx/index.php?title=VIMA) seems to be similar in principle.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 12, 2006, 07:59:27 pm
Novus: Did you manage to get a pure digital sampling of the audio output? I'm assuming not since you mentioned artifacts, but I wonder if it would be possible with DOSBox or some kind of signal analysis software. Thinking about this reminded me of a software called "Virtual Audio Cable" (VAC) which lets you reroute a digital audio signal from the output of one program to the input of another. That's assuming you can actually select the output and input drivers on both softwares, though. I've been using Virtual PC (Microsoft version) with MS-DOS 6.22 installed, but there seems to be no way to get a recording without resorting to sampling the sound card's main outs (you can't select an output driver like VAC), which puts us in the analog domain. I thought about trying an optical out, but not sure that would be enough to get a pure signal since we're emulating DOS and relying on two layers of drivers for output (emulation driver passed to the host driver). But if you could select an output driver with DOSBox, that would give us an excellent option for analyzing the signal. I haven't had a chance to play with DOSBox yet, so I'll have to look into that. Thanks for the idea. :)

Another possibility for a pure signal would be VDMSound, which operates as a driver emulating DOS sound hardware within the native OS, and has a "Wave Writer" option. Though every time I try to enable that option with Star Control 3, it refuses to work on my system. I just opens a file for writing, but never writes anything to it, heh.

I suppose, if nothing else, I could get one of these options working (or even go analog) and meticulously go through and record all the audio I want; but this, of course, doesn't answer the question of the format. It becomes academic at this point, but hey, that's what I'm here for, right? ;D


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: meep-eep on May 12, 2006, 08:12:19 pm
If recapturing the audio is an acceptable solution, then you could use DOSBox with SDL to dump the audio.
Set the SDL_AUDIODRIVER environment variable to "disk" and set SDL_DISKAUDIOFILE to the name of the file to dump the (raw) data.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 12, 2006, 08:18:30 pm
meep-eep: Thanks for the information. I was just now reading about the SDL environment variables on the DOSBox Wiki. I'll have to try that since VDMSound is not being kind to me on this game.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 12, 2006, 09:27:25 pm
Novus: Did you manage to get a pure digital sampling of the audio output? I'm assuming not since you mentioned artifacts, but I wonder if it would be possible with DOSBox or some kind of signal analysis software.
The problem is that DOSBox is not dumping the raw data sent to the sound card; it's resampling everything to the output frequency, applying mixer settings, adding OPL output (unless disabled) and so on (as well as accurately emulating any gaps in the signal). Setting the DOSBox output frequency to the Sound Blaster's approximation of 22050 Hz (22222 Hz, I think) and turning up the PCM volume to max ought to fix the resampling and scaling problems, but doesn't affect any timing-related problems.

Of course, since DOSBox is open source (just like UQM), it's possible to hack it to output the sound data sent to the emulated SB16 instead of the emulated SB16's output.

Recording from DOSBox is dead easy; just make sure you have a "capture" directory (or whatever your DOSBox configuration calls it) and press CTRL-F6 to record (this has the advantages over meep-eep's method that you can start/stop recording while running and hear what's playing). The only problem is that DOSBox is that DOSBox is emulating the sound a bit too realistically as mentioned above.

Quote
I suppose, if nothing else, I could get one of these options working (or even go analog) and meticulously go through and record all the audio I want; but this, of course, doesn't answer the question of the format. It becomes academic at this point, but hey, that's what I'm here for, right? ;D
I guess it's the intellectual challenge that's the attraction for me, not the end result. If this were Star Control 2, the end result alone would probably be worth it, but that's been done already.

It would be easier if I had both the compressed version of a sound and a good recording of the same sound. The compressed sounds are a bit hard to identify, especially since it was quite a while since I actually played SC3 last, so I'm having a hard time finding the corresponding sound in game. As far as I can tell, silence tends to be encoded as repetitions of 7f 7f... which seems to correspond to silence showing up as sequences of 0, 1, 0, 1... That looks like 4-bit ADPCM with the sign bit in the MSB of each nibble to me and some sort of inverted magnitude part. However, I'm just guessing at this point. Quick and silly hypothesis: Creative VOC data with all the bits inverted.

Edit: Actually, that hypothesis might not be as silly as it initially sounds; Creative's Sound Blasters (and therefore also VOC files) support 3 different ADPCM formats, with compression ratios of 1:4 (2-bit), 1:3 ("2.6-bit") and 1:2 (4-bit). AIL/Miles Sound System has had VOC support since the beginning as far as I can tell, which ought to include compressed VOC files. Inverting all bits would be a good example of the "encryption" game programmers like to do to standard file formats (e.g. LucasArts archives make a lot more sense if you XOR everything with 0x69). Writing a decoder and seeing whether it works might be a good idea.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 12, 2006, 11:32:18 pm
Novus: Ok, here is a link to one of the extracted (compressed) files as well as the DOSBox output of that file, reduced to 22,050 Hz 8-bit Mono (Left channel):

http://www.oblyvaeon.com/TEMP/WAV03545.zip

The DOSBox output was 22,050 Hz 16-bit Stereo, but it looks like it only contained 8 bits worth of data anyway, and each channel contained the same data, other than the stereo image being out of phase by 1 millisecond. It is interesting to note (I mentioned this earlier with regards to the intelligibility of the audio) that downsampling the compressed file to 11,025 Hz makes the audible portion of the file closely match the length that the file would be if properly played / uncompressed. This suggests a 1:2 format. How closely matched in this case depending on human error while pressing CTRL-F6 in DOSBox :)

Can't get the quotes to work on this forum, hmm, but regarding the bit inversion / XOR: yeah, I've encountered this before as well. Even used it myself on some "archives" attached to the end of executable files. The problem here being, what did they do this time? Haha, but well, I could try a few simple swaps for fun anyway.

Side note: I guess I was wrong about intro/cutscene audio being among the 150 'good' files I could extract, because this file happens to be audio from the first segment of the intro, heh. Should have given them a good listen again. It's been a while since I actually played the game all the way through as well.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 13, 2006, 12:44:02 am
Well, isn't this interesting... I was going over the 150 'good' files for clues, and noticed that they are a mixed bag of 8 AND 16-bit 22,050 Hz Mono files. Don't know if this will have any impact on the format we are seeking, but thought it was worth mentioning. This should make it harder to figure out, yay! ::)


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 13, 2006, 01:35:17 am
Report: Initial bit inversion yielded no results. Tried all bits and even bits, similar to A-Law and mu-Law types, even tried odd bits. I only tried it on a couple files so far and have no reference to check these against, so don't let that stop anyone else from trying the same thing. After observing the mixed bag earlier, who knows? Maybe they used a slightly different encoding on each file haha, wouldn't that be fun...


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 13, 2006, 11:01:42 am
Novus: Ok, here is a link to one of the extracted (compressed) files as well as the DOSBox output of that file, reduced to 22,050 Hz 8-bit Mono (Left channel):

http://www.oblyvaeon.com/TEMP/WAV03545.zip
Thanks, that ought to help.

Quote
The DOSBox output was 22,050 Hz 16-bit Stereo, but it looks like it only contained 8 bits worth of data anyway, and each channel contained the same data, other than the stereo image being out of phase by 1 millisecond. It is interesting to note (I mentioned this earlier with regards to the intelligibility of the audio) that downsampling the compressed file to 11,025 Hz makes the audible portion of the file closely match the length that the file would be if properly played / uncompressed. This suggests a 1:2 format. How closely matched in this case depending on human error while pressing CTRL-F6 in DOSBox :)
Those WAV headers may be left over from some previous version of the samples, in which case trying to deduce the compression ratio from them would give plausible but incorrect results. Using the DOSBox output as reference for this is a much better idea.

Quote
Can't get the quotes to work on this forum, hmm, but regarding the bit inversion / XOR: yeah, I've encountered this before as well.
The "insert quote" links in the "Post reply" page have been broken for so long I forgot they existed. The "Quote" button on the main thread view works fine, but only quotes one post.

If you want to obfuscate your data but not sacrifice performance, you have to use a fast algorithm. XOR is quick. That makes it common. However, in that case, I would have expected the entire file to be XORed, not just the data.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 17, 2006, 02:20:28 am
Had another chance to play with the files today. Did a quick and dirty XOR on WAV00000.WAV using every single-byte key from 0-255 (included 0 as a check of the program's integrity). Just got through listening to all 256 files, and guess what: no results. Er, no GOOD results :)

Oh well, I'm posting some BASIC code if anyone wants to modify it to use a larger key. i.e. Maybe there is a key somewhere in the file data, like those 3-4 bytes before each WAVE header. There are also quite a few other not-so-random looking bytes near the end of the 'padding' area of each WAVE file. Might play with it some more if I can get some free time.

Remember, this code is quick and dirty; no error trapping or file creation confirmation, etc. Use at your own risk. I don't know if this board supports code snippets, so everything following this line is code:

Code:
'Brute force XOR decrypt (single byte key)
'
'Notes:
'  First output file is XOR 0 (output file = input file), verifies program is working properly
'  Modified to leave a 44 byte WAVE header intact

DEFINT A-Z

DIM CurrentByte AS STRING * 1
DIM NewByte AS STRING * 1

INPUT "File to process: ", infile$
OPEN infile$ FOR BINARY AS #1

FOR XorLoop = 0 TO 255

  outfile$ = "XOR_" + LTRIM$(RTRIM$(STR$(XorLoop))) + ".WAV" 'format (text) output file name
  PRINT "Output file: ", outfile$ 'show what file we are currently operating on
  OPEN outfile$ FOR BINARY AS #2

  FOR HeaderLoop = 1 TO 44 'leave header intact
    GET #1, HeaderLoop, CurrentByte
    PUT #2, HeaderLoop, CurrentByte
  NEXT HeaderLoop
 
  FOR ReadLoop& = 45 TO LOF(1) 'XOR remaining bytes with current key
    GET #1, ReadLoop&, CurrentByte
    NewByte = CHR$(ASC(CurrentByte) XOR XorLoop)
    PUT #2, ReadLoop&, NewByte
  NEXT ReadLoop&

  CLOSE #2

NEXT XorLoop

CLOSE
END


Edit: ^^^ Hey look, a code snippet! Heh, thanks Novus.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 17, 2006, 09:27:53 am
I did some poking around with DOSBox. Whatever compression SC3 is using, it's not using the Sound Blaster's ADPCM decoder to play it. Also, the intro speech seems to be using all 16 bits, while the dialogue is in 8 bit, meaning that they are probably in different, unknown, compressed formats. ???

I managed to find a copy of the source code to Miles Audio Interface Library (AIL) 2.14 on John Miles's homepage (http://www.qsl.net/ke5fx/), but the only ADPCM support there seems to rely on the SB's decoding hardware.

On the plus side, I did manage to extract the subfiles apparently cleanly without any trailing garbage using the following:

Code:
#include <stdio.h>
#include <stdlib.h>

void abend(char *s) {
    fprintf(stderr, "%s\n", s);
    exit(1);
}

int main(int argc, char *argv[]) {
    unsigned short files;
    FILE *in,*out;
    unsigned int *offset;
    int a;
    unsigned int size;
    char *buffer=NULL;
    char filename[200];

    if ((in=fopen(argv[1], "rb"))==NULL)
abend("Input file open failed.");
   
    if (fread(&files, sizeof(unsigned short), 1, in)<1)
abend("Number of files read failed.");
    offset=malloc(sizeof(unsigned int)*files);
    fprintf(stderr, "Offsets in index: %d\n", files);
    if (fread(offset, sizeof(unsigned int), files, in)<1)
abend("Offsets read failed.");

    for(a=0;a<files;a++)
if (offset[a]) {
    fseek(in, offset[a]+2, SEEK_SET);

    fread(&size, sizeof(unsigned int), 1, in);
    fprintf(stderr, "File %d: %d bytes\n", a, size);
    buffer=realloc(buffer, size*sizeof(char));
    fread(buffer, sizeof(char), size, in);
    sprintf(filename, "out%d.wav", a);
    out=fopen(filename, "wb");
    fwrite(buffer, sizeof(char), size, out);
    fclose(out);
}

    fclose(in);
    free(buffer);
    free(offset);
    return 0;
}


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Slylendro on May 19, 2006, 06:25:43 pm
a few years ago, I tried to decompress that voc file, but also without success extracting them without the noices in the background.
It's too bad there are no sounds and no quotes about Star Control 3 out there on the net besides playing the whole game.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 21, 2006, 07:23:32 pm
I did some poking around with DOSBox. Whatever compression SC3 is using, it's not using the Sound Blaster's ADPCM decoder to play it.
...
I managed to find a copy of the source code to Miles Audio Interface Library (AIL) 2.14 on John Miles's homepage (http://www.qsl.net/ke5fx/), but the only ADPCM support there seems to rely on the SB's decoding hardware.

That makes sense, considering that if they did use SB hardware to decode, the audio wouldn't work with any of the other supported sound cards. :P It would have to be decoded before passing data to the sound card driver(s). I'm going over some decompiled ASM code now, but I think the best route at this point will be with a low level debugger, such as SoftICE. I might fall back to a real DOS box for this.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Novus on May 22, 2006, 06:57:17 pm
That makes sense, considering that if they did use SB hardware to decode, the audio wouldn't work with any of the other supported sound cards. :P It would have to be decoded before passing data to the sound card driver(s).
True, although getting hardware acceleration for decompression on most computers would be a major bonus. However, it seems Creative never bothered to document their compression format properly (possibly to make their cards harder to clone), which would make it sensible for Miles or Legend to do a custom job.

Quote
I'm going over some decompiled ASM code now, but I think the best route at this point will be with a low level debugger, such as SoftICE. I might fall back to a real DOS box for this.
DOSBox includes a low-level debugger (http://vogons.zetafleet.com/viewtopic.php?t=3944&sid=6d0fcdca406212967c18f5f1b3bac838) as well, BTW.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Daktaklakpak on May 27, 2006, 09:11:54 pm
Another interesting development...

Since Novus pointed it out, I decided to play around with DOSBox's built-in debugger, and using a HEAVY_DEBUG build, I obtained the following (excerpt from the debugger):

Code:
    462876: EXEC:Execute sc3.EXE 0
    462876: FILES:file open command 0 file sc3.EXE
    464196: FILES:file open command 0 file C:\SC3.EXE
    464407: FILES:file open command 0 file C:\SC3.ETX
    465760: BIOS:INT15:Unknown call BFDE
    465811: DOSMISC:DOS:Multiplex Unhandled call 1687
    465834: BIOS:INT15:Unknown call BFDE
    470291: BIOS:INT15:Unknown call BF01
    470349: BIOS:INT15:Function 0x88 Remaining 0000 kb
   1503533: FILES:file open command 0 file C:\SC3.EXE
   9000353: FILES:file open command 0 file LEGEND.INI
   9023854: PIT:PIT 0 Timer at 18.21 Hz mode 3
   9190030: PIT:PIT 0 Timer at 99.99 Hz mode 3
   9192092: FILES:Special file open command 10 file AUTORUN.LOC
   9222210: FILES:file open command 0 file MDI.INI
   9248830: FILES:file open command 0 file MPU401.MDI
   9258415: FILES:file open command 0 file MPU401.MDI
   9266178: FILES:file open command 0 file MPU401.MDI
   9281409: MISC:MPU-401:Reset FF
   9281907: MISC:MPU-401:Set UART mode 3F
  10903521: PIT:PIT 0 Timer at 120.00 Hz mode 3
  10904754: FILES:file open command 0 file GM.XMI
  10992615: FILES:file open command 0 file DIG.INI
  11016330: FILES:file open command 0 file SB16.DIG
  11026586: FILES:file open command 0 file SB16.DIG
  11035030: FILES:file open command 0 file SB16.DIG
  11051512: SBLASTER:DSP:Reset
  11051718: PIC:1 mask EC
  11051722: PIC:0 mask 78
  11052043: SBLASTER:Short transfer scheduling IRQ in 0.023 milliseconds
  11052043: SBLASTER:DMA unmasked,starting output, auto 0 block 1
  11052043: SBLASTER:DMA Transfer:16-bits PCM Stereo Single-Cycle freq 22050 rate 44100 size 1
  11052069: IO:Writing 04 to port 00ED
  11052268: SBLASTER:Single cycle transfer ended
  11052268: SBLASTER:Raising IRQ
  11136859: PIC:1 mask EC
  11136863: PIC:0 mask F8
  11140162: PIC:1 mask EC
  11140166: PIC:0 mask 78
  11140181: SBLASTER:MIXER:Read from unhandled index 32
  11140187: SBLASTER:MIXER:Read from unhandled index 33
  11140205: SBLASTER:MIXER:Read from unhandled index 32
  11140208: SBLASTER:MIXER:Write 0 to unhandled index 32
  11140213: SBLASTER:MIXER:Read from unhandled index 33
  11140216: SBLASTER:MIXER:Write 0 to unhandled index 33
  11143464: PIT:PIT 0 Timer at 200.00 Hz mode 3
  11155239: FILES:file open command 0 file STAR001.VOC
  11187488: INT10:Function 6F not supported
  11188096: INT10:Function 5F not supported
  11188338: IO:Read from port 03CD
  11188342: IO:Writing 55 to port 03CD
  11188737: INT10:Function 12:Call 80 not handled
  11378620: INT10:Set Video Mode 101
  11378620: VGA:Blinking 0
  11378620: MOUSE:Unhandled videomode 69 on reset
  11387796: FILES:file open command 0 file STAR000.PIC
  11398032: MOUSE:Unhandled videomode 69 on reset
  11404208: FILES:file open command 0 file STAR000.FNT
  11421087: FILES:file open command 0 file SC3STR.DAT
  11442600: MOUSE:Define Hortizontal range min:0 max:639
  11443241: MOUSE:Define Vertical range min:0 max:479
  11449370: FILES:file open command 0 file .\Q\STAR909.Q
  11858361: VGA:H total 100, V Total 525
  11858361: VGA:H D End 80, V D End 480
  11858361: VGA:Width 640, Height 480, fps 70.007141
  11858361: VGA:normal width, normal height aspect 1.000000
  13196077: FILES:file open command 0 file .\Q\STAR923.Q
  13497437: SBLASTER:DMA unmasked,starting output, auto 1 block 2047
  13497437: SBLASTER:DMA Transfer:16-bits PCM Stereo Auto-Init freq 22050 rate 44100 size 1024
  13708984: SBLASTER:Raising IRQ
  13939022: SBLASTER:Raising IRQ
  14168409: SBLASTER:Raising IRQ
...
  19784215: SBLASTER:Raising IRQ
  20004240: SBLASTER:Raising IRQ
  20234289: SBLASTER:Raising IRQ
  30696035: FILES:file open command 0 file STAR200.PIC
  35391734: FILES:file open command 0 file STAR220.PIC
  35771776: FILES:file open command 0 file STAR400.PIC
  37563673: FILES:file open command 0 file STAR000.PIC
  41174719: FILES:file open command 0 file STAR800.PIC
  41195646: FILES:file open command 2 file RESTART.DAT
  69015087: SBLASTER:DMA unmasked,starting output, auto 1 block 2047
  69224460: SBLASTER:Raising IRQ
  69444582: SBLASTER:Raising IRQ
  69674713: SBLASTER:Raising IRQ
  69904075: SBLASTER:Raising IRQ
  70481783: FILES:file open command 0 file STAR200.PIC
  75455670: FILES:file open command 0 file STAR220.PIC
  75863420: FILES:file open command 0 file STAR400.PIC
  77734434: FILES:file open command 0 file STAR000.PIC
  81504589: FILES:file open command 0 file STAR800.PIC
  81529665: FILES:file open command 0 file RESTART.DAT
 101593877: FILES:file open command 0 file .\Q\STAR905.Q
 102020600: SBLASTER:DMA unmasked,starting output, auto 1 block 2047
 102231354: SBLASTER:Raising IRQ
 102471391: SBLASTER:Raising IRQ
 102701423: SBLASTER:Raising IRQ
...

The ... indicates more lines of the same (i.e. Sound Blaster raising an IRQ, which just plays audio in this case). What I found was that Star Control 3's Q files, which were accessed before playing a video (STAR909.Q, STAR923.Q, STAR905.Q), also contain WAVE headers. STAR905.Q seems to be the first intro video, for which I posted a link containing the audio portion from STAR001.VOC earlier in this discussion. I poked around in some of the other Q files, and not all of them contain WAVE headers. The ones that do contain headers will extract properly with MRIP, but the data they contain is useless as-is. BAH! More riddles.

What we do now is watch the ASM instructions to see what SC3.EXE does with this data. While we can see the code accessing the sound driver (SB16.DIG) in the listing above, I can't tell if, or when, DOSBox's debugger is displaying the instructions executed by the driver, though we can probably assume that most of the "SBLASTER:" commands are coming from it. This may not be particularly relevant to the decoding process anyway, but I thought it was worth mentioning. I do know that SoftICE will tell you when switching to and displaying a driver's code as soon as it is called, though I still haven't had the time to set up a box for this yet. With luck, the calls will be documented (right, haha) and our solution will be found. I'm even not going to try and assert the probability of this, heh.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Bisqwit on September 24, 2013, 12:50:33 am
Well, the previous three messages in this thread are all spam (promoting a product without really checking whether it actually solves anything discussed in the thread), and the last relevant post was from 2006.

Once upon a time - in 1998 - I also worked on this problem. Similarly without much success at cracking the speech compression.

Here's my Turbo Pascal program (from 1998) that extracts the wav files from the voc file, and reports some statistics on how much they were compressed.
Code:
{$M 2048,0,68096}
Uses Objects, Strings, WinDos, Crt;

Const
AppTxt: Array[0..63]Of Char = '<APP BYTES>';

Var
s, s2: Array[0..63]Of Char;
Buf: Array[0..16383]Of Char;
Bread: Word;

f, t: TBufStream;

Count: Word;
a, b: Word;

Size2, Size3, Total, Size, Posi: LongInt;

Ratio2, Ratio3, Ratio: Real;
Best2, Best3, Signa: Word;

C10: Word; C10A, C10B: Real;
C11: Word; C11A, C11B: Real;

Var
wFormatTag,
nChannels: Word;
nSamplesPerSec,
nAvgBytesPerSec: LongInt;
nBlockAlign,
Bits: Word;

Begin
f.Init('e:\pelit\sc3\star001.voc', stOpenRead, $FE00);

f.Read(Count, 2);

WriteLn('Voice clip count: ', Count);

Total := 0;

Ratio2 := 0;
Ratio3 := 100;

For a := 2900 To Count-1 Do
Begin
f.Seek(2 + a*4);
f.Read(Posi, 4);
f.Read(Size2, 4);
If Posi = 0 Then Continue;

{ Repeat f.Read(Size2, 4)Until Size2 <> 0;
Dec(Size2, Posi);
}
If Size2>0 Then Dec(Size2, Posi);
If(Size2 < 30000)Or(a >= 3000)Then
Begin
f.Seek(Posi);
f.Read(Signa, 2);
f.Read(Size, 4)
End
Else
Size := Size2;

If KeyPressed Then Break;

Write(a:4,': ',Size:5,' bytes'#13);
{If(Size >= 30000)And(Signa <> 0)Then Continue;}

If a < 3000 Then
Begin
f.Seek(Posi);
f.Read(Signa, 2);
f.Read(Size, 4)
End;

Str(a+1:4, s2);
strcpy(s, 'd:\sc3_wav\sc3_');
For b := 0 To 3 Do If s2[b]=' ' Then s2[b] := '0';
strcat(s, s2);

If Signa=32 Then strcat(s, '.wac')
Else If Signa= 0 Then strcat(s, '.wav')
Else
strcat(s, '.___');

Inc(Total, Size);

TextAttr := 15;
Write(s,': ');

t.Init(s, stOpenRead, 512);
If(t.Status = stOk)And(t.GetSize = Size)Then
Begin
t.Done;
Continue
End;
t.Done;
t.Init(s, stCreate, 512);
t.Done;
t.Init(s, stOpen, 2048);
t.Seek(0);
Size2 := Size;

Write(#8#8' Size:',Size:7);

C10 := 0;
C11 := 0;

Repeat
BRead := SizeOf(Buf);
If BRead > Size Then BRead := Size;

f.Read(Buf, BRead);

For b := 0 To BRead-1 Do
Begin
If Buf[b] = '' Then Inc(C10);
If Buf[b] = '' Then Inc(C11)
End;

t.Write(Buf, BRead);

Dec(Size, BRead)
Until Size <= 0;

t.CopyFrom(f, Size);

Size3 := t.GetPos;
{ t.Write(AppTxt, strlen(AppTxt));
t.CopyFrom(f, Size2 - Size - 6);}

WriteLn('   Free: ',DiskFree(Ord(UpCase(s[0]))-Ord('A')+1));

TextAttr := 7;

If(Signa = 32)Or(Signa=0)Then
Begin
t.Seek(20);      {'RIFF', size, 'WAVE', 'fmt ', size}
t.Read(wFormatTag,     2);
t.Read(nChannels,      2);
t.Read(nSamplesPerSec, 4);
t.Read(nAvgBytesPerSec,4);
t.Read(nBlockAlign,    2);
t.Read(Bits, 2);
t.Read(Size, 4); {'data'}
Size2 := Size3;
t.Read(Size, 4);
TextAttr := 3; Write(' FORMAT ');
TextAttr := 9; Write(wFormatTag);
TextAttr := 3; Write(': ');
TextAttr := 9;
If nChannels=1 Then Write('MONO')
Else If nChannels=2 Then Write('STEREO')
Else Write('Chans: ',nChannels);

TextAttr := 9; Write(' ',nSamplesPerSec/1000:0:2,' kHz ',Bits);
TextAttr := 3; WriteLn(' bits');

TextAttr := 3; Write(' Size: ');
TextAttr := 9; Write(Size:6);
TextAttr := 3; Write(' File: ');
TextAttr := 9; Write(Size2:6);
TextAttr := 3; Write(' Ratio: ');
Ratio := Size2*100/Size;
TextAttr := 9; Write(Ratio:0:1);
If Ratio > Ratio2 Then Begin Ratio2 := Ratio; Best2 := a; C10A := C10*100.0/Size; C11A := C11*100.0/Size End;
If Ratio < Ratio3 Then Begin Ratio3 := Ratio; Best3 := a; C10B := C10*100.0/Size; C11B := C11*100.0/Size End;
TextAttr := 3; WriteLn('% of original');
If Signa=0 Then WriteLn('*** NOT COMPRESSED, NOTE!')
End;

t.Done
End;

TextAttr := 7;
WriteLn('LEAST COMPRESSED ITEM WAS NUMBER',Best2:5,' (',Ratio2:0:1,'% OF ORIGINAL, :',C10A:0:4,'%, :',C11A:0:4,'%)');
WriteLn(' MOST COMPRESSED ITEM WAS NUMBER',Best3:5,' (',Ratio3:0:1,'% OF ORIGINAL, :',C10B:0:4,'%, :',C11B:0:4,'%)');

f.Done
End.

From what I deduced back then I saw that the compressed files still contained bits of recognizable audio, which suggests that it's either not entirely in one format, or it includes LZ-type elements to it.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: meep-eep on June 13, 2015, 10:19:36 pm
For future readers; the 'previous three messages in this thread' that Bisqwit referred to, no longer exist.
I have removed them, along with the later spam postings.


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Scalare on October 20, 2016, 04:45:19 pm
For future readers; the 'previous three messages in this thread' that Bisqwit referred to, no longer exist.
I have removed them, along with the later spam postings.

Their permission to exist has been revoked!


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Death 999 on October 27, 2016, 04:42:02 am
If they can keep the writing up, they're in good shape - only like one line of dialog and it's a memetic-level hit!


Title: Re: Star Control 3 STAR001.VOC archive WAV file format?
Post by: Wcarr_ on December 24, 2016, 04:02:00 pm
I don't have the slightest idea! I am good at programming (kind of), but all of that still makes no sense to me.