Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Recovering an audio disk ripped in an unknown format
3 points by ohnoesjmr on Dec 29, 2018 | hide | past | favorite | 7 comments
Hello HN,

Wondering if some experts can help me.

Years ago, I went to a car audiophile meetup somewhere in North London and while there, I've heard a very interesting set of songs on a CD while listening to one of the guys setup. I took down his details, later got in touch and asked him clone the CD and post it to me for a few, so I could have it for myself. When it arrived, I thought I should rip the CD in case I loose it, was it was great selection of songs to benchmark audio system quality in general. I have a recollection saying it was one of the European Mobile Music Association (EMMA) test disks, but I don't think it's likely because those usually have a lot of chatter explaining judges which test chapter they are currently on.

Anyways, six-seven years later, I got a new stereo at home, and I thought it would be great to see what that set of songs sounds like on the new system.

I dug up the image file in my Google Drive, which claims to be in MDF format, tried to mount it, and to my disappointment it did not work. Tried renaming it to .iso, .bin, .cue, .nrg, yet none of these seem to work.

I don't recall exactly what I used to rip it, but I suspect it was Nero Burning ROM or something alike.

The only recollection I have is that one of the songs on the CD (number 2 or 3) was Club For Five - Brothers in Arms (https://www.youtube.com/watch?v=g55JlxWYo8Q)

This is where I am looking for experts help, perhaps someone who knows about audio encoding enough can recover the songs on the disk image.

Disk image available at:

https://drive.google.com/file/d/0B8u8uTexrWiSSjJYX1gxTllEamM/view?usp=sharing



A quick look in a hex-editor shows a familiar little-endian 16-bit PCM format, with 2352 byte chunks of data with 96 bytes of non-PCM data repeating - this is characteristic of raw Mode-2 CD digital audio with sub-channel data.

I think .MDF files started with the Alcohol 120% program and are typically the raw data from a CD/DVD, with the track information in a corresponding .MDS file.

Not sure of a free/open program to directly convert to a familiar format, so I made this C program:

  #include <unistd.h>
  #include <string.h>
  #define BLOCK_SIZE 2448
  #define DATA_SIZE 2352
  int main() {
    char buf[BLOCK_SIZE];
    ssize_t count;
    while (1) {
      count = read(0, buf, BLOCK_SIZE);
      if (count > 0) {
        write(1, buf, DATA_SIZE);
      } else break;
    }
    return 0;
  }
compile and converted to raw CDDA audio with:

  cc -o convert convert.c
  ./convert < "Audio Disk.mdf" > out.cdda
Then use 'sox' (open source audio processing program) to convert CDDA to .wav :

  sox -c 2 -b 16 -r 44100 --endian little out.cdda out.wav
I'm not sure if the track length information is recoverable.

Update: mdf2iso seems to work fine also to convert the .mdf to raw .cdda format: mdf2iso "Audio Disk.mdf" out.cdda


On further thought, the subcode 'P' channel can be used to identify the start of each track, which I verified is present in the .mdf file, but I didn't find a way to convince mdf2iso to output a TOC/CUE or to convince cdda2wav or cdaparanoia to read from a file rather than a CD drive. So I made a utility to split a CDDA .mdf file into WAV files : https://github.com/matja/mdf2wav

The .mdf file probably also contains the subcode 'Q' channel , which can be used to identify the CD and get the track names using cddb (cdda2wav/cdparanoia can do this, but apparently only from a physical CD drive).


You are just reading blocks of 2448 bytes and writing blocks of 2352 bytes. Does this just remove some kind of "footer"/crc/whatever?


Yes, in the .mdf format there is usually subchannel information after every sector. In Red Book (CDDA) Mode 2, this is 2352 bytes of audio data followed by 96 bytes of subchannel data, so this is just skipping the subchannel data.


Most formats include some kind of header. Try to look at the first 8-16 bytes of the file, and hope that Google can find a site about them.


Yeah, first few hundred bytes are all zeroes.


First 8 non zero bytes?

Last 8 bytes? (IIRC the .zip file has the "header" at the end. I doubt this is a .zip file, but a small amount of formats have the header at the end.)

Have you tried listening to the file as a raw wave? (perhaps with 1/2/4 bytes per sample and 1/2 channels)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: