Page 1 of 3

Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Wed Oct 21, 2009 4:23 pm
by Feltzkrone
EDIT: Okay, there's still much things left to clarify - it seems the guide proposal is partially simply wrong as Fireball depicts here. I haven't yet fixed it but will do and rewrite parts (or maybe start all over again as Rockrnoms doesn't find it easy to read) once everything is clear.

---- Original text starting here ----

After heaving read what's going on in the thread "data inside pregap?" I believe we are heavily in need of a proper step-by-step ripping guide for producing correct dumps of CDs which have a pregap in front of Track 2.

The guide aims at producing dumps which contain scrambled pregap data sectors taken from CD in Track 2 binary file, which is, from what I understood so far, the form of preserving such CDs as F1ReB4LL would like to see them. Here is my proposal for such a guide and this guide involves the following tools:
Exact Audio Copy, IsoBuster, CDToImg (D8 hack), Px_D8 and HxD (any other hex editor properly used should be OK, too)

This proposal is built upon a notional example with easy to follow values. I don't know if this is the preferred style of explaining things, at least it is mine. Image

It would be nice if technically versed persons like F1ReB4LL for example read this guide and could confirm that it's correct and judge if it is easy to follow. For the case of something being wrongly described please post corrections and also please post additions when you think that something important is missing.


---- "Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap" - proposal in original form ----


1. Getting knowledge of how the pregap is made up

We have a CD-ROM with one data track and two audio tracks. It has the following TOC.

Code: Select all

Track 1 - Data  - LBA 0
Track 2 - Audio - LBA 10500, Pregap 00:03.00 (225 sectors)
Track 3 - Audio - LBA 21000, Pregap 00:02.00 (150 sectors)
We first try to determine the combined offset by using either IsoBuster's (or CDTool's) sector viewer. When browsing sectors we always keep in mind that whenever we see unexpected data or get read errors, we go some sectors forward then back to the wanted sector - or first go backward some sectors and then forward to the sector in question.

We substract 225 (as pregap is 03.00) from 10500 and seek to sector 10275. The sector looks like a normal data sector (easiliy recognized because of its sync mark - 00 FF FF FF...). So we go forward using the sector viewer in order to find the sector partially containing scrambled bytes and we finally find 408 scrambled bytes at sector 10350 (the last 408 bytes of the previous data sector in scrambled form).

At this point we know the following: The first 75 sectors of pregap are data sectors, the remaining 150 sectors are audio sectors. The combined offset which will be used to read the audio tracks is +408 bytes / 4 bytes per sample = +102 samples.


2. Doing things as they are normally done

Both audio tracks are read using EAC and the combined read offset of +102. We keep in mind that the first 75 sectors of Track 2 dump must be considered as containing undefined garbage. After that we dump the data track using IsoBuster. We remove the last 225 sectors from Track 1 dump using HxD. Number of bytes to remove is 225 sectors * 2352 bytes per sector = 529200 bytes.

For that purpose in HxD we open the Track 1 binary file, jump to its end by pressing Ctrl+End, choosing Edit / Mark block from the menu. In the dialog we first switch from hex to dec so our input is interpreted as decimal values. We then take the value shown as the start address (which is actually the size of the file), substract the number of bytes to remove from it and finally enter the result as start address and hit OK. By pressing the Del key the block is removed and we now can save the file.

At this point we have already partially finished the dump: The data track dump is correct and all audio track dumps are correct, but Track 2 is still not proper due to the mixed-mode pregap.


3a. Getting a dump of Track 2 as is present on CD: Deleting the current pregap data in Track 2 dump

The first thing we want to do is getting rid of the undefined pregap data contained in Track 2 dump made with EAC. We start by using HxD (or any other hex editor) and opening the Track 2 binary file. We then calculate the length of the pregap in bytes as follows:
Pregap length = number of pregap sectors * 2352 bytes per sector

On this particular disc the formula works as follows:
Pregap length = 225 sectors * 2352 bytes per sector = 529200 bytes

In HxD we now mark the block (Edit / Mark block) using 0 as the start address and the calculated pregap length as the length (after having selected decimal value format in the dialog). We now delete the block (by pressing del key) and don't close or save the file yet - just leave it open and modified as is for now.


3b. Getting a dump of Track 2 as is present on CD: Extracting real pregap data from CD

This step involves the usage of the D8 hack of CDToImg, Px_D8 and a drive capable of processing D8 vendor read commands (such as real Plextor models).

We use Px_D8 to determine the combined offset for the drive used (which is different to the one used in the previous steps, but doesn't matter at all). The combined offset is +30 samples = 120 bytes (or 0x78 in hex). Then we use CDToImg and make a full dump (we choose dump.raw as filename) which contains all data as is present on disc, i.e. all data sectors in this dump are present in scrambled form.

We now calculate the start address of the block/range containing the pregap data in the raw dump as follows:
Start address = (Track 2 LBA - number of pregap sectors) * 2352 bytes per sector + combined offset in bytes

In the current case this leads to the following value:
Start address = (10500 - 225) * 2352 + 120 = 24166920

We now use HxD to copy the wanted block to clipboard. We open the raw dump file (dump.raw) and again mark the wanted block/range using Edit / Mark block. This time we use the calculated start address as start address for this dialog and reuse the previously (step 3a) calculated pregap length as length. Again, before entering the values we check that decimal number format is selected.

Now in HxD we copy the selected block of dump.raw, then change to the still opened Track 2 binary file and insert the copied block right before the very first byte of it. We are now ready to save Track 2 binary file and so we do.


4. Recapitulation of what we have done
  • we extracted all tracks as usual using IsoBuster and EAC
  • we truncated the data track as usual because we had a pregap at Track 2 (audio)
  • we made a full disc dump with CDToImg
  • we removed the (wrongly read) pregap data from Track 2 binary file
  • we then extracted the correct pregap data from the full disc dump
  • we inserted the correct pregap before the remaining Track 2 binary data
In effect we replaced the bad pregap data in Track 2 (which has been read by EAC) with proper pregap data which has been read using CDToImg. That proper pregap data contains several data sectors in scrambled form (as on CD).


5. Handling EAC pregap detection oddities (*not sure if this should be included)

It might happen that EAC is unable to detect the correct pregap length and just returns 00:00.00. In exactly that case we need to use CDTool to browse the sectors as (opposed to IsoBuster) it's able to return subchannel data when viewing sector contents. When using the sector browser in CDTool we'll always have to choose subchannel reading mode "P-W RAW Interleaved (Code 001b)" and select "Deinterleave subch data".

At first we will have to determine the drive's subchannel read offsets for both data and audio sectors. So we start the sector viewer and view sector 75 (a confirmed data sector). In the subchannel data are (the bottom data area) we look at the first four digits of the second line and see "0102" which should be read as 1 second, 2 sectors. We now know that the subchannel data the drive reports is 2 sectors off, i.e. we have a subchannel offset of +2, in the following steps refered to as DSO for data subchannel offset.

Next we take Track 2 LBA from TOC and add 75 (=> 10575) then view that sector. We now read "0101" so we know that the drive has a subchannel offset of +1 when reading audio sectors, in the following steps referred to as ASO for audio subchannel offset. Note that this offset generally can be different to the DSO (and is in this example).

Generally DSO and ASO aswell might be negative. In this case you will find e.g. "0074" as a value in the second row of subchannel data, which reads as 0 seconds, 74 sectors and represents a subchannel offset of -1.

We now go to sector (Track 2 LBA - ASO - 2), in this case (10500 - 1 - 2 = 10497). Again we look at the second row in subchannel data and should find either "0001" or "0002" - and we find "0001". If additionally the last two digits of the first row in subchannel data is "00" there is a pregap on disc which EAC was unable to detect. In general as long as you the "00" at the end of the first line, it is an indicator for being in pregap.

Pregaps usually have lengths of 2, 3 or 4 seconds, in some cases these lengths are one, two or three sectors smaller or greater. These possible lengths should cover 99% of all cases, so instead of going backward sector by sector in the sector viewer, we go backward by 75 sectors. By the way: We do count the number of times we go backward 75 sectors. We now see "0101" which tells us that we still are in pregap and are one second ahead. As we want to find the beginning of the pregap we repeat this step.

Again we go back 75 sectors resulting in "0200" and we see a sync mark in the sector data in the text area above. We still are in the pregap but now the pregap contains data sectors so now the formerly calculated DSO applies instead of ASO. That's why we get "0200" and not the expected value of "0201". To compensate this, we calculate the difference from ASO to DSO as follows: +1 - (+2) = -1. That's the number of sectors we will go foward. In this case the number is negative, so we go one sector backward.

We now get "5357" which has nothing in common with the previously seen values. Additionally we don't see "00" anymore at the end of the first row, so it's an indicator for having left pregap => now being before pregap. We have gone backward 75 sectors for three times. That is 75 * 3 = 225 sectors.

The final step is moving forward (and counting) sector by sector until the last value in the first row becomes "00" again. We have to do this 2 times and now find "00" and in the second row "0274". Now we calculate the pregap length...

1. We have gone three times 75 backward => 225 sectors.
2. We have gone two times one sector forward => - 2 sectors => 223 sectors.
3. We started at Track 2 LBA - ASO - 2, so we skipped the last 2 sectors of pregap => + 2 sectors => 225 sectors

Just for proof we check our calculation by using the "0274" value that can be seen in subchannel data. It shall be read as 2 seconds, 74 sectors. Our first value was "0001" and guess what: We build the difference which is 2 seconds, 73 sectors. Then we add two sectors as we skipped the last 2 sectors of pregap and the circle closes => 3 seconds, 0 sectors.

Finally we go one sector forward and we should notice all those "00000..." values at the beginning of the first row in subchannel data change to "FFFFF...". That's the P-Channel indicating the end of the data track has been entered. The P-Channel usually switches to "FFFFF..." exactly one sector after the beginning of a pregap. So another indicator for having correctly determined the pregap length.

In this case we figured out that the pregap length is 03.00. As EAC left out the pregap completely when reading Track 2 we just skip step 3a and instead we have to manipulate the CUE file made with EAC and will replace "INDEX 01 00:00:00" with "INDEX 00 00:00:00 [CRLF] INDEX 01 00:03:00".

Please note that the remaining steps have to be performed as described. Step 5 is not ment as an addition to the regular dumping guide!

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Thu Oct 22, 2009 10:07 am
by Rocknroms
Your explanation is quite confusing, sorry. Moreover we have some other tools that can simplify the job.
Also about point 5, if you are referring to 2 tracks disc (1 data 1 audio) you don't have to do all this job to get real pregap.
I'll test what you say on point 5 as soon as possible to fit it in exceptions.

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Thu Oct 22, 2009 11:09 am
by F1ReB4LL
I repeat, there are CDs, where those sectors are marked as data in the subs (track is audio, sectors are data AND marked as data in the subs) - I insist on leaving them descrambled and this case suggests different handling (in this case you should extract the data part of the gap from clonecd dump skipping the 1st track - first 176400 bytes, first 352800 bytes or first 529200 bytes, etc., because data sectors rarely fills the _whole_ gap, then, you should extract the audio part of the gap and the track itself, skipping [1st_track_size + combined_offset_size_in_bytes + data_part_of_the_gap_size], then glue the both parts). Of course, subchannels analyzing is necessary in both cases.

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Thu Oct 22, 2009 12:38 pm
by Feltzkrone
F1ReB4LL wrote:I repeat, there are CDs, where those sectors are marked as data in the subs (track is audio, sectors are data AND marked as data in the subs) - I insist on leaving them descrambled and this case suggests different handling
That's why I have asked you implicitly to review the guide: to clarify things. It doesn't matter to me if I (or somebody else) needs to rewrite big parts of it - it doesn't matter if a first proposal might be wrong. All I would like is to see is a proper guide as the result, may it take months, I don't care... So let's use this thread as discussion platform in order to clarify things and simply write down how to detect different cases, what analyzations have to be made and finally how to store data in which form in the image. Once done I'll give it another try. Image

Ok, I understand that things will have to be done differently depending on how the data sectors are marked (as either data or audio), so I hope that I now got it right: If those data sectors are marked as data, keep them unscrambled in Track 2 binary. Is that correct now?

And what if those data sectors are marked as audio? (All following questions refer to audio-marked data sectors...)
1) Are they still in scrambled form on CD or might that depend on the mastering?
2) Does any drive unscramble them automatically if they are scrambled (= drive ignores that they are marked as audio)?
3) If they are not automatically unscrambled, does the factory write offset apply when reading them with READ CD commands, i.e. data is shifted when read, i.e. sync marks are not at the beginning of the returned sector data?
4) How should the data be kept in the image? Scrambled or unscrambled, or depending on certain circumstances?

When you are saying that subchannel data analyzing is necessary in both cases, isn't it that subchannel analysis is first necessary to distinguish both cases from each other and after that (again) necessary to figure out the number of sectors that actually are marked as data?

I'm convinced now that for those cases subchannel data inspection generally must be done. I still find it annoying to rely on CloneCD .sub files as their quality (remember Manic Karts) heavily depends on the quality of drives, i.e. as soon as a drive has changing subchannel data offsets when switching from data to audio sectors the resulting .sub file is crap when produced with CloneCD. But actually all you'd need is a drive just capable of reading pregap sectors.

How about a tool that just automates pregap (including its subchannel data) analyzing and prints the results - similar to Px_D8 which just prints the combined offset?

EDIT: If I may ask in all innocence - are F1ReB4LL and Rocknroms the only ones willing to discuss and clarify more complex cases like these or are other members, moderators and admins just very busy at the moment? (Don't get me wrong - no offense!)

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Thu Oct 22, 2009 2:06 pm
by F1ReB4LL
First of all, no need to use the bold symbols - I've marked some words from my post to notify all the newbies, that your guide isn't perfect and it's not that easy.
Feltzkrone wrote:And what if those data sectors are marked as audio? (All following questions refer to audio-marked data sectors...)
1) Are they still in scrambled form on CD or might that depend on the mastering?
Sure. Unscrambled sectors are very rare, but it happens sometimes (see [SS] Sakura Tsuushin entry, for example)
Feltzkrone wrote:2) Does any drive unscramble them automatically if they are scrambled (= drive ignores that they are marked as audio)?
When you dump a first data track, data sectors from pregap on its end will be descrambled (same for CloneCD dumps, etc.). Not sure what happens when you extract them as audio, though. EAC tweaks the first gap, excluding those sectors. PR extracts them totally wrong (not scrambled, not unscrambled, but screwed).
Feltzkrone wrote:3) If they are not automatically unscrambled, does the factory write offset apply when reading them with READ CD commands, i.e. data is shifted when read, i.e. sync marks are not at the beginning of the returned sector data?
Usual READ CD command should return them unscrambled, because they "belong" to the previous (data) track, according to the drive's firmware's logic, I've already explained this. Next track "officially" starts from the 01 index according to the TOC, 00 index belongs to the same track according to the subs, but following the TOC ignores this.
Feltzkrone wrote:4) How should the data be kept in the image? Scrambled or unscrambled, or depending on certain circumstances?
I repeat: in my opinion, all the sectors should be scrambled (even the data tracks), because that's how they are stored on CD. But in the current situation data sectors marked as audio should be scrambled, data sectors marked as data should be unscrambled. But in any case there should be a proper comment in the dump's entry describing all the abnormalities.
Feltzkrone wrote:When you are saying that subchannel data analyzing is necessary in both cases, isn't it that subchannel analysis is first necessary to distinguish both cases from each other and after that (again) necessary to figure out the number of sectors that actually are marked as data?
Yes, you should find the proper gaps in subs at first, then you should check the mode for the gap sectors (audio/data), then you should count a number of data sectors in the gap. Btw, some of the sectors of the gap may be marked as data in the subs and some - as audio (don't have any examples yet, but I can't exclude a possibility of this).
Feltzkrone wrote:How about a tool that just automates pregap (including its subchannel data) analyzing and prints the results - similar to Px_D8 which just prints the combined offset?
What results, exactly?
Feltzkrone wrote:EDIT: If I may ask in all innocence - are F1ReB4LL and Rocknroms the only ones willing to discuss and clarify more complex cases like these or are other members, moderators and admins just very busy at the moment? (Don't get me wrong - no offense!)
Jackal is also able to do some researches, Dremora in some rare cases... Themabus is more on the hardware side (Saturn rings tests, etc.). Don't remember anyone alse.

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Thu Oct 22, 2009 3:29 pm
by Feltzkrone
F1ReB4LL wrote:
Feltzkrone wrote:How about a tool that just automates pregap (including its subchannel data) analyzing and prints the results - similar to Px_D8 which just prints the combined offset?
What results, exactly?

The tool would analyze Track 2 pregap in case of Mixed-Mode CDs and produce an output like this (example):

Code: Select all

Subchannel offset data ... +2 sectors
Subchannel offset audio .. +1 sectors
Q-Channel pregap length .. 225 sectors (-00:02.74 to -00:00.00)
P-Channel flags .......... 225 sectors (-00:02.73 to +00:00.00)
Q/P data matches ......... yes
Combined offset .......... +8 samples

+----------------------------------------------------------------------------+
| Pregap layout                                                              |
+---------+--------+-------+-------------------------------------------------+
|  Offset | Length | Type  | Main channel analysis                           |
|---------+--------+-------+-------------------------------------------------+
|    -225 |     75 | Data  | Data (sync found, aligned)                      |
|    -150 |      1 | Audio | 32 scrambled bytes from data sector             |
|    -149 |    149 | Audio | Audio (sync not found)                          |
+---------+--------+-------+-------------------------------------------------+

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Thu Oct 22, 2009 3:48 pm
by F1ReB4LL
Writing a basic proper dumping tool won't take much longer time, so don't see much point in this.

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Fri Oct 23, 2009 11:05 am
by Feltzkrone
F1ReB4LL wrote:Writing a basic proper dumping tool won't take much longer time, so don't see much point in this.
If this really was the case, why don't we already have one? Image


The current guide forces you to use three different tools:
  • IsoBuster
  • ExactAudioCopy
  • Resize
Of course it would make sense to have one tool instead of these three which handles 99% of all cases automatically. Combined offset detection as suggested in the guide is no hocus pocus, generally none of those three do anything which couldn't be automated using one tool. But at the moment we don't have anything like that.

I only wanted to provide a tool which would detect special cases of pregap layouts and simplify the task of determining the combined offset. What's bad about it? Just that we still don't have an all-in-one tool then?

Don't get me wrong, but if such a tool is really wanted you are one of those few people who will have to contribute their knowledge. Coding (half-)blindly might produce a usable all-in-one dumping tool but it won't be perfect then and will cause bad dumps, that's why knowledge is needed to be spread. For example by posting background info on how certain things found on a CD should be detected and interpreted, how errors in audio extraction could be detected and compensated (i.e. figuring out what is so 'magical' about EAC). Once all info needed for coding such a tool we actually can code it and the resulting tool will be reliable. So in my opinion it actually is a lot more work to code such a dumping tool compared to just coding a more or less simple pregap reader.

If you don't mind I'd like to alter the topic subject for something like "Technical discussion for a future dumping tool" and - as the new subject denotes - we could talk about technical backgrounds, tricks and abnormalities here - that the tool should be able to handle properly.

Also we would have to discuss if the tool still should rely on CUE/BIN as with this format mixed-mode pregaps cannot be preserved properly. Subchannel data would have to be preserved aswell where it is non-standard. Apart from that I wouldn't mind discarding Sync and ECC/EDC info in data sectors where they are built the standard way - finally giving an image of the CD which contains user data and abnormalities, from which (if needed) a full and clean CCD/IMG/SUB aswell as a clean CUE/BIN can be reconstructed. What do you think about that?

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Fri Oct 23, 2009 11:47 am
by F1ReB4LL
Feltzkrone wrote:If this really was the case, why don't we already have one? Image
Noone to code. I don't have enough time and no volunteers at all.
Feltzkrone wrote:I only wanted to provide a tool which would detect special cases of pregap layouts and simplify the task of determining the combined offset. What's bad about it? Just that we still don't have an all-in-one tool then?
You offer to write a tool, which should be able to read some sectors, read the subs, correct the subs, analyze the subs, count the data offset, count the subs offset - man, you just need to read all the sectors instead of some gap and split the tracks - voila, a good dump. I only wanted to provide -- you're welcome to provide any tool Image
Feltzkrone wrote:Don't get me wrong, but if such a tool is really wanted you are one of those few people who will have to contribute their knowledge. Coding (half-)blindly might produce a usable all-in-one dumping tool but it won't be perfect then and will cause bad dumps, that's why knowledge is needed to be spread.
Of course, I've already offered this (my ideas/algorithms, me as a tester, but someone else as a coder), noone is interested.
Feltzkrone wrote:For example by posting background info on how certain things found on a CD should be detected and interpreted, how errors in audio extraction could be detected and compensated (i.e. figuring out what is so 'magical' about EAC).
Feltzkrone wrote:If you don't mind I'd like to alter the topic subject for something like "Technical discussion for a future dumping tool" and - as the new subject denotes - we could talk about technical backgrounds, tricks and abnormalities here - that the tool should be able to handle properly.
Noone is interested.
Feltzkrone wrote:Also we would have to discuss if the tool still should rely on CUE/BIN as with this format mixed-mode pregaps cannot be preserved properly. Subchannel data would have to be preserved aswell where it is non-standard. Apart from that I wouldn't mind discarding Sync and ECC/EDC info in data sectors where they are built the standard way - finally giving an image of the CD which contains user data and abnormalities, from which (if needed) a full and clean CCD/IMG/SUB aswell as a clean CUE/BIN can be reconstructed. What do you think about that?
Yes, with the .sub dumps it's possible to add the checksum for a single .img file, ccd can be generated from the .sub file (in 99% cases, at least - to cover the 100% we should also dump TOC into a standalone file and generate ccd and cue based on both TOC and .sub dumps, not only .sub). This would make many stupid people happy, who think, that our splitted dumps are bad and clonecd ones are perfect.

Re: Ripping guide for Mixed-Mode CDs with data sectors in Track 2 pregap

Posted: Fri Oct 23, 2009 12:01 pm
by Feltzkrone
It's fine that you are willing to provide background info as soon as needed. And here is your volunteer - but with the following drawbacks (at least in the eyes of most coders):
- GUI tool will be coded in Delphi 7 (alternatively commandline-based Java with native DLL usage)
- It will first use SPTI only, so it probably won't work on Windows versions < 2000
- I will only be able to code 1-2 hours per day (employed as programmer, sometimes just being fed up with programming and weekends are there to care for other things)

If you wouldn't mind these drawbacks I'm willing to start, although with the feeling of reinventing the wheel but as all wheels available don't represent a perfect circle I won't care. Image


EDIT:
F1ReB4LL wrote:This would make many stupid people happy, who think, that our splitted dumps are bad and clonecd ones are perfect.
From what I had to read so far from those people is that they simply don't trust anything other than CloneCD or Alcohol 120%, just because only they can handle newest forms of copy protection etc. If they got knowledge about we would be discarding certain data being present on the CD they immediately would lose their interest. But they don't get the point that tools like ECM discard data, too - in their eyes it's just a stronger form of compression.