1 2008-03-24 07:51:56 (edited by pnkiller78 2008-03-24 08:19:54)

Ok, I've been playing a little bit with some sega saturn images that I had laying on my hard disk, mainly from TOSEC torrents..
And I noticed something curious about some games from different regions but the same game in fact.

By example, these two
VIRTUA FIGHTER 2 (JAP)
VIRTUA FIGHTER 2 (PAL)

both games have different checksums for the audio tracks but the sizes are the same, it looked strange for me, I tough that the music should match different region games like on PSX.
So I did a test, using the psxt001z track method to rebuild a track from an image (used a TOSEC image as source), I rebuilded the audio tracks, for the PAL version the offset correction was +18 (the same write offset in DB), and after that tried to rebuild the JAP version, and guess what, the program was able to found the track, but this time is was with an offset correction of -306...
So, I was wondering why the audio track don't match on both disc when in fact the music from both version it's the same. Could it be an error from one of the dumpers, or in fact, is this the way it was supposed to be?

The same behavior was true for these two other games
ALIEN TRILOGY (PAL)
ALIEN TRILOGY (USA)

What you guys think about it?

EDIT:
Sorry, I was mean to post of the General Discussion forum, if some admin can move, it should be good.

moderator edit: changed topic title to reflect contents

2 2008-03-24 11:13:16

I noticed this earlier.. the only systems where the visible write offset really makes sense are PSX and Dreamcast.. in all the other systems, after correction,  the audio doesn't start exactly at the pregap, data gets moved into the next track's pregap, etc..

It would be nice to know if the d8 method gives the same offset correction on these discs, but I suppose it does.. there are propably multiple offset corrections needed, but we can only detect 2 of them (read+write)..

An 'intelligent checksum' would make more sense for these systems imho.. as Eidolon explained back then, you'll be ignoring the offset completely instead of assuming that our method of offset correction is the right one (I have no doubt that it is for PSX and Dreamcast, but I can't be sure about the other systems as a lot of dupe games don't have matching audio).. at least we could verify the integrity of the audio data then WITH matching checksums..

Maybe it would be possible to write an app that can determine the most likely original offset by comparing several dumps, for instance saturn ones? based on the following assumptions:

- No data gets moved outside of the track, unless the data starts exactly at pregap.
- If data gets moved outside of the track with the default offset correction, the amount of bytes outside of the track could indicate the additional offset correction needed (there are several psx discs where the last audio track has audio data up until the last byte.. if these bytes are also the last bytes on the cd, this means no data is cut off and it's a hint on what the proper offset correction should be).
- All tracks on a disc need the same offset correction (is the offset difference on saturn discs the same for all tracks? I remember that with IBM PC this usually isn't the case due to different mastering, different gap sizes etc)... this means that if we want matching audio offsets for all audio tracks, the pregaps should be ignored because the same games (but different system, version, region, etc) sometimes have different gaps?
- It is likely that one or more tracks start exactly at the pregap.

It's clear that offsets DO exist.. and that for PSX and DC they can be used as values to put the audio data back into the original position (usually exactly @ byte 352800 for one or more tracks on the discs).. I'm not sure if our method makes sense on the other systems though.

Are there any discs in the database that were released with multiple write offsets and where correcting both gives identical checksums for both discs (excluding dummy tracks)? Example: Doom for PSX.

3 2008-03-24 11:40:18

ps. of course it's possible that the detected offset of this disc is just wrong: http://redump.org/disc/1460/ maybe d8 gives a -306 offset.. I'm not at all comfortable with all the +0 write offsets .. Maybe the correct write offset just isn't showing.. I also have some IBM discs where the data track with d8 only shows the read offset and not the write offset (yet the old detection method gives me a write offset different than +0 on the audio tracks).. maybe these +0 discs just don't show the write offset using any method. How can we fix this?

Same for these discs: http://redump.org/disc/2883/ http://redump.org/disc/1706/ .. same gaps, same track sizes, +901 offset difference.. both discs show a +0 write offset.. I will see if there's another way to determine the correct write offset value (for instance the amount of data that is moved outside of the tracks).

In other words: how to find a reference that isn't there? tongue

4 2008-03-24 12:33:55 (edited by themabus 2008-03-25 02:22:20)

isn't offset a difference between data / subcode frames? and as i understand it, drive would realign data track because sync makes it possible and this is where it manifests - data tracks are misplaced relative to audio stream. and so when you read whole cd content as an continuous stream of data, like when you fake toc with an audio (like for dc / satrurn rings etc.), everything should be in place (each subsequent sector) because sync is ignored - so no data is lost or added in between, instead whole stream is now offseted relative to subcode. and then it's possible to get offset from data track, like drive would do, but instead of data track only we realign whole stream and separate each track, like you described in GD guide. i could not imagine a better way of dumping - it's as raw as it gets imo. but this data should not be different from ours, since what we do - we realign data track, relative to audio. so what more offsets can there be? i would guess everything else is from mastering.

edit: attempt to make myself more clear

5 2008-03-24 12:43:00

You mean reading the entire disc in d8, then cut off the bytes in the first sector before the sync starts, then cut off the data track (it has to be descrambled anyway)?.. this should give the exact same results as our current method (and this is also how I dumped the dreamcast discs). Perhaps I'm just looking for a way to avoid the offset differences from mastering and finding the true reference.

Here's a tool that can extract all sectors in d8 mode: http://vigi.dremora.com/cdtoimg.rar

6 2008-03-24 13:06:51

yes, that's what i meant smile
i guess it would solve a lot of difficulties - dumping that way, like: huge audio pregaps, bad sectors, mode 0 sectors, $00 sectors, data pregap after audio (PCE), data missing from audio pregap, maybe more... (i guess it's maybe what PerfectRip does? but i think Gigadeath had problems with audio-data gaps at least)
thank you very much for program!
about  'intelligent checksums', i agree, but i imagine it on different level, not replacing file CRCs, so it would show all those matching tracks and possibilities to render one version into another but but file level crcs are still there.

7 2008-03-24 13:17:50

The descramble_cdda tool to descramble the data track (if you just select the complete d8 dump file and start descrambling it will stop extracting once it reached the proper data track size): http://vigi.dremora.com/dctools.zip

I tried comparing the amount of bytes in the pregap of my alone in the dark dump but it's different on each track, so there's no way to detect a reference.. sad

8 2008-03-24 13:27:52

thank you, i had those from GD-Rom thread smile

9 2008-03-24 15:39:19

pnkiller78 wrote:

Ok, I've been playing a little bit with some sega saturn images that I had laying on my hard disk, mainly from TOSEC torrents..
And I noticed something curious about some games from different regions but the same game in fact.

By example, these two
VIRTUA FIGHTER 2 (JAP)
VIRTUA FIGHTER 2 (PAL)

both games have different checksums for the audio tracks but the sizes are the same, it looked strange for me, I tough that the music should match different region games like on PSX.
So I did a test, using the psxt001z track method to rebuild a track from an image (used a TOSEC image as source), I rebuilded the audio tracks, for the PAL version the offset correction was +18 (the same write offset in DB), and after that tried to rebuild the JAP version, and guess what, the program was able to found the track, but this time is was with an offset correction of -306...
So, I was wondering why the audio track don't match on both disc when in fact the music from both version it's the same. Could it be an error from one of the dumpers, or in fact, is this the way it was supposed to be?

The same behavior was true for these two other games
ALIEN TRILOGY (PAL)
ALIEN TRILOGY (USA)

What you guys think about it?

EDIT:
Sorry, I was mean to post of the General Discussion forum, if some admin can move, it should be good.

moderator edit: changed topic title to reflect contents

Not pressing in technical questions:

1: There are games with identical region and identical offsets, but audiotracks are shifted under the attitude to each other on 2 samples. http://redump.org/disc/1431/ http://redump.org/disc/2767/ It proves, that not always both versions of various regions should converges, as even the version of the same games does not converge. While I could find only one game for SS which audiotracks converge with other version. Wipeout (PAL) http://redump.org/disc/3036/ . 8 audiotracks are identical PSX Wipeouts.

2: About WO = 0 EAC. It is a lot of games for SS have an identical second audiotrack which CD converges for all, except for CD with offset = 0. This direct proof, that offset = 0 is not true.
Example for 10 sec second SS Jap Track:
ID in DB ---- Name ------- MD5 ---------- Track Number-------Offset
1629,"Winning Post 2: Program '96","4e0043e3be0aee92409d0add922ba839",2,"0"
2345,"Bakuretsu Hunter","5abdd123f1f721875d8eade888da2550",2,"0"
1449,"Enemy Zero","7c0f9b105165b013366b534b6f8be141",2,"0"
2281,"Jissen Pachi-Slot Hisshouhou! 3","92fd90a37d298edb02e81b7137274384",2,"0"
1450,"Enemy Zero","bcfae1cfecac5be5f71bcf8666bf8a4d",2,"0"
1448,"Enemy Zero","bcfae1cfecac5be5f71bcf8666bf8a4d",2,"0"
2279,"Enemy Zero","e03343a8c9f1853d8af2f154b67e2dac",2,"+666"
1439,"DX Jinsei Game","e03343a8c9f1853d8af2f154b67e2dac",2,"+684"
2343,"Tower, The","e03343a8c9f1853d8af2f154b67e2dac",2,"+684"
1610,"AI Shougi","e03343a8c9f1853d8af2f154b67e2dac",2,"+390"
2273,"Enemy Zero","e03343a8c9f1853d8af2f154b67e2dac",2,"+390"
2278,"Enemy Zero","e03343a8c9f1853d8af2f154b67e2dac",2,"+390"
1459,"Street Fighter Zero 2","e03343a8c9f1853d8af2f154b67e2dac",2,"+1260"
2280,"Enemy Zero","e03343a8c9f1853d8af2f154b67e2dac",2,"+390"
1451,"Enemy Zero","e9b72983c72cbeb7aa0d58bfacc10e1d",2,"0"


Apparently from an example, at correction the audiotrack becomes identical at any offsets, except for offsets = 0 !!!

10 2008-03-24 17:17:41

If the difference on this track is only an offset difference, can't you use that track to determine the 'correct' offset correction?

11 2008-03-24 17:55:35

Vigi, and whence we know, what received offset true? smile I discussed it with Dremora. Probably changes have been brought in an audiotrack at manufacturing the master-copy (For example: http://redump.org/disc/1431/ http://redump.org/disc/2767/). In other words,  it's necessary to search for a method of correct offset definition instead of to adjust an audiotrack to the standard.

I can calculate it if the received result will help us to find out true.

12 2008-03-25 11:59:56

did some more reading (that i should have done long ago, would have saved me a lot of worries) and i think i finally understand now why offsets are sample limited. it all looks fairly obvious now.
ECMA-130:
16   F1-Frames
Each Scrambled Sector shall be mapped onto a series of consecutive frames. Each frame consists of 24 8-bit bytes,
numbered from 0 to 23. Byte 0 of the Sector shall be placed in byte 4n of a frame, where n is 0, 1, 2, 3, 4 or 5.
Consecutive bytes of the Sector are placed in consecutive bytes of the frames. Byte 2 351 of a Sector is immediately
followed by byte 0 of the next Sector.

18   Control Bytes - F3-Frames and Sections
A single byte called Control byte is added as first byte to each F2-Frame of 32 bytes. This yields a new F3-Frame of
33 bytes.
The Control byte shall be obtained from a table of 98 bytes as defined in clause 22. The information in the Control
bytes is mainly used for addressing purposes. The bytes in the table are added to 98 consecutive F2-Frames, coming
out of the CIRC encoder, byte 0 of the table first, byte 97 last. This operation yields groups of 98 F3-Frames of 33
bytes each, called Sections. These Sections are asynchronous with the Sectors, i.e. there is no prescribed relation
between the number of the F1-Frame in which the first byte of a Sector is placed and the number of the F3-Frame in
which the first Control byte of the table is placed. Each Section has its own table with Control bytes.

so as i see it, subcode byte may be displaced from data by 24 byte (6 samples) steps (F1 frame). but not only that - sector's starting position (sync) may be 0 to 5 samples off in the first frame. so for example offset +13 = 2 frames (sector/subcode  difference) + 1 sample (sync/frame difference - since subcode byte is fixed to frame, this contributes to offset).
so, i think we have whole PMA stream as it was before it got written on a CD (except for a 1st track's pregap, and it may hold data actually (eg. Dreamweb UK PC release), but not that it changes anything, i guess, but interesting to know nevertheless). so if there are differences left it would be mastering then but we should not fix that on file level, imho.
so for intelligent checksums i would imagine a different level of abstraction for an audio stream that would also include those audio gaps that follow/precede data tracks and so checksums would be calculated on this data. LBA-pause would be targeted (+/- about 10 or so sectors maybe) and if ther's large amount of silence it's excluded (back until last meaningful sample and forth until first) and this point is a break in stream. if silence can not be found in position between two tracks - there is no break and thus less audio streams than audio tracks.
it's like Eidolon suggested, i guess, but we should not drop file CRCs it's a backbone of this db - file still is a core element, everything else is just interpretation or additional (less vital) info.