Page 1 of 2

findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Tue Jan 08, 2013 10:25 pm
by V.
Howdy,

I am not sure if what I made is new, but searching around didn't turn up anything.
In any case, after not finding anything useful, I made a new crc tool to find (in a fast way) a matching crc in a dump with some offset.
This is whipped up in a few hours, so it is still kind of rough and not idiotproof (no software of mine is idiotproof actually  Image)

Usecase:
Finding an offset in a dump of specific audiotracks to match against the redump database.
If all tracks match with an offset to redump, it is further confirmation that the dump was successful.
This is NOT meant to actually find a drive offset or to be used in a way to matchup a dump to validate an entry!
Use the proper redump guides to dump discs! Different pressings of audio disks have factory offsets which this tool mainly detects.

README:

Code: Select all

What:
 This is a not yet idiotproof version of findcrcs.
 It is to be used for finding a block of data which matches a specific crc.

How:
 findcrcs <file> <size of window> <crc> [more crcs...]

 File is a big file which should or may contain the searched for data.
 Size of window is the size of the block of data to find.
 Crc is the crc to find in the file (may be more then 1, but all will be matched on the window size).

 If a match is found it will print out an md5sum of the matched block for further inspection.
 For best results, add some (1MB or so) zero bytes padding around the file first.
 In a future version, this might be a selectable option of this program.

Why:
 Useful for finding audio offsets in disk images together with the redump.org database.

Warning:
 This software is not yet idiotproof!
 - It does not check arguments for validity yet (especially size of window and crc's.)
 - No paddiong option yet.
   if matching audiodata, you should pad the combined audiotracks with zero bytes at the start and end.

Compiling:
 Use "make" on any linux/unix/bsd console nearby, or if you must, an msys or cygwin environment.
 You need to use a relatively recent gcc (4.5.0+ ish I guess).
 This software uses crcutil-1.0 for providing fast crc calculations.
 crcutil is made by Andrew Kadatch and Bob Jenkins and can be found on http://code.google.com/p/crcutil/
 Do not contact them for support on findcrcs.
 The Makefile will try to pull in version 1.0 through wget if it is not supplied yet.

 Also, this program makes use of the MD5 implementation of Alexander Peslyak.
 This is found at http://openwall.info/wiki/people/solar/software/public-domain-source-code/md5
 A small casting patch was made to support g++, this small patch is released under the same license as the original md5.c file.

Contact:
 At the moment, see the redump.org forum thread where you got this.

-V.
Disclaimer:
I write my tools mainly for myself to use in a specific way.
If however someone has some issues using this, or has some suggestions, i MIGHT be able to change, fix or add things, but only if time and effort are permitting (which is not usually the case).

Source: http://winaoe.org/findcrcs-0.2.tar.gz
Win32 binary: http://winaoe.org/findcrcs-0.2-bin-win32.zip

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Wed Jan 09, 2013 4:26 am
by Jackal
Hi,

thx for this useful tool. We already had the psxt001z --track option and there was another tool (by themabus?), but from what I recall those weren't really suitable for full images and only worked well on individual tracks.
V. wrote:For the moment linux only, but I am considering a windows version (cli only)
This means your target audience for the moment will only be a fraction of what it could be. Maybe release a cygwin build (with the necessary dll files included) to get the windows folks going?

Regards

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Wed Jan 09, 2013 5:32 am
by gaijin
psxt001z a bit outdated and slow, uses one core and step 4 bytes.

themabus' fff.exe  much better and uses multiple cores and step of 1 byte

+ he has interesting tool recombine -> one bad image + another version same bad image = good image or tracks.

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Wed Jan 09, 2013 2:34 pm
by V.
I was not aware of those 2 programs, so I tested them against mine.

First off, I ported the thing to windows/MinGW, so no need for cygwin dll's
Updated the initial post for a source and binary release of v.0.2.

Benchmarking the 3 programs was done on a combined bin (data + audio) of Moto Racer 1.
Image is 574,551,264 bytes big (around 75% of a full CD).
The target crc to find is 9c8f607e with a track size of 46,435,536 (track 6 of this listing: https://redump.info/disc/18266/)

first off: findcrcs

Code: Select all

~/findcrcs-0.2$ time findcrcs.exe "Moto Racer 1.bin" 46435536 9c8f607e
348061372  9c8f607e  7ca7c0881f28f2684623b3a2ae53e95b

real    0m5.743s
user    0m0.000s
sys     0m0.047s
Found with the correct md5 on offset / index 348061372.
Meaning bytes 348061372 to 348061372+46435536 correspond to the track 6 in the dump information.

Done in 5.790 seconds.
In these 5.760 seconds, a total of 528,115,728 crcs were checked, making it do around 90,000,000 crcs per second.


Next up: psxt001z

This one was not doable for a complete search of the whole image, took waaay to long.
So instead I let it search -1000 to +1000 around index 348061372 (found above to be correct) to get at least some idea of the speed.

Code: Select all

~/findcrcs-0.2$ time psxt001z.exe --track "Moto Racer 1.bin" 348060372 46435536 9c8f607e
psxt001z by Dremora, v0.21 beta 1

File: Moto Racer 1.bin
Start: 348060372
Size: 46435536
CRC-32: 9c8f607e

Offset correction 0 bytes, 0 samples, CRC-32 dbf4a270
Offset correction 0 bytes, 0 samples, CRC-32 dbf4a270
Offset correction 4 bytes, 1 samples, CRC-32 aaeba36a
Offset correction -4 bytes, -1 samples, CRC-32 4abb692b
...
...
Offset correction -992 bytes, -248 samples, CRC-32 a87ee8ab
Offset correction 996 bytes, 249 samples, CRC-32 3607c430
Offset correction -996 bytes, -249 samples, CRC-32 97e367cd
Offset correction 1000 bytes, 250 samples, CRC-32 9c8f607e

DONE!

Offset correction: 1000 bytes / 250 samples

real    2m27.556s
user    0m0.031s
sys     0m0.000s
This was a search of around 500 crcs (it does steps of 4, as also mentioned by gaijin).
Meaning, it does around 3.5 crcs per second.
findcrc's beats this with a factor of 25,000,000.


Lastly: fff
This is faster then psx001z, but still way too slow to do a full image.
So, again, it gets a 2000 byte window, with the default of a 4 byte step.
It can do less bytes per step, but 500 crcs gives us a clear compare to psxt001z.

Code: Select all

~/findcrcs-0.2$ time fff.exe -offset=348060372 -size=46435536 -crc=0x9c8f607e "
Moto Racer 1.bin"
FindFileFragment @20100709 / themabus@inbox.lv
----------------------------------------------
Input: Moto Racer 1.bin
Offset: 348060372
Size: 46435536
CRC: 9c8f607e
Shift: both
Step: 4
Range: 20000

Offset correction 0 bytes, CRC-32 dbf4a270
Offset correction 4 bytes, CRC-32 aaeba36a
Offset correction -4 bytes, CRC-32 4abb692b
Offset correction 8 bytes, CRC-32 7ed53192
...
...
Offset correction -992 bytes, CRC-32 a87ee8ab
Offset correction 996 bytes, CRC-32 3607c430
Offset correction -996 bytes, CRC-32 97e367cd
Offset correction 1000 bytes, CRC-32 9c8f607e

Fragment found!

real    0m26.102s
user    0m0.015s
sys     0m0.015s
This was 500 crcs in 26.102 seconds, meaning around 19 crcs per second.
Meaning an increase over psx001z by a factor of around 5, but still 4,700,000 times slower then findcrcs.


So.... yeah....  Image


Anything above an offset of 10,000 is not really doable with fff and out of the question for psx001z.
10,000 is not even that much of an factory offset between different cd presses, so I think findcrcs has a use.

In any case, as said, I updated the initial post for a 0.2 update and windows binary release.
Enjoy.

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Fri Jan 11, 2013 3:25 pm
by camb702
Excellent tool, V.

Is there any chance of adding an option to output any found fragments to a new file(s)?
e.g.
findcrcs <file> <size of window> <crc> <outfile>

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Fri Jan 11, 2013 5:54 pm
by V.
Thanks.
I used a bit of shellscript for that, but I guess that on windows it would be easier to have that be done by the tool itself.
I'll see what I can do in the next revision, i'll probably add a "slice" tool instead of having it be done by findcrcs itself.

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Wed Jan 16, 2013 2:05 pm
by pablogm123
Tested. This tool is really impressive.


This is the test I have performed:

I have extracted, using IsoBuster (no offset correction, no error detection for audio...), a full image of this audio disc I dumped in the past. Then, I have run findcrcs.exe "image.bin" 95020800 dd0e562e. After 10-11 seconds (I own a budget CPU), findcrcs has found two fragments with matching CRC32 (I assume due to hashes collisions), the second one with the expected MD5 hash.

G:\>findcrcs.exe "image.bin" 95020800 dd0e562e
120031802  dd0e562e  87ac7985fcc46286efc5c0876f723e5e
619521696  dd0e562e  8855ffc1921ec4e5d7536272ced3d989

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Thu Jan 17, 2013 7:23 am
by camb702
Thanks, V.

Agree with pablogm123: match finding is very fast.
Image

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Thu Jan 18, 2018 8:32 pm
by HwitVlf
This scanner works well enough that I made a simple GUI front-end for my own use. It uses info from the Redump database and extracts tracks that are found.
Image

In case it helps anyone else, it is HERE. Source code (Autoit) included.

EDIT Link Updated to v7

Re: findcrcs (fast crc finding tool, v0.2, windows support added)

Posted: Fri Jan 19, 2018 4:36 pm
by rosewood
I can't get your GUI to work, it always throws the error "Track Information is not formatted correctly."

So I compiled v3 of findcrcs for win64, you can download it here: findcrcs-0.3-bin-win64.7z
This new Version also supports extract from the command line:

Code: Select all

Usage: findcrcs [OPTION]... [--] <FILE> <WINDOWSIZE> <CRC> [MD5] [CRC [MD5]...]

Find the offset of CRCs in FILE with a window size of WINDOWSIZE.
Outputs the crc, offset and md5 of a found segment.
If an MD5 is given, it will only output or extract on a matching md5 hash.

  -e              extract the found segments with the md5 hash as filename
  -f EXTRACTFILE  use EXTRACTFILE as file to extract to
                  implies -e and -q
  -p PADDING      use PADDING amount of zero bytes around the input file
                  this can result in a negative offset in the results
                  if used with -s only an end padding will be added
  -q              quit processing after finding a match and optionally
                  extracting that match
  -s SEEDFILE     get an initial crc from SEEDFILE
                  if used with -e, the SEEDFILE will be joined with the found
                  segment