Merged Compression Tests

User avatar
cHrI8l3
Posts: 314
Joined: Mon Jun 08, 2026 1:26 am

Merged Compression Tests

Post by cHrI8l3 »

Job: merge different versions of this same game
Goal: find best method that will require smallest user effort for compression and decompression (and perhaps don't use too much of system memory)

Configurations used:
Image

Test #1:
Game: [PSX] Final Fantasy IX (Disc 1) (8 versions)
Versions order used for merging: U v1.0, U v1.1, E, F, G, I, S, J
Results table:
Image
Brief summary:
- 5,5gb - uncompressed
- 2,8gb - PackIso
- 379mb - ECM+Split:100mb+Rep:200mb+LZMA:32mb in ~10 minutes (~250mb for decompression!)
- 375mb - ECM+Rep:1gb+LZMA:128mb in ~25 minutes (~1gb for decompression)
- 354mb - ECM+Rep:1gb+NanoZip:1.5gb in ~40 minutes (~1.5gb for decompression)
- 344mb - ECM+Split:100mb+Rep:1gb+NanoZip:1gb in ~30 minutes (~1gb for decompression)

One version of game & ImageDiff for others:
- 358mb - 7-Zip:192mb
- 344mb - NanoZip:1.5gb

Some notes:
- times were measured on dual core 2.5ghz, and accuracy is about 90%
- it will be possible to use ECM inside FreeArc, but we need to wait for next more stable version - that will reduce whole process into one command

Basic idea of splitting:
- split data into parts, f.e. v1.bin.001, v1.bin.002, v2.bin.001, v2.bin.002
- add all to archive sorted by extension and name: v1.bin.001, v2.bin.001, v1.bin.002, v2.bin.002
- apply repetition filter with at least twice large dictionary than part size (so if you have parts for 100mb you need at least 200mb dictionary for it to work good)

Conclusions:
- no need for storing by ImageDiff when merging with repetition filter is much more conveniant
- with splitting you can achive amazing results with cost of convenience...
- buy more RAM - there's never too much when it comes to compression


How much RAM do you have in your machine?


0%
(0)

< 256MB


3%
(1)

256MB - 512MB


13%
(5)

512MB - 1GB


44%
(17)

1GB - 2GB


18%
(7)

2GB - 3GB


23%
(9)

> 3GB

Votes: 39
Last edited by cHrI8l3 on Fri Apr 03, 2009 8:25 am, edited 1 time in total.
User avatar
themabus
Posts: 741
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by themabus »

- 358mb with 7-Zip (one ecm'ed version & ImageDiffs)
- 344mb with NanoZip (one ecm'ed version & ImageDiffs)
it (NanoZip) would decompress slower than 7z, though
right?
ECMa130 @20091225 :: friidump 0.5.3 :: PSXstuff :: SaturnPrograms :: MyStupidPrograms2 :: [url=http://www.mediafire.com/?q1mbksntoje]MyStupidPrograms[/url]
User avatar
cHrI8l3
Posts: 314
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by cHrI8l3 »

it (NanoZip) would decompress slower than 7z, though
right?
yes, it's symetrical algorithm and decompression takes pretty much this same time as compression (and memory also..)
Sotho Tal Ker
Posts: 267
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by Sotho Tal Ker »

You did not try PAQ? Image
User avatar
cHrI8l3
Posts: 314
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by cHrI8l3 »

You did not try PAQ?
no way Image PAQs are disqualifed because i dont have all day to wait for each to finish Image
Last edited by cHrI8l3 on Thu Apr 02, 2009 3:57 pm, edited 1 time in total.
User avatar
themabus
Posts: 741
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by themabus »

- 375mb with ECM+FreeArc/LZMA in ~30 minutes
- 354mb with ECM+FreeArc/NanoZip in ~40 minutes
so it's 30-40 minutes to decompress those?

it's still too much imho
maybe for archiving, when kept to oneself - accessed infrequently and space is really an issue
but otherwise (if such archives would be distributed)
those minutes multiplied with thousands of CDs and thousands of copies would turn into years
ECMa130 @20091225 :: friidump 0.5.3 :: PSXstuff :: SaturnPrograms :: MyStupidPrograms2 :: [url=http://www.mediafire.com/?q1mbksntoje]MyStupidPrograms[/url]
User avatar
cHrI8l3
Posts: 314
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by cHrI8l3 »

so it's 30-40 minutes to decompress those?

it's still too much imho
maybe for archiving, when kept to oneself - accessed infrequently and space is really an issue
but otherwise (if such archives would be distributed)
I agree, for distribution there will need to found better method which will at least consume less memory... I have in mind splitting all discs into f.e. 100mb parts and then merging with proper sorting.. with 100mb it would be enough to have 150-200mb of RAM for repetition filter - I'll test this method soon
those minutes multiplied with thousands of CDs and thousands of copies would turn into years
cmon man, I've been recently doing massive recompression on about 1700 discs and it took barely 2 weeks with ultra 7-zip on 2x2.5ghz Image and compressing with this same method inside FreeArc would be about 20% faster
Last edited by cHrI8l3 on Fri Apr 03, 2009 4:08 am, edited 1 time in total.
topkat
Posts: 28
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by topkat »

I had some though about space-saving as well. Since there are quite a lot of redundancies across variuos dumps, merging ala MAMEs' parent/clone-relationships came to my mind. Finally I did a testrun with Ace/Air Combat.

First I renamed each track to it's crc32-hash, eg. 'cfbe8182.bin'. Then I looked at a recent mame-dat and created a corrosponding merged dat-file for the mentioned game. After that I altered the cue-files to reflect the crc-filenames. With its' 60+ tracks per version it was quite an insane task. As a last step, I fired up clrmame and rebuild a merged set with the created dat-file. After torrentzip, the final filesize was around 1/3 of the individual archives. Quite nice I think.

Pro:

- Good compression ratio
- Shareable archives due to torrentzips' consistent filehashes
- Compression duration is quite small

Contra:

- Manual editing/creating of merged dats
- Altered/New cue-files needed
- Manual hunt down identical files arcoss game version can be a insane task
- No emulator/frontend support yet?!
- Hard to tell files apart with only the crc-filenames

Questions:

- What to do with files identical across multiple games (eg. the Capcom dummy track)
- How to handle multi disk games?

Sadly, due to the hugh amount of manual editing, I lost interest and deleted all my work. If anyone is interested, I could redo an example...
Last edited by topkat on Fri Apr 03, 2009 5:49 am, edited 1 time in total.
User avatar
cHrI8l3
Posts: 314
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by cHrI8l3 »

topkat, idea is effective but it's deffinately too much job with editing those files, everything must be done manually, and you're loosing original file naming and cues, personally I would not want such mess... Image

good method should be done by one command, don't change anything in original files and decompress files to their original state, and I have some new ideas about that...

Updates:
- low-memory methods using split have been added to test (config "Arc 6" needs only ~250mb for decompression)
- approximate compression time added to the results

However... there is still some manual work with splitting... would be nice to have it done by automatically before compression and auto-joined after decompression
Last edited by cHrI8l3 on Fri Apr 03, 2009 7:22 am, edited 1 time in total.
topkat
Posts: 28
Joined: Mon Jun 08, 2026 1:26 am

Re: Merged Compression Tests

Post by topkat »

cHrI8l3 wrote:topkat, idea is effective but it's deffinately too much job with editing those files, everything must be done manually, and you're loosing original file naming and cues, personally I would not want such mess... Image
Yupp, that's why I trashed that idea

Still a nice sideeffect was to be able to easily audit sets with clrmame without prioror (de)compressing forth and back. Hopefully someone will come up with a solution with the best of both worlds...
Last edited by topkat on Fri Apr 03, 2009 9:18 am, edited 1 time in total.
Post Reply