maximum compression for backups

aimeeandbeatles

watermelon
Joined
Apr 5, 2007
Messages
20,112
Every month I do a backup of my WordPress content folder. This consists mostly of images, pdfs, and text-based files with a few other types scattered about.

As well when I make scans I also keep the original bitmaps which are rather large, file-size.

Currently I am using the 7z format at ultra for compressing these. (You tell me "don't bother, hard drive space is cheap," except the fact is that it's kind of wasteful to use hard drive space for backups when you can have nice lossless FLACs taking up that space instead.)

I'm not worried about speed/CPU usage when extracting these as I dont need them on-the-fly.

Is there a better format I should use?
 
What format are the bitmaps stored in? If these are stored in an efficient format (e.g., PNG if you want lossless), I'm not sure the backup format should matter too much.
 
Well, the edited (usually resized to about 30-40%, cropped, and rotated) files are in PNG. However I keep the original bitmap (BMP) files around for backup purposes.

I'll PM you to show you what I mean.
 
No -- as I said, the PNGs are cropped, rotated, etc. I like to keep the original scans around.
 
In PM you said something about 7z being worse at compressing PNGs. That is, the size of the 7z archive when using BMP is smaller than when using PNG?

What happens if you compress the images on their own - is the 7z size smaller than the total of the PNG files (without compressing with 7z)?

I guess it's a case that 7z is actually better than PNG, at least with your images (see http://sourceforge.net/projects/sevenzip/forums/forum/45797/topic/1561858 for some discussion on this effect). In which case, my guess would be that starting off with uncompressed BMPs is going to give the best results (I mean, if it's clear that 7z is having a harder job of input data already compressed, then you can't get better than an uncompressed format like BMP - although check you don't have run length encoding compression, which BMP does support I believe).

But what sort of difference in sizes of the archive are you talking about? If you're worried about disk space, then there's also the fact that your original Wordpress folder is taking up space, and I imagine that the size with BMPs is a lot bigger than the size it would be with PNGs? You may be better off saving space there with PNGs, if the extra space in the archive size is not that much.
 
Scan to PNG.

No option for it. I mentioned this repeatedly before.

My scanned files are kept separate from the WordPress backups.

But here's my process for the scanning:

a) Scan it at a relatively high compression. More than I need.
b) Convert them all to PNG, keeping the original bitmaps.
c) Stick the bitmaps in an archive to get them out of the way.
d) Resize the new PNG files (usually anywhere from 30-60% of the original size) and rotate and crop if needed. (I couldn't find a rotate option on my scanner software and the crop is not detailed enough so I worry about cropping too much.)
e) When I'm ready to upload them to the site, I create a PDF file (I usually use FlateDecode or High-Quality JPEG, depending on which gives the better size/quality ratio), create copies of the PNG images and rename them for the site, crop them down further (sometimes you only need one little box) and run them all through PNGGauntlet. Sometimes I also decrease the color depth.

So if you were to compare an image on my site to the original "archival" (no pun) PNG image, the website version would be smaller.

I keep the original bitmaps around because they're not resized or anything.

Now, suppose I was compressing a bunch of bitmaps. Suppose the original size was about 100 MB. If I compressed the bitmaps without doing anything else, I would get maybe 50 MB. If I turned them to PNG images before compressing them, the archive size would be a few megabytes more (I don't remember for sure the difference).
 
I don't think anyone can know if compressing a format other than BMP would give a better result - you'd have to try it and see.

But it seems to me that if 7zip is managing to beat PNG, and also that it seems to do better compressing from the original uncompressed BMP files, then I don't see that starting from any other format would do better.

The only other thing to try might be lossless JPEG 2000, to see if that does better than 7zip.
 
From what I understand, compression happens by squishing together repeating strings. When you turn a bitmap into a PNG, it's already squished together and thus 7-Zip wouldn't be able to squish it as efficiently as it would squish the original bitmaps.

Anyways, the PNG files are resized and cropped and rotated. The bitmaps are exactly what came off the scanner. No editing whatsoever. I keep them in case, for some reason, something happens that would normally require me to re-scan and I don't want to do that.

Also, remember that for my website, its a mixture of different filetypes. I just download the entire folder and then stick it in an archive. I tried zip, tar/gz, bzip2, and 7z. 7z got the smallest size. (I don't have WinRAR and am not planning to buy or illegally download it so don't suggest RAR to me.) I didn't try wim or xz which are also in 7zip as I dont know much about those formats.
 
You should try converting to PNG and saving those as your backup. Then you'll have original and resized PNGs. This should be more efficient than using 7z, because PNG is designed for images.
 
Back
Top Bottom