I do my backups with rsync from the root directory, as is right and proper. One of my machines has a dozen git repositories scattered around its file system in various places. This machine's backup target is at the other end of a not-fast connection.
And, git's data model seems to be "I'm going to continuously make 100% changes to multiple 50+ MB files, to ensure that there's nothing incremental about your incremental backups."
Is there any way to tune git's file usage to be less egregious? Do "git gc" and "git repack" make this worse, or better?
You're about to suggest that I not back up my git repositories, but just do a dozen different git checkouts on the backup-target machine instead. No. Let's just leave it at "no" so that I don't have to explain the several different ways in which that suggestion is stupid.
Kinda missing CVS right now.
Update: Unless I missed something, there have been 3½ suggestions here:
Turn off pack files and gc entirely, which will cause small files to accumulate for every future change, and will eventually make things get slow. gc.auto 0, gc.autopacklimit 0.
Set the maximum pack size to a smaller number, so that no pack file gets too large, and subsequent layers of diffs get bundled into smaller pack files. pack.packSizeLimit.
Dissenting opinion on #2: That doesn't do what you think it does, it just slices a single large pack file into N different files with the same bits in them, so you haven't saved anything.
If you already have one gigantic pack file, create a .keep file next to it. New pack files will appear but they will be diffs against that saved one, and thus smaller.
I guess option #4 is the only practical one?