Until recently, the largest offender of disk space usage with respect to BDB-backed Subversion repositories was the log files in which Berkeley DB performs its pre-writes before modifying the actual database files. These files capture all the actions taken along the route of changing the database from one state to another—while the database files, at any given time, reflect a particular state, the log files contain all the many changes along the way between states. Thus, they can grow and accumulate quite rapidly.
Fortunately, beginning with the 4.2 release of Berkeley
DB, the database environment has the ability to remove its
own unused log files automatically. Any
repositories created using an svnadmin
which is compiled against Berkeley DB version 4.2 or greater
will be configured for this automatic log file removal. If
you don't want this feature enabled, simply pass the
--bdb-log-keep option to the
svnadmin create command. If you forget
to do this, or change your mind at a later time, simply edit
DB_CONFIG file found in your
db directory, comment out
the line which contains the
DB_LOG_AUTOREMOVE directive, and then run
svnadmin recover on your repository to
force the configuration changes to take effect. See the section called “Berkeley DB Configuration” for more information about
Without some sort of automatic log file removal in place, log files will accumulate as you use your repository. This is actually somewhat of a feature of the database system—you should be able to recreate your entire database using nothing but the log files, so these files can be useful for catastrophic database recovery. But typically, you'll want to archive the log files that are no longer in use by Berkeley DB, and then remove them from disk to conserve space. Use the svnadmin list-unused-dblogs command to list the unused log files:
$ svnadmin list-unused-dblogs /var/svn/repos /var/svn/repos/log.0000000031 /var/svn/repos/log.0000000032 /var/svn/repos/log.0000000033 … $ rm `svnadmin list-unused-dblogs /var/svn/repos` ## disk space reclaimed!
BDB-backed repositories whose log files are used as part of a backup or disaster recovery plan should not make use of the log file autoremoval feature. Reconstruction of a repository's data from log files can only be accomplished when all the log files are available. If some of the log files are removed from disk before the backup system has a chance to copy them elsewhere, the incomplete set of backed-up log files is essentially useless.