fsfs-reshard.py

While not an official member of the Subversion toolchain, the fsfs-reshard.py script (found in the tools/server-side directory of the Subversion source distribution) is a useful performance tuning tool for administrators of FSFS-backed Subversion repositories. FSFS repositories contain files which describe the changes made in a single revision, and files which contain the revision properties associated with a single revision. Repositories created in versions of Subversion prior to 1.5 keep these files in two directories—one for each type of file. As new revisions are committed to the repository, Subversion drops more files into these two directories, and over time, the number of these files in each directory can grow to be quite large. This has been observed to cause perfomance problems on certain network-based filesystems.

Subversion 1.5 creates FSFS-backed repositories using a slightly modified layout in which the contents of these two directories are sharded, or scattered across several subdirectories. This can greatly reduce the time it takes the system to locate any one of these files, and therefore increases the overall performance of Subversion when reading from the repository. The number of subdirectories used to house these files is configurable, though, and that's where fsfs-reshard.py comes in. This script reshuffles the repository's file structure into a new arrangement which reflects the requested number of sharding subdirecties. This is especially useful for converting an older Subversion repository into the new Subversion 1.5 sharded layout (which Subversion will not automatically do for you), or for fine-tuning an already-sharded repository.