Repository Layer

When referring to Subversion's Repository Layer, we're generally talking about two basic concepts—the versioned filesystem implementation (accessed via libsvn_fs, and supported by its libsvn_fs_base and libsvn_fs_fs plugins), and the repository logic that wraps it (as implemented in libsvn_repos). These libraries provide the storage and reporting mechanisms for the various revisions of your version-controlled data. This layer is connected to the Client Layer via the Repository Access Layer, and is, from the perspective of the Subversion user, the stuff at the “other end of the line.

The Subversion Filesystem is not a kernel-level filesystem that one would install in an operating system (like the Linux ext2 or NTFS), but a virtual filesystem. Rather than storing “files” and “directories” as real files and directories (as in, the kind you can navigate through using your favorite shell program), it uses one of two available abstract storage backends—either a Berkeley DB database environment, or a flat-file representation. (To learn more about the two repository back-ends, see the section called “Choosing a Data Store”.) There has even been considerable interest by the development community in giving future releases of Subversion the ability to use other back-end database systems, perhaps through a mechanism such as Open Database Connectivity (ODBC). In fact, Google did something similar to this before launching the Google Code Project Hosting service: they announced in mid-2006 that members of its Open Source team had written a new proprietary Subversion filesystem plugin which used their ultra-scalable Bigtable database for its storage.

The filesystem API exported by libsvn_fs contains the kinds of functionality you would expect from any other filesystem API—you can create and remove files and directories, copy and move them around, modify file contents, and so on. It also has features that are not quite as common, such as the ability to add, modify, and remove metadata (“properties”) on each file or directory. Furthermore, the Subversion Filesystem is a versioning filesystem, which means that as you make changes to your directory tree, Subversion remembers what your tree looked like before those changes. And before the previous changes. And the previous ones. And so on, all the way back through versioning time to (and just beyond) the moment you first started adding things to the filesystem.

All the modifications you make to your tree are done within the context of a Subversion commit transaction. The following is a simplified general routine for modifying your filesystem:

  1. Begin a Subversion commit transaction.

  2. Make your changes (adds, deletes, property modifications, etc.).

  3. Commit your transaction.

Once you have committed your transaction, your filesystem modifications are permanently stored as historical artifacts. Each of these cycles generates a single new revision of your tree, and each revision is forever accessible as an immutable snapshot of “the way things were.

Most of the functionality provided by the filesystem interface deals with actions that occur on individual filesystem paths. That is, from outside of the filesystem, the primary mechanism for describing and accessing the individual revisions of files and directories comes through the use of path strings like /foo/bar, just as if you were addressing files and directories through your favorite shell program. You add new files and directories by passing their paths-to-be to the right API functions. You query for information about them by the same mechanism.

Unlike most filesystems, though, a path alone is not enough information to identify a file or directory in Subversion. Think of a directory tree as a two-dimensional system, where a node's siblings represent a sort of left-and-right motion, and descending into subdirectories a downward motion. Figure8.1, “Files and directories in two dimensions” shows a typical representation of a tree as exactly that.

Figure8.1.Files and directories in two dimensions

Files and directories in two dimensions

The difference here is that the Subversion filesystem has a nifty third dimension that most filesystems do not have—Time! [51] In the filesystem interface, nearly every function that has a path argument also expects a root argument. This svn_fs_root_t argument describes either a revision or a Subversion transaction (which is simply a revision-in-the-making), and provides that third-dimensional context needed to understand the difference between /foo/bar in revision 32, and the same path as it exists in revision 98. Figure8.2, “Versioning time—the third dimension!” shows revision history as an added dimension to the Subversion filesystem universe.

Figure8.2.Versioning time—the third dimension!

Versioning time—the third dimension!

As we mentioned earlier, the libsvn_fs API looks and feels like any other filesystem, except that it has this wonderful versioning capability. It was designed to be usable by any program interested in a versioning filesystem. Not coincidentally, Subversion itself is interested in that functionality. But while the filesystem API should be sufficient for basic file and directory versioning support, Subversion wants more—and that is where libsvn_repos comes in.

The Subversion repository library (libsvn_repos) sits (logically speaking) atop the libsvn_fs API, providing additional functionality beyond that of the underlying versioned filesystem logic. It does not completely wrap each and every filesystem function—only certain major steps in the general cycle of filesystem activity are wrapped by the repository interface. Some of these include the creation and commit of Subversion transactions, and the modification of revision properties. These particular events are wrapped by the repository layer because they have hooks associated with them. A repository hook system is not strictly related to implementing a versioning filesystem, so it lives in the repository wrapper library.

The hooks mechanism is but one of the reasons for the abstraction of a separate repository library from the rest of the filesystem code. The libsvn_repos API provides several other important utilities to Subversion. These include the abilities to:

  • create, open, destroy, and perform recovery steps on a Subversion repository and the filesystem included in that repository.

  • describe the differences between two filesystem trees.

  • query for the commit log messages associated with all (or some) of the revisions in which a set of files was modified in the filesystem.

  • generate a human-readable “dump” of the filesystem, a complete representation of the revisions in the filesystem.

  • parse that dump format, loading the dumped revisions into a different Subversion repository.

As Subversion continues to evolve, the repository library will grow with the filesystem library to offer increased functionality and configurable option support.

[51] We understand that this may come as a shock to sci-fi fans who have long been under the impression that Time was actually the fourth dimension, and we apologize for any emotional trauma induced by our assertion of a different theory.