Introduction
Tom Lord's Arch is a revision control system, or in plain terms "Yet Another CVS". While it might be perceived as that on the surface, fact is that Arch tries to solve a much more difficult problem than CVS does.
One of the main reasons Tom Lord started coding Arch was to create a distributed revision control system, in the sense that everyone controls the "primary" version of a project, namely their own. Rather than have a central repository, and making all users modify and access that central repository, each user has her own repository, her own personal version of the project.
While this may seem a completely bold idea at first, it brings several advantages to the table. First and foremost a central repository means single point of failure, single point of control, single point of modification (add more "single point of ..." of if you like). Besides the single point of failure argument is the point that most software (open source software in particular) is developed in parallel. In a single repository versioning system i would have to make my new feature either in another branch or in a single commit. Adding a new feature in a single commit is possible, but quite a daunting task. Arch solves this problem by letting you commit all you want to your personal repository, and when the feature is done, you can send that patch to other people that might want to use this patch (or rather other developers might have added your repository to their list of repositories and synchronise your tree). The point is that Arch makes this easy, almost trivial to do, and the user of your code will be able to see all the commits you made, and will see where this patch came from.
Branching is also a possibility, branching in CVS is at the very least difficult, but in Arch it is as natural as patching. Merging of different branches can normally be handled automatically, and all versioning information is preserved.
I might add that for a laptop user like me, it is wonderful to be able to work on code and commit to it while being offline.
Another point is that the Linux kernel team also uses a distributed revision control system for developing the linux kernel: BitKeeper. Only down-side of this is that it is closed-source.
Besides this distributed feature Arch also fixes the design flaws that has CVS users has lived with for so long:
Atomic commits
All commits are atomic, that is, either the whole patch is committed or none of it is. No more half revisions in the repository. This is also helped by having your own personal repository.
Commit-based versioning
A commit is a new version (arch refers to this as patch-level) one version increase per commit, not "each file is versioned by itself by some non perceivable 1.xx version number" as CVS seems to like. A version is more like a "state" of the project tree.
With that rather lengthy discussion and not the least understandable discussion of the design decisions behind Arch, it's on to using the damn thing.
Oh, and installation is as easy as:
apt-get install tla
If this doesn't make sense to you, then switch to a proper operating system for pete's sake
Identifying yourself
For Arch (and other people) to be able to distinguish your patches from other peoples patches you need to introduce yourself to tla:
$ tla my-id My Name <my_email@address.org>
You can reassure yourself that Arch understood this by saying:
$ tla my-id My Name <my_email@address.org>
Creating your own archive
Before starting to use other peoples archives it is good to define your own repository, such that you have a place to store your own revisions. I use ~/.arch-archives as my directory for storing archives.
$ tla mkdir ~/.arch-archives $ tla make-archive my_email@address.org--2004 ~/.arch-archives/my_email@address.org $ tla register-archive my_email@address.org--2004 ~/.arch-archives/my_email@address.org
If this is an archive you would like others to access through http, you should add the -l option after tla make-archive this tells Arch to create and keep up-to-date .listing files. Another option i urge you to use is -s, this tells Arch to sign your archives using GnuPG, and allows the users of your code to assure that the code they get from your archive has not been tampered with.
One might be a bit confused about the --2004 part, but this is to overcome the fact that Arch can use quite a lot of space, and to accommodate this you use the year to denote your archive and when you run out of space, you create a new archive with another year indicator and move the most current revision to that archive. Don't worry too much about it.
To ease your use of Arch, it is nice to set your default archive to your own archive, this makes Arch know that all Arch commands where you don't specify any archive for, will work with the default archive.
$ tla my-default-archive my_email@address.org--2004
Arch's naming scheme
An archive consists of categories, categories consists of branches, branches consists of versions and versions consists of patch-levels. This is quite a handful to swallow at once, so let's try to clear up each term:
category A category is basically just a project, the name may seem to be ill chosen but you this is the name it was given by Tom Lord (a name that induces some respect) and no one has opted to change it, yet. But your could of course just create your own version. branch A branch just like in any other revision control system, i use trunk as the name of the current development branch, but the Arch community seems to like devo or dev better. Choose any you like. version A version is a version of a project, how this could be useful is entirely up to you. I could see a use for it by declaring each released version of a project as a separate version, but then again, a branch could just as easily do that. patch-level Think: revision, instead. Each commit is given a new revision name. The initial import or commit of source is called base-0 and subsequent commits are named patch-1 and so forth.
Using these terms we can define a unique revision of a tree, e.g. the newest revision of the vertigo trunk as:
vertigo--trunk--0--patch-231
Or if you want to make it a globally unique name:
halfdan@halfdans.net--2004/vertigo--trunk--0--patch-231
This makes it quite easy to discuss new patches in e.g. email correspondences, even though the fact that you are developing distributed.
Examining an archive
Now that you have grasped Arch naming scheme, you can start exploring an archive. Four Arch commands will do that: tla categories, tla branches, tla versions and tla revisions. Use should be almost self explanatory, but a small example is probably in place:
$ tla categories
....
vertigo
...
$ tla branches vertigo
vertigo--trunk
$ tla versions vertigo--trunk
vertigo--trunk--0
$ tla revisions vertigo--trunk--0
base-0
patch-1
patch-2
...
patch-231
patch-232
Starting a new project
Ok, you are now ready to create your own project, let's assume that we want to write a yet another tetris clone. We want to register the project name tetris, a branch called trunk and a version called 0:
$ tla archive-setup tetris--trunk--0
That's it, now the category, branch and version has been created, now you should start putting some files into the first revision, or at least commit the base revision. Go to the directory that will serve as initial base.
$ cd my-tetris-clone
First you initialise that directory as an Arch directory, and an tell Arch what project--branch--version this should be:
$ tla init-tree tetris--trunk--0
You should now observe that a tree has been created, now you are almost ready to do an import of the initial base. But you first need a log file. Arch will automatically open your $EDITOR for this, and go on with the import after you've created the log. So without further explanation:
$ tla import
That's it, later on when you want to commit something to the archive, just do:
$ tla commit
But how to add files ? This is described in the next section.
Managing files
Arch has a peculiar way of controlling files, files have two names: a file-name and an id. This concept is quite hard to grasp at first, so don't think anymore about, the reason for having an id is to be able to track the file if it changes name or is moved about the project tree. Arch automatically assign reasonable id's so you don't have to worry about them. To add a file to a repository use the tla add command:
$ tla add main.c
The file is now added, and will be committed to the archive on next tla commit. You can also remove files:
$ tla rm main.c
And move files:
$ tla move main.c init.c
You don't have to worry whether some other user of your code has renamed his file to something else, by using id's to identify files, arch will recognise that your patch for init.c should be applied to another users's main.c, unless your patch also contains the information to rename the file.
The general Arch phrase is: don't think about it, Arch will take care of the hard things.
Logs
Contrary to other revision control systems, Arch lets you edit the log file for the next commit "on the run". This allows for much better log files, when you have made changes in the code, write the log message immediately, and when you are ready to commit, the log message is in perfect condition, all the changes you made are described in the log message. This takes some getting used to, but Arch still supports the CVS way of thinking, so choose what fits your purpose.
Tagging
Not written yet
Registering other peoples archives
Ok, Arch knows who you, you have your own archive and you are familiar with Arch's category--branch--version nameing scheme. Now you are ready to get some code from other guys. Easy:
$ tla register-archive halfdan@halfdans.net--2004 http://dev.infonet.dk/~rto/arch
That wasn't hard was it ? Now you can access my archive using the standard Arch commands to much about with my archive. One thing to remember is to add the -A <archive> option to all the commands to tell Arch to use my archive rather than your own default archive. Or you could just set my archive as your default archive using:
$ tla my-default-archive halfdan@halfdans.net--2004
And changing it back to your archive is just as easy.