After restructuring a Mercurial repository, I wanted to split it into two separate projects, but keep history clean.

I chose to rewrite history of the old repository to treat the action when the files were moved into a separate directory structure as a removal of the original files, and ignore history touching these files from there on. At the same time I wanted to create a new repository that began at this very moment and recorded the subsequent file specific history.

Disclaimer

Rewriting project history in a way is "the nuclear option" when it comes to distributed version control systems (VCS). A repository with modified history is, as far as the VCS is concerned, a completely different project. If a repository is spread out among several parties and one of them decides to rewrite history, this person's repository will no longer be compatible with the subsequent changes after the point of history rewriting.

Do not rewrite repository history without understanding the role of history for the system in question.

However, rules are often more like guidelines, so let us proceed.

Initial project state

The directory include/Backend had used to contain a project specific backend, being version controlled together with the frontend in the root directory. Due to interest in using the same backend in another project, I started to separate it cleanly from the rest of the project code before putting it into the vendor/FooBackend directory. After a few commits it became clear that it would be much smoother if FooBackend was kept in its own repository due to now having become a clearly separated library, used in multiple contexts.

hg convert

Enter hg convert — an extension that ships with Mercurial (see the documentation on how to enable extensions). Its main purpose is to convert projects between different VCS formats, but through its ‑‑filemap switch it can also be helpful in converting a Mercurial repository into a new Mercurial repository.

Create a file map new-repo.filemap such as

include vendor/FooBackend
rename vendor/FooBackend .

Create another file map rewrite-old-repo.filemap:

exclude vendor/FooBackend

Create the new repository:

hg convert /path/to/current/repo /path/to/new/repo --filemap new-repo.filemap

The new repository is now finished. The directory is empty, but a hg update will bring its contents up to speed.

Create the modified repository:

hg convert /path/to/current/repo /path/to/rewritten/repo --filemap rewrite-old-repo.filemap

Retain customized configuration

Care might have to be taken to bring customized .hg/hgrc changes to the rewritten repository. Unless there are some very exotic changes, it should suffice to do something like:

cp /path/to/old/repo/.hg/hgrc /path/to/rewritten/repo/.hg
rm -r /path/to/old/repo/.hg # If you run this without having a backup available, you should rethink your computer habits :-)
mv /path/to/rewritten/repo/.hg /path/to/old/repo
rmdir /path/to/rewritten/repo
hg update --cwd /path/to/old/repo

To recap and clarify: this will

  1. save the old .hg/hgrc file
  2. remove the complete Mercurial specific directory structure from the old project
  3. replace it with the rewritten history
  4. remove the now empty rewritten repository directory.
  5. change the internal Mercurial directory state to make it aware of that the current revision is present in the working directory (this will be a no-op, except for updating the internal state)

As mentioned in the code comment: be defensive and keep a backup copy in case things go pear-shaped.

In case .hg/hgrc modifications in the initial project directory are also relevant for the newly created split off repository, just copy it there:

cp /path/to/old/repo/.hg/hgrc /path/to/new/repo/.hg

Clean up distributed copies

Now, if this repository is present on multiple sites, the easiest smooth transition in my eyes is to use the strip extension to remove as many commits as needed to reach the state before the split, and then do a new pull and update. If it is not obvious where the split occured, you could also replace the entire repository by a clone of the modified version, but be wary of files not in the VCS that you might want to keep in the old directory (and once again remember custom .hg/hgrc files).

Areas of application

Rewriting history in shared repositories come with many problematic consequences. To do such a thing more or less requires that all parties adapt the new structure immediately to avoid ending up with in best case spaghetti history that will take some effort to merge, and in worst case fully incompatible repositories and lost work.

For personal projects living at one or two controlled locations, it should be less problematic, and can help in keeping clean and logically consistent repositories.