After restructuring a Mercurial repository, I wanted to split it into two separate projects, but keep history clean.
I chose to rewrite history of the old repository to treat the action when the files were moved into a separate directory structure as a removal of the original files, and ignore history touching these files from there on. At the same time I wanted to create a new repository that began at this very moment and recorded the subsequent file specific history.
Disclaimer
Rewriting project history in a way is "the nuclear option" when it comes to distributed version control systems (VCS). A repository with modified history is, as far as the VCS is concerned, a completely different project. If a repository is spread out among several parties and one of them decides to rewrite history, this person's repository will no longer be compatible with the subsequent changes after the point of history rewriting.
Do not rewrite repository history without understanding the role of history for the system in question.
However, rules are often more like guidelines, so let us proceed.
Initial project state
The directory include/Backend
had used to contain a project specific backend, being version controlled together with the frontend in the root directory. Due to interest in using the same backend in another project, I started to separate it cleanly from the rest of the project code before putting it into the vendor/FooBackend
directory. After a few commits it became clear that it would be much smoother if FooBackend
was kept in its own repository due to now having become a clearly separated library, used in multiple contexts.
hg convert
Enter hg convert
— an extension that ships with Mercurial (see the documentation on how to enable extensions). Its main purpose is to convert projects between different VCS formats, but through its ‑‑filemap
switch it can also be helpful in converting a Mercurial repository into a new Mercurial repository.
Create a file map new-repo.filemap
such as
include vendor/FooBackend
rename vendor/FooBackend .
Create another file map rewrite-old-repo.filemap
:
exclude vendor/FooBackend
Create the new repository:
hg convert /path/to/current/repo /path/to/new/repo --filemap new-repo.filemap
The new repository is now finished. The directory is empty, but a hg update
will bring its contents up to speed.
Create the modified repository:
hg convert /path/to/current/repo /path/to/rewritten/repo --filemap rewrite-old-repo.filemap
Retain customized configuration
Care might have to be taken to bring customized .hg/hgrc
changes to the rewritten repository. Unless there are some very exotic changes, it should suffice to do something like:
cp /path/to/old/repo/.hg/hgrc /path/to/rewritten/repo/.hg
rm -r /path/to/old/repo/.hg # If you run this without having a backup available, you should rethink your computer habits :-)
mv /path/to/rewritten/repo/.hg /path/to/old/repo
rmdir /path/to/rewritten/repo
hg update --cwd /path/to/old/repo
To recap and clarify: this will
- save the old
.hg/hgrc
file - remove the complete Mercurial specific directory structure from the old project
- replace it with the rewritten history
- remove the now empty rewritten repository directory.
- change the internal Mercurial directory state to make it aware of that the current revision is present in the working directory (this will be a no-op, except for updating the internal state)
As mentioned in the code comment: be defensive and keep a backup copy in case things go pear-shaped.
In case .hg/hgrc
modifications in the initial project directory are also relevant for the newly created split off repository, just copy it there:
cp /path/to/old/repo/.hg/hgrc /path/to/new/repo/.hg
Clean up distributed copies
Now, if this repository is present on multiple sites, the easiest smooth transition in my eyes is to use the strip
extension to remove as many commits as needed to reach the state before the split, and then do a new pull
and update
. If it is not obvious where the split occured, you could also replace the entire repository by a clone of the modified version, but be wary of files not in the VCS that you might want to keep in the old directory (and once again remember custom .hg/hgrc
files).
Areas of application
Rewriting history in shared repositories come with many problematic consequences. To do such a thing more or less requires that all parties adapt the new structure immediately to avoid ending up with in best case spaghetti history that will take some effort to merge, and in worst case fully incompatible repositories and lost work.
For personal projects living at one or two controlled locations, it should be less problematic, and can help in keeping clean and logically consistent repositories.