Keeping your commit history while migrating from SVN to Git

I had to finally get off my arse and migrate my old SVN repos over to Git(hub) when CloudForge decided to close shop. It was very nice of them to host my junk for free all these years and also give us plenty of notice about turning off their services.

But alas, the migration path. I had always envisioned this to be painful and tedious. Luckily it was neither due to the wonderful work by the people who made git svn. In the past we weren't fortunate enough to have such tooling and just had to give things up when Google Code shut down and lost a whole lot of commit history when migrating.

Overview

  1. Export names of users
  2. Convert the SVN repo to a Git repo
  3. Push source to its new home

Steps to migrate

  • Install SVN, Git and Git SVN
sudo install subversion git git-svn
  • Check out your SVN repo
svn checkout <url>

  • "cd <project-path>"
  • Export names of users
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > users.txt ; cat users.txt

  • You'll see the SVN format will be something like "username = username <username>"
  • Change it to the format of "username = Full Name <email@address.com>"
  • Save the file and return to the command line
  • Now to use Git SVN to convert the repo to a Git one. Don't worry, it creates the new repo in a completely separate folder.
git svn clone --no-metadata --authors-file=users.txt $(svn info | grep "^URL:" | cut -d : -f 2-) git_migration ; cd git_migration

  • Now you have a perfectly good Git clone of your SVN repo ready to push to a new home

Extra info

A breakdown of what the "git svn clone" command does:

  • "git svn clone" runs git-svn on the current SVN repo and clones it to a new Git repo
  • "--no-metadata" excludes the SVN commit IDs from the new Git commits. Don't need it since its a one way trip
  • "--authors-file" is the file used to map commit author details
  • Next bit "$(svn info | grep | cut ...)" simply reads the SVN repo URL from the current project so we don't have to manually edit the command for each project
  • "git_migration" is the path we want to put the new Git repo
  • "cd git_migration" simply changes the path out from SVN repo to the new Git repo

Pushing to Git

  • Go to your favourite Git host (Github, GitLab, BitBucket, etc) and create a new repo for your project. If using an existing one, just don't push in the last step unless you know what you're doing)
  • Now is a good point to make sure you can access Git via SSH. I will not be covering it here.
  • Now add a new remote to your local Git repo
git remote add origin <url>
  • Assuming you created a completely empty repo, you can simply just "git push origin master" and call it a day.

In case of a non-empty new Git repo...

If your new Git repo has existing commits, you can do one of two things.
  1. Note; triple check you want to do this before doing this!
    Force push your master branch to override the existing content by using "git push origin master --force". There is no easy way to undo this unless someone else has a checkout of it.
  2. Use git rebase to shift the new commits from "origin/master" onto the end of your master branch, and THEN force push to preserve the new files.

Merging SVN and Git repo commit histories

In my case, some projects started off in SVN and I made a clean cut switch-over to Git. Now that it's all in Git, it was only right to combine the two separate commit histories into one.

In this scenario, perform the steps below BEFORE pushing to Git. We'll need to juggle some stuff around first.

Now depending on which repository is older, you may need to manually replace some stuff below as you go.

  • So following on from the "git remote add origin <url>" command above...
  • "git fetch origin" to pull in information about the target repo
  • Create a new branch called "combined" using the given source branch
    • "git checkout -b combined origin/master" if your SVN repo is older
    • Otherwise "git checkout -b combined git-svn"
  • Work on the "combined" branch as it's safe to stuff things up since we still have the original branch in case anything goes wrong.
  • Now we need to use git rebase from the "combined" branch onto the older source. Think of it as appending the newer source onto the end of the older source.
    • "git rebase --committer-date-is-author-date git-svn" if SVN repo is older
    • Otherwise "git rebase --committer-date-is-author-date origin/master"
  • Rebase will replay a series of commits to combine the history, resulting in new commit IDs for the replayed commits. You may need to resolve some conflicts at the start to iron out some small discrepancies.
    • "git rebase --continue" to keep going
    • "git rebase --abort" to give up
  • Once finished, check the log for your "combined" branch using either "git log" or tig
    • Confirm that your "combined" branch is working as expected before performing the next step!
    • Check that the commit messages are ok from the rebase point
    • Check that the commit dates are correct and haven't been rewritten to the current date.
    • Check that usernames are correct.
  • If ALL is well, then you're ready to delete your local master branch and rewrite it using the new "combined" branch.
  • Proceed with caution; I will not be held responsible for mishaps if you are following this blindly.
    • git branch -D master
    • git checkout -b master
    • git push origin master --force
  • Your master branch has now been rewritten to include the old SVN commits.
  • Know that if anyone branched off your repo prior to this point, they will need to rebase their changes off the new "origin/master" branch in order to merge their changes in.

Sources

 
Copyright © Twig's Tech Tips
Theme by BloggerThemes & TopWPThemes Sponsored by iBlogtoBlog