Showing posts with label git. Show all posts
Showing posts with label git. Show all posts

Keeping your commit history while migrating from SVN to Git

I had to finally get off my arse and migrate my old SVN repos over to Git(hub) when CloudForge decided to close shop. It was very nice of them to host my junk for free all these years and also give us plenty of notice about turning off their services.

But alas, the migration path. I had always envisioned this to be painful and tedious. Luckily it was neither due to the wonderful work by the people who made git svn. In the past we weren't fortunate enough to have such tooling and just had to give things up when Google Code shut down and lost a whole lot of commit history when migrating.

Overview

  1. Export names of users
  2. Convert the SVN repo to a Git repo
  3. Push source to its new home

Steps to migrate

  • Install SVN, Git and Git SVN
sudo install subversion git git-svn
  • Check out your SVN repo
svn checkout <url>

  • "cd <project-path>"
  • Export names of users
svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > users.txt ; cat users.txt

  • You'll see the SVN format will be something like "username = username <username>"
  • Change it to the format of "username = Full Name <email@address.com>"
  • Save the file and return to the command line
  • Now to use Git SVN to convert the repo to a Git one. Don't worry, it creates the new repo in a completely separate folder.
git svn clone --no-metadata --authors-file=users.txt $(svn info | grep "^URL:" | cut -d : -f 2-) git_migration ; cd git_migration

  • Now you have a perfectly good Git clone of your SVN repo ready to push to a new home

Extra info

A breakdown of what the "git svn clone" command does:

  • "git svn clone" runs git-svn on the current SVN repo and clones it to a new Git repo
  • "--no-metadata" excludes the SVN commit IDs from the new Git commits. Don't need it since its a one way trip
  • "--authors-file" is the file used to map commit author details
  • Next bit "$(svn info | grep | cut ...)" simply reads the SVN repo URL from the current project so we don't have to manually edit the command for each project
  • "git_migration" is the path we want to put the new Git repo
  • "cd git_migration" simply changes the path out from SVN repo to the new Git repo

Pushing to Git

  • Go to your favourite Git host (Github, GitLab, BitBucket, etc) and create a new repo for your project. If using an existing one, just don't push in the last step unless you know what you're doing)
  • Now is a good point to make sure you can access Git via SSH. I will not be covering it here.
  • Now add a new remote to your local Git repo
git remote add origin <url>
  • Assuming you created a completely empty repo, you can simply just "git push origin master" and call it a day.

In case of a non-empty new Git repo...

If your new Git repo has existing commits, you can do one of two things.
  1. Note; triple check you want to do this before doing this!
    Force push your master branch to override the existing content by using "git push origin master --force". There is no easy way to undo this unless someone else has a checkout of it.
  2. Use git rebase to shift the new commits from "origin/master" onto the end of your master branch, and THEN force push to preserve the new files.

Merging SVN and Git repo commit histories

In my case, some projects started off in SVN and I made a clean cut switch-over to Git. Now that it's all in Git, it was only right to combine the two separate commit histories into one.

In this scenario, perform the steps below BEFORE pushing to Git. We'll need to juggle some stuff around first.

Now depending on which repository is older, you may need to manually replace some stuff below as you go.

  • So following on from the "git remote add origin <url>" command above...
  • "git fetch origin" to pull in information about the target repo
  • Create a new branch called "combined" using the given source branch
    • "git checkout -b combined origin/master" if your SVN repo is older
    • Otherwise "git checkout -b combined git-svn"
  • Work on the "combined" branch as it's safe to stuff things up since we still have the original branch in case anything goes wrong.
  • Now we need to use git rebase from the "combined" branch onto the older source. Think of it as appending the newer source onto the end of the older source.
    • "git rebase --committer-date-is-author-date git-svn" if SVN repo is older
    • Otherwise "git rebase --committer-date-is-author-date origin/master"
  • Rebase will replay a series of commits to combine the history, resulting in new commit IDs for the replayed commits. You may need to resolve some conflicts at the start to iron out some small discrepancies.
    • "git rebase --continue" to keep going
    • "git rebase --abort" to give up
  • Once finished, check the log for your "combined" branch using either "git log" or tig
    • Confirm that your "combined" branch is working as expected before performing the next step!
    • Check that the commit messages are ok from the rebase point
    • Check that the commit dates are correct and haven't been rewritten to the current date.
    • Check that usernames are correct.
  • If ALL is well, then you're ready to delete your local master branch and rewrite it using the new "combined" branch.
  • Proceed with caution; I will not be held responsible for mishaps if you are following this blindly.
    • git branch -D master
    • git checkout -b master
    • git push origin master --force
  • Your master branch has now been rewritten to include the old SVN commits.
  • Know that if anyone branched off your repo prior to this point, they will need to rebase their changes off the new "origin/master" branch in order to merge their changes in.

Sources

Git: Post-merge hook to detect when certain files have changed and display a notification

When certain files in a project change, the project needs to be rebuilt/restarted/recompiled. Using the following git hook, you can keep an eye out for changes in certain files then display a reminder message to perform an update or just make it run the command automatically.

It's just a simple bash script that's executed by git after a successful merge.

Use check_run() to run the script automatically, or check_warn() if you want it to only be noisy about it.

Store this in your_project/.git/hooks/ as "post-merge".

14QJ3Mv
Automate things. Less room for errors and mistakes.

Git: List and remove all remote branches which have already been merged

Sometimes you need to do some spring cleaning on your remote branches (such as GitHub).

To hell with doing it manually!

Fortunately it's an easy fix for us lazy sods.

List all remote branches which have been merged into master

git branch -r --merged | grep origin | grep -v '>' | grep -v master | xargs -L1 | awk '{sub(/origin\//,"");print}'

Remove all remote branches which have been merged into master

git branch -r --merged | grep origin | grep -v '>' | grep -v master | xargs -L1 | awk '{sub(/origin\//,"");print} | xargs git push origin --delete

Digging-Like-a-boss

It just works.

Source

Migrating from Google Code to Github and keeping revision history

This was a tricky one, and I ran into a few problems trying to move DCX over from GoogleCode to Github.

  • Usernames on Google Code are emails. When you bring them over, your commit messages are littered with potentially private email addresses that your committers don't want to share with the world (well, it's mainly spambots that they don't want to share with).
  • Windows. If you're trying to do this on Windows, forget it. Save yourself the hassle and grab Linuxmint, Ubuntu or whichever flavour you prefer and run it in Virtualbox. It won't cost you anything and takes less than an hour to download and install.

The time spent setting up Linux will be well worth it compared to trying to finish ANY of this on Windows.

b7ZeLB8
Seriously, don't fight Windows. It wasn't made to work there.

Setting up

Find your terminal and make yourself a folder to work with. This is where all your magic will happen.

mkdir code

cd code

We'll need to install a few things. Enter your password when prompted.

sudo apt-get install subversion

sudo apt-get install git

sudo apt-get install git-svn

sudo apt-get install tig

I don't think I need to explain why subversion and git are necessary.

git-svn is the "in-between" software which imports SVN data into a format git can understand. It also sorts arranges the SVN branches into git branches.

The tool tig is VERY useful for checking your changes as they stand after pulling the repository into your computer (and before pushing it onto Github).

Generating a list of usernames

Check out a copy of your GoogleCode repository in a folder called "googlecode" (replacing DCX with whatever your project name is)

svn checkout https://dcx.googlecode.com/svn/ googlecode

cd googlecode

This one is straight from John Albin's bag of tricks and it's a real winner!

svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > ../authors.txt

cd ..

It fetches all the usernames from commit messages, sorts, removes duplicates and saves them to a file called "authors.txt".

Open up this file (using vim or a text editor of any sort) and you should see something like this:

twig@blogspot.com = twig@blogspot.com <twig@blogspot.com>

Keep the left column as is, but feel free to change the right column.

twig@blogspot.com = Twig Nguyen <twig@whatever.com>

The name and email on the right will be used in the git history. Once all the changes are finished, save and close.

Pulling the repository onto your computer

Because Google doesn't allow shell access to the SVN repository, you can't simply dump and copy the files out.

What we have to do is use git-svn and pull them out using SVN and save it in git.

git svn clone https://dcx.googlecode.com/svn/ -A authors.txt --stdlayout gitsvn

Depending on how big your repository is, this may take a while.

Once it's done, take a look at how it's set up so far by typing:

cd gitsvn

git status

# On branch master
nothing to commit, working directory clean

The "trunk" is now called "master" branch in git. This was your stable channel. Type "tig" to see what's going on, and if the usernames were migrated correctly. Select the commit (arrow keys) and open the commit details with the Enter button.

If you're happy with how it is, time to push it into github! Make sure that your account has been set up properly by typing:

ssh -T git@github.com

Sidenote:

If you run into errors, please follow the information on these two pages to help troubleshoot your woes.

/sidenote

Once it's up and running, you can link github to your local git repository.

git remote add origin git@github.com:youraccount/yourproject.git

git fetch origin

git merge origin/master

git push origin master

To summise what just happened. We added a "remote" location pointing to your github project and called it "origin" as this is your new home. You fetch the current details about it and then merge in the origin's master branch into your master branch, melding the gitsvn repository with the github repository.

As soon as that's done, we push all our existing files from gitsvn into the empty repository at origin (github). This again should take some time.

For most people, this should be enough.

SVN branches aren't done yet!

If you use SVN branches, then they're not in github yet!

Now for the mind bending bit and training your brain a little about Subversion branches and git branches.

image
SVN branches.... yeah, it can get messy.

To switch to another branch such as "dcxutf":

git checkout dcxutf

tig (optional)

git push origin dcxutf

Switching the branch means that git will delete any files specific to "master" and revive any files from "dcxutf", along with any changes to files mutual to both branches. It's quick and easy.

Use tig again to check if revision history and changes were migrated properly. Then finally pushing the branch upstream into origin (github).

Repeat this process for any branches you wish to keep.

Cleaning up

Make sure that everything has been pushed into github. Browse your project page a little and you can see things appear instantly.

Once you're done, feel free to remove anything from the "code" folder you made in the very first step.

Sources

Git: Revert commit that has been pushed to origin

To bump off an accidental commit to origin, you'll have to sync your branch with the remote one.

Once that's done, we pop off the latest commit locally.

git reset HEAD^

The ^ means we're going back to the commit before "HEAD". Optionally you can use the option --hard to delete any changes made in that commit.

Now that we're back one commit, force it onto the remote master branch with +

git push origin +master

And that's it!

XJyGYChow Yun Fat approves of this method.

Source

Git: Uninstall git-cheetah from menu

It's pretty neat to have but it wasn't the best user interface for me.

It was bloating my context menu with far too many options across my whole computer so I had to tranquilise it.

To do that, just open up a command prompt or the "Run" window (with Windows+R).

Assuming that you installed Windows Git to C:\Program Files\Git\

Type in:

cd %PROGRAMFILES%\git\git-cheetah && regsvr32 /u git_s
hell_ext.dll

And menu be-gone!

Now go and install something more intuitive like TortoiseGit.

tumblr_lydsbo5pcq1r1y5auo1_500

Source

GitHub: Committing code to your public repository without "Unknown" author name in commits

GitHub was easy to get up and running, but a few little quirks here and there got me.

Assuming you've:

  • Already created your repository in GitHub
  • Created an account in GitHub
  • Set up git for Windows and using git-bash

Create an SSH key

This bit allows to you check out and commit code securely to github.

To check if it already exists, type in:

cd ~/.ssh

If it works, skip to the next step.

If you get "No such file or directory", then you'll have to create one.

ssh-keygen -t rsa -C "your_email@youremail.com"

The email doesn't have to be your github email.

Let GitHub know your public key

Type in:

notepad ~/.ssh/id_rsa.pub

Copy everything from that file into the clipboard.

On the GitHub site click:

  • Account Settings
  • SSH Public Keys
  • Add another public key
  • Paste in everything from that file and give it a name of your computer (so you remember which computer can commit to the repo)

Test it out by typing (don't change it to your repo)

ssh -T git@github.com

Link your git to your github account

This step will link your github account to your commits.

On the site, go to:

  • Account Settings (top menu)
  • Account Settings (on the left menu)
  • Copy your API token.

Back in the git-bash console:

git config --global github.user your_github_username
git config --global github.token 0123456789yourf0123456789token

Okay, this time you enter in your github username. Replace the token with your token accordingly.

Set up your user details

To ensure that your commits show up correctly with your name, make sure you set your author name. Otherwise your commits will show up with "Unknown" as author name like here.

git config --global user.name "Firstname Lastname"
git config --global user.email "your_email@youremail.com"

Again, this doesn't have to be your github email.

Your name should not match the github username either.

Getting the code

Finally eh? Don't worry, all of the stuff above you only need to do once.

Go to your project page and just below the description, make sure "SSH" is highlighted.

imageNow click on the little clipboard icon to copy that URL (or you could do it manually)

Go to your coding folder (the one that holds all of your projects) and type:

git clone <paste> [optional_checkout_name]

In my example it'd be:

cd ~/android/projects/
git clone git@github.com:twig/Android-File-Dialog.git FileExplorer

That's because I wanted the directory name "FileExplorer" rather than "Android-File-Dialog" in my projects folder.

Commitment issues?

Make some changes to the code (like whitespaces or edit the readme) and type:

git add whatever_you_edited.txt
git commit -m"Test commit message"

That has committed the code locally.

To push it off to github, you'll have to type:

git push

It'll take whatever commits you've made and dump it onto github.

Refresh your project page to check that:

  1. You've committed code properly (yes, it DOES show up straight away!)
  2. Your author name is showing up correctly.
  3. You haven't broken anything ;)

Happy coding!

funny-hilarious-awesomeness-0

Sources

git: How to push a tag to remote origin

If you've created a tag on the local repository and want to push it to the origin repo, it's quite easy!

To update all the tags, just type:

git push --tags

Otherwise, type the following to update a specific tag:

git push --tags stable

Viola!

Source

git: Ignore dirty or untracked submodules in diff or status commands

Git 1.7.0 added a feature which marks submodules as dirty if an untracked file is in it.

--- a/article
+++ b/article
@@ -1 +1 @@
-Subproject commit aba7c80124b0ac07299f224f1bf7ddd4c9a095e3
+Subproject commit aba7c80124b0ac07299f224f1bf7ddd4c9a095e3-dirty

...

diff --git a/south b/south
--- a/south
+++ b/south
@@ -1 +1 @@
-Subproject commit 6512510da9a408b178730f38fcac664483451ab0
+Subproject commit 6512510da9a408b178730f38fcac664483451ab0-dirty

Personally find it annoying because the files there just patch files or something I'm saving for later.

git status

# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#   (commit or discard the untracked or modified content in submodules)
#
#       modified:   article (modified content)
#       modified:   south (untracked content)

So you can see that "article" has files thats been modified, while "south" has files which are untracked.

To ignore this, you have to add the --ignore-submodules flags to your diff and status commands.

git status --ignore-submodules=dirty
git diff --ignore-submodules=dirty

Sources

Git: Cherry-picking and removing git commits

So I dun goofed one of my workmate's git branches and had to remove one commit from the history.

cherry-pick

You may have noticed that this was buried under a whole stack of other commits...

Normally, if you've made an accidental commit and realised straight away, you can just do:

git reset HEAD^

This will pop off the latest commit but leave all of your changes to the files intact.

Because this commit is buried under a whole plethora of changes from other people, we need to do something else.

Using "git rebase", we can shift things around and keep the other unrelated changes in the history.

(From the git rebase help)

      A---B---C topic
     /
D---E---F---G master

Becomes:

              A´--B´--C´ topic 
             /
D---E---F---G master

It's a bit hairy, but here's how you can pull it off:

  • Checkout "master" branch for the project
  • Pull everything in line with:

git submodule update --init

  • Create a clean branch using:

git checkout -b 0000_fixed

  • Checkout your broken branch

git checkout 0000_broken

  • Update init to pull everything in line

git submodule update --init

  • Create a copy of the broken branch

git checkout -b 0000_temp

  • Rebase towards the clean target branch

git rebase -i 0000_fixed

  • Now you get a chance to cherry-pick the commits. Remove any commits you don't want. This step should be easy, assuming you've entered in good commit messages.
  • Save and exit.
  • git rebase will now attempt to automatically shift your commits.
  • If it worked, you'll see this:

Successfully rebased and updated refs/heads/0000_temp

  • If not, you'll have some fun conflicts to deal with

Automatic cherry-pick failed.  After resolving the conflicts,
mark the corrected paths with 'git add <paths>', and
run 'git rebase --continue'
Could not apply xxxxxxx... progress commit

  • Treat this like a normal merge/conflict scenario.
  • Do "git submodule update --init" again
  • Go into each conflicted file and resolve the conflicts
  • Go into each conflicted submodule and checkout your working branch
  • Merge in the previous head so you get the incoming changes
  • Go back to the project
  • Re-commit the conflict resolutions (with the previous commit comment if possible)
  • Type:

git rebase --continue

  • Rinse and repeat until successful.
Let's just hope you never have to do this again.

Sources

git: Cherry-pick code fragments from base revision (partial revert)

If you've made a mistake on part of a file and want to only revert portions of it rather than the whole file, you can do that using:

git checkout -p the_filename.ext

The -p flag allows you to step through each code chunk and actively select or ignore changes you want to undo since the last commit.

[ Source ]

git: Spring cleaning and garbage cleanup on trees and branches

After using a repository for a while, its a good idea to clean it up as it accumulates junk.

Using "du -hs" to see how large the folder was, I found that my tree was 166M.

Git gc will do the magic for us.

Cleanup unnecessary files and optimize the local repository

Runs a number of housekeeping tasks within the current repository, such as compressing file revisions (to reduce disk space and increase ) and removing unreachable objects which may have been created from prior invocations of git-add.

 

Users are encouraged to run this task on a regular basis within each repository to maintain good disk space utilization and good operating performance.

So using:

git gc

and

git submodule foreach git gc

Git will go through and clean up the tree. Eventually everything was reduced down to 111M.

Other repositories were 78M and 100M. They were cleaned up to be 56M and 77M respectively.

Git: Do a "git status" on each submodule using foreach

Strangely enough, its fine doing a git branch on each submodule using "git submodule foreach git branch".

However, if you try "git status" it wont be happy. It'll spit out this error after the first submodule.

Stopping at 'firstmodule'; script returned non-zero status.

Instead, use:

git submodule foreach "git status || true"

This will go through each submodule and perform a status, without any errors.

Now go forth and celebrate your success!

wvrex2

git: Show history with a helpful branching tree

Tracking down branching ancestry in revision control is horrible to say least. At least git has a nice little feature which lets you change the log output into a nicer layout.

git log --graph

To look at a specific revision:

git log --graph commitid

This will give a nice branched log format to help you track down commits.

image

git: Show difference between commits

git diff commitid1..commitid2

git: How to remove a submodule

You'll need to do 3 things in order to remove the submodule.

In your current working path:

  • Remove any entries for your submodule in ".gitmodules"
  • Remove any entries for your submodule in ".git/config"

Thats the easy bit. Lastly, you'll need to type:

git rm --cached submodule_name

Wish there was an easier way...

[ Source ]

Git: Recursively check submodules for diffs

Git is pretty good to work with, but when it comes to merging submodules, sometimes the project dies in the ass because someone (ie. me) forgot to commit a submodule.

kermit miffy
An example of the project dying in the ass.

To check for nested differences, use the following command:

git submodule foreach --recursive git diff --name-status

It'll go through each submodule it finds and check for a diff, spitting out the name of the files that have changed.

Entering 'submoduleA'
M       models.py
Entering 'submoduleB'
M       models.py
Entering 'submoduleC'
Entering 'submoduleD'

Remove the "--name-status" flag to show the actual diffs.

Git: Uncommit revision but keep changes

After accidentally committing some stuff to the git repository, I had some trouble finding a way to "uncommit". Luckily a few notes on the net pointed towards the reset command.

git reset HEAD^

This will leave your changes intact, but bump down the revision of your branch.

[ Source ]

Git: Shelving changes and bringing them back later

To stash minor changes away on the shelf for later retrieval, use:

git stash

To display a list of changes you've got on the shelf, use:

git stash list

And to retrieve the changes, use:

git stash apply #

Where # is the ID of the change you want to retrieve.

[ Source - git manual ]

Git: Revert a single file

Use "git checkout -- filename" to revert a file. The "--" ensures that it recognises filename as a path and not a branch.

[ Source ]

 
Copyright © Twig's Tech Tips
Theme by BloggerThemes & TopWPThemes Sponsored by iBlogtoBlog