Wednesday, January 8, 2014

How to get statistics about your contributions on a GitHub organization

GitHub, the web-based hosting service for software development projects, lets you check statistics on your contributions on specific projects easily.

You can check, for instance, the statistics on collective.cover just by visiting the following page:

GitHub gives you a count of commits and a visual representation of them among time.

If you want to check only the number of commits by author, you could use the git-shortlog command:

The -s option gives you a summary output and the sort command shows the information sorted in reverse order (-nr).

You could improve this listing by using the .mailmap feature to add commits belonging to the same author using two different email addresses.

Image now that you want to get a list of all your contributions on a specific organization, lets say, the Plone collective.

Enter the GitHub API and, a Python wrapper for it.

Until today, the Plone collective has more that 1.1k repositories so we need to use the authenticated access to the API to avoid depletion of requests (rate limit allow us to make up to 60 requests per hour for unauthenticated requests and 5,000 for authenticated requests). In this specific example we used 1,171 requests to the API.

Using we iterate over all of the organization repositories and get information of each one only if my user name is listed among the contributors. We get the results as a list of tuples (repo, total).

After that we sort the list in reverse order using the total commits as the key and make the sum of all commits in general.

As you can see I have contributed to 74 repositories on the Plone collective making 2,975 commits in total.

Not bad, isn't it? :-)