Many times we have to find out what was the activity going on a particular repository since couple of months. It may be for audit purpose or analysing that how actively a particular project is being developed. Here arises the big question how to get the commit details? Do I need to do it manually? Sounds tedious!
Thankfully Github's API comes to our rescue. This API gives us an application level access to do multiple tasks. Using API we can get the required statistics of our repository.
Following are the two easy steps to get the details:
- Get Github API token with proper permissions
- Access the particular route of Github API to get the required statistics.
Explanation:
Get Github API token with proper permissions All information for getting the personal access token for quick access to the GitHub API is available in link Note: It's recommended to give only required access to the particular token for particular task
Access the particular route of Github API to get the required statistics: Now we will focus on creating actual logic for getting the commits of particular repository.
- Github API has amazing feature where we can get the commits of last year for particular repo. If you want more information on this please visit the Github API
So before starting our code we have some of the prerequisites:
Set 'GIT_TOKEN' as environment variable and token value from 1st step as variable value.
Type "python -m pip install requests" command in your terminal
- After installing the requests module we can proceed directly as all other modules in our code we will get in Python by default. I am using Python version 3.7.9.
Import below modules in the code as shown below:
import requests
import os
from pprint import pprint
from datetime import datetime
you will get why we have imported these modules in sometime.
- Now we need to first check what is the response of below route GET /repos/{owner}/{repo}/stats/commit_activity
owner : owner of the repository, repo : repository name You can check it with the help of 'postman' or you can simply do the following: Create a temporary 'temp.py' file and enter below code in it:
import requests
import os
from pprint import pprint
token = <your GITHUB_TOKEN>
query_url ="https://api.github.com/repos/{owner}/{repo}/stats/commit_activity"
headers = {"Authorization": f"token {token}"}
r = requests.get(query_url, headers=headers)
testdata = r.json()
pprint(testdata)
We will get output like given below:
{'days': [0, 1, 0, 1, 2, 1, 0], 'total': 5, 'week': 1593907200},
{'days': [0, 0, 0, 0, 0, 0, 0], 'total': 0, 'week': 1605398400},
{'days': [0, 0, 0, 0, 0, 0, 0], 'total': 0, 'week': 1606003200},
{'days': [0, 0, 0, 1, 0, 0, 0], 'total': 1, 'week': 1606608000},
{'days': [0, 0, 0, 0, 0, 0, 0], 'total': 0, 'week': 1607212800}]
So now we are aware that we are getting list of dictionaries in our API response.The 'week' value in our dictionary is nothing but the unix timestamp value. So we need to convert it to human readable format. Here 'datetime' module helps us.
We will create two functions, one for converting unix timestamp to human readable format and another for getting the monthly commits for particular repository from last year.
Below is the function for getting the monthly commits for particular repository from last year.
The core logic is it checks whether the key 'month-year' is present in dictionary or not. If not then it assigns the commits as value to the particular key, if it is already present then it sums the commits for particular 'month-year'
Output of the code will be as follows:
{'April-2020': 0,
'August-2020': 0,
'December-2019': 0,
'December-2020': 1,
'February-2020': 0,
'January-2020': 0,
'July-2020': 0,
'June-2020': 0,
'March-2020': 0,
'May-2020': 0,
'November-2020': 0,
'October-2020': 0,
'September-2020': 0}
That's it guys. For complete code please refer gist link. I have used 'gkorgtest' as owner and 'testrepo' as repo for this code, you can replace the values as per your requirement.