Git and GitHub¶
Learning Objectives
- Understand why version control is essential for reproducible research
- Learn basic Git commands and workflows
- Learn how to collaborate using GitHub
- Develop best practices for version control in scientific computing
Why Version Control?¶
Version control is essential for:
- Tracking Changes: Keep a complete history of your project
- Collaboration: Work effectively with others
- Backup: Never lose your work
- Reproducibility: Return to any previous state of your project
- Documentation: Understand why changes were made
Git Basics¶
If you are using VS Code, you can check the tutorial here.
Setting Up Git¶
First time setup:
Initialize a repository only when you create a project
Basic Git Workflow¶
-
Check Status
-
Stage Changes
-
Commit Changes
-
View History
Best Practices for Commits¶
- Commit Often: Make small, logical commits
- Avoid Half-Done Work: Only commit logically completed tasks. Use
git commit --amend --no-edit
to quickly add forgotten changes to your last commit, but only if you haven't shared that commit with others. - Commit Related Changes: Group related changes into one commit
- Write Clear Messages: Use descriptive commit messages
- One Change Per Commit: Each commit should represent one logical change
- Test Before Committing: Ideally the code in
main
branch should always work. Ensure code works before committing
GitHub¶
You can read the Github official documentation for the suggested workflow for using GitHub for collaboration.
There are tools like Github Desktop and VS Code Github that can help you manage your repositories more conveniently. You can also create your own workflows using tools like GNU Make with GitHub command-line tool gh
.
GitHub beyond repository
GitHub is not just a place to host your code. It is also a social network for developers. You can follow others, see what they are working on, star their repositories, and collaborate with them.
GitHub is also your resume, it will help you get a job. Imagine the employer goes to your GitHub and sees all the 1k-star projects you have worked on.
Setting Up GitHub¶
- Create a GitHub account at github.com
- Set up SSH keys for secure authentication:
- Add the public key to your GitHub account
- (Optional) Install GitHub gh
GitHub Collaboration Workflow¶
You can follow the VS Code tutorial here for the Github collaboration workflow on VS Code. We suggest the following workflow for software development on GitHub. You may use Make to automate your workflow instead of memorizing all the commands.
Here are the visualization of the workflow on the commit status of the remote Github, local git, and local disk. You can cross reference to the text instructions below for better understanding.
Initial Repository Setup and Branch Creation
-
Clone the Repository
- Use
git clone
to download the remote repository to your local machine. (If the Github repository is not from your collaborators, you need to fork the repository to your own account and clone the repository from your account.)
- Use
-
Create a New Working Branch
- Every new feature should be developed in a new branch
- Execute
git checkout -b my_feature
to create and switch to a new branch named 'my_feature' - This effectively creates a local copy of the repository on a separate branch
Code Development and Local Changes
-
Modify Local Code
- Make necessary changes or additions to source files on your local hard drive
-
Review Code Changes
- Use
git diff
to inspect modifications made to the codebase
- Use
-
Commit Local Changes
- Use
git commit -a -m "Descriptive message about changes"
to update staged code to the local Git repository
- Use
-
Push Branch to Remote Repository
- Execute
git push origin my_feature
to upload the localmy_feature
branch to GitHub
- Execute
Handling Remote Repository Updates
-
Switch to Main Branch
- Use
git checkout main
to return to the primary branch - Warning: Commit your branch before switching to main branch, or you will lose all your changes.
- Use
-
Pull Latest Changes
- Execute
git pull origin master/main
to update local repository with remote modifications
- Execute
-
Return to Working Branch
- Switch back to
my_feature
branch usinggit checkout my_feature
- Switch back to
-
Rebase Working Branch
- Use
git rebase main
to integrate main branch updates into your working branch - Note: Potential merge conflicts may require manual code selection
- Understand the difference between
git pull
andgit rebase
here.
- Use
-
Force Push Updated Branch
- Execute
git push -f origin my_feature
to update remote repository with rebased changes
- Execute
-
Pull Request
- After finish adding all the new features in the branch, create a pull request on GitHub
-
Merge Request
- Project owner can use "squash and merge" in Github pull request to consolidate commits
Tip: Always rebase
- Always rebase your branch on the main branch before creating a pull request.
- This will make your pull request history cleaner and easier to review.
- You can simply use
git pull origin my_feature --rebase
to rebase your branch on the github main branch.
Advanced Git Usage¶
Resolving Merge Conflicts¶
We recommend using VS Code to resolve conflicts. Here is the tutorial here. You can also watch this video tutorial.
.gitignore¶
Create a .gitignore
file to exclude:
- Large data files
- Sensitive information
- Generated files
- System files
Example .gitignore
: