Course materials and documentation for DS2002
The goal of this activity is to familiarize you with version control using Git and GitHub. These tools are essential for tracking changes in your code, collaborating with others, managing project history, and contributing to open-source projects.
If the initial examples feel like a breeze, challenge yourself with activities in the Advanced Concepts section and explore the resource links at the end of this post.
At your table, select one person to set up a new repository on GitHub. Work through these steps:
main branch.cd
git clone https://github.com/CREATOR_USERNAME/REPO_NAME.git
cd REPO_NAME
Replace CREATOR_USERNAME and REPO_NAME with the actual GitHub username and repository name.
Important: Make sure you are not inside an existing Git repository when running the git clone command. You don’t want to create nested Git repositories.
alice.txt, bob.txt). Each team member should commit and push their files to the GitHub repository:
echo "Hello from Alice" > alice.txt
git add alice.txt
git commit -m "Add alice.txt"
git push origin main
git pull origin main --merge
(The --merge flag is explicit and avoids warnings in newer Git versions.)
So far, so good. Let’s take it to the next level!
When collaborating, team members may be working in parallel on local copies of the same file. This leads to divergence and file version conflicts need to be resolved. Let’s simulate such scenario.
collision.txt in your local repository. The file should contain a single line with your first name, favorite animal. Track, commit, and push it to the remote repo on GitHub:
echo "Alice, cat" > collision.txt
git add collision.txt
git commit -m "Add collision.txt"
git push origin main
The early bird gets the worm: If you are the first person to push the collision.txt file, you’re in luck—the push should go through without a hitch. However, the others will encounter an error message like this:
! [rejected] main -> main (fetch first)
error: failed to push some refs to 'https://github.com/YOUR_USERNAME/REPO_NAME.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
To resolve the conflict:
Starting with the group member next to the first person who successfully pushed, go clockwise and perform the following steps:
git pull origin main --merge
(The --merge flag is explicit and avoids warnings in newer Git versions.)
This will create a merge commit.
Git will pause and indicate that there are conflicts. VSCode (or your editor) will highlight the conflicting lines in collision.txt.
Alice, cat
Bob, dog
Carol, bird
git add collision.txt
git commit
This completes the merge commit.
git push origin main
collision.txt file on GitHub.Congratulations, you did it! You are ready for Lab 02.
Read git in Data Science for a brief introduction.
Then work through the Creating and Managing Git Repositories Exercises. These exercises will cover:
git branch
This shows all local branches. The current branch is marked with an asterisk (*).
git switch -c feature-branch
The -c flag creates a new branch and switches to it immediately. Alternatively, you can create a branch first with git branch feature-branch and then switch to it with git switch feature-branch.
# be safe, make sure you are not losing anything
git add .
git commit -m "committing everything before getting files from other branches"
# now it is safe to switch
git switch main
This switches you to the main branch. Make sure you’ve committed or stashed any changes before switching branches.
Pull requests (PRs) are a way to propose changes to a repository. When you create a pull request, you’re asking the repository maintainer to review and merge your changes into the main branch. Pull requests allow for code review, discussion, and collaboration before changes are integrated into the project.
git switch -c my-feature
echo "## Features" >> README.md
echo "- Feature 1" >> README.md
git add README.md
git commit -m "Add features section to README"
git push -u origin my-feature
The -u flag sets up tracking between your local branch and the remote branch, so future git push and git pull commands know which remote branch to use.
git switch main
git pull origin main --merge
git branch -d my-feature
For an additional challenge work through the scenario in the Advanced Git Demo.
You may already have a project set up in a directory on your computer (or in codespace), but it’s not set up as a Git repository yet. The following steps show you how to initialize it and connect it to GitHub.
cd # go to your home directory, or any other directory that is NOT inside an existing repo
mkdir my-git-project
cd my-git-project
git init
ls -la .git
You should see a .git directory containing the repository metadata.
Note: This repository only exists in your local environment; it is not on GitHub yet.
# Install GitHub CLI if not already installed
# Then create the repository:
gh repo create my-git-project --public --source=. --remote=origin --push
This single command creates the GitHub repository and pushes your code.
If you want to explore additional Git features, review the Advanced git tutorial.
GitHub allows you to create new repositories from templates, which can include pre-configured files, workflows, and settings. This is useful for starting projects with best practices already in place.
The course repository includes a template URL for creating repositories with security best practices. Here’s how to use it:
Step 1: Get the template URL
The template URL is located in github-new-repo-from-template.txt in this directory (practice/03-git/). The URL format is:
https://github.com/new?owner=YOUR_USERNAME&template_name=secure-repository-supply-chain&template_owner=skills&name=YOUR_REPO_NAME&visibility=public
Step 2: Customize the URL
Replace the placeholders:
YOUR_USERNAME - Your GitHub username or organization nameYOUR_REPO_NAME - The name you want for your new repositoryvisibility=public - Change to visibility=private if you want a private repositoryStep 3: Create the repository
Example:
If your username is johndoe and you want to create a repo called my-secure-project:
https://github.com/new?owner=johndoe&template_name=secure-repository-supply-chain&template_owner=skills&name=my-secure-project&visibility=public
What you get:
The “secure-repository-supply-chain” template from GitHub Skills includes:
Alternative: Using GitHub’s Web Interface
You can also create a repository from a template using GitHub’s web interface: