Git & Chisubmit

While you will be able to do your CAPP 30121 labs on your Virtual Machine, you should plan to complete this lab in the Computer Science Instructional Lab (CSIL).

This lab assumes you attended the Linux workshop and have set up your CAPP 30121 repository in CSIL and you have replaced “Firstname Lastname” in the file lab1/test.txt of your capp30121-aut-19-username directory with your name and changed “World” in the file lab1/hello_world.py to something else (“Gustav”, for example).

If you did not attend the Linux workshop, and thus, have not setup your CAPP 30121 repository, please follow these instructions before you start working through this material. Also, once you have your repository, please make the required change to lab1/test.txt.

Objectives

  1. Learn the basics of git

  2. Learn how to use chisubmit, our submission software

Git

Git is a system used for developing software in a group. This system maintains files and all changes that are applied to them. You will each have a personal Git repository that is hosted on a central server. The server stores the project files and stores all changes to those files that have been uploaded to the repository.

We have created accounts and repositories for each of you on a CS department Git server. We will seed your repositories with templates and files that you need for labs and programming assignments. Also, we will be able to see any changes you upload to your repository, which allows us to provide help remotely, grade your programming assignments, and provide feedback.

Git tracks every version of a file or directory using commits. When you have made changes to one or more files, you can logically group those changes into a “commit” that gets added to your repository. You can think of commits as “checkpoints” in your work, representing the work you’ve done since the previous checkpoint. This mechanism makes it possible to look at and even revert to older versions of a file by going back to your code as it was when you “checkpointed” it with a commit.

When using Git, your basic working cycle will be:

  • Log into a CS machine (or your VM)

  • Change to your capp30121-aut-19-username directory

  • Download updates from the Git server (we will add files to your repository throughout the quarter). In Git, this operation is called pulling from the server.

  • Work on your files

  • Create a commit with any changes you have made

  • Upload the commit to the Git server. In Git, this operation is called pushing to the server.

The course staff does not have access to any files stored in your home directory or files on your laptop. All we can access are files that have been pushed to the Git server, so remember to always push your latest commits when you’re done or when you ask a question on Piazza that will require us to look at your code.

Please navigate to your capp30121-aut-19-username/lab1 directory using cd. username should always be substituted by your CNetID.

Creating a commit

Creating a commit is a two-step process. First, you have to indicate what files you want to include in your commit. Let’s say we want to create a commit that only includes the hello_world.py file that you modified as part of the Linux lab. We can specify this operation explicitly using the git add command from the Linux command-line:

$ git add hello_world.py

(Recall that we use $ to indicate the Linux command-line prompt. It is not part of the command.)

There are various shortcuts that will allow you to add all of the files in a directory, such as git add . or git add --all. Using these commands is poor practice, because you can easily end up adding files that you did not intend. Instead, it is better to add files explicitly (as shown above) when you create them and then use the following command:

$ git add -u

when you want to add any previously-added file that has changed since your last commit.

To create the commit, use the git commit command. This command will take all the files you added with git add and will bundle them into a commit:

$ git commit -m"Made some changes to hello_world.py"

The text after the -m is a short message that describes the changes you have made since your last commit. Common examples of commit messages might be “Finished part 1 of the homework” or “Finished lab 1”.

Note

If you forget the -m"Comment" at the end then Git will think that you forgot to specify a commit message. It will graciously open up a default editor so that you can enter such a message. On the CS machines this editor is vim. To escape the vim view, press ZZ (shift-z twice). Now try git commit again and don’t forget the -m"Comment".

Once you run the above command, you will see something like the following output:

[master 99232df] Made some changes to hello_world.py
1 file changed, 1 insertion(+), 1 deletion(-)

You’ve created a commit, but you’re not done yet: you haven’t uploaded it to the server. Forgetting this step is a very common mistake, so don’t forget to upload your changes. You must use the git push command for your changes to actually be uploaded to the Git server. If you don’t push your commit, the instructors and graders will not be able to see your code. Simply run the following command from the Linux command-line:

$ git push

You should see something like this output:

Counting objects: 7, done.
Delta compression using up to 16 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 452 bytes, done.
Total 4 (delta 1), reused 0 (delta 0)
To git@git-dev.cs.uchicago.edu:capp30121-aut-19/username.git
   c8432e4..99232df  master -> master

You can ignore most of those messages. The important thing is to not see any warnings or error messages.

You can verify that our Git server correctly received your commit by visiting the following page:

https://mit.cs.uchicago.edu/capp30121-aut-19/username

Where username should be substituted by your CNetID.

This URL takes you to the web frontend of our Git server (please note that you will have to log in using your CNetID and password). More specifically, the above URL will show you the contents of your repository, exactly as it appears on the Git server. You can click on “Files” to see your repository’s files, and on “Commits” to see the latest commits uploaded to the server. If you see a commit titled “Made some changes to Hello World”, then your commit was successfully uploaded.

In general, if you’re concerned about whether the graders are seeing the right version of your code, you can just go to the above URL. Whatever is shown on that page is what the graders will see. If you wrote some code, and it doesn’t show up in the above URL, make sure you didn’t forget to add your files, create a commit, and push the most recent commit to the server.

Pulling changes from “upstream”

When we distribute new homework assignments or lab materials, we will do so through Git. These files are located in a separate repository on our Git server, which we call the “upstream” repository. The setup script you ran earlier already configured your Git repository so you can easily download any new files we upload to the upstream repository). To download these changes, run this command from inside the capp30121-aut-19-username directory:

$ git pull upstream master

Run this command now. We moved a file between the first and second Linux workshops. You might see output like this, if you attended the first workshop:

Updating e73ccd2..5cc9de1
Fast-forward
grader.py => common/grader.py | 0
1 file changed, 0 insertions(+), 0 deletions(-)
rename grader.py => common/grader.py (100%)

Or like this, if your did one the later workshops.

From git-dev.cs.uchicago.edu:capp30121-aut-19/capp30121-aut-19
 * branch            master     -> FETCH_HEAD
Already up-to-date.

When you pull from “upstream”, Git automatically downloads any new files or changes that have been committed to “upstream” and updates the files in your repository. If you have made local changes to files that have changed upstream, Git will attempt to merge these changes.

After you’ve pulled from upstream, any new files or changes will only be downloaded to your local copy of capp30121-aut-19-username. As with any other changes to your code, you need to run git push to upload them to the Git server (you don’t need to do a git commit to prepare a commit, though; git pull already takes care of this task).

Note

Every time you work on your code, you should run git pull upstream master in your capp30121-aut-19-username directory before you do anything else. Sometimes, the instructors notice typos or errors in the code provided for a programming assignment, and they’ll commit fixes to upstream. By running git pull upstream master, you can make sure that those fixes propagate to your code too.

Pulling your changes from the server

If you have done work and committed it to the server from a lab computer and now wish to work on your VM (or vice versa), you will need to pull these changes from the server to your VM. To download these changes, run this command from inside the capp30121-aut-19-username directory on your VM:

$ git pull

It is important that you commit your changes after every session and that you pull from both upstream and capp30121-aut-19-username before you start to do any work.

Note

Your output may vary from our sample output slightly. Do not worry about the difference unless you see an error message or a warning message.

git add revisited and git status

So far, we’ve created a single commit with a single file that we had already supplied in the lab1 directory. If you create new files, Git will not consider them a part of the repository. You need to add them to your repository explicitly. For example, let’s create a copy of hello_world.py:

$ cp hello_world.py hello_universe.py

Is hello_universe.py part of your repository? You can use the following command to ask Git for a summary of the files it is tracking:

$ git status

This command should output something like this:

# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#   modified:   test.txt
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#   hello_universe.py
no changes added to commit (use "git add" to track).

The exact output may vary depending on how far along you got in the Linux workshop. However, the important thing is that there are two types of files listed here:

  • Changes not staged for commit: This is a list of files that Git knows about and have been modified since your last commit, but which have not been added (with git add).

  • Untracked files: This is a list of files that Git has found in the same directory as your repository, but which Git isn’t keeping track of.

    You may see some automatically generated files in your Untracked files section. Files that start with a pound sign (#) or end with a tilde should not be added to your repository. Files that end with a tilde are backup files created by some editors that are intended to help you restore your files if your computer crashes. In general, files that are automatically generated should not be committed to your repository. Other people should be able to generate their own versions, if necessary.

To add a previously untracked file to your repository, you can just use git add (unlike the previous commands, don’t actually run this just yet; you will be doing a similar exercise later on):

$ git add hello_universe.py

If you re-ran git status you would see something like this:

# On branch master
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#   new file:   hello_universe.py
#
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#   modified:   test.txt

Notice how there is now a new category of files: Changes to be committed. Adding hello_universe.py not only added the file to your repository, it also staged it into the next commit (which, remember, won’t happen until you actually run git commit).

The git status command reports the status on the local copy of the full repository. If you wish to look at the status of a smaller part of the repository (the directory you are working in for example), you can add a path name to the status command. For example:

$ git status .

reports the status of the current directory (a single dot is the path used to refer to the current directory).

Unstaging, discarding changes, and removing files

Take a closer look at the git status output above. Git is providing you hints in case you want to undo some of your work.

For example, you can use git reset hello_universe.py to unstage the file. Doing so reverses git add hello_universe.py so you can create a commit only of changes to other files. This is good practice if you think the changes you made to hello_universe.py don’t logically go in the commit you are about to make.

Another useful git command is git checkout. This command will undo modifications to files. If you again look at the above git status output, you will see in the last line that test.txt was modified. To undo any changes to the file, type git checkout test.txt. This command will revert the file content to match the last commit you made in your repository’s history

Finally, if you would like to remove a file from your directory, using git rm test.txt combines the result of doing rm test.txt and git add test.txt.

Looking at the commits log

Once you have made multiple commits, you can see these commits, their dates, commit messages, author, and an SHA-1 hash (a value used by git to uniquely identify the commit) by typing git log. This command will open a scrollable interface (using the up/down arrow keys) that you can get out of by hitting q.

Exercises

  1. You have already changed the test.txt file in your directory. Verify this by using the command git status. You should see it under Changes not staged for commit.

  2. Use git add and git commit to create a commit that includes only the test.txt file. A good commit message would be “Added my name as Author in test.txt”.

  3. Upload your work to the server using git push.

  4. Verify that this file was sent by again using the command git status. You should see that the file test.txt is no longer listed.

  5. If you have not already done so, use cp to make a copy of hello_world.py named hello_universe.py.

  6. If you run git status, hello_universe.py should show up under Untracked files. Add it to the repository using git add.

  7. Run git status again. Is hello_universe.py in a different category of files now?

  8. Although we have added this file, we have not yet created a commit. Create a commit and push it to the server.

  9. Run git status a final time to verify that hello_universe.py was committed (if so, you should not see it in any category of files)

  10. Run git push to upload your changes to the server.

We strongly recommend you to add, commit and push changed files as often as possible, especially if you finished some work and are about to log off a computer. (We often refer to the process of adding/committing/pushing as *checking-in* your code or *syncing* your repository.) This way the latest files are accessible from any other computer where your repository is set up.

chisubmit

You will be using a locally-developed system named chisubmit to submit your programming assignments. The set-up script that you ran earlier set you up to use chisubmit in addition to initializing your Git repository.

All chisubmit commands should be run from within your capp30121-aut-19-username directory.

chisubmit has commands for managing assignments. Here are descriptions and sample runs of some of the more useful commands. You can run these commands as you read through this section.

chisubmit student assignment list: lists upcoming programming assignments and their deadlines.

$ chisubmit student assignment list

pa0 2019-10-07 16:00:00-05:00PA #0
pa1 2019-10-12 16:00:00-05:00PA #1

chisubmit student assignment show-deadline <assignment name>: lists deadline information for the specified programming assignment.

$ chisubmit student assignment show-deadline pa0
PA #0

      Now: 2019-09-25 15:04:18-05:00
 Deadline: 2019-10-07 16:00:00-05:00

The deadline has not yet passed
You have 12 days, 0 hours, 55 minutes, 42 seconds left

chisubmit student assignment register <assignment name>: registers a student for a specific assignment. You will do this step once per assignment.

$ chisubmit student assignment register pa0
Your registration for pa0 (Programming Assignment 0) is complete.

chisubmit student assignment submit pa0: submits your current commit

$ chisubmit student assignment submit pa0

SUBMISSION FOR ASSIGNMENT pa0 (Programming Assignment 0)
--------------------------------------------------------

This is an INDIVIDUAL submission for Gustav Martin Larsson

The latest commit in your repository is the following:

     Commit: eeed8efa66a13c0b04c587acdda43fbe75c9b99b
       Date: 2019-09-25 14:48:16-05:00
    Message: Added log for testing purposes
     Author: Gustav Martin Larsson <larsson@cs.uchicago.edu>

PLEASE VERIFY THIS IS THE EXACT COMMIT YOU WANT TO SUBMIT

You currently have 3 extensions

You are going to use 0 extensions on this submission.

You will have 0 extensions left after this submission.

Are you sure you want to continue? (y/n):  y

Your submission has been completed.

chisubmit has many other commands, including command for canceling registrations, canceling submissions etc. You can find detailed instructions on these and other commands here.

Merge conflicts

You need to have installed the CS Virtual Machine (VM) on your laptop to do this section of the lab. If you do not have a laptop or do not have one with you, you can just skip this section.

The beauty of Git specifically, and version control in general, is that you can share repositories with other people and you can work on the code separately. Merge conflicts arise when different copies of the repository get changed in incompatible ways. Unfortunately, this complication can arise even if you are the only one working on your repository! You just need to work on your code using multiple machines.

Let’s work through an example. Do the following steps using your CSIL machine:

  1. Change the line in hello_world.py from:

print("Hello, World!")

to:

print("Hello, Chicago!")
  1. Add and commit these changes.

  2. Do not push these changes at this time.

And then switch to your VM on your laptop and do the following:

  1. Run git pull inside your course directory to pick up the most recent copy from the server.

  2. Change the line in hello_world.py from:

print("Hello, World!")

to:

print("Hello, New York!")

(Notice that the change from “World” to “Chicago” did not make it to your VM because you did not do a push!)

  1. Add, commit, and push these changes to the server.

And then, finally, switch back to your CSIL machine and run git pull to pick up the most recent version from the server. This command will fail with an error like the following:

$ git pull
remote: Counting objects: 4, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 4 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (4/4), done.
From mit.cs.uchicago.edu:capp30121-aut-19/larsson
   62c72de..b70ae2a  master     -> origin/master
Auto-merging lab1/hello_world.py
CONFLICT (content): Merge conflict in lab1/hello_world.py
Automatic merge failed; fix conflicts and then commit the result.

Git will not able to reconcile your local changes with the version on the server automatically and so, it will update hello_world.py to reflect the conflicts. The file hello_world.py will look like this:

<<<<<<< HEAD
print("Hello, Chicago!")
=======
print("Hello, New York!")
>>>>>>> b70ae2a739c7775189b284be04ae568568ac3c62

The lines between <<<<<<< HEAD and ======= contain the code as it exists on the CSIL machine, where as the lines between ======= through >>>>>>> b70ae2a739c7775189b284be04ae568568ac3c62 contain the code from the server. In general, a failed merge can yield a mix of merged and unmerged blocks.

You should resolve these conflicts by choosing among the offending lines and removing the conflict markers (<<<<<<< HEAD, =======, and >>>>>>> b70ae2a739c7775189b284be04ae568568ac3c62). Use sublime text to edit the file. Once you are done: add, commit, and push the updated files. You must resolve the conflict before you will be able to push or pull again.

Cleaning Up

Use git status to check that you have left the local copies of repository on both your CSIL machine and your VM in a clean state. In particular, make sure you have resolved any merge conflicts that caused by the previous exercise and added (or removed) any extra files that you created during the Linux workshop. And then, if necessary, add/commit/push your files to the server.

Summary

You should always run git pull and git pull upstream master before you start working. When you are done working, you should always add, commit, and push your code to the server. Keeping the server and your local copies of your repositories in sync will save you a lot of grief over the course of the term.