Homework #3¶

Due: Thursday, October 21st at 11:59pm

This homework is intended to serve as an introduction to mutual exclusion and parallel performance.

Getting started¶

For each assignment, a Git repository will be created for you on GitHub. However, before that repository can be created for you, you need to have a GitHub account. If you do not yet have one, you can get an account here: https://github.com/join.

To actually get your private repository, you will need this invitation URL:

HW3 (Posted on Ed)

When you click on an invitation URL, you will have to complete the following steps:

You will need to select your CNetID from a list. This will allow us to know what student is associated with each GitHub account. This step is only done for the very first invitation you accept.
You must click “Accept this assignment” or your repository will not actually be created.
After accepting the assignment, Github will take a few minutes to create your repository. You should receive an email from Github when your repository is ready. Normally, it’s ready within seconds and you can just refresh the page.
You now need to clone your repository (i.e., download it to your machine).
- Make sure you’ve set up SSH access on your GitHub account.
- For each repository, you will need to get the SSH URL of the repository. To get this URL, log into GitHub and navigate to your project repository (take into account that you will have a different repository per project). Then, click on the green “Code” button, and make sure the “SSH” tab is selected. Your repository URL should look something like this: git@github.com:mpcs52060-aut21/hw3-GITHUB-USERNAME.git.
- If you do not know how to use git clone to clone your repository then follow this guide that Github provides: Cloning a Repository

If you run into any issues, or need us to make any manual adjustments to your registration, please let us know via Ed Discussion.

Short Answer Questions¶

Each assignment may include a section dedicated to answering a few short answer questions. Please turn in a readable text file named saqs (e.g., saqs.pdf). Store this file inside the hw3/saqs directory. You will place your answers to the following questions in this document. The file is not required to be a PDF but any text document that is easy to open is fine.

SAQ 1¶

Running your application on two processors yields a speedup of $S_2$. Use Amdahl’s Law to derive a formula for $S_N$, the speedup on n processors, in terms of $n$ and $S_2$.

SAQ 2¶

The L-exclusion problem is a variant of the starvation-free mutual exclusion problem. We make two changes: as many as L threads may be in the critical section at the same time, and fewer than L threads might fail (by halting) in the critical section.

An implementation must satisfy the following conditions:

L-Exclusion: At any time, at most L threads are in the critical section.
L-Starvation-Freedom: As long as fewer than L threads are in the critical section, then some thread that wants to enter the critical section will eventually succeed (even if some threads in the critical section have halted).

Modify the n-process Bakery mutual exclusion algorithm to turn it into an L-exclusion algorithm. Do not consider atomic operations in your answer. You can provide a pseudo-code solution or written solution.

SAQ 3¶

Lets say two goroutines are accessing a shared boolean slice intSlice = make([]bool,10)). Also assume, these goroutines are running on two different processors and have this slice inside their individual L1 caches. The first goroutine is only writing to the first 5 indices in the slice and the second goroutine is only reading from the last 5 indices. They will never cross their boundaries.

Question: Does the goroutines operating this way affect the performance caching and retrieving cached data? Explain.

Programming Questions¶

For this assignment you are only allowed to use the following Go concurrent constructs:

go statement
sync.Mutex and its associated methods.
sync/atomic package. You may use any of the atomic operations.
sync.WaitGroup and its associated methods.

As the course progresses, you will be able to use other constructs provided by Go such as channels, conditional variables, etc. I want you to learn the basics and build up from there. This way you’ll have a good picture of how parallel programs and their constructs are built from each other. If you are unsure about whether you are able to use a language feature then please ask on Ed before using it.

Problem 1: Spin Locks¶

In lecture, we discussed a few implementations of spin locks. Your task for this problem is to implement these locks in Go using the atomic package. You will place your implementations in a package called ppsync (i.e., the Parallel Programming Synchronization package).

You will implement the following locking algorithms:

TAS Lock (TASLock): Already implemented for you. This implementation is located in hw3/ppsync/taslock.go.
TTAS Lock (TTASLock): Create and place this implementation inside the hw3/ppsync/ttaslock.go.
Anderson Lock (ALock): You need to initialize the lock with the number of threads. How you do so is up to you. Create and place this implementation inside hw3/ppsync/alock.go.
Exponential Backoff Lock (EBLock): The MIN_DELAY and MAX_DELAY values will be specified on the command line in milliseconds. Double the delay for each failed attempt at acquiring the lock. You may also use the time and math/rand packages. Create and place this implementation inside hw3/ppsync/eblock.go.

Each locking algorithm must implement the sync.Locker interface and include an allocation function that returns a pointer instance of the lock. The actual return type must be a sync.Locker. For example,

func NewTTASLock() sync.Locker{
       ...
}

Use the taslock.go as a blueprint for the rest of the locks.

Problem 2: Counter Experiment¶

You will implement the same experiment discussed in lecture:

Ask the user to specify three command-line arguments command, incrementAmount, and threads. The command states which lock to use when the goroutines access a shared counter. incrememntAmount is an integer value that represents the number of times a goroutine will increment a shared counter variable. Finally, threads specifies the number of goroutines to spawn.
Here are the values the user can specify for the command argument:
- tas : Runs the program using a TASLock lock.
- ttas : Runs the program using a TTASLock lock.
- eb MIN_DELAY MAX_DELAY: Runs the program using a EBLock lock. MIN_DELAY and MAX_DELAY are both numeric values that represent time in milliseconds. You may choose the numeric type for these values based on your implementation. For the test cases, we will always provide them as integers; however, you can convert them to any numeric type for your implementation.
- a: Runs the program using an ALock lock.
- sync: Runs the program using a sync.Mutex as the lock.
All gorountines must be spawned all at once (i.e., via our normnal mechanism of using a for-loop) and execute the same function. In this function, each goroutine will increment a shared counter incrementAmount number of times using a for-loop. For each iteration of the for-loop, a goroutine will do the following:
```
1. Acquire the lock.
2. Increment the shared counter by one.
3. Release the lock.
```
The only output of the program should be the final counter value. This value is printed by the main goroutine after waiting for all the goroutines to exit their functions. Do not print anything else out to the command line.

You can assume that we will provide you with the correct number of arguments, in the correct order, with their specified types. You do not need to do error handling or print out a usage statement. However, here the usage statement for this program:

Usage: experiment command incrementAmount threads
command:
      tas - Runs the program using a TASLock lock.
      ttas - Runs the program using a TTASLock lock.
      eb MIN_DELAY MAX_DELAY - Runs the program using a EBLock lock. MIN_DELAY and MAX_DELAY are both numeric values that represent time in milliseconds.
      a - Runs the program using an ALock lock.
      sync - Runs the syc.Mutex lock.
incrementAmount: the number of times the goroutines should increment the counter
threads: the number of threads to spawn

Sample Runs ($: is just mimicking the command line)

$: go run hw3/experiment ttas 100 4
400
$: go run hw3/experiment a 100 2
200
$: go run hw3/experiment eb 32 1024 100 2
200

The tests for this problem is inside experiment_test.go. Please go back to homework #1 if you are having trouble running tests.

Problem 3: COVID Wrangling (Sequential Version)¶

Inside the hw3, do the following:

Copy over your covid directory with your solution from homework 2 into the hw3 directory. You will need a working version for this assignment. Please talk with Professor Samuels or a TA if you still need help getting your covid solution working.
Grab the new covid data files: hw3-data.zip. Place the data directory in hw3/covid directory. The data directory has been reduced in size to make it more manageable when we test the performance of the covid program. These are older and smaller data files and may not match the output from homework #2; however, we care about performance and not the counts. Do not add these to your repository.
Update this file with a sequential version, where it does the same process of the parallel version. The sequential version reads through all 500 files and calculates the total number of cases, tests, and deaths for a zip code in a given month. It does this process without spawning any threads to help with the calculation. Make sure not to use any concurrent mechanisms in this implementation.
The program runs the sequential version when the user enters in 0 for the threads command line argument. For example:
```
$ go run hw3/covid.go 0 60603 5 2020
2,48,0
```

Make sure you run the sequential version a few times to make sure it matches the output of your parallel version:

$: go run hw3/covid 0 60603 5 2020
2,48,0
$: go run hw3/covid 2 60603 5 2020
2,48,0
$: go run hw3/covid 4 60603 5 2020
2,48,0

Problem 4: Speedup Graph (Covid Program)¶

In this problem, you will create a speedup graph that compares your sequential covid program to the parallel version. You can use the the unix command time to time the execution of your program. The time command will print out the real-time (i.e., wall-time), user-time, and system time. You will want to use the real time measurement:

$: time ls > output

real 0m0.005s
user 0m0.002s
sys  0m0.003s

Remember, the formula for speedup

\[Speedup = \frac{\text{wall-clock time of serial execution}}{\text{wall-clock time of parallel execution}}\]

The steps to produce your speedup graph is as follows:

Make sure you close down all non-essential applications when performing your timing tests! You want to have make the cores more available to your program.
Run the sequential version of your program 5 times in a row and record the average real time. The zipcode = 60603, month=5, and year=2020.

Note

Why do we have to run the program 5 times in a row?

A process may create a cache on its first execution; therefore, running faster subsequently on additional executions.
Other processes may cause the command to be starved of CPU or I/O time.
There might be random interrupt that causes the timing to be an outlier.

Run the parallel version of your program 5 times in a row and record the average time for when threads==2, threads==4, threads==8, and threads==12. The zipcode == 60603, month=5 and year=2020 for all parallel timings. This means you will have 20 parallel timings but you will take the average of the timings for each thread subset. For example:
```
$: time go run hw3/covid 2 60603 5 2020
real    0m2.503s
user    0m5.227s
sys     0m0.514s
$: time go run hw3/covid 2 60603 5 2020
real    0m2.453s
user    0m5.227s
sys     0m0.514s
$: time go run hw3/covid 2 60603 5 2020
real    0m2.153s
user    0m5.227s
sys     0m0.514s
```
The above snippet ran the parallel version three times for when threads==2; however, you will run it 5 times in a row and take the average time from the real output.
Using those timings, produce a speedup graph. The y-axis will list the speedup measurement and the x-axis will list the number of worker threads. Similar to the graph shown below. Make make sure to title the graph, and label each axis. Make sure to adjust your y-axis range so that we can accurately see the values. That is, if most of your values fall between a range of [0,1] then don’t make your speedup range [0,14]. Here’s an example graph that I ran on my older Intel machine:

You do not need to have the same timings/speedups in your graph. However, you do need to have speedups for all threads. If you do not then you need to go back to your parallel version and make changes to speed up the implementation. One thing you can do is make sure you are limiting the amount of time you are spending in critical sections. Try to do as much work using local variables as possible. We will verify your graph by running your program on the CS servers.
Name your graph speedup and place it in your hw3/covid directory.

Note

I highly recommend you automate this process because in future assignments you will need to produce a script that will automatically generate your speedup graph. However, this is not required for this assignment. I used Python to generate my graph but you could also write a bash script or use a different language to automate this process.

Grading¶

Programming assignments will be graded according to a general rubric. Specifically, we will assign points for completeness, correctness, design, and style. (For more details on the categories, see our Assignment Rubric page.)

The exact weights for each category will vary from one assignment to another. For this assignment, the weights will be:

Completeness (experiment.go): 40%
Correctness: 10%
Design & Style: 10%
Short Answer Questions (5% each): 15%
Problem 3 and Problem 4: 25%

Obtaining your test score¶

The completeness part of your score will be determined using automated tests. To get your score for the automated tests, simply run the following from the Terminal. (Remember to leave out the $ prompt when you type the command.)

$ cd grader
$ go run hw3/grader hw3

This should print total score after running all test cases inside the individual problems. This printout will not show the tests you failed. You must run the problem’s individual test file to see the failures.

Design, Style and Cleaning up¶

Before you submit your final solution, you should, remove

any Printf statements that you added for debugging purposes and
all in-line comments of the form: “YOUR CODE HERE” and “TODO …”
Think about your function decomposition. No code duplication. This homework assignment is relatively small so this shouldn’t be a major problem but could be in certain problems.

Go does not have a strict style guide. However, use your best judgment from prior programming experience about style. Did you use good variable names? Do you have any lines that are too long, etc.

As you clean up, you should periodically save your file and run your code through the tests to make sure that you have not broken it in the process.

Submission¶

Before submitting, make sure you’ve added, committed, and pushed all your code to GitHub. You must submit your final work through Gradescope (linked from our Canvas site) in the “Homework #3” assignment page via two ways,

Uploading from Github directly (recommended way): You can link your Github account to your Gradescope account and upload the correct repository based on the homework assignment. When you submit your homework, a pop window will appear. Click on “Github” and then “Connect to Github” to connect your Github account to Gradescope. Once you connect (you will only need to do this once), then you can select the repsotiory you wish to upload and the branch (which should always be “main” or “master”) for this course.
Uploading via a Zip file: You can also upload a zip file of the homework directory. Please make sure you upload the entire directory and keep the initial structure the same as the starter code; otherwise, you run the risk of not passing the automated tests.

Note

For either option, you must upload the entire directory structure; otherwise, your automated test grade will not run correctly and you will be penalized if we have to manually run the tests. Going with the first option will do this automatically for you. You can always add additional directories and files (and even files/directories inside the stater directories) but the default directory/file structure must not change.

Depending on the assignment, once you submit your work, an “autograder” will run. This autograder should produce the same test results as when you run the code yourself; if it doesn’t, please let us know so we can look into it. A few other notes:

You are allowed to make as many submissions as you want before the deadline.
Please make sure you have read and understood our Late Submission Policy.
Your completeness score is determined solely based on the automated tests, but we may adjust your score if you attempt to pass tests by rote (e.g., by writing code that hard-codes the expected output for each possible test input).
Gradescope will report the test score it obtains when running your code. If there is a discrepancy between the score you get when running our grader script, and the score reported by Gradescope, please let us know so we can take a look at it.