Polling places¶

Due: Friday, November 12, 4:30pm CST

You may work alone or in a pair on this assignment.

The goal of this assignment is to give you practice designing and implementing classes and methods, and working with class instances.

Getting started¶

If you are going to work individually, follow the invitation URL provided on Ed Discussion and, when prompted for a team name, simply enter your CNetID. Because of a quirk in how GitHub manages team assignments, it may complain that your CNetID is already “taken” if you did PA #2 individually (this does not mean anyone is usurping your CNetID; GitHub simply requires that you sign up with a different name, because you already used your CNetID in PA 2). If this happens, simply add a -4 to your CNetID. For example, CNetID jrandom would become jrandom-4. Make sure to use this modified CNetID in place of your CNetID in all the instructions below.

If you are going to work in a pair, the process involves a couple of extra steps. The process is described in our Coursework Basics page. Please head over to that page, and return here once you’ve created your team repository. We have made a small, but important, change to the way teams are named, so please have a look at the Coursework Basics page even if you already worked in a team for PA #2.

Next, you will need to initialize your repository. If you are working in a pair only one of you should complete these steps. If you repeat these steps more than once, you may break your repository.

First, create a TEAM variable in the terminal that contains either your CNetID (if you’re working individually) or your team name (if you’re working in a pair):

TEAM=replace_me_with_your_cnetid_or_team_name

(remember you can double-check whether the variable is properly set by running echo $TEAM)

Finally, run these commands (if you don’t recall what some of these commands do, they are explained at the end of the Git Tutorial):

cd ~/capp30121
mkdir pa4-$TEAM
cd pa4-$TEAM
git init
git remote add origin git@github.com:uchicago-CAPP30121-aut-2021/pa4-$TEAM.git
git remote add upstream git@github.com:uchicago-CAPP30121-aut-2021/pa4-initial-code.git
git pull upstream main
git branch -M main
git push -u origin main

You will find the files you need for the programming assignment directly in the root of your repository, including a README.txt file that explains what each file is. Make sure you read that file.

If you are working in a pair, the person who did not initialize the repository should setup the TEAM variable (as shown above) and then clone the repository like this:

cd ~/capp30121
git clone git@github.com:uchicago-CAPP30121-aut-2021/pa4-$TEAM.git
cd pa4-$TEAM
git remote add upstream git@github.com:uchicago-CAPP30121-aut-2021/pa4-initial-code.git

If you are working in a pair, you should also take a moment to read the working_pair section of the Coursework Basics page.

Introduction¶

Every two years, many people around the United States wait in long lines to vote. Some of these voters get impatient and leave without voting, which is bad for democracy.

In this assignment, you will write code to simulate the flow of voters through polling places and analyze the interplay between different thresholds for voter impatience and the number of voting booths assigned to a precinct.

We will view this problem as an instance of the more general problem of simulating an $M$/$M$/$N$ queue, where:

the first $M$ indicates that arrivals (voters arriving at the polls) obey a Markov process;
the second $M$ indicates that the service times obey a Markov process (the amount of time a voter takes to complete a ballot); and finally
the $N$ indicates that there are $N$ servers (the number of voting booths).

We will tell you how to structure your implementation at a high level, but you will be responsible for the design details.

Representing Voters¶

In this assignment you will write a Voter class to model voters. This class will include:

the time the voter arrives at the polls,
the voting duration, that is, the amount of time the voter takes to vote,
whether the voter is of the impatient sort (that is, will they leave if they have to wait too long),
the time the voter is assigned to a voting booth (aka, the start time), assuming they vote,
the time the voter leaves the voting booth (aka, the departure time) assuming they vote, and
whether or not they voted.

For simplicity, we will model time as a floating point value, representing the number of minutes since the polls opened. So, if we say a voter arrived at the polls at time 60.5, that means they arrived an hour and thirty seconds after the polls opened.

We will know the voter’s arrival time and voting duration when we construct the Voter instance representing a voter. Both the arrival time and the voting duration for the voter should be initialized to their proper values when the voter instance is constructed.

The start time for a voter, which is not available at construction time, should be initialized to None when the voter instance is constructed. This value should be reset to the correct value when the voter is assigned to a voting booth and starts to vote.

Similarly, a voter’s departure time cannot be determined when the voter is constructed. It too should be initialized to None when the voter instance is constructed and should be reset to the appropriate time when the voter is assigned to a voting booth.

The voter’s has voted flag should be initialized to False and set to True only when e determined whether or not the voter actually votes.

Your implementation must include at a minimum a constructor and public attributes named:

arrival_time,
voting_duration,
is_impatient,
start_time,
departure_time, and
has_voted

Our utility function for printing voters, described below, relies on the existence of these attributes.

In addition to a constructor, this class could have a method to set the voter’s start and departure times, and their has voted flag, all of which can be set at the same time. Another reasonable design is to separate the work of handling these updates into multiple methods. Whether you use one method for these updates or more than one is up to you.

Precincts¶

Cities are broken up into voting precincts. You will complete the definition of a Precinct class that represents a single precinct, and which will be responsible for simulating the behaviour of voters in that precinct.

A precinct is open for some number of hours and has some maximum number of voters who can vote in the precinct (you can think of this as the number of voters who are registered to vote in that precinct). We will assume that all registered voters will go to the polls on election day, but not all arrive in time to vote. Also, some of the voters will arrive on time, but are impatient and will leave if they have to wait too long.

In addition to these values, a precinct has three values–the arrival rate, the voting duration rate, and the probability that a voter will be impatient–that are used to generate voters. We discuss this process below.

Precinct Configuration¶

The configuration for a precinct is specified using a dictionary containing this information:

The name of the precinct (name)
Number of hours the polls are open (hours_open)
Number of voters assigned to the precinct (num_voters)
The rate at which voters arrive (arrival_rate)
The rate at which voters complete their voting (voting_duration_rate).
The probability a voter is impatient (impatience_prob)

The significance of the rates and the probability will be explained later.

For example:

{
 "name": "Downtown",
 "hours_open": 1,
 "num_voters": 5,
 "voting_duration_rate": 0.1,
 "arrival_rate": 0.11,
 "impatience_prob": 0.10
}

Information about a precinct is stored in a JSON file that contains a dictionary with two items: the key "precinct" maps to a precinct configuration as specified above and the key "seed" maps to a random seed. Similar to PA1, the random seed will be used to ensure that a simulation involving randomness produces the same result every time it runs (which will allow us to test whether your code produces the expected results). We recommend reviewing PA1 (and how it used random seeds) if you’re unsure of how random seeds are used in the context of running a simulation.

We have provided a function, load_precinct, in util.py that takes the name of a JSON precinct file and returns two values: a precinct configuration (as specified above) and the random seed.

Our testing code, and our code for running simulate.py from the command line already call load_precinct, so you should not make any calls to load_precinct in the code you’ll add to simulate.py. However, it can be a useful function when testing your code. In particular, you should take a moment to familiarize yourself with the data stored in these JSON files, and how to access each value once you’ve loaded the file with load_precinct. We suggest you start by looking at data/precinct-0.json, and loading that file from IPython:

In [1]: %load_ext autoreload

In [2]: %autoreload 2

In [3]: import util

In [4]: precinct, seed = util.load_precinct("data/precinct-0.json")

In [5]: precinct
Out[5]:
{'name': 'Downtown',
  'hours_open': 1,
  'num_voters': 5,
  'voting_duration_rate': 0.1,
  'arrival_rate': 0.11,
  'impatience_prob": 0.1}

In [6]: seed
Out[6]: 1468604453

In [7]: precinct["num_voters"]
Out[7]: 5

In [8]: precinct["arrival_rate"]
Out[8]: 0.11

In [9]: precinct["impatience_prob"]
Out[9]: 0.1

`Precinct` class¶

Notice how having a Precinct class allows us to have multiple instances of the Precinct class in our program, each with its own set of parameters. This design will make it easier for us to simulate how voters behave given different settings for the parameters.

We have provided a skeleton for the Precinct class that includes the constructor and a simulate method to simulate a single voting day for that precinct. While you are allowed to add additional methods to this class, you cannot modify the parameter list of the constructor or the simulate method.

Please note that we will consider each Precinct to be independent from each other: voters cannot go from one precinct to another, and there is no information shared between the precincts.

Generating Voters¶

One of the main responsibilities of the Precinct class is to generate the voters. When we generate a voter, we set their arrival time, so let’s take a closer look at how those arrival times are generated.

The arrival time of our voters will follow a Poisson process. From Wikipedia: a Poisson process, named after the French mathematician Siméon-Denis Poisson (1781-1840), is a stochastic process in which the time to the next event is independent of past events. We can use the following fact to generate a voter sample: if the number of arrivals in the time interval $[0, t]$ follows the Poisson distribution with mean $t \cdot \lambda$, then the gap between arrivals times follows the exponential distribution with rate parameter $\lambda$.

This means that to generate a new voter, we need to know the arrival time of the previously generated voter. Our voter simulation code should keep track of a “current time” representing when the most recent voter arrived. We will keep track of time in minutes starting at time zero. To generate a voter, you will need to generate an arrival time, a voting duration, and a boolean flag to indicate whether they are impatient.

Arrival time

To generate the arrival time of the next voter, you will generate an inter-arrival time ($\textit{gap}$) from the exponential distribution using the specified arrival rate as $\lambda$ and add it to the current time ($t$). You’ll then move time forward by updating the time from $t$ to $t + \textit{gap}$. Notice how $\textit{gap}$ represents the time that elapses between the arrival of the previous voter and the arrival of the next voter.

For example, let’s say we want to generate a voter in a precinct, and the random number generator returns 5.7 for that voter’s gap. Since this is the first voter we generate, their arrival time will be 5.7, and we advance the precinct’s current time from 0 to 5.7. We then generate another voter, and the random number generator returns 2.4 for the voter’s gap. This means that this second voter arrives 2.4 minutes after the previous voter, so the second voter’s arrival time will be 8.1 (5.7 + 2.4) and we advance the precinct’s current time to 8.1.

Voting Duration

When we generate a voter, we also need to assign a voting duration to that voter. The voting duration also follows a Poisson process so you will draw the voter’s voting duration from the exponential distribution using the specified voting duration rate as $\lambda$.

Is the voter impatient?

Finally, we need to determine whether the voter is impatient. We will do this by flipping a weighted coin, with the weight determined by the precinct’s impatience probability.

Task 0: Implement voter generation¶

Implement the Voter class. And, in the provided Precinct class, fill in the constructor, and write a private method for generating a list of voters according to the configuration of the precinct.

In other words, instead of carrying out a full simulation of voters moving through the polling place, you will just generate instances of the Voter class for the voters who will be able to vote in the precinct, and return a list of these instances.

Doing this partial implementation of a voting simulation will ensure that you’ve correctly implemented the voter generation logic we described earlier. The code for generating the voters in your Precinct should be fairly simple (our implementation is only 15 lines long, not including comments).

When writing your code, it is important that you take into account the following notes.

Repeatability

For testing purposes, it is important that you draw the gap, determine the voter’s voting duration, and flip the impatience coin in the same order as our code. To reduce the likelihood that the testing process fails simply because the values were drawn in the wrong order, we have provided a function, gen_voter_parameters, in util.py that takes the precinct’s arrival rate, voting duration rate, and impatience probability and returns a tuple with the gap as a float, the voting duration as a float, and the impatience flag as a boolean that indicates whether the voter is impatient (in that order). We recommend familiarizing yourself with this function to understand how arrival gaps, voting duration times, and impatience flags are determined.

Generating a list of voters

Your voter list generation method should be a private method (that is, one with name that starts with __), because its main purpose will be as a helper function to the simulate method you will write later. We recommend that it take a seed for the random number as input. The number of voters generated will be determined by one of two stopping conditions: (1) you have generated the maximum number of the voters specified for the precinct, or (2) you generate a voter who arrives after closing time. In the second case, the late voter should be discarded and no further voters should be generated. We will refer to voters who arrive before the polls close as valid.

Testing¶

Because your voter list generation method is private, you cannot call it directly from outside the Precinct class, even for testing. However, since its purpose is as a helper function for the simulate method, you can do the following: in the simulate method, call your voter list generation method, and return the resulting list of voters. That way, you can test your voter list generation method by testing the simulate method.

So that you can test your implementation manually, we have provided two sets of voter generation parameters. The first set (see data/precinct-0.json) includes the following:

number of voters in the precinct: 5
number of booths in the precinct: 1
hours open: 1
arrival rate: 0.11 (corresponds to mean time between voters of approximately 9 minutes)
voting duration rate: 0.1 (corresponds to a mean voting duration of 10 minutes)
impatience probability: 0.1
seed for random number generator: 1468604453

and corresponds to a voter population in which all the voters arrive before the polls close and one voter, the first, is impatient.

The random seed

Remember that the precincts file includes a seed value (returned by load_precincts). This seed must be passed to simulate (using the seed parameter) and you must use the random.seed(seed) function to set the seed inside your method that generates the list of voters. The seed only needs to be set once for each list of voters that you generate.

Remember that PA1 includes a longer discussion of random seeds, and their significance in testing simulations.

We have also provided two functions that will be useful to you as you test this part your code. The first, a function named run_simulation in util_tests, takes the name of a precinct file and two optional parameters–the number of voting booths assigned to the precinct and the impatience threshold. This function loads the precinct data from the file, creates an instance of the Precinct class using that data, and then returns the result of calling simulate on the precinct.

The second function, print_voters in util.py takes a list of Voter instances and an optional file name. This function will either print a representation of the voters to screen or to a file, depending on whether a filename is specified (you likely won’t have to write to a file yourselves, but it is something that we’ve found useful on occasion).

You can use these functions to call your voter list generation method and then to print and check the voters that get returned.

In [10]: import util, util_tests

In [11]: voters = util_tests.run_simulation("data/precinct-0.json")

In [12]: util.print_voters(voters)
Arrival        Voting             Is             Start         Departure             Has
Time           Duration           Impatient      Time          Time                  Voted
------------------------------------------------------------------------------------------
 1.92           7.46                T             None          None                  F

20.17           5.13                F             None          None                  F

21.17           1.78                F             None          None                  F

23.48          10.08                F             None          None                  F

54.75          18.02                F             None          None                  F

Notice how the Start Time and the Departure Time are initially None because you are not yet calculating the time when the voters start voting (they may need to wait in line once they arrive at the precinct, and we describe this in more detail in the next task) nor are you handling impatient voters. Notice that all the voters start out not having voted (that is, Has Voted is False).

The second set of voter generation parameters, data/precinct-1.json, is the same except the arrival rate is set to 0.04, which corresponds to a mean gap of 25 minutes between arrivals:

In [6]: voters = util_tests.run_simulation("data/precinct-1.json")

In [7]: util.print_voters(voters)
Arrival        Voting             Is             Start         Departure             Has
Time           Duration           Impatient      Time          Time                  Voted
------------------------------------------------------------------------------------------
 5.27           7.46                T             None          None                  F

55.48           5.13                T             None          None                  F

58.22           1.78                F             None          None                  F

Notice how, in the above set, only three voters get to vote. The fourth voter’s projected arrival time turns out to be 64.57, which means they would arrive after the polls closed since this precinct is only open for one hour.

Once you’ve done some manual testing, you can run additional automated tests with py.test:

$ py.test -xvk voter

(Recall that we use $ to indicate the Linux command-line prompt. You should not include it when you run this command.)

As usual, remember that you can find more detailed instructions on how manual and automated testing work in the Testing Your Code page.

Task 1: Simulate an election day¶

In this task, you will be simulating an election for a single precinct. You will do this by completing the implementation of the simulate method in Precinct.

Simulating a Precinct¶

The simulation itself will be similar to the M/D/1 queue example from class. In the M/D/1 example, we have a single server, such as a toll booth, that serves a queue of clients, such as cars, that arrive with a specified distribution and have a fixed service time. In this assignment, your simulation will need to simulate multiple servers (voting booths) and variable service times (voting durations), but the structure of your simulation will be similar.

The biggest difference is that you will be using our VotingBooths class (described below) to model the “multiple servers”. This class will be responsible for determining whether a voting booth is available and, if not, when a booth will become available. This information will be necessary to determine when a given voter will start voting (since it depends on when they can access a booth), and to ensure that the number of booths in use will never be no more than the number of booths assigned to the precinct.

A voter who is impatient will ask how long they have to wait. If it is longer than a specified impatience threshold, they will leave without voting. We’ll start by looking at an example that uses an impatience threshold that is long enough that everyone, even the impatient voters, will stay and vote.

Example¶

Let’s walk through an example. The file data/precinct-2.json contains the following precinct:

{
 "name": "Little Rodentia",
 "hours_open": 1,
 "num_voters": 10,
 "voting_duration_rate": 0.1,
 "arrival_rate": 0.16666666666666666,
 "impatience_prob": 0.30
}

And the seed 1438018945.

The precinct has 10 voters who arrive at the precinct once every 6 minutes on average. Voters take 10 minutes on average to vote. And two of the voters (the fourth voter and the eighth voter) are impatient.

As you work on your implementation, it can be useful to print out the voters returned by your simulate method, even if you have not yet filled in the start and departure times of the voters:

In [15]: voters = util_tests.run_simulation("data/precinct-2.json")

In [16]: util.print_voters(voters)
Arrival        Voting             Is             Start         Departure             Has
Time           Duration           Impatient      Time          Time                  Voted
------------------------------------------------------------------------------------------
13.38           4.37                F             None          None                  F

15.58           0.12                F             None          None                  F

17.92           4.15                F             None          None                  F

18.98          11.31                T             None          None                  F

26.54           5.51                F             None          None                  F

29.31           0.71                F             None          None                  F

43.15           5.38                F             None          None                  F

44.97           4.09                T             None          None                  F

49.22           9.52                F             None          None                  F

49.90          18.05                F             None          None                  F

Let’s consider what happens when we simulate this precinct with two voting booths and a large value (say, 1000) for the impatience threshold.

Our first voter first voter arrives at 13.38, enters a voting booth immediately at time 13.38, and finishes voting at 17.75 (13.38 + 4.37). (All times in this example have been rounded to two digits for clarity; your code should not round times.)

The second voter also starts voting immediately, because a booth is open, and finishes voting at time 15.70 (15.58 + 0.12).

The third voter arrives at time 17.92 and has a voting duration of 4.15 minutes. Since the first voter departed at 17.75 and the second voter departed at 15.70, this voter can start voting immediately, and finishes voting at 22.07.

The fourth voter arrives at 18.98, has a voting duration of 11.31 minutes, and is impatient. Only one booth is occupied when the fourth voter arrives, so voter can start voting immediately, and finishes voting at 30.29. The voter will vote because their wait time is zero, which is less than the impatience threshold.

The fifth voter arrives at time 26.54, after the third voter has departed, and starts voting immediately. This voter departs at 32.05.

The sixth voter arrives at 29.31 and has a voting duration of 0.71 minutes. Both booths are occupied when the sixth voter arrives, so this voter must wait until a booth becomes free, which happens when the fourth voter finishes at time 30.29. Thus, unlike the previous voters, the sixth voter’s start time (30.29) will be different from their arrival time (29.31). Their finish time will be 31.00. Notice that this voter departs before the fifth voter.

And so on.

All ten voters arrive before the polls close and the two impatient voters get to vote quickly and so all ten get to vote. Please note that (as typical in the US) if a voter arrives before the polls close, they still get to vote even if they will not start voting until after the polls have closed. In this example, the polls close at time 60 (the precinct dictionary specifies the number of hours the polls are open, while all other times are measured in minutes).

Here’s the result of our simulation:

In [20]: voters = util_tests.run_simulation("data/precinct-2.json", 2, 1000)

In [21]: util.print_voters(voters)
Arrival        Voting             Is             Start         Departure             Has
Time           Duration           Impatient      Time          Time                  Voted
------------------------------------------------------------------------------------------
13.38           4.37                F             13.38         17.75                 T

15.58           0.12                F             15.58         15.70                 T

17.92           4.15                F             17.92         22.07                 T

18.98          11.31                T             18.98         30.29                 T

26.54           5.51                F             26.54         32.05                 T

29.31           0.71                F             30.29         31.00                 T

43.15           5.38                F             43.15         48.54                 T

44.97           4.09                T             44.97         49.06                 T

49.22           9.52                F             49.22         58.74                 T

49.90          18.05                F             49.90         67.94                 T

Notice that we provided values for the two optional parameters of util_tests.run_simulation: num_booths is two and impatience_threshold is 1000.

Now, let’s look at a precinct with one voting booth where everyone is very impatient. That is, the impatience probability is 1.0 and voters are not willing to wait more than three minutes. That is, if their wait time is (strictly) greater than three, they will leave without voting.

In [30]: voters = util_tests.run_simulation("data/precinct-6.json", 1, 3)

In [31]: util.print_voters(voters)
Arrival        Voting             Is             Start         Departure             Has
Time           Duration           Impatient      Time          Time                  Voted
------------------------------------------------------------------------------------------
13.38           4.37                T             13.38         17.75                 T

15.58           0.12                T             17.75         17.86                 T

17.92           4.15                T             17.92         22.07                 T

18.98          11.31                T             None          None                  F

26.54           5.51                T             26.54         32.05                 T

29.31           0.71                T             32.05         32.76                 T

43.15           5.38                T             43.15         48.54                 T

44.97           4.09                T             None          None                  F

49.22           9.52                T             49.22         58.74                 T

49.90          18.05                T             None          None                  F

Notice that everyone is impatient as expected. The second voter arrives while the first voter is still voting, so they have to wait. Their projected waiting time, that is, the time between when they arrive and when a booth will be available is 2.17. Since their projected wait time is less than three minutes, the voter waits and starts voting when the first voter finishes.

The third voter does not arrive at the polls until after the second voter has finished, so they vote right away. This voter will finish at time 22.07.

The fourth voter arrives while the third voter is still voting. Their projected wait time is more than three minutes (22.07 - 18.98 > 3) and so, they leaving the polling place without voting. Note that their departure time, which tells us when a voter left the voting booth, remains set at None.

The fifth voter arrives at time 26.54. Since the third voter finished at time 22.07 and the fourth voter left without voting, this voter can start voting immediately.

And so, on.

Managing the Voting Booths¶

We have supplied a class VotingBooths to model the voting booths. This class allows you to add and remove voters from booths and to track the state of the booths.

VotingBooths class has a constructor that takes the number of booths and initializes the data structures used to manage the booths.

Priority Queues

The underlying data structure used to represent the booths in the VotingBooths class is a priority queue. You do not need to understand priority queues to use the VotingBooths class. But in case you are curious: a minimum priority queue is a data structure that includes operations for (1) adding items along with a priority to the queue (where the priority is just a number), (2) removing the item with the lowest priority value from the queue, and (3) determining whether the queue is empty. Removing the elements with the lowest priority value may seem counter-intuitive but, in Computer Science, a lower value usually denotes a higher priority (think of it as a ranking: if you’re #1, your ranked above someone who is ranked #2).

The class also provides a collection of methods:

is_booth_available: which returns True if there is at least one unoccupied booth and False otherwise.
is_some_booth_occupied: which returns True if there is at least one occupied booth and False otherwise.
time_next_free: returns a float with the time that the next booth will be free next. This method should only be called when at least one booth is occupied.
enter_booth: takes an instance of the Voter class and adds the voter to a booth. This method should only be called when at least one booth is open and it also requires the start_time attribute of the Voter to be set.
exit_booth: removes the next voter to depart from their booth and returns the voter and their departure_time. This method should only be called when at least one booth is occupied.

When simulating an election day, you are responsible for ensuring that all booths are empty at the end of the day.

Your Precinct code may use as many of these methods as needed. Your code may not access the underlying data directly. In fact our implementation of VotingBooths uses a private attribute (created by adding two underscores at the start of the attribute’s name) for the data to prevent you accessing it directly.

Returning the voters¶

Your simulate function must return a list of all the Voter objects. The start_time, departure_time, and has_voted attributes should be updated for the voters who voted. Furthermore, for a given precinct, the list of voters should be sorted by the voters’ arrival time. Since your Precinct should generate voters in the order in which they arrive, producing a list of voters sorted by arrival time should not require any additional steps. If you do find that you need to sort the list explicitly, it is likely that your solution is headed in the wrong direction; come to office hours or ask for help on Ed if that is the case.

Testing¶

As we mentioned earlier, the function util_tests.run_simulation has two optional arguments: num_booths allows the caller to supply the number of voting booths to use in the simulation, while impatience_threshold allows the caller to specify the impatience threshold. Here is an example use:

In [40]: voters = util_tests.run_simulation("data/precinct-6.json", num_booths=2, impatience_threshold=1)

In [41]: util.print_voters(voters)
Arrival        Voting             Is             Start         Departure             Has
Time           Duration           Impatient      Time          Time                  Voted
------------------------------------------------------------------------------------------
13.38           4.37                T             13.38         17.75                 T

15.58           0.12                T             15.58         15.70                 T

17.92           4.15                T             17.92         22.07                 T

18.98          11.31                T             18.98         30.29                 T

26.54           5.51                T             26.54         32.05                 T

29.31           0.71                T             30.29         31.00                 T

43.15           5.38                T             43.15         48.54                 T

44.97           4.09                T             44.97         49.06                 T

49.22           9.52                T             49.22         58.74                 T

49.90          18.05                T             49.90         67.94                 T

We have provided CSV files with the expected results for different precincts with a variety of numbers of voting booths and thresholds. The filenames all have the form: data/precint-N_VB_T.csv, where precinct-N specifies the precinct files to use, VB is the number of voting booths, and T is the impatience threshold. So, for example, the file precinct-7_5_20.csv was generated using the precinct from data/precinct-7.json along with five voting booths, and an impatience threshold of 20.

We also have provided automated tests that will test your implementation of the simulate method. You can run these tests like this:

$ py.test -xv test_simulate_election_day.py

Finally, we have provided a main function in simulate.py that allows you to run simulations from the command-line. If you run simulate.py with just the name of a configuration file, it will print a summary of the simulation results using that precinct with one voting machine and an impatience threshold of 1000. For example:

$ python3 simulate.py data/precinct-0.json

PRECINCT 'Downtown'
- 5 voters voted.
- Polls closed at 60 and last voter departed at 72.77
- 0 voters left without voting

You can use the --print-voters options to, instead, print all the voters:

$ python3 simulate.py data/precinct-0.json --print-voters
Arrival        Voting             Is             Start         Departure          Has
Time           Duration           Impatient      Time          Time               Voted
---------------------------------------------------------------------------------------
 1.92           7.46                T              1.92          9.37              T

20.17           5.13                F             20.17         25.31              T

21.17           1.78                F             25.31         27.08              T

23.48          10.08                F             27.08         37.16              T

54.75          18.02                F             54.75         72.77              T

You can use the --num-booths flag to set the number of booths available to the precinct. (The default is 1) and you can use the --impatience-threshold flag to the impatience threshold (the default is 1000).

Here’s a sample run of the very impatient precinct with one booth and a threshold of three:

$ python3 simulate.py data/precinct-6.json --impatience-threshold 3
- 10 voters voted
- Polls closed at 60 and last *voter* departed at 58.74.
- 3 voters left without voting
  including the last person to arrive at the polls

If we keep the impatience threshold the same and add a booth, then none of the voters leave without voting:

$ python3 simulate.py data/precinct-6.json --num-booths 2 --impatience-threshold 3
Precinct Impatience Island
- 10 voters voted
- Polls closed at 60 and last voter departed at 67.94.
- 0 voters left without voting

Finding the median of a set of trial runs¶

In the final pair of tasks, you will need to run a series of trials, collect the results from the trials, and then report the median result. One easy way to find the median is to store the results in a list (let’s call it lst), sort the list using the built-in sorted function (sorted(lst)), and then return the middle element of the sorted list (sorted(lst)[len(lst) // 2]). Our test code for the last two tasks assumes you use this approach to compute the median.

Task 2: Finding the impatience threshold¶

At this point, we have the ability to simulate precincts under a variety of parameters. One interesting parameter to tweak is the threshold for how long impatient voters are willing to wait (i.e., the impatience threshold).

Let’s start by thinking about a specific set of voters for a given precinct and a fixed number of voting booths. How might we find a value for the impatience threshold that would be large enough to ensure that every voter would stay and vote. One way to do this task, is to simulate election day for the precinct using the same seed (to guarantee that we are simulating the same set voters) and the same bank of voting booths, but with different values for the impatience threshold. In our case, we’ll start with an impatience threshold of 1 and then keep incrementing it by 10 until we find a value for the threshold that is large enough to guarantee that even the impatient voters in the batch stay to vote.

Here’s an example of this process run manually using the impatient precinct ("data/precinct-6.json")

In [50] import simulate

In [51]: p, seed = util.load_precinct("data/precinct-6.json")

In [52]: voting_booths = simulate.VotingBooths(1)

In [53]: precinct = simulate.Precinct(p["name"],
    ...:                              p["hours_open"],
    ...:                              p["num_voters"],
    ...:                              p["arrival_rate"],
    ...:                              p["voting_duration_rate"],
    ...:                              p["impatience_prob"])
    ...:
    ...:

In [54]: voters = precinct.simulate(seed, voting_booths, 1)

In [55]: [v.has_voted for v in voters]
Out[55]: [True, False, True, False, True, False, True, False, True, False]

In [56]: voting_booths.is_some_booth_occupied()
Out[56]: False

In [57]: voters = precinct.simulate(seed, voting_booths, 11)

In [58]: [v.has_voted for v in voters]
Out[58]: [True, True, True, True, True, True, True, True, True, False]

In [59]: voting_booths.is_some_booth_occupied()
Out[59]: False

In [60]: voters = precinct.simulate(seed, voting_booths, 21)

In [61]: [v.has_voted for v in voters]
Out[61]: [True, True, True, True, True, True, True, True, True, True]

In [62]: voting_booths.is_some_booth_occupied()
Out[62]: False

Notice that we used the same precinct, seed, and bank of voting booths for all three tests. The only thing that varied was the impatience threshold. This makes sense because our goal is to find a threshold suitable for a specific set of voters and a specific set of voting booths.

Also, note that there is a tradeoff between computation time and accuracy. The threshold we identified (21) overshoots the true value. A value closer to 12.3 is sufficient for this specific set of voters.

If we ran another trial of this process using the same precinct and set of voting booths but with seed+1 as the seed, we’d end up with a threshold of 71.

To avoid having the quirks of a specific set of voters mislead us, we will run through this trial process for different sets of voters (that is, using different seeds) and return the median threshold identified across the different trials.

Your task is to write a function find_impatience_threshold that, given a seed, a Precinct, a number of booths, and a number of trials returns the median threshold from the specified number of trials. You may find it helpful to write a separate helper function that runs a single trial and returns the computed threshold for that seed (set of voters). What parameters do you need for the helper function? What should it return?

You will need to use a different seed for each trial. Start with the specified initial seed and add 1 for each new trial. (That is, the ith trial should use seed+i as the seed.)

Note that the number of booths to simulate is a parameter to the function. If you implemented your simulation method properly, you should be able to construct a single instance of the VotingBooths class and use it for all the simulations.

Testing

We have provided a function util_tests.run_threshold_test to help you test your work in ipython3. This function takes the name of a precinct file, the number of booths to use, and the number of trials to run. It loads the precinct into a Precinct object and calls find_impatience_threshold and returns the result.

In [60]: util_tests.run_threshold_test("data/precinct-6.json", 1, 100)
Out[60]: 41

In [61]: util_tests.run_threshold_test("data/precinct-6.json", 2, 100)
Out[61]: 11

You can find the expected results for many combinations of precincts, numbers of voting booths, and numbers of trials in the file data/impatience_threshold.csv.

The automated tests for this task are split into two groups: tests that will run quickly and test that will run slowly. To skip the slow tests, run:

$ py.test -xv test_impatience_threshold.py --skip-slow

To include in the slow tests run:

$ py.test -xv test_impatience_threshold.py

Our implementation took 31.4 seconds to run the full test suite for this task. And under 50 seconds to run all the tests for all the tasks.

Task 3: Finding a suitable number of booths¶

Ideally, we would like to have enough booths on hand to allow everyone, even the impatient voters, to vote in all situations. Planning for all situations, though, can be wasteful. Lots of booths would sit idle most of the time. Instead, we will determine a number of booths that will be sufficient for many, but not all, situations.

Your task is to implement the function find_voting_booths_needed, which takes a starting seed, a precinct, an impatience threshold, and a number of trials to run and returns this “good enough” number of booths. The specific value required is defined below.

As in the previous task, a single trial will consist of finding the number of booths needed to ensure that everyone votes given a specified threshold and a specific set of voters (determined by the seed). Start with 1 booth and increment by 1 each time until you find a number of booths that is sufficient to ensure that all the voters stay to vote.

Also similar to the previous task, you should start with the specified seed and increment it by one for each trial (i.e., the ith trial should use seed+i).

Your function should return the median number of booths computed across the trials (determined as noted above), which we have decided is good enough.

Testing

As with the previous function, we have provided a test function utils_test.run_find_vb_test to simplify the process of testing your code manually. It takes the name of a precinct file, the impatience threshold, and the number of trials to run as arguments. It loads the precinct file into a Precinct and then calls your find_voting_booths_needed function and returns the results:

In [70]: util_tests.run_find_vb_test("data/precinct-6.json", 5, 1)
Out[70]: 2

In [71]: util_tests.run_find_vb_test("data/precinct-6.json", 5, 10)
Out[71]: 3

In [72]: util_tests.run_find_vb_test("data/precinct-6.json", 5, 100)
Out[72]: 3

The file data/voting_booths_needed.csv contains the expected results for a variety of precinct, impatience thresholds, and numbers of trials.

You can run the automated tests for this function like this from the Linux command-line:

$ py.test -xv test_voting_booths_needed.py

Grading¶

The assignment will be graded according to the Rubric for Programming Assignments. Make sure to read that page carefully; the remainder of this section explains aspects of the grading that are specific to this assignment.

In particular, your completeness score will be determined solely on the basis of the automated tests, which provide a measure of how many of the tasks you have completed:

Grade	Percent tests passed
Exemplary	at least 95%
Satisfactory	at least 80%
Needs Improvement	at least 50%
Ungradable	less than 50%

For example, if your implementation passes 85% of the tests, you will earn an S (satisfactory) score for completeness.

The code quality score will be based on the general criteria in the Rubric for Programming Assignments but, in this programming assignment, we will also be paying special attention to the following:

Appropriate design for the Voter class Did you design one or more methods that are apporpriate for updating voters once they vote?
Appropriate design for the Precinct class Did you design an appropriate helper method for generating voters?
Appropriate use of the VotingBooths class Does your implementation leverage the VotingBooths class to manage the voters properly or did you recreate some of the functionality of this class in your own code?
Appropriate use of helper functions Did you design a good method for creating voters? Did you write well-designed helper methods for the parameter sweeps?

While these are the main things we care about in this assignment, please remember that it is not possible for us to give you an exhaustive list of every single thing that could affect your code quality score (and that thinking in those terms is generally counterproductive to learning how to program; see our How to Learn in this Class page for more details).

In general, avoiding all of the above issues will increase the likelihood of getting an E; if your code has a few (but not most) of the above issues, that will likely result in an S; if your code suffers from most of those issues, it will likely get an N. That said, to reiterate, we could also find other issues in your code that we did not anticipate, but that end up affecting your code quality score. When that happens, we encourage you to see these as opportunities to improve your code in future assignments (or as specific things to change if you decide to resubmit the assignment).

And don’t forget that style also matters! You could avoid all of the above issues, and still not get an E because of style issues in your code. Make sure to review the general guidelines in the Rubric for Programming Assignments, as well as our Style Guide.

Cleaning up¶

Before you submit your final solution, you should, remove

any print statements that you added for debugging purposes and
all in-line comments of the form: “YOUR CODE HERE” and “REPLACE …”

Also, check your code against the style guide. Did you use good variable names? Do you have any lines that are too long, etc.

Make sure you have included header comments, that is, the triple-quote strings that describe the purpose, inputs, and return values of each function, for every function you have written.

As you clean up, you should periodically save your file and run your code through the tests to make sure that you have not broken it in the process.

Submission¶

You will be submitting your work through Gradescope (linked from our Canvas site). The process will be the same as with previous coursework: Gradescope will fetch your files directly from your PA4 repository on GitHub, so it is important that you remember to commit and push your work! You should also get into the habit of making partial submissions as you make progress on the assignment; remember that you’re allowed to make as many submissions as you want before the deadline.

If you are working in a pair, only one of you has to make a submission

To submit your work, go to the “Gradescope” section on our Canvas site. Then, click on “Programming Assignment #4”.

Under “Repository”, make sure to select your uchicago-CAPP30121-aut-2021/pa4-$TEAM.git repository. Under “Branch”, just select “main”.

Finally, click on “Upload”. An autograder will run, and will report back a score. Please note that this autograder runs the exact same tests (and the exact same grading script) described in Testing Your Code. If there is a discrepancy between the tests when you run them on your computer, and when you submit your code to Gradescope, please let us know.

If you are working in a pair, Gradescope will initially create an individual submission. You will need to explicitly add your partner to your submission, so that your work is associated with them too.

Polling places¶

Getting started¶

Introduction¶

Representing Voters¶

Precincts¶

Precinct Configuration¶

Precinct class¶

Generating Voters¶

Task 0: Implement voter generation¶

Testing¶

Task 1: Simulate an election day¶

Simulating a Precinct¶

Example¶

Managing the Voting Booths¶

Returning the voters¶

Testing¶

Finding the median of a set of trial runs¶

Task 2: Finding the impatience threshold¶

Task 3: Finding a suitable number of booths¶

Grading¶

Cleaning up¶

Submission¶

`Precinct` class¶