Polling places¶
Due: Friday, November 12, 4:30pm CST
You may work alone or in a pair on this assignment.
The goal of this assignment is to give you practice designing and implementing classes and methods, and working with class instances.
Getting started¶
If you are going to work individually, follow the invitation URL
provided on Ed Discussion and, when prompted for a team name, simply
enter your CNetID. Because of a quirk in how GitHub manages team
assignments, it may complain that your CNetID is already “taken” if
you did PA #2 individually (this does not mean anyone is usurping
your CNetID; GitHub simply requires that you sign up with a different
name, because you already used your CNetID in PA 2). If this happens,
simply add a -4
to your CNetID. For example, CNetID jrandom
would become jrandom-4
. Make sure to use this modified CNetID
in place of your CNetID in all the instructions below.
If you are going to work in a pair, the process involves a couple of extra steps. The process is described in our Coursework Basics page. Please head over to that page, and return here once you’ve created your team repository. We have made a small, but important, change to the way teams are named, so please have a look at the Coursework Basics page even if you already worked in a team for PA #2.
Next, you will need to initialize your repository. If you are working in a pair only one of you should complete these steps. If you repeat these steps more than once, you may break your repository.
First, create a TEAM
variable in the terminal that contains either
your CNetID (if you’re working individually) or your team name (if
you’re working in a pair):
TEAM=replace_me_with_your_cnetid_or_team_name
(remember you can double-check whether the variable is properly set by
running echo $TEAM
)
Finally, run these commands (if you don’t recall what some of these commands do, they are explained at the end of the Git Tutorial):
cd ~/capp30121
mkdir pa4-$TEAM
cd pa4-$TEAM
git init
git remote add origin git@github.com:uchicago-CAPP30121-aut-2021/pa4-$TEAM.git
git remote add upstream git@github.com:uchicago-CAPP30121-aut-2021/pa4-initial-code.git
git pull upstream main
git branch -M main
git push -u origin main
You will find the files you need for the programming assignment
directly in the root of your repository, including a README.txt
file that explains what each file is. Make sure you read that file.
If you are working in a pair, the person who did not initialize the repository should setup
the TEAM
variable (as shown above) and then clone the repository like this:
cd ~/capp30121
git clone git@github.com:uchicago-CAPP30121-aut-2021/pa4-$TEAM.git
cd pa4-$TEAM
git remote add upstream git@github.com:uchicago-CAPP30121-aut-2021/pa4-initial-code.git
If you are working in a pair, you should also take a moment to read the working_pair section of the Coursework Basics page.
Introduction¶

Every two years, many people around the United States wait in long lines to vote. Some of these voters get impatient and leave without voting, which is bad for democracy.
In this assignment, you will write code to simulate the flow of voters through polling places and analyze the interplay between different thresholds for voter impatience and the number of voting booths assigned to a precinct.
We will view this problem as an instance of the more general problem of simulating an \(M\)/\(M\)/\(N\) queue, where:
the first \(M\) indicates that arrivals (voters arriving at the polls) obey a Markov process;
the second \(M\) indicates that the service times obey a Markov process (the amount of time a voter takes to complete a ballot); and finally
the \(N\) indicates that there are \(N\) servers (the number of voting booths).
We will tell you how to structure your implementation at a high level, but you will be responsible for the design details.
Representing Voters¶
In this assignment you will write a Voter
class to model voters.
This class will include:
the time the voter arrives at the polls,
the voting duration, that is, the amount of time the voter takes to vote,
whether the voter is of the impatient sort (that is, will they leave if they have to wait too long),
the time the voter is assigned to a voting booth (aka, the start time), assuming they vote,
the time the voter leaves the voting booth (aka, the departure time) assuming they vote, and
whether or not they voted.
For simplicity, we will model time as a floating point value, representing the number of minutes since the polls opened. So, if we say a voter arrived at the polls at time 60.5, that means they arrived an hour and thirty seconds after the polls opened.
We will know the voter’s arrival time and voting duration when we
construct the Voter
instance representing a voter. Both the
arrival time and the voting duration for the voter should be
initialized to their proper values when the voter instance is
constructed.
The start time for a voter, which is not available at construction
time, should be initialized to None
when the voter instance is
constructed. This value should be reset to the correct value when the
voter is assigned to a voting booth and starts to vote.
Similarly, a voter’s departure time cannot be determined when the
voter is constructed. It too should be initialized to None
when
the voter instance is constructed and should be reset to the
appropriate time when the voter is assigned to a voting booth.
The voter’s has voted flag should be initialized to False
and set
to True
only when e determined whether or not the voter actually
votes.
Your implementation must include at a minimum a constructor and public attributes named:
arrival_time
,voting_duration
,is_impatient
,start_time
,departure_time
, andhas_voted
Our utility function for printing voters, described below, relies on the existence of these attributes.
In addition to a constructor, this class could have a method to set the voter’s start and departure times, and their has voted flag, all of which can be set at the same time. Another reasonable design is to separate the work of handling these updates into multiple methods. Whether you use one method for these updates or more than one is up to you.
Precincts¶
Cities are broken up into voting precincts. You will complete the definition of
a Precinct
class that represents a single precinct, and which will be responsible for simulating
the behaviour of voters in that precinct.
A precinct is open for some number of hours and has some maximum number of voters who can vote in the precinct (you can think of this as the number of voters who are registered to vote in that precinct). We will assume that all registered voters will go to the polls on election day, but not all arrive in time to vote. Also, some of the voters will arrive on time, but are impatient and will leave if they have to wait too long.
In addition to these values, a precinct has three values–the arrival rate, the voting duration rate, and the probability that a voter will be impatient–that are used to generate voters. We discuss this process below.
Precinct Configuration¶
The configuration for a precinct is specified using a dictionary containing this information:
The name of the precinct (
name
)Number of hours the polls are open (
hours_open
)Number of voters assigned to the precinct (
num_voters
)The rate at which voters arrive (
arrival_rate
)The rate at which voters complete their voting (
voting_duration_rate
).The probability a voter is impatient (
impatience_prob
)
The significance of the rates and the probability will be explained later.
For example:
{
"name": "Downtown",
"hours_open": 1,
"num_voters": 5,
"voting_duration_rate": 0.1,
"arrival_rate": 0.11,
"impatience_prob": 0.10
}
Information about a precinct is stored in a JSON file that contains a
dictionary with two items: the key "precinct"
maps to a precinct
configuration as specified above and the key "seed"
maps to a
random seed. Similar to PA1, the random seed will be used to ensure
that a simulation involving randomness produces the same result every
time it runs (which will allow us to test whether your code produces
the expected results). We recommend reviewing PA1 (and how it used
random seeds) if you’re unsure of how random seeds are used in the
context of running a simulation.
We have provided a function, load_precinct
, in util.py
that
takes the name of a JSON precinct file and returns two values:
a precinct configuration (as specified above) and the random seed.
Our testing code, and our code for running simulate.py
from the command line already call load_precinct
,
so you should not make any calls to load_precinct
in the code
you’ll add to simulate.py
. However, it can be a useful
function when testing your code. In particular,
you should take a moment to familiarize yourself with the
data stored in these JSON files, and how to access each value once
you’ve loaded the file with load_precinct
. We suggest you start
by looking at data/precinct-0.json
, and loading that file
from IPython:
In [1]: %load_ext autoreload
In [2]: %autoreload 2
In [3]: import util
In [4]: precinct, seed = util.load_precinct("data/precinct-0.json")
In [5]: precinct
Out[5]:
{'name': 'Downtown',
'hours_open': 1,
'num_voters': 5,
'voting_duration_rate': 0.1,
'arrival_rate': 0.11,
'impatience_prob": 0.1}
In [6]: seed
Out[6]: 1468604453
In [7]: precinct["num_voters"]
Out[7]: 5
In [8]: precinct["arrival_rate"]
Out[8]: 0.11
In [9]: precinct["impatience_prob"]
Out[9]: 0.1
Precinct
class¶
Notice how having a Precinct
class allows us to have multiple
instances of the Precinct
class in our program, each with its own
set of parameters. This design will make it easier for us to simulate
how voters behave given different settings for the parameters.
We have provided a skeleton for the Precinct
class that
includes the constructor and a simulate
method to simulate a
single voting day for that precinct. While you are allowed to add
additional methods to this class, you cannot modify the parameter
list of the constructor or the simulate
method.
Please note that we will consider each Precinct
to be independent
from each other: voters cannot go from one precinct to another, and
there is no information shared between the precincts.
Generating Voters¶
One of the main responsibilities of the Precinct
class is to generate
the voters.
When we generate a voter, we set their arrival time, so
let’s take a closer look at how those arrival times are generated.
The arrival time of our voters will follow a Poisson process. From Wikipedia: a Poisson process, named after the French mathematician Siméon-Denis Poisson (1781-1840), is a stochastic process in which the time to the next event is independent of past events. We can use the following fact to generate a voter sample: if the number of arrivals in the time interval \([0, t]\) follows the Poisson distribution with mean \(t \cdot \lambda\), then the gap between arrivals times follows the exponential distribution with rate parameter \(\lambda\).
This means that to generate a new voter, we need to know the arrival time of the previously generated voter. Our voter simulation code should keep track of a “current time” representing when the most recent voter arrived. We will keep track of time in minutes starting at time zero. To generate a voter, you will need to generate an arrival time, a voting duration, and a boolean flag to indicate whether they are impatient.
Arrival time
To generate the arrival time of the next voter, you will generate an inter-arrival time (\(\textit{gap}\)) from the exponential distribution using the specified arrival rate as \(\lambda\) and add it to the current time (\(t\)). You’ll then move time forward by updating the time from \(t\) to \(t + \textit{gap}\). Notice how \(\textit{gap}\) represents the time that elapses between the arrival of the previous voter and the arrival of the next voter.
For example, let’s say we want to generate a voter in a precinct,
and the random number generator returns 5.7
for that
voter’s gap. Since this is the first voter we generate, their arrival
time will be 5.7
, and we advance the precinct’s current time from 0
to
5.7
. We then generate another voter, and the random number generator
returns 2.4
for the voter’s gap. This means that this second voter
arrives 2.4
minutes after the previous voter, so the second voter’s
arrival time will be 8.1
(5.7 + 2.4
) and we advance the precinct’s
current time to 8.1
.
Voting Duration
When we generate a voter, we also need to assign a voting duration to that voter. The voting duration also follows a Poisson process so you will draw the voter’s voting duration from the exponential distribution using the specified voting duration rate as \(\lambda\).
Is the voter impatient?
Finally, we need to determine whether the voter is impatient. We will do this by flipping a weighted coin, with the weight determined by the precinct’s impatience probability.
Task 0: Implement voter generation¶
Implement the Voter
class. And, in the provided Precinct
class, fill in the constructor, and write a private method
for generating a list of voters according to the configuration of the precinct.
In other words, instead of carrying out a full simulation of voters moving through the polling place, you will just
generate instances of the Voter
class for the voters who will be
able to vote in the precinct, and return a list of these instances.
Doing this partial implementation of a voting simulation will
ensure that you’ve correctly implemented the voter generation logic we
described earlier. The code for generating the voters in your
Precinct
should be fairly simple (our implementation is only 15
lines long, not including comments).
When writing your code, it is important that you take into account the following notes.
Repeatability
For testing purposes, it is important that you draw the gap, determine
the voter’s voting duration, and flip the impatience coin in the same
order as our code. To reduce the likelihood that the testing process
fails simply because the values were drawn in the wrong order, we have
provided a function, gen_voter_parameters
, in util.py
that
takes the precinct’s arrival rate, voting duration rate, and
impatience probability and returns a tuple with the gap as a float,
the voting duration as a float, and the impatience flag as a boolean
that indicates whether the voter is impatient (in that order). We
recommend familiarizing yourself with this function to understand how
arrival gaps, voting duration times, and impatience flags are
determined.
Generating a list of voters
Your voter list generation method should be a private method (that is, one with name
that starts with __
), because its main purpose will be as a helper function to the simulate
method you will write later.
We recommend that it take a seed for the random number as input. The number
of voters generated will be determined by one of two stopping
conditions: (1) you have generated the maximum number of the voters
specified for the precinct, or (2) you generate a voter who arrives
after closing time. In the second case, the late voter should be
discarded and no further voters should be generated. We will refer to
voters who arrive before the polls close as valid.
Testing¶
Because your voter list generation method is private, you cannot call it directly from outside the Precinct
class, even for testing. However, since its purpose is as a helper function for the simulate
method, you can do the following: in the simulate
method, call your voter list generation method, and return the resulting list of voters. That way, you can test your voter list generation method by testing the simulate
method.
So that you can test your implementation manually, we have provided two sets of
voter generation parameters. The first set (see
data/precinct-0.json
) includes the following:
number of voters in the precinct: 5
number of booths in the precinct: 1
hours open: 1
arrival rate: 0.11 (corresponds to mean time between voters of approximately 9 minutes)
voting duration rate: 0.1 (corresponds to a mean voting duration of 10 minutes)
impatience probability: 0.1
seed for random number generator: 1468604453
and corresponds to a voter population in which all the voters arrive before the polls close and one voter, the first, is impatient.
The random seed
Remember that the precincts file includes a seed value
(returned by load_precincts
). This seed must be passed
to simulate
(using the seed
parameter) and
you must use the random.seed(seed)
function to set the seed inside
your method that generates the list of voters.
The seed only needs to be set once for each list of voters that you generate.
Remember that PA1 includes a longer discussion of random seeds, and their significance in testing simulations.
We have also provided two functions that will be useful to you as you
test this part your code. The first, a function named
run_simulation
in util_tests
, takes the name of a precinct
file and two optional parameters–the number of voting booths assigned
to the precinct and the impatience threshold. This function loads the
precinct data from the file, creates an instance of the Precinct
class using that data, and then returns the result of calling
simulate
on the precinct.
The second function, print_voters
in util.py
takes a list of
Voter
instances and an optional file name. This function will
either print a representation of the voters to screen or to a file,
depending on whether a filename is specified (you likely won’t have to
write to a file yourselves, but it is something that we’ve found
useful on occasion).
You can use these functions to call your voter list generation method and then to print and check the voters that get returned.
In [10]: import util, util_tests
In [11]: voters = util_tests.run_simulation("data/precinct-0.json")
In [12]: util.print_voters(voters)
Arrival Voting Is Start Departure Has
Time Duration Impatient Time Time Voted
------------------------------------------------------------------------------------------
1.92 7.46 T None None F
20.17 5.13 F None None F
21.17 1.78 F None None F
23.48 10.08 F None None F
54.75 18.02 F None None F
Notice how the Start Time
and the Departure Time
are initially
None
because you are not yet calculating the time when the voters
start voting (they may need to wait in line once they arrive at the
precinct, and we describe this in more detail in the next task) nor
are you handling impatient voters. Notice that all the voters start
out not having voted (that is, Has Voted
is False
).
The second set of voter generation parameters,
data/precinct-1.json
, is the same except the arrival
rate is set to 0.04, which corresponds to a mean gap of 25 minutes
between arrivals:
In [6]: voters = util_tests.run_simulation("data/precinct-1.json")
In [7]: util.print_voters(voters)
Arrival Voting Is Start Departure Has
Time Duration Impatient Time Time Voted
------------------------------------------------------------------------------------------
5.27 7.46 T None None F
55.48 5.13 T None None F
58.22 1.78 F None None F
Notice how, in the above set, only three voters get to vote. The
fourth voter’s projected arrival time turns out to be 64.57
, which
means they would arrive after the polls closed since this precinct is
only open for one hour.
Once you’ve done some manual testing, you can run additional automated tests with py.test:
$ py.test -xvk voter
(Recall that we use $
to indicate the Linux command-line prompt.
You should not include it when you run this command.)
As usual, remember that you can find more detailed instructions on how manual and automated testing work in the Testing Your Code page.
Task 1: Simulate an election day¶
In this task, you will be simulating an election for a single
precinct. You will do this by completing the implementation of the
simulate
method in Precinct
.
Simulating a Precinct¶
The simulation itself will be similar to the M/D/1 queue example from class. In the M/D/1 example, we have a single server, such as a toll booth, that serves a queue of clients, such as cars, that arrive with a specified distribution and have a fixed service time. In this assignment, your simulation will need to simulate multiple servers (voting booths) and variable service times (voting durations), but the structure of your simulation will be similar.
The biggest difference is that you will be using our VotingBooths
class (described below) to model the “multiple servers”. This class
will be responsible for determining whether a voting booth is
available and, if not, when a booth will become available. This
information will be necessary to determine when a given voter will
start voting (since it depends on when they can access a booth), and
to ensure that the number of booths in use will never be no more than
the number of booths assigned to the precinct.
A voter who is impatient will ask how long they have to wait. If it is longer than a specified impatience threshold, they will leave without voting. We’ll start by looking at an example that uses an impatience threshold that is long enough that everyone, even the impatient voters, will stay and vote.
Example¶
Let’s walk through an example. The file
data/precinct-2.json
contains the following
precinct:
{
"name": "Little Rodentia",
"hours_open": 1,
"num_voters": 10,
"voting_duration_rate": 0.1,
"arrival_rate": 0.16666666666666666,
"impatience_prob": 0.30
}
And the seed 1438018945
.
The precinct has 10 voters who arrive at the precinct once every 6 minutes on average. Voters take 10 minutes on average to vote. And two of the voters (the fourth voter and the eighth voter) are impatient.
As you work on your implementation, it can be useful to print out the
voters returned by your simulate
method, even if you have not yet
filled in the start and departure times of the voters:
In [15]: voters = util_tests.run_simulation("data/precinct-2.json")
In [16]: util.print_voters(voters)
Arrival Voting Is Start Departure Has
Time Duration Impatient Time Time Voted
------------------------------------------------------------------------------------------
13.38 4.37 F None None F
15.58 0.12 F None None F
17.92 4.15 F None None F
18.98 11.31 T None None F
26.54 5.51 F None None F
29.31 0.71 F None None F
43.15 5.38 F None None F
44.97 4.09 T None None F
49.22 9.52 F None None F
49.90 18.05 F None None F
Let’s consider what happens when we simulate this precinct with two voting booths and a large value (say, 1000) for the impatience threshold.
Our first voter first voter arrives at 13.38, enters a voting booth immediately at time 13.38, and finishes voting at 17.75 (13.38 + 4.37). (All times in this example have been rounded to two digits for clarity; your code should not round times.)
The second voter also starts voting immediately, because a booth is open, and finishes voting at time 15.70 (15.58 + 0.12).
The third voter arrives at time 17.92 and has a voting duration of 4.15 minutes. Since the first voter departed at 17.75 and the second voter departed at 15.70, this voter can start voting immediately, and finishes voting at 22.07.
The fourth voter arrives at 18.98, has a voting duration of 11.31 minutes, and is impatient. Only one booth is occupied when the fourth voter arrives, so voter can start voting immediately, and finishes voting at 30.29. The voter will vote because their wait time is zero, which is less than the impatience threshold.
The fifth voter arrives at time 26.54, after the third voter has departed, and starts voting immediately. This voter departs at 32.05.
The sixth voter arrives at 29.31 and has a voting duration of 0.71 minutes. Both booths are occupied when the sixth voter arrives, so this voter must wait until a booth becomes free, which happens when the fourth voter finishes at time 30.29. Thus, unlike the previous voters, the sixth voter’s start time (30.29) will be different from their arrival time (29.31). Their finish time will be 31.00. Notice that this voter departs before the fifth voter.
And so on.
All ten voters arrive before the polls close and the two impatient voters get to vote quickly and so all ten get to vote. Please note that (as typical in the US) if a voter arrives before the polls close, they still get to vote even if they will not start voting until after the polls have closed. In this example, the polls close at time 60 (the precinct dictionary specifies the number of hours the polls are open, while all other times are measured in minutes).
Here’s the result of our simulation:
In [20]: voters = util_tests.run_simulation("data/precinct-2.json", 2, 1000)
In [21]: util.print_voters(voters)
Arrival Voting Is Start Departure Has
Time Duration Impatient Time Time Voted
------------------------------------------------------------------------------------------
13.38 4.37 F 13.38 17.75 T
15.58 0.12 F 15.58 15.70 T
17.92 4.15 F 17.92 22.07 T
18.98 11.31 T 18.98 30.29 T
26.54 5.51 F 26.54 32.05 T
29.31 0.71 F 30.29 31.00 T
43.15 5.38 F 43.15 48.54 T
44.97 4.09 T 44.97 49.06 T
49.22 9.52 F 49.22 58.74 T
49.90 18.05 F 49.90 67.94 T
Notice that we provided values for the two optional parameters of util_tests.run_simulation
:
num_booths
is two and impatience_threshold
is 1000.
Now, let’s look at a precinct with one voting booth where everyone is very impatient. That is, the impatience probability is 1.0 and voters are not willing to wait more than three minutes. That is, if their wait time is (strictly) greater than three, they will leave without voting.
In [30]: voters = util_tests.run_simulation("data/precinct-6.json", 1, 3)
In [31]: util.print_voters(voters)
Arrival Voting Is Start Departure Has
Time Duration Impatient Time Time Voted
------------------------------------------------------------------------------------------
13.38 4.37 T 13.38 17.75 T
15.58 0.12 T 17.75 17.86 T
17.92 4.15 T 17.92 22.07 T
18.98 11.31 T None None F
26.54 5.51 T 26.54 32.05 T
29.31 0.71 T 32.05 32.76 T
43.15 5.38 T 43.15 48.54 T
44.97 4.09 T None None F
49.22 9.52 T 49.22 58.74 T
49.90 18.05 T None None F
Notice that everyone is impatient as expected. The second voter arrives while the first voter is still voting, so they have to wait. Their projected waiting time, that is, the time between when they arrive and when a booth will be available is 2.17. Since their projected wait time is less than three minutes, the voter waits and starts voting when the first voter finishes.
The third voter does not arrive at the polls until after the second voter has finished, so they vote right away. This voter will finish at time 22.07.
The fourth voter arrives while the third voter is still voting. Their
projected wait time is more than three minutes (22.07 - 18.98 > 3) and
so, they leaving the polling place without voting. Note that their
departure time, which tells us when a voter left the voting booth,
remains set at None
.
The fifth voter arrives at time 26.54. Since the third voter finished at time 22.07 and the fourth voter left without voting, this voter can start voting immediately.
And so, on.
Managing the Voting Booths¶
We have supplied a class VotingBooths
to model the voting booths.
This class allows you to add and remove voters from booths and to
track the state of the booths.
VotingBooths
class has a constructor that takes the number of
booths and initializes the data structures used to manage the booths.
Priority Queues
The underlying data structure used to represent the booths in the
VotingBooths
class is a priority queue. You do not need
to understand priority queues to use the VotingBooths
class.
But in case you are curious: a minimum priority queue is a data
structure that includes operations for (1) adding items along with
a priority to the queue (where the priority is just a
number), (2) removing the item with the lowest priority value
from the queue, and (3) determining whether the queue is empty.
Removing the elements with the lowest priority value may seem
counter-intuitive but, in Computer Science, a lower value usually
denotes a higher priority (think of it as a ranking: if you’re #1,
your ranked above someone who is ranked #2).
The class also provides a collection of methods:
is_booth_available
: which returnsTrue
if there is at least one unoccupied booth andFalse
otherwise.is_some_booth_occupied
: which returnsTrue
if there is at least one occupied booth andFalse
otherwise.time_next_free
: returns a float with the time that the next booth will be free next. This method should only be called when at least one booth is occupied.enter_booth
: takes an instance of theVoter
class and adds the voter to a booth. This method should only be called when at least one booth is open and it also requires thestart_time
attribute of theVoter
to be set.exit_booth
: removes the next voter to depart from their booth and returns the voter and their departure_time. This method should only be called when at least one booth is occupied.
When simulating an election day, you are responsible for ensuring that all booths are empty at the end of the day.
Your Precinct
code may use as many of these methods as needed.
Your code may not access the underlying data directly. In fact
our implementation of VotingBooths
uses a private attribute
(created by adding two underscores at the start of the attribute’s
name) for the data to prevent you accessing it directly.
Returning the voters¶
Your simulate
function must return a list of all the Voter
objects. The start_time
, departure_time
, and has_voted
attributes should be updated for the voters who voted. Furthermore, for a
given precinct, the list of voters should be sorted by the voters’
arrival time. Since your Precinct
should generate voters in the
order in which they arrive, producing a list of voters sorted by
arrival time should not require any additional steps. If you do find
that you need to sort the list explicitly, it is likely that your
solution is headed in the wrong direction; come to office hours or ask
for help on Ed if that is the case.
Testing¶
As we mentioned earlier, the function util_tests.run_simulation
has two optional arguments: num_booths
allows the caller to supply
the number of voting booths to use in the simulation, while
impatience_threshold
allows the caller to specify the impatience
threshold. Here is an example use:
In [40]: voters = util_tests.run_simulation("data/precinct-6.json", num_booths=2, impatience_threshold=1)
In [41]: util.print_voters(voters)
Arrival Voting Is Start Departure Has
Time Duration Impatient Time Time Voted
------------------------------------------------------------------------------------------
13.38 4.37 T 13.38 17.75 T
15.58 0.12 T 15.58 15.70 T
17.92 4.15 T 17.92 22.07 T
18.98 11.31 T 18.98 30.29 T
26.54 5.51 T 26.54 32.05 T
29.31 0.71 T 30.29 31.00 T
43.15 5.38 T 43.15 48.54 T
44.97 4.09 T 44.97 49.06 T
49.22 9.52 T 49.22 58.74 T
49.90 18.05 T 49.90 67.94 T
We have provided CSV files with the expected results for different
precincts with a variety of numbers of voting booths and thresholds.
The filenames all have the form: data/precint-N_VB_T.csv
,
where precinct-N
specifies the precinct files to use, VB
is the number of
voting booths, and T
is the impatience threshold. So, for
example, the file precinct-7_5_20.csv
was generated using the
precinct from data/precinct-7.json
along with five voting booths,
and an impatience threshold of 20.
We also have provided automated tests that will test your
implementation of the simulate
method. You can run these tests
like this:
$ py.test -xv test_simulate_election_day.py
Finally, we have provided a main function in simulate.py
that
allows you to run simulations from the command-line. If you run
simulate.py
with just the name of a configuration file, it will
print a summary of the simulation results using that precinct with one
voting machine and an impatience threshold of 1000. For example:
$ python3 simulate.py data/precinct-0.json
PRECINCT 'Downtown'
- 5 voters voted.
- Polls closed at 60 and last voter departed at 72.77
- 0 voters left without voting
You can use the --print-voters
options to, instead, print
all the voters:
$ python3 simulate.py data/precinct-0.json --print-voters
Arrival Voting Is Start Departure Has
Time Duration Impatient Time Time Voted
---------------------------------------------------------------------------------------
1.92 7.46 T 1.92 9.37 T
20.17 5.13 F 20.17 25.31 T
21.17 1.78 F 25.31 27.08 T
23.48 10.08 F 27.08 37.16 T
54.75 18.02 F 54.75 72.77 T
You can use the --num-booths
flag to set the number of booths
available to the precinct. (The default is 1) and you can use the
--impatience-threshold
flag to the impatience threshold (the
default is 1000).
Here’s a sample run of the very impatient precinct with one booth and a threshold of three:
$ python3 simulate.py data/precinct-6.json --impatience-threshold 3
- 10 voters voted
- Polls closed at 60 and last *voter* departed at 58.74.
- 3 voters left without voting
including the last person to arrive at the polls
If we keep the impatience threshold the same and add a booth, then none of the voters leave without voting:
$ python3 simulate.py data/precinct-6.json --num-booths 2 --impatience-threshold 3
Precinct Impatience Island
- 10 voters voted
- Polls closed at 60 and last voter departed at 67.94.
- 0 voters left without voting
Finding the median of a set of trial runs¶
In the final pair of tasks, you will need to run a series of trials,
collect the results from the trials, and then report the median
result. One easy way to find the median is to store the results in a
list (let’s call it lst
), sort the list using the built-in
sorted
function (sorted(lst)
), and then return the middle
element of the sorted list (sorted(lst)[len(lst) // 2]
). Our test
code for the last two tasks assumes you use this approach to compute the
median.
Task 2: Finding the impatience threshold¶
At this point, we have the ability to simulate precincts under a variety of parameters. One interesting parameter to tweak is the threshold for how long impatient voters are willing to wait (i.e., the impatience threshold).
Let’s start by thinking about a specific set of voters for a given precinct and a fixed number of voting booths. How might we find a value for the impatience threshold that would be large enough to ensure that every voter would stay and vote. One way to do this task, is to simulate election day for the precinct using the same seed (to guarantee that we are simulating the same set voters) and the same bank of voting booths, but with different values for the impatience threshold. In our case, we’ll start with an impatience threshold of 1 and then keep incrementing it by 10 until we find a value for the threshold that is large enough to guarantee that even the impatient voters in the batch stay to vote.
Here’s an example of this process run manually using the impatient
precinct ("data/precinct-6.json"
)
In [50] import simulate
In [51]: p, seed = util.load_precinct("data/precinct-6.json")
In [52]: voting_booths = simulate.VotingBooths(1)
In [53]: precinct = simulate.Precinct(p["name"],
...: p["hours_open"],
...: p["num_voters"],
...: p["arrival_rate"],
...: p["voting_duration_rate"],
...: p["impatience_prob"])
...:
...:
In [54]: voters = precinct.simulate(seed, voting_booths, 1)
In [55]: [v.has_voted for v in voters]
Out[55]: [True, False, True, False, True, False, True, False, True, False]
In [56]: voting_booths.is_some_booth_occupied()
Out[56]: False
In [57]: voters = precinct.simulate(seed, voting_booths, 11)
In [58]: [v.has_voted for v in voters]
Out[58]: [True, True, True, True, True, True, True, True, True, False]
In [59]: voting_booths.is_some_booth_occupied()
Out[59]: False
In [60]: voters = precinct.simulate(seed, voting_booths, 21)
In [61]: [v.has_voted for v in voters]
Out[61]: [True, True, True, True, True, True, True, True, True, True]
In [62]: voting_booths.is_some_booth_occupied()
Out[62]: False
Notice that we used the same precinct, seed, and bank of voting booths for all three tests. The only thing that varied was the impatience threshold. This makes sense because our goal is to find a threshold suitable for a specific set of voters and a specific set of voting booths.
Also, note that there is a tradeoff between computation time and
accuracy. The threshold we identified (21
) overshoots the true
value. A value closer to 12.3 is sufficient for this specific set of
voters.
If we ran another trial of this process using the same precinct and
set of voting booths but with seed+1
as the seed, we’d end up with
a threshold of 71.
To avoid having the quirks of a specific set of voters mislead us, we will run through this trial process for different sets of voters (that is, using different seeds) and return the median threshold identified across the different trials.
Your task is to write a function find_impatience_threshold
that,
given a seed, a Precinct
, a number of booths, and a number of
trials returns the median threshold from the specified number of
trials. You may find it helpful to write a separate helper function
that runs a single trial and returns the computed threshold for that seed
(set of voters). What parameters do you need for the helper function?
What should it return?
You will need to use a different seed for each trial. Start with the
specified initial seed
and add 1 for each new trial. (That is,
the ith trial should use seed+i
as the seed.)
Note that the number of booths to simulate is a parameter to the
function. If you implemented your simulation method properly, you
should be able to construct a single instance of the VotingBooths
class and use it for all the simulations.
Testing
We have provided a function util_tests.run_threshold_test
to help
you test your work in ipython3. This function takes the name of a
precinct file, the number of booths to use, and the number of trials
to run. It loads the precinct into a Precinct
object and calls
find_impatience_threshold
and returns the result.
In [60]: util_tests.run_threshold_test("data/precinct-6.json", 1, 100)
Out[60]: 41
In [61]: util_tests.run_threshold_test("data/precinct-6.json", 2, 100)
Out[61]: 11
You can find the expected results for many combinations of precincts,
numbers of voting booths, and numbers of trials in the file
data/impatience_threshold.csv
.
The automated tests for this task are split into two groups: tests that will run quickly and test that will run slowly. To skip the slow tests, run:
$ py.test -xv test_impatience_threshold.py --skip-slow
To include in the slow tests run:
$ py.test -xv test_impatience_threshold.py
Our implementation took 31.4 seconds to run the full test suite for this task. And under 50 seconds to run all the tests for all the tasks.
Task 3: Finding a suitable number of booths¶
Ideally, we would like to have enough booths on hand to allow everyone, even the impatient voters, to vote in all situations. Planning for all situations, though, can be wasteful. Lots of booths would sit idle most of the time. Instead, we will determine a number of booths that will be sufficient for many, but not all, situations.
Your task is to implement the function find_voting_booths_needed
,
which takes a starting seed, a precinct, an impatience threshold, and
a number of trials to run and returns this “good enough” number of
booths. The specific value required is defined below.
As in the previous task, a single trial will consist of finding the number of booths needed to ensure that everyone votes given a specified threshold and a specific set of voters (determined by the seed). Start with 1 booth and increment by 1 each time until you find a number of booths that is sufficient to ensure that all the voters stay to vote.
Also similar to the previous task, you should start with the specified
seed and increment it by one for each trial (i.e., the ith trial
should use seed+i
).
Your function should return the median number of booths computed across the trials (determined as noted above), which we have decided is good enough.
Testing
As with the previous function, we have provided a test function
utils_test.run_find_vb_test
to simplify the process of testing
your code manually. It takes the name of a precinct file, the
impatience threshold, and the number of trials to run as arguments.
It loads the precinct file into a Precinct
and then calls your
find_voting_booths_needed
function and returns the results:
In [70]: util_tests.run_find_vb_test("data/precinct-6.json", 5, 1)
Out[70]: 2
In [71]: util_tests.run_find_vb_test("data/precinct-6.json", 5, 10)
Out[71]: 3
In [72]: util_tests.run_find_vb_test("data/precinct-6.json", 5, 100)
Out[72]: 3
The file data/voting_booths_needed.csv
contains the expected
results for a variety of precinct, impatience thresholds, and numbers
of trials.
You can run the automated tests for this function like this from the Linux command-line:
$ py.test -xv test_voting_booths_needed.py
Grading¶
The assignment will be graded according to the Rubric for Programming Assignments. Make sure to read that page carefully; the remainder of this section explains aspects of the grading that are specific to this assignment.
In particular, your completeness score will be determined solely on the basis of the automated tests, which provide a measure of how many of the tasks you have completed:
Grade |
Percent tests passed |
---|---|
Exemplary |
at least 95% |
Satisfactory |
at least 80% |
Needs Improvement |
at least 50% |
Ungradable |
less than 50% |
For example, if your implementation passes 85% of the tests, you will earn an S (satisfactory) score for completeness.
The code quality score will be based on the general criteria in the Rubric for Programming Assignments but, in this programming assignment, we will also be paying special attention to the following:
Appropriate design for the
Voter
class Did you design one or more methods that are apporpriate for updating voters once they vote?Appropriate design for the
Precinct
class Did you design an appropriate helper method for generating voters?Appropriate use of the
VotingBooths
class Does your implementation leverage theVotingBooths
class to manage the voters properly or did you recreate some of the functionality of this class in your own code?Appropriate use of helper functions Did you design a good method for creating voters? Did you write well-designed helper methods for the parameter sweeps?
While these are the main things we care about in this assignment, please remember that it is not possible for us to give you an exhaustive list of every single thing that could affect your code quality score (and that thinking in those terms is generally counterproductive to learning how to program; see our How to Learn in this Class page for more details).
In general, avoiding all of the above issues will increase the likelihood of getting an E; if your code has a few (but not most) of the above issues, that will likely result in an S; if your code suffers from most of those issues, it will likely get an N. That said, to reiterate, we could also find other issues in your code that we did not anticipate, but that end up affecting your code quality score. When that happens, we encourage you to see these as opportunities to improve your code in future assignments (or as specific things to change if you decide to resubmit the assignment).
And don’t forget that style also matters! You could avoid all of the above issues, and still not get an E because of style issues in your code. Make sure to review the general guidelines in the Rubric for Programming Assignments, as well as our Style Guide.
Cleaning up¶
Before you submit your final solution, you should, remove
any
print
statements that you added for debugging purposes andall in-line comments of the form: “YOUR CODE HERE” and “REPLACE …”
Also, check your code against the style guide. Did you use good variable names? Do you have any lines that are too long, etc.
Make sure you have included header comments, that is, the triple-quote strings that describe the purpose, inputs, and return values of each function, for every function you have written.
As you clean up, you should periodically save your file and run your code through the tests to make sure that you have not broken it in the process.
Submission¶
You will be submitting your work through Gradescope (linked from our Canvas site). The process will be the same as with previous coursework: Gradescope will fetch your files directly from your PA4 repository on GitHub, so it is important that you remember to commit and push your work! You should also get into the habit of making partial submissions as you make progress on the assignment; remember that you’re allowed to make as many submissions as you want before the deadline.
If you are working in a pair, only one of you has to make a submission
To submit your work, go to the “Gradescope” section on our Canvas site. Then, click on “Programming Assignment #4”.
Under “Repository”, make sure to select your
uchicago-CAPP30121-aut-2021/pa4-$TEAM.git
repository. Under “Branch”, just
select “main”.
Finally, click on “Upload”. An autograder will run, and will report back a score. Please note that this autograder runs the exact same tests (and the exact same grading script) described in Testing Your Code. If there is a discrepancy between the tests when you run them on your computer, and when you submit your code to Gradescope, please let us know.
If you are working in a pair, Gradescope will initially create an individual submission. You will need to explicitly add your partner to your submission, so that your work is associated with them too.