Polling places¶
Due: Friday, Nov 4th, 4pm
You may work alone or in a pair on this assignment.

Every two years, many people around the country wait in long lines to vote due, in part, to inadequate provisioning of voting booths. In this assignment you will write code to simulate the flow of voters through precincts. The information generated by your simulator could be used by an election official as a crude lower bound to help determine the number of voting booths needed at a precinct.
We will view this problem as an instance of the more general problem of simulating an \(M\)/\(M\)/\(N\) queue, where:
- the first \(M\) indicates that arrivals (voters arriving at the polls) obey a Markov process;
- the second \(M\) indicates that the service times obey a Markov process (the amount of time a voter takes to complete a ballot); and finally
- the \(N\) indicates that there are \(N\) servers (the number of voting booths).
We will tell you how to structure your implementation at a high level, but you will be responsible for the design details.
Getting started¶
See these start-up instructions if you intend to work alone.
See these start-up instructions if you intend to work with the same partner as in a previous assignment.
See these start-up instructions if you intend to work in a NEW pair.
The pa4
directory includes a file named simulate.py
, which you
will modify, a file named util.py
that contains a few useful
functions, and a directory named data
, which contains sample
data. See the file README.txt
in pa4/data
for an explanation
of the contents of the data directory.
Please put all of your code for this assignment in simulate.py
.
Do not add extra files for the required classes.
Precinct Configurations¶
The configurations for a precinct, except the number of voting booths, is specified in a JSON file that contains the:
- number of voters assigned to the precinct (
num_voters
), - number of hours the polls are open (
hours_open
), - voting duration rate (
voting_duration_rate
), - arrival rate (
arrival_rate
), and a - seed for the random number generator (
seed
).
The number of voting booths (number_of_booths
) will be supplied as
a command-line argument.
We have provided a function, setup_config
in util.py
that
takes the name of a configuration file and a value for the number of
voting booths and returns a dictionary containing the configuration
parameters. The key for each value is shown in parenthesis next to
its description above.
In addition to being called to set up the simulation in the main
function of simulate.py
, this function will also be useful when
testing your code.
Here is an example use of this function:
In [1]: import util
In [2]: util.setup_config("data/config-0.json", 3)
Out[2]:
{'arrival_rate': 0.11,
'hours_open': 1,
'num_voters': 5,
'number_of_booths': 3,
'seed': 1468604453,
'voting_duration_rate': 0.1}
Representing Voters¶
Your implementation must contain a class for representing a voter. This data structure must include the time the voter arrives at the polls, voting duration, that is, the amount of time the voter takes to vote, and the time the voter is assigned to a voting booth (aka, the start time). We found it convenient for debugging purposes to store a unique identifier and the time the voter leaves the voting booth (aka, the departure time) as well.
The start time for a voter, which is not available at construction
time, should be initialized to None
when the voter object is constructed.
This value should be reset to the correct value when the voter is
assigned to a voting booth. Similarly, voters’ departure times cannot
be determined until we know their start times.
Your implementation must include at a minimum: a constructor and
properties named arrival_time
, voting_duration
, and
start_time
for the corresponding values. Our utility function for
printing voters, described below, relies on the existence of these
properties.
This class is straightforward. It merely serves as a way to encapsulate information about a voter. Our implementation is roughly 30 lines, including the code for generating unique identifiers for the voters and for handling the optional attributes.
Please note, the voter sample class, described next, is the only class that is allowed to call the voter constructor.
Generating a voter sample¶
Your implementation must include a class for generating voter samples using a Poisson process. From Wikipedia: a Poisson process, named after the French mathematician Simeon-Denis Poisson (1781-1840), is the stochastic process in which events occur continuously and independently of one another. We can use the following fact to generate a voter sample: if the number of arrivals in a given time interval [0,t] follows the Poisson distribution, with mean \(t*\lambda\), then the gap between arrivals times follows the exponential distribution with rate parameter \(\lambda\).
You may not generate the voters en masse when you construct a voter sample. Instead, you must generate voters one at a time as requests are received from the client of the sample.
To generate a voter, you will need to generate an arrival time and a voting duration. To generate the arrival time of the next voter, you will draw an inter-arrival time (\(gap\)) from the exponential distribution using the specified arrival rate as \(\lambda\) and add it to the current time (\(t\)). You’ll then move time forward by setting \(t\) to \(t+gap\). You will draw the voter’s voting duration by drawing a value from the exponential distribution using the specified voting duration rate as \(\lambda\).
For testing purposes, it is important that you draw the gap and the
voting duration in the same order as our code. To reduce the likelihood
that the testing process fails simply because the values were drawn in
the wrong order, we have provided a function,
gen_voter_parameters
, in util.py
that takes the arrival rate
and the voting duration rate as arguments and returns a pair of floats to use
for the gap and voting duration (in that order).
We have one detail left to describe: the stopping conditions. Your simulator should generate new voters until (1) you have generated the number of the voters specified for the precinct or (2) you generate a voter who arrives after closing time. In the second case, the late voter should be discarded and no further voters should be generated. We will refer to voters who arrive before the polls close as valid.
Here is a more formal statement of the stopping conditions: if \(M\) is the number of voters who arrive before the polls close and \(N\) is the number of voters in the precinct, then your implementation should generate exactly:
- \(N\) voters if \(M=N\)
- \(M+1\) voters if \(M<N\)
Our implementation has a constructor, a method named has_next
that
returns a boolean to indicate whether there are valid voters left to
be generated in the sample, and a method named next
that returns
the next voter in the sample. To simplify debugging our next
method fails on an assertion if it is called after the last valid
voter has been generated. You can use this organization or another as
you prefer as long as your implementation generates the expected
voters on demand (that is, as needed).
Printing a voter sample
We have provided a function named print_voters
in util.py
that
takes a list of voter
instances and an optional file name. This
function will either print a representation of the voters to
screen or to a file, depending on whether a filename is specified.
Testing
The code for generating the voter sample is not too long, but can be tricky to get right. We strongly urge you to test your voter sample class carefully before you move on to the next step.
When we were working on our code, we wrote a short function (8 lines)
for testing sample creation. This function takes the name of a
configuration file and creates a sample, pulls the voters one by one
from it and constructs a list, and then calls util.print_voters
with the list. You may want to consider writing
a similar function to aid in your testing.
To help with testing your voter sample class, we have provided two
sets of sample parameters. The first set, (see
data/config-0.json
), includes the following:
- number of voters in the sample: 5
- hours open: 1
- arrival rate: 0.11 (corresponds to mean time between voters of approximately 9 minutes)
- voting duration rate: 0.1 (corresponds to a mean voting duration of 10 minutes)
- seed for random number generator: 1468604453
and corresponds to a sample in which all the voters arrive before the polls close.
Here is the result of calling our sample testing function with
data/config-0.json
.
Arrival Time Voting Duration Start Time Departure Time
6.78 2.11 None None
7.01 5.13 None None
25.27 2.68 None None
26.89 1.10 None None
33.48 10.08 None None
The second set, data/config-1.json
, is the same except the arrival
rate is set to 0.04, which corresponds to a mean gap of 25 minutes.
In this sample, only two of the voters arrive before the polls close.
Here is the result of calling our sample testing function with
data/config-1.json
.
Arrival Time Voting Duration Start Time Departure Time
18.64 2.11 None None
19.28 5.13 None None
Representing the Voting Booths¶
Cities are broken up into voting precincts, each of which has a number of voting booths. We’ll be simulating election day for a single precinct.
Your implementation must include a class for representing a precinct with a specified number of voting booths. At a minimum, your implementation must provide a means for adding voters to voting booths and removing voters from voting booths in increasing order of departure time.
You will need a data structure to keep track of the voters who are
currently occupying voting booths and to determine which voter will
depart next. Your implementation should use the PriorityQueue
class the from the python queue library for this purpose. A minimum
priority queue is a data structure that includes operations for (1)
adding items along with their priorities to the queue, (2) removing
the item with the lowest priority from the queue, and (3) determining
whether the queue is empty. The PriorityQueue
library also allows
you to specify an upper bound on the number of items allowed to be in
the queue at the same time and provides a method for determining
whether the queue is full. Please look through the queue library
for details on how to use the this class.
For this assignment, the priorities will be the voters’ departure times.
Simulate election day¶
Finally, you must complete the function simulate_election_day
in
simulate.py
that simulates the flow of voters through a precinct
on election day. Your function should take a precinct configuration
and return a list of the valid voters, updated with their assigned
starting times, in the order they arrived at the precinct
In lecture, we will describe a program that simulates a single server, such as a toll booth, that serves a queue of clients, such as cars, that arrive with a specified distribution and have a fixed service time. You need to simulate multiple servers (voting booths) and variable service times (voting durations), but the structure of your simulation will be similar.
There are three main differences. First, instead of generating arrival times for the clients directly in the simulation code, you will construct a voter sample and pull voters out of the sample as needed. Second, you will use the precinct data structure to determine whether a voting booth is available and if not, when a booth will become available. And third, you will need to limit the number of booths in use to be no more than the number of booths assigned to the precinct.
Your implementation of the election day simulator must use the voter sample and precinct data structures. It must not call the voter constructor or access the precinct priority queue directly.
Example¶
Let’s walk through an example. The file data/config-2.json
contains the following configuration:
num_voters
: 10hours_open
: 1arrival_rate
: 0.16666666666666666voting_duration_rate
: 0.1seed
: 1438018944
The precinct has 10 voters who arrive at the precinct once every 6 minutes on average and take 10 minutes on average to vote.
Let’s consider what happens when we simulate this precinct using two
voting booths. The first call to util.gen_voter_parameters
generates (3.29, 5.39)
as the gap and voting duration, which means
that our first voter arrives at 3.29, enters a voting booth
immediately at time 3.29, and finishes voting at 8.69 (3.29 + 5.39). (All times
have been rounded to two digits for clarity.)
The second call to util.gen_voter_parameters
yields (9.10,
5.74)
, which means that the second voter arrives at time 12.39 (3.29 + 9.10). She
starts voting immediately, because a booth is open. She finishes at
time 18.13 (12.39 + 5.74).
The third voter arrives at time 12.68 (gap: 0.29) and has a voting duration of 24.45 minutes. Since the first voter departed at 8.69, this voter can start voting immediately. He finishes voting at 37.13.
The fourth voter arrives at 15.78 (gap: 3.10) and has a voting duration of 6.08 minutes. Both booths are occupied when she arrives, so this voter must wait until a booth becomes free, which happens when the second voter finishes at time 18.13. Her start time will be 18.13 and her finish time will be 24.21. Notice that this voter finishes before the third voter.
The fifth voter arrives at time 19.76 (gap: 3.99) and will not start voting until the fourth voter finishes at time 24.21. Notice that the third voter is still voting when the fifth voter starts!
And so on.
All ten voters arrive before the polls close and so all ten get to vote even though the last two will not start voting until after the polls have closed at time 60.
Here’s the output of our simulation:
Arrival Time Voting Duration Start Time Departure Time
3.29 5.39 3.29 8.69
12.39 5.74 12.39 18.13
12.68 24.45 12.68 37.13
15.78 6.08 18.13 24.21
19.76 39.64 24.21 63.85
20.52 5.79 37.13 42.92
22.33 16.65 42.92 59.57
24.23 3.07 59.57 62.64
26.75 5.24 62.64 67.88
29.04 15.43 63.85 79.28
Testing¶
We have provided four sample configurations along with the output of
running our implementation on each configuration with one voting
booth. We have also included the example above (see
data/config-2-2-voters.txt
). These test cases should be used
merely a starting point; our grading code uses these with
configurations with different numbers of voting booths along with some
precinct configurations that describe corner cases.
Our function, util.print_voters
, takes a filename as an optional
parameter. If you print your voters to a file using this function can
compare your results with ours using the Linux diff
command. For
example, if you simulated data/config-0.json
with one voting booth
and saved the resulting voters in a file named
config-0-1-myresults.txt
, you could compare your results with ours
with the command:
diff config-0-1-myresults.txt data/config-0-1-voters.txt
Please note that diff
is silent when the two files are the same.
It will generate output that attempts to identify the differences when
the files are different. This output can be a bit cryptic. You may
want to try including the -y
flag or -u
flag to get output
from diff in an alternative format that might be find easier to
understand.
Putting it all together¶
We have provided a main function that parses the command-line
arguments (configuration file name and number of voting booths), calls
simulate_election_day
, and prints the resulting voters to standard
out.
Please note that if you run simulate.py
in ipython3
without
any arguments, it will generate a usage error message:
In [9]: run simulate-skeleton.py
usage: python3 simulate-skeleton.py <configuration filename> [number of voting booths]
You can safely ignore this message.
If your code does not generate any debugging output or sends its
debugging output to standard error (include file=sys.stderr
as an
argument in your calls to print
), then you can compare your
results to ours using a Linux pipe and diff
. For example,:
python3 simulate.py data/config-0.json | diff - data/config-0-1-voters.txt
Using dash (-
) as an argument to diff
indicates that the data
coming from the pipe should be used in place of data from a file.
Submission¶
See these submission instructions if you are working alone.
See these submission instructions if you are working in the same pair as for PA #2 or PA #3.
See these submission instructions if you are working in a NEW pair.