============== Polling places ============== **Due: Friday, November 8, 4pm** *You may work alone or in a pair on this assignment.* .. image:: images/voting_booths.png :width: 3in :align: right Every two years, many people around the country wait in long lines to vote due, in part, to inadequate provisioning of voting booths. In this assignment you will write code to simulate the flow of voters through precincts. The information generated by your simulator could be used by an election official as a crude lower bound to help determine the number of voting booths needed at a precinct. We will view this problem as an instance of the more general problem of simulating an :math:`M`/:math:`M`/:math:`N` queue, where: - the first :math:`M` indicates that arrivals (voters arriving at the polls) obey a Markov process; - the second :math:`M` indicates that the service times obey a Markov process (the amount of time a voter takes to complete a ballot); and finally - the :math:`N` indicates that there are :math:`N` servers (the number of voting booths). We will tell you how to structure your implementation at a high level, but you will be responsible for the design details. Getting started --------------- See `these start-up instructions `__ if you intend to work alone. See `these start-up instructions `__ if you intend to work with the same partner as in a previous assignment. See `these start-up instructions `__ if you intend to work in a *NEW* pair. The ``pa4`` directory includes a file named ``simulate.py``, which you will modify, a file named ``util.py`` that contains a few useful functions, a directory named ``data``, which contains precint configuration files (described below), and a series of tests contained in files ``test_avg_wait_time.py``, ``test_simulate_election_day.py`` and ``test_find_number_of_booths.py``. Please put all of your code for this assignment in ``simulate.py``. Do not edit other files or add extra files for the required classes. Precinct Configuration ---------------------- The configuration for a precinct is specified using a dictionary containing this information: - The name of the precinct (``name``) - number of voters assigned to the precinct (``num_voters``), - number of hours the polls are open (``hours_open``), - The number of voting booths (``num_booths``) - The voter distribution (``voter_distribution``). For example:: {'name': 'Downtown', 'num_voters': 5, 'hours_open': 1, 'num_booths': 1, 'voter_distribution': {'type': 'poisson', 'arrival_rate': 0.11, 'voting_duration_rate': 0.1} Notice how the ``voter_distribution`` field contains another dictionary containing these fields: - The type of distribution (``type``). In this assignment, we will only deal with one type of distribution: ``poisson`` - The rate at which voters arrive (``arrival_rate``) and the rate at which voters complete their voting (``voting_duration_rate``). The significance of these rates is explained later. Information about precincts is stored in a JSON file that contains a dictionary with two keys: ``precincts``, containing a list of precinct configurations as specified above, and ``seed``, containing a random seed. Similar to PA1, the random seed will be used to ensure that a simulation involving randomness produces the same result (which will allow us to test whether your code produces the expected results). We recommend reviewing PA1 (and how it used random seeds) if you're unsure of how random seeds are used in the context of running a simulation. We have provided a function, ``load_precincts`` in ``util.py`` that takes the name of a JSON precincts file and returns two values: a list of precinct configurations (as specified above) and the random seed. This function is already called from the code we provide for you, so you should not make any calls to ``load_precincts`` in the code you'll add to ``simulate.py``. However, it can be a useful function when testing your code. In particular, you should take a moment to familiarize yourself with the data stored in these JSON files, and how to access each value once you've loaded the file with ``load_precincts``. We suggest you start by looking at ``data/config-single-precinct-0.json``, and loading that file from IPython:: In [1]: import util In [2]: precincts, seed = util.load_precincts("data/config-single-precinct-0.json") In [3]: precincts Out[3]: [{'hours_open': 1, 'name': 'Downtown', 'num_booths': 1, 'num_voters': 5, 'voter_distribution': {'arrival_rate': 0.11, 'type': 'poisson', 'voting_duration_rate': 0.1}}] In [4]: seed Out[4]: 1468604453 In [5]: len(precincts) Out[5]: 1 In [6]: p = precincts[0] In [7]: p["num_voters"] Out[7]: 5 In [8]: p["voter_distribution"]["arrival_rate"] Out[8]: 0.11 Representing Voters ------------------- In this assignment you will need to model voters using a ``Voter`` class. This class must include the time the voter arrives at the polls, voting duration, that is, the amount of time the voter takes to vote, and the time the voter is assigned to a voting booth (aka, the start time). For simplicity, we will model time as a floating point number, representing the number of minutes since the polls opened. So, if we say a voter arrived at the polls at time 60.5, that means they arrived an hour and thirty seconds after the polls opened. While not strictly necessary, we found it convenient for debugging purposes to store the time the voter leaves the voting booth (i.e., the departure time) as well. The start time for a voter, which is not available at construction time, should be initialized to ``None`` when the voter object is constructed. This value should be reset to the correct value when the voter is assigned to a voting booth. Similarly, voters' departure times cannot be determined until we know their start times. Your implementation **must** include at a minimum: a constructor and public attributes named ``arrival_time``, ``voting_duration``, and ``start_time``. Our utility function for printing voters, described below, relies on the existence of these attributes. This class is straightforward. It merely serves as a way to encapsulate information about a voter. Our implementation is roughly 10 lines. Please note, the precinct class, described next, is the only class that is allowed to call the ``Voter`` constructor. Precincts --------- Cities are broken up into voting precincts, and your implementation must include a ``Precinct`` class that represents a single precinct, and which will be responsible for simulating the behaviour of voters in that precinct. A precinct is open for some number of hours, and has some maximum number of voters who can vote in the precinct (you can think of this as the number of voters who are registered to vote in that precinct). We will assume that *all* registered voters will go to the polls on election day, but not all of them may get to vote, because some could arrive after the polls close. As we will explain later on, a precinct also has some number of voting booths (which can affect how quickly we can process voters) Notice how having a ``Precinct`` class allows us to have multiple ``Precinct`` objects in our program, each with its own set of parameters. This will make it easier for us to simulate the behavior of multiple precincts, as well as compare the effects of those parameters. For example, we could create two ``Precinct`` objects with the exact same parameters, but where one precinct has one voting booth and the other one has two voting booths. Simulating both can provide insights into how the number of voting booths can affect waiting times. We have already provided a skeleton for the ``Precinct`` class that includes the constructor and a ``simulate`` method to simulate a single voting day on that precinct. While you are allowed to add additional methods to this class, you *cannot* modify the parameter list of the constructor and the ``simulate`` method. Please note that we will consider each ``Precinct`` as completely independent from each other: voters cannot go from one precinct to another, and there is **no shared information between the precincts**. Generating Voters ----------------- One of the main responsibilities of the ``Precinct`` class is to generate the voters according to some random distribution (in our case, we will only be using a Poisson distribution), so let's take a closer look at how those voters will be generated. Both the arrival time and the voting duration of our voters will follow a Poisson process. From Wikipedia: a Poisson process, named after the French mathematician Simeon-Denis Poisson (1781-1840), is a stochastic process in which the time to the next event is independent of past events. We can use the following fact to generate a voter sample: if the number of arrivals in a given time interval [0,t] follows the Poisson distribution, with mean :math:`t*\lambda`, then the gap between arrivals times follows the exponential distribution with rate parameter :math:`\lambda`. This means that a ``Precinct`` object must keep track of time. We will assume that we keep track of time in minutes starting at time zero. To generate a voter, you will need to generate an arrival time and a voting duration. To generate the arrival time of the next voter, you will generate an inter-arrival time (:math:`gap`) from the exponential distribution using the specified arrival rate as :math:`\lambda` and add it to the current time (:math:`t`). You'll then move time forward by updating the time from :math:`t` to :math:`t + gap`. Notice how :math:`gap` represents the time that elapses between the arrival of the previous voter and the arrival of the next voter. For example, let's say we want to generate a voter in a precinct, and the random number generator returns ``5.7`` for that voter's gap. Since this is the first voter we generate, their arrival time will be ``5.7``, and we advance the precinct's current time from ``0`` to ``5.7``. We then generate another voter, and the random number generator returns ``2.4`` for the voter's gap. This means that this second voter arrives ``2.4`` minutes *after* the previous voter, so the second voter's arrival time will be ``8.1`` (``5.7 + 2.4``) and we advance the precinct's current time to ``8.1``. When we generate a voter, we also need to assign a voting duration to that voter. You will draw the voter's voting duration from the exponential distribution using the specified voting duration rate as :math:`\lambda`. For testing purposes, it is important that you draw the gap and the voting duration in the same order as our code. To reduce the likelihood that the testing process fails simply because the values were drawn in the wrong order, we have provided a function, ``gen_poisson_voter_parameters``, in ``util.py`` that takes the arrival rate and the voting duration rate as arguments and returns a pair of floats to use for the gap and voting duration (in that order). We also recommend that you include a ``next_voter`` method in your ``Precinct`` class. This method would return the next voter in that precinct, and can be useful both for testing and for keeping your simulation code clean. Task 0: Implement voter generation ---------------------------------- You will start by implementing the ``Voter`` class, as well as writing only the voter generation logic in the ``simulate`` method in ``Precinct``. In other words, instead of carrying out a simulation, you will just generate all the ``Voter`` objects for the voters who will be able to vote in the precinct, and will return a list of these objects. When doing so, you must generate voters until you reach one of the simulation's *stopping conditions*. In this case, we need to stop the simulation once there are no more voters who can vote. This can happen in two situations: (1) you have generated the maximum number of the voters specified for the precinct, or (2) you generate a voter who arrives after closing time. In the second case, the late voter should be discarded and no further voters should be generated. We will refer to voters who arrive before the polls close as *valid*. Doing this partial implementation of the ``simulate`` method will ensure that you've correctly implemented the voting generation logic we described earlier. The code for generating the voters in your ``Precinct`` should be fairly simple (our implementation is only 15 lines long, not including comments). Testing ~~~~~~~ To manually test your implementation, we have provided two sets of voter generation parameters. The first set, (see ``data/config-single-precinct-0.json``), includes the following: - number of voters in the precinct: 5 - hours open: 1 - arrival rate: 0.11 (corresponds to mean time between voters of approximately 9 minutes) - voting duration rate: 0.1 (corresponds to a mean voting duration of 10 minutes) - seed for random number generator: 1468604453 and corresponds to a voter population in which all the voters arrive before the polls close. .. admonition:: The random seed Remember that the precincts file includes a seed value (returned by ``load_precincts``). This seed must be passed to ``simulate`` (using the ``seed`` parameter) and you must use the ``random.seed(seed)`` function to set the seed *inside* ``simulate``. The seed only needs to be set once per simulation. Remember that PA1 includes a longer discussion of random seeds, and their significance in testing simulations. We have also provided a function named ``print_voters`` in ``util.py`` that takes a list of ``Voter`` instances and an optional file name. This function will either print a representation of the voters to screen or to a file, depending on whether a filename is specified (you likely won't have to write to a file yourselves, but it is something that we've found useful on occasion). You can use this function to print the return value of the ``simulate`` method, and check whether the returned values look correct:: In [1]: from simulate import Precinct In [2]: import util In [3]: precincts, seed = util.load_precincts("data/config-single-precinct-0.json") In [4]: p = precincts[0] In [5]: precinct = Precinct(p["name"], p["hours_open"], p["num_voters"], ...: p["voter_distribution"]["arrival_rate"], ...: p["voter_distribution"]["voting_duration_rate"]) In [6]: voters = precinct.simulate(seed=seed, num_booths=1) In [7]: util.print_voters(voters) Arrival Time Voting Duration Start Time Departure Time 6.78 2.11 None None 7.01 5.13 None None 25.27 2.68 None None 26.89 1.10 None None 33.48 10.08 None None Notice how the ``Start Time`` and the ``Departure`` will be initially ``None`` because you are not yet calculating the time when the voters start voting (they may need to wait in line once they arrive at the precinct, and we describe this in more detail in the next task) The second set, ``data/config-single-precinct-1.json``, is the same except the arrival rate is set to 0.04, which corresponds to a mean gap of 25 minutes:: In [8]: precincts, seed = util.load_precincts("data/config-single-precinct-1.json") In [9]: p = precincts[0] In [10]: precinct = Precinct(p["name"], p["hours_open"], p["num_voters"], ...: p["voter_distribution"]["arrival_rate"], ...: p["voter_distribution"]["voting_duration_rate"]) In [11]: voters = precinct.simulate(seed=seed, num_booths=1) In [12]: util.print_voters(voters) Arrival Time Voting Duration Start Time Departure Time 18.64 2.11 None None 19.28 5.13 None None Notice how, in the above set, only two voters get to vote (if you actually generated the third voter, it would have an arrival time of ``69.49``, which means they would arrive after the polls closed, since this precinct is only open for one hour). Tip: Instead of having to type in all the above code every time, you may want to write a short ``load_and_print`` function to help you Once you've done some manual testing, you can run additional automated tests with py.test:: $ py.test -xvk voter (Recall that we use ``$`` to indicate the Linux command-line prompt. You should not include it when you run this command.) Task 1: Simulate an election day -------------------------------- In this task, you will be simulating an election for a single precinct. You will do this by completing the implementation of the ``simulate`` method in ``Precinct`` and by implementing a ``VotingBooths`` class that models the voting booths in that precinct. Simulating a Precinct ~~~~~~~~~~~~~~~~~~~~~ The simulation itself will be similar to the M/D/1 queue example from class. In the M/D/1 example, we have a single server, such as a toll booth, that serves a queue of clients, such as cars, that arrive with a specified distribution and have a fixed service time. In this assignment, your simulation will need to simulate multiple servers (voting booths) and variable service times (voting durations), but the structure of your simulation will be similar. The biggest difference is that you will be implementing a ``VotingBooths`` class which models the "multiple servers" in the queue. This class will be responsible for determining whether a voting booth is available and, if not, when a booth will become available. This will be necessary to determine when a given voter will finish voting (since it depends on when they can access a booth), and to ensure that the number of booths in use to be no more than the number of booths assigned to the precinct. We discuss the ``VotingBooths`` class in more detail below. Example ~~~~~~~ Let's walk through an example. The file ``data/config-single-precinct-2.json`` contains the following precinct:: { "name": "Little Rodentia", "hours_open": 1, "num_voters": 10, "num_booths": 2, "voter_distribution": { "type": "poisson", "voting_duration_rate": 0.1, "arrival_rate": 0.16666666666666666 } } And the seed ``1438018944``. The precinct has 10 voters who arrive at the precinct once every 6 minutes on average and take 10 minutes on average to vote. As you work on your implementation, it can be useful to print out the voters returned by your ``simulate`` method, even if you have not yet filled in the start and departure times of the voters:: In [3]: precincts, seed = util.load_precincts("data/config-single-precinct-2.json") In [4]: p = precincts[0] In [5]: precinct = Precinct(p["name"], p["hours_open"], p["num_voters"], ...: p["voter_distribution"]["arrival_rate"], ...: p["voter_distribution"]["voting_duration_rate"]) In [6]: voters = precinct.simulate(seed=seed, num_booths=2) In [7]: util.print_voters(voters) Arrival Time Voting Duration Start Time Departure Time 3.29 5.39 None None 12.39 5.74 None None 12.68 24.45 None None 15.78 6.08 None None 19.76 39.64 None None 20.52 5.79 None None 22.33 16.65 None None 24.23 3.07 None None 26.75 5.24 None None 29.04 15.43 None None Let's consider what happens when we simulate this precinct using two voting booths. The first call to ``util.gen_poisson_voter_parameters`` generates ``(3.29, 5.39)`` as the gap and voting duration, which means that our first voter arrives at 3.29, enters a voting booth immediately at time 3.29, and finishes voting at 8.69 (3.29 + 5.39). (All times have been rounded to two digits for clarity.) The second call to ``util.gen_poisson_voter_parameters`` yields ``(9.10, 5.74)``, which means that the second voter arrives at time 12.39 (3.29 + 9.10). This second voter starts voting immediately, because a booth is open, and finishes voting at time 18.13 (12.39 + 5.74). The third voter arrives at time 12.68 (gap: 0.29) and has a voting duration of 24.45 minutes. Since the first voter departed at 8.69, this voter can start voting immediately, and finishes voting at 37.13. The fourth voter arrives at 15.78 (gap: 3.10) and has a voting duration of 6.08 minutes. Both booths are occupied when the fourth voter arrives, so this voter must wait until a booth becomes free, which happens when the second voter finishes at time 18.13. Thus, the fourth voter's start time will be 18.13 and their finish time will be 24.21. Notice that this voter finishes *before* the third voter. The fifth voter arrives at time 19.76 (gap: 3.99) and will not start voting until the fourth voter finishes at time 24.21. Notice that the third voter is still voting when the fifth voter starts! And so on. All ten voters arrive before the polls close and so all ten get to vote even though the last two will not start voting until after the polls have closed at time 60 (remember that we specify time in *minutes*, but the precinct dictionary specified the number of *hours*) Here's the result of our simulation (assuming the ``simulate`` method has been implemented correctly):: In [3]: precincts, seed = util.load_precincts("data/config-single-precinct-2.json") In [4]: p = precincts[0] In [5]: precinct = Precinct(p["name"], p["hours_open"], p["num_voters"], ...: p["voter_distribution"]["arrival_rate"], ...: p["voter_distribution"]["voting_duration_rate"]) In [6]: voters = precinct.simulate(seed=seed, num_booths=2) In [7]: util.print_voters(voters) Arrival Time Voting Duration Start Time Departure Time 3.29 5.39 3.29 8.69 12.39 5.74 12.39 18.13 12.68 24.45 12.68 37.13 15.78 6.08 18.13 24.21 19.76 39.64 24.21 63.85 20.52 5.79 37.13 42.92 22.33 16.65 42.92 59.57 24.23 3.07 59.57 62.64 26.75 5.24 62.64 67.88 29.04 15.43 63.85 79.28 Modelling voting booths within a precinct ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We will encapsulate the behavior of the voting booths using a ``VotingBooths`` class. In turns, this class will need a data structure to keep track of the voters who are currently occupying voting booths and to determine which voter will depart next. Your implementation should use the ``PriorityQueue`` class the from the python queue_ library for this purpose. A minimum priority queue is a data structure that includes operations for (1) adding items along with a *priority* to the queue (where the priority is just a number), (2) removing the item with the *lowest* priority value from the queue, and (3) determining whether the queue is empty. Removing the elements with the lowest priority value may seem counter-intuitive but, in Computer Science, a lower value usually denotes a higher priority (think of it as a ranking: if you're #1, your ranked above someone who is ranked #2). The ``PriorityQueue`` class also allows you to specify an upper bound on the number of items allowed to be in the queue at the same time and provides a method for determining whether the queue is full. Please look through the queue_ library for details on how to use the this class. For this assignment, the priorities will be the voters' departure times. .. admonition:: Blocking on ``get`` and ``put`` You should be careful with one specific aspect of priority queues: the ``get`` method is a *blocking* method, meaning that if you call ``get`` (to extract a value from the priority queue) and the queue has no values, the operation will *block* (or hang) until a value is available. In this assignment, you should never call ``get`` on an empty queue so, if your code seems to mysteriously hang, the first thing you should check is whether you're calling ``get`` on an empty queue somewhere in your code. Alternatively, you can call ``get`` like this:: my_queue.get(block=False) This way, ``get`` will raise an ``Empty`` exception if you try to extract a value from an empty queue (immediately alerting you to the fact that there may be an error somewhere in your code, since you should never be calling ``get`` on an empty queue in this assignment) Similarly, if you call ``put`` on a queue that is full, your code will also block. However, if you call ``put`` like this:: my_queue.put(value, block=False) Then putting a value into a full queue will raise a ``Full`` exception. Note: the ``PriorityQueue`` class does not have a "peek" function (to look at the element at the front of the queue without removing it), but you should be able to implement your ``VotingBooths`` class without relying on a "peek" function. Take into account that your implementation should not need a separate class for each individual voting booth. There could certainly be models that require modelling each voting booth as its own class but, in the model we are assuming, it is enough to have a ``VotingBooths`` class that uses a priority queue to model the state of all the voting booths. Furthermore, your ``Precinct`` code should not need to access the priority queue directly. Instead, you should implement methods in the ``VotingBooths`` class that encapsulate operations like adding a voter to a booth, or removing the next voter who will finish voting. Your ``Precinct`` code will then call those. An easy way to make sure that your ``Precinct`` doesn't directly access the priority queue is by making sure that the priority queue inside ``VotingBooths`` is a *private* attribute (by adding two underscores at the start of the attribute's name) .. admonition:: Don't queue voters! You may be tempted to create a queue of ``Voter`` objects. Don't! This won't be necessary, since we're already using a priority queue to model the voting booths. This allows us to efficiently answer the question "When will a voting booth become available in this precinct?". If we can easily answer that question, then we will know the voting start time of any new ``Voter``. Returning the voters ~~~~~~~~~~~~~~~~~~~~ Your ``simulate`` function must return a list of ``Voter`` objects with the ``start_time`` attribute filled in. Furthermore, for a given precinct, our tests expects the list of voters to be sorted by the voters' arrival time; since your ``Precinct`` should generate voters in the order in which they arrive, producing a list of voters sorted by arrival time should not require any additional steps to sort the list. If you do find that you need to sort the list explicitly, it is likely that your solution is headed in the wrong direction; ask on Piazza if that is the case. Testing ~~~~~~~ We have provided several automated tests that will test your implementation of the ``simulate`` method. The first tests you should run are the ones that simulate a single precinct (and, thus, create only a single ``Precinct`` object). You can run these tests like this:: $ py.test -xv -k single test_simulate_election_day.py However, remember that the first approach to testing and debugging your code should not rely exclusively on the tests. If you run the tests, and it is not immediately clear what the issue is, you should manually run your simulation code either from IPython (in the way we've been doing so far) or by using a command we describe below. These two approaches will generally produce error messages that may be easier to interpret and debug. If you run a simulation from IPython, and would like to manually verify whether the values are correct, each configuration file has a corresponding CSV file with the expected values (take into account that these are the files the automated tests use; if you find that your implementation seems to produce correct values, you may want to switch to running the automated tests). Once the single precinct tests are passing, you can then try the tests that simulate multiple precincts. Take into account that the precinct configuration files we have used up to this point contain only a single precinct. However, the file format allows for multiple precincts to be specified in the same file. For example, the ``config-multiple-precincts-1.json`` file specifies two identical precincts, except for the number of booths :: { "name": "Little Rodentia (1 booth)", "hours_open": 1, "num_voters": 10, "num_booths": 1, "voter_distribution": { "type": "poisson", "voting_duration_rate": 0.1, "arrival_rate": 0.16666666666666666 } } :: { "name": "Little Rodentia (2 booths)", "hours_open": 1, "num_voters": 10, "num_booths": 2, "voter_distribution": { "type": "poisson", "voting_duration_rate": 0.1, "arrival_rate": 0.16666666666666666 } } In most cases, if you are passing the single precinct tests, you should also pass the multiple precinct tests effortlessly. However, there are a few bugs that will only be caught if we create multiple ``Precinct`` objects. You can run the automated tests that use multiple precincts like this:: $ py.test -xv -k multiple test_simulate_election_day.py If you want to run all the simulation tests (with both single and multiple precincts), you can just run this:: $ py.test -xv -k day We also provide a main function in ``simulate.py`` that allows you to run ``simulate.py`` from the command-line. If you run ``simulate.py`` with just the name of a configuration file, it will print a summary of the simulation results. For example:: $ python3 simulate.py data/config-single-precinct-0.json PRECINCT 'Downtown' - 5 voters voted. - Polls closed at 60.0 and last voter departed at 43.56. - Avg wait time: 0.59 You can use the ``--print-voters`` options to, instead, print all the voters:: $ python3 simulate.py data/config-single-precinct-0.json --print-voters PRECINCT 'Downtown' Arrival Time Voting Duration Start Time Departure Time 6.78 2.11 6.78 8.89 7.01 5.13 8.89 14.02 25.27 2.68 25.27 27.94 26.89 1.10 27.94 29.04 33.48 10.08 33.48 43.56 And here is an example with multiple precincts:: $ python3 simulate.py data/config-multiple-precincts-1.json PRECINCT 'Little Rodentia (1 booth)' - 10 voters voted. - Polls closed at 60.0 and last voter departed at 134.48. - Avg wait time: 46.43 PRECINCT 'Little Rodentia (2 booths)' - 10 voters voted. - Polls closed at 60.0 and last voter departed at 79.28. - Avg wait time: 15.00 Notice how the results make sense intuitively: with more booths, it is possible to accommodate new voters sooner, which means waiting times will go down. Task 2: Finding the average waiting time of a precinct ------------------------------------------------------ At this point, we have the ability to simulate precincts under a variety of parameters. As we saw above, an interesting parameter to tweak is the number of booths, since it has a direct effect on the average waiting time of the voters. In this task, you will implement a function that, given a single precinct and a number of booths, computes the average waiting time when simulating that precinct with that number of booths:: def find_avg_wait_time(precinct, num_booths, ntrials, initial_seed=0): Where: - ``precinct`` is a precinct dictionary. You will use this to construct a ``Precinct`` object. - ``num_booths`` is the number of booths to use in the simulation. Notice that this will override whatever value is specified in the ``precinct`` dictionary. However **do not** modify the ``num_booths`` field of the ``precinct`` dictionary. - ``ntrials`` is a number of trials (this is explained below) - ``initial_seed`` is an initial seed (also explained below) Given a single simulation of a precinct, computing the average waiting time of the voters is not difficult. In fact, if you look at the code we provide(see the ``cmd`` function in ``simulate.py``), you'll find some code within it to do just that! However, as with any simulation, it is unreasonable to accept the result of a single simulation (or "trial"). So, instead you will simulate the precinct ``ntrials`` times, compute the average wait time for each trial, sort the resulting average wait times, and return the median value (which, for simplicity, we will define simply as element ``ntrials//2`` in the sorted list of average wait times). In this task, you will need to pass a different random seed to ``simulate`` in each trial. You will do so by passing ``initial_seed`` to the first trial, and then, before any subsequent trial, incrementing the seed by one. To test your implementation of the function, you can call the function from IPython. For example:: In [1]: %load_ext autoreload In [2]: %autoreload 2 In [3]: import util In [4]: import simulate In [5]: precincts, seed = util.load_precincts("data/config-single-precinct-3.json") In [6]: p = precincts[0] In [7]: simulate.find_avg_wait_time(p, num_booths=1, ntrials=20, initial_seed=seed) Out[7]: 2203.0502079782727 In [8]: simulate.find_avg_wait_time(p, num_booths=8, ntrials=20, initial_seed=seed) Out[8]: 4.630351652014112 You can also run the automated tests for this function like this:: $ py.test -xv -k avg Task 3: Finding the optimal number of booths -------------------------------------------- At this point, we have all the necessary pieces to answer the following question: "Given a target waiting time :math:`W`, how many booths does the precinct need to ensure that the average waiting time is less that :math:`W`?". You will write the following function to answer this question:: def find_number_of_booths(precinct, target_wait_time, max_num_booths, ntrials, seed=0): Where: - ``precinct`` is a precinct dictionary. You will use this to construct a ``Precinct`` object. - ``target_wait_time`` is :math:`W` as defined above. - ``max_num_booths`` is the maximum number of booths the precinct is willing to have. - ``ntrials`` is the number of trials to run when finding the average waiting time of a precinct. - ``seed`` is a random seed. Your function will do a simple linear search: you will first simulate the precinct with one booth, and check whether the average waiting time of the voters (as produced by ``find_avg_wait_time`` from Task 3) is strictly below ``target_wait_time``. If it isn't, you will simulate the precinct with two booths, and check the average waiting time, and so on up to ``max_num_booths`` or until you find a number of booths that achieves an average waiting time less than ``target_wait_time``. Take into account that it could also be the case that the provided ``target_wait_time`` is infeasible with a number of booths less than or equal to ``max_num_booths``. So, the result of this search with be a tuple with two values: the optimal number of booths, and the average waiting time with that number of booths. If the provided target waiting time is infeasible, return ``(0,None)``. To test your implementation of the function, you can call the function from IPython. For example:: In [1]: %load_ext autoreload In [2]: %autoreload 2 In [3]: import util In [4]: import simulate In [5]: precincts, seed = util.load_precincts("data/config-single-precinct-3.json") In [6]: p = precincts[0] In [7]: simulate.find_number_of_booths(p, ntrials=20, target_wait_time=10, max_num_booths=10, seed=seed) Out[7]: (8, 4.630351652014112) In [8]: simulate.find_number_of_booths(p, ntrials=20, target_wait_time=10, max_num_booths=5, seed=seed) Out[8]: (0, None) You can also run the function from the command-line by using the ``--target-wait-time`` and ``--max-num-booths`` options:: $ python3 simulate.py data/config-single-precinct-3.json --max-num-booths 10 --target-wait-time 10 Precinct 'Sahara Square' can achieve average waiting time of 4.63 with 8 booths $ python3 simulate.py data/config-single-precinct-3.json --max-num-booths 5 --target-wait-time 10 The target wait time (10.00) is infeasible in precinct 'Sahara Square' with 5 or less booths Note: The command-line tool sets the value of ``ntrials`` to ``20``. And, finally, you can run the automated tests for this function like this:: $ py.test -xv -k find Grading ------- Programming assignments will be graded according to a general rubric. Specifically, we will assign points for completeness, correctness, design, and style. (For more details on the categories, see our `PA Rubric page <../rubric.html>`__.) The exact weights for each category will vary from one assignment to another. For this assignment, the weights will be: * **Completeness:** 50% * **Correctness:** 10% * **Design:** 30% * **Style:** 10% Design is an important part of this assignment, because you have to design the ``Voter`` and ``VotingBooths`` classes. There are a few things we will be looking out for in your design: * ``Precinct`` is the only class that is allowed to construct ``Voter`` objects. Creating ``Voter`` objects anywhere else typically denotes a bad design. * This assignment does not require implementing a voting booth class (i.e., a class to model an individual voting booth). If you do, you are likely overthinking how to implement a precinct. * Your ``find_avg_wait_time`` function (Task 2), and ``find_number_of_booths`` function (Task 3) should be fairly short and straightforward, with absolutely no simulation logic in them (which should, instead, be in your ``Precinct`` class). If your classes are poorly designed, then these functions will tend to be long and convoluted. You must include header comments in all the methods you implement. You do not need to include a docstring for a class (but it certainly doesn't hurt to do so). Obtaining your test score ~~~~~~~~~~~~~~~~~~~~~~~~~ Like previous assignments, you can obtain your test score by running ``py.test`` followed by ``../common/grader.py``. Continuous Integration ---------------------- Continuous Integration (CI) is available for this assignment. For more details, please see our `Continuous Integration <../../ci/index.html>`__ page. We strongly encourage you to use CI in this and subsequent assignments. Submission ---------- See `these submission instructions `__ if you are working alone. See `these submission instructions `__ if you are working in the same pair as for PA #2 or PA #3. See `these submission instructions `__ if you are working in a *NEW* pair. .. _random: https://docs.python.org/3/library/random.html .. _queue: https://docs.python.org/3/library/queue.html .. |clearfloat| raw:: html