============ Dictionaries ============ Introduction --------------- The objective of this lab is to give you practice using dictionaries, a useful data type built into Python. A dictionary (``dict`` for short) is a generalization of arrays/lists that associates keys with values. In computer science, this data type is also referred to as an *associative array* or a *map*. You will also learn simple statements to read a file in Python. We will cover file I/O extensively in future lectures. By the end of the lab, you should be able to: - Perform basic operations on dictionaries - Apply dictionaries in several usage scenarios - Read a file using Python Getting started --------------- .. includes/getting-started-labs.txt Once you have collected the lab materials, navigate to the ``lab4`` directory and fire up ``ipython``. Dictionaries ------------- As a recap, when using arrays/lists we give an index and receive a value, like this: :: names = ['Alice', 'Bob', 'Charlie'] print(names[1]) Bob We might think about this list in the following way, i.e., a list associates an index with a value: :: 0: 'Alice' 1: 'Bob' 2: 'Charlie' A dict generalizes what we can use as an index (left of the ``:``). Dictionaries can use many different types of values as an index--not just integers. Here is an example that associates items in a grocery store with their prices: :: 'Apple': .89 'Banana': .39 'Bread': 2.50 Each product ``'Apple'``, ``'Banana'``, ``'Bread'`` is associated with a *value*: the price. ``'Apple'`` works like an index in an array. When using dictionaries we refer to the index as the *key*. We can specify this dict in Python with the following syntax: :: prices = {'Apple': .89, 'Banana': .39, 'Bread': 2.50} which adheres to the following pattern: :: new_dict = {key1: value1, key2: value2, ... } Note the use of curly braces and colons. We get the value associated with a specific key with standard indexing syntax :: print(prices['Banana']) 0.39 As we've described, dictionaries can use many different types for keys and any type for values. For this dict, we have:: Give -- string Get -- float Question: What if we try to print the price of an item that is not in the dict? Try it yourself. If you are unsure, it is always good practice to test whether a key exists in a dict before accessing its value, like this: :: if 'Chocolate' in prices: print(prices['Chocolate']) else: print("No Chocolate!") We add new key/value pairs to the dict with standard indexing syntax. :: prices['Chocolate'] = 1.50 ``prices`` now contains: :: 'Apple': .89 'Banana': .39 'Bread': 2.50 'Chocolate': 1.50 Looping over Dictionaries ------------------------- :: print(list(prices.keys())) ['Chocolate', 'Bread', 'Apple', 'Banana'] print(list(prices.values())) [1.5, 2.5, 0.89, 0.39] print(list(prices.items())) [('Chocolate', 1.5), ('Bread', 2.5), ('Apple', 0.89), ('Banana', 0.39)] We often want to perform an action on every element in a list. How do we do this task with dictionaries? The three methods ``.keys()``, ``.values()``, and ``.items()`` return "list-like structures" (a *dictionary view* object, which we will cover in future lectures in more details) that contain the keys, values, and key/value tuples respectively. In the example above, we cast them to lists explicitly. We can also loop over these lists as we have looped over lists in the past. The following loop prints out a nicely formatted string displaying the cost for each product :: for product in prices.keys(): price = '$' + str(prices[product]) print(product + ' costs ' + price) Chocolate costs $1.5 Bread costs $2.5 Apple costs $0.89 Banana costs $0.39 Looping over the keys of a dict is so common, that we can omit ``.keys()`` as in:: for product in prices: ... Reading Files ------------- We often store data in files. How do we open a file and read its contents into a Python variable? The following command opens a file for reading. :: f = open(filename, 'r') ``f`` is an object of type ``file`` and ``filename`` is a string, such as ``'my_file.txt'``. We can read this file using the ``readlines()`` method, which returns a list of strings and each string is one line of the file. :: lines = f.readlines() print(lines) ``lines`` is a list of strings. We can now use for loops and string methods to process this data. Once we are done we need to close the file as follows: :: f.close() Exercises --------- The following exercises are designed to introduce some of the ways in which dictionaries can be used. You will work with data contained in files. Grocery Store ~~~~~~~~~~~~~ In this exercise we will write code to handle purchases in a grocery store. You will create a dict that encodes the products that the store sells and their corresponding prices. Assume that each time a customer buys a set of items, the cash register generates a new file. This file describes the type and quantity of each item (e.g. two apples and one loaf of bread). We will read this file, compute the total cost of the purchase ($4.38) and write the cost out to a second file. We presume that the second file will then be sent to the cashier. We will do this task in stages. Do the work described in a new file called ``grocery.py``. Loading in the prices ^^^^^^^^^^^^^^^^^^^^^ In the example discussed above, we associated a price with each item in a grocery store. :: 'Apple': .89 'Banana': .39 'Bread': 2.50 'Chocolate': 1.50 The file ``prices.txt`` contains a more thorough list. Perform the following steps to create a dict out of this file. 1. Look at this file in a text editor and make sure you understand the format in which the data are stored. 2. Open this file for reading in Python. 3. Read the lines into a list of strings. Print them out to see what they look like. 4. Create an empty dict named ``prices`` (syntax for a new dict is ``dict_name = {}``). We need to convert this list of lines into a dict mapping the first element of each line to the float of the second element of each line. For example the first line: :: 'Apple .89\n' should cause us to add the following key/value pair to our dict. Note that ``.89`` is now a float and that Python will print a leading zero on floats less than one. :: prices['Apple'] = .89 5. Figure out how to do this transformation for a single line using ``s.split()`` and ``float(s)``, where ``s`` is a string. Try calling these functions in your IPython session on various strings to understand their behavior. Your code should look something like this. :: line = lines[0] # your code here. Make product and cost variables. # They should be like 'Apple' and .89 respectively prices[product] = cost 6. Wrap this code with a for loop so that we do this subtask for each line in the file. 7. Verify that you have a complete price list by printing out the cost of an orange. 8. Close the file once you're done. Put this code in a function ``read_into_dict(filename)``, which takes the name of the file to be read and returns the ``prices`` dict that we have just constructed. Reading in a single receipt ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The file ``receipt.txt`` contains the items and quantities from a single sale. This file was produced by the cash register. 1. Read this file into a dict as you did for ``prices``. Note that you should still be able to use ``read_into_dict`` for this task, since ``prices.txt`` and ``receipt.txt`` use the same format. 2. We'll do this a few times (there are a few receipts). We want to compute the cost of each sale. Write a function ``calc_cost`` that takes in two dictionaries: 1. prices - a dict mapping each product to the single unit price 2. receipt - a dict mapping each product in a sale to the quantity and returns a float with the total sale cost. Write this function and test it on the three receipts in your ``lab4`` directory. Printing total cost ^^^^^^^^^^^^^^^^^^^ Having processed a receipt, and calculated the total cost, your program should print the total cost. For example: :: receipt3.txt Chocolate 4 Coffee 1 Output on screen as a result of a print statement: 10.0 Write a function ``process_receipt`` that takes in a ``prices`` dict and the name of a receipt file and does the following: 1. Computes the total price of the sale (use your previous work). 2. Print the price to examine the value Counting ~~~~~~~~ In the example above we used dictionaries to represent unchanging variables. The price and quantities always stayed the same. Sometimes, we also need to change values in a dict after they are initially defined. In the following example we want to count the number of times each letter grade was given for a quiz. Here are the results of the quiz:: grades = ['B', 'C', 'A', 'A', 'D', 'B', 'B', 'A', 'C'] And the associated counts:: counts = {'A': 3, 'B': 3, 'C': 2, 'D': 1} Because the ``grades`` list is so short we can make the ``counts`` variable by hand. For a class with more students, this task would be tedious. Your first goal for this task is to write a function that takes in a list like ``grades`` and returns a dict like ``counts``. Do your work in a file called ``grades.py.`` First, write function called ``count_grades`` that takes a list of grades as an argument and returns a dict of grade counts. This task is the classic histogram problem. The problem can be solved with the following steps: 1. Create an empty ``counts`` dict (the syntax is just ``dict_name = {}``). 2. For each item in grades do one of the following: a. If that grade is not in the ``counts`` dict then add that grade to the dict with a value of 1. b. If that grade is already in ``counts`` then add 1 to that grade's value. You may check if a key is in a dict using the ``key in dict`` syntax. For example:: 'Apples' in {'Apples': 1, 'Bananas': 4} True 'Chocolate' in {'Apples': 1, 'Bananas': 4} False Next write a function ``process_grades`` that takes the name of a file, reads in this file, and converts the information it contains into a list like the ``grades`` variable above. Use ``count_grades`` to convert this list into a dict that counts the number of times each grade occurs in the file. Finally, ``process_grades`` should print out a nicely formatted text that follows the format below (your numbers, however, will be different). Use the file ``grades.txt``, which contains letter grades from a recent quiz, to test your functions. :: A -- 3 B -- 3 C -- 2 D -- 1 Note that the grades do not have to appear in alphabetical order. Other Data Types in Dictionaries ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You know very well by now that dictionaries keep key/value pairs, but the ``value`` does not have to be simple primitives such as an integer, a float, or a string. It can be more complex data types such as a tuple, a list, or another dict. Similarly, we can use various data types for the ``key`` as well. Your task here is to write a function ``list_grades`` that: #. Reads the file ``grades_names.txt``, which contains a list of student names together with their grades. #. Generates a dict such that keys are letter grades and values are lists of student names associated with each grade. As in the Grocery Store exercise, you will need to parse each line in the file. You will also need to test whether a grade already exists in the dict. If the grade does not exist, add it as a new key with its value being a single-element list. Otherwise, simply append the name of the student to the corresponding list. Note that, if the same name appears multiple times for a given letter grade, your should append the name multiple times. #. Print out a nicely formated dict like below (again, your exact output will be different). :: A -- ['Alice', 'Adam', 'Alex'] B -- ['Bob', 'Beth', 'Bill'] C -- ['Charlie', 'Cathy'] D -- ['David'] Hint: you should be able to follow the structure of ``grades.py`` and reuse a lot of your code. A useful library function ------------------------- Code like:: if key not in d: d[key] = 0 d[key] = d[key] + 1 occurs frequently. Python dictionaries have a built-in method, called ``get``, that simplifies this type of code. The expression ``d.get(key, 0)`` returns the value of the expression ``d[key]``, if the key occurs in the dict and zero, if it does not. The above example can be rewritten using ``get`` as follows:: d[key] = d.get(key, 0) + 1 In general, code of the form:: if key not in d: d[key] = some_default_value d[key] = some_transformation(d[key]) can be rewritten as:: d[key] = some_transformation(d.get(key, some_default_value)) When finished ------------- .. include:: includes/finished-labs-1.txt .. code:: git add grocery.py git add grades.py git commit -m "Finished with lab4" git push .. include:: includes/finished-labs-2.txt