Dictionaries¶
Introduction¶
The objective of this lab is to give you practice using dictionaries, a useful data type built into Python.
A dictionary (dict
for short) is a generalization of arrays/lists that associates keys with
values. In computer science, this data type is also referred to as an associative array
or a map.
You will also learn simple statements to read a file in Python. We will cover file I/O extensively in future lectures.
By the end of the lab, you should be able to:
- Perform basic operations on dictionaries
- Apply dictionaries in several usage scenarios
- Read a file using Python
Getting started¶
Once you have collected the lab materials, navigate to the lab4
directory and fire up ipython
.
Dictionaries¶
As a recap, when using arrays/lists we give an index and receive a value, like this:
names = ['Alice', 'Bob', 'Charlie']
print(names[1])
Bob
We might think about this list in the following way, i.e., a list associates an index with a value:
0: 'Alice'
1: 'Bob'
2: 'Charlie'
A dict generalizes what we can use as an index (left of the
:
). Dictionaries can use many different types of values as an index–not just
integers. Here is an example that associates items in a grocery store
with their prices:
'Apple': .89
'Banana': .39
'Bread': 2.50
Each product 'Apple'
, 'Banana'
, 'Bread'
is associated with a
value: the price. 'Apple'
works like an index in an array. When
using dictionaries we refer to the index as the key. We can specify
this dict in Python with the following syntax:
prices = {'Apple': .89, 'Banana': .39, 'Bread': 2.50}
which adheres to the following pattern:
new_dict = {key1: value1, key2: value2, ... }
Note the use of curly braces and colons.
We get the value associated with a specific key with standard indexing syntax
print(prices['Banana'])
0.39
As we’ve described, dictionaries can use many different types for keys and any type for values. For this dict, we have:
Give -- string
Get -- float
Question: What if we try to print the price of an item that is not in the dict? Try it yourself. If you are unsure, it is always good practice to test whether a key exists in a dict before accessing its value, like this:
if 'Chocolate' in prices:
print(prices['Chocolate'])
else:
print("No Chocolate!")
We add new key/value pairs to the dict with standard indexing syntax.
prices['Chocolate'] = 1.50
prices
now contains:
'Apple': .89
'Banana': .39
'Bread': 2.50
'Chocolate': 1.50
Looping over Dictionaries¶
print(list(prices.keys()))
['Chocolate', 'Bread', 'Apple', 'Banana']
print(list(prices.values()))
[1.5, 2.5, 0.89, 0.39]
print(list(prices.items()))
[('Chocolate', 1.5), ('Bread', 2.5), ('Apple', 0.89), ('Banana', 0.39)]
We often want to perform an action on every element in a list. How do we
do this task with dictionaries? The three methods .keys()
, .values()
,
and .items()
return “list-like structures” (a dictionary view object, which we will cover in future lectures in more details) that contain the keys, values, and key/value tuples respectively. In the example above, we cast them to lists explicitly.
We can also loop over these lists as we have looped over lists in the past. The following loop prints out a nicely formatted string displaying the cost for each product
for product in prices.keys():
price = '$' + str(prices[product])
print(product + ' costs ' + price)
Chocolate costs $1.5
Bread costs $2.5
Apple costs $0.89
Banana costs $0.39
Looping over the keys of a dict is so common, that we can omit .keys()
as in:
for product in prices:
...
Reading Files¶
We often store data in files. How do we open a file and read its contents into a Python variable? The following command opens a file for reading.
f = open(filename, 'r')
f
is an object of type file
and filename
is a string, such as 'my_file.txt'
. We can read this file using
the readlines()
method, which returns a list of strings and each string is one
line of the file.
lines = f.readlines()
print(lines)
lines
is a list of strings. We can now use for loops and string
methods to process this data. Once we are done we need to close the file
as follows:
f.close()
Exercises¶
The following exercises are designed to introduce some of the ways in which dictionaries can be used. You will work with data contained in files.
Grocery Store¶
In this exercise we will write code to handle purchases in a grocery store. You will create a dict that encodes the products that the store sells and their corresponding prices. Assume that each time a customer buys a set of items, the cash register generates a new file. This file describes the type and quantity of each item (e.g. two apples and one loaf of bread). We will read this file, compute the total cost of the purchase ($4.38) and write the cost out to a second file. We presume that the second file will then be sent to the cashier. We will do this task in stages.
Do the work described in a new file called grocery.py
.
Loading in the prices¶
In the example discussed above, we associated a price with each item in a grocery store.
'Apple': .89
'Banana': .39
'Bread': 2.50
'Chocolate': 1.50
The file prices.txt
contains a more thorough list. Perform the
following steps to create a dict out of this file.
Look at this file in a text editor and make sure you understand the format in which the data are stored.
Open this file for reading in Python.
Read the lines into a list of strings. Print them out to see what they look like.
Create an empty dict named
prices
(syntax for a new dict isdict_name = {}
).We need to convert this list of lines into a dict mapping the first element of each line to the float of the second element of each line. For example the first line:
'Apple .89\n'
should cause us to add the following key/value pair to our dict. Note that
.89
is now a float and that Python will print a leading zero on floats less than one.prices['Apple'] = .89
Figure out how to do this transformation for a single line using
s.split()
andfloat(s)
, wheres
is a string. Try calling these functions in your IPython session on various strings to understand their behavior. Your code should look something like this.line = lines[0] # your code here. Make product and cost variables. # They should be like 'Apple' and .89 respectively prices[product] = cost
Wrap this code with a for loop so that we do this subtask for each line in the file.
Verify that you have a complete price list by printing out the cost of an orange.
Close the file once you’re done.
Put this code in a function read_into_dict(filename)
, which takes
the name of the file to be read and returns the prices
dict
that we have just constructed.
Reading in a single receipt¶
The file receipt.txt
contains the items and quantities from a single
sale. This file was produced by the cash register.
- Read this file into a dict as you did for
prices
. Note that you should still be able to useread_into_dict
for this task, sinceprices.txt
andreceipt.txt
use the same format. - We’ll do this a few times (there are a few receipts).
We want to compute the cost of each sale. Write a function calc_cost
that takes in two dictionaries:
- prices - a dict mapping each product to the single unit price
- receipt - a dict mapping each product in a sale to the quantity
and returns a float with the total sale cost. Write this function and
test it on the three receipts in your lab4
directory.
Printing total cost¶
Having processed a receipt, and calculated the total cost, your program should print the total cost. For example:
receipt3.txt
Chocolate 4
Coffee 1
Output on screen as a result of a print statement:
10.0
Write a function process_receipt
that takes in a prices
dict and the
name of a receipt file and does the following:
- Computes the total price of the sale (use your previous work).
- Print the price to examine the value
Counting¶
In the example above we used dictionaries to represent unchanging variables. The price and quantities always stayed the same. Sometimes, we also need to change values in a dict after they are initially defined. In the following example we want to count the number of times each letter grade was given for a quiz.
Here are the results of the quiz:
grades = ['B', 'C', 'A', 'A', 'D', 'B', 'B', 'A', 'C']
And the associated counts:
counts = {'A': 3, 'B': 3, 'C': 2, 'D': 1}
Because the grades
list is so short we can make the counts
variable by
hand. For a class with more students, this task would be tedious. Your first goal
for this task is to write a function that takes in a list like grades
and returns a dict like counts
.
Do your work in a file called grades.py.
First, write function called count_grades
that takes a list of grades as an argument and returns a dict of grade counts.
This task is the classic histogram problem. The problem can be solved with the following steps:
- Create an empty
counts
dict (the syntax is justdict_name = {}
). - For each item in grades do one of the following:
- If that grade is not in the
counts
dict then add that grade to the dict with a value of 1. - If that grade is already in
counts
then add 1 to that grade’s value.
- If that grade is not in the
You may check if a key is in a dict using the key in dict
syntax. For example:
'Apples' in {'Apples': 1, 'Bananas': 4}
True
'Chocolate' in {'Apples': 1, 'Bananas': 4}
False
Next write a function
process_grades
that takes the name of a file, reads in this file,
and converts the information it contains into a list like the
grades
variable above. Use count_grades
to convert this list
into a dict that counts the number of times each grade occurs
in the file. Finally, process_grades
should print out a nicely
formatted text that follows the format below (your numbers, however, will be different). Use the file grades.txt
, which contains letter grades from a recent quiz, to test your functions.
A -- 3
B -- 3
C -- 2
D -- 1
Note that the grades do not have to appear in alphabetical order.
Other Data Types in Dictionaries¶
You know very well by now that dictionaries keep key/value pairs, but the value
does not have
to be simple primitives such as an integer, a float, or a string. It can be more complex data types
such as a tuple, a list, or another dict. Similarly, we can use various data types for the
key
as well.
Your task here is to write a function list_grades
that:
- Reads the file
grades_names.txt
, which contains a list of student names together with their grades. - Generates a dict such that keys are letter grades and values are lists of student names associated with each grade. As in the Grocery Store exercise, you will need to parse each line in the file. You will also need to test whether a grade already exists in the dict. If the grade does not exist, add it as a new key with its value being a single-element list. Otherwise, simply append the name of the student to the corresponding list. Note that, if the same name appears multiple times for a given letter grade, your should append the name multiple times.
- Print out a nicely formated dict like below (again, your exact output will be different).
A -- ['Alice', 'Adam', 'Alex']
B -- ['Bob', 'Beth', 'Bill']
C -- ['Charlie', 'Cathy']
D -- ['David']
Hint: you should be able to follow the structure of grades.py
and reuse a lot of your code.
A useful library function¶
Code like:
if key not in d:
d[key] = 0
d[key] = d[key] + 1
occurs frequently. Python dictionaries have a built-in method, called
get
, that simplifies this type of code. The expression
d.get(key, 0)
returns the value of the expression d[key]
, if
the key occurs in the dict and zero, if it does not. The above
example can be rewritten using get
as follows:
d[key] = d.get(key, 0) + 1
In general, code of the form:
if key not in d:
d[key] = some_default_value
d[key] = some_transformation(d[key])
can be rewritten as:
d[key] = some_transformation(d.get(key, some_default_value))
When finished¶
When finished with the lab please check in your work (assuming you are inside the lab directory):
git add grocery.py
git add grades.py
git commit -m "Finished with lab4"
git push
No, we’re not grading this, we just want to look for common errors.