CS121 A'14 - Dictionaries

Introduction

The objective of this lab is to give you practice using dictionaries and files in Python.

Dictionary

A dictionary (dict for short) is a data type that associates keys with values. In computer science, this data type is also referred to as an associative array or a map. It is a generalization of arrays/lists.

When using arrays/lists we give an index and receive a value.

names = ['Alice', 'Bob', 'Charlie']
print names[1]
Bob

In the example above we gave the index 1 and received the value 'Bob'. We might think about this list in the following way

0: 'Alice'
1: 'Bob'
2: 'Charlie'

A dictionary generalizes what we can use as an index (left of the :). Dictionaries can use many different types of values as an index--not just integers. Here is an example that associates items in a grocery store with their prices:

'Apple':   .89
'Banana':  .39
'Bread':  2.50

Each product 'Apple', 'Banana', 'Bread' is associated with a value: the price. 'Apple' works like an index in an array. When using dictionaries we refer to the index as the key. We can specify this dictionary in Python with the following syntax:

prices = {'Apple': .89, 'Banana': .39, 'Bread': 2.50}

which adheres to the following pattern:

newDict = {key1: value1,  key2: value2, ... }

Note the use of curly braces and colons.

We get the value associated with a specific key with standard indexing syntax:

print prices['Banana']
.39

As we've described, dictionaries can use many different types for keys and any type for values. For this dictionary, we have:

Give -- string
Get  -- float

We add new key/value pairs to the dictionary with standard indexing syntax.

prices['Chocolate'] = 1.50

prices now contains:

'Apple':         .89
'Banana':        .39
'Bread':        2.50
'Chocolate':    1.50

Looping over Dictionaries

prices.keys()
['Chocolate', 'Bread', 'Apple', 'Banana']
prices.values()
[1.5, 2.5, 0.89, 0.39]
prices.items()
[('Chocolate', 1.5), ('Bread', 2.5), ('Apple', 0.89), ('Banana', 0.39)]

We often want to perform an action on every element in a list. How do we do this task with dictionaries? The three methods .keys(), .values(), and .items() return lists with the keys, values, and key/value tuples respectively. We can loop over these lists as we have looped over lists in the past.

The following loop prints out a nicely formatted string displaying the cost for each product

for product in prices.keys():
    price = '$' + str(prices[product])
    print product + ' costs ' + price

Chocolate costs $1.50
Bread costs $2.50
Apple costs $0.89
Banana costs $0.39

Looping over the keys of a dictionary is so common, that we can omit .keys() as in:

for product in prices:
    ...

Reading Files

f = open(filename, 'r')         # open a file for reading
lines = f.readlines()           # get a list of lines of text from the file
f.close()                       # close the file

We often store data in files. How do we open a file and read its contents into a Python variable? The following command opens a file for reading.

f = open(filename, 'r')

f is an object of type file. It has at least two important methods: f.readlines() returns a list of strings. Each string is one line of the file.

lines = f.readlines()

lines is a list of strings. We can now use for loops and string methods to process this data. Once we are done we need to close the file as follows:

f.close()

Writing Files

f = open(filename, 'w')         # open a file for writing
f.write('Hello, World!\n')      # write "Hello, World!" with an newline to the file
f.close()                       # save and close the file

We can open a file for writing in a similar way. The write method takes in a string and writes that string to the file. You can call write many times. Note that we have to add the newlines ("\n") manually.

Exercises

The following exercises are designed to show off some of the ways in which dictionaries can be used. You will work with data contained in files, so you will get an opportunity to practice those skills as well.

Grocery Store

In this exercise we will write code to handle purchases in a grocery store. You will create a dict that encodes the products that the store sells and their corresponding prices. Assume that each time a customer buys a set of items, the cash register generates a new file. This file describes the type and quantity of each item (e.g. two apples and one loaf of bread). We will read this file, compute the total cost of the purchase ($4.38) and write the cost out to a second file. We presume that the second file will then be sent to the cashier. We will do this task in stages.

Do the work described in a new file called grocery.py.

Loading in the prices

In the example above we associated a price with each item in a grocery store.

'Apple':       .89
'Banana':      .39
'Bread':      2.50
'Chocolate':  1.50

The file prices.txt contains a more thorough list. Perform the following steps to create a dictionary out of this file.

  1. Look at this file in a text editor and make sure you understand the format in which the data are stored.

  2. Open this file for reading in Python.

  3. Read the lines into a list of strings. Print them out to see what they look like.

  4. Create an empty dictionary named prices (syntax for a new dict is dict_name = {}).

    We need to convert this list of lines into a dictionary mapping the first element of each line to the float of the second element of each line. For example the first line:

    'Apple .89\n'
    

    Should cause us to add the following key/value pair to our dict. Note that .89 is now a float.

    prices['Apple'] = .89
    
  5. Figure out how to do this transformation for a single line using s.strip(), s.split() and float(s), where s is a string. Try calling these functions in your IPython session on various strings to understand their behavior. Your code should look something like this.

    line = lines[0]
    # your code here. Make product and cost variables.
    # They should be like 'Apple' and .89 respectively
    prices[product] = cost
    
  6. Wrap this code with a for loop so that we do this subtask for each line in the file.

  7. Verify that you have a complete price list by printing out the cost of an orange.

  8. Close the file once you're done.

Put this code in a function read_into_dict(filename), which takes the name of the file to be read and returns the prices dictionary that we have just constructed.

Reading in a single receipt

The file receipt.txt contains the items and quantities from a single sale. This file was produced by the cash register.

  1. Read this file into a dictionary as you did for prices. Note that you should still be able to use read_into_dict for this task, since prices.txt and receipt.txt use the same format.
  2. We'll do this a few times (there are a few receipts).

We want to compute the cost of each sale. Write a function calc_cost that takes in two dictionaries:

  1. prices - a dictionary mapping each product to the single unit price
  2. receipt - a dictionary mapping each product in a sale to the quantity

and returns a float with the total sale cost. Write this function and test it on the three receipts in your lab6 directory.

Writing out a total cost

Having processed a receipt, and calculated the total cost, your program should write a file that contains the total cost. For example, say your program takes receipt3.txt as input and produces out-receipt3.txt as output. The two files might have the contents:

receipt3.txt

    Chocolate 4
    Coffee 1

out-receipt3.txt

    10.0

Write a function process_receipt that takes in a prices dict and the name of a receipt file and does the following:

  1. Computes the total price of the sale (use your previous work).
  2. Opens a file called out-[receiptfile] where [receiptfile] is the name of your input receipt file. This file should be opened for writing.
  3. Writes the total cost into this output file (you will need to convert the float to a string with the function str).
  4. Closes the output file.

Counting

In the example above we used dictionaries to represent unchanging variables. The price and quantities always stayed the same. Sometimes, we also need to change values in a dictionary after they are initially defined. In the following example we want to count the number of times each letter grade was given for a quiz.

Here are the results of the quiz:

grades = ['B', 'C', 'A', 'A', 'D', 'B', 'B', 'A', 'C']

And the associated counts:

counts = {'A': 3, 'B': 3, 'C': 2, 'D': 1}

Because the grades list is so short we can make the counts variable by hand. For a class with more students, this task would be tedious. Your goal for this task is to make a function that takes in a list like grades and creates a dict like counts.

Do your work in a file called grades.py. You will ultimately produce a function called count_grades that takes the filename for a file that lists grades and writes the counts for each grade to a different file.

This task is the classic histogram problem. The problem can be solved with the following steps:

  1. Create an empty counts dictionary (the syntax is just dict_name = {})
  2. For each item in grades do one of the following:
    1. If that grade is not in the counts dict then add that grade to the dict with a value of 1.
    2. If that grade is already in counts then add 1 to that grade's value.

You may check if a key is in a dictionary using the key in dict syntax. For example:

'Apples' in {'Apples': 1, 'Bananas': 4}
True
'Chocolate' in {'Apples': 1, 'Bananas': 4}
False

The file grades.txt contains letter grades from a recent quiz.

Your last task is to put these pieces together: write a function process_grades that takes the name of a file, reads in this file and converts the information it contains into a list like the grades variable above. Use your function to convert this list into a dictionary that counts the number of times each grade occurs in the file. Finally, open a file for writing and write out a nicely formatted text file that follows the format below (your numbers, however, will be different).

A -- 3
B -- 3
C -- 2
D -- 1

In the output file, the grades do not have to appear in alphabetical order.

When Finished

When finished with the lab please check in your work (assuming you are inside the lab6 directory):

git add grocery.py
git add grades.py
git commit -m "Finished with lab6"
git push

No, we're not grading this, we just want to look for common errors.