Testing Your Code¶
Once you’ve written a piece of code, you will want to test that it works as expected. There are broadly two ways of doing this:
Manual testing: This just involves running the code you wrote with some sample values, to check whether it behaves as expected. For example, if you were writing an expression to compute whether a year is a leap year, you may try running your code in IPython with a few leap years and a few non-leap years, to see whether your code correctly identifies the leap years.
Automated testing: There are automated testing frameworks that allow you to specify a series of tests you want to run on your code, and which make it easy to automatically re-run all those tests. For example, following the leap year example, you wouldn’t have to manually test each year value one by one; instead, the testing framework would test all these year values for you, and would report back how many produced the expected result.
All the exercises and programming assignments include a suite of automated tests that you can use to check whether your code is working correctly, Furthermore, these tests also factor into your score for an exercise or programming assignment.
Because the automated tests are easy to run (and affect your score for the assignment), you may be tempted to do the following: write some code, immediately try running the automated tests to find a test that fails, make a guess as to how to modify your code, and then repeat the process until all of the tests pass.
This is not a good way to test your code. Instead, you should start by doing some manual testing to get a sense of whether it is working before you try the automated tests. In this page, we describe how to do this.
Manual testing¶
Let’s say you’re working on Exercises #1 and, specifically,
on the add_one_and_multiply
exercise. You can start by
informally testing your expression on IPython. For example,
you could do this:
In [1]: a = 5
In [2]: x = 2
In [3]: a + 1 * x
Out[3]: 7
That doesn’t seem quite right: the result should’ve been 12 (if we add 1 to 5, that gives us 6, which multiplied by 2 gives us 12). Looks like you need to experiment a bit more in IPython!
Let’s say you’ve figured out what you believe it the correct
expression, and you’ve added the code to the se1.py
file.
You should then try making some sample calls to the add_one_and_multiply
function:
In [2]: se1.add_one_and_multiply(5, 2)
Out[2]: 12
In [3]: se1.add_one_and_multiply(7, 0)
Out[3]: 0
If you get the wrong answer for some sample input, stop to reason why your code is behaving the way it is and think about how to modify it to get the correct result.
In short exercises like the one above, it is sometimes enough to look
at the code and figure our why it is not working. However, for more
complex code, especially once you move on to the programming assignments,
you will want to follow a more rigorous approach. You should make a hypothesis about what might be
wrong and use print
statements to print out key values to help you
verify or disprove your hypothesis. You can also find a lot of tips
on how to debug your code in The Debugging Guide
After you’ve done some manual testing, and get the sense that your function seems to be working for, at least, a few simple inputs, you should try running the automated tests. The tests could reveal that there are still issues with your function and, at that point, you could repeat the same process we described above. The important thing is that you always have an idea in your head about how your code works, and don’t make random changes that you don’t fully understand.
Automated testing¶
Now on to the automated tests.
We will be using the pytest Python testing framework for this and
subsequent assignments. Pytest is available on the CS machines. To run our automated tests, you will use the
py.test
command from the Linux command line (not from within
ipython3
). We recommend opening a new terminal window for running
this command, which will allow you to go back and forth easily between
testing code by hand in ipython3
in one terminal window and running the test suite using
py.test
in the other. (When we work on assignments, we usually have three
windows open: one for editing, one for experimenting in ipython3
,
and one for running the automated tests.)
For example, to run all the tests for add_one_and_multiply
, you can run the following command from the Linux command-line:
$ py.test -v -x -k add_one_and_multiply test_se1.py
(Recall that the $
represents the
prompt and is not included in the command.)
Here is what each part of this command means:
py.test
indicates that we want to run pytest.test_se1.py
is the name of the file that contains the testing code. (If you look at this file, you may find some of the syntax unfamiliar; this is okay for now.)In between these, we specify three options:
The flag
-v
means run in verbose mode; this gives us a more detailed readout of the test results.The flag
-x
means that pytest should stop running tests after a single test failure.The option
-k add_one_and_multiply
restricts pytest to only running the tests foradd_one_and_multiply
. The way the-k
option works is actually a bit more elaborate but, for now, you can assume that providing the name of the function you’re testing will run only the tests for that function.
Here is (slightly-modified) output from using this command to test our
reference implementation of add_one_and_multiply
:
$ py.test -v -x -k add_one_and_multiply test_se1.py
============================= test session starts ==============================
platform linux -- Python 3.5.2, pytest-3.9.1, py-1.8.1, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
metadata: {'Python': '3.5.2', 'Plugins': {'html': '1.19.0', 'metadata': '1.7.0', 'timeout': '1.3.2', 'json': '0.4.0'}, 'Platform': 'Linux-4.15.0-91-generic-x86_64-with-Ubuntu-16.04-xenial', 'Packages': {'pluggy': '0.13.1', 'py': '1.8.1', 'pytest': '3.9.1'}}
rootdir: /home/username/cmsc12100-aut-20-username/se1, inifile: pytest.ini
plugins: html-1.19.0, metadata-1.7.0, json-0.4.0, timeout-1.3.2
collected 26 items / 20 deselected
::test_se1.py::test_add_one_and_multiply_1 PASSED [ 16%]
::test_se1.py::test_add_one_and_multiply_2 PASSED [ 33%]
::test_se1.py::test_add_one_and_multiply_3 PASSED [ 50%]
::test_se1.py::test_add_one_and_multiply_4 PASSED [ 66%]
::test_se1.py::test_add_one_and_multiply_5 PASSED [ 83%]
::test_se1.py::test_add_one_and_multiply_6 PASSED [100%]
- generated json report: /home/username/cmsc12100-aut-20-username/se1/tests.json -
=================== 6 passed, 20 deselected in 0.18 seconds ====================
This output shows that our code passed all six tests
for add_one_and_multiply
. It also shows that there were 20
tests that were deselected (that is, were not run) because they did
not match the test selection criteria specified by the argument to
-k
.
If you fail a test, pytest will print out some information about what went wrong. For example, let’s say you specified the following expression, which we saw earlier was incorrect:
a + 1 * x
This would pass the first test, but would fail the second one:
$ py.test -v -x -k add_one_and_multiply test_se1.py
============================= test session starts ==============================
platform linux -- Python 3.5.2, pytest-3.9.1, py-1.8.1, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
metadata: {'Plugins': {'html': '1.19.0', 'metadata': '1.7.0', 'json': '0.4.0', 'timeout': '1.3.2'}, 'Python': '3.5.2', 'Packages': {'pytest': '3.9.1', 'py': '1.8.1', 'pluggy': '0.13.1'}, 'Platform': 'Linux-4.15.0-91-generic-x86_64-with-Ubuntu-16.04-xenial'}
rootdir: /home/username/cmsc12100-aut-20-username/se1, inifile: pytest.ini
plugins: html-1.19.0, metadata-1.7.0, json-0.4.0, timeout-1.3.2
collected 26 items / 20 deselected
::test_se1.py::test_add_one_and_multiply_1 PASSED [ 16%]
::test_se1.py::test_add_one_and_multiply_2 FAILED [ 33%]
=================================== FAILURES ===================================
_________________________ test_add_one_and_multiply_2 __________________________
def test_add_one_and_multiply_2():
> do_test_add_one_and_multiply(a=5, x=2, expected=12)
../dist/test_se1.py:17:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../dist/test_se1.py:185: in do_test_add_one_and_multiply
check_equals(actual, expected, recreate_msg)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
actual = 7, expected = 12
recreate_msg = 'To recreate this test in ipython3 run:\n se1.add_one_and_multiply(5, 2)'
def check_equals(actual, expected, recreate_msg=None):
msg = "Actual ({}) and expected ({}) values do not match.".format(actual, expected)
if recreate_msg is not None:
msg += "\n" + recreate_msg
> assert actual == expected, msg
E AssertionError: Actual (7) and expected (12) values do not match.
E To recreate this test in ipython3 run:
E se1.add_one_and_multiply(5, 2)
E assert 7 == 12
../dist/test_se1.py:158: AssertionError
- generated json report: /home/username/cmsc12100-aut-20-username/se1/tests.json -
============== 1 failed, 1 passed, 20 deselected in 0.12 seconds ===============
The volume of output can be a bit overwhelming. You should focus on
the lines towards the end that start with E
. These lines will
usually contain a helpful message telling you why the test failed:
E AssertionError: Actual (7) and expected (12) values do not match.
E To recreate this test in ipython3 run:
E se1.add_one_and_multiply(5, 2)
This information can help you narrow down the issue with your code.
This error message, in particular, tells you that, like the manual testing
example we saw earlier, the test code expected a return value of 12, but got a return value of 7. It
also shows you how to run this test in ipython3
. At this point, you should switch back to testing your
function in ipython3
until you have fixed the problem.
A few more notes on the py.test
command and its options:
By default, if you do not supply the name of a specific test file (such as
test_sir.py
), pytest will look in the current directory tree for Python files that have names that start withtest_
.Because we specified the
-x
option, pytest exited as soon as the second test failed (without running the remaining tests). Omitting the-x
option makes sense when you want to get a sense of which tests are passing and which ones aren’t; however, when debugging your code, you should always use the-x
option so that you can focus on one error at a time.If you don’t use the
-k
option, pytest will run any function that starts withtest_
. You can limit the tests that get run by using the-k
option along with any string that uniquely identifies the desired tests. The string is not required to be a prefix. For example, if you specify-k add
, pytest will run test functions that start withtest_
and include the wordadd
.Pytest has many other options that we did not use here. You can see the rest of the options by running the command
py.test -h
.
In general, we will leave out the name of the file with
the test code (test_se1.py
), use short substrings to describe the
desired tests, and combine the option flags (-v -x -k
) into a
single string (-xvk
). For example, the tests for
add_one_and_multiply
can also be run with the following command:
$ py.test -xvk add
Obtaining your test score¶
Your score on the exercises, as well as the “Completeness” portion of the programming assignments, is determined
by the automated tests. To get your score for the automated tests, simply run the
following from the Linux command-line. (Remember to leave out the
$
prompt when you type the command.)
$ py.test
$ ../common/grader.py
Notice that we’re running py.test
without the -k
or -x
options: we want it to run all the tests. If you’re still failing
some tests, and don’t want to see the output from all the failed
tests, you can add the --tb=no
option when running py.test
:
$ py.test --tb=no
$ python3 ../common/grader.py
Take into account that the grader.py
program will look at the
results of the last time you ran py.test
so, if you make any
changes to your code, you need to make sure to re-run py.test
. You
can also just run py.test
followed by the grader on one line by
running this:
$ py.test --tb=no; ../common/grader.py
If you run this inside your se1
directory, you should see something like this (of course,
your actual scores may be different!):
Category Passed / Total Score / Points
----------------------------------------------------------------------------------------------------
Exercise 1 6 / 6 15.00 / 15.00
Exercise 2 4 / 4 15.00 / 15.00
Exercise 3 3 / 3 15.00 / 15.00
Exercise 4 5 / 5 15.00 / 15.00
Exercise 5 4 / 4 20.00 / 20.00
Exercise 6 4 / 4 20.00 / 20.00
----------------------------------------------------------------------------------------------------
TOTAL = 100.00 / 100
====================================================================================================