Property Based Testing in Python
Posted on Fri 16 October 2020 in Python • 4 min read
Building software can be challenging, but maintaining it after that initial build can be even moreso. Being able to test the software such that it verifies the software behaves as expected is crucial in building robust software applications that users depend upon, being able to automate this testing is even better! There's other blog posts on this blog around the topic of testing Introduction to Pytest & Pipenv, but for this post we're going to focus on a very specific type of testing, property based testing.
Property based testing differs itself from the conventional example based testing by being able to generate the test data that drives your tests, and even better, can help find the boundaries of where the tests fail.
To demonstrate the power of property based testing, we're going to build some testing for the old faithful multiplication operator in Python.
To help with this, we are going to use a few packages:
- pytest (testing framework)
- hypothesis (property testing package)
- ipytest (to enable running tests in jupyter notebooks)
Before we dive in, let's set up ipytest and use some example-based testing to verify the multiplication operator.
import ipytest
import pytest
ipytest.autoconfig()
def multiply(number_1, number_2):
return number_1 * number_2
%%run_pytest[clean]
def test_example():
assert multiply(3,3) == 9
assert multiply(5,5) == 25
assert multiply(4,6) == 24
Fantastic, our examples passed the test! Now let's ensure that the test fails.
%%run_pytest[clean]
def test_fail_example():
assert multiply(3,3) == 9
assert multiply(3,5) == 150
Perfect! We can see that the test fails as expected and even nicely tells us which line of code it failed on. Let's say we had lots of these examples that we wanted to test for, so to simplify it we could potentially use pytest's parametrize decorator.
%%run_pytest[clean]
@pytest.mark.parametrize('number_1, number_2 , expected', [
(3,3,9),
(5,5,25),
(4,6,24)
])
def test_multiply(number_1,number_2,expected):
assert expected == multiply(number_1,number_2)
Is this enough testing to verify our function? Really, we're only testing a few conditions that we'd expect to work, but in reality it's the ones that nobody foresees that would be ideal to capture in our tests. This also raises a few more things, the developer writing the tests may choose to write 2 or 2000 test cases but this doesn't guarantee anything when it comes to if it's truly covered.
Introduce Property Based Testing¶
Property based testing is considered as generative testing, we don't supply specific examples with inputs and expected outputs. Rather we define certain properties and generate randomized inputs to ensure the properties are correct. In addition to this, property based testing can also shrink
outputs to find the exact boundary condition where a test fails.
While this doesn't 100% replace example-based testing, they definitely have their use and have a lot of potential for effective testing. Now let's implement the same tests above, using property based testing with hypothesis
.
from hypothesis import given
import hypothesis.strategies as st
%%run_pytest[clean]
@given(st.integers(),st.integers())
def test_multiply(number_1,number_2):
assert multiply(number_1,number_2) == number_1 * number_2
Note that we've used the given
decorator which makes our test parametrized, and use strategies which cover the types of input data to generate. As per the hypothesis documentation Most things should be easy to generate and everything should be possible, we can find more information on them here: https://hypothesis.readthedocs.io/en/latest/data.html
Now this doesn't look any different to last time, so what even changed! Let's change our multiply function so it behaves strangely and see if we can see hypothesis shrink the failures in action. Shrinking is whenever it finds a failure, it'll try to get to the absolute boundary case to help us find the potential cause and even better it'll remember this failure for next time so it doesn't poke it's head up again!
def bad_multiply(number_1,number_2):
if number_1 > 30:
return 0
if number_2 < 0:
return 0
return number_1 * number_2
%%run_pytest[clean]
@given(st.integers(),st.integers())
def test_bad_multiply(number_1,number_2):
assert bad_multiply(number_1,number_2) == number_1 * number_2
Fantastic, we can see that the failure has been shrunken to number_1
being 31 and number_2
being 1 which is one integer off the 'bad' boundary conditions we'd introduced into the multiply function.
Hopefully this has introduced the power of property based testing and can help make software more robust for everyone!