PyData London 2024

Adventures in not writing tests
06-16, 16:30–17:10 (Europe/London), Warwick

Developing reliable code without writing tests may be a far off dream, but Hypothesis' ghostwriter function will generate tests from type hints. The resulting tests are powerful and often appropriate for data analysis. In this talk, I'll discuss how to add tests to your data analysis code that cover a wide range of inputs -- all while using just a small amount of code.


Data analysis code is hard to test. After all, we are analyzing the data in order to learn what we already don't know. However, once we've built our analysis, we may want to turn it into production code. If our code is in production, we'll have to maintain it, and that means we need tests to ensure that changes to the code while maintaining it do not change other behavior.

Hypothesis is a Python library for creating inputs that are good for exercising code. Hypothesis tests create many different inputs for a single test case, with a special concentration on inputs that are likely to break your code. This is especially good for data science code, where we don't know what input will be presented, but we do want to analyze the data. While we cannot predict our test outputs any more than we can predict our real-world outputs, we can check for certain properties to prove that the analysis was performed as expected.

Ghostwriting is a feature of Hypothesis that writes tests based on the type hints in your code. This can not only save time, but also validate our type hints. The savings in time and toil can be significant, but the ghostwritten tests do also need some additions to truly test our code. We'll look at what is needed to both generate proper inputs and check our outputs.


Prior Knowledge Expected

No previous knowledge expected

Andy Fundinger is a senior engineer at Bloomberg, where he develops Python applications in the Data Gateway Platform team and supports Python developers throughout the firm through the company's Python Guild. Andy has spoken several times at PyGotham, as well as other conferences such as QCon, PyCaribbean, and EuroPython.

In the past, Andy has worked on private equity and credit risk applications, web services, and virtual worlds. Andy holds a master's degree in engineering from Stevens Institute of Technology.