PyData London 2024

GPU Development in Python 101
06-14, 09:00–12:30 (Europe/London), Salisbury

Since joining NVIDIA I’ve gotten to grips with the fundamentals of writing accelerated code in Python. I was amazed to discover that I didn’t need to learn C++ and I didn’t need new development tools. Writing GPU code in Python is easier today than ever, and in this tutorial, I will share what I’ve learned and how you can get started with accelerating your code.


In this tutorial we will cover:

  • What is a GPU and why is it different to a CPU?
  • An overview of the CUDA development model.
  • Numba: A high performance compiler for Python.
  • Writing your first GPU code in Python.
  • Managing memory.
  • Understanding what your GPU is doing with pyNVML (memory usage, utilization, etc).
  • RAPIDS: A suite of GPU accelerated data science libraries.
  • Working with Pandas dataframes on the GPU.
  • Working with Numpy style arrays on the GPU.
  • Performing some scikit-learn style machine learning on the GPU.

Attendees will be expected to have a general knowledge of Python and programming concepts, but no GPU experience will be necessary. The key takeaway for attendees will be the knowledge that they don’t have to do much differently to get their code running on a GPU.


Prior Knowledge Expected

No previous knowledge expected

Jacob Tomlinson is a senior software engineer at NVIDIA. His work involves maintaining open source projects including RAPIDS and Dask. He also tinkers with kr8s in his spare time. He lives in Exeter, UK.

I lead CUDA Python Product Management, working closely with RAPIDS, Omniverse, and Math Libraries to unify NVIDIA's foundational offering for Python developers and the Python community.

I received my Ph.D. from the University of Chicago in 2010, where Ibuilt domain-specific languages to generate high-performance code for physics simulations with the PETSc and FEniCS projects. After spending a brief time as a research professor at the University of Texas and Texas Advanced Computing Center, I have been a serial startup executive, including a founding team member of Anaconda.

I am a leader in the Python open data science community (PyData). A contributor to Python's scientific computing stack since 2006, I am most notably a co-creator of the popular Dask distributed computing framework, the Conda package manager, and the SymPy symbolic computing library. I was a founder of the NumFOCUS foundation. At NumFOCUS, I served as the president and director, leading the development of programs supporting open-source codes such as Pandas, NumPy, and Jupyter.