System Development with Python

Week 6 :: C Extensions

C extensions in Python

Today's topics

Motivations for exiting pure Python

Overall motivations:

C-API-level motivations:

Packaging your code

Before we start building extensions, a quick review of building and packaging Python modules with distutils

Packaging with Distutils

write a setup.py script

from distutils.core import setup

setup(
    name='add',
    version='1.0',
    description='Test description',
    ext_modules=[],
    py_modules=['add'],
)
python setup.py build_ext [--inplace]
python setup.py install

Example code used today

First, the obligatory simple example we'll see in different parts of the slide deck


#include <stdio.h>

int add(int x, int y) {
    return x+y;
}

int main(void) {
    int w = 0;
    int q = 2;
    printf("test\n");
    printf("%d", add(w,q));
}

The C code

examples/pure-c you'll find a Makefile containing:

all: add; gcc -o add add.c

Now compile it:

% make

And run it:

./add

 3 + 2 = 5

The Python C-API

Benefits:

Drawbacks: You will hurt yourself

Further reading:

Python Functions and New Python Types

We can arbitrarily divide the categorical things we want to build in the C-API into:

  1. Creating Python functions
  2. Building new Python types

Create Python Functions in the C-API

Four things we'll need:

  1. the Python header file
  2. our C functions that mutate data
  3. a function-mapping struct that defines functions to export. Effectively says, "this module exposes these functions"
  4. an initializer call that creates the Python function ( think of it like main )

Create Python Functions in the C-API

Pull in the Python API to your C code via


#include <Python.h>
/*
Note: Since Python may define some pre-processor definitions which affect the standard headers on some systems, you must include Python.h before any standard headers are included.

stdio.h, string.h, errno.h, and stdlib.h are included for you.
*/

Registering your functions

First, register the name and address of your function in the method table


static PyMethodDef AddMethods[] = {
    {"add", add, METH_VARARGS, "add two numbers"},
    {NULL, NULL, 0, NULL}
};

That second record is a required Sentinel value

What the heck does METH_VARARGS do?

Further Reading:

http://docs.python.org/2/c-api/structures.html#PyMethodDef

Method Initialization

Now you're ready to create a Python function


PyMODINIT_FUNC
initadd(void) 
{
    // module's initialization function
    // will be called again if you use Python's reload()
    Py_InitModule3("add", AddMethods, "add method module docstring" );
}

Our add method, ready to import in Python

#include <Python.h>

static PyObject *
add(PyObject *self, PyObject *args)
{
    int x;
    int y;
    int sts;

    if (!PyArg_ParseTuple(args, "ii", &x, &y))
        return NULL;
    sts = x+y;
    return Py_BuildValue("i", sts);
}

// Module's method table and initialization function
// see: https://docs.python.org/2/extending/extending.html#the-module-s-method-table-and-initialization-function
static PyMethodDef AddMethods[] = {
    {"add", add, METH_VARARGS, "add two numbers"},
    {NULL, NULL, 0, NULL} // sentinel
};


PyMODINIT_FUNC
initadd(void) {
    // Module's initialization function
    // Will be called again if you use Python's reload()
    Py_InitModule3("add", AddMethods, "add method module docstring" );
}

Build it

Now let's build our module with distutils

Simple compilation details are handled by distutils

python setup.py build_ext --inplace

Or to install into your virtualenv:

python setup.py install

Now you can "import add; add.add(2,4)" from your Python code

Try it now

Helper Functions for Unpacking Args

Python isn't concerned with "types" but C is

So function arguments must be parsed and unpacked to C types on the way in:

if (!PyArg_ParseTuple(args, "s", &var1, ...))
    return NULL;

http://docs.python.org/2/c-api/arg.html#PyArg_ParseTuple

Helper Functions for Creating Python Objects

On the way out or internally we can call Py_BuildValue


PyObject* Py_BuildValue(const char *format, ...)

https://docs.python.org/2/c-api/arg.html#c.Py_BuildValue

Exercise

Modify the C code paying special attention to the formatters used:

  1. Change "add" input and output values to floating point numbers. Check the format options for PyArg_ParseTuple. Compile. Verify that your changes work
  2. Look at the last few "items" formatters for Py_BuildValue. Return the terms and the result one the addition operations as one of these types. For example, "add(2,3)" might return "(2,3,5)".

More Examples with Sequence Types

Let's look at /examples/week-08/c-api/whirlext/

The userfulness of Py_BuildValue should be apparent. If we ever want to interact with "Python-like" structures we can dynamically create them.

These examples are from Ned Batchelder's old-but-still relevant PyCon Talk


static PyMethodDef
module_functions[] = {
   { "string_peek", string_peek, METH_VARARGS, "Pick a character from a string." },
   { "string_peek2", string_peek2, METH_VARARGS, "Safely pick a character from a string." },
   { "insert_powers1", insert_powers1, METH_VARARGS, "Insert a tuple of powers-of-n at index n" },
   { "insert_powers2", insert_powers2, METH_VARARGS, "Insert a tuple of powers-of-n at index n" },
   { NULL }
};

2 Minute Reference Counting Run Down

There's a lot of nuances and exceptions of Owned versus Borrowed:

Examples -- Memory Leaking

Go look at /examples/week-08/c-api/memleak

Exercise

Go to the dir /examples/week-08/c-api/whirlext/

Exception handling

Major errors in your C code won't magically turn into Python exceptions

You have to detect error conditions and call the proper functions

there is a global indicator (per thread) of the last error that occurred. Most functions don’t clear this on success, but will set it to indicate the cause of the error on failure.

Most functions also return an error indicator, usually NULL if they are supposed to return a pointer, or -1 if they return an integer (exception: the PyArg_*() functions return 1 for success and 0 for failure)

The easy way to set this indicator is with PyErr_SetString

http://docs.python.org/2/c-api/exceptions.html

Exercise

Find the divide module in the examples/week-08/c-api/divide/ directory

SWIG

Simple Wrapper Interface and Generator

A language agnostic tool for integrating C/C++ code with high level languages

Advantages

Language interfaces

Further reading

SWIGifying add()

SWIG doesn't require modification to your C source code

The language interface is defined by an "interface file", usually with a suffix of .i

From there, SWIG can generate interfaces for the languages it supports

The interface file contains ANSI C prototypes and variable declarations

The %module directive defines the name of the module that will be created by SWIG

To create a SWIG wrapper:

run it!

python -c 'import add;print add.add(4,5)'

http://www.swig.org/Doc2.0/SWIGDocumentation.html#Introduction_nn5

SWIGifying add(), not just for Python

SWIG will create interfaces for all supported languages

Further reading

ctypes

A foreign function interface in Python

Binds functions in shared libraries to Python functions

Benefits

Drawbacks

Importing Dynamic Shared Libries

Importing dynamic shared libraries is different on Windows and Unix systems, see https://docs.python.org/2/library/ctypes.html#loading-dynamic-link-libraries

from ctypes import *
add = cdll.LoadLibrary("add.so")
print add.add(3,4)

Further reading

Calling functions with ctypes

None, integers, longs, byte strings and unicode strings are the only native Python objects that can directly be used as parameters in these function calls.

The rest must be wrapped in a ctypes data type

For instance, floats can be wrapped in c_double() before handing off to ctypes

printf("An int %d, a double %f\n", 1234, c_double(3.14))

You can allow your own classes to be passed to ctypes via the _as_parameter_ instance variable, as long as they can be resolved to an integer or string.

class MyObject(object):
    def __init__(self, number):
        self._as_parameter_ = number

obj = MyObject(32)
printf("object value: %d\n", obj)

http://docs.python.org/2/library/ctypes.html#fundamental-data-types

ctypes

Passing Python objects into C functions

If a function expects a pointer, just wrap your Python object in byref(x)

a_lib.a_function( ctypes.byref(c_float(x)))

http://docs.python.org/2/library/ctypes.html#passing-pointers-or-passing-parameters-by-reference

For callback functions, use a factory that returns function prototypes:

ctypes.CFUNCTYPE(restype, *argtypes, use_errno=False, use_last_error=False)

See examples/ctypes/pointers.py and examples/ctypes/ctypes_test.py

http://docs.python.org/2/library/ctypes.html#ctypes.CFUNCTYPE

ctypes

You can define C structs by subclassing ctypes.Structure:

class POINT(ctypes.Structure):
    _fields_ = [("x", ctypes.c_int),
             ("y", ctypes.c_int)]

point = POINT(10, 20)
print point.x, point.y
point = POINT(y=5)
print point.x, point.y

ctypes summary

ctypes allows you to call shared libraries:

Supports almost all of C:

Upside:

Downsides:

Cython

Cython code is Python with a few extra keywords

Allows definition of static types

Cython compiles down to Python extensions written in C

To type a variable, just add the cdef keyword:

def add(int x, int y):
    cdef int result=0
    result = x + y
    return result

The allowed types are defined here

Further reading

Developing with Cython

first, install cython with "pip install cython"

Cython files end in the .pyx extension

Once your .pyx file is created, it is converted to C via

cython cy_add.pyx

Generate "annoted" C code in HTML

cython -a cy_add.pyx

Building Cython extensions with distutils

Building your Python extension with distutils is similar to before, but use 'cythonize'

from distutils.core import setup
from Cython.Build import cythonize

setup(name = "cython_example",
      ext_modules = cythonize(['cy_add1.pyx',])
   )

Then you're ready to build:

python setup.py build_ext [--inplace]

See examples/cython/setup.py

Adding types

cdef int i
cdef double dx

Typing everything in sight will not necessarily improve performance. It may even harm it, as there may be unnecessary type checks or conversions

Cython functions

Cython functions can be declared two ways:

Calling external functions with Cython

You can tell Cython about external functions you want to call with 'cdef extern':

# distutils: sources = add.c
# This tells cythonize that you need that c file.

# telling cython what the function we want to call looks like.
cdef extern from "add.h":
    # pull in C add function, renaming to c_add for Cython
    int c_add "add" (int x, int y)

def add(x, y):
    # now that cython knows about it -- we can just call it.
    return c_add(x, y)

Cython can compile pure Python code to C to provide a performance improvement

Consider a more expensive numerical integration function

Numerical Integration

def f(x):
    return x**2

def integrate(f, a, b, N):
    s = 0
    dx = (b-a)/N
    for i in range(N):
        s += f(a+i*dx)
    return s * dx

For example, integrating this function from 0 to 10 results in 333.333...

This is a good candidate for Cython – an essentially static function called a lot.

http://www.wolframalpha.com/input/?i=integrate+x**2+from+0+to+10

Improvements with static typing

Find the integrate code in examples/cython/integrate

Build it as usual and test with integrate_main.py

Even more ways to work in C

There are several other ways to work with C code. We'll say a passing hello to them.

Pyrex

http://wiki.python.org/moin/Pyrex

Superceded by Cython

XDress

Cython-based, NumPy-aware automatic wrapper generation for C / C++

Currently, xdress may generate Python bindings (via Cython) for C++ classes and functions and in-memory wrappers for C++ standard library containers (sets, vectors, maps). In the future, other tools and bindings will be supported.

SIP

http://wiki.python.org/moin/SIP

Boost.Python

http://www.boost.org/doc/libs/1_41_0/libs/python/doc/index.html

A C++ library which interfaces Python and C++

Wraps C++ functions in BOOST wrappers, compiled with your regular C++ compiler

shedskin

https://code.google.com/p/shedskin/

A pure python compiler that makes type assumptions based on type inference

Experimental, but growing

A few others

http://wiki.python.org/moin/IntegratingPythonWithOtherLanguages

Choosing one of the methods

Are you calling a few system library calls? - ctypes

Want your code to be included in the standard CPython library? - CPython API

Do you have a really big library to wrap?

use a wrapper generator: - SWIG, XDress, ..

Are you writing extensions from scratch? - Cython

Using C++ or Boost already? - Boost-Python

Do you want a “thick” wrapper around a C/C++ lib - Cython

Want some easy speed and can use an alternative interpreter? - try http://pypy.org

Questions?

/