profile_document

In this blog post, we present our progress in integrating a memory-managed class system in Cython which doesn't use the GIL. This new class systems allows truly multithreaded OOP in Cython with the ease of use of Python classes.

Last Update:2019-05-24
Version:010
Language:en

Page Content

Quite recently, we made a proof of concept GIL-free class system in Cython. The goal of this article is to present the progress made toward a multithreadable class system with automatic memory management.

CPython and the GIL

CPython, the standard Python interpreter, takes its roots in 1991. Multithreaded programming was uncommon (POSIX threads were normalized in 1995). As a result, the implementation of CPython, regarding multithreading, is simple, because there was no need for a complex solution at the time. Every time a thread touches a Python object (variable, function, object, class, ...), it must hold the Global Interpreter Lock (GIL), which is, as it name suggests, an interpreter-wide lock.

Consequently, in CPython, it is possible to have multiple threads doing I/O in parallel, but only one thread at a time can do calculation. This can be very inefficient for computationally heavy tasks.

Cython (briefly)

Cython, which works with the CPython API, is able to write pure Python as well as GIL-free C code with almost the same syntax.

In Cython, you can write code like this:

cdef struct SomeCStruct:
    # This type doesn't need the GIL
    int a # This is a C int
    double b # This is a C double

class SomePythonClass:
    # This type needs the GIL
    def __init__(self):
        a = 3 # This is a PyObject

cdef class SomeCythonClass:
    # This type needs the GIL
    cdef object a # This is a PyObject
    cdef int b # This is a C int
    def __init__(self):
        a = 3 # This is a PyObject

def test():
    # This function must hold the GIL to be able to instantiate the SomeCythonClass object
    cdef SomeCStruct stack_allocated_struct
    stack_allocated_struct.a = 42
    stack_allocated_struct.b = 4.2

    heap_allocated_cython = SomeCythonClass()
    heap_allocated_cython.b = stack_allocated_struct.a # This is a C assignation (both are C int)
    heap_allocated_cython.a = stack_allocated_struct.b # The C double is coerced to a PyObject before assignation
    print(heap_allocated_cython.a, heap_allocated_cython.b)

Moreover, Cython as a support of C++ class wrapping (declaring), as well as an experimental C++ class defining support. The C++ classes are declared with the "cppclass" keyword:

cdef extern from "SomeCppHeader.h" nogil:
    cdef cppclass SomeDeclaredCppClass nogil:
        int getter()

cdef cppclass SomeDefinedCppClass nogil:
    int value
    int getter():
        return this.value
    void __init__():
        this.value = 42

cdef int do_test() nogil:
    cdef SomeDefinedCppClass* heap_allocated_cpp_object = new SomeDefinedCppClass()
    val = heap_allocated_cpp_object.getter()
    del heap_allocated_cpp_object
    return val

cpdef void test():
    print(do_test())

As you can see, memory management of C++ objects is done in Cython as in C++: manually. Moreover, the C++-style this implicit variable is used instead of the explicit Python-style self. This is uncanny for a Python developer.

The problem

Thanks to the C++ class support, it is possible in Cython to do OOP without taking the GIL. However, from a Python user perspective, this is cumbersome to do the memory management by hand.

Thanks to my colleague Boxiang who paved the way for it, I was able to craft a new class system in Cython which combines the GIL-free aspect of C++ classes and the memory management aspect of Python classes.

Cypclass

This new class system is defined with the keyword "cypclass". Let's begin with a simple example:

cdef cypclass SomeMemory:
  double a
  double b

  void foo(self, double a):
    self.a = a
    return

This snippet came from Boxiang's blog article. As you can see, this is essentially the same syntax, with three changes:

The keyword is not "class" but "cypclass"
The nogil flag is implicitely set for methods inside a cypclass. To define a GIL-taking method, you must define it with "with gil"
There is no more cdef inside the class body. This is due to the fact cypclass and cppclass are using the exact same parsing path.

At the implementation level, there is a huge shift between cypclass and Boxiang's nogil class.

Boxiang's nogil class is a proof-of-concept using Cython's Python Extension Type. Due to the GIL avoidance, it has some limitations. Mainly, there is no inheritance nor memory management.

Cypclass relies on C++ classes, so we benefit from the OOP capabilities of a C++ compiler at (almost) no cost. Each cypclass implicitely inherits from a class (called CyObject) which has the necessary methods for reference counting. Also, each cypclass object is heap-allocated. Like Python object, you cannot have stack allocated cypclass objects (unless you tweak things directly in C).

Memory management

At present, the memory management of cypclass is done through simple reference counting. The reference incrementation and decrementation is done atomically, so it is thread-safe. This is more or less the whole point of this work. At the Cython implementation level, the refcount bookkeeping is done exactly like the Python objects refcount.

Consequently, there is a convergent code base and semantic between Python objects and Cypclass objects in Cython. This allows for an easy change of the current technical backend (C++ class) to another one (own crafted C structs and vtables for example). As this flexibility was required, C++ smart pointers were rejected.

You can explicitly give away a reference if needed with the del keyword:

cdef cypclass Test:
    int a

cpdef void test() nogil:
    heap_allocated_test = Test()
    del heap_allocated_test

However, there is no name unbinding like in Python, so the above snippet is doing less than the one below:

cdef cypclass Test:
    int a

cpdef test():
    heap_allocated_test = Test()
    heap_allocated_test = <Test> NULL

Please note that explicitly giving away a reference is currently the only way to break a reference cycle, as there is no garbage collection algorithm (like mark-and-sweep) beyond reference counting.

Self

The cypclass system is following the Python convention concerning self argument within methods. This means you must supply self as the first argument of non-static methods, and you can use
the Python syntax to call parent class methods:

cdef cypclass Base:
    int getter(self):
        return 4

cdef cypclass Derived(Base):
    int getter(self):
         return 2

def print_42():
    o = Derived()
    print(str(Base.getter(o)) + str(o.getter()))
    return

Operators

The overloaded operators follow the Python syntax and are direct mapping to C++ operators. The following code runs fine:

cdef cypclass CrazyOperations:
    int a
    double b

    CrazyOperations __iadd__(self, CrazyOperations other):
        self.a += other.a

    CrazyOperations __add__(self, CrazyOperations other):
        return CrazyOperations(self.a, self.b + other.b)

    CrazyOperations __isub__(self, CrazyOperations other):
        self.b -= other.b

    __init__(self, int a = 0, double b = 4.2):
        self.a = a
        self.b = b

cdef cypclass SuperCrazyOperations(CrazyOperations):

    __init__(self, int a = 0, double b = 4.2):
        CrazyOperations.__init__(self, a, b)

    SuperCrazyOperations __iadd__(self, CrazyOperations other):
        CrazyOperations.__iadd__(self, other)

    SuperCrazyOperations __iadd__(self, SuperCrazyOperations other):
        # This is one (possibly weird) way to chain to base class' __iadd__, done here for the example
        casted_self = <CrazyOperations> self
        # As this is casted, we will call base __iadd__ with the += below
        casted_self += <CrazyOperations> other
        self.b = 0
        # At the end, we added self.a with other.a, and set out b to 0.0

    SuperCrazyOperations __isub__(self, CrazyOperations other):
        CrazyOperations.__isub__(self, other)

    SuperCrazyOperations __isub__(self, SuperCrazyOperations other):
        CrazyOperations.__isub__(self, other)
        self.a = 4


def test():
    o1 = SuperCrazyOperations(1, 2)
    o2 = CrazyOperations(42, 4.2)
    o3 = SuperCrazyOperations(-3, 2.1)

    # This should print:
    # 1 2.0
    print(o1.a, o1.b)

    # This should print:
    # 42 4.2
    print(o2.a, o2.b)

    # This should print:
    # -3 2.1
    print(o3.a, o3.b)

    # We're calling __add__ so we just add the b attributes
    tmp1 = o1 + o2
    # This should print:
    # 1 6.2
    print(tmp1.a, tmp1.b)

    # CrazyOperations' __iadd__ is called: adding the a attributes 
    o2 += o1
    # This should print:
    # 43 4.2
    print(o2.a, o2.b)

    # SuperCrazyOperations' __iadd__ is called: adding the a attributes AND setting own (o1) b to 0.0 
    o1 += o3
    # This should print:
    # -2 0
    print(o1.a, o1.b)

    # We're calling CrazyOperations' __isub__, so we're substracting b attributes
    o1 -= o2
    # This should print:
    # -2 -4.2
    print(o1.a, o1.b)

    # We're calling SuperCrazyOperations' __isub__, so we're substracting b attributes AND setting own (o1) a to 4
    o1 -= o3
    # This should print:
    # 4 -6.3
    print(o1.a, o1.b)

Typecast operators

In Python, it is possible to overload the truthiness check of a class thanks to the __bool__ method. In the cypclass system, this has been extended to allow casting to an arbitrary type, not just bool.

cdef cypclass SomeClass:
    int a
    double b
    int __int__(self):
        return self.a
    double __double__(self):
        return self.b

cdef cypclass SomeContainer:
     SomeClass some_object
     bint __bool__(self):
         return self.some_object is not NULL
     int __int__(self):
         if self:
             return <int> self.some_object
         else:
             return 0
     double __double__(self):
         if self:
             return <double> self.some_object
         else:
             return 0.0

cpdef void test():
    contained = SomeClass()
    contained.a = 42
    contained.b = 4.2
    container = SomeContainer()
    container.some_object = contained
    print(<bint> container, <int> container, <double> container)

Notice the is/is not syntax instance comparison, which behaves exactly like their PyObjects counterpart (aka pointer comparison).

Cypclass objects instantiation

There are two ways to instantiate a cypclass into an object.

The first one is using the normal instantiation syntax. It is using a wrapper around the C++ construction process in order to have the Python behaviour when constructing objects.

The second one is using the new statement. The behaviour of this second form will be detailed later. For now, you can consider it to be a direct memory allocation (like malloc would do).

C++ construction vs Python construction

In C++ constructors are implicitely called by subclasses. To be able to properly construct the parent class part of the object, a constructor must be provided in the initializer list if it is not the default constructor.
The chosen parent class constructor should not be provided in the body of the subclass constructor, because you would not be able to set const members.

This is quite different from the Python behaviour, where no parent class constructor is called by default. You must explicitly call the desired parent class constructor in the subclass constructor body otherwise.
As in Python this chaining is just a function call, which can be bound to a condition, it is not trivial to translate these calls to a C++ equivalent in an initializer list.

That's why another path was taken, which uses a wrapper around the C++ construction.

The cypclass construction layout

In a cypclass, it is possible to define an __init__ which will be called at instantiation time. This __init__ is not part of any C++ constructor which should construct the c++ object representing the cypclass object.
In fact, a C++ class representing a cypclass doesn't have any C++ constructors except the default one, which does nothing by default but calls the parent classes' default constructor.

The constructor wrapper is in charge of calling __init__.

The constructor wrapper

The wrapper is in charge of allocating the memory and calling __init__ if there is one. There is no support for argument ellipsis on __init__, but optional arguments are supported, as well as polymorphism.
So several __init__ can coexist as long as there are no ambiguous calls. Each __init__ is wrapped in its own wrapper with the same signature (except the self argument, which is obviously omitted).

__init__ is inherited. It means the constructor wrapper will call a parent class __init__ if it is not redefined in the subclass. However, if this __init__ is overloaded in the subclass, the base __init__ won't be implicitely called. If you want to do constructor chaining, you must explicitly call the parent class __init__, as for Python classes.

This snippet should make things clearer:

cdef cypclass Base:
    int a
    int b
    __init__(self, int a, int b):
        self.a = a
        self.b = b
    __init__(self, int a):
        self.__init__(a, 0)
    __init__(self):
        self.__init__(0, 0)

cdef cypclass Derived(Base):
    int c
    __init__(self, int a, int b, int c):
        self.c = c
    

cdef cypclass AltDerived(Base):
    __init__(self, int a, int b):
        #self.a = a + b
        self.b = b - a

cpdef void test():
    cdef Base base = Base(4, 2)
    cdef Derived derived1 = Derived(4)
    cdef Derived derived2 = Derived(4, 2)
    cdef Derived derived3 = Derived(4, 2, 1)
    cdef AltDerived alt_derived = AltDerived(4, 2)

    # Base should be correctly set
    print(base.a, base.b)

    # derived1 has undefined c, so derived1.c will be garbage
    print(derived1.a, derived1.b, derived1.c)

    # Same thing for derived2, but derived2.b is not zero
    print(derived2.a, derived2.b, derived2.c)

    # derived3 has the opposite behaviour: c is set, a and b are garbage
    print(derived3.a, derived3.b, derived3.c)

    # alt_derived won't output 4 2 but garbage and -2
    print(alt_derived.a, alt_derived.b)

The above-described behaviour is true but incomplete.

Sometimes in Python, there is a need to redefine __new__, for example because we're making a singleton.

__new__ redefinition is supported in the cypclass system. Only one __new__ per cypclass is allowed. The staticness is, like in Python, forced, you don't have to care about it.

The cypclass __new__ signature is different from the Python class one.

This is mainly because the first argument of Python class __new__, the class object, doesn't have any sense in a C++ world (you cannot pass a type as an argument in C++).
In Python, this class object is essentially used to call the parent __new__ to have, at the end of the call chain, a memory allocation, because the default __new__ (which resides in the slot tp_new), will call the function in the slot tp_alloc.

In a cypclass __new__, this first argument is directly the equivalent of tp_alloc: a function which only cares about memory allocation. In the cypclass world, it is called __alloc__. The default __alloc__ is just allocating memory using the second form of cypclass instantiation, with the new statement. You can overload the default __alloc__ if, for example, you would like to implement a freelist (the @freelist(n) decorator isn't currently supported in the default __alloc__). There is only one __alloc__ for each cypclass. This is a forced static method with no arguments.

Below you will find an example of cypclass with overloaded __new__, and another one, more complex, with __new__ and __alloc__ overloading:

cdef cypclass Singleton
  
cdef int allocated = 0
cdef Singleton ptr

cdef cypclass Singleton:
    Singleton __new__(alloc):
        global allocated
        global ptr
        if not allocated:
            ptr = alloc()
            allocated = 1
        return ptr

cpdef void testSingleton():
    o1 = Singleton()
    o2 = Singleton()
    if o1 is o2 and o1 is ptr:
        print("Everything is fine")

cdef cypclass AheadOfTimeChunk

DEF CHUNK_SIZE = 2
cdef AheadOfTimeChunk chunk[CHUNK_SIZE]
cdef unsigned int chunk_started = 0
# chunk_allocated[index] tells us the state of memory pointed by chunk[index]
cdef int chunk_allocated[CHUNK_SIZE]

cdef AheadOfTimeChunk chunkclass_singleton
cdef bint chunkclass_singleton_allocated = False

cdef cypclass AheadOfTimeChunk:
    """
    A stupid chunk allocator:
    Whenever the user requests a new object, fills each empty cell of
    the chunk array with fresh memory. Give to the user the first
    available chunk cell with allocated but unused memory.
    This class is reaaally stupid because it still frees memory when
    the user don't need the object anymore, so the __alloc__ will be filling
    again and again the cells we just deallocated with fresh memory.
    """
    int cell_index

    __dealloc__(self):
        global chunkclass_singleton_allocated
        global chunkclass_singleton
        global chunk_allocated
        idx = self.cell_index
        if idx >= 0:
            with gil:
                print("Object destruction : mark cell", idx, "as empty")
            chunk_allocated[idx] = -1
        else:
            with gil:
                print("Object destruction: out-of-chunk deallocation")
        if self is chunkclass_singleton:
            with gil:
                print("  Object destruction : Mark singleton flag as unallocated")
            chunkclass_singleton_allocated = False
            chunkclass_singleton = <AheadOfTimeChunk> NULL

    AheadOfTimeChunk __new__(alloc, bint singleton = False, bint must_fit_in_chunk = False):
       global chunkclass_singleton_allocated
       global chunkclass_singleton
       if singleton and chunkclass_singleton_allocated:
           with gil:
               print("Object creation : Returning already allocated singleton")
           obj = chunkclass_singleton
       else:
           obj = alloc()
           if obj is NULL and not must_fit_in_chunk:
               with gil:
                   print("Object creation : Allocating object outside of chunk")
               obj = new AheadOfTimeChunk()
               obj.cell_index = -1
           if obj is not NULL and singleton:
                with gil:
                    print("Object creation : Singleton allocated")
                chunkclass_singleton_allocated = True
                chunkclass_singleton = obj
                # We should decref here (assignment to chunkclass_singleton has incref'ed obj, we do not want that)
                del chunkclass_singleton
       if obj is NULL:
           # This is done to show something for the demo
           with gil:
               print("Object creation : Obj is NULL !")
       return obj
        
    AheadOfTimeChunk __alloc__():
        """
        Convention for chunk_allocated is:
        1 - Memory in use
        0 - Memory allocated and available
        -1 - No memory allocated
        """
        global chunk_started
        global chunk_allocated
        global chunk
        if not chunk_started:
            for i in range(CHUNK_SIZE):
                chunk_allocated[i] = -1
                chunk_started = 1
        cdef AheadOfTimeChunk ptr = <AheadOfTimeChunk> NULL
        for i in range(CHUNK_SIZE):
            if chunk_allocated[i] == -1:
                with gil:
                    print("cell", i, "is empty: allocate it")
                chunk[i] = new AheadOfTimeChunk()
                chunk[i].cell_index = i
                chunk_allocated[i] = 0
            if ptr is NULL and chunk_allocated[i] == 0:
                with gil:
                    print("cell", i, "is allocated but unused: take it")
                chunk_allocated[i] = 1
                ptr = chunk[i]
                # We must decref ptr here, because it is the same reference
                del ptr
        return ptr

cpdef void testChunk():
    print("=== Creating a basic object ===")
    basic_obj = AheadOfTimeChunk()
    print()
    print("=== Getting singleton (no previous references) ===")
    singleton_obj1 = AheadOfTimeChunk(True)
    print()
    print("=== Getting singleton (one existing reference) ===")
    singleton_obj2 = AheadOfTimeChunk(True)

    if singleton_obj1 is singleton_obj2 and singleton_obj1 is chunkclass_singleton:
        print("    Both singleton pointers are correct")
    print()
    print("=== Deleting one singleton reference (one ref left) ===")
    #del singleton_obj1
    singleton_obj1 = <AheadOfTimeChunk> NULL

    print()
    print("=== Getting again singleton (one existing reference) ===")
    singleton_obj3 = AheadOfTimeChunk(True)
    if singleton_obj3 is singleton_obj2:
        print("    Freeing a singleton instance doesn't crash things")

    print()
    print("=== Allocating an object outside of chunk ===")
    outside_allocation = AheadOfTimeChunk()
    print()
    print("=== Make a fail allocation ===")
    failed_outside_allocation = AheadOfTimeChunk(False, True)

    # Make some place in the chunk to see reuse
    print()
    print("=== Deleting one singleton reference (one ref left) ===")
    #del singleton_obj3
    singleton_obj3 = <AheadOfTimeChunk> NULL
    print()
    print("=== Deleting one singleton reference (zero refs left) ===")
    #del singleton_obj2
    singleton_obj2 = <AheadOfTimeChunk> NULL

    print()
    print("=== Allocating again a basic object (inside the chunk) ===")
    inside_chunk = AheadOfTimeChunk()
    # We can afford a very specific test here
    if inside_chunk.cell_index == 1 and inside_chunk is chunk[1]:
        print("    Chunk cell reuse went fine")

    print()
    print("=== Allocating singleton outside of chunk (no previous references) ===")
    outside_singleton = AheadOfTimeChunk(True)

    print()
    print("Remaining objects at the end of this function are:")
    print("1 The first object we created (living at cell 0)")
    print("2 The first out-of-chunk allocated object")
    print("3 The last successfully non-singleton allocated object (living at cell 1)")
    print("4 One singleton instance (living outside of chunk)")
    print("The object destruction messages should reflect this order")
    print()
As you have guessed, __new__ is wrapped in a constructor wrapper, too.

The logic behind is that a wrapper is created when there is an overloaded __new__ or __init__, and the wrapper code is defined the following way:

If there is a __new__ with a compatible signature, call it, else call a default allocation
1. When calling a user-defined __new__, call it with the overloaded __alloc__ if there is one, else call it with default __alloc__
If __new__ return type is the class type and there is an __init__ with a compatible signature, call it.

We cannot afford to have both __init__ and __new__ wrapping themselves at the same time because of signature overlap. For example, if we have a cypclass defined this way:

cdef cypclass SomeAmbiguousConstruction:
    int a
    int b
    float c
    void __init__(self, int a, float c = 4.2):
        self.a = a
        self.b = 0
        self.c = c
    int __new__(alloc, int a, int b = 42):
        # Return type is different from SomeAmbiguousConstruction, so the wrapper will skip __init__ calling
        return a*b

If both __init__ and __new__ wrap themselves, we should have two wrappers:

wrapper 1 with argument list (int, float=4.2)
wrapper 2 with argument list (int, int=42)

There is an ambiguous wrapper signature match with this statement:

o = SomeAmbiguousConstruction(42) # both signatures are fine

Both can match, so we don't know which one to use.

To fix the ambiguity, the following behaviour is used:

whenever an __init__ is defined and there is no __new__ lying around, wrap around __init__. You may have multiple wrappers with separate signatures at the end.
whenever a __new__ is defined, clear all wrappers defined, and wrap around __new__ (of course, check before there is no other __new__ lying around, else it's a programming error). You have only one wrapper at the end.

This is simple but can have some surprising behaviour. Indeed, even if __init__ cannot wrap themselves because there is a __new__, they are still inherited. Consequently, changing the __new__ signature in subclasses can also change the called __init__ in the wrapper. This is effectively crafted with the above snippet:

cdef cypclass Base:
    double value

    __init__(self, int a):
        self.value = (<double> a)*1.2

    __init__(self, double b):
        self.value = b

    Base __new__(alloc, int a):
        obj = alloc()
        # After the return, the call Base.__init__(obj, a) will be issued
        return obj

cdef cypclass Derived(Base):
    Derived __new__(alloc, double b):
        obj = alloc()
        # After the return, the call Base.__init__(obj, b) will be issued
        return obj

cpdef void test():
    base = Base(5)
    derived = Derived(5)
    # Should print: 6 5
    print(base.value, derived.value)

As said previously, the wrapper tries to call __new__ and __init__ with matching signature, not only exact signatures. It means you can play with optional arguments. The current implementation has some limitation regarding this: the wrapper doesn't know the default value of the optional arguments. This means you can have optional arguments that are not aligned if there is no optional arguments in __new__ which are mandatory ones in __init__. In any other cases, the wrapper will just retrieve the optional arguments defined at call-time and pass them to __init__.
Look at the snippet below if this is not clear:

cdef cypclass OK:
    """
    This is OK because the wrapper will pass a & b from __new__ to __init__ directly,
    and will quite blindly transfer the information about c presence in the actual __new__ call to __init__
    """
    __init__(self, int a, double b = 4.2, int c = 42):
        pass
    OK __new__(alloc, int a, double b, int c = 0):
        return alloc()

cdef cypclass KO:
    """
    This will fail because the wrapper knows b (in __new__) is optional,
    but doesn't know its default value, so it cannot pass it to __init__
    """
     __init__(self, int a, double b, int c = 42):
         pass
     KO __new__(alloc, int a, double b = 4.2, int c = 0):
         return alloc()

As the wrapper doesn't know default values, constructing an OK object with two arguments (obj= OK(42, 4.2)) will set c to 0 in __new__, and c to 42 in __init__.

Wrapper generation

The wrapper is generated at the late compilation stage. To avoid generating useless wrapper (and bloating the C++ code), only wrappers actually used in the module are generated. This also means you won't get any error checking for an unused wrapper. To fire the error message on the previous KO cypclass, you must craft a function trying to construct a KO object. Defining without using the KO cypclass will happily compile.

The new statement

As stated above, every cypclass inherits from a base class called CyObject which handles the refcount. This class has a non-trivial default constructor, as it sets the reference counter to 1. The C++ class representing a cypclass has no C++ constructor except the default one. This way, a C++ new statement on a cypclass completely bypass the constructor wrapper, but sets the refcount properly.
Consequently, from a developer point of view, this is like constructing a Python object without calling any __init__ or __new__, so it is in some way a raw memory allocation.

In Cython, using a new statement on a cypclass is directly mapped to the C++ new statement. You cannot use this new statement with arguments, as we are relying
on the C++ constructors, which are all default ones for the cypclass case. You are instantiating the cypclass object without calling any __init__ or __new__, because these calls are handled by the wrapper.
Nevertheless, it is safely usable as a valid cypclass object (although without any non-c++-default initialization).

cdef cypclass A:
    int a
    int getter(self):
        return self.a*6
    int __int__(self):
        return self.getter()
    __init__(self, int a):
        self.a = a-1

cdef cypclass MyClass:
    # Cypclass object are initialized to NULL by default.
    # This is done in the C++ construction part.
    A cypobj
    int number

    __init__(self, int a=0):
        self.cypobj = A(a)
        self.number = a

    __init__(self, A obj):
        self.cypobj = obj
        self.number = obj.a+1

    int get_A_value(self):
        if self.cypobj is not NULL:
            return self.cypobj.getter()
        else:
            return 42

cdef int take_value(MyClass o) nogil:
    value = o.get_A_value()
    return value

def print_values():
    method1_specified = MyClass(2)
    method1_default = MyClass()
    method2 = new MyClass()
    print(take_value(method1_specified), take_value(method1_default), take_value(method2))

Advanced OOP in Cypclass

Flat classes definitions and single inheritance are common aspects of all OOP models.

As cypclass relies on C++ OOP model, some more advanced aspects are available: nesting and multiple inheritance.

Nested classes

Defining nested classes is perfectly doable in cypclass. You can also define nested ctypedefs, structs, enum, union ... Because cppclass already allows you to do so.

cdef cypclass A:
    int a
    cypclass B:
        int b
        __init__(self, int b):
            self.b = 2*b
    B b
    __init__(self, int a, int b):
        self.a = a
        self.b = B(b)

cpdef void test():
    a = A(2, 2)
    b = A.B(3)
    print(a.a, a.b.b, b.b)

Multiple inheritance

Multiple inheritance is also perfectly doable, but beware of diamond inheritance, because C++ isn't always doing what you would expect. For now, inheritance between cypclass can only be "non-virtual". It means that if you're doing diamond inheritance, the base class would be present multiple times in the memory of the object, so you must explicitly tell which "inheritance branch" you want to use when casting to that base class. As the object carries different base class objects (one per inheritance branch), this means you must be careful when passing it to function using a parent class.

The snippet below crafts an example of a surprising (from a Python developer perspective) inheritance behaviour:

cdef cypclass Base:
    int base

cdef cypclass Derived1(Base):
    int derived1
    void setBase(self, int arg):
        self.base = arg

cdef cypclass Derived2(Base):
    int derived2
    void setBase(self, int arg):
        self.base = arg

cdef cypclass Diamond(Derived1, Derived2):
    int diamond

cdef void printD1(Derived1 d1) with gil:
    print(d1.derived1, d1.base)

cdef void printD2(Derived2 d2) with gil:
    print(d2.derived2, d2.base)

cpdef void test():
     o = Diamond()
     o.derived1 = 42
     o.derived2 = 4242
     Derived1.setBase(o, 4)
     Derived2.setBase(o, 2)

     # This will print the values stored in the Derived1 part of object o:
     # 42 4
     printD1(o)
     # This will print the values stored in the Derived2 part of object o:
     # 4242 2
     printD2(o)

Limitations

This new class system has some limitations that were described in the sections above. To sum up:

the memory management is using a simple reference counting algorithm. This means reference cycles must be avoided or manually broken by the developer.
the object construction process is using wrappers which are not aware of optional arguments default values. This result in a disymmetrical composition possibilities of __new__ and __init__.
multiple inheritance follows the classic C++ scheme. Making diamond inheritance work as expected for a Python developer without adding too much overhead would be a great improvement.

Finally, the cypclass system relies on Cython's ability to define c++ classes. If declaring c++ classes is supported (and documented), defining c++ classes is for now considered experimental.

Further thoughts

As Bryton's stated in is blog post, high performance concurrent programming in Cython is possible and must not be neglected.

The cypclass system allows GIL-free memory-managed OOP in Cython. Combining both works, the cypclass system and lwan coroutines scheduler, paves the way for a powerful concurrent programming framework.

This would be powerful, but not safe. Indeed, there would be no mechanism at all to avoid data races.

Implementing a safe concurrent programming framework in Cython would lead to an easy to use and high-performance multi-threaded programming kit for Python developers, which would be a great addition to the Python-related world.

An alternative approach to cypclass

During our work on cypclasses, we have considered a possibility to use a similar structure to PyObjects with an atomic reference counting. It could work if all methods regarding objects in CPython such as PyObject_RichCompare (which handles comparison operators) were reimplemented in C without GIL. This amounts for about 500 methods and about 10,000 lines of code. We call this approach ccdef class.

The advantage of ccdef class is that they share the same semantic as python's OOP model. They could also be converted to PyObjects and be manipulated by the CPython interpreter.

However, it would have taken 10 months to complete, compared to 4 months for cypclass.

Source code

References

Cython C++ wrapping: http://docs.cython.org/en/latest/src/userguide/wrapping_CPlusPlus.html
Cython internal introduction: https://wstein.org/wiki/attachments/2008(2f)sageseminar(2f)kantor/slides.pdf
Bryton's blog of Cython multithreaded coroutines: https://www.nexedi.com/NXD-Blog.Cython.Multithreaded.Coroutines
Boxiang's blog of Cython nogil extension: https://www.nexedi.com/blog/NXD-Blog.Cython.Nogil.Extension.Reports

Contact

Gwenaël Samain
gwenael (dot) samain (at) etu (dot) utc (dot) fr

Sven Franck
sven (dot) franck (at) nexedi (dot) com

Jean-Paul Smets
jp (at) rapid (dot) space
Jean-Paul Smets is the founder and CEO of Nexedi. After graduating in mathematics and computer science at ENS (Paris), he started his career as a civil servant at the French Ministry of Economy. He then left government to start a small company called “Nexedi” where he developed his first Free Software, an Enterprise Resource Planning (ERP) designed to manage the production of swimsuits in the not-so-warm but friendly north of France. ERP5 was born. In parallel, he led with Hartmut Pilch (FFII) the successful campaign to protect software innovation against the dangers of software patents. The campaign eventually succeeeded by rallying more than 100.000 supporters and thousands of CEOs of European software companies (both open source and proprietary). The Proposed directive on the patentability of computer-implemented inventions was rejected on 6 July 2005 by the European Parliament by an overwhelming majority of 648 to 14 votes, showing how small companies can together in Europe defeat the powerful lobbying of large corporations. Since then, he has helped Nexedi to grow either organically or by investing in new ventures led by bright entrepreneurs.

Automatic multithreaded-safe memory managed classes in Cython