Adventures in PyOpenCL: Part 2, Particles with PyOpenGL

Posted on March 22, 2011 by enj

Today we journey as pythonauts into the world of particle systems and fast 3D graphics, manipulating thousands upon thousands of little dots in mere milliseconds! We accomplish this with two Python modules, PyOpenCL and PyOpenGL. This is a port of my C++ OpenCL tutorial, but with much less code and all the splendors of numpy.

What you’ll need:

numpy (at least version 1.4)
PyOpenGL
PyOpenCL(for now, Mac users will need to build from git)
the code
A graphics card and driver that support OpenCL

Let’s get started!

First lets take a look at the files we will be dissecting

- important files
    main.py
    part2.py
    part2.cl
    initialize.py
- support files
    glutil.py
    vector.py
    timing.py

You can go ahead and run python main.py to see the code in action. The way our particle system works is that we have some collection of particles, really just 3D points in space, each with it’s own lifetime. Every time we loop we want to update the positions of our particles based on some set of rules (e.g. gravity, initial velocity) and decrease their lifetime a little. When the lifetime of a particle reaches 0 we set it back to its original position and reset its lifetime. So we want to structure our code so that we initialize the positions of the particles first, then every frame we update and render them.

main.py is responsible for setting up the OpenGL environment using PyOpenGL and GLUT, providing mouse and keyboard interaction as well as a run loop for our program. So now we have some 3D space we can look around by clicking and dragging (left and right mouse buttons) or hit ‘q’ to quit and ‘t’ to print out timing of the update function.

The Setup

Once we have our environment we need to setup some particles, for this look in initialize.py. The first function you will see is function_np which creates several numpy arrays and sets their values using numpy slice operators. If that’s a little confusing, it may be easier to follow the fountain_loopy function which does the same thing but with a (slower) for loop. Essentially we are randomly placing the particles on a flat donut in the x, y plane. They each start with an initial velocity in the same direction as their position and a random lifetime. We also make an RGBA color array so that each particle can have its own color.

Notice the fountain function at the bottom, which just calls the fountain_np function and creates Vertex Buffer Objects (VBO) out of the numpy arrays. Presumably when you go on to make your own OpenCL/OpenGL code you will start with your own numpy arrays and this is a key point for setting up your memory on the GPU. If you want OpenCL to be able to act on OpenGL memory (without making copies) you will need to prepare VBOs (or render buffer objects with clImage which should be the subject of a later tutorial) before you go to OpenCL. Luckily its dead simple with PyOpenGL

    from OpenGL.arrays import vbo
    pos_vbo = vbo.VBO(data=pos, usage=GL_DYNAMIC_DRAW, target=GL_ARRAY_BUFFER)
    pos_vbo.bind()

Now lets look in main.py to see how the initialize functions are called and the results are passed into OpenCL.

     #set up initial conditions
     (pos_vbo, col_vbo, vel) = initialize.fountain(num)
     #create our OpenCL instance
     self.cle = part2.Part2(num, dt)
     self.cle.loadData(pos_vbo, col_vbo, vel)

Here num and dt are defined at the top of main.py, which are user parameters for how many particles to make and essentially how much simulation time passes each frame. Now we get to the OpenCL, we’ve made an instance of a Part2 class. Then we call loadData with the vbos and velocity array.

Interfacing with OpenCL

Let’s take a look at what the constructor and loadData are doing in part2.py

     self.clinit()
     self.loadProgram("part2.cl");
     self.num = num
     self.dt = numpy.float32(dt)

Notice how we explicitly “cast” dt to a numpy float, this is required to pass variables (non-buffers) to a PyOpenCL kernel.

Now let’s take a minute to look at what clinit is doing, for it is responsible for setting up our CL Context using the existing GL context.

    plats = cl.get_platforms()
    from pyopencl.tools import get_gl_sharing_context_properties
    import sys
    if sys.platform == "darwin":
        self.ctx = cl.Context(properties=get_gl_sharing_context_properties(),
                             devices=[])
    else:
        self.ctx = cl.Context(properties=[
            (cl.context_properties.PLATFORM, plats[0])]
            + get_gl_sharing_context_properties(), devices=None)

    self.queue = cl.CommandQueue(self.ctx)

First we get a list of available OpenCL platforms on the machine and then import a handy function provided by PyOpenCL which abstracts the messy business[pdf] of setting up the right properties for sharing a GL context. Due to the eccentric whims of Apple (hey, they pretty much came up with OpenCL) the way you create the context on Mac OS X is slightly different, hence the if statement checking for “darwin”. After that its back to business as usual by creating a CommandQueue from the context. If you are using PyOpenCL .92 or beta2011 you will need to replace the contents of the clinit function with this code.

So loadProgram is the same as in Part 1 where we simply read in the file to instantiate and build a program object. So lets skip that and get straight into loading the data:

def loadData(self, pos_vbo, col_vbo, vel):
        import pyopencl as cl
        mf = cl.mem_flags

        #... cut out the saving of variables to self ...

        #Setup vertex buffer objects and share them with OpenCL as GLBuffers
        self.pos_vbo.bind()
        self.pos_cl = cl.GLBuffer(self.ctx, mf.READ_WRITE, int(self.pos_vbo.buffers[0]))
        self.col_vbo.bind()
        self.col_cl = cl.GLBuffer(self.ctx, mf.READ_WRITE, int(self.col_vbo.buffers[0]))

        #pure OpenCL arrays
        self.vel_cl = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=vel)
        self.pos_gen_cl = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=self.pos)
        self.vel_gen_cl = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=self.vel)
        self.queue.finish()

        # set up the list of GL objects to share with opencl
        self.gl_objects = [self.pos_cl, self.col_cl]

The key here is on line 33 in creating OpenCL buffers from the vbo objects using the cl.GLBuffer class. Notice how we do not pass in the vbo object itself, but the actual integer value of the buffer as it is represented by OpenGL. We create normal OpenCL buffers as we did in Part 1 simply by passing in numpy arrays (♥). The other significant change is defining the gl_objects list which we will use in execute!

Execute!

def execute(self, sub_intervals):
        cl.enqueue_acquire_gl_objects(self.queue, self.gl_objects)

        global_size = (self.num,)
        local_size = None

        kernelargs = (self.pos_cl,
                      self.col_cl,
                      self.vel_cl,
                      self.pos_gen_cl,
                      self.vel_gen_cl,
                      self.dt)

        for i in xrange(0, sub_intervals):
            self.program.part2(self.queue, global_size, local_size, *(kernelargs))

        cl.enqueue_release_gl_objects(self.queue, self.gl_objects)
        self.queue.finish()

The first thing to point out is that we are acquiring the gl_objects before we pass them in as arguments to the kernel. This makes sure that OpenGL is not using the buffers for anything and allows us to read from and write to them. We are setting our global workgroup size to the length of our arrays, essentially saying that each thread global workitem will be one element of our original array.
We also introduce sub intervals, allowing us to perform a variable number of updates per frame. This means one could make dt smaller, generally making the simulation more accurate but not slow down the desired motion of the particles. For example if you decrease dt from .01 to .001 you will be making the particles move 10 times slower every iteration, so to keep the visual speed the same you would do 10 sub iterations.

The Kernel

So after all that setup we are ready to run our kernel, found in part2.cl.

This is also the exact same kernel as used in my C++ version.

So there you have it, the essentials of interoperating between OpenCL and OpenGL in Python! I didn’t cover what I did in my utility files but those are subjects of later posts. You should be able to poke around to see whats going on, and as far as rendering I’m using very basic GL calls for VBOs which there are other tutorials for.

I’d like to shout out to Keith Brafford for helping test and refactor this code on Windows as well as the PyOpenCL patch I worked on to get GL interop working on the Mac. Of course this tutorial wouldn’t be possible without the valiant efforts of Andreas Klöckner!

As a sample of what’s possible I implemented a solver for the 1D Wave equations as described here.

This entry was posted in advcl, code, opencl, python, tutorial. Bookmark the permalink.

21 Responses to Adventures in PyOpenCL: Part 2, Particles with PyOpenGL

Aleksandar says:

June 29, 2011 at 7:58 am

I tried to execute main.py but i get this result:
from pyopencl.tools import get_gl_sharing_context_properties ImportError: cannot import name get_gl_sharing_context_properties
I’ve searched all over the Internet but couldn’t find anything?
enj says:

June 29, 2011 at 3:23 pm

@Aleksandar
What version of PyOpenCL are you running? It sounds to me like you have an older one, can you try again with 2011.1?
http://pypi.python.org/pypi/pyopencl
Aleksandar says:

June 30, 2011 at 10:34 am

I use 0.92-1 that is the only one i have in repository(I’m using Ubuntu 11.04). I’m going to try to install 2011.1.
Aleksandar says:

June 30, 2011 at 11:18 am

I’ve installed pyopencl 2011.1 and when i run main.py I get this error:
AttributeError: type object 'context_properties' has no attribute 'GL_CONTEXT_KHR'
enj says:

June 30, 2011 at 2:14 pm

I will have to try 2011.1 myself then, I was using pyopencl from source and haven’t updated in a few weeks. I’ll get it and see if I can replicate the issue.

What OS are you on?
Aleksandar says:

July 2, 2011 at 4:10 am

Ubuntu 11.04
Jake says:

July 4, 2011 at 6:48 pm

I’m seeing the same thing – Ubuntu 11.04

The error is occurring in pyopencl/tools.py
Jake says:

July 4, 2011 at 7:06 pm

Ah, the solution was rather simple.

When configuring pyopencl, use:

configure.py –no-cl-enable-gl

Once done, pyopencl.context_properties now has:

['CGL_SHAREGROUP_KHR', 'EGL_DISPLAY_KHR', 'GLX_DISPLAY_KHR', 'GL_CONTEXT_KHR', 'PLATFORM', 'WGL_HDC_KHR', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'to_string']
Aleksandar says:

July 6, 2011 at 9:38 am

Thanks man, I resolve that problem but now I have another one:

line 37, in loadData
self.pos_cl = cl.GLBuffer(self.ctx, mf.READ_WRITE, int(self.pos_vbo.buffers[0]))
pyopencl.LogicError: clCreateFromGLBuffer failed: invalid gl object
Radu says:

July 8, 2011 at 10:05 am

Hello.
I followed the steps in the tutorial but failed to run the program.
I get this error:
File “main.py”, line 135, in
p2 = window()
File “main.py”, line 56, in __init__
(pos_vbo, col_vbo, vel) = initialize.fountain(num)
File “/home/radu/Desktop/Python/adventures_in_opencl/python/part2/initialize.py”, line 84, in fountain
pos_vbo.bind()
File “vbo.pyx”, line 227, in OpenGL_accelerate.vbo.VBO.bind (src/vbo.c:2590)
File “vbo.pyx”, line 181, in OpenGL_accelerate.vbo.VBO.create_buffers (src/vbo.c:1872)
File “latebind.pyx”, line 32, in OpenGL_accelerate.latebind.LateBind.__call__ (src/latebind.c:559)
File “wrapper.pyx”, line 308, in OpenGL_accelerate.wrapper.Wrapper.__call__ (src/wrapper.c:5059)
File “/usr/lib/pymodules/python2.7/OpenGL/platform/baseplatform.py”, line 340, in __call__
self.__name__, self.__name__,
OpenGL.error.NullFunctionError: Attempt to call an undefined function glGenBuffersARB, check for bool(glGenBuffersARB) before calling

I also have Ubuntu 11.04.
enj says:

July 10, 2011 at 1:43 am

@Jake, @Aleksander I don’t have Ubuntu available at the moment to test with you, but I just updated to the latest source and everything still works on my mac. Those may be problems with PyOpenGL, I’m trying to remember on a school Ubuntu 11.04 machine I had to build PyOpenGL from source. That might address your genbuffers stuff @Radu unless your graphic cards drivers aren’t up to date.
Pingback: Visualizing OpenCL computations within OpenGL | Omar Abo-Namous
Omar says:

August 6, 2011 at 11:31 am

Thank you very much for the tutorial. It helped me further with learning the interaction between opengl and opencl. Here is my article: Visualizing OpenCL computations within OpenGL
Ali says:

August 7, 2011 at 7:57 pm

Hi,
I’m having this problem on my macbook pro. It has a built in graphics unit and ‘GeForce GT 330M’:

Timings:
fountain_np | average: 4.44102287292 | total: 4.44102287292 | count: 1

Traceback (most recent call last):
File “main.py”, line 135, in
p2 = window()
File “main.py”, line 58, in __init__
self.cle = part2.Part2(num, dt)
File “/Users/aarslan/Documents/PYTHON_TOOLS/enjalot-adventures_in_opencl-a5bb2a1/python/part2/part2.py”, line 15, in __init__
self.clinit()
File “/Users/aarslan/Documents/PYTHON_TOOLS/enjalot-adventures_in_opencl-a5bb2a1/python/part2/part2.py”, line 78, in clinit
self.ctx = cl.Context(properties=get_gl_sharing_context_properties(),
File “/Library/Frameworks/EPD64.framework/Versions/7.0/lib/python2.7/site-packages/pyopencl-2011.1.2-py2.7-macosx-10.5-x86_64.egg/pyopencl/tools.py”, line 272, in get_gl_sharing_context_properties
(ctx_props.CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE, cl.get_apple_cgl_share_group()))
AttributeError: type object ‘context_properties’ has no attribute ‘CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE’
Jonathan Lettvin says:

August 28, 2011 at 2:37 am

I just installed ubuntu 11.04 and used synaptic package manager to install pyopencl.
When I try to run the example programs I get:

Xlib: extension “GLX” missing on display “:0.0″.
freeglut (main.py): OpenGL GLX extension not supported by display ‘:0.0′

Please advise how to fix this.
Erik Edin says:

August 31, 2011 at 2:56 pm

Hi!

I just bought a new laptop, with an NVIDIA GT555M card, running Windows 7 x64.
I initially had issues running the Python part2 particle demo, with an error similar to some above.
The actual error I got was:
pyopencl.LogicError: Context failed: invalid gl sharegroup reference khr

I realized after a bit of worrying that the problem was in the Optimus device that the newer laptops have, that is the integrated powersaving graphics card, if I understand correctly.
I was however able to fix that by using the NVIDIA control panel and changing the Global settings to always use the NVIDIA GT555M card, instead of the integrated Optimus card. Then I was able to run the demo without the error message.
Matt says:

December 28, 2011 at 7:09 pm

I get this as well. Google is short on solutions.

python main.py
Timings:
fountain_np | average: 23.1220722198 | total: 23.1220722198 | count: 1

Traceback (most recent call last):
File “main.py”, line 135, in
p2 = window()
File “main.py”, line 58, in __init__
self.cle = part2.Part2(num, dt)
File “/Users/xxx/pyopencl/enjalot-adventures_in_opencl-94f53cf/python/part2/part2.py”, line 15, in __init__
self.clinit()
File “/Users/xxx/pyopencl/enjalot-adventures_in_opencl-94f53cf/python/part2/part2.py”, line 78, in clinit
self.ctx = cl.Context(properties=get_gl_sharing_context_properties(),
File “/Library/Python/2.7/site-packages/pyopencl-2011.2-py2.7-macosx-10.7-intel.egg/pyopencl/tools.py”, line 282, in get_gl_sharing_context_properties
(ctx_props.CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE, cl.get_apple_cgl_share_group()))
AttributeError: type object ‘context_properties’ has no attribute ‘CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE’

OS X 10.7.2 — Build 11C74
Active Graphics Card:

NVIDIA GeForce 9600M GT:

Chipset Model: NVIDIA GeForce 9600M GT
Type: GPU
Bus: PCIe
PCIe Lane Width: x16
VRAM (Total): 512 MB
Vendor: NVIDIA (0x10de)
Device ID: 0×0647
Revision ID: 0x00a1
ROM Revision: 3437
gMux Version: 1.7.3
Displays:
Color LCD:
Resolution: 1440 x 900
Pixel Depth: 32-Bit Color (ARGB8888)
Main Display: Yes
Mirror: Off
Online: Yes
Built-In: Yes
Steve says:

February 14, 2012 at 12:22 pm

@Aleksandar

I’ve just spent aaaaaaaall afternoon trying to fix this problem:

“I’ve installed pyopencl 2011.1 and when i run main.py I get this error:
AttributeError: type object ‘context_properties’ has no attribute ‘GL_CONTEXT_KHR’”

But I finally did it. It indicates that either your device doesn’t support OpenCL/OpenGL interoperability, or that PyOpenCL was compiled without it. To get it to work, I had to download cl_ext.h from the Khronos website and put it in /usr/include/CL/, then alter “siteconf.py” which was created by “configure.py” when building PyOpenCL and set CL_ENABLE_GL to True.

Since I also spent the day installing various versions of NVidia drivers, it’s possible something else also helped, but I’m pretty sure that’s what fixed it.
tn says:

February 17, 2012 at 3:30 pm

Hi,
Thanks for your interesting tutorial.
However, I was wondering how you would do the following thing:
at the moment, you have a single kernel doing all the math, but for some other problems, you can not do that.
Say for example that at a given time you want to sum the speed of all particles (silly example, but you get the idea) and stop on a condition for that value.

As you mention it in your code, we sometime need to keep the vectors in GPU memory, do some other computation and go back to the initial kernel.
How would you do that ?
Waiting for the next example :)

Thanks
joost says:

March 25, 2012 at 4:54 pm

I also had the ImportError: cannot import name get_gl_sharing_context_properties first, so I built pyopencl-2011.2 from source.
Then I had the ‘GL_CONTEXT_KHR’ exception, which I tried to solve by following the advice above, but after some trial-and-error this worked for me:
./configure.py –cl-enable-gl

i am on Kubuntu 11.10 (same as Ubuntu), using the most recent nvidia driver.
Ryan says:

April 29, 2012 at 12:50 pm

Thanks for this. I hacked the nbody algo into the kernel. It works fantastic on my gtx285 with around 10k particles.