Adventures in OpenCL Part 2: Particles with OpenGL

20,000 particles being shared by OpenGL and OpenCL

This tutorial series is aimed at developers trying to learn OpenCL from the bottom up, with a focus on practicality. This installment introduces OpenCL context sharing with OpenGL. We make a simple particle system to demonstrate this feature. One of the most important aspects of this feature is the time we can save by doing rendering and calculations on the same memory in the GPU, this means we don’t need to copy data back and forth!

You may want to grab the code and compile it to see it in action. As usual I recommend having a copy of the OpenCL specification handy.

In this tutorial I use the C++ bindings with less explanation, so see Part 1.5 for a more in-depth explanation or Part 1 to get started with C bindings. As before this code has been tested by me on NVIDIA hardware on my Macbook Pro and on an Ubuntu workstation, ATI and Windows users are encouraged to try it out and let me know if you have any problems so I can update the tutorial.

Sharing is Caring!

Lets see what we need to do to get OpenCL and OpenGL sharing their context. The first thing we need is an OpenGL context! For this tutorial I use GLUT to create a window that we can draw in, and GLEW for OpenGL extensions (at least on linux). This means you will need to have those headers and libraries installed on your system. Once you have those you can try building:

cd part2
mkdir build
cd build
cmake ..
make

The Source Code Files

Let’s go over the source files again, even though they are the same as the last tutorial we add more functionality in a couple of them, so it will be good to quickly go over the changes.

main.cpp
This is where we test out our CL class. We setup a GLUT window and OpenGL context. Then we instantiate our CL class, prepare a simple particle system and initialize it. At the end of the file are several helper functions for manipulating the OpenGL view with the mouse and keyboard.

cll.h
The main header file for our CL class definition, also handles including the OpenCL libraries. I’ve downloaded the header files from the Khronos website to avoid having to search the computer for a particular SDK. Note that I’ve had to make a slight change to cl.hpp for Mac users because of a bug in the implementation, which I will cover later.

cll.cpp
The core implementation of our CL class, including functions for initializing the OpenCL context from OpenGL, loading and building an OpenCL program.

part2.cpp
Implementation of the functions that setup and run the OpenCL kernel. This is where we actually see OpenCL in action.

part2.cl
The actual OpenCL code to be executed. Right now it’s a simple particle system that models gravity.

util.h and util.cpp
Utility functions that make things like creating VBOs or printing out OpenCL error messages easier

CMakeLists.txt
The configuration and build script used to build the project. This makes it easier to be portable, and building our code as a library makes it easier to contribute to other projects.

The Source Code Contents

So our main.cpp is a bit messier than before, this is mostly just setting up the OpenGL stuff. You may already have some OpenGL context to plug your OpenCL stuff into, in which case you just want to make sure you are using Vertex Buffer Objects (VBOs) that OpenCL can use to create its buffers.
I will just point out some interesting bits that are important to OpenCL, starting with the number of particles and a pointer to our CL class:

#define NUM_PARTICLES 10000
CL* example;

and later we instantiate it like:

 example = new CL();

We do this so that we can access the object from the GLUT loop, specifically in the appRender function which is called by GLUT to update the display. Before we can talk about rendering though, we need to load some data! Take a look at the for loop where we populate three vectors of Vec4 which is just a typedef for 4 floats; x, y, z, w, declared in cll.h. We then pass these to our class which will push the data to the GPU.

example->loadData(pos, vel, color);
example->popCorn();

Lastly in main.cpp let’s take a look at the void appRender() function, where the first thing we do is update the particle system:

example->runKernel();

This is followed by the rendering code, which is standard OpenGL for drawing points from a VBO. I learned this from here.

Now let’s took at some changes to cll.cpp.
It is easier to just click the link and view the whole source file, since the code is too long for a snippet here, but the major addition is the way we ceate the context. Each operating system uses different extensions to accomplish the same thing, the code mostly comes from the NVIDIA GPU SDK examples, but I modified it to use the C++ bindings. In doing so I found an inconsistency in Apple’s implementation and had to add an extra constructor to the Context class in cl.hpp (around line 1448) to compensate.

We see the most OpenCL action in part2.cpp which also demonstrates creating a CL buffer from a GL buffer. First we create the GL buffer as a VBO

p_vbo = createVBO(&pos[0], array_size, GL_ARRAY_BUFFER, GL_DYNAMIC_DRAW);

The nice thing about std::vectors is that they store their elements in a tightly packed array so we can just pass the address of the first element. Next we store our vbos in another vector as cl::BufferGL objects:

cl_vbos.push_back(cl::BufferGL(context, CL_MEM_READ_WRITE, p_vbo, &err));

We don’t need to push any data to them like we do our pure OpenCL buffers because they just reference the data that is already in the VBO!

Our popCorn function just loads the kernel like before, and sets its arguments. Finally in the runKernel function we have to do a couple extra things to work with our VBOs, namely

err = queue.enqueueAcquireGLObjects(&cl_vbos, NULL, &event);

and

err = queue.enqueueReleaseGLObjects(&cl_vbos, NULL, &event);

Which acquires the buffers before we execute the kernel, and then releases them when we are done. This way we can safely work on the data without interfering with OpenGL or it interfering with our OpenCL.

The last thing to talk about is the cl code itself, but I’ve heavily commented the code so it will be easier to just go read the file :)

I hope this tutorial helps, please let me know if I’m mistaken anywhere so I can correct it for others reading! I’m learning a lot about particle systems for my master’s thesis at the Florida State University Department of Scientific Computing!

12 thoughts on “Adventures in OpenCL Part 2: Particles with OpenGL”

Pingback: Tweets that mention Adventures in OpenCL Part 2: Particles with OpenGL | enj -- Topsy.com
Pingback: [GPU Computing] Particle Systems with OpenCL and CUDA - 3D Tech News, Pixel Hacking, Data Visualization and 3D Programming - Geeks3D.com
blenderificus August 30, 2010 at 5:40 am

thanks for sharing more insigt into programming particles for openCL. Cant wait for your work to make in into trunk!
Patrick A. September 10, 2010 at 5:55 pm

Really great tutorial! I’m trying to compile your source code under Netbeans 6.9.1 for Windows 7 but it seems that the file GL/glx.h is missing. What is this file used for? Also, which libraries are you linking your GLUT and OpenGL things to, as I am failing to link the source code to the standard libraries (Glew32.lib, Glu32.lib, OpenGL32.lib and Glut32.lib)…
Thanks,
enj Post authorSeptember 14, 2010 at 6:19 pm

@Patrick,
Thanks for the feedback, unfortunately I don’t have a Windows machine handy to test with, but I did check and glx.h is not necessary on Windows (this pertains to functions on linux to get the OpenGL context).
You should be able to link to those standard OpenGL libraries, is NetBeans using the CMakeLists.txt file or does it have its own build system?

At some point I’ll be testing everything on Windows, but right now my time is focused on my thesis project (which is where the knowledge for these tutorials comes from) so unfortunately I can’t help just yet.
Calin November 24, 2010 at 4:46 am

Thanks for sharing your knowledge. Great tutorial.
I tested it on Ubuntu 10.04 with ATI 4650, StreamSdk 2.2 and it perfectly works.
Garet December 28, 2010 at 5:49 am

Got it up and running in Code::Blocks on Windows 7 x64 with MinGW

Thanks much =D Will be very helpful in conceptualizin’ opencl
Garet December 28, 2010 at 6:04 am

oh btw, i linked to:
libopengl32.a
libglu32.a
libfreeglut.a
OpenCL.lib

in that order

also i changed the top of part2.cpp to look like:

#define GLUT_DISABLE_ATEXIT_HACK
#define GLEW_STATIC

#include <windows.h>
//OpenGL stuff
#include <glew.c>
#if defined __APPLE__ || defined(MACOSX)
#include <GLUT/glut.h>
#else
#include <GL/glut.h>
#endif
#include <GL/gl.h>
#include <stdio.h>
#include <stdlib.h>
#include <cstdlib>
#include <sstream>
#include <iomanip>
#include <math.h>
#include <vector>
#include "cll.h"
#include "util.h"
#include <string>
#include <string.h>

and i removed any place i saw includes being recalled that i could find based on that order.

the top of main.cpp just looks like:
#include "part2.cpp"

#define NUM_PARTICLES 20000
CL* example;

etc,…

and maybe the more important thing with glew is that i just went ahead and cheated and plopped the glew.c file into mingw/include and glew.h into mingw/include/GL
and then just #include<glew.c> to avoid some strange link errors ^_^;;

glew.h should be at top of glew.c ala
#include <GL/glew.h>
#if defined(_WIN32)
# include <GL/wglew.h>
#elif !defined(__APPLE__) || defined(GLEW_APPLE_GLX)
# include <GL/glxew.h>
#endif

which i think should shed some light on that glx business.. prolly related to GLEW_APPLE_GLX
Keith January 12, 2011 at 5:47 am

Nice tutorials so far. Have you considered getting into PyOpenCL?
karimkhan January 13, 2011 at 4:05 am

Hi everybody ,
I tried to run the part2.x , but its giving segmenation fault , what should I do ca anyone suggest ,
enja I hope you will do it…

Thanks…
Pingback: Adventures in PyOpenCL: Part 2, Particles with PyOpenGL | enj
Eustachy November 22, 2011 at 4:11 am

In my application, I had a problem creating shared context under Mac OS X (10.6.8) with CL_INVALID_GL_OBJECT error returned by clCreateFromGLBuffer(). Your example failed to run for the same reason. Turns out it works fine if you create VBO with GL_STATIC_DRAW instead of GL_DYNAMIC_DRAW. Not sure what is the reason, but it seems there is one, because the example I found on developer.apple.com explicitly defines this param as STATIC_DRAW if gl interop is enabled (and uses DYNAMIC_DRAW when gl interop is disabled).