And so I reach another milestone in my journey to put OpenCL Particles in Blender! I’ve written my own (very) simple particle system from scratch in C++ using OpenGL and OpenCL, taking advantage of VBO interoperability. OpenCL and OpenGL interop is all about keeping the processing on the GPU, so the array of vertices you are drawing from and the array of vertices you are moving around are actually pointers to the same spot in GPU memory. This saves lots of time, especially if you are manipulating millions of particles in real time and you don’t want to transfer megabytes of data back and forth each frame!
I based my work on the NVIDIA GPU SDK example oclSimpleGL, but I wanted to make my code a standalone library that could be popped into some other graphics context (namely Blender). I was able to nicely separate my code into relevant files: opencl.cpp for opencl functions, enja.cpp for all the particle stuff and main.cpp is responsible for the window/GLUT and rendering stuff. Instantiating an EnjaParticles class creates a system and populates two VBOs, one for vertices and one for colors as well as an OpenCL context and all the necessary OpenCL variables. All one has to do is call the update function and the OpenCL kernel is executed, automatically updating the vbos which can be rendered by OpenGL. Right now the constructor for the particle system only takes in an array for vertices (which I intend to get from a DerivedMesh object in Blender) and makes the VBO ids available so the rendering context can use them.
The code currently compiles and runs on both Ubuntu with NVIDIA drivers on a GTX480 (yeah baby!) and my Macbook Pro with NVIDIA GeForce 9400M (it shouldn’t be hard to port to windows but I don’t have time for that now). I use CMAKE so check out the README to make sure you configure your environment correctly (you need to set up a couple environment variables so it can find the right headers to include and libraries to link against). In the CMakeLists.txt file you can also turn GL_INTEROP on and off if you want to see the performance difference. This code also runs in the OpenCL Visual Profiler, which wasn’t the easiest thing to get working. You need to be very careful not only with the OpenCL objects you instantiate, but ones that implicitly get created by calling cl functions! Once it works it’s very nice for seeing how your code is behaving on the GPU.
I’ve noticed that on my MacBook Pro the GL/CL interop is not behaving as expected, and I suspect that there is still memory being transfered. Without the opencl profiler I will have to do my own timings to get to the bottom of this. Adding to my suspicion is that there is no difference in performance for the oclSimpleGL example on my MBP.
It took me over a week to get this working like I expected to, I don’t intend it for it to take that long to get it into Blender. I really want to start working on interesting stuff like Rigid Body collision and python interaction, let alone more complex physics!