Hello world! I’m excited to be writing to you from the beautiful Bay area in California, specifically the Lawrence Berkeley National Laboratory at the top of the hill in Berkeley.
This summer I’ll be working with Hank Childs and the visualization team of the Computational Research Division on, excuse my nerdiness, some pretty cool stuff. The underlying goal of my project is to study the use of GPUs for accelerating distributed computations on super computers. The specific topic of my research will be particle advection for FTLE calculations. In English you could think of this topic as trying to find out how fast things separate in a flow, you can imagine dropping fish food in a turbulent stream and seeing where the flakes get separated the fastest. Like my Master’s research I will be using OpenCL and moving around a lot of particles, only for this project I will be moving a lot more particles on much bigger computers. Luckily, many of the efforts I make here will improve both my understanding and some of my underlying code for my Blender project. Already I have reused several pieces from the RTPS library and I plan on releasing my OpenCL wrappers as an improved standalone library very soon (just need to clean up some and add a couple little examples).
Here is an unscientific but interesting picture of my first successful result. After that I will be describing my project and progress in painful (it hurts so good) computational science detail.
My research problem for the summer is investigating particle advection on the GPU in a distributed memory setting.
The key issues to explore are the costs and benefits of using a GPU in this environment. Since GPUs can greatly accelerate computation but suffer from increased memory latency we want to find out exactly how using the GPU changes the equations governing communication and performance using an interesting visualization problem as the guinea pig. The FTLE computations themselves are already complete and were handed to me by some smart coworkers in my first week. The interesting thing about FTLE is that it requires seed particles to be advected by a velocity field, a common problem when visualizing data from Computational Fluid Dynamics simulations.
My approach to the project is to first implement some code which can accomplish this in three phases and then study the performance in comparison with an existing CPU implementation by a coworker. The first phase is relatively complete, as evidenced by the picture above, which is to implement the FTLE calculations in OpenCL on one GPU. The second phase I am beginning this week is to setup a communications infrastructure using some routines from Visit (which are based on MPI) for transferring particles between nodes and sending them to and from the GPU. The third phase will be combining the two so that calculations are performed and the results are communicated as necessary.
I’m not sure how long it will take me to do phase 2, as I have relatively little experience with MPI type coding. My distributed data experience is limited to website load balancing and I’m not yet sure if there are more similarities or differences.
I am encouraged by my leaders here to write a weekly report. Some of these, such as this first one will be posted on my blog, others may remain internal until the time is right. In any case I’m excited to share as much as possible about what I’m learning and I look forward to pushing OpenCL to new frontiers!