Swan: A simple tool for porting CUDA to OpenCL
Download latest version
For the impatient: download the latest version of Swan (16 December 2010) here (by downloading this file, you are accepting the GPL2 license http://www.gnu.org/licenses/gpl-2.0.txt). Earlier versions available via wiki history. To build, untar and follow the instructions in the README. the requirements are:
- A modern(ish) Linux OS
- CUDA toolkit and/or OpenCL runtime and a matching GPU
- Perl (with the package Text::Balanced)
- gcc/g++ 3.x or later
Changes in latest update (10 May 2010):
- Support for Fedora 12+. (Changed from using varargs to an explicit array)
- Support for SM_20/Fermi
- Some support for OpenCL images (set $has_opencl_images=1 in swan.ocl)
What is it?
Swan is a small tool that aids the reversible conversion of existing CUDA codebases to OpenCL. It does several useful things:
- Translates CUDA kernel source-code to OpenCL.
- Provides a common API that abstracts both CUDA and OpenCL runtimes.
Preserves the convenience of the CUDA <<< grid, block >>> kernel launch syntax by generating C source-code for kernel entry-point functions.
It can also be usefully used for compiling and managing kernels written directly for OpenCL.
Why might you want it?
Possible uses include:
- Evaluating OpenCL performance of an existing CUDA code.
- Maintaining a dual-target OpenCL and CUDA code.
- Reducing dependence on NVCC when compiling host code.
- Support multiple CUDA compute capabilities in a single binary
- A runtime library for managing OpenCL kernels for new development
It's not a drop-in replacement for nvcc. Host code needs to have all kernel invocations and CUDA API calls re-written.
Swan does not support a few things. In particular:
- CUDA C++ templating in kernel code.
- OpenCL Images/Samplers (analogous to Textures) -- texture interpolation done in software
- Multiple device management in a single process.
- Compiling kernels for the CPU.
- CUDA device-emulation mode.
Furthermore, it's a work in progress. It works for our code but no promises it will for yours.
The essential steps for converting a CUDA code:
- Separate CUDA kernels into separate source files.
- Compile the kernel source using swan. For a CUDA target:
swan --cuda kernel.kh kernel.cu
or for OpenCL
swan --opencl kernel.kh kernel.cu
The output file kernel.kh contains:
- a data block which holds the compiled kernel source code (PTX or IL).
- autogenerated source-code functions for running each of the kernels
#include "kernel.kh" into the host source.
- Re-write all kernel invocations to use the corresponding entry-point functions. For example, the kernel with formal declaration:
__global__ void vec_add( float *in, float *out, int N );
will have an entry point defined as:
void k_vec_add( int3 grid, int3 block, int shmem, float *in, float *out, int N );
where grid, block and shmem correspond to the first three arguments of the <<< >>> syntax.
Entry points are named systematically. For any kernel kernel(), there will be two entry points: k_kernel() and k_kernel_async(). The former is a synchronous call (blocks until kernel completes execution).
Re-write all of the CUDA API calls with the Swan analogues (see swan_api.h)
Compile and link code against the appropriate Swan library (libswan_ocl or libswan_cuda).
Examples can be found in the package.
The API is documented here.
For queries, problems and suggestions, contact Matt @ M.J.Harvey (at) ic (dot) ac (dot) uk.
Experiences porting from CUDA to OpenCL Presentation at the Daresbury Machine Evaluation Workshop, 2009
If publishing work that uses Swan, please cite:
Swan: A tool for porting CUDA programs to OpenCL, M J Harvey and G De Fabritiis, Computer Physics Communications, 182 (4) 1093-1099 (2011) 10.1016/j.cpc.2010.12.052
Swan is licensed under the GPL version 2.