Swan: A simple tool for porting CUDA to OpenCL

Download latest version

For the impatient: download the latest version of Swan (16 December 2010) here (by downloading this file, you are accepting the GPL2 license http://www.gnu.org/licenses/gpl-2.0.txt). Earlier versions available via wiki history. To build, untar and follow the instructions in the README. the requirements are:

  • A modern(ish) Linux OS
  • CUDA toolkit and/or OpenCL runtime and a matching GPU
  • Perl (with the package Text::Balanced)
  • gcc/g++ 3.x or later

Changes in latest update (10 May 2010):

  • Support for Fedora 12+. (Changed from using varargs to an explicit array)
  • Support for SM_20/Fermi
  • Some support for OpenCL images (set $has_opencl_images=1 in swan.ocl)

What is it?

Swan is a small tool that aids the reversible conversion of existing CUDA codebases to OpenCL. It does several useful things:

  • Translates CUDA kernel source-code to OpenCL.
  • Provides a common API that abstracts both CUDA and OpenCL runtimes.
  • Preserves the convenience of the CUDA <<< grid, block >>> kernel launch syntax by generating C source-code for kernel entry-point functions.

It can also be usefully used for compiling and managing kernels written directly for OpenCL.

Why might you want it?

Possible uses include:

  • Evaluating OpenCL performance of an existing CUDA code.
  • Maintaining a dual-target OpenCL and CUDA code.
  • Reducing dependence on NVCC when compiling host code.
  • Support multiple CUDA compute capabilities in a single binary
  • A runtime library for managing OpenCL kernels for new development


It's not a drop-in replacement for nvcc. Host code needs to have all kernel invocations and CUDA API calls re-written.

Swan does not support a few things. In particular:

  • CUDA C++ templating in kernel code.
  • OpenCL Images/Samplers (analogous to Textures) -- texture interpolation done in software
  • Multiple device management in a single process.
  • Compiling kernels for the CPU.
  • CUDA device-emulation mode.

Furthermore, it's a work in progress. It works for our code but no promises it will for yours.


The essential steps for converting a CUDA code:

  • Separate CUDA kernels into separate source files.
  • Compile the kernel source using swan. For a CUDA target:

swan --cuda kernel.kh kernel.cu

or for OpenCL

swan --opencl kernel.kh kernel.cu
  • The output file kernel.kh contains:

    • a data block which holds the compiled kernel source code (PTX or IL).
    • autogenerated source-code functions for running each of the kernels
  • #include "kernel.kh" into the host source.

  • Re-write all kernel invocations to use the corresponding entry-point functions. For example, the kernel with formal declaration:

__global__ void vec_add( float *in, float *out, int N );

will have an entry point defined as:

void k_vec_add( int3 grid, int3 block, int shmem, float *in, float *out, int N );

where grid, block and shmem correspond to the first three arguments of the <<< >>> syntax.

Entry points are named systematically. For any kernel kernel(), there will be two entry points: k_kernel() and k_kernel_async(). The former is a synchronous call (blocks until kernel completes execution).

  • Re-write all of the CUDA API calls with the Swan analogues (see swan_api.h)

  • Compile and link code against the appropriate Swan library (libswan_ocl or libswan_cuda).

Examples can be found in the package.

The API is documented here.


For queries, problems and suggestions, contact Matt @ M.J.Harvey (at) ic (dot) ac (dot) uk.


If publishing work that uses Swan, please cite:

  • Swan: A tool for porting CUDA programs to OpenCL, M J Harvey and G De Fabritiis, Computer Physics Communications, 182 (4) 1093-1099 (2011) 10.1016/j.cpc.2010.12.052


Swan is licensed under the GPL version 2.

For other type of licenses and for converting legacy code please contact info@acellera.com (web: http://www.acellera.com).

Copyright 2008-2011. All rights reserved.