Dragan Djuric
Clojure + CUDA + OpenCL infrastructure; Bayesian GPU software
Dragan Djuric is a professor at the Department of Software Engineering, FON, University of Belgrade, Serbia. He passionately uses Clojure as a primary language since 2009, and teaches Clojure-based courses at the university since 2010. He published his Clojure-based research in leading scientific journals, but does not skip contributiing to the community through open-source Clojure projects (www.uncomplicate.org). His main interests are in the area of software engineering and intelligent systems, but programming in Clojure is the activity he enjoys the most. When he is not working in Emacs, he likes doing his daily dose of long-distance running, gym, and Cuban salsa dancing.
Past Activities
Code Mesh LDN 2018
11.25 - 12.10
Interactive GPU programming with ClojureCUDA and ClojureCL
Who wouldn't like to program with CUDA dynamically, in an interactive, but compiled, environment? I present an interactive approach to accelerating dynamic functional programs with GPU kernels.
I developed ClojureCUDA and ClojureCL, libraries that integrate CUDA and OpenCL into a Clojure environment, which compiles to Java bytecode and integrates into ubiquitous Java enterprise ecosystem. They help programmers discover the solutions by growing GPU programs inside a live session, by constantly learning and experimenting with instant results, rather than having to specify the whole GPU program right away, compile it, and run it to see the result.
CUDA and parallel programming are complex and brittle. Interactive programming is indispensable when the details of the solution are not certain beforehand. For both the novice and the expert, the instant feedback is of great help. We aim at full CUDA power at a low level and offer a nice dynamic environment and the power of LISP to automate grudge work without hiding the important details of CUDA kernels.
ClojureCUDA demonstrates this in Bayadera and Neanderthal projects. The resulting programs are novel, require typically very little code (low thousands instead of hundreds of thousands), have little if any overhead, and, what is unusual, have equivalent functionality and similar performance on both Nvidia (CUDA) and AMD (OpenCL) hardware.