In addition to the rCUDA demo that you can attend next week at Mellanox Technologies booth #2722 at SC13 in Denver, CO, we will we glad to answer your questions about this remote GPU virtualization technology. Furthermore, in order to help you better understanding rCUDA, we have prepared several white papers, which will be available at SC13. You can see one of them at our "Support" tab in this website.
The rCUDA remote GPU virtualization technology, whose last release also supports the ARM processor architecture, has been now leveraged to execute the LAMMPS and CUDASW++ applications in ARM-based systems. More specifically, the Tegra 3 ARM Cortex A9 quad-core CPUs (1.4 GHz) present in the NVIDIA Development Kits CARMA and KAYLA have been used to execute the CPU code of these applications, whereas the GPU code has been offloaded to an NVIDIA GeForce GTX480 “Fermi” GPU installed in a remote regular Xeon-based system. Results clearly show that by using rCUDA and remote accelerators, noticeable performance improvements are achieved over using the local NVIDIA Quadro 1000M GPU already present in the CARMA system, despite of using a traditional 1Gbps Ethernet network.
Paper on the benefits of InfiniBand FDR on remote GPU virtualization presented at CLUSTER 2013 Conference
The rCUDA team is glad to announce that its last research results were presented on Tuesday September 24th in the CLUSTER 2013 Conference held in Indianapolis, IN (USA).The presented paper, titled "Influence of InfiniBand FDR on the performance of remote GPU virtualization", showed the benefits that the last version of the InfiniBand technology provides to GPU virtualization. The paper was awarded with the Best Paper Award.
A first complete functional version of an rCUDA module for the SLURM scheduler will be available soon
The rCUDA team has been working during the last months in order to create an rCUDA module for the SLURM scheduler. This scheduler, which is used in many clusters around the world, efficiently dispatches computing jobs to the different nodes of the cluster. However, when a job requires the use of one or several GPUs, the SLURM scheduler assumes that those GPUs will be local to the node where the job is targeted, thus hindering the use of remote GPU virtualization frameworks such as rCUDA. With the new module created by the rCUDA team, the SLURM scheduler is aware of the use of remote GPU virtualizaton, therefore making possible sharing the GPUs available in the cluster among the several applications demanding them, independently of the exact node where the application is being executed and also independently of the exact node where the GPU is located. The new module allows to schedule GPUs in two ways: in an exclusive way, or concurrently sharing the GPUs among several applications. Notice that in both cases the use of remote GPUs is feasible.
The last developments of rCUDA, including a thorough performance analysis, were presented on September 12th at the HPC Advisory Council Spain Conference 2013, held in Barcelona (Spain) and co-hosted by the Barcelona Supercomputer Center. The slides of the presentation are available here. You can also access a video with the presentation in this link.