Latest Release

User Rating:  / 97
AddThis Social Bookmark Button
Change letter size:

What's new in rCUDA 20.07alpha?

- A completely new and disruptive internal architecture has been designed and implemented for the core of rCUDA. The new internal architecture is aimed at provide much better support to more CUDA applications and also better performance. Notice, however, that we have not checked yet the amount of applications supported neither their performance when using rCUDA

- A completely new communications layer has been implemented, which has been architected to provide much easier maintenance of the code and also
much better performance than previous versions of rCUDA

- Multi-tenancy is supported. That is, a real GPU can be virtualized into multiple GPUs, which can be concurrently provided to several applications

- The rCUDA server can simultaneously provide service across TCP and InfiniBand networks. That is, the rCUDA server can provide support to some applications by using TCP/IP at the same time that other applications are served using the InfiniBand network

- Support for functions in the Driver API has been noticeably improved

- The use of P2P data copies has been noticeably simplified, making it fully transparent to the user.

- GPU memory can be safely partitioned among different applications. In next releases of rCUDA we will disclose the public API to do so

- Next releases of rCUDA will include the rCUDA-smi tool, which is similar to the nvidia-smi tool except that remote GPUs are monitored

- Next releases of rCUDA will include the rCUDA GPU scheduler, intended to provide efficient integration of rCUDA with Slurm and other job schedulers

- Next releases of rCUDA will include the sbatch and srun commands required to integrate rCUDA with Slurm. Other job schedulers, such as PBSpro, could also be supported

The rCUDA Team hopes that you enjoy this new version of the rCUDA technology! 

Contact Us

User Rating:  / 213
AddThis Social Bookmark Button
Change letter size:

Universidad Politecnica de Valencia

Address: Camino de vera s/n

Postal code/ZIP : 46022

Town/Suburb/City : Valencia

State/County/Province : Valencia

Country : Spain

Phone : +34 963877007 Ext. 75745

Fax : +34 963877579

Web Site : 

@ Email :  

Please notice that in case you contact us at the info@ address, you are encouraged to use your company or institution email address. In this regard, requests received from domains such as gmail, hotmail, yahoo,, etc will be discarded.


User Rating:  / 34
AddThis Social Bookmark Button
Change letter size:

The use of GPUs to accelerate general-purpose scientific and engineering applications is mainstream today, but their adoption in current high-performance computing clusters is impaired primarily by acquisition costs and power consumption. Furthermore, GPU utilization is in general low, causing that the investment on GPU hardware cannot be quickly amortized.

Virtualizing GPUs is an appealing strategy to deal with all these drawbacks simultaneously.  By leveraging GPU virtualization, physical GPUs are installed only in some nodes of the cluster, and they are transparently shared among all the nodes. Hence, those nodes equipped with GPUs become servers that provide GPU services to all the nodes in the cluster. GPU virtualization leads to the use of a lower number of GPUs across the cluster, thus reducing acquisition costs and power consumption, while increasing the accelerator utilization rate. Consequently, GPU virtualization enables a more efficient use of the available hardware. Moreover, by making use of GPU virtualization, a single application being executed in one of the nodes of the cluster can be provided with all the GPUs installed in the cluster. This amount of GPUs is usually much larger than the number of GPUs that can fit into a single box. Therefore, by using GPU virtualization applications are further accelerated.

The rCUDA framework is the most modern remote GPU virtualization solution today. It is a development of the Parallel Architectures Group from Universitat Politecnica de Valencia (Spain). rCUDA enables the concurrent remote usage of CUDA-enabled devices in a transparent way. Thus, the source code of applications does not need to be modified in order to use remote GPUs but rCUDA takes care of all the necessary details. Furthermore, the overhead introduced by using a remote GPU is very small. In this regard, execution time is usually increased by less than 4% when a high performance network fabric is used. Basically, rCUDA can be useful in three different environments: 

  • Clusters. rCUDA allows a single non-MPI application to make use of all the GPUs in the cluster, independently of the exact node where they are installed. Additionally, rCUDA allows to adjust the exact amount of GPUs in the cluster to the actual computing needs, leading to increased GPU utilization and reduced overall costs (energy, acquisition, maintenance, space, cooling, etc).
  • Academia. In commodity networks, rCUDA provides concurrent access to a few high performance GPUs to many students, thus reducing teaching costs.
  • Virtual Machines. rCUDA allows applications running inside virtual machines to access GPUs installed in remote physical machines.

 rCUDA provides full compatibility support with CUDA. It implements all of the functions in the CUDA Runtime API and Driver API, excluding only those related with graphics interoperability. It additionally includes highly optimized TCP and low-level InfiniBand pipelined communications as well as full multi-thread and multi-node capabilities. rCUDA targets the same Linux OS distributions as CUDA does, providing also support for x86 and ARM processor architectures. Furthermore, an integration of rCUDA with the SLURM scheduler has been developed, allowing your scheduled jobs to use remote GPUs. The combination of SLURM + rCUDA provides reductions in overall execution times of job batches between 25% and 45%, depending on the exact composition of the job batch. Consumed energy is also noticeably reduced. 

Currently, rCUDA has been successfully tested with several applications selected from the list of “Popular GPU-accelerated Applications” by NVIDIA. In this way, in addition to show the right behavior with the NVIDIA SDK samples, rCUDA has been applied to the following applications: LAMMPS, WideLM, CUDASW++, HOOMDBlue, mCUDA-MEME, GPU-Blast, Gromacs, GAMESS, DL-POLY, and HPL. In the papers and presentations available in the documentation page you may find additional information about the performance of many of these applications when used with rCUDA. 

If you are interested in using rCUDA, please proceed to software request form page. The rCUDA team will be glad to send you a copy of the software at no cost. It is distributed for free. Notice that the rCUDA technology is own by Technical University of Valencia and the software request form page is the only way to get a copy of this technology.

For further information, please refer to the papers and presentations listed in the documentation page.

The rCUDA Team

User Rating:  / 6
AddThis Social Bookmark Button
Change letter size:

The rCUDA Team

The rCUDA Team is affiliated to Universitat Politècnica de València (Technical University of Valencia) in Spain. The team is led by Federico Silla since 2008, when the development of the rCUDA technology began. Different members of the rCUDA Team carry out different tasks. Currently, the team is composed of the following members:

Leadership & Coordination:

Federico Silla. Full Professor.

Development & Testing:

Cristian Peñaranda. Developer.

Javier Prades. Developer.

Carlos Reaño. External collaborator.

Jaime Sierra. Developer.

Previous Developers:

Tony Díaz.

Pablo Higueras.

Antonio Peña.

Change letter size:

Gold Sponsors

Silver Sponsors

Logo gva


logo bright

logo nvidia