A first complete functional version of an rCUDA module for the SLURM scheduler will be available soon
The rCUDA team has been working during the last months in order to create an rCUDA module for the SLURM scheduler. This scheduler, which is used in many clusters around the world, efficiently dispatches computing jobs to the different nodes of the cluster. However, when a job requires the use of one or several GPUs, the SLURM scheduler assumes that those GPUs will be local to the node where the job is targeted, thus hindering the use of remote GPU virtualization frameworks such as rCUDA. With the new module created by the rCUDA team, the SLURM scheduler is aware of the use of remote GPU virtualizaton, therefore making possible sharing the GPUs available in the cluster among the several applications demanding them, independently of the exact node where the application is being executed and also independently of the exact node where the GPU is located. The new module allows to schedule GPUs in two ways: in an exclusive way, or concurrently sharing the GPUs among several applications. Notice that in both cases the use of remote GPUs is feasible.