13 papers:
SAC-2014-KwonB #implementation #library #prototype- A CUDA-based implementation of OpenGL-compatible rasterization library prototype (YCK, NB), pp. 1747–1748.
PPoPP-2014-YangZ #concurrent #named #parallel #thread- CUDA-NP: realizing nested thread-level parallelism in GPGPU applications (YY, HZ), pp. 93–106.
DAC-2011-KimS #data access #memory management #named- CuMAPz: a tool to analyze memory access patterns in CUDA (YK, AS), pp. 128–133.
HCI-DDA-2011-LuoY #algorithm #clustering #framework #network #novel #parallel #using- A Novel Parallel Clustering Algorithm Based on Artificial Immune Network Using nVidia CUDA Framework (RL, QY), pp. 598–607.
HCI-DDA-2011-XuZC #algorithm #image #performance- High-Quality Fast Image Upsampling Algorithm Based on CUDA (QX, XZ, JC), pp. 677–683.
PPoPP-2011-HongKOO #algorithm #graph- Accelerating CUDA graph algorithms at maximum warp (SH, SKK, TO, KO), pp. 267–276.
ICPR-2010-MizukamiTWLP #database #implementation #pattern matching #pattern recognition #recognition- CUDA Implementation of Deformable Pattern Recognition and its Application to MNIST Handwritten Digit Database (YM, KT, JW, PL, SP), pp. 2001–2004.
SAC-2010-JiCW #scalability #simulation- A simulation of large-scale groundwater flow on CUDA-enabled GPUs (XJ, TC, QW), pp. 2402–2403.
CC-2010-BaskaranRS #automation #code generation #source code- Automatic C-to-CUDA Code Generation for Affine Programs (MMB, JR, PS), pp. 244–263.
HPDC-2010-DiasBG- CUDA-based triangulations of convolution molecular surfaces (SD, KB, AJPG), pp. 531–540.
PPoPP-2010-LiGKQ #source code #verification- A symbolic verifier for CUDA programs (GL, GG, RMK, DQ), pp. 357–358.
PPoPP-2008-RyooRBSKH #evaluation #gpu #optimisation #parallel #performance #thread #using- Optimization principles and application performance evaluation of a multithreaded GPU using CUDA (SR, CIR, SSB, SSS, DBK, WmWH), pp. 73–82.
ISMM-2007-Kirk #architecture #gpu #parallel- NVIDIA cuda software and gpu parallel computing architecture (DK), pp. 103–104.