25 papers:
- DAC-2015-MaoHCL #named
- VWS: a versatile warp scheduler for exploring diverse cache localities of GPGPU applications (MM, JH, YC, HL), p. 6.
- SEFM-2015-AmighiDBH #source code #specification #verification
- Specification and Verification of Atomic Operations in GPGPU Programs (AA, SD, SB, MH), pp. 69–83.
- CGO-2015-JiaoLHM #concurrent #energy #execution #kernel
- Improving GPGPU energy-efficiency through concurrent kernel execution and DVFS (QJ, ML, HPH, TM), pp. 1–11.
- HPCA-2015-WuGLJC #estimation #machine learning #performance #using
- GPGPU performance and power estimation using machine learning (GYW, JLG, AL, NJ, DC), pp. 564–576.
- DAC-2014-MaoWZCL #architecture #memory management #using
- Exploration of GPGPU Register File Architecture Using Domain-wall-shift-write based Racetrack Memory (MM, WW, YZ, YC, HHL), p. 6.
- DAC-2014-RahimiGLCBG #architecture #collaboration #compilation #energy
- Energy-Efficient GPGPU Architectures via Collaborative Compilation and Memristive Memory-Based Computing (AR, AG, MALM, KTC, LB, RKG), p. 6.
- DAC-2014-ZhangPL #hardware #power management
- Low Power GPGPU Computation with Imprecise Hardware (HZ, MP, JL), p. 6.
- DATE-2014-LeeF #framework #named #realtime #runtime #scheduling
- GPU-EvR: Run-time event based real-time scheduling framework on GPGPU platform (HL, MAAF), pp. 1–6.
- DATE-2014-SongLKSCR #energy #scheduling
- Energy-efficient scheduling for memory-intensive GPGPU workloads (SS, ML, JK, WS, YGC, SR), pp. 1–6.
- CGO-2014-MargiolasO #communication #optimisation
- Portable and Transparent Host-Device Communication Optimization for GPGPU Environments (CM, MFPO), p. 55.
- HPCA-2014-LeeSMKSCR #concurrent #resource management #scheduling #thread
- Improving GPGPU resource utilization through alternative thread block scheduling (ML, SS, JM, JK, WS, YGC, SR), pp. 260–271.
- PPoPP-2014-YangZ #concurrent #named #parallel #thread
- CUDA-NP: realizing nested thread-level parallelism in GPGPU applications (YY, HZ), pp. 93–106.
- DAC-2013-RahimiBG #architecture
- Aging-aware compiler-directed VLIW assignment for GPGPU architectures (AR, LB, RKG), p. 6.
- DATE-2013-NugterenBC #architecture #future of #parametricity
- Future of GPGPU micro-architectural parameters (CN, GJvdB, HC), pp. 392–395.
- ASPLOS-2013-JogKNMKMID #array #concurrent #named #owl #performance #scheduling #thread
- OWL: cooperative thread array aware scheduling techniques for improving GPGPU performance (AJ, OK, NCN, AKM, MTK, OM, RI, CRD), pp. 395–406.
- ASPLOS-2013-PaiTG #concurrent #kernel
- Improving GPGPU concurrency with elastic kernels (SP, MJT, RG), pp. 407–418.
- HPCA-2013-GilaniKS #power management
- Power-efficient computing for compute-intensive GPGPU applications (SZG, NSK, MJS), pp. 330–341.
- PPoPP-2013-LiuDJK #architecture #layout #optimisation
- Data layout optimization for GPGPU architectures (JL, WD, OJ, MTK), pp. 283–284.
- ICSE-2012-NaganoNKAHUF #mining #repository #scalability #using
- Using the GPGPU for scaling up Mining Software Repositories (RN, HN, YK, BA, KH, NU, AF), pp. 1435–1436.
- HPCA-2012-AdriaensCKS #multi
- The case for GPGPU spatial multitasking (JA, KC, NSK, MJS), pp. 79–90.
- HPCA-2012-YangXMZ #architecture #cpu #gpu
- CPU-assisted GPGPU on fused CPU-GPU architectures (YY, PX, MM, HZ), pp. 103–114.
- PPoPP-2012-SimDKV #analysis #framework #identification #performance
- A performance analysis framework for identifying potential benefits in GPGPU applications (JS, AD, HK, RWV), pp. 11–22.
- PLDI-2010-YangXKZ #compilation #memory management #optimisation #parallel
- A GPGPU compiler for memory optimization and parallelism management (YY, PX, JK, HZ), pp. 86–97.
- PPoPP-2010-YangXKZ #compilation #optimisation #source code
- An optimizing compiler for GPGPU programs with input-data sharing (YY, PX, JK, HZ), pp. 343–344.
- PPoPP-2009-LeeME #automation #compilation #framework #optimisation
- OpenMP to GPGPU: a compiler framework for automatic translation and optimization (SL, SJM, RE), pp. 101–110.