BibSLEIGH
BibSLEIGH corpus
BibSLEIGH tags
BibSLEIGH bundles
BibSLEIGH people
CC-BY
Open Knowledge
XHTML 1.0 W3C Rec
CSS 2.1 W3C CanRec
email twitter
Used together with:
use (10)
process (9)
graph (8)
algorithm (7)
memori (7)

Stem gpus$ (all stems)

72 papers:

DACDAC-2015-JungC #embedded #multi #named #performance #simulation
ΣVP: host-GPU multiplexing for efficient simulation of multiple embedded GPUs on virtual platforms (YJ, LPC), p. 6.
DATEDATE-2015-RahimiGCBG #approximate #energy #memory management
Approximate associative memristive memory for energy-efficient GPUs (AR, AG, KTC, LB, RKG), pp. 1497–1502.
VLDBVLDB-2015-ZhangWYGLZ #in memory #named #throughput
Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores (KZ, KW, YY, LG, RL, XZ), pp. 1226–1237.
SACSAC-2015-RochaRCOMVADGF #algorithm #classification #dataset #documentation #named #performance #using
G-KNN: an efficient document classification algorithm for sparse datasets on GPUs using KNN (LCdR, GSR, RC, RSO, DM, FV, GA, SD, MAG, RF), pp. 1335–1338.
SACSAC-2015-RodriguesJD #recommendation #using
Accelerating recommender systems using GPUs (AVR, AJ, ID), pp. 879–884.
ASPLOSASPLOS-2015-AgarwalNSOK #memory management
Page Placement Strategies for GPUs within Heterogeneous Memory Systems (NA, DWN, MS, MO, SWK), pp. 607–618.
CGOCGO-2015-FauziaPS #memory management
Characterizing and enhancing global memory data coalescing on GPUs (NF, LNP, PS), pp. 12–22.
HPCAHPCA-2015-AgarwalNOKW
Unlocking bandwidth for GPUs in CC-NUMA systems (NA, DWN, MO, SWK, TFW), pp. 354–365.
HPCAHPCA-2015-LiuLJCT #comprehension #empirical
Understanding the virtualization “Tax” of scale-out pass-through GPUs in GaaS clouds: An empirical study (ML, TL, NJ, AC, VT), pp. 259–270.
HPCAHPCA-2015-XieLWSW #coordination
Coordinated static and dynamic cache bypassing for GPUs (XX, YL, YW, GS, TW), pp. 76–88.
PPoPPPPoPP-2015-SeoKK #graph #named #scalability #streaming
GStream: a graph streaming processing method for large-scale graphs on GPUs (HS, JK, MSK), pp. 253–254.
ICLPICLP-2015-DovierFPV #execution #parallel
Parallel Execution of the ASP Computation — an Investigation on GPUs (AD, AF, EP, FV).
ASEASE-2014-RajanSSK #execution #using
Accelerated test execution using GPUs (AR, SS, PS, DK), pp. 97–102.
DACDAC-2014-NandakumarM #analysis
System-Level Floorplan-Aware Analysis of Integrated CPU-GPUs (VSN, MMS), p. 6.
DACDAC-2014-SamavatianAAS #architecture #performance
An Efficient STT-RAM Last Level Cache Architecture for GPUs (MHS, HA, MA, HSA), p. 6.
DATEDATE-2014-AguileraLFMSK #algorithm #clustering #multi #process
Process variation-aware workload partitioning algorithms for GPUs supporting spatial-multitasking (PA, JL, AFF, KM, MJS, NSK), pp. 1–6.
VLDBVLDB-2014-WangZYMLD0 #concurrent #query
Concurrent Analytical Query Processing with GPUs (KW, KZ, YY, SM, RL, XD, XZ), pp. 1011–1022.
TACASTACAS-2014-WijsB #manycore #named #on the fly #using
GPUexplore: Many-Core On-the-Fly State Space Exploration Using GPUs (AW, DB), pp. 233–247.
ICMLICML-c1-2014-GiesekeHOI #nearest neighbour #query
Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs (FG, JH, CEO, CI), pp. 172–180.
PADLPADL-2014-CampeottoPDFP #constraints #theorem proving
Exploring the Use of GPUs in Constraint Solving (FC, ADP, AD, FF, EP), pp. 152–167.
ASPLOSASPLOS-2014-PichaiHB #architecture #cpu #design #memory management
Architectural support for address translation on GPUs: designing memory management units for CPU/GPUs with unified address spaces (BP, LH, AB), pp. 743–758.
CCCC-2014-AnantpurG #control flow
Taming Control Divergence in GPUs through Control Flow Linearization (JA, RG), pp. 133–153.
CCCC-2014-WangPFO #legacy #parallel
Exploitation of GPUs for the Parallelisation of Probably Parallel Legacy Code (ZW, DCP, BF, MFPO), pp. 154–173.
CGOCGO-2014-BarikKMLSHNA #c++ #performance
Efficient Mapping of Irregular C++ Applications to Integrated GPUs (RB, RK, DM, BTL, TS, CH, YN, ARAT), p. 33.
CGOCGO-2014-GrosserCHSV #hybrid
Hybrid Hexagonal/Classical Tiling for GPUs (TG, AC, JH, PS, SV), p. 66.
CGOCGO-2014-JuegaGTC #adaptation #automation #code generation #parametricity
Adaptive Mapping and Parameter Selection Scheme to Improve Automatic Code Generation for GPUs (JCJ, JIG, CT, FC), p. 251.
CGOCGO-2014-WuDSABGY #execution #query #relational
Red Fox: An Execution Environment for Relational Query Processing on GPUs (HW, GFD, TS, MA, SB, MG, SY), p. 44.
HPCAHPCA-2014-HechtmanCHTBHRW #approach #consistency #named
QuickRelease: A throughput-oriented approach to release consistency on GPUs (BAH, SC, DRH, YT, BMB, MDH, SKR, DAW), pp. 189–200.
HPCAHPCA-2014-LakshminarayanaK #algorithm #graph
Spare register aware prefetching for graph algorithms on GPUs (NBL, HK), pp. 614–625.
HPCAHPCA-2014-PalframanKL #fault
Precision-aware soft error protection for GPUs (DJP, NSK, MHL), pp. 49–59.
HPCAHPCA-2014-XiangYZ
Warp-level divergence in GPUs: Characterization, impact, and mitigation (PX, YY, HZ), pp. 284–295.
HPDCHPDC-2014-KhorasaniVGB #graph #named
CuSha: vertex-centric graph processing on GPUs (FK, KV, RG, LNB), pp. 239–252.
ISMMISMM-2014-EgielskiHZ #parallel
Massive atomics for massive parallelism on GPUs (IJE, JH, EZZ), pp. 93–103.
PPoPPPPoPP-2014-BauerTA #named #performance
Singe: leveraging warp specialization for high performance on GPUs (MB, ST, AA), pp. 119–130.
PPoPPPPoPP-2014-MaAC #algorithm #analysis #manycore #thread
Theoretical analysis of classic algorithms on highly-threaded many-core GPUs (LM, KA, RDC), pp. 391–392.
PPoPPPPoPP-2014-SandesMMMA #comparison #parallel #sequence
Fine-grain parallel megabase sequence comparison with multiple heterogeneous GPUs (EFdOS, GM, ACMAdM, XM, EA), pp. 383–384.
PPoPPPPoPP-2014-YanLZZ #framework #named
yaSpMV: yet another SpMV framework on GPUs (SY, CL, YZ, HZ), pp. 107–118.
DATEDATE-2013-BertaccoCBFVKP #on the
On the use of GP-GPUs for accelerating compute-intensive EDA applications (VB, DC, NB, FF, SV, AMK, HDP), pp. 1357–1366.
ASPLOSASPLOS-2013-SilbersteinFKW #file system #named
GPUfs: integrating a file system with GPUs (MS, BF, IK, EW), pp. 485–498.
CGOCGO-2013-LaiS #analysis #bound #optimisation #performance
Performance upper bound analysis and optimization of SGEMM on Fermi and Kepler GPUs (JL, AS), p. 10.
HPDCHPDC-2013-SajjapongseWB #clustering #multi #runtime
A preemption-based runtime to efficiently schedule multi-process applications on heterogeneous clusters with GPUs (KS, XW, MB), pp. 179–190.
PPoPPPPoPP-2013-NasreBP #algorithm
Morph algorithms on GPUs (RN, MB, KP), pp. 147–156.
PPoPPPPoPP-2013-YanLZ #algorithm #named #performance
StreamScan: fast scan algorithms for GPUs without global barrier synchronization (SY, GL, YZ), pp. 229–238.
PPoPPPPoPP-2013-YuB #automaton #performance #regular expression
Exploring different automata representations for efficient regular expression matching on GPUs (XY, MB), pp. 287–288.
DATEDATE-2012-BombieriFG #fault #framework #functional #named #simulation #verification
FAST-GP: An RTL functional verification framework based on fault simulation on GP-GPUs (NB, FF, VG), pp. 562–565.
DATEDATE-2012-LiangCZRZJC #3d #implementation #locality #optimisation #performance #realtime
Real-time implementation and performance optimization of 3D sound localization on GPUs (YL, ZC, SZ, KR, YZ, DLJ, DC), pp. 832–835.
DATEDATE-2012-WangRR #energy #runtime
Run-time power-gating in caches of GPUs for leakage energy savings (YW, SR, NR), pp. 300–303.
ESOPESOP-2012-HabermaierK #correctness #execution #on the
On the Correctness of the SIMT Execution Model of GPUs (AH, AK), pp. 316–335.
PLDIPLDI-2012-DubachCRBF #architecture #compilation
Compiling a high-level language for GPUs: (via language support for architectures and compilers) (CD, PC, RMR, DFB, SJF), pp. 1–12.
HPDCHPDC-2012-BecchiSGPRC #clustering #memory management #multitenancy #runtime
A virtual memory based runtime to support multi-tenancy in clusters with GPUs (MB, KS, IG, AMP, VTR, STC), pp. 97–108.
HPDCHPDC-2012-ChenA #effectiveness #memory management #optimisation #pipes and filters
Optimizing MapReduce for GPUs with effective shared memory usage (LC, GA), pp. 199–210.
ISMMISMM-2012-MaasRMAJK #garbage collection
GPUs as an opportunity for offloading garbage collection (MM, PR, JM, KA, ADJ, JK), pp. 25–36.
PPoPPPPoPP-2012-LiLSGGR #generative #named #testing #verification
GKLEE: concolic verification and test generation for GPUs (GL, PL, GS, GG, IG, SPR), pp. 215–224.
PPoPPPPoPP-2012-ZhongH #bibliography #graph
An overview of Medusa: simplified graph processing on GPUs (JZ, BH), pp. 283–284.
VLDBVLDB-2011-YangPS #graph #mining #multi #performance
Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining (XY, SP, PS), pp. 231–242.
GPCEGPCE-2011-NystromWD #compilation #named #runtime #scala
Firepile: run-time compilation for GPUs in scala (NN, DW, KD), pp. 107–116.
POPLPOPL-2011-PrabhuRMH #analysis #named
EigenCFA: accelerating flow analysis with GPUs (TP, SR, MM, MWH), pp. 511–522.
PPoPPPPoPP-2011-GrossetZLVH #graph
Evaluating graph coloring on GPUs (AVPG, PZ, SL, SV, MWH), pp. 297–298.
PPoPPPPoPP-2011-KimKLL #image #multi
Achieving a single compute device image in OpenCL for multiple GPUs (JK, HK, JHL, JL), pp. 277–288.
SOSPSOSP-2011-RossbachCSRW #abstraction #named #operating system
PTask: operating system abstractions to manage GPUs as compute devices (CJR, JC, MS, BR, EW), pp. 233–248.
DACDAC-2010-FengZ #analysis #grid #parallel #power management #robust
Parallel multigrid preconditioning on graphics processing units (GPUs) for robust power grid analysis (ZF, ZZ), pp. 661–666.
DACDAC-2010-WangZD #distributed #logic #parallel #simulation
Distributed time, conservative parallel logic simulation on GPUs (BDW, YZ, YD), pp. 761–766.
SIGMODSIGMOD-2010-KimCSSNKLBD #architecture #named #performance
FAST: fast architecture sensitive tree search on modern CPUs and GPUs (CK, JC, NS, ES, ADN, TK, VWL, SAB, PD), pp. 339–350.
SIGMODSIGMOD-2010-SatishKCNLKD #performance
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort (NS, CK, JC, ADN, VWL, DK, PD), pp. 351–362.
SACSAC-2010-JiCW #scalability #simulation
A simulation of large-scale groundwater flow on CUDA-enabled GPUs (XJ, TC, QW), pp. 2402–2403.
PPoPPPPoPP-2010-ChoiSV #modelling #multi
Model-driven autotuning of sparse matrix-vector multiply on GPUs (JC, AS, RWV), pp. 115–126.
DACDAC-2009-ChatterjeeDB #simulation
Event-driven gate-level simulation with GP-GPUs (DC, AD, VB), pp. 557–562.
CGOCGO-2009-UdupaGT #execution #pipes and filters #source code
Software Pipelined Execution of Stream Programs on GPUs (AU, RG, MJT), pp. 200–209.
PPoPPPPoPP-2009-MaA #compilation #data mining #mining #runtime
A compiler and runtime system for enabling data mining applications on gpus (WM, GA), pp. 287–288.
ICPRICPR-2008-GongC #graph #learning #online #optimisation #realtime #segmentation #using
Real-time foreground segmentation on GPUs using local online learning and global graph cut optimization (MG, LC), pp. 1–4.
ASPLOSASPLOS-2006-TarditiPO #named #parallel #using
Accelerator: using data parallelism to program GPUs for general-purpose uses (DT, SP, JO), pp. 325–335.
ICDARICDAR-2005-SteinkrauSB #algorithm #machine learning #using
Using GPUs for Machine Learning Algorithms (DS, PYS, IB), pp. 1115–1119.

Bibliography of Software Language Engineering in Generated Hypertext (BibSLEIGH) is created and maintained by Dr. Vadim Zaytsev.
Hosted as a part of SLEBOK on GitHub.