BibSLEIGH — gpu stem

Used together with:

base (28)
cpu (25)
acceler (19)
parallel (19)
architectur (17)

Stem gpu$ (all stems)

146 papers:

DAC-2015-HanF #analysis #approach #cpu #gpu #graph #scalability: Transient-simulation guided graph sparsification approach to scalable harmonic balance (HB) analysis of post-layout RF circuits leveraging heterogeneous CPU-GPU computing systems (LH, ZF), p. 6.
DAC-2015-JungC #embedded #multi #named #performance #simulation: ΣVP: host-GPU multiplexing for efficient simulation of multiple embedded GPUs on virtual platforms (YJ, LPC), p. 6.
DAC-2015-KadjoAKG #approach #cpu #energy #gpu #mobile #performance: A control-theoretic approach for energy efficient CPU-GPU subsystem in mobile platforms (DK, RA, MK, PVG), p. 6.
DATE-2015-GerumBR #gpu #performance #simulation: Source level performance simulation of GPU cores (CG, OB, WR), pp. 217–222.
DATE-2015-NguyenASS #gpu #simulation: Accelerating complex brain-model simulations on GPU platforms (HADN, ZAA, GS, CS), pp. 974–979.
DATE-2015-ParkAHYL #big data #energy #gpu #low cost #memory management #performance: Memory fast-forward: a low cost special function unit to enhance energy efficiency in GPU for big data processing (EP, JA, SH, SY, SL), pp. 1341–1346.
DATE-2015-SchneiderHKWW #fault #simulation: GPU-accelerated small delay fault simulation (ES, SH, MAK, XW, HJW), pp. 1174–1179.
DATE-2015-WangLWY #gpu: Eliminating intra-warp conflict misses in GPU (BW, ZL, XW, WY), pp. 689–694.
SIGMOD-2015-HeimelKM #estimation #kernel #modelling #multi #self: Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation (MH, MK, VM), pp. 1477–1492.
VLDB-2015-BoghCA #gpu #parallel: Work-Efficient Parallel Skyline Computation for the GPU (KSB, SC, IA), pp. 962–973.
TACAS-2015-Wijs #branch #gpu #similarity: GPU Accelerated Strong and Branching Bisimilarity Checking (AW), pp. 368–383.
PLDI-2015-SharmaBA #gpu #source code #verification: Verification of producer-consumer synchronization in GPU programs (RS, MB, AA), pp. 88–98.
ICML-2015-TristanTS #estimation #gpu #performance: Efficient Training of LDA on a GPU by Mean-for-Mode Estimation (JBT, JT, GLSJ), pp. 59–68.
GPCE-2015-KolesnichenkoPN #contract #gpu #programming: Contract-based general-purpose GPU programming (AK, CMP, SN, BM), pp. 75–84.
SAC-2015-JoselliJC #animation #data type #gpu #named #proximity: NGrid: a proximity data structure for fluids animation with GPU computing (MJ, JRdSJ, EC), pp. 1303–1308.
SAC-2015-MartinCBGP #algorithm #gpu: OpenACC-based GPU acceleration of an optical flow algorithm (NM, JC, GB, CG, MP), pp. 96–98.
ICSE-v2-2015-Salgado #behaviour #cpu #gpu #interactive #kernel #profiling: Profiling Kernels Behavior to Improve CPU / GPU Interactions (RS), pp. 754–756.
ASPLOS-2015-AlglaveBDGKPSW #behaviour #concurrent #gpu #programming: GPU Concurrency: Weak Behaviours and Programming Assumptions (JA, MB, AFD, GG, JK, DP, TS, JW), pp. 577–591.
ASPLOS-2015-ParkPM #collaboration #gpu #multi #named: Chimera: Collaborative Preemption for Multitasking on a Shared GPU (JJKP, YP, SAM), pp. 593–606.
CGO-2015-LiYLZ #automation #gpu #memory management: Automatic data placement into GPU on-chip memory resources (CL, YY, ZL, HZ), pp. 23–33.
HPCA-2015-AroraMPJT #behaviour #benchmark #comprehension #cpu #gpu #metric #power management: Understanding idle behavior and power gating mechanisms in the context of modern benchmarks on CPU-GPU Integrated systems (MA, SM, IP, NJ, DMT), pp. 366–377.
HPCA-2015-LengZR #architecture #gpu: GPU voltage noise: Characterization and hierarchical smoothing of spatial and temporal voltage noise interference in GPU architectures (JL, YZ, VJR), pp. 161–173.
HPCA-2015-SethiaJM #gpu #memory management #named: Mascar: Speeding up GPU warps by reducing memory pitstops (AS, DAJ, SAM), pp. 174–185.
HPCA-2015-TiwariGRMRVOLDN #comprehension #design #fault #gpu #scalability: Understanding GPU errors on large-scale HPC systems and the implications for system design and operation (DT, SG, JHR, DM, PR, SSV, DAGdO, DL, ND, POAN, LC, ASB), pp. 331–342.
HPDC-2015-WahibM #automation #gpu #kernel #scalability: Automated GPU Kernel Transformations in Large-Scale Production Stencil Applications (MW, NM), pp. 259–270.
HPDC-2015-XiaoCHZ #cpu #gpu #monte carlo: Monte Carlo Based Ray Tracing in CPU-GPU Heterogeneous Systems and Applications in Radiation Therapy (KX, DZC, XSH, BZ), pp. 247–258.
PPoPP-2015-AlSaberK #multi #performance #semantics: SemCache++: semantics-aware caching for efficient multi-GPU offloading (NA, MK), pp. 255–256.
PPoPP-2015-PiaoKOLKKL #adaptation #cpu #framework #gpu #javascript #named: JAWS: a JavaScript framework for adaptive CPU-GPU work sharing (XP, CK, YO, HL, JK, HK, JWL), pp. 251–252.
PPoPP-2015-ShiLDHJLWLZ #gpu #graph #hybrid #optimisation: Optimization of asynchronous graph processing on GPU with hybrid coloring model (XS, JL, SD, BH, HJ, LL, ZW, XL, JZ), pp. 271–272.
PPoPP-2015-WangDPWRO #gpu #graph #library #named: Gunrock: a high-performance graph processing library on the GPU (YW, AAD, YP, YW, AR, JDO), pp. 265–266.
DAC-2014-KoKYKH #cpu #gpu #simulation: Hardware-in-the-loop Simulation for CPU/GPU Heterogeneous Platforms (YK, TK, YY, MK, SH), p. 6.
DAC-2014-PathaniaJPM #3d #cpu #game studies #gpu #mobile #power management: Integrated CPU-GPU Power Management for 3D Mobile Games (AP, QJ, AP, TM), p. 6.
DATE-2014-LeeF #framework #named #realtime #runtime #scheduling: GPU-EvR: Run-time event based real-time scheduling framework on GPGPU platform (HL, MAAF), pp. 1–6.
DATE-2014-LeeL #3d #gpu #on the #reduction: On GPU bus power reduction with 3D IC technologies (YJL, SKL), pp. 1–6.
VLDB-2015-HeZH14 #architecture #cpu #gpu #query: In-Cache Query Co-Processing on Coupled CPU-GPU Architectures (JH, SZ, BH), pp. 329–340.
ICEIS-v1-2014-PenaAMFF #algorithm #gpu #parallel #using: An Improved Parallel Algorithm Using GPU for Siting Observers on Terrain (GCP, MVAA, SVGM, WRF, CRF), pp. 367–375.
ICPR-2014-ScottEMFA #pattern matching #scalability: GPU-Based PostgreSQL Extensions for Scalable High-Throughput Pattern Matching (GJS, ME, KM, ZF, DTA), pp. 1880–1885.
SEKE-2014-JuniorCMS #data analysis #gpu #repository: Exploratory Data Analysis of Software Repositories via GPU Processing (JRDSJ, EC, LM, AS), pp. 495–500.
OOPSLA-2014-HolkNSL #data type #gpu #memory management #programming language: Region-based memory management for GPU programming languages: enabling rich data structures on a spartan host (EH, RN, JGS, AL), pp. 141–155.
SAC-2014-AlexandreMP #algorithm #multi #on the: On the support of task-parallel algorithmic skeletons for multi-GPU computing (FA, RM, HP), pp. 880–885.
SAC-2014-AvilaMRPY #distributed #quantum #simulation: GPU-aware distributed quantum simulation (AA, AM, RR, MLP, ACY), pp. 860–865.
CGO-2014-XuWGLGQ #architecture #gpu #memory management #transaction: Software Transactional Memory for GPU Architectures (YX, RW, NG, TL, LG, DQ), p. 1.
HPCA-2014-ElTantawyMOA #architecture #control flow #gpu #multi #performance #scalability: A scalable multi-path microarchitecture for efficient GPU control flow (AE, JWM, MO, TMA), pp. 248–259.
HPCA-2014-KimLJK #architecture #gpu #memory management #named #using: GPUdmm: A high-performance and memory-oblivious GPU architecture using dynamic memory management (YK, JL, JEJ, JK), pp. 546–557.
HPCA-2014-NugterenBCB #distance #gpu #modelling #reuse: A detailed GPU cache model based on reuse distance theory (CN, GJvdB, HC, HEB), pp. 37–48.
HPCA-2014-PowerHW #gpu: Supporting x86-64 address translation for 100s of GPU lanes (JP, MDH, DAW), pp. 568–578.
HPDC-2014-KriederWAWKGFR #design #evaluation #framework: Design and evaluation of the gemtc framework for GPU-enabled many-task computing (SJK, JMW, TGA, MW, DSK, BG, ITF, IR), pp. 153–164.
OSDI-2014-KimHZHWWS #abstraction #gpu #named #network #source code: GPUnet: Networking Abstractions for GPU Programs (SK, SH, XZ, YH, AW, EW, MS), pp. 201–216.
CAV-2014-BardsleyBCCDDKLQ #gpu #kernel #verification: Engineering a Static Verification Tool for GPU Kernels (EB, AB, NC, PC, PD, AFD, JK, DL, SQ), pp. 226–242.
CAV-2014-WijsKB #component #composition #graph: GPU-Based Graph Decomposition into Strongly Connected and Maximal End Components (AW, JPK, DB), pp. 310–326.
DAC-2013-HanZF #gpu #named #parallel #simulation: TinySPICE: a parallel SPICE simulator on GPU for massively repeated small circuit simulations (LH, XZ, ZF), p. 8.
DAC-2013-LiaoHL #detection #fault: GPU-based n-detect transition fault ATPG (KYL, SCH, JCML), p. 8.
DATE-2013-ZakharenkoAM #cpu #gpu #performance #using: Characterizing the performance benefits of fused CPU/GPU systems using FusionSim (VZ, TMA, AM), pp. 685–688.
DATE-2013-ZhaiYZ #algorithm #float #random: GPU-friendly floating random walk algorithm for capacitance extraction of VLSI interconnects (KZ, WY, HZ), pp. 1661–1666.
ICDAR-2013-ZhouYL #learning #performance #polynomial #recognition: GPU-Based Fast Training of Discriminative Learning Quadratic Discriminant Function for Handwritten Chinese Character Recognition (MKZ, FY, CLL), pp. 842–846.
VLDB-2013-Bress #gpu #hybrid #performance #query #why: Why it is time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS (SB), pp. 1398–1403.
VLDB-2013-HeLH #architecture #cpu #gpu: Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture (JH, ML, BH), pp. 889–900.
VLDB-2013-YuanL0 #gpu #query: The Yin and Yang of Processing Data Warehousing Queries on GPU Devices (YY, RL, XZ), pp. 817–828.
VLDB-2013-ZhangHHL #architecture #cpu #gpu #named #parallel #performance #query #towards: OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures (SZ, JH, BH, ML), pp. 1374–1377.
ESOP-2013-CollingbourneDKQ #analysis #gpu #kernel #semantics #verification: Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels (PC, AFD, JK, SQ), pp. 270–289.
CSMR-2013-ScannielloECG #gpu #using: Using the GPU to Green an Intensive and Massive Computation System (GS, UE, GC, CG), pp. 384–387.
ICFP-2013-McDonellCKL #functional #gpu #optimisation #source code: Optimising purely functional GPU programs (TLM, MMTC, GK, BL), pp. 49–60.
OOPSLA-2013-ChongDKKQ #abstraction #analysis #gpu #invariant #kernel: Barrier invariants: a shared state abstraction for the analysis of data-dependent GPU kernels (NC, AFD, PHJK, JK, SQ), pp. 605–622.
ASPLOS-2013-JooybarFODA #architecture #gpu #named: GPUDet: a deterministic GPU architecture (HJ, WWLF, MO, JD, TMA), pp. 1–12.
HPCA-2013-LustigM #cpu #fine-grained #gpu #latency: Reducing GPU offload latency via fine-grained CPU-GPU synchronization (DL, MM), pp. 354–365.
HPCA-2013-RhuE #control flow #execution #gpu #performance: The dual-path execution model for efficient GPU control flow (MR, ME), pp. 591–602.
HPCA-2013-SinghSFOA #architecture #gpu: Cache coherence for GPU architectures (IS, AS, WWLF, MO, TMA), pp. 578–590.
HPDC-2013-AjiPJCMBBDFMMT #on the: On the efficacy of GPU-integrated MPI for scientific applications (AMA, LSP, FJ, MC, KM, PB, KRB, JD, WcF, JMMC, XM, RT), pp. 191–202.
HPDC-2013-YuZQYWG #game studies #gpu #named #scheduling: VGRIS: virtualized GPU resource isolation and scheduling in cloud gaming (MY, CZ, ZQ, JY, YW, HG), pp. 203–214.
PPoPP-2013-DeoK #array #gpu #parallel: Parallel suffix array and least common prefix for the GPU (MD, SK), pp. 197–206.
PPoPP-2013-WuZZJS #algorithm #analysis #complexity #design #gpu #memory management: Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU (BW, ZZ, EZZ, YJ, XS), pp. 57–68.
PPoPP-2013-YangXFGLXLSYZ #algorithm #cpu #gpu #simulation: A peta-scalable CPU-GPU algorithm for global atmospheric simulations (CY, WX, HF, LG, LL, YX, YL, JS, GY, WZ), pp. 1–12.
DAC-2012-JeongESP #cpu #gpu #memory management: A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC (MKJ, ME, CS, NCP), pp. 850–855.
DAC-2012-KimLCKWYL #cpu #gpu #hybrid #in memory #memory management: Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU (DK, SL, JC, DK, DHW, SY, SL), pp. 888–896.
DAC-2012-RenCWZY #gpu #parallel #simulation: Sparse LU factorization for parallel circuit simulation on GPU (LR, XC, YW, CZ, HY), pp. 1125–1130.
DAC-2012-VincoCBF #architecture #gpu #named: SAGA: SystemC acceleration on GPU architectures (SV, DC, VB, FF), pp. 115–120.
DATE-2012-LiuTW #analysis #approach #graph #parallel #statistics: Parallel statistical analysis of analog circuits by GPU-accelerated graph-based approach (XL, SXDT, HW), pp. 852–857.
DATE-2012-LiuTWY #simulation: A GPU-accelerated envelope-following method for switching power converter simulation (XL, SXDT, HW, HY), pp. 1349–1354.
DATE-2012-SuriBE #approach #multi #problem #scalability: A scalable GPU-based approach to accelerate the multiple-choice knapsack problem (BS, UDB, PE), pp. 1126–1129.
VLDB-2012-WangHLWZS #cpu #gpu #hybrid #image: Accelerating Pathology Image Data Cross-Comparison on CPU-GPU Hybrid Systems (KW, YH, RL, FW, XZ, JHS), pp. 1543–1554.
CSMR-2012-JuniorPCM #architecture #parallel #version control: A GPU-based Architecture for Parallel Image-aware Version Control (JRdSJ, TP, EWGC, LGPM), pp. 191–200.
PLDI-2012-LeungGAGJL #gpu #kernel #verification: Verifying GPU kernels by test amplification (AL, MG, YA, RG, RJ, SL), pp. 383–394.
ICFP-2012-BergstromR #gpu: Nested data-parallelism on the gpu (LB, JHR), pp. 247–258.
GRAPHITE-2012-Cormie-Bowins #comparison #gpu #implementation #reachability: A Comparison of Sequential and GPU Implementations of Iterative Methods to Compute Reachability Probabilities (ECB), pp. 20–34.
CIKM-2012-KozawaAK #database #gpu #mining #nondeterminism #probability: GPU acceleration of probabilistic frequent itemset mining from uncertain databases (YK, TA, HK), pp. 892–901.
CIKM-2012-MasadaT #gpu #topic: Extraction of topic evolutions from references in scientific articles and its GPU acceleration (TM, AT), pp. 1522–1526.
OOPSLA-2012-BettsCDQT #gpu #kernel #named #verification: GPUVerify: a verifier for GPU kernels (AB, NC, AFD, SQ, PT), pp. 113–132.
SAC-2012-FazackerleyML #database #gpu: GPU accelerated AES-CBC for database applications (SF, SMM, RL), pp. 873–878.
SAC-2012-JiXWLTY #gpu #sequence: High-throughput antibody sequence alignment based on GPU computing (GJ, ZX, XW, SL, MT, JY), pp. 1417–1418.
CC-2012-UnkuleSQ #automation #gpu #kernel #locality #thread: Automatic Restructuring of GPU Kernels for Exploiting Inter-thread Data Locality (SU, CS, AQ), pp. 21–40.
CGO-2012-JablinJPLA #architecture #cpu #gpu: Dynamically managed data for CPU-GPU architectures (TBJ, JAJ, PP, FL, DIA), pp. 165–174.
CGO-2012-ZhangM #3d #clustering #gpu: Auto-generation and auto-tuning of 3D stencil codes on GPU clusters (YZ, FM), pp. 155–164.
HPCA-2012-LeeK #architecture #cpu #gpu #named #policy: TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture (JL, HK), pp. 91–102.
HPCA-2012-YangXMZ #architecture #cpu #gpu: CPU-assisted GPGPU on fused CPU-GPU architectures (YY, PX, MM, HZ), pp. 103–114.
HPDC-2012-PhullLRCC #clustering #resource management: Interference-driven resource management for GPU-based heterogeneous clusters (RP, CHL, KR, SC, STC), pp. 109–120.
PPoPP-2012-HuynhHWG #framework #multi #scalability #streaming: Scalable framework for mapping streaming applications onto multi-GPU systems (HPH, AH, WFW, RSMG), pp. 1–10.
PPoPP-2012-KimSLNJL #clustering #cpu #gpu #programming: OpenCL as a unified programming model for heterogeneous CPU/GPU clusters (JK, SS, JL, JN, GJ, JL), pp. 299–300.
PPoPP-2012-LiuAHLSZWT #gpu #implementation #named: FlexBFS: a parallelism-aware implementation of breadth-first search on GPU (GL, HA, WH, XL, TS, WZ, XW, XT), pp. 279–280.
PPoPP-2012-Mendez-LojoBP #analysis #gpu #implementation #points-to: A GPU implementation of inclusion-based points-to analysis (MML, MB, KP), pp. 107–116.
PPoPP-2012-MerrillGG #gpu #graph #scalability #traversal: Scalable GPU graph traversal (DM, MG, ASG), pp. 117–128.
PPoPP-2012-TaoBB #development #gpu #kernel #scalability #using: Using GPU’s to accelerate stencil-based computation kernels for the development of large scale scientific applications on heterogeneous systems (JT, MB, SRB), pp. 287–288.
PPoPP-2012-ZuYXWTPD #automaton #implementation #memory management #nondeterminism #performance #regular expression: GPU-based NFA implementation for memory efficient high speed regular expression matching (YZ, MY, ZX, LW, XT, KP, QD), pp. 129–140.
DAC-2011-ZhaoF #3d #gpu #parallel #performance: Fast multipole method on GPU: tackling 3-D capacitance extraction on massively parallel SIMD platforms (XZ, ZF), pp. 558–563.
DAC-2011-ZhuDC #architecture #cpu #gpu #named: Hermes: an integrated CPU/GPU microarchitecture for IP routing (YZ, YD, YC), pp. 1044–1049.
DATE-2011-KangD #classification #gpu #metaprogramming #scalability: Scalable packet classification via GPU metaprogramming (KK, YSD), pp. 871–874.
DATE-2011-Wang #coordination #gpu #kernel #power management: Coordinate strip-mining and kernel fusion to lower power consumption on GPU (GW), pp. 1218–1219.
PLDI-2011-JablinPJJBA #automation #communication #cpu #gpu #optimisation: Automatic CPU-GPU communication management and optimization (TBJ, PP, JAJ, NPJ, SRB, DIA), pp. 142–151.
CSCW-2011-AspinR #3d #approach #gpu #multi: A GPU based, projective multi-texturing approach to reconstructing the 3D human form for application in tele-presence (RAA, DJR), pp. 105–112.
CIKM-2011-KrulisLBSS #architecture #distance #gpu #manycore #polynomial: Processing the signature quadratic form distance on many-core GPU architectures (MK, JL, CB, TS, TS), pp. 2373–2376.
KDD-2011-CotterSK #approach #kernel: A GPU-tailored approach for training kernelized SVMs (AC, NS, JK), pp. 805–813.
ASPLOS-2011-ZhangJGTS #gpu #on the fly: On-the-fly elimination of dynamic irregularities for GPU computing (EZZ, YJ, ZG, KT, XS), pp. 369–380.
HPCA-2011-ZhangO #analysis #architecture #gpu #performance: A quantitative performance analysis model for GPU architectures (YZ, JDO), pp. 382–393.
HPDC-2011-LiLTCZ #3d #cpu #experience #gpu #re-engineering: Experience of parallelizing cryo-EM 3D reconstruction on a CPU-GPU heterogeneous system (LL, XL, GT, MC, PZ), pp. 195–204.
HPDC-2011-RaviBAC #framework #gpu #runtime: Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework (VTR, MB, GA, STC), pp. 217–228.
ISMM-2011-VeldemaP #gpu: Iterative data-parallel mark&sweep on a GPU (RV, MP), pp. 1–10.
PPoPP-2011-ZhengRQA #detection #gpu #named #source code: GRace: a low-overhead mechanism for detecting data races in GPU programs (MZ, VTR, FQ, GA), pp. 135–146.
DAC-2010-LuoWH #effectiveness #gpu #implementation: An effective GPU implementation of breadth-first search (LL, MDFW, WmWH), pp. 52–55.
DATE-2010-RathiDGCV #distance #feature model #gpu #implementation: A GPU based implementation of Center-Surround Distribution Distance for feature extraction and matching (AR, MD, WG, RTC, NV), pp. 172–177.
FSE-2010-LiG #gpu #kernel #scalability #smt #verification: Scalable SMT-based verification of GPU kernel functions (GL, GG), pp. 187–196.
ASPLOS-2010-WooL #gpu #named #programmable #using: COMPASS: a programmable data prefetcher using idle GPU shaders (DHW, HHSL), pp. 297–310.
HPDC-2010-GharaibehAGR #gpu: A GPU accelerated storage system (AG, SAK, SG, MR), pp. 167–178.
HPDC-2010-LinWG #gpu #migration: OpenGL application live migration with GPU acceleration in personal cloud (YL, WW, KG), pp. 280–283.
HPDC-2010-LiuS #parallel: GPU-based parallel householder bidiagonalization (FL, FJS), pp. 288–291.
HPDC-2010-StuartCMO #multi #pipes and filters #using: Multi-GPU volume rendering using MapReduce (JAS, CKC, KLM, JDO), pp. 841–848.
PPoPP-2010-BaghsorkhiDPGH #adaptation #architecture #gpu #modelling #performance: An adaptive performance modeling tool for GPU architectures (SSB, MD, SJP, WDG, WmWH), pp. 105–114.
PPoPP-2010-SandesM #comparison #gpu #named #sequence #using: CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences (EFdOS, ACMAdM), pp. 137–146.
PPoPP-2010-ZhangCO #gpu #performance: Fast tridiagonal solvers on the GPU (YZ, JC, JDO), pp. 127–136.
DAC-2009-LiuH #optimisation #parallel #performance: GPU-based parallelization for fast circuit optimization (YL, JH), pp. 943–946.
DAC-2009-ShiCHMTHW #analysis #gpu #grid #network #performance #power management: GPU friendly fast Poisson solver for structured power grid network analysis (JS, YC, WH, LM, SXDT, PHH, XW), pp. 178–183.
SAC-2009-FortS #distance #network: GPU-based computation of distance functions on road networks with applications (MF, JAS), pp. 1320–1324.
DAC-2008-Garland #gpu #manycore #matrix: Sparse matrix computations on manycore GPU’s (MG), pp. 2–6.
DATE-2008-CopeCL #configuration management #gpu #logic #memory management #using: Using Reconfigurable Logic to Optimise GPU Memory Accesses (BC, PYKC, WL), pp. 44–49.
ICPR-2008-ChariotK #image #online: GPU-boosted online image matching (AC, RK), pp. 1–4.
ICPR-2008-KauffmannP #automaton #gpu: Cellular automaton for ultra-fast watershed transform on GPU (CK, NP), pp. 1–4.
ICPR-2008-LeeWN #detection #performance #using: Very fast ellipse detection using GPU-based RHT (JKL, BAW, TSN), pp. 1–4.
CGO-2008-RyooRSBUSH #gpu #optimisation #parallel #thread: Program optimization space pruning for a multithreaded gpu (SR, CIR, SSS, SSB, SZU, JAS, WmWH), pp. 195–204.
HPDC-2008-Al-KiswanyGSYR #distributed #named: StoreGPU: exploiting graphics processing units to accelerate distributed storage systems (SAK, AG, ESN, GY, MR), pp. 165–174.
PPoPP-2008-FernandesSS #gpu #parallel: Massive parallel LDPC decoding on GPU (GFPF, LS, VMMdS), pp. 83–90.
PPoPP-2008-RyooRBSKH #evaluation #gpu #optimisation #parallel #performance #thread #using: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA (SR, CIR, SSB, SSS, DBK, WmWH), pp. 73–82.
CASE-2007-LucianoBR: GPU-based elastic-object deformation for enhancement of existing haptic applications (CL, PPB, SHRR), pp. 146–151.
HIMI-IIE-2007-MoradiKSH #algorithm #detection #navigation #realtime: A Real-Time GPU-Based Wall Detection Algorithm for Mapping and Navigation in Indoor Environments (HM, EK, DNS, JH), pp. 1072–1077.
CGO-2007-Buck #gpu #parallel #programming: GPU Computing: Programming a Massively Parallel Processor (IB), p. 17.
ISMM-2007-Kirk #architecture #gpu #parallel: NVIDIA cuda software and gpu parallel computing architecture (DK), pp. 103–104.
SIGMOD-2006-GovindarajuGKM #database #named #performance #scalability #sorting: GPUTeraSort: high performance graphics co-processor sorting for large database management (NKG, JG, RK, DM), pp. 325–336.
ICPR-v3-2006-MinM #gpu: Tensor Voting Accelerated by Graphics Processing Units (GPU) (CM, GGM), pp. 1103–1106.
SAC-2006-LejdforsO #embedded #generative #gpu #implementation: Implementing an embedded GPU language by combining translation and generation (CL, LO), pp. 1610–1614.