146 papers:
- DAC-2015-HanF #analysis #approach #cpu #gpu #graph #scalability
- Transient-simulation guided graph sparsification approach to scalable harmonic balance (HB) analysis of post-layout RF circuits leveraging heterogeneous CPU-GPU computing systems (LH, ZF), p. 6.
- DAC-2015-JungC #embedded #multi #named #performance #simulation
- ΣVP: host-GPU multiplexing for efficient simulation of multiple embedded GPUs on virtual platforms (YJ, LPC), p. 6.
- DAC-2015-KadjoAKG #approach #cpu #energy #gpu #mobile #performance
- A control-theoretic approach for energy efficient CPU-GPU subsystem in mobile platforms (DK, RA, MK, PVG), p. 6.
- DATE-2015-GerumBR #gpu #performance #simulation
- Source level performance simulation of GPU cores (CG, OB, WR), pp. 217–222.
- DATE-2015-NguyenASS #gpu #simulation
- Accelerating complex brain-model simulations on GPU platforms (HADN, ZAA, GS, CS), pp. 974–979.
- DATE-2015-ParkAHYL #big data #energy #gpu #low cost #memory management #performance
- Memory fast-forward: a low cost special function unit to enhance energy efficiency in GPU for big data processing (EP, JA, SH, SY, SL), pp. 1341–1346.
- DATE-2015-SchneiderHKWW #fault #simulation
- GPU-accelerated small delay fault simulation (ES, SH, MAK, XW, HJW), pp. 1174–1179.
- DATE-2015-WangLWY #gpu
- Eliminating intra-warp conflict misses in GPU (BW, ZL, XW, WY), pp. 689–694.
- SIGMOD-2015-HeimelKM #estimation #kernel #modelling #multi #self
- Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation (MH, MK, VM), pp. 1477–1492.
- VLDB-2015-BoghCA #gpu #parallel
- Work-Efficient Parallel Skyline Computation for the GPU (KSB, SC, IA), pp. 962–973.
- TACAS-2015-Wijs #branch #gpu #similarity
- GPU Accelerated Strong and Branching Bisimilarity Checking (AW), pp. 368–383.
- PLDI-2015-SharmaBA #gpu #source code #verification
- Verification of producer-consumer synchronization in GPU programs (RS, MB, AA), pp. 88–98.
- ICML-2015-TristanTS #estimation #gpu #performance
- Efficient Training of LDA on a GPU by Mean-for-Mode Estimation (JBT, JT, GLSJ), pp. 59–68.
- GPCE-2015-KolesnichenkoPN #contract #gpu #programming
- Contract-based general-purpose GPU programming (AK, CMP, SN, BM), pp. 75–84.
- SAC-2015-JoselliJC #animation #data type #gpu #named #proximity
- NGrid: a proximity data structure for fluids animation with GPU computing (MJ, JRdSJ, EC), pp. 1303–1308.
- SAC-2015-MartinCBGP #algorithm #gpu
- OpenACC-based GPU acceleration of an optical flow algorithm (NM, JC, GB, CG, MP), pp. 96–98.
- ICSE-v2-2015-Salgado #behaviour #cpu #gpu #interactive #kernel #profiling
- Profiling Kernels Behavior to Improve CPU / GPU Interactions (RS), pp. 754–756.
- ASPLOS-2015-AlglaveBDGKPSW #behaviour #concurrent #gpu #programming
- GPU Concurrency: Weak Behaviours and Programming Assumptions (JA, MB, AFD, GG, JK, DP, TS, JW), pp. 577–591.
- ASPLOS-2015-ParkPM #collaboration #gpu #multi #named
- Chimera: Collaborative Preemption for Multitasking on a Shared GPU (JJKP, YP, SAM), pp. 593–606.
- CGO-2015-LiYLZ #automation #gpu #memory management
- Automatic data placement into GPU on-chip memory resources (CL, YY, ZL, HZ), pp. 23–33.
- HPCA-2015-AroraMPJT #behaviour #benchmark #comprehension #cpu #gpu #metric #power management
- Understanding idle behavior and power gating mechanisms in the context of modern benchmarks on CPU-GPU Integrated systems (MA, SM, IP, NJ, DMT), pp. 366–377.
- HPCA-2015-LengZR #architecture #gpu
- GPU voltage noise: Characterization and hierarchical smoothing of spatial and temporal voltage noise interference in GPU architectures (JL, YZ, VJR), pp. 161–173.
- HPCA-2015-SethiaJM #gpu #memory management #named
- Mascar: Speeding up GPU warps by reducing memory pitstops (AS, DAJ, SAM), pp. 174–185.
- HPCA-2015-TiwariGRMRVOLDN #comprehension #design #fault #gpu #scalability
- Understanding GPU errors on large-scale HPC systems and the implications for system design and operation (DT, SG, JHR, DM, PR, SSV, DAGdO, DL, ND, POAN, LC, ASB), pp. 331–342.
- HPDC-2015-WahibM #automation #gpu #kernel #scalability
- Automated GPU Kernel Transformations in Large-Scale Production Stencil Applications (MW, NM), pp. 259–270.
- HPDC-2015-XiaoCHZ #cpu #gpu #monte carlo
- Monte Carlo Based Ray Tracing in CPU-GPU Heterogeneous Systems and Applications in Radiation Therapy (KX, DZC, XSH, BZ), pp. 247–258.
- PPoPP-2015-AlSaberK #multi #performance #semantics
- SemCache++: semantics-aware caching for efficient multi-GPU offloading (NA, MK), pp. 255–256.
- PPoPP-2015-PiaoKOLKKL #adaptation #cpu #framework #gpu #javascript #named
- JAWS: a JavaScript framework for adaptive CPU-GPU work sharing (XP, CK, YO, HL, JK, HK, JWL), pp. 251–252.
- PPoPP-2015-ShiLDHJLWLZ #gpu #graph #hybrid #optimisation
- Optimization of asynchronous graph processing on GPU with hybrid coloring model (XS, JL, SD, BH, HJ, LL, ZW, XL, JZ), pp. 271–272.
- PPoPP-2015-WangDPWRO #gpu #graph #library #named
- Gunrock: a high-performance graph processing library on the GPU (YW, AAD, YP, YW, AR, JDO), pp. 265–266.
- DAC-2014-KoKYKH #cpu #gpu #simulation
- Hardware-in-the-loop Simulation for CPU/GPU Heterogeneous Platforms (YK, TK, YY, MK, SH), p. 6.
- DAC-2014-PathaniaJPM #3d #cpu #game studies #gpu #mobile #power management
- Integrated CPU-GPU Power Management for 3D Mobile Games (AP, QJ, AP, TM), p. 6.
- DATE-2014-LeeF #framework #named #realtime #runtime #scheduling
- GPU-EvR: Run-time event based real-time scheduling framework on GPGPU platform (HL, MAAF), pp. 1–6.
- DATE-2014-LeeL #3d #gpu #on the #reduction
- On GPU bus power reduction with 3D IC technologies (YJL, SKL), pp. 1–6.
- VLDB-2015-HeZH14 #architecture #cpu #gpu #query
- In-Cache Query Co-Processing on Coupled CPU-GPU Architectures (JH, SZ, BH), pp. 329–340.
- ICEIS-v1-2014-PenaAMFF #algorithm #gpu #parallel #using
- An Improved Parallel Algorithm Using GPU for Siting Observers on Terrain (GCP, MVAA, SVGM, WRF, CRF), pp. 367–375.
- ICPR-2014-ScottEMFA #pattern matching #scalability
- GPU-Based PostgreSQL Extensions for Scalable High-Throughput Pattern Matching (GJS, ME, KM, ZF, DTA), pp. 1880–1885.
- SEKE-2014-JuniorCMS #data analysis #gpu #repository
- Exploratory Data Analysis of Software Repositories via GPU Processing (JRDSJ, EC, LM, AS), pp. 495–500.
- OOPSLA-2014-HolkNSL #data type #gpu #memory management #programming language
- Region-based memory management for GPU programming languages: enabling rich data structures on a spartan host (EH, RN, JGS, AL), pp. 141–155.
- SAC-2014-AlexandreMP #algorithm #multi #on the
- On the support of task-parallel algorithmic skeletons for multi-GPU computing (FA, RM, HP), pp. 880–885.
- SAC-2014-AvilaMRPY #distributed #quantum #simulation
- GPU-aware distributed quantum simulation (AA, AM, RR, MLP, ACY), pp. 860–865.
- CGO-2014-XuWGLGQ #architecture #gpu #memory management #transaction
- Software Transactional Memory for GPU Architectures (YX, RW, NG, TL, LG, DQ), p. 1.
- HPCA-2014-ElTantawyMOA #architecture #control flow #gpu #multi #performance #scalability
- A scalable multi-path microarchitecture for efficient GPU control flow (AE, JWM, MO, TMA), pp. 248–259.
- HPCA-2014-KimLJK #architecture #gpu #memory management #named #using
- GPUdmm: A high-performance and memory-oblivious GPU architecture using dynamic memory management (YK, JL, JEJ, JK), pp. 546–557.
- HPCA-2014-NugterenBCB #distance #gpu #modelling #reuse
- A detailed GPU cache model based on reuse distance theory (CN, GJvdB, HC, HEB), pp. 37–48.
- HPCA-2014-PowerHW #gpu
- Supporting x86-64 address translation for 100s of GPU lanes (JP, MDH, DAW), pp. 568–578.
- HPDC-2014-KriederWAWKGFR #design #evaluation #framework
- Design and evaluation of the gemtc framework for GPU-enabled many-task computing (SJK, JMW, TGA, MW, DSK, BG, ITF, IR), pp. 153–164.
- OSDI-2014-KimHZHWWS #abstraction #gpu #named #network #source code
- GPUnet: Networking Abstractions for GPU Programs (SK, SH, XZ, YH, AW, EW, MS), pp. 201–216.
- CAV-2014-BardsleyBCCDDKLQ #gpu #kernel #verification
- Engineering a Static Verification Tool for GPU Kernels (EB, AB, NC, PC, PD, AFD, JK, DL, SQ), pp. 226–242.
- CAV-2014-WijsKB #component #composition #graph
- GPU-Based Graph Decomposition into Strongly Connected and Maximal End Components (AW, JPK, DB), pp. 310–326.
- DAC-2013-HanZF #gpu #named #parallel #simulation
- TinySPICE: a parallel SPICE simulator on GPU for massively repeated small circuit simulations (LH, XZ, ZF), p. 8.
- DAC-2013-LiaoHL #detection #fault
- GPU-based n-detect transition fault ATPG (KYL, SCH, JCML), p. 8.
- DATE-2013-ZakharenkoAM #cpu #gpu #performance #using
- Characterizing the performance benefits of fused CPU/GPU systems using FusionSim (VZ, TMA, AM), pp. 685–688.
- DATE-2013-ZhaiYZ #algorithm #float #random
- GPU-friendly floating random walk algorithm for capacitance extraction of VLSI interconnects (KZ, WY, HZ), pp. 1661–1666.
- ICDAR-2013-ZhouYL #learning #performance #polynomial #recognition
- GPU-Based Fast Training of Discriminative Learning Quadratic Discriminant Function for Handwritten Chinese Character Recognition (MKZ, FY, CLL), pp. 842–846.
- VLDB-2013-Bress #gpu #hybrid #performance #query #why
- Why it is time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS (SB), pp. 1398–1403.
- VLDB-2013-HeLH #architecture #cpu #gpu
- Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture (JH, ML, BH), pp. 889–900.
- VLDB-2013-YuanL0 #gpu #query
- The Yin and Yang of Processing Data Warehousing Queries on GPU Devices (YY, RL, XZ), pp. 817–828.
- VLDB-2013-ZhangHHL #architecture #cpu #gpu #named #parallel #performance #query #towards
- OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures (SZ, JH, BH, ML), pp. 1374–1377.
- ESOP-2013-CollingbourneDKQ #analysis #gpu #kernel #semantics #verification
- Interleaving and Lock-Step Semantics for Analysis and Verification of GPU Kernels (PC, AFD, JK, SQ), pp. 270–289.
- CSMR-2013-ScannielloECG #gpu #using
- Using the GPU to Green an Intensive and Massive Computation System (GS, UE, GC, CG), pp. 384–387.
- ICFP-2013-McDonellCKL #functional #gpu #optimisation #source code
- Optimising purely functional GPU programs (TLM, MMTC, GK, BL), pp. 49–60.
- OOPSLA-2013-ChongDKKQ #abstraction #analysis #gpu #invariant #kernel
- Barrier invariants: a shared state abstraction for the analysis of data-dependent GPU kernels (NC, AFD, PHJK, JK, SQ), pp. 605–622.
- ASPLOS-2013-JooybarFODA #architecture #gpu #named
- GPUDet: a deterministic GPU architecture (HJ, WWLF, MO, JD, TMA), pp. 1–12.
- HPCA-2013-LustigM #cpu #fine-grained #gpu #latency
- Reducing GPU offload latency via fine-grained CPU-GPU synchronization (DL, MM), pp. 354–365.
- HPCA-2013-RhuE #control flow #execution #gpu #performance
- The dual-path execution model for efficient GPU control flow (MR, ME), pp. 591–602.
- HPCA-2013-SinghSFOA #architecture #gpu
- Cache coherence for GPU architectures (IS, AS, WWLF, MO, TMA), pp. 578–590.
- HPDC-2013-AjiPJCMBBDFMMT #on the
- On the efficacy of GPU-integrated MPI for scientific applications (AMA, LSP, FJ, MC, KM, PB, KRB, JD, WcF, JMMC, XM, RT), pp. 191–202.
- HPDC-2013-YuZQYWG #game studies #gpu #named #scheduling
- VGRIS: virtualized GPU resource isolation and scheduling in cloud gaming (MY, CZ, ZQ, JY, YW, HG), pp. 203–214.
- PPoPP-2013-DeoK #array #gpu #parallel
- Parallel suffix array and least common prefix for the GPU (MD, SK), pp. 197–206.
- PPoPP-2013-WuZZJS #algorithm #analysis #complexity #design #gpu #memory management
- Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU (BW, ZZ, EZZ, YJ, XS), pp. 57–68.
- PPoPP-2013-YangXFGLXLSYZ #algorithm #cpu #gpu #simulation
- A peta-scalable CPU-GPU algorithm for global atmospheric simulations (CY, WX, HF, LG, LL, YX, YL, JS, GY, WZ), pp. 1–12.
- DAC-2012-JeongESP #cpu #gpu #memory management
- A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC (MKJ, ME, CS, NCP), pp. 850–855.
- DAC-2012-KimLCKWYL #cpu #gpu #hybrid #in memory #memory management
- Hybrid DRAM/PRAM-based main memory for single-chip CPU/GPU (DK, SL, JC, DK, DHW, SY, SL), pp. 888–896.
- DAC-2012-RenCWZY #gpu #parallel #simulation
- Sparse LU factorization for parallel circuit simulation on GPU (LR, XC, YW, CZ, HY), pp. 1125–1130.
- DAC-2012-VincoCBF #architecture #gpu #named
- SAGA: SystemC acceleration on GPU architectures (SV, DC, VB, FF), pp. 115–120.
- DATE-2012-LiuTW #analysis #approach #graph #parallel #statistics
- Parallel statistical analysis of analog circuits by GPU-accelerated graph-based approach (XL, SXDT, HW), pp. 852–857.
- DATE-2012-LiuTWY #simulation
- A GPU-accelerated envelope-following method for switching power converter simulation (XL, SXDT, HW, HY), pp. 1349–1354.
- DATE-2012-SuriBE #approach #multi #problem #scalability
- A scalable GPU-based approach to accelerate the multiple-choice knapsack problem (BS, UDB, PE), pp. 1126–1129.
- VLDB-2012-WangHLWZS #cpu #gpu #hybrid #image
- Accelerating Pathology Image Data Cross-Comparison on CPU-GPU Hybrid Systems (KW, YH, RL, FW, XZ, JHS), pp. 1543–1554.
- CSMR-2012-JuniorPCM #architecture #parallel #version control
- A GPU-based Architecture for Parallel Image-aware Version Control (JRdSJ, TP, EWGC, LGPM), pp. 191–200.
- PLDI-2012-LeungGAGJL #gpu #kernel #verification
- Verifying GPU kernels by test amplification (AL, MG, YA, RG, RJ, SL), pp. 383–394.
- ICFP-2012-BergstromR #gpu
- Nested data-parallelism on the gpu (LB, JHR), pp. 247–258.
- GRAPHITE-2012-Cormie-Bowins #comparison #gpu #implementation #reachability
- A Comparison of Sequential and GPU Implementations of Iterative Methods to Compute Reachability Probabilities (ECB), pp. 20–34.
- CIKM-2012-KozawaAK #database #gpu #mining #nondeterminism #probability
- GPU acceleration of probabilistic frequent itemset mining from uncertain databases (YK, TA, HK), pp. 892–901.
- CIKM-2012-MasadaT #gpu #topic
- Extraction of topic evolutions from references in scientific articles and its GPU acceleration (TM, AT), pp. 1522–1526.
- OOPSLA-2012-BettsCDQT #gpu #kernel #named #verification
- GPUVerify: a verifier for GPU kernels (AB, NC, AFD, SQ, PT), pp. 113–132.
- SAC-2012-FazackerleyML #database #gpu
- GPU accelerated AES-CBC for database applications (SF, SMM, RL), pp. 873–878.
- SAC-2012-JiXWLTY #gpu #sequence
- High-throughput antibody sequence alignment based on GPU computing (GJ, ZX, XW, SL, MT, JY), pp. 1417–1418.
- CC-2012-UnkuleSQ #automation #gpu #kernel #locality #thread
- Automatic Restructuring of GPU Kernels for Exploiting Inter-thread Data Locality (SU, CS, AQ), pp. 21–40.
- CGO-2012-JablinJPLA #architecture #cpu #gpu
- Dynamically managed data for CPU-GPU architectures (TBJ, JAJ, PP, FL, DIA), pp. 165–174.
- CGO-2012-ZhangM #3d #clustering #gpu
- Auto-generation and auto-tuning of 3D stencil codes on GPU clusters (YZ, FM), pp. 155–164.
- HPCA-2012-LeeK #architecture #cpu #gpu #named #policy
- TAP: A TLP-aware cache management policy for a CPU-GPU heterogeneous architecture (JL, HK), pp. 91–102.
- HPCA-2012-YangXMZ #architecture #cpu #gpu
- CPU-assisted GPGPU on fused CPU-GPU architectures (YY, PX, MM, HZ), pp. 103–114.
- HPDC-2012-PhullLRCC #clustering #resource management
- Interference-driven resource management for GPU-based heterogeneous clusters (RP, CHL, KR, SC, STC), pp. 109–120.
- PPoPP-2012-HuynhHWG #framework #multi #scalability #streaming
- Scalable framework for mapping streaming applications onto multi-GPU systems (HPH, AH, WFW, RSMG), pp. 1–10.
- PPoPP-2012-KimSLNJL #clustering #cpu #gpu #programming
- OpenCL as a unified programming model for heterogeneous CPU/GPU clusters (JK, SS, JL, JN, GJ, JL), pp. 299–300.
- PPoPP-2012-LiuAHLSZWT #gpu #implementation #named
- FlexBFS: a parallelism-aware implementation of breadth-first search on GPU (GL, HA, WH, XL, TS, WZ, XW, XT), pp. 279–280.
- PPoPP-2012-Mendez-LojoBP #analysis #gpu #implementation #points-to
- A GPU implementation of inclusion-based points-to analysis (MML, MB, KP), pp. 107–116.
- PPoPP-2012-MerrillGG #gpu #graph #scalability #traversal
- Scalable GPU graph traversal (DM, MG, ASG), pp. 117–128.
- PPoPP-2012-TaoBB #development #gpu #kernel #scalability #using
- Using GPU’s to accelerate stencil-based computation kernels for the development of large scale scientific applications on heterogeneous systems (JT, MB, SRB), pp. 287–288.
- PPoPP-2012-ZuYXWTPD #automaton #implementation #memory management #nondeterminism #performance #regular expression
- GPU-based NFA implementation for memory efficient high speed regular expression matching (YZ, MY, ZX, LW, XT, KP, QD), pp. 129–140.
- DAC-2011-ZhaoF #3d #gpu #parallel #performance
- Fast multipole method on GPU: tackling 3-D capacitance extraction on massively parallel SIMD platforms (XZ, ZF), pp. 558–563.
- DAC-2011-ZhuDC #architecture #cpu #gpu #named
- Hermes: an integrated CPU/GPU microarchitecture for IP routing (YZ, YD, YC), pp. 1044–1049.
- DATE-2011-KangD #classification #gpu #metaprogramming #scalability
- Scalable packet classification via GPU metaprogramming (KK, YSD), pp. 871–874.
- DATE-2011-Wang #coordination #gpu #kernel #power management
- Coordinate strip-mining and kernel fusion to lower power consumption on GPU (GW), pp. 1218–1219.
- PLDI-2011-JablinPJJBA #automation #communication #cpu #gpu #optimisation
- Automatic CPU-GPU communication management and optimization (TBJ, PP, JAJ, NPJ, SRB, DIA), pp. 142–151.
- CSCW-2011-AspinR #3d #approach #gpu #multi
- A GPU based, projective multi-texturing approach to reconstructing the 3D human form for application in tele-presence (RAA, DJR), pp. 105–112.
- CIKM-2011-KrulisLBSS #architecture #distance #gpu #manycore #polynomial
- Processing the signature quadratic form distance on many-core GPU architectures (MK, JL, CB, TS, TS), pp. 2373–2376.
- KDD-2011-CotterSK #approach #kernel
- A GPU-tailored approach for training kernelized SVMs (AC, NS, JK), pp. 805–813.
- ASPLOS-2011-ZhangJGTS #gpu #on the fly
- On-the-fly elimination of dynamic irregularities for GPU computing (EZZ, YJ, ZG, KT, XS), pp. 369–380.
- HPCA-2011-ZhangO #analysis #architecture #gpu #performance
- A quantitative performance analysis model for GPU architectures (YZ, JDO), pp. 382–393.
- HPDC-2011-LiLTCZ #3d #cpu #experience #gpu #re-engineering
- Experience of parallelizing cryo-EM 3D reconstruction on a CPU-GPU heterogeneous system (LL, XL, GT, MC, PZ), pp. 195–204.
- HPDC-2011-RaviBAC #framework #gpu #runtime
- Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework (VTR, MB, GA, STC), pp. 217–228.
- ISMM-2011-VeldemaP #gpu
- Iterative data-parallel mark&sweep on a GPU (RV, MP), pp. 1–10.
- PPoPP-2011-ZhengRQA #detection #gpu #named #source code
- GRace: a low-overhead mechanism for detecting data races in GPU programs (MZ, VTR, FQ, GA), pp. 135–146.
- DAC-2010-LuoWH #effectiveness #gpu #implementation
- An effective GPU implementation of breadth-first search (LL, MDFW, WmWH), pp. 52–55.
- DATE-2010-RathiDGCV #distance #feature model #gpu #implementation
- A GPU based implementation of Center-Surround Distribution Distance for feature extraction and matching (AR, MD, WG, RTC, NV), pp. 172–177.
- FSE-2010-LiG #gpu #kernel #scalability #smt #verification
- Scalable SMT-based verification of GPU kernel functions (GL, GG), pp. 187–196.
- ASPLOS-2010-WooL #gpu #named #programmable #using
- COMPASS: a programmable data prefetcher using idle GPU shaders (DHW, HHSL), pp. 297–310.
- HPDC-2010-GharaibehAGR #gpu
- A GPU accelerated storage system (AG, SAK, SG, MR), pp. 167–178.
- HPDC-2010-LinWG #gpu #migration
- OpenGL application live migration with GPU acceleration in personal cloud (YL, WW, KG), pp. 280–283.
- HPDC-2010-LiuS #parallel
- GPU-based parallel householder bidiagonalization (FL, FJS), pp. 288–291.
- HPDC-2010-StuartCMO #multi #pipes and filters #using
- Multi-GPU volume rendering using MapReduce (JAS, CKC, KLM, JDO), pp. 841–848.
- PPoPP-2010-BaghsorkhiDPGH #adaptation #architecture #gpu #modelling #performance
- An adaptive performance modeling tool for GPU architectures (SSB, MD, SJP, WDG, WmWH), pp. 105–114.
- PPoPP-2010-SandesM #comparison #gpu #named #sequence #using
- CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences (EFdOS, ACMAdM), pp. 137–146.
- PPoPP-2010-ZhangCO #gpu #performance
- Fast tridiagonal solvers on the GPU (YZ, JC, JDO), pp. 127–136.
- DAC-2009-LiuH #optimisation #parallel #performance
- GPU-based parallelization for fast circuit optimization (YL, JH), pp. 943–946.
- DAC-2009-ShiCHMTHW #analysis #gpu #grid #network #performance #power management
- GPU friendly fast Poisson solver for structured power grid network analysis (JS, YC, WH, LM, SXDT, PHH, XW), pp. 178–183.
- SAC-2009-FortS #distance #network
- GPU-based computation of distance functions on road networks with applications (MF, JAS), pp. 1320–1324.
- DAC-2008-Garland #gpu #manycore #matrix
- Sparse matrix computations on manycore GPU’s (MG), pp. 2–6.
- DATE-2008-CopeCL #configuration management #gpu #logic #memory management #using
- Using Reconfigurable Logic to Optimise GPU Memory Accesses (BC, PYKC, WL), pp. 44–49.
- ICPR-2008-ChariotK #image #online
- GPU-boosted online image matching (AC, RK), pp. 1–4.
- ICPR-2008-KauffmannP #automaton #gpu
- Cellular automaton for ultra-fast watershed transform on GPU (CK, NP), pp. 1–4.
- ICPR-2008-LeeWN #detection #performance #using
- Very fast ellipse detection using GPU-based RHT (JKL, BAW, TSN), pp. 1–4.
- CGO-2008-RyooRSBUSH #gpu #optimisation #parallel #thread
- Program optimization space pruning for a multithreaded gpu (SR, CIR, SSS, SSB, SZU, JAS, WmWH), pp. 195–204.
- HPDC-2008-Al-KiswanyGSYR #distributed #named
- StoreGPU: exploiting graphics processing units to accelerate distributed storage systems (SAK, AG, ESN, GY, MR), pp. 165–174.
- PPoPP-2008-FernandesSS #gpu #parallel
- Massive parallel LDPC decoding on GPU (GFPF, LS, VMMdS), pp. 83–90.
- PPoPP-2008-RyooRBSKH #evaluation #gpu #optimisation #parallel #performance #thread #using
- Optimization principles and application performance evaluation of a multithreaded GPU using CUDA (SR, CIR, SSB, SSS, DBK, WmWH), pp. 73–82.
- CASE-2007-LucianoBR
- GPU-based elastic-object deformation for enhancement of existing haptic applications (CL, PPB, SHRR), pp. 146–151.
- HIMI-IIE-2007-MoradiKSH #algorithm #detection #navigation #realtime
- A Real-Time GPU-Based Wall Detection Algorithm for Mapping and Navigation in Indoor Environments (HM, EK, DNS, JH), pp. 1072–1077.
- CGO-2007-Buck #gpu #parallel #programming
- GPU Computing: Programming a Massively Parallel Processor (IB), p. 17.
- ISMM-2007-Kirk #architecture #gpu #parallel
- NVIDIA cuda software and gpu parallel computing architecture (DK), pp. 103–104.
- SIGMOD-2006-GovindarajuGKM #database #named #performance #scalability #sorting
- GPUTeraSort: high performance graphics co-processor sorting for large database management (NKG, JG, RK, DM), pp. 325–336.
- ICPR-v3-2006-MinM #gpu
- Tensor Voting Accelerated by Graphics Processing Units (GPU) (CM, GGM), pp. 1103–1106.
- SAC-2006-LejdforsO #embedded #generative #gpu #implementation
- Implementing an embedded GPU language by combining translation and generation (CL, LO), pp. 1610–1614.