J. Ramanujam, P. Sadayappan
Proceedings of the 17th Symposium on Principles and Practice of Parallel Programming
PPoPP, 2012.
@proceedings{PPoPP-2012, acmid = "2145816", address = "New Orleans, Louisiana, USA", editor = "J. Ramanujam and P. Sadayappan", isbn = "978-1-4503-1160-1", publisher = "{ACM}", title = "{Proceedings of the 17th Symposium on Principles and Practice of Parallel Programming}", year = 2012, }
Contents (56 items)
- PPoPP-2012-HuynhHWG #framework #multi #scalability #streaming
- Scalable framework for mapping streaming applications onto multi-GPU systems (HPH, AH, WFW, RSMG), pp. 1–10.
- PPoPP-2012-SimDKV #analysis #framework #identification #performance
- A performance analysis framework for identifying potential benefits in GPGPU applications (JS, AD, HK, RWV), pp. 11–22.
- PPoPP-2012-BaghsorkhiGDH #evaluation #memory management #parallel #performance #thread
- Efficient performance evaluation of memory hierarchy for highly multithreaded graphics processors (SSB, IG, MD, WmWH), pp. 23–34.
- PPoPP-2012-BallardDK #communication #reduction
- Communication avoiding successive band reduction (GB, JD, NK), pp. 35–44.
- PPoPP-2012-SackG #algorithm #communication #performance
- Faster topology-aware collective algorithms through non-minimal communication (PS, WG), pp. 45–54.
- PPoPP-2012-KimH #code generation #kernel #performance
- Efficient SIMD code generation for irregular kernels (SK, HH), pp. 55–64.
- PPoPP-2012-LeissaHW #programming
- Extending a C-like language for portable SIMD programming (RL, SH, IW), pp. 65–74.
- PPoPP-2012-KwonJEM #approach #clustering #hybrid
- A hybrid approach of OpenMP for clusters (OK, FJ, RE, SPM), pp. 75–84.
- PPoPP-2012-EomYJD #named #object-oriented #source code
- DOJ: dynamically parallelizing object-oriented programs (YHE, SY, JCJ, BD), pp. 85–96.
- PPoPP-2012-BonettaPPB #named #rest #scripting language #web #web service
- S: a scripting language for high-performance RESTful web services (DB, AP, CP, WB), pp. 97–106.
- PPoPP-2012-Mendez-LojoBP #analysis #gpu #implementation #points-to
- A GPU implementation of inclusion-based points-to analysis (MML, MB, KP), pp. 107–116.
- PPoPP-2012-MerrillGG #gpu #graph #scalability #traversal
- Scalable GPU graph traversal (DM, MG, ASG), pp. 117–128.
- PPoPP-2012-ZuYXWTPD #automaton #implementation #memory management #nondeterminism #performance #regular expression
- GPU-based NFA implementation for memory efficient high speed regular expression matching (YZ, MY, ZX, LW, XT, KP, QD), pp. 129–140.
- PPoPP-2012-KoganP #data type #performance
- A methodology for creating fast wait-free data structures (AK, EP), pp. 141–150.
- PPoPP-2012-ProkopecBBO #concurrent #performance
- Concurrent tries with efficient non-blocking snapshots (AP, NGB, PB, MO), pp. 151–160.
- PPoPP-2012-CrainGR
- A speculation-friendly binary search tree (TC, VG, MR), pp. 161–170.
- PPoPP-2012-ChenCM #array #named #parallel #representation
- PARRAY: a unifying array representation for heterogeneous parallelism (YC, XC, HM), pp. 171–180.
- PPoPP-2012-BlellochFGS #algorithm #parallel #performance
- Internally deterministic parallel algorithms can be fast (GEB, JTF, PBG, JS), pp. 181–192.
- PPoPP-2012-LeisersonSS #generative #parallel #platform #thread
- Deterministic parallel random-number generation for dynamic-multithreading platforms (CEL, TBS, JS), pp. 193–204.
- PPoPP-2012-NobariCKB #parallel #scalability
- Scalable parallel minimum spanning forest computation (SN, TTC, PK, SB), pp. 205–214.
- PPoPP-2012-LiLSGGR #generative #named #testing #verification
- GKLEE: concolic verification and test generation for GPUs (GL, PL, GS, GG, IG, SPR), pp. 215–224.
- PPoPP-2012-DuBBHD #fault tolerance #matrix
- Algorithm-based fault tolerance for dense matrix factorizations (PD, AB, GB, TH, JD), pp. 225–234.
- PPoPP-2012-BuhlerALC #concurrent #performance #streaming
- Efficient deadlock avoidance for streaming computation with filtering (JDB, KA, PL, RDC), pp. 235–246.
- PPoPP-2012-DiceMS #design
- Lock cohorting: a general technique for designing NUMA locks (DD, VJM, NS), pp. 247–256.
- PPoPP-2012-FatourouK
- Revisiting the combining synchronization technique (PF, NDK), pp. 257–266.
- PPoPP-2012-TardieuWL #parallel
- A work-stealing scheduler for X10’s task parallelism with suspension (OT, HW, HL), pp. 267–276.
- PPoPP-2012-BaskaranVML #automation #communication #memory management #optimisation #reuse
- Automatic communication optimizations through memory reuse strategies (MMB, NV, BM, RL), pp. 277–278.
- PPoPP-2012-LiuAHLSZWT #gpu #implementation #named
- FlexBFS: a parallelism-aware implementation of breadth-first search on GPU (GL, HA, WH, XL, TS, WZ, XW, XT), pp. 279–280.
- PPoPP-2012-AnderschCJ #embedded #parallel #programming
- Programming parallel embedded and consumer applications in OpenMP superscalar (MA, CCC, BHHJ), pp. 281–282.
- PPoPP-2012-ZhongH #graph #overview
- An overview of Medusa: simplified graph processing on GPUs (JZ, BH), pp. 283–284.
- PPoPP-2012-AliasDP #kernel #optimisation #synthesis
- Optimizing remote accesses for offloaded kernels: application to high-level synthesis for FPGA (CA, AD, AP), pp. 285–286.
- PPoPP-2012-TaoBB #development #gpu #kernel #scalability #using
- Using GPU’s to accelerate stencil-based computation kernels for the development of large scale scientific applications on heterogeneous systems (JT, MB, SRB), pp. 287–288.
- PPoPP-2012-MarkerTPBG #algebra #developer #linear
- Mechanizing the expert dense linear algebra developer (BM, AT, JP, DSB, RAvdG), pp. 289–290.
- PPoPP-2012-NugterenC #adaptation #parallel #performance #predict
- The boat hull model: adapting the roofline model to enable performance prediction for parallel computing (CN, HC), pp. 291–292.
- PPoPP-2012-FengGB #parallel
- Speculative parallelization on GPGPUs (MF, RG, LNB), pp. 293–294.
- PPoPP-2012-JimboreanCPML #adaptation #framework #parallel #performance
- Adapting the polyhedral model as a framework for efficient speculative parallelization (AJ, PC, BP, LM, VL), pp. 295–296.
- PPoPP-2012-GongHZ #in the cloud #network #overview #performance
- An overview of CMPI: network performance aware MPI in the cloud (YG, BH, JZ), pp. 297–298.
- PPoPP-2012-KimSLNJL #clustering #cpu #gpu #programming
- OpenCL as a unified programming model for heterogeneous CPU/GPU clusters (JK, SS, JL, JN, GJ, JL), pp. 299–300.
- PPoPP-2012-TzenakisPKPVN #analysis #dependence #named #parallel
- BDDT: : block-level dynamic dependence analysis for deterministic task-based parallelism (GT, AP, JK, PP, HV, DSN), pp. 301–302.
- PPoPP-2012-KamilCBCGHMF #domain-specific language #effectiveness #embedded #parallel #performance
- Portable parallel performance from sequential, productive, embedded domain-specific languages (SK, DC, SB, HC, EG, JH, JM, AF), pp. 303–304.
- PPoPP-2012-HoeflerS #detection #optimisation
- Communication-centric optimizations by dynamically detecting collective operations (TH, TS), pp. 305–306.
- PPoPP-2012-TimnatBKP
- Wait-free linked-lists (ST, AB, AK, EP), pp. 309–310.
- PPoPP-2012-DinhAJGMR #debugging #parallel #scalability #statistics
- Scalable parallel debugging with statistical assertions (MND, DA, CJ, AG, BM, LDR), pp. 311–312.
- PPoPP-2012-MalkisB #verification
- Verification of software barriers (AM, AB), pp. 313–314.
- PPoPP-2012-MittalJGSK #algorithm
- Collective algorithms for sub-communicators (AM, NJ, TG, YS, SK), pp. 315–316.
- PPoPP-2012-KosterMD
- Synchronization views for event-loop actors (JDK, SM, TD), pp. 317–318.
- PPoPP-2012-MetreveliZK #named
- CPHASH: a cache-partitioned hash table (ZM, NZ, MFK), pp. 319–320.
- PPoPP-2012-WernsingS #automation #heuristic #manycore #named
- RACECAR: a heuristic for automatic function specialization on multi-core heterogeneous systems (JRW, GS), pp. 321–322.
- PPoPP-2012-LiuS #queue
- A lock-free, array-based priority queue (YL, MFS), pp. 323–324.
- PPoPP-2012-NollG #framework #optimisation #parallel #source code
- An infrastructure for dynamic optimization of parallel programs (AN, TRG), pp. 325–326.
- PPoPP-2012-KjolstadHS #automation #data type #generative #optimisation
- Automatic datatype generation and optimization (FK, TH, MS), pp. 327–328.
- PPoPP-2012-BurnimENS #correctness #named #nondeterminism #parallel #specification
- NDetermin: inferring nondeterministic sequential specifications for parallelism correctness (JB, TE, GCN, KS), pp. 329–330.
- PPoPP-2012-ParkS #concurrent
- Concurrent breakpoints (CSP, KS), pp. 331–332.
- PPoPP-2012-StoneDS #programmable
- Establishing a Miniapp as a programmability proxy (AS, JD, MS), pp. 333–334.
- PPoPP-2012-JiangPOJ #manycore #parallel
- OpenMP-style parallelism in data-centered multicore computing with R (LJ, PBP, GO, FJ), pp. 335–336.
- PPoPP-2012-CaniouDRCA #analysis #constraints #parallel #performance
- Performance analysis of parallel constraint-based local search (YC, DD, FR, PC, SA), pp. 337–338.
17 ×#parallel
14 ×#performance
10 ×#named
5 ×#gpu
5 ×#optimisation
5 ×#scalability
4 ×#analysis
4 ×#framework
3 ×#algorithm
3 ×#automation
14 ×#performance
10 ×#named
5 ×#gpu
5 ×#optimisation
5 ×#scalability
4 ×#analysis
4 ×#framework
3 ×#algorithm
3 ×#automation