Albert Cohen, David Grove
Proceedings of the 20th Symposium on Principles and Practice of Parallel Programming
PPoPP, 2015.
@proceedings{PPoPP-2015, acmid = "2688500", address = "San Francisco, California, USA", editor = "Albert Cohen and David Grove", isbn = "978-1-4503-3205-7", publisher = "{ACM}", title = "{Proceedings of the 20th Symposium on Principles and Practice of Parallel Programming}", year = 2015, }
Contents (44 items)
- PPoPP-2015-Gramoli #algorithm #concurrent #impact analysis
- More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms (VG), pp. 1–10.
- PPoPP-2015-AlistarhKLS #queue #scalability
- The SprayList: a scalable relaxed priority queue (DA, JK, JL, NS), pp. 11–20.
- PPoPP-2015-ArbelM #concurrent #scalability
- Predicate RCU: an RCU for scalable concurrent updates (MA, AM), pp. 21–30.
- PPoPP-2015-Golan-GuetaRSY #automation #scalability #semantics
- Automatic scalable atomicity via semantic locking (GGG, GR, MS, EY), pp. 31–41.
- PPoPP-2015-BensonB #framework #matrix #parallel #performance
- A framework for practical parallel fast matrix multiplication (ARB, GB), pp. 42–53.
- PPoPP-2015-AcharyaB #locality #modelling #parallel
- PLUTO+: near-complete modeling of affine transformations for parallelism and locality (AA, UB), pp. 54–64.
- PPoPP-2015-RavishankarDEPRRS #code generation #distributed #memory management
- Distributed memory code generation for mixed Irregular/Regular computations (MR, RD, VE, LNP, JR, AR, PS), pp. 65–75.
- PPoPP-2015-XiangS #clustering #hardware #transaction
- Software partitioning of hardware transactions (LX, MLS), pp. 76–86.
- PPoPP-2015-BaldassinBA #memory management #performance #transaction
- Performance implications of dynamic memory allocators on transactional memory systems (AB, EB, GA), pp. 87–96.
- PPoPP-2015-ZhangHCB #memory management #semantics #transaction
- Low-overhead software transactional memory with progress guarantees and strong semantics (MZ, JH, MC, MDB), pp. 97–108.
- PPoPP-2015-ChabbiLJSMI #parallel #source code
- Barrier elision for production parallel programs (MC, WL, WdJ, KS, JMMC, CI), pp. 109–119.
- PPoPP-2015-ThebaultPD #3d #assembly #case study #implementation #matrix #performance #scalability
- Scalable and efficient implementation of 3d unstructured meshes computation: a case study on matrix assembly (LT, EP, QD), pp. 120–129.
- PPoPP-2015-TallentVDDKH
- Diagnosing the causes and severity of one-sided message contention (NRT, AV, HvD, JD, DJK, AH), pp. 130–139.
- PPoPP-2015-ChangG #algorithm #concurrent #parallel
- A parallel algorithm for global states enumeration in concurrent systems (YJC, VKG), pp. 140–149.
- PPoPP-2015-CogumbreiroHMY #concurrent #verification
- Dynamic deadlock verification for general barrier synchronisation (TC, RH, FM, NY), pp. 150–160.
- PPoPP-2015-YouWTC #abstraction #framework #named
- VirtCL: a framework for OpenCL device abstraction and management (YPY, HJW, YNT, YTC), pp. 161–172.
- PPoPP-2015-AshariTBRCKS #kernel #machine learning #on the #optimisation
- On optimizing machine learning workloads via kernel fusion (AA, ST, MB, BR, KC, JK, PS), pp. 173–182.
- PPoPP-2015-ZhangCC
- NUMA-aware graph-structured analytics (KZ, RC, HC), pp. 183–193.
- PPoPP-2015-XieCGZC #distributed
- SYNC or ASYNC: time to fuse for distributed graph-parallel computation (CX, RC, HG, BZ, HC), pp. 194–204.
- PPoPP-2015-TangYKTGC #algorithm #parallel #programming #recursion
- Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency (YT, RY, HK, JJT, PG, RAC), pp. 205–214.
- PPoPP-2015-ChabbiFM #multi #performance
- High performance locks for multi-level NUMA systems (MC, MWF, JMMC), pp. 215–226.
- PPoPP-2015-MajoG #composition #library #locality #optimisation
- A library for portable and composable data locality optimizations for NUMA systems (ZM, TRG), pp. 227–238.
- PPoPP-2015-AmerLWBM #runtime #thread
- MPI+Threads: runtime contention and remedies (AA, HL, YW, PB, SM), pp. 239–248.
- PPoPP-2015-McPhersonNSC #detection #legacy #source code
- Fence placement for legacy data-race-free programs via synchronization read detection (AJM, VN, SS, MC), pp. 249–250.
- PPoPP-2015-PiaoKOLKKL #adaptation #cpu #framework #gpu #javascript #named
- JAWS: a JavaScript framework for adaptive CPU-GPU work sharing (XP, CK, YO, HL, JK, HK, JWL), pp. 251–252.
- PPoPP-2015-SeoKK #graph #named #scalability #streaming
- GStream: a graph streaming processing method for large-scale graphs on GPUs (HS, JK, MSK), pp. 253–254.
- PPoPP-2015-AlSaberK #multi #performance #semantics
- SemCache++: semantics-aware caching for efficient multi-GPU offloading (NA, MK), pp. 255–256.
- PPoPP-2015-KimLV #multi #programming
- An OpenACC-based unified programming model for multi-accelerator systems (JK, SL, JSV), pp. 257–258.
- PPoPP-2015-ThomsonD #concurrent #lazy evaluation #partial order #reduction #testing
- The lazy happens-before relation: better partial-order reduction for systematic concurrency testing (PT, AFD), pp. 259–260.
- PPoPP-2015-HaidarDLTD #hardware #linear #platform #towards
- Towards batched linear solvers on accelerated hardware platforms (AH, TD, PL, ST, JJD), pp. 261–262.
- PPoPP-2015-MuralidharanGCSH #performance #programming
- A collection-oriented programming model for performance portability (SM, MG, BCC, AS, MWH), pp. 263–264.
- PPoPP-2015-WangDPWRO #gpu #graph #library #named
- Gunrock: a high-performance graph processing library on the GPU (YW, AAD, YP, YW, AR, JDO), pp. 265–266.
- PPoPP-2015-PearceGSSA
- Decoupled load balancing (OP, TG, BRdS, MS, NMA), pp. 267–268.
- PPoPP-2015-JinLMLLPCK #automation #benchmark #generative #identification #metric #modelling #parallel #statistics
- Combining phase identification and statistic modeling for automated parallel benchmark generation (YJ, ML, XM, QL, JSL, NP, JYC, SK), pp. 269–270.
- PPoPP-2015-ShiLDHJLWLZ #gpu #graph #hybrid #optimisation
- Optimization of asynchronous graph processing on GPU with hybrid coloring model (XS, JL, SD, BH, HJ, LL, ZW, XL, JZ), pp. 271–272.
- PPoPP-2015-WestNM #concurrent #object-oriented #performance
- Efficient and reasonable object-oriented concurrency (SW, SN, BM), pp. 273–274.
- PPoPP-2015-VassiliadisPCALBVN #energy #programming #runtime
- A programming model and runtime system for significance-aware energy-efficient computing (VV, KP, CC, CDA, SL, NB, HV, DSN), pp. 275–276.
- PPoPP-2015-0003GTT #queue
- The lock-free k-LSM relaxed priority queue (MW, JG, JLT, PT), pp. 277–278.
- PPoPP-2015-SaillardCB #concurrent #multi #thread #validation
- Static/Dynamic validation of MPI collective communications in multi-threaded context (ES, PC, DB), pp. 279–280.
- PPoPP-2015-RamachandranM #concurrent #named #performance #using
- CASTLE: fast concurrent internal binary search tree using edge-based locking (AR, NM), pp. 281–282.
- PPoPP-2015-DasSR #communication #concurrent #detection #program analysis #thread
- Section based program analysis to reduce overhead of detecting unsynchronized thread communication (MD, GS, JR), pp. 283–284.
- PPoPP-2015-HarshvardhanAR #algorithm #approach #communication #graph #parallel
- A hierarchical approach to reducing communication in parallel graph algorithms (H, NMA, LR), pp. 285–286.
- PPoPP-2015-ChenCM #named #parallel
- Tiles: a new language mechanism for heterogeneous parallelism (YC, XC, HM), pp. 287–288.
- PPoPP-2015-RadoiHSD #parallel #question #web
- Are web applications ready for parallelism? (CR, SH, JS, DD), pp. 289–290.
9 ×#concurrent
9 ×#parallel
8 ×#performance
6 ×#named
5 ×#scalability
4 ×#algorithm
4 ×#graph
4 ×#multi
4 ×#programming
3 ×#framework
9 ×#parallel
8 ×#performance
6 ×#named
5 ×#scalability
4 ×#algorithm
4 ×#graph
4 ×#multi
4 ×#programming
3 ×#framework