Mahmut Taylan Kandemir, Alexandra Jimborean, Tipp Moseley
Proceedings of the 17th International Symposium on Code Generation and Optimization
CGO, 2019.
Contents (33 items)
- CGO-2019-PanchenkoANO #named
- BOLT: A Practical Binary Optimizer for Data Centers and Beyond (MP, RA, BN, GO), pp. 2–14.
- CGO-2019-Zhou0 #automation #named #parallel
- Janus: Statically-Driven and Profile-Guided Automatic Dynamic Binary Parallelisation (RZ, TMJ0), pp. 15–25.
- CGO-2019-AgaA #layout #named #runtime #stack
- Smokestack: Thwarting DOP Attacks with Runtime Stack Layout Randomization (MTA, TMA), pp. 26–36.
- CGO-2019-LimN #assembly #automation #encryption #equivalence #implementation #library
- Automatic Equivalence Checking for Assembly Implementations of Cryptography Libraries (JPL, SN), pp. 37–49.
- CGO-2019-LiuSWDL #detection #named
- CSOD: Context-Sensitive Overflow Detection (HL, SS, XW, LD, TL), pp. 50–60.
- CGO-2019-SunBSB #graph #reasoning #using
- Reasoning about the Node.js Event Loop using Async Graphs (HS, DB, FS, WB), pp. 61–72.
- CGO-2019-GonzaloHGHMH #automation #generative #parallel #performance #reduction
- Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs (SGDG, SH, JGL, SDH, OM, WMH), pp. 73–84.
- CGO-2019-KimSTKPPRS #code generation
- A Code Generator for High-Performance Tensor Contractions on GPUs (JK, ASR, VT, SK, AP, LNP, AR, PS), pp. 85–95.
- CGO-2019-TianQ0LR #manycore #query #sequence
- Transforming Query Sequences for High-Throughput B+ Tree Processing on Many-Core Processors (RT, JQ, ZZ0, XL0, BR), pp. 96–108.
- CGO-2019-MururuGP #commit #execution #modelling #optimisation
- Quantifying and Reducing Execution Variance in STM via Model Driven Commit Optimization (GM, AG, SP), pp. 109–121.
- CGO-2019-LeeLLMCZ0
- White-Box Program Tuning (WCL, YL, PL, SM, HC, XZ0, RG0), pp. 122–135.
- CGO-2019-RodriguesGP #array #bound #generative #in memory
- Generation of In-Bounds Inputs for Arrays in Memory-Unsafe Languages (MR, BG, FMQP), pp. 136–148.
- CGO-2019-RochaP0CL #sequence
- Function Merging by Sequence Alignment (RCOR, PP, ZW0, MC, HL), pp. 149–163.
- CGO-2019-ProkopecDLW #algorithm #compilation #incremental
- An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers (AP, GD, DL, TW), pp. 164–179.
- CGO-2019-KjolstadAKA #algebra #compilation
- Tensor Algebra Compilation with Workspaces (FK, PA, SK, SPA), pp. 180–192.
- CGO-2019-BaghdadiRRSAZSK #compilation #named #performance
- Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code (RB, JR, MBR, EDS, AA, YZ, PS, SK, SPA), pp. 193–205.
- CGO-2019-PorpodasRBGM #sequence
- Super-Node SLP: Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements (VP, RCOR, EB, LFWG, TGM), pp. 206–216.
- CGO-2019-TeixeiraAPG #named #optimisation
- Locus: A System and a Language for Program Optimization (TSFXT, CA, DAP, WG), pp. 217–228.
- CGO-2019-HayesHHCZ
- Decoding CUDA Binary (ABH, FH, JH, YHC, EZZ), pp. 229–241.
- CGO-2019-QiaoRHT #approach #kernel #locality #optimisation
- From Loop Fusion to Kernel Fusion: A Domain-Specific Approach to Locality Optimization (BQ, OR, FH, JT), pp. 242–253.
- CGO-2019-ChandrasekharCC #compilation #named #open source
- IGC: The Open Source Intel Graphics Compiler (AC, GC, PYC, WYC, JG, PG, SHPK, GYL, PM, WP, TR, KT), pp. 254–265.
- CGO-2019-NethS #automation #parallel
- Automatic Parallelization of Irregular x86-64 Loops (BN, MMS), p. 266.
- CGO-2019-DasBS #design #manycore
- A Shared BTB Design for Multicore Systems (MD, AB, BS), pp. 267–268.
- CGO-2019-Varadarajan #interactive #optimisation
- Optimizing RNA-RNA Interaction Computations (SV), pp. 269–270.
- CGO-2019-GomesB #automation #code generation #formal method #modelling
- Code Generation from Formal Models for Automatic RTOS Portability (RMG, MB), pp. 271–272.
- CGO-2019-NelsonP #behaviour #comprehension
- Understanding RDMA Behavior in NUMA Systems (JN, RP), pp. 273–274.
- CGO-2019-FuH #architecture
- Translating Traditional SIMD Instructions to Vector Length Agnostic Architectures (SYF, WCH), p. 275.
- CGO-2019-LiL0 #gpu #optimisation #runtime
- Accelerating GPU Computing at Runtime with Binary Optimization (GL, LL, XF0), pp. 276–277.
- CGO-2019-KruppeOS0 #lightweight #using
- Extending LLVM for Lightweight SPMD Vectorization: Using SIMD and Vector Instructions Easily from Any Language (RK, JO, LS, AK0), pp. 278–279.
- CGO-2019-Castro-LopezL #compilation #deployment #machine learning #modelling #multi
- Multi-target Compiler for the Deployment of Machine Learning Models (OCL, IFVL), pp. 280–281.
- CGO-2019-ZhouM #analysis #performance
- A Tool for Performance Analysis of GPU-Accelerated Applications (KZ, JMMC), p. 282.
- CGO-2019-MishraKC #automation #composition #kernel
- Kernel Fusion/Decomposition for Automatic GPU-Offloading (AM, MK, BMC), pp. 283–284.
- CGO-2019-KimK #generative #hardware #using
- Translating CUDA to OpenCL for Hardware Generation using Neural Machine Translation (YK, HK), pp. 285–286.