58 papers:
- DATE-2015-FuWH #code generation
- Improving SIMD code generation in QEMU (SYF, JJW, WCH), pp. 1233–1236.
- SIGMOD-2015-PolychroniouRR #database #in memory
- Rethinking SIMD Vectorization for In-Memory Databases (OP, AR, KAR), pp. 1493–1508.
- VLDB-2015-InoueT #algorithm #array #sorting
- SIMD- and Cache-Friendly Algorithm for Sorting an Array of Structures (HI, KT), pp. 1274–1285.
- DAC-2014-WaeijenSCH #reduction
- Reduction Operator for Wide-SIMDs Reconsidered (LW, DS, HC, YH), p. 6.
- DATE-2014-BoettcherAEGR #architecture
- Advanced SIMD: Extending the reach of contemporary SIMD architectures (MB, BMAH, ME, GG, AR), pp. 1–4.
- DATE-2014-KimH #automation #generative #parallel
- Automatic generation of custom SIMD instructions for Superword Level Parallelism (TK, YH), pp. 1–6.
- VLDB-2015-InoueOT14 #branch #performance #predict #set
- Faster Set Intersection with SIMD instructions by Reducing Branch Mispredictions (HI, MO, KT), pp. 293–304.
- PLDI-2013-KongVSFPS #code generation
- When polyhedral transformations meet SIMD code generation (MK, RV, KS, FF, LNP, PS), pp. 127–138.
- ICFP-2013-PetersenOG #automation #haskell
- Automatic SIMD vectorization for Haskell (LP, DAO, NG), pp. 25–36.
- CGO-2013-RenALMPS #data type #parallel
- SIMD parallelization of applications that traverse irregular data structures (BR, GA, JRL, TM, TP, WS), p. 10.
- HPCA-2013-WangCWMZLN #architecture #execution #parallel
- A multiple SIMD, multiple data (MSMD) architecture: Parallel execution of dynamic and static SIMD fragments (YW, SC, JW, JM, KZ, WL, XN), pp. 603–614.
- PPoPP-2013-BartheCKGM #relational #synthesis #verification
- From relational verification to SIMD loop synthesis (GB, JMC, SG, CK, MM), pp. 123–134.
- DAC-2012-SeoDWPCMBM #architecture #process
- Process variation in near-threshold wide SIMD architectures (SS, RGD, MW, YP, CC, SAM, DB, TNM), pp. 980–987.
- ASPLOS-2012-ParkSPCM #architecture #performance
- SIMD defragmenter: efficient ILP realization on data-parallel architectures (YP, SS, HP, HKC, SAM), pp. 363–374.
- PPoPP-2012-KimH #code generation #kernel #performance
- Efficient SIMD code generation for irregular kernels (SK, HH), pp. 55–64.
- PPoPP-2012-LeissaHW #programming
- Extending a C-like language for portable SIMD programming (RL, SH, IW), pp. 65–74.
- DAC-2011-ZhaoF #3d #gpu #parallel #performance
- Fast multipole method on GPU: tackling 3-D capacitance extraction on massively parallel SIMD platforms (XZ, ZF), pp. 558–563.
- DATE-2011-MichelFP #embedded #simulation
- Speeding-up SIMD instructions dynamic binary translation in embedded processor simulation (LM, NF, FP), pp. 277–280.
- DATE-2011-WohSDKSBM #power management
- Low power interconnects for SIMD computers (MW, SS, RGD, DK, DS, DB, TNM), pp. 600–605.
- CIKM-2011-StepanovGREO
- SIMD-based decoding of posting lists (AAS, ARG, DER, RJE, PSO), pp. 317–326.
- CC-2011-HenrettySPFRS #architecture #layout
- Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures (TH, KS, LNP, FF, JR, PS), pp. 225–245.
- CGO-2011-NuzmanDRRWYCZ
- Vapor SIMD: Auto-vectorize once, run everywhere (DN, SD, ER, IR, KW, DY, AC, AZ), pp. 151–160.
- DAC-2010-HePKYALC #energy #named #throughput
- Xetal-Pro: an ultra-low energy and high throughput SIMD processor (YH, YP, RPK, ZY, AAA, SML, HC), pp. 543–548.
- SIGMOD-2010-SatishKCNLKD #performance
- Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort (NS, CK, JC, ADN, VWL, DK, PD), pp. 351–362.
- HPCA-2010-HuangSWSXM #named #permutation
- SIF: Overcoming the limitations of SIMD devices via implicit permutation (LH, LS, ZW, WS, NX, SM), pp. 1–12.
- VLDB-2009-WillhalmPBPZS #in memory #named #performance #using
- SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units (TW, NP, YB, HP, AZ, JS), pp. 385–394.
- DATE-2008-BonnotLEGRG #approach #architecture #implementation #multi
- Definition and SIMD Implementation of a Multi-Processing Architecture Approach on FPGA (PB, FL, GE, GG, OR, PG), pp. 610–615.
- VLDB-2008-ChhuganiNLMHCBKD #architecture #cpu #implementation #manycore #performance #sorting
- Efficient implementation of sorting on multi-core SIMD CPU architecture (JC, ADN, VWL, WM, MH, YKC, AB, SK, PD), pp. 1313–1324.
- CC-2008-FranchettiP #generative #permutation
- Generating SIMD Vectorized Permutations (FF, MP), pp. 116–131.
- CC-2008-LashariLM #architecture #control flow
- Control Flow Emulation on Tiled SIMD Architectures (GL, OL, MM), pp. 100–115.
- PPoPP-2008-Cameron #case study #parallel
- A case study in SIMD text processing with parallel bit streams: UTF-8 to UTF-16 transcoding (RDC), pp. 91–98.
- DATE-2007-KraemerLAM #interactive #parallel #program transformation #source code #using
- Interactive presentation: SoftSIMD — exploiting subword parallelism using source code transformations (SK, RL, GA, HM), pp. 1349–1354.
- AGTIVE-2007-AnandK #assembly #generative #graph transformation
- Code Graph Transformations for Verifiable Generation of SIMD-Parallel Assembly Code (CKA, WK), pp. 217–232.
- CC-2007-FiremanPZ #algorithm
- New Algorithms for SIMD Alignment (LF, EP, AZ), pp. 1–15.
- HPCA-2007-ClarkHYMF #hardware #lightweight #using
- Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping (NC, AH, SY, SAM, KF), pp. 216–227.
- DATE-DF-2006-DavilaTSSBR #algorithm #architecture #configuration management #design #implementation
- Design and implementation of a rendering algorithm in a SIMD reconfigurable architecture (MorphoSys) (JD, AdT, JMS, MSE, NB, FR), pp. 52–57.
- PLDI-2006-NuzmanRZ
- Auto-vectorization of interleaved data for SIMD (DN, IR, AZ), pp. 132–143.
- PLDI-2006-RenWP #optimisation #permutation
- Optimizing data permutations for SIMD devices (GR, PW, DAP), pp. 118–131.
- ASPLOS-2006-PatwardhanJDL #architecture #fault #self
- A defect tolerant self-organizing nanoscale SIMD architecture (JPP, VJ, CD, ARL), pp. 241–251.
- CGO-2006-LiZXH #optimisation
- Optimizing Dynamic Binary Translation for SIMD Instructions (JL, QZ, SX, BH), pp. 269–280.
- LCTES-2006-ZhangQWZZ #architecture #compilation #multi #optimisation
- Optimizing compiler for shared-memory multiple SIMD architecture (WZ, XQ, YW, BZ, CZ), pp. 199–208.
- CC-2005-JiangMHLZZZ #multi #performance #using
- Boosting the Performance of Multimedia Applications Using SIMD Instructions (WJ, CM, BH, JL, JZ, BZ, CZ), pp. 59–75.
- CGO-2005-WuEW #code generation #performance #runtime
- Efficient SIMD Code Generation for Runtime Alignment and Length Conversion (PW, AEE, AW), pp. 153–164.
- LCTES-2005-KudriavtsevK #generative #permutation
- Generation of permutations for SIMD processors (AK, PMK), pp. 147–156.
- PLDI-2004-EichenbergerWO #architecture #constraints
- Vectorization for SIMD architectures with alignment constraints (AEE, PW, KO), pp. 82–93.
- DATE-2003-BeeckGBMCD #data transformation #implementation #power management #realtime
- Background Data Organisation for the Low-Power Implementation in Real-Time of a Digital Audio Broadcast Receiver on a SIMD Processor (POdB, CG, EB, MM, FC, GD), pp. 11144–11145.
- DATE-2003-DuSTBAF #configuration management #interactive
- Interactive Ray Tracing on Reconfigurable SIMD MorphoSys (HD, MSE, NT, NB, MLA, MF), pp. 20144–20149.
- SIGMOD-2002-ZhouR #database #implementation #using
- Implementing database operations using SIMD instructions (JZ, KAR), pp. 145–156.
- LCTES-SCOPES-2002-LorenzWD #compilation #energy
- Energy aware compilation for DSPs with SIMD instructions (ML, LW, TD), pp. 94–101.
- DATE-2000-Leupers
- Code Selection for Media Processors with SIMD Instructions (RL), pp. 4–8.
- ICPR-1996-ArumugaveluR #algorithm #clustering
- SIMD algorithms for single link and complete link pattern clustering (SA, NR), pp. 625–629.
- SAC-1995-BaudinoCMS #set
- Processing sets on a SIMD machine (AB, GC, GM, GS), pp. 593–598.
- ILPS-1993-TongL #concurrent #constraints #logic programming #parallel
- Concurrent Constraint Logic Programming On Massively Parallel SIMD Computers (BMT, HfL), pp. 388–402.
- ESOP-1992-Levaire #case study #programming #using
- Using the Centaur System to for Data-Parallel SIMD Programming: A Case Study (JLL), pp. 341–350.
- PLDI-1992-HanxledenK #constraints #control flow #using
- Relaxing SIMD Control Flow Constraints using Loop Transformations (RvH, KK), pp. 188–199.
- IWMM-1992-Yuasa #architecture #garbage collection #lisp #memory management #parallel
- Memory Management and Garbage Collection of an Extended Common Lisp System for Massively Parallel SIMD Architecture (TY), pp. 490–506.
- AdaEurope-1991-Gudenberg #ada #parallel
- Modellin SIMD — Type Parallel Arithmetic Operations in Ada (JWvG), pp. 110–124.
- LFP-1988-HudakH
- Graphinators and the Duality of SIMD and MIMD (PH, EM), pp. 224–234.