58 papers:
DATE-2015-FuWH #code generation- Improving SIMD code generation in QEMU (SYF, JJW, WCH), pp. 1233–1236.
SIGMOD-2015-PolychroniouRR #database #in memory- Rethinking SIMD Vectorization for In-Memory Databases (OP, AR, KAR), pp. 1493–1508.
VLDB-2015-InoueT #algorithm #array #sorting- SIMD- and Cache-Friendly Algorithm for Sorting an Array of Structures (HI, KT), pp. 1274–1285.
DAC-2014-WaeijenSCH #reduction- Reduction Operator for Wide-SIMDs Reconsidered (LW, DS, HC, YH), p. 6.
DATE-2014-BoettcherAEGR #architecture- Advanced SIMD: Extending the reach of contemporary SIMD architectures (MB, BMAH, ME, GG, AR), pp. 1–4.
DATE-2014-KimH #automation #generative #parallel- Automatic generation of custom SIMD instructions for Superword Level Parallelism (TK, YH), pp. 1–6.
VLDB-2015-InoueOT14 #branch #performance #predict #set- Faster Set Intersection with SIMD instructions by Reducing Branch Mispredictions (HI, MO, KT), pp. 293–304.
PLDI-2013-KongVSFPS #code generation- When polyhedral transformations meet SIMD code generation (MK, RV, KS, FF, LNP, PS), pp. 127–138.
ICFP-2013-PetersenOG #automation #haskell- Automatic SIMD vectorization for Haskell (LP, DAO, NG), pp. 25–36.
CGO-2013-RenALMPS #data type #parallel- SIMD parallelization of applications that traverse irregular data structures (BR, GA, JRL, TM, TP, WS), p. 10.
HPCA-2013-WangCWMZLN #architecture #execution #parallel- A multiple SIMD, multiple data (MSMD) architecture: Parallel execution of dynamic and static SIMD fragments (YW, SC, JW, JM, KZ, WL, XN), pp. 603–614.
PPoPP-2013-BartheCKGM #relational #synthesis #verification- From relational verification to SIMD loop synthesis (GB, JMC, SG, CK, MM), pp. 123–134.
DAC-2012-SeoDWPCMBM #architecture #process- Process variation in near-threshold wide SIMD architectures (SS, RGD, MW, YP, CC, SAM, DB, TNM), pp. 980–987.
ASPLOS-2012-ParkSPCM #architecture #performance- SIMD defragmenter: efficient ILP realization on data-parallel architectures (YP, SS, HP, HKC, SAM), pp. 363–374.
PPoPP-2012-KimH #code generation #kernel #performance- Efficient SIMD code generation for irregular kernels (SK, HH), pp. 55–64.
PPoPP-2012-LeissaHW #programming- Extending a C-like language for portable SIMD programming (RL, SH, IW), pp. 65–74.
DAC-2011-ZhaoF #3d #gpu #parallel #performance- Fast multipole method on GPU: tackling 3-D capacitance extraction on massively parallel SIMD platforms (XZ, ZF), pp. 558–563.
DATE-2011-MichelFP #embedded #simulation- Speeding-up SIMD instructions dynamic binary translation in embedded processor simulation (LM, NF, FP), pp. 277–280.
DATE-2011-WohSDKSBM #power management- Low power interconnects for SIMD computers (MW, SS, RGD, DK, DS, DB, TNM), pp. 600–605.
CIKM-2011-StepanovGREO- SIMD-based decoding of posting lists (AAS, ARG, DER, RJE, PSO), pp. 317–326.
CC-2011-HenrettySPFRS #architecture #layout- Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures (TH, KS, LNP, FF, JR, PS), pp. 225–245.
CGO-2011-NuzmanDRRWYCZ- Vapor SIMD: Auto-vectorize once, run everywhere (DN, SD, ER, IR, KW, DY, AC, AZ), pp. 151–160.
DAC-2010-HePKYALC #energy #named #throughput- Xetal-Pro: an ultra-low energy and high throughput SIMD processor (YH, YP, RPK, ZY, AAA, SML, HC), pp. 543–548.
SIGMOD-2010-SatishKCNLKD #performance- Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort (NS, CK, JC, ADN, VWL, DK, PD), pp. 351–362.
HPCA-2010-HuangSWSXM #named #permutation- SIF: Overcoming the limitations of SIMD devices via implicit permutation (LH, LS, ZW, WS, NX, SM), pp. 1–12.
VLDB-2009-WillhalmPBPZS #in memory #named #performance #using- SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units (TW, NP, YB, HP, AZ, JS), pp. 385–394.
DATE-2008-BonnotLEGRG #approach #architecture #implementation #multi- Definition and SIMD Implementation of a Multi-Processing Architecture Approach on FPGA (PB, FL, GE, GG, OR, PG), pp. 610–615.
VLDB-2008-ChhuganiNLMHCBKD #architecture #cpu #implementation #manycore #performance #sorting- Efficient implementation of sorting on multi-core SIMD CPU architecture (JC, ADN, VWL, WM, MH, YKC, AB, SK, PD), pp. 1313–1324.
CC-2008-FranchettiP #generative #permutation- Generating SIMD Vectorized Permutations (FF, MP), pp. 116–131.
CC-2008-LashariLM #architecture #control flow- Control Flow Emulation on Tiled SIMD Architectures (GL, OL, MM), pp. 100–115.
PPoPP-2008-Cameron #case study #parallel- A case study in SIMD text processing with parallel bit streams: UTF-8 to UTF-16 transcoding (RDC), pp. 91–98.
DATE-2007-KraemerLAM #interactive #parallel #program transformation #source code #using- Interactive presentation: SoftSIMD — exploiting subword parallelism using source code transformations (SK, RL, GA, HM), pp. 1349–1354.
AGTIVE-2007-AnandK #assembly #generative #graph transformation- Code Graph Transformations for Verifiable Generation of SIMD-Parallel Assembly Code (CKA, WK), pp. 217–232.
CC-2007-FiremanPZ #algorithm- New Algorithms for SIMD Alignment (LF, EP, AZ), pp. 1–15.
HPCA-2007-ClarkHYMF #hardware #lightweight #using- Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping (NC, AH, SY, SAM, KF), pp. 216–227.
DATE-DF-2006-DavilaTSSBR #algorithm #architecture #configuration management #design #implementation- Design and implementation of a rendering algorithm in a SIMD reconfigurable architecture (MorphoSys) (JD, AdT, JMS, MSE, NB, FR), pp. 52–57.
PLDI-2006-NuzmanRZ- Auto-vectorization of interleaved data for SIMD (DN, IR, AZ), pp. 132–143.
PLDI-2006-RenWP #optimisation #permutation- Optimizing data permutations for SIMD devices (GR, PW, DAP), pp. 118–131.
ASPLOS-2006-PatwardhanJDL #architecture #fault #self- A defect tolerant self-organizing nanoscale SIMD architecture (JPP, VJ, CD, ARL), pp. 241–251.
CGO-2006-LiZXH #optimisation- Optimizing Dynamic Binary Translation for SIMD Instructions (JL, QZ, SX, BH), pp. 269–280.
LCTES-2006-ZhangQWZZ #architecture #compilation #multi #optimisation- Optimizing compiler for shared-memory multiple SIMD architecture (WZ, XQ, YW, BZ, CZ), pp. 199–208.
CC-2005-JiangMHLZZZ #multi #performance #using- Boosting the Performance of Multimedia Applications Using SIMD Instructions (WJ, CM, BH, JL, JZ, BZ, CZ), pp. 59–75.
CGO-2005-WuEW #code generation #performance #runtime- Efficient SIMD Code Generation for Runtime Alignment and Length Conversion (PW, AEE, AW), pp. 153–164.
LCTES-2005-KudriavtsevK #generative #permutation- Generation of permutations for SIMD processors (AK, PMK), pp. 147–156.
PLDI-2004-EichenbergerWO #architecture #constraints- Vectorization for SIMD architectures with alignment constraints (AEE, PW, KO), pp. 82–93.
DATE-2003-BeeckGBMCD #data transformation #implementation #power management #realtime- Background Data Organisation for the Low-Power Implementation in Real-Time of a Digital Audio Broadcast Receiver on a SIMD Processor (POdB, CG, EB, MM, FC, GD), pp. 11144–11145.
DATE-2003-DuSTBAF #configuration management #interactive- Interactive Ray Tracing on Reconfigurable SIMD MorphoSys (HD, MSE, NT, NB, MLA, MF), pp. 20144–20149.
SIGMOD-2002-ZhouR #database #implementation #using- Implementing database operations using SIMD instructions (JZ, KAR), pp. 145–156.
LCTES-SCOPES-2002-LorenzWD #compilation #energy- Energy aware compilation for DSPs with SIMD instructions (ML, LW, TD), pp. 94–101.
DATE-2000-Leupers- Code Selection for Media Processors with SIMD Instructions (RL), pp. 4–8.
ICPR-1996-ArumugaveluR #algorithm #clustering- SIMD algorithms for single link and complete link pattern clustering (SA, NR), pp. 625–629.
SAC-1995-BaudinoCMS #set- Processing sets on a SIMD machine (AB, GC, GM, GS), pp. 593–598.
ILPS-1993-TongL #concurrent #constraints #logic programming #parallel- Concurrent Constraint Logic Programming On Massively Parallel SIMD Computers (BMT, HfL), pp. 388–402.
ESOP-1992-Levaire #case study #programming #using- Using the Centaur System to for Data-Parallel SIMD Programming: A Case Study (JLL), pp. 341–350.
PLDI-1992-HanxledenK #constraints #control flow #using- Relaxing SIMD Control Flow Constraints using Loop Transformations (RvH, KK), pp. 188–199.
IWMM-1992-Yuasa #architecture #garbage collection #lisp #memory management #parallel- Memory Management and Garbage Collection of an Extended Common Lisp System for Massively Parallel SIMD Architecture (TY), pp. 490–506.
AdaEurope-1991-Gudenberg #ada #parallel- Modellin SIMD — Type Parallel Arithmetic Operations in Ada (JWvG), pp. 110–124.
LFP-1988-HudakH- Graphinators and the Duality of SIMD and MIMD (PH, EM), pp. 224–234.