Travelled to:
1 × China
1 × France
1 × Spain
4 × USA
Collaborated with:
W.W.L.Fung M.O'Connor X.E.Chen V.Zakharenko A.Moshovos A.ElTantawy J.W.Ma H.Jooybar J.Devietti I.Singh A.Shriraman P.Chow P.Hammarlund H.Wang J.P.Shen
Talks about:
gpu (5) architectur (2) control (2) effici (2) flow (2) microarchitectur (1) multithread (1) determinist (1) throughput (1) prescient (1)
Person: Tor M. Aamodt
DBLP: Aamodt:Tor_M=
Contributed to:
Wrote 7 papers:
- HPCA-2014-ElTantawyMOA #architecture #control flow #gpu #multi #performance #scalability
- A scalable multi-path microarchitecture for efficient GPU control flow (AE, JWM, MO, TMA), pp. 248–259.
- ASPLOS-2013-JooybarFODA #architecture #gpu #named
- GPUDet: a deterministic GPU architecture (HJ, WWLF, MO, JD, TMA), pp. 1–12.
- DATE-2013-ZakharenkoAM #cpu #gpu #performance #using
- Characterizing the performance benefits of fused CPU/GPU systems using FusionSim (VZ, TMA, AM), pp. 685–688.
- HPCA-2013-SinghSFOA #architecture #gpu
- Cache coherence for GPU architectures (IS, AS, WWLF, MO, TMA), pp. 578–590.
- HPCA-2011-FungA #concurrent #control flow #performance #thread
- Thread block compaction for efficient SIMT control flow (WWLF, TMA), pp. 25–36.
- HPCA-2009-ChenA #fine-grained #first-order #parallel #thread #throughput
- A first-order fine-grained multithreaded throughput model (XEC, TMA), pp. 329–340.
- HPCA-2004-AamodtCHWS #hardware
- Hardware Support for Prescient Instruction Prefetch (TMA, PC, PH, HW, JPS), pp. 84–95.