Single-Instruction Multiple-Data Execution

Single-Instruction Multiple-Data Execution

Christopher J. Hughes
ISBN: 9781627057639 | PDF ISBN: 9781627057646
Copyright © 2015 | 121 Pages | Publication Date: May, 2015

BEFORE YOU ORDER: You may have Academic or Corporate access to this title. Click here to find out: 10.2200/S00647ED1V01Y201505CAC032

Ordering Options: Paperback $55.00   E-book $44.00   Paperback & E-book Combo $68.75

Why pay full price? Members receive 15% off all orders.
Learn More Here

Read Our Digital Content License Agreement (pop-up)

Purchasing Options:

Having hit power limitations to even more aggressive out-of-order execution in processor cores, many architects in the past decade have turned to single-instruction multiple-data (SIMD) execution to increase single-threaded performance. SIMD execution, or having a single instruction drive execution of an identical operation on multiple data items, was already well established as a technique to efficiently exploit data parallelism. Furthermore, support for it was already included in many commodity processors. However, in the past decade, SIMD execution has seen a dramatic increase in the set of applications using it, which has motivated big improvements in hardware support in mainstream microprocessors.

The easiest way to provide a big performance boost to SIMD hardware is to make it wider, i.e., increase the number of data items hardware operates on simultaneously. Indeed, microprocessor vendors have done this. However, as we exploit more data parallelism in applications, certain challenges can negatively impact performance. In particular, conditional execution, non-contiguous memory accesses, and the presence of some dependences across data items are key roadblocks to achieving peak performance with SIMD execution.

This book first describes data parallelism, and why it is so common in popular applications. We then describe SIMD execution, and explain where its performance and energy benefits come from compared to other techniques to exploit parallelism. Finally, we describe SIMD hardware support in current commodity microprocessors. This includes both expected design tradeoffs, as well as unexpected ones, as we work to overcome challenges encountered when trying to map real software to SIMD execution.

Table of Contents

Data Parallelism
Exploiting Data Parallelism with SIMD Execution
Computation and Control Flow
Memory Operations
Horizontal Operations
Author's Biography

About the Author(s)

Christopher J. Hughes, Intel
Christopher J.Hughes is a principal engineer at Intel Labs, where he joined in August 2003. He received his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign in 2003, Master of Science degree in Computer Science from the University of Illinois at Urbana-Champaign in 2000, and Bachelor of Science degree in Electrical Engineering and Bachelor of Arts degree in Computer Science from Rice University in 1998. He led the teams that defined the gather instructions in Intel AVX2, the gather and scatter instructions in Intel AVX-512, and Intel's AVX-512CD instructions. His research focuses on highly parallel architectures for compute- and data-intensive applications.


Browse by Subject
Case Studies in Engineering
ACM Books
SEM Books
0 items

Note: Registered customers go to: Your Account to subscribe.

E-Mail Address:

Your Name: