ICASSP 2007 - April 15-20, 2007 - Honolulu, Hawai'i, U.S.A.

TUT-3: Multimedia Signal Processing on Personal Computers

Sunday Afternoon, April 15
14:00 - 17:00
Room 325A

Presented by

Yen-Kuang Chen, Intel Corporation

Abstract

For the best performance of multimedia applications on personal computers, we must carefully consider the interplay between microprocessors and algorithms/applications. The performance of personal computers has improved significantly during the past two decades. Significant portion of the improvement comes from data-level parallelism (e.g., MMX/SSE instructions) and thread-level parallelism (e.g., the latest Intel Core Duo processor). Moving forward, we expect a trend of increasing not only the capability of single-instruction-multiple-data instructions but also the number of processing cores in a single personal computer. Conventional optimization of digital signal processing algorithms in terms of numbers of operations may not be suitable for modern personal computers. To capture the increasing computational performance provided by the future architectures, we must carefully design or choose the algorithm for a specific task. This tutorial covers algorithm design and algorithmic-level optimization for modern processors.

  1. Overview & motivation
    1. Sequential vs parallel processing
    2. Goal of the tutorial---from architectural style to algorithm design
  2. Performance enhancement features in personal computer for media applications
    1. Data-level parallelism, e.g., MMX/SSE Technologies
    2. Thread-level parallelism, e.g., Hyper-Threading Technology and Dual Core
  3. SIMD optimization techniques
    1. Match the algorithms to SIMD instruction capability---put data into the right format for parallel execution (using H.264 integer transform as an example)
    2. Execute multiple identical operations in one instruction (using H.264 luminance sub-pel interpolation as an example)
    3. Transform conditional executions into logic operations (using MPEG-4 repetitive pixel padding as an example)
    4. Reduce shuffling and maximizing group of operations into one instruction (using MPEG-4 SA-DCT as an example)
  4. Multi-threading algorithm design
    1. Partition application into multiple threads, which have same program, but on different pieces of data (using H.264 encoder as an example)
    2. Dynamically balance loads for better parallelism (using MPEG-2 video decoder as an example)
    3. Take advantage of sharing cache to increase effectiveness (using SVM-based face detection as an example)
  5. Conclusions
    1. Match algorithms to SIMD instruction capabilities
    2. Design algorithm with minimal/simple data dependencies for data-level and functional-level parallelism

Target audience

This tutorial is intended to provide a basic overview of implementing multimedia applications (specifically video codec) on modern personal computers. Potential audiences include those who are interested in (1) implementation and performance evaluation of multimedia applications on personal computers, and (2) parallelizing and optimization techniques. Some background in video compression may help, but not required.

Speaker Biography

Yen-Kuang Chen received his Ph.D. from Princeton University in Electrical Engineering. He is a Senior Staff Researcher in Corporate Technology Group, Intel Corporation. His research interests include developing innovative multimedia and Internet applications, studying the performance bottleneck in current computers, and designing next generation microprocessor/platform. He has 10 US patents, 25+ pending patent applications, and 50+ technical publications. He is one of the key contributors to Supplemental Streaming SIMD Extension 3 (SSSE3). As an expert in video compression (e.g., MPEG-2, MPEG-4, H.263, & H.264) and computer architecture for emerging applications (e.g., SIMD and multi-threading), he is an invited speaker to 2005 Emerging Information Technology Conference, 2005 New Technology Business Opportunities Forum, 2004 Sino-American Technology & Engineering Conference, and 2003 Workshop on Media and Signal Processors for Embedded Systems and SoCs. He is an associate editor of the Journal of VLSI Signal Processing Systems (including special issues on “System-on-a-Chip for Multimedia Systems” and on “Design and Programming of Signal Processors for Multimedia Communication”) and of IEEE Transactions on Circuit and System I. He has served a program committee member of 20+ international conferences and workshops on multimedia, video communication, image processing, VLSI circuits and systems, parallel processing, and software optimization. He is an invited participant to 2002 Frontiers of Engineering Symposium (National Academy of Engineering) and to 2003 German-American Frontiers of Engineering Symposium (Alexander von Humboldt Foundation). He is an IEEE Senior Member.


©2009 Conference Management Services, Inc. -||- email: webmaster@icassp2007.com -||- Last updated Wednesday, April 04, 2007