TY - BOOK AU - McCool,Michael D. AU - Robison,Arch D. AU - Reinders,James TI - Structured parallel programing: patterns for efficient computation SN - 9780124159938 AV - QA76.76.P37 M34 2012 U1 - 005.1 23 PY - 2012/// CY - San Francisco, Calif. : , Oxford : PB - Morgan Kaufmann, Elsevier Science [distributor] KW - Software patterns KW - Structured programming N1 - Includes bibliographical references (pages 391-396) and index; Chapter 1. Introduction 1.1 Think Parallel 1.2 Performance 1.3 Motivation: Pervasive Parallelism 1.4 Structured Pattern-Based Programming 1.5 Parallel Programming Models 1.6 Organization of this Book 1.7 Summary Chapter 2. Background 2.1 Vocabulary and Notation 2.2 Strategies 2.3 Mechanisms 2.4 Machine Models 2.5 Performance Theory 2.6 Pitfalls 2.7 Summary PART I. Patterns Chapter 3. Patterns 3.1 Nesting Pattern 3.2 Structured Serial Control Flow Patterns 3.3 Parallel Control Patterns 3.4 Serial Data Management Patterns 3.5 Parallel Data Management Patterns 3.6 Other Parallel Patterns 3.7 Non-Deterministic Patterns 3.8 Programming Model Support for Patterns 3.9 Summary Chapter 4. Map 4.1 Map 4.2 Scaled Vector Addition (SAXPY) 4.3 Mandelbrot 4.4 Sequence of Maps versus Map of Sequence 4.5 Comparison of Parallel Models 4.6 Related Patterns 4.7 Summary Chapter 5. Collectives 5.1 Reduce 5.2 Fusing Map and Reduce 5.3 Dot Product 5.4 Scan 5.5 Fusing Map and Scan 5.6 Integration 5.7 Summary Chapter 6. Data Reorganization 6.1 Gather 6.2 Scatter 6.3 Converting Scatter to Gather 6.4 Pack 6.5 Fusing Map and Pack 6.6 Geometric Decomposition and Partition 6.7 Array of Structures vs. Structures of Arrays 6.8 Summary Chapter 7. Stencil and Recurrence 7.1 Stencil 7.2 Implementing Stencil with Shift 7.3 Tiling Stencils for Cache 7.4 Optimizing Stencils for Communication 7.5 Recurrence 7.6 Summary Chapter 8. Fork–Join 8.1 Definition 8.2 Programming Model Support for Fork–Join 8.3 Recursive Implementation of Map 8.4 Choosing Base Cases 8.5 Load Balancing 8.6 Complexity of Parallel Divide-and-Conquer 8.7 Karatsuba Multiplication of Polynomials 8.8 Cache Locality and Cache-Oblivious Algorithms 8.9 Quicksort 8.10 Reductions and Hyperobjects 8.11 Implementing Scan with Fork–Join 8.12 Applying Fork–Join to Recurrences 8.13 Summary Chapter 9. Pipeline 9.1 Basic Pipeline 9.2 Pipeline with Parallel Stages 9.3 Implementation of a Pipeline 9.4 Programming Model Support for Pipelines 9.5 More General Topologies 9.6 Mandatory versus Optional Parallelism 9.7 Summary PART II. Examples Chapter 10. Forward Seismic Simulation 10.1 Background 10.2 Stencil Computation 10.3 Impact of Caches on Arithmetic Intensity 10.4 Raising Arithmetic Intensity with Space–Time Tiling 10.5 Cilk Plus Code 10.6 ArBB Implementation 10.7 Summary Chapter 11. K-Means Clustering 11.1 Algorithm 11.2 K-Means with Cilk Plus 11.3 K-Means with TBB 11.4 Summary Chapter 12. Bzip2 Data Compression 12.1 The Bzip2 Algorithm 12.2 Three-Stage Pipeline Using TBB 12.3 Four-Stage Pipeline Using TBB 12.4 Three-Stage Pipeline Using Cilk Plus 12.5 Summary Chapter 13. Merge Sort 13.1 Parallel Merge 13.2 Parallel Merge Sort 13.3 Summary Chapter 14. Sample Sort 14.1 Overall Structure 14.2 Choosing the Number of Bins 14.3 Binning 14.4 Repacking and Subsorting 14.5 Performance Analysis of Sample Sort 14.6 For C++ Experts 14.7 Summary Chapter 15. Cholesky Factorization 15.1 Fortran Rules! 15.2 Recursive Cholesky Decomposition 15.3 Triangular Solve 15.4 Symmetric Rank Update 15.5 Where is the Time Spent? 15.6 Summary APPENDIX A. Further Reading APPENDIX B. Cilk Plus APPENDIX C. TBB APPENDIX D. C++11 APPENDIX E. Glossary Bibliography UR - http://repository.fue.edu.eg/xmlui/handle/123456789/3502 ER -