Wednesday, July 8, 2015

Parallel Computing for Data Science

Hot off the press, Norman Matloff's book, Parallel Computing for Data Science: With Examples in R, C++ and CUDA  (Chapman and Hall/ CRC Press, 2015) should appeal to a lot of the readers of this blog.

The book's coverage is clear from the following chapter titles:

1. Introduction to Parallel Processing in R
2. Performance Issues: General
3. Principles of Parallel Loop Scheduling
4. The Message Passing Paradigm
5. The Shared Memory Paradigm
6. Parallelism through Accelerator Chips
7. An Inherently Statistical Approach to Parallelization: Subset Methods
8. Distributed Computation
9. Parallel Sorting, Filtering and Prefix Scan
10. Parallel Linear Algebra
Appendix - Review of Matrix Algebra 

The Preface makes it perfectly clear what this book is intended to be, and what it is not intended to be. Consider these passages: