Title: Theory and Applications of Asynchronous Parallel Computing
Single-core performance stopped improving around 2005. However, 64 CPU-cores workstations, thousand-core GPUs, and even eight-core cellphones are sold at affordable prices. To take advantages of multiple cores, we must parallelize our algorithms. In order for iterative algorithms to have strong parallel performance, it is important to reduce or even remove synchronizations so that the cores can keep running with the information they have, even when the latest one has not arrived. This talk explains why such async-parallel computing is both theoretically sound and practically attractive. In particular, we study fixed-point iterations of a nonexpansive operator and show that randomized async-parallel iterations will almost surely converge to a fixed point, as long as the operator has a fixed point and the step size is properly chosen. Roughly speaking, the convergence speed scales linearly with the number of cores when the number of cores is no more than the square root of the number of variables. As special cases, novel algorithms for linear equation systems, machine learning, distributed and decentralized optimization are introduced. On sparse logistic regression and others, new async-parallel algorithms run order-of-magnitude faster than the traditional sync-parallel algorithms. This is joint work with Zhimin Peng (UCLA), Yangyang Xu (IMA), and Ming Yan (Michigan State).
Wotao Yin is a professor in the Department of Mathematics of UCLA. His research interests lie in computational optimization and its applications in image processing, machine learning, and other inverse problems. He received his B.S. in mathematics from Nanjing University in 2001, and then M.S. and Ph.D. in operations research from Columbia University in 2003 and 2006, respectively. During 2006 - 2013, he was with Rice University. He won NSF CAREER award in 2008 and Alfred P. Sloan Research Fellowship in 2009. His recent work has been in optimization algorithms for large-scale and distributed signal processing and machine learning problems.