关键词:随机算法;机器学习;质量评估
摘 要:Many existing procedures in machine learning and statistics are computationally intractable in the setting of large-scale data. As a result, the advent of rapidly increasing dataset sizes, which should be a boon yielding improved statistical performance, instead severely blunts the usefulness of a variety of existing inferential methods. In this work, we use randomness to ameliorate this lack of scalability by reducing complex, computationally dicult inferential problems to larger sets of signi cantly smaller and more tractable subproblems.This approach allows us to devise algorithms which are both more ecient and more amenable to use of parallel and distributed computation. We propose novel randomized algorithms for two broad classes of problems that arise in machine learning and statistics;estimator quality assessment and semide nite programming.