MLlib: Machine learning in Apache Spark

Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLLIB, Spark's open-source distributed machine learning library. MLLIB provides efficient functionality for a wide range of learning...

Full description

Bibliographic Details
Main Authors: Meng, Xiangrui, Bradley, Joseph, Yavuz, Burak, Sparks, Evan, Venkataraman, Shivaram, Liu, Davies, Freeman, Jeremy, Tsai, DB, Amde, Manish, Owen, Sean, Xin, Doris, Franklin, Michael J., Zadeh, Reza, Talwakar, Ameet, Zaharia, Matei A
Other Authors: Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Format: Article
Published: JMLR, Inc. 2018
Online Access:http://hdl.handle.net/1721.1/116816
https://orcid.org/0000-0002-7547-7204