Vancouver, BC, Canada
August 27 & 28 - Co-Located Events, Tutorials, Labs & Lightning Talks
August 29-31 - Conference
Click Here For Information & Registration
Back To Schedule
Tuesday, August 28 • 3:15pm - 4:45pm
Workshop: Matrix Math at Scale with Apache Mahout and Spark - Andrew Musselman, Apache Mahout

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
This workshop will provide an overview of the process of building predictions from raw data using the Apache Mahout machine-learning library and the Apache Spark compute framework. We will present recent developments in Mahout including the matrix-math-oriented domain-specific language named “Samsara” as well as a new algorithm development framework. Then we will walk through the steps required to convert a data set to the format required for analytic methods, split the data, train and test a model, and then output predictions.

This approach takes advantage of a declarative Scala-based DSL that allows practitioners to operate on very large data sets while focusing on the math in their work symbolically rather than needing to acquire significant programming experience and skill to manipulate matrices and other structures. Attendees will be given example source code and pointers on getting started on their own projects.


Andrew Musselman

PMC Member, Apache Mahout
Andrew Musselman is a member of the Apache Mahout Project Management Committee, an independent data engineering and analytics consultant, and co-host of the Adversarial Learning podcast. He loves distributed matrix math and lives in Seattle with his wife and kids. He has spoken on... Read More →

Tuesday August 28, 2018 3:15pm - 4:45pm PDT
Room 212