Event image

Abstract:

Machine learning has become an indispensable component of the “Big Data” software stack. However, applying machine-learning algorithms on massive datasets remains a challenge, in part due to the poor support for iteration and recursion in existing frameworks for large-scale data analytics. This talk reviews our recent work that addresses this challenge by leveraging principles and techniques from database query processing. Our approach matches or exceeds the performance of specialized tools on actual machine-learning tasks, while providing a general and extensible framework for applying machine learning on large datasets.

Bio:

Neoklis PolyzotisNeoklis Polyzotis is a full professor at UC Santa Cruz.  His research focuses on database systems, and in particular on machine learning on big data sets, on-line database tuning, and declarative crowdsourcing. He is the recipient of an NSF CAREER award in 2004 and of an IBM Faculty Award in 2005 and 2006. He has also received the runner-up for best paper in VLDB 2007 and the best newcomer paper award in PODS 2008. He received his PhD from the University of Wisconsin at Madison in 2003.