English

Better than Deep Learning: Gradient Boosting Machines (GBM)

Overview
With all the hype about deep learning and “AI”, it is not well publicized that for structured/tabular data widely encountered in business applications it is actually another machine learning algorithm, the gradient boosting machine (GBM) that most often achieves the highest accuracy. In this workshop I’ll introduce GBMs and several implementations (accessible from R or Python) and we’ll train and tune GBMs on some public datasets using R and Python (hands-on).

Recommended audience
Recommended to people already familiar with machine learning basics and also coding in R or Python who want to learn about GBMs and get some familiarity with using them in practical applications.

Prerequisities
You must already have some familiarity with machine learning (e.g. train-test split, overfitting etc.) and with R or Python in order to attend this workshop.

To participate in the hands-on experiences in the final part of the workshop you need to bring your own laptop with either R or Python installed. Please also install Java and the h2o R/Python package and the xgboost R/Python package.

Presenter

Pafka Szilárd
Chief Scientist, Epoch USA

Szilard studied Physics in the 90s and obtained a PhD by using statistical methods to analyze the risk of financial portfolios. For the last decade he’s been the Chief Scientist of a tech company in California doing everything data (analysis, modeling, data visualization, data engineering, machine learning etc). He is the founder of the LA R and LA data science meetups and the data community website datascience.la, he is the author of a well-known machine learning benchmark on github (1000+ stars), a frequent speaker at data science conferences, and he has developed and taught graduate machine learning courses at two universities (UCLA and CEU).