Skip to main content

Posts

Showing posts from October, 2015

RuleFit: When disassembled trees meet Lasso

The RuleFit algorithm from Friedman and Propescu is an interesting regression and classification approach that uses decision rules in a linear model.

RuleFit is not a completely new idea, but it combines a bunch of algorithms in a clever way. RuleFit consists of two components: The first component produces "rules" and the second component fits a linear model with these rules as input (hence the name "RuleFit"). The cool thing about the algorithm is that the produced model is highly interpretable, because the decision rules have an easy understandable format, but you still have a flexible enough approach to capture complex interactions and get a good fit.

Part I: Generate rules  The rules that the algorithm generates have a simple form:

if  x2 < 3  and  x5 < 7  then  1  else  0

The rules are generated from the covariates matrix X. You can also see the rules simply as new features based on your original features.

The RuleFit paper uses the Boston housing data as…