LightGBM uses leaf-wise tree growth algorithms.

  • Advantages:

Converge faster

  • Disadvantages:

Tend to be over-fitting

Parameters to tune

  1. num_leaves
  2. min data in leaf
  3. (ordered by importance level DESC, in my opinion of course)
num_leaves.

For XGBoost using depth

The main parameter to control the complexity of the tree model.

Numbers smaller than 2^(max_depth) could be better choices.

min_data_in_leaf.

The parameter to deal with over-fitting (in leaf-wise tree).

Its value depends on the number of training data and num_leaves.

Setting it to the range of (200, 999) should be enough for a large dataset.

  1. learning_rate

The initial learning rate I set was 0.1. After 140 rounds, I found the auc started to decrease from 0.688.

To improve accuracy, I chose a lower learning rate and enlarge the number of boosting iterations accordingly. Because with lower learning rate, it takes more rounds to converge. The accuracy 0.688 can be a target indicating that I have gone into a local minima.

The result of

  1. max_depth.

The parameter to limit the tree depth.

Actually, the concept depth can be forgotten in leaf-wise tree, since it doesn't have a correct mapping from leaves to depth.

### How do tree methods deal with NaN?

During EDA, I found many missing values.

results matching ""

    No results matching ""