Machine Learning

[The Based Models 2] Boosting

bomishot 2023. 4. 14. 17:37

๐ŸŒ€ Bagging vs Boosting

Bagging : ๋ณต์› ์ถ”์ถœ- ๊ธฐ๋ณธ๋ชจ๋ธ ํ•™์Šต(๊ฐ ๊ธฐ๋ณธ ๋ชจ๋ธ๋“ค์ด ํ•™์Šต ์‹œ ๋…๋ฆฝ์ , ๋ณ‘๋ ฌ์  ํ•™์Šต๋จ) - ๊ธฐ๋ณธ ๋ชจ๋ธ๋“ค์˜ ์˜ˆ์ธก๊ฐ’ ํ‰๋“ฑํ•˜๊ฒŒํ•ฉ์น˜๊ธฐ

 

Boosting : ์ˆœ์ฐจ์ ์œผ๋กœ ํ•™์Šต๋จ. ์ง€๊ธˆ๊นŒ์ง€ ํ•™์Šต๋œ ๋ชจ๋ธ์ด ์ž˜ ์˜ˆ์ธกํ•˜์ง€ ๋ชปํ•˜๋Š” ๋ถ€๋ถ„์— ์ง‘์ค‘ํ•ด ๋‹ค์Œ ๋ชจ๋ธ ํ•™์Šต์‹œํ‚ด

 

  Bagging Boosting
๊ธฐ๋ณธ ๋ชจ๋ธ ๊ฐ„ ์˜ํ–ฅ ๊ธฐ๋ณธ ๋ชจ๋ธ๋“ค ๊ฐ„์— ์˜ํ–ฅ์„ ๋ฐ›์ง€ ์•Š๊ณ , ๊ฐœ๋ณ„์ ์œผ๋กœ ๋งŒ๋“ฆ ์ด์ „ ๊ธฐ๋ณธ ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•˜์ง€ ๋ชปํ•œ ๋ถ€๋ถ„์— ์ง‘์ค‘ํ•˜๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ฆ
๋ฐ์ดํ„ฐ์…‹ ๊ธฐ์กด ๋ฐ์ดํ„ฐ์…‹์—์„œ ์ค‘๋ณต ํ—ˆ์šฉํ•œ ๋ฌด์ž‘์œ„ ์ถ”์ถœ(๋ถ€ํŠธ์ŠคํŠธ๋žฉ)์œผ๋กœ ๋งŒ๋“ฆ ์ด์ „ ํ•™์Šต์—์„œ ์˜ค์ฐจ๊ฐ€ ์‹ฌํ–ˆ๋˜ ๋ฐ์ดํ„ฐ๋“ค์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•œ ํ›„ ๋žœ๋คํ•˜๊ฒŒ ์„ ํƒํ•ด์„œ ๋งŒ๋“ฆ
๋ถ„์‚ฐ๊ณผ ํŽธํ–ฅ ๊ธฐ๋ณธ ๋ชจ๋ธ๋“ค์˜ ์„œ๋กœ ๋‹ค๋ฅธ ์–‘์ƒ์œผ๋กœ ๋ฐœ์ƒํ•˜๋Š” ์˜ค์ฐจ๋“ค์ด ์ƒ์‡„๋˜๋ฉฐ ๋ถ„์‚ฐ์„ ์ค„์ž„ → ๊ณผ๋Œ€์ ํ•ฉ ํ•ด๊ฒฐ Boosting ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•˜๋ฉฐ ์ตœ์ข… ๋ชจ๋ธ์˜ ๋ณต์žก๋„๋ฅผ ์ƒ์Šน์‹œํ‚ค๋ฉด์„œ ํŽธํ–ฅ์„ ์ค„์ž„ → ๊ณผ์†Œ์ ํ•ฉํ•ด๊ฒฐ
์ตœ์ข… ๊ฒฐ๊ณผ ๊ธฐ๋ณธ ๋ชจ๋ธ๋“ค์˜ ํ‰๊ท (ํšŒ๊ท€๋ชจ๋ธ), ๋‹ค์ˆ˜๊ฒฐ(๋ถ„๋ฅ˜๋ชจ๋ธ)๋กœ ๊ฒฐ์ • ๊ธฐ๋ณธ ๋ชจ๋ธ๋“ค์˜ ๊ฒฐ๊ณผ๋ฅผ ์ทจํ•ฉํ•ด ์˜ˆ์ธก ์ˆ˜ํ–‰
๋Œ€ํ‘œ ์•Œ๊ณ ๋ฆฌ์ฆ˜ Random Forest AdaBoost, GBM, XGBoost, LightGBM

cf) XGBoost์˜ ๋А๋ฆฐ ์ ์ด LightGBM์„ ์“ฐ๋ฉด ํ›จ์”ฌ ๋‚˜์•„์ง!


๋Œ€ํ‘œ์ ์ธ Boosting ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์ธ AdaBoost, Gradient Boost์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž!

 

AdaBoost

๋ถ„๋ฅ˜๋ฌธ์ œ์— ์ ํ•ฉ, Gradient boost๋ณด๋‹ค ์ด์ƒ์น˜ ๋ฏผ๊ฐ, ์„ฑ๋Šฅ ๋–จ์–ด์ ธ ๋ณ„๋กœ ์‚ฌ์šฉ x..

์ž˜๋ชป ๋ถ„๋ฅ˜๋œ ๊ด€์ธก์น˜์— ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌํ•ด ์ƒ˜ํ”Œ๋งํ•จ. (๊ฐ€์ค‘ ์ƒ˜ํ”Œ๋ง)

 

๐ŸŒ€ Gradient Boost

ํšŒ๊ท€, ๋ถ„๋ฅ˜ ๋ฌธ์ œ ๋ชจ๋‘ ์‚ฌ์šฉ ๊ฐ€๋Šฅ

๊ฐ•๋ ฅํ•œ ์„ฑ๋Šฅ!

kaggle, ํ˜„์—…์—์„œ ์ธ๊ธฐ ๋†’์Œ!

๊ตฌํ˜„ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋„ ๋งŽ์•„ ์‰ฝ๊ฒŒ ๋ชจ๋ธ ๊ตฌ์ถ• ๊ฐ€๋Šฅ

Gradient Boosting์€ ํ‹€๋ฆฐ ๋ฐ์ดํ„ฐ์— ์ง‘์ค‘ํ•˜๊ธฐ ์œ„ํ•ด, ์ž”์ฐจ๋ฅผ ํ•™์Šตํ•œ๋‹ค.

  • ์ด์ „ ํŠธ๋ฆฌ์—์„œ ์ž˜๋ชป ๋ถ„๋ฅ˜๋œ ์ƒ˜ํ”Œ๋“ค์— ๊ฐ€์ค‘์น˜๋ฅผ ๋†’์—ฌ ๋‹ค์Œ ํŠธ๋ฆฌ์—์„œ ๋”์šฑ ์ง‘์ค‘์ ์œผ๋กœ ํ•™์Šตํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ด๋•Œ ๊ฐ€์ค‘์น˜๋ฅผ ๋†’์ด๋Š” ๊ฒƒ์€ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ ์ž์ฒด๋ฅผ ๋” ๋งŽ์ด ์ถ”์ถœํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ํ•ด๋‹น ์ƒ˜ํ”Œ์˜ ์ž”์—ฌ ์˜ค์ฐจ(residual error)๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ์ด๋ฅผ ์ด์šฉํ•ด ๊ฐ€์ค‘์น˜๋ฅผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.

 

์„ฑ๋Šฅ์ด ๋‹จ์ผ ๋ชจ๋ธ๋ณด๋‹ค ์ข‹์„ ์ˆ˜ ๋ฐ–์— ์—†๋‹ค. ์ฒซ๋ฒˆ์งธ์—์„œ ํ•™์Šต์„ ์ œ๋Œ€๋กœ ๋ชปํ•ด์„œ, ์ฒซ๋ฒˆ์งธ๊ฐ€ ๋งž์ถ”์ง€ ๋ชปํ•œ ๋‚˜๋จธ์ง€ ์˜ค์ฐจ๋ฅผ ์ด์šฉํ•ด์„œ ๊ณ„์† ํ•™์Šต์„ ์‹œ์ผœ์ค€๋‹ค.

 

์ž”์ฐจ๊ฐ€ ํฐ ๊ด€์ธก์น˜๋ฅผ ๋” ํ•™์Šตํ•˜๋„๋ก ํ•˜๋Š” ํšจ๊ณผ๊ฐ€ ์žˆ์œผ๋ฉฐ, ์ด์ „ ๋ชจ๋ธ์ด ํ‹€๋ฆฐ ๋งŒํผ์„ ์ง์ ‘ ํ•™์Šตํ•˜๋ฉฐ ์ด์ „ ๋ชจ๋ธ์„ ์ˆœ์ฐจ์ ์œผ๋กœ ๋ณด์™„ํ•จ.

 

 

๐ŸŒ€ XGBoost 

Gradient Boosting๋ณด๋‹ค ์„ฑ๋Šฅ, ๊ณ„์‚ฐ์†๋„ ์ข‹๋‹ค!

XGBoost ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ : 2014๋…„์— ๊ณต๊ฐœ๋œ Gradient Boosting Decision Tree ๊ตฌํ˜„ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ, kaggle๋“ฑ์—์„œ ๊พธ์ค€ํžˆ ์‚ฌ๋ž‘๋ฐ›์•„ ์˜จ ๋ชจ๋ธ์ด๋‹ค.

scikit-learn ensemble ๋ชจ๋“ˆ์˜ GradientBoostingRegressor, GradientBoostingClassifier ํด๋ž˜์Šค๋„ Gradient Boosting Decision Tree ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์ด์ง€๋งŒ, ์„ฑ๋Šฅ๊ณผ ๊ณ„์‚ฐ ์†๋„๊ฐ€ XGBoost ๋ชจ๋ธ๋ณด๋‹ค ๋–จ์–ด์ ธ ์ž์ฃผ ์‚ฌ์šฉํ•˜์ง€๋Š” ์•Š๋Š”๋‹ค.

 

Gradient Boosting Decision Tree๋Š” Tree-Based ๋ชจ๋ธ์˜ ํŠน์„ฑ์„ ๊ทธ๋ž˜๋„ ๋”ฐ๋ฅธ๋‹ค.

  • ํŠน์„ฑ์„ ์ˆ˜์น˜ํ™”ํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค. ( carboost๋“ฑ์—์„œ๋Š” string type์˜ ํŠน์„ฑ์„ ๊ทธ๋Œ€๋กœ ์ฒ˜๋ฆฌํ•˜๊ธด ํ•จ.)
  • ํŠน์„ฑ์˜ scaling์ด๋‚˜ normalize๊ฐ€ ํ•„์š”์—†๋‹ค.
  • one-hot encoding๋ณด๋‹ค ordinal encoding ์„ ํ˜ธ๋จ.
    • ํŠนํžˆ cardinality๊ฐ€ ๋†’์€ ํŠน์„ฑ์˜ ๊ฒฝ์šฐ one-hot encdoing์‹œ ํ•™์Šต ์‹œ๊ฐ„ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ, ์ปดํ“จํŒ… ์ž์›์ด ๋งŽ์ด ์†Œ๋ชจ๋˜๋ฏ€๋กœ ์ฃผ์˜.

 

XGBoost์˜ ํŒŒ๋ผ๋ฏธํ„ฐ

booster

  • weak learner ๋ชจ๋ธ์„ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ
  • gbtree : Decision Tree ๋ชจ๋ธ ์‚ฌ์šฉ
  • dart : Decision Tree ๋ชจ๋ธ ์‚ฌ์šฉํ•˜๋˜, DART ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ ์ •๊ทœํ™”

objective

eval_metric

  • ๊ฒ€์ฆ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ™์ด ๋„ฃ์–ด์ค„ ๊ฒฝ์šฐ, ๊ฒ€์ฆ ๋ฐฉ๋ฒ• ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.
  • (default) regression : rmse, classification : logloss
  • eval_metric='error' : 1-accuracy์ง€ํ‘œ ์ด์šฉํ•ด ํ‰๊ฐ€

 

XGBoost์˜ ์ฃผ์š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ

XGBoost๋Š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ๋งŽ์ด ๋‹ฌ๋ผ์ง€๋Š” ๋ชจ๋ธ์ด๋ฏ€๋กœ ๊ฐ๊ฐ ๋ชจ๋‘ ๊ณ ๋ คํ•ด ์‚ฌ์šฉํ•ด๋ณด์ž!!!

n_estimators

  • weak learner๋“ค์˜ ์ˆ˜ ๊ฒฐ์ • (randomforest์˜ ๊ฒฝ์šฐ, decision tree์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋˜๊ฒ ์ง€.)

learning_rate

  • ๋‹จ๊ณ„๋ณ„๋กœ weak learner๋“ค ์–ผ๋งˆ๋‚˜ ๋ฐ˜์˜ํ• ์ง€ ๊ฒฐ์ •
  • 0~1 ๋ฒ”์œ„
    • ๊ฐ’์ด ๋„ˆ๋ฌด ํฌ๋ฉด overfitting ๋ฐœ์ƒ ์‰ฌ์›€.
    • ๊ฐ’์ด ๋„ˆ๋ฌด ์ž‘์œผ๋ฉด ํ•™์Šต์ด ๋А๋ ค์ง.
  • ์ผ๋ฐ˜์ ์œผ๋กœ 0.05~0.3 ์ •๋„์˜ ๋ฒ”์œ„์—์„œ ํƒ์ƒ‰ ์ง„ํ–‰

max_depth

  • ๊ฐ weak learner ํŠธ๋ฆฌ๋“ค์˜ ์ตœ๋Œ€ ๊นŠ์ด ๊ฒฐ์ •
  • ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์— ๊ฐ€์žฅ ํฐ ์˜ํ–ฅ์„ ์ฃผ๋Š” ๋ณ€์ˆ˜!!!
  • ๊ฐ’์ด ๋„ˆ๋ฌด ํฌ๋ฉด, overfitting ๋ฐœ์ƒ ์‰ฌ์šฐ๋ฉฐ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰ ๋Š˜์–ด๋‚จ.
  • ์ผ๋ฐ˜์ ์œผ๋กœ 5-12 ์ •๋„์˜ ๋ฒ”์œ„์—์„œ ํƒ์ƒ‰ ์ง„ํ–‰

min_child_weight

  • leaf ๋…ธ๋“œ์— ํฌํ•จ๋˜๋Š” ๊ด€์ธก์น˜์˜ ์ˆ˜ ๊ฒฐ์ •
  • ๊ฐ’์ด ์ปค์งˆ์ˆ˜๋ก, weak learner๋“ค์˜ ๋ณต์žก๋„ ๊ฐ์†Œ
  • ์ผ๋ฐ˜์ ์œผ๋กœ, overfitting ์‹œ, 1,2,4,8…๊ณผ ๊ฐ™์ด 2๋ฐฐ์”ฉ ์„ฑ๋Šฅ์„ ๋Š˜๋ ค ํ™•์ธ

subsample

  • ๊ฐ weak learner๋“ค์„ ํ•™์Šตํ•  ๋•Œ ๊ณผ์ ํ•ฉ์„ ๋ง‰๊ณ , ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ์˜ฌ๋ฆฌ๊ธฐ ์œ„ํ•ด, ์ „์ฒด ๋ฐ์ดํ„ฐ ์ค‘ ์ผ๋ถ€๋ฅผ ์ƒ˜ํ”Œ๋งํ•˜์—ฌ ํ•™์Šตํ•จ.
  • subsample ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ˜ํ”Œ๋งํ•  ๋น„์œจ ๊ฒฐ์ •ํ•จ.
  • ์ผ๋ฐ˜์ ์œผ๋กœ 0.8์ •๋„๋กœ ์„ค์ •ํ•˜๋ฉฐ, ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋‹ค.

colsample_bytree

  • column์„ ์ƒ˜ํ”Œ๋งํ•  ๋น„์œจ
  • ์ผ๋ฐ˜์ ์œผ๋กœ 0.8์ •๋„๋กœ ์„ค์ •ํ•˜๋ฉฐ, ํŠน์„ฑ์˜ ๊ฐœ์ˆ˜์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋‹ค. ํŠน์„ฑ์ด ์ฒœ ๊ฐœ ์ด์ƒ์œผ๋กœ ๋งค์šฐ ๋งŽ์„ ๊ฒฝ์šฐ 0.1๋“ฑ์˜ ๋งค์šฐ ์ž‘์€ ๊ฐ’ ์„ค์ •ํ•˜๊ธฐ๋„ ํ•จ.

scale_pos_weight

  • ๋ถˆ๊ท ํ˜• target ํ’€๋•Œ!!
  • sum(negative cases) / sum(positive cases) ๊ฐ’์„ ๋„ฃ์–ด์ฃผ๋ฉด, scikit-learn์˜ 'class_weight= balanced' ์˜ต์…˜๊ณผ ๋™์ผํ•˜๊ฒŒ ๋จ.

์ผ๋ฐ˜์ ์œผ๋กœ max_depth, learning rate๊ฐ€ ๊ฐ€์žฅ ์ค‘์š”ํ•œ hyperparameter์ด๋ฉฐ, ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์ถ”๊ฐ€์ ์œผ๋กœ ์กฐ์ •ํ•ด์คŒ.

 

https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters

 

XGBoost Parameters — xgboost 2.0.0-dev documentation

update: Starts from an existing model and only updates its trees. In each boosting iteration, a tree from the initial model is taken, a specified sequence of updaters is run for that tree, and a modified tree is added to the new model. The new model would

xgboost.readthedocs.io

 

 

Early Stopping

  • ์ง€์ •ํ•ด์ค€ n_estimators ๋งŒํผ ํ•™์Šต์„ ๋ฐ˜๋ณตํ•˜์ง€ ์•Š๋”๋ผ๋„, ์ผ์ • ํšŸ์ˆ˜์—์„œ ๋” ์ด์ƒ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜์ง€ ์•Š์œผ๋ฉด ์ค‘๋‹จ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•
  • ๋‹ค๋ฅธ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹ํ•  ๋•Œ, ์ด์— ๋งž์ถ”์–ด n_estimators๊ฐ’์„ ๋ณ€๊ฒฝํ•ด์ฃผ์ง€ ์•Š์•„๋„ ๋˜์„œ ํŽธ๋ฆฌํ•˜๋‹ค.
  • XGBoost ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ early_stopping_rounds ์„ค์ •ํ•ด ๊ธฐ๋Šฅ ์‚ฌ์šฉ
  • early stopping์„ ๊ฒฐ์ •ํ•˜๋Š” ๊ธฐ์ค€ ๋ฐ์ดํ„ฐ์…‹์ธ eval_set์„ ์ œ๊ณตํ•ด์ฃผ์–ด์•ผํ•จ.
  • ์—ฌ๋Ÿฌ ๊ฐœ์˜ eval_set์ด ์ œ๊ณต๋  ๊ฒฝ์šฐ ๋งˆ์ง€๋ง‰ dataset์ด ๊ธฐ์ค€ ๋ฐ์ดํ„ฐ์…‹์ด ๋จ.

 

model = XGBClassifier(objective='binary:logistic',
					  eval_metric='error',
                      n_estimators=3244224,
                      n_jobs=-1,
                      max_depth=7,
                      learning_rate=0.1,
                      cale_pos_weight= train[target].value_counts(normalize=True)[0]/ train[target].value_counts(normalize=True)[1],
                      reg_lambda=1}
watchlist = [(x_train, y_train{, (x_val, y_val)]

modell.fit(x_train, y_train,
		   eval_set=watchlist,
           early_stopping_rounds=50) # 50 rounds๋™์•ˆ ์„ฑ๋Šฅ ๊ฐœ์„ ์ด ์—†์œผ๋ฉด ํ•™์Šต ์ค‘์ง€.

 

Kaggle - H1N1๋…๊ฐ ๋ฐฑ์‹  ๋ฐ˜์‘

 

# 80/20 ๋น„์œจ๋กœ train / test ๋ฐ์ดํ„ฐ ๋ถ„๋ฆฌ
train, val = train_test_split(train, test_size=0.2, random_state=42, stratify=train[target])

# Feature Engineering
def engineer(df):
  # drop high cardinality columns 
  selected_cols = df.select_dtypes(include=["number", "object"])
  labels = selected_cols.nunique()  
  selected_features = labels[labels <= 30].index.tolist()  
  df = df[selected_features]

  # new feature
  behaviorals = [col for col in df.columns if 'behavioral' in col]
  df['behaviorals'] = df[behaviorals].sum(axis=1)

  # drop feature
  dels = [col for col in df.columns if ('employment' in col or 'seas' in col)]
  df.drop(columns=dels, inplace=True)

  return df


train = engineer(train.copy())
val = engineer(val.copy())
test = engineer(test.copy())
train.shape, val.shape, test.shape
>> ((33723, 32), (8431, 32), (28104, 31))

# feature, label ๋ถ„๋ฆฌ
x_train, y_train = train.drop(target, axis=1), train[target]
x_val, y_val = val.drop(target, axis=1), val[target]
x_test = test
from category_encoders import OrdinalEncoder
from sklearn.impute import SimpleImputer
from xgboost import XGBClassifier

encoder = OrdinalEncoder()
x_train = encoder.fit_transform(x_train)
x_val = encoder.transform(x_val)
x_test = encoder.transform(x_test)

model = XGBClassifier(
    objective='binary:logistic',
    eval_metric='error',
    n_estimators=100,
    random_state=42,
    n_jobs=-1, 
    max_depth=7,
    learning_rate=0.1,
    scale_pos_weight=3,
    sub_sample=0.8,
    colsample_bytre=0.8,
    min_child_weight=16, # ๊ฐ€์ง€์น˜๊ธฐ์— ์‚ฌ์šฉ๋˜๋Š” ๊ฐ€์ค‘์น˜์˜ ์ตœ์†Ÿ๊ฐ’ ์ง€์ •, overfitting์‹œ, 1,2,4,8..ํ˜•ํƒœ๋กœ ๋Š˜๋ ค๊ฐ€๊ธฐ
    reg_lambda=1) # L1, L2๊ทœ์ œ ๊ฐ•๋„ ์กฐ์ ˆ - ๊ฐ’ํด์ˆ˜๋ก, ๊ฐ€์ค‘์น˜๊ฐ€ ์ž‘์•„์ ธ ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€

eval_set = [(x_train, y_train), (x_val, y_val)]

model.fit(x_train, y_train,
                    eval_set=eval_set,
                    early_stopping_rounds=50)


model.score(x_train, y_train)
model.score(x_val, y_val)
>> 
train score :0.7967
val score : 0.7771

randomforest๋ชจ๋ธ๋ณด๋‹ค, overfitting ๊ฐ€๋Šฅ์„ฑ ๋‚ฎ์•„์กŒ๋‹ค.

kaggle์— ์ œ์ถœ ์‹œ, XGBClassifier์˜ f1-score : 0.61997

 

๋‹ค์Œ๋ฒˆ์—๋Š” ๋‹ค๋ฅธ ๋ชจ๋ธ์„ ํ†ตํ•ด ๋” score๋ฅผ ์˜ฌ๋ ค๋ณด์ž!!

 

More study

ํŠธ๋ฆฌ ์‹œ๊ฐํ™”

# ์‹œ๊ฐํ™” ๋ชจ๋“ˆ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
from xgboost import plot_tree

# ์‹œ๊ฐํ™”
plot_tree(model,        # ๋ชจ๋ธ์ด๋ฆ„
          num_trees=0,  # ํ‘œ์‹œํ•  ํŠธ๋ฆฌ ๋ฒˆํ˜ธ
          rankdir='TB') # ํ‘œ์‹œ ๋ฐฉํ–ฅ: LR' or 'TB'
plt.gcf().set_size_inches(13, 13);

ํ™•์‹คํžˆ decision tree๋ณด๋‹จ ์ด์˜๊ฒŒ ๋‚˜์˜ค์ง€๋Š” ์•Š๊ตฌ๋‚˜..

 

 

lightBGM

import lightgbm as lgb

# LightGBM ๋ชจ๋ธ ์„ค์ •ํ•˜๊ธฐ
model = lgb.LGBMClassifier(
    objective='binary:logistic',
    metric='error',
    n_estimators=100,
    random_state=42,
    num_leaves=31,  # ์žŽ ๋…ธ๋“œ ๊ฐœ์ˆ˜ : ๋” ์ด์ƒ ๋ถ„ํ• ํ•  ์ˆ˜ ์—†๋Š” ์ตœ์ข…์ ์ธ ์˜์—ญ์—ญ
    learning_rate=0.1,
    max_depth=5,
    reg_lambda=1,
    scale_pos_weight=3,
    subsample=0.8,
    colsample_bytree=0.8,
    min_child_weight=16
)

wathchlist = [(X_train_encoded, y_train), (X_val_encoded, y_val)]
# LightGBM ๋ชจ๋ธ ํ•™์Šตํ•˜๊ธฐ
model.fit(X_train_encoded, y_train, eval_set=watch_list, early_stopping_rounds=50)

XGB๋ณด๋‹ค, lightGBM ๋ชจ๋ธ์ด ํ™•์‹คํžˆ ๋” ๋น ๋ฅด๋‹ค..!

ํ•˜์ง€๋งŒ, ๋‚ด๊ฐ€ ํ•œ ๋ฐ์ดํ„ฐ์…‹์—์„œ๋Š” XGB ๋ชจ๋ธ์ด ์„ฑ๋Šฅ์ด ๋” ์ข‹์•˜๋‹ค.

 

[Reference]

Gradient Boosting์ด ๊ตฌํ˜„๋œ Python Library

  • scikit-learn Gradient Tree Boosting — ์ƒ๋Œ€์ ์œผ๋กœ ์†๋„๊ฐ€ ๋А๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • Anaconda: already installed
    • Google Colab: already installed
  • xgboost — ๊ฒฐ์ธก๊ฐ’์„ ์ˆ˜์šฉํ•˜๋ฉฐ, monotonic constraints๋ฅผ ๊ฐ•์ œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • Anaconda, Mac/Linux: conda install -c conda-forge xgboost
    • Windows: conda install -c anaconda py-xgboost
    • Google Colab: already installed
  • LightGBM — ๊ฒฐ์ธก๊ฐ’์„ ์ˆ˜์šฉํ•˜๋ฉฐ, monotonic constraints๋ฅผ ๊ฐ•์ œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • Anaconda: conda install -c conda-forge lightgbm
    • Google Colab: already installed
  • CatBoost — ๊ฒฐ์ธก๊ฐ’์„ ์ˆ˜์šฉํ•˜๋ฉฐ, categorical features๋ฅผ ์ „์ฒ˜๋ฆฌ ์—†์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • Anaconda: conda install -c conda-forge catboost
    • Google Colab: pip install catboost

 

๋ฐฐ๊น… ๋ณต์Šต

๋ถ€์ŠคํŒ