데모¶
라이브러리 import 및 설정¶
%reload_ext autoreload
%autoreload 2
%matplotlib inline
import lightgbm as lgb
from matplotlib import pyplot as plt
from matplotlib import rcParams
import numpy as np
from pathlib import Path
import pandas as pd
from sklearn.metrics import accuracy_score
from sklearn.model_selection import StratifiedKFold
import seaborn as sns
import warnings
rcParams['figure.figsize'] = (16, 8)
plt.style.use('fivethirtyeight')
pd.set_option('max_columns', 100)
pd.set_option("display.precision", 4)
warnings.simplefilter('ignore')
학습데이터 로드¶
03-pandas-eda.ipynb에서 생성한 feature.csv
피처파일 사용
data_dir = Path('../data/dacon-dku')
feature_dir = Path('../build/feature')
val_dir = Path('../build/val')
tst_dir = Path('../build/tst')
sub_dir = Path('../build/sub')
trn_file = data_dir / 'train.csv'
tst_file = data_dir / 'test.csv'
sample_file = data_dir / 'sample_submission.csv'
target_col = 'class'
n_fold = 5
n_class = 3
seed = 42
algo_name = 'lgbcv'
feature_name = 'stacking1'
model_name = f'{algo_name}_{feature_name}'
feature_file = feature_dir / f'{feature_name}.csv'
p_val_file = val_dir / f'{model_name}.val.csv'
p_tst_file = tst_dir / f'{model_name}.tst.csv'
sub_file = sub_dir / f'{model_name}.csv'
Stacking Feature 생성¶
model_names = ['lrcv_polyfeature', 'rfcv_feature', 'lgbcv_feature']
trn = []
tst = []
feature_names = []
for model in model_names:
trn.append(np.loadtxt(val_dir / f'{model}.val.csv', delimiter=','))
tst.append(np.loadtxt(tst_dir / f'{model}.tst.csv', delimiter=','))
feature_names += [f'{model}_class0', f'{model}_class1', f'{model}_class2']
trn = np.hstack(trn)
tst = np.hstack(tst)
feature_names
['lrcv_polyfeature_class0',
'lrcv_polyfeature_class1',
'lrcv_polyfeature_class2',
'rfcv_feature_class0',
'rfcv_feature_class1',
'rfcv_feature_class2',
'lgbcv_feature_class0',
'lgbcv_feature_class1',
'lgbcv_feature_class2']
y = pd.read_csv(trn_file, index_col=0, usecols=['id', target_col]).values.flatten()
y.shape
(320000,)
Stratified K-Fold Cross Validation¶
cv = StratifiedKFold(n_splits=n_fold, shuffle=True, random_state=seed)
LightGBM 모델 학습¶
p_val = np.zeros((trn.shape[0], n_class))
p_tst = np.zeros((tst.shape[0], n_class))
for i, (i_trn, i_val) in enumerate(cv.split(trn, y), 1):
print(f'training model for CV #{i}')
clf = lgb.LGBMClassifier(objective='multiclass',
n_estimators=1000,
num_leaves=64,
learning_rate=0.1,
min_child_samples=10,
subsample=.5,
subsample_freq=1,
colsample_bytree=.8,
random_state=seed,
n_jobs=-1)
clf.fit(trn[i_trn], y[i_trn],
eval_set=[(trn[i_val], y[i_val])],
eval_metric='multiclass',
early_stopping_rounds=10)
p_val[i_val, :] = clf.predict_proba(trn[i_val])
p_tst += clf.predict_proba(tst) / n_fold
training model for CV #1
[1] valid_0's multi_logloss: 0.872561
Training until validation scores don't improve for 10 rounds
[2] valid_0's multi_logloss: 0.781259
[3] valid_0's multi_logloss: 0.705565
[4] valid_0's multi_logloss: 0.641647
[5] valid_0's multi_logloss: 0.586861
[6] valid_0's multi_logloss: 0.539686
[7] valid_0's multi_logloss: 0.498697
[8] valid_0's multi_logloss: 0.462928
[9] valid_0's multi_logloss: 0.431543
[10] valid_0's multi_logloss: 0.403925
[11] valid_0's multi_logloss: 0.379427
[12] valid_0's multi_logloss: 0.357745
[13] valid_0's multi_logloss: 0.33846
[14] valid_0's multi_logloss: 0.321341
[15] valid_0's multi_logloss: 0.306008
[16] valid_0's multi_logloss: 0.292285
[17] valid_0's multi_logloss: 0.280025
[18] valid_0's multi_logloss: 0.269093
[19] valid_0's multi_logloss: 0.259231
[20] valid_0's multi_logloss: 0.250454
[21] valid_0's multi_logloss: 0.242495
[22] valid_0's multi_logloss: 0.235348
[23] valid_0's multi_logloss: 0.228915
[24] valid_0's multi_logloss: 0.223113
[25] valid_0's multi_logloss: 0.217877
[26] valid_0's multi_logloss: 0.21314
[27] valid_0's multi_logloss: 0.208835
[28] valid_0's multi_logloss: 0.204941
[29] valid_0's multi_logloss: 0.201432
[30] valid_0's multi_logloss: 0.198208
[31] valid_0's multi_logloss: 0.19532
[32] valid_0's multi_logloss: 0.192669
[33] valid_0's multi_logloss: 0.190273
[34] valid_0's multi_logloss: 0.188068
[35] valid_0's multi_logloss: 0.186113
[36] valid_0's multi_logloss: 0.184264
[37] valid_0's multi_logloss: 0.182672
[38] valid_0's multi_logloss: 0.181179
[39] valid_0's multi_logloss: 0.179791
[40] valid_0's multi_logloss: 0.178508
[41] valid_0's multi_logloss: 0.177326
[42] valid_0's multi_logloss: 0.176257
[43] valid_0's multi_logloss: 0.175299
[44] valid_0's multi_logloss: 0.174391
[45] valid_0's multi_logloss: 0.173569
[46] valid_0's multi_logloss: 0.172829
[47] valid_0's multi_logloss: 0.172107
[48] valid_0's multi_logloss: 0.171481
[49] valid_0's multi_logloss: 0.170876
[50] valid_0's multi_logloss: 0.170352
[51] valid_0's multi_logloss: 0.169814
[52] valid_0's multi_logloss: 0.169338
[53] valid_0's multi_logloss: 0.168891
[54] valid_0's multi_logloss: 0.168508
[55] valid_0's multi_logloss: 0.168146
[56] valid_0's multi_logloss: 0.167838
[57] valid_0's multi_logloss: 0.167517
[58] valid_0's multi_logloss: 0.167205
[59] valid_0's multi_logloss: 0.16696
[60] valid_0's multi_logloss: 0.166668
[61] valid_0's multi_logloss: 0.166455
[62] valid_0's multi_logloss: 0.166238
[63] valid_0's multi_logloss: 0.166082
[64] valid_0's multi_logloss: 0.16592
[65] valid_0's multi_logloss: 0.165746
[66] valid_0's multi_logloss: 0.165551
[67] valid_0's multi_logloss: 0.165416
[68] valid_0's multi_logloss: 0.165258
[69] valid_0's multi_logloss: 0.165123
[70] valid_0's multi_logloss: 0.165009
[71] valid_0's multi_logloss: 0.164925
[72] valid_0's multi_logloss: 0.164826
[73] valid_0's multi_logloss: 0.164745
[74] valid_0's multi_logloss: 0.164649
[75] valid_0's multi_logloss: 0.164562
[76] valid_0's multi_logloss: 0.164544
[77] valid_0's multi_logloss: 0.164477
[78] valid_0's multi_logloss: 0.164384
[79] valid_0's multi_logloss: 0.16433
[80] valid_0's multi_logloss: 0.164247
[81] valid_0's multi_logloss: 0.164178
[82] valid_0's multi_logloss: 0.16412
[83] valid_0's multi_logloss: 0.164029
[84] valid_0's multi_logloss: 0.16399
[85] valid_0's multi_logloss: 0.163989
[86] valid_0's multi_logloss: 0.163934
[87] valid_0's multi_logloss: 0.163912
[88] valid_0's multi_logloss: 0.163863
[89] valid_0's multi_logloss: 0.163824
[90] valid_0's multi_logloss: 0.163786
[91] valid_0's multi_logloss: 0.163764
[92] valid_0's multi_logloss: 0.163741
[93] valid_0's multi_logloss: 0.163714
[94] valid_0's multi_logloss: 0.163693
[95] valid_0's multi_logloss: 0.163672
[96] valid_0's multi_logloss: 0.163659
[97] valid_0's multi_logloss: 0.163627
[98] valid_0's multi_logloss: 0.163612
[99] valid_0's multi_logloss: 0.163636
[100] valid_0's multi_logloss: 0.16364
[101] valid_0's multi_logloss: 0.163627
[102] valid_0's multi_logloss: 0.163662
[103] valid_0's multi_logloss: 0.163649
[104] valid_0's multi_logloss: 0.163668
[105] valid_0's multi_logloss: 0.163675
[106] valid_0's multi_logloss: 0.163678
[107] valid_0's multi_logloss: 0.163689
[108] valid_0's multi_logloss: 0.163684
Early stopping, best iteration is:
[98] valid_0's multi_logloss: 0.163612
training model for CV #2
[1] valid_0's multi_logloss: 0.87104
Training until validation scores don't improve for 10 rounds
[2] valid_0's multi_logloss: 0.779017
[3] valid_0's multi_logloss: 0.702984
[4] valid_0's multi_logloss: 0.638549
[5] valid_0's multi_logloss: 0.583572
[6] valid_0's multi_logloss: 0.536297
[7] valid_0's multi_logloss: 0.495139
[8] valid_0's multi_logloss: 0.459251
[9] valid_0's multi_logloss: 0.427733
[10] valid_0's multi_logloss: 0.399993
[11] valid_0's multi_logloss: 0.375475
[12] valid_0's multi_logloss: 0.353711
[13] valid_0's multi_logloss: 0.334478
[14] valid_0's multi_logloss: 0.317348
[15] valid_0's multi_logloss: 0.302085
[16] valid_0's multi_logloss: 0.288467
[17] valid_0's multi_logloss: 0.276335
[18] valid_0's multi_logloss: 0.265425
[19] valid_0's multi_logloss: 0.255703
[20] valid_0's multi_logloss: 0.246887
[21] valid_0's multi_logloss: 0.238999
[22] valid_0's multi_logloss: 0.231934
[23] valid_0's multi_logloss: 0.225559
[24] valid_0's multi_logloss: 0.219843
[25] valid_0's multi_logloss: 0.214677
[26] valid_0's multi_logloss: 0.210019
[27] valid_0's multi_logloss: 0.205789
[28] valid_0's multi_logloss: 0.201973
[29] valid_0's multi_logloss: 0.198476
[30] valid_0's multi_logloss: 0.195345
[31] valid_0's multi_logloss: 0.192493
[32] valid_0's multi_logloss: 0.189909
[33] valid_0's multi_logloss: 0.187561
[34] valid_0's multi_logloss: 0.185438
[35] valid_0's multi_logloss: 0.18355
[36] valid_0's multi_logloss: 0.181804
[37] valid_0's multi_logloss: 0.180169
[38] valid_0's multi_logloss: 0.17873
[39] valid_0's multi_logloss: 0.177354
[40] valid_0's multi_logloss: 0.176112
[41] valid_0's multi_logloss: 0.174967
[42] valid_0's multi_logloss: 0.173916
[43] valid_0's multi_logloss: 0.172939
[44] valid_0's multi_logloss: 0.172071
[45] valid_0's multi_logloss: 0.171285
[46] valid_0's multi_logloss: 0.170578
[47] valid_0's multi_logloss: 0.169946
[48] valid_0's multi_logloss: 0.169335
[49] valid_0's multi_logloss: 0.168798
[50] valid_0's multi_logloss: 0.168302
[51] valid_0's multi_logloss: 0.167796
[52] valid_0's multi_logloss: 0.167393
[53] valid_0's multi_logloss: 0.166924
[54] valid_0's multi_logloss: 0.166532
[55] valid_0's multi_logloss: 0.166144
[56] valid_0's multi_logloss: 0.165817
[57] valid_0's multi_logloss: 0.165534
[58] valid_0's multi_logloss: 0.165274
[59] valid_0's multi_logloss: 0.165049
[60] valid_0's multi_logloss: 0.164807
[61] valid_0's multi_logloss: 0.164894
[62] valid_0's multi_logloss: 0.164639
[63] valid_0's multi_logloss: 0.164447
[64] valid_0's multi_logloss: 0.164277
[65] valid_0's multi_logloss: 0.16499
[66] valid_0's multi_logloss: 0.163987
[67] valid_0's multi_logloss: 0.163847
[68] valid_0's multi_logloss: 0.166748
[69] valid_0's multi_logloss: 0.163951
[70] valid_0's multi_logloss: 0.165424
[71] valid_0's multi_logloss: 0.164133
[72] valid_0's multi_logloss: 0.170719
[73] valid_0's multi_logloss: 0.164874
[74] valid_0's multi_logloss: 0.167003
[75] valid_0's multi_logloss: 0.163835
[76] valid_0's multi_logloss: 0.163779
[77] valid_0's multi_logloss: 0.169687
[78] valid_0's multi_logloss: 0.16437
[79] valid_0's multi_logloss: 0.167214
[80] valid_0's multi_logloss: 0.165725
[81] valid_0's multi_logloss: 0.166933
[82] valid_0's multi_logloss: 0.165532
[83] valid_0's multi_logloss: 0.16543
[84] valid_0's multi_logloss: 0.169679
[85] valid_0's multi_logloss: 0.165332
[86] valid_0's multi_logloss: 0.165313
Early stopping, best iteration is:
[76] valid_0's multi_logloss: 0.163779
training model for CV #3
[1] valid_0's multi_logloss: 0.8728
Training until validation scores don't improve for 10 rounds
[2] valid_0's multi_logloss: 0.78198
[3] valid_0's multi_logloss: 0.706436
[4] valid_0's multi_logloss: 0.642692
[5] valid_0's multi_logloss: 0.588198
[6] valid_0's multi_logloss: 0.541148
[7] valid_0's multi_logloss: 0.500206
[8] valid_0's multi_logloss: 0.464501
[9] valid_0's multi_logloss: 0.433164
[10] valid_0's multi_logloss: 0.405452
[11] valid_0's multi_logloss: 0.380965
[12] valid_0's multi_logloss: 0.359351
[13] valid_0's multi_logloss: 0.340064
[14] valid_0's multi_logloss: 0.322961
[15] valid_0's multi_logloss: 0.30769
[16] valid_0's multi_logloss: 0.294058
[17] valid_0's multi_logloss: 0.281866
[18] valid_0's multi_logloss: 0.270968
[19] valid_0's multi_logloss: 0.261074
[20] valid_0's multi_logloss: 0.252274
[21] valid_0's multi_logloss: 0.244306
[22] valid_0's multi_logloss: 0.237134
[23] valid_0's multi_logloss: 0.230648
[24] valid_0's multi_logloss: 0.224885
[25] valid_0's multi_logloss: 0.219671
[26] valid_0's multi_logloss: 0.214968
[27] valid_0's multi_logloss: 0.210653
[28] valid_0's multi_logloss: 0.206748
[29] valid_0's multi_logloss: 0.203192
[30] valid_0's multi_logloss: 0.199978
[31] valid_0's multi_logloss: 0.19708
[32] valid_0's multi_logloss: 0.194418
[33] valid_0's multi_logloss: 0.191981
[34] valid_0's multi_logloss: 0.1898
[35] valid_0's multi_logloss: 0.187818
[36] valid_0's multi_logloss: 0.186013
[37] valid_0's multi_logloss: 0.184354
[38] valid_0's multi_logloss: 0.182853
[39] valid_0's multi_logloss: 0.181494
[40] valid_0's multi_logloss: 0.18026
[41] valid_0's multi_logloss: 0.179135
[42] valid_0's multi_logloss: 0.178116
[43] valid_0's multi_logloss: 0.17709
[44] valid_0's multi_logloss: 0.176216
[45] valid_0's multi_logloss: 0.175425
[46] valid_0's multi_logloss: 0.174681
[47] valid_0's multi_logloss: 0.173971
[48] valid_0's multi_logloss: 0.173332
[49] valid_0's multi_logloss: 0.172694
[50] valid_0's multi_logloss: 0.172174
[51] valid_0's multi_logloss: 0.171638
[52] valid_0's multi_logloss: 0.171183
[53] valid_0's multi_logloss: 0.170741
[54] valid_0's multi_logloss: 0.170368
[55] valid_0's multi_logloss: 0.169994
[56] valid_0's multi_logloss: 0.169661
[57] valid_0's multi_logloss: 0.169316
[58] valid_0's multi_logloss: 0.169031
[59] valid_0's multi_logloss: 0.168771
[60] valid_0's multi_logloss: 0.168526
[61] valid_0's multi_logloss: 0.168334
[62] valid_0's multi_logloss: 0.168221
[63] valid_0's multi_logloss: 0.168015
[64] valid_0's multi_logloss: 0.167923
[65] valid_0's multi_logloss: 0.167733
[66] valid_0's multi_logloss: 0.167573
[67] valid_0's multi_logloss: 0.167399
[68] valid_0's multi_logloss: 0.167256
[69] valid_0's multi_logloss: 0.167135
[70] valid_0's multi_logloss: 0.166992
[71] valid_0's multi_logloss: 0.166866
[72] valid_0's multi_logloss: 0.16677
[73] valid_0's multi_logloss: 0.166671
[74] valid_0's multi_logloss: 0.166616
[75] valid_0's multi_logloss: 0.166545
[76] valid_0's multi_logloss: 0.166461
[77] valid_0's multi_logloss: 0.166428
[78] valid_0's multi_logloss: 0.166358
[79] valid_0's multi_logloss: 0.166254
[80] valid_0's multi_logloss: 0.166183
[81] valid_0's multi_logloss: 0.166151
[82] valid_0's multi_logloss: 0.166108
[83] valid_0's multi_logloss: 0.166125
[84] valid_0's multi_logloss: 0.166065
[85] valid_0's multi_logloss: 0.166014
[86] valid_0's multi_logloss: 0.166006
[87] valid_0's multi_logloss: 0.165961
[88] valid_0's multi_logloss: 0.165919
[89] valid_0's multi_logloss: 0.165903
[90] valid_0's multi_logloss: 0.165874
[91] valid_0's multi_logloss: 0.165838
[92] valid_0's multi_logloss: 0.165817
[93] valid_0's multi_logloss: 0.165794
[94] valid_0's multi_logloss: 0.165784
[95] valid_0's multi_logloss: 0.165881
[96] valid_0's multi_logloss: 0.165825
[97] valid_0's multi_logloss: 0.165803
[98] valid_0's multi_logloss: 0.165792
[99] valid_0's multi_logloss: 0.165782
[100] valid_0's multi_logloss: 0.165782
[101] valid_0's multi_logloss: 0.165764
[102] valid_0's multi_logloss: 0.165778
[103] valid_0's multi_logloss: 0.16577
[104] valid_0's multi_logloss: 0.165788
[105] valid_0's multi_logloss: 0.16578
[106] valid_0's multi_logloss: 0.165779
[107] valid_0's multi_logloss: 0.165797
[108] valid_0's multi_logloss: 0.165828
[109] valid_0's multi_logloss: 0.165822
[110] valid_0's multi_logloss: 0.165841
[111] valid_0's multi_logloss: 0.16584
Early stopping, best iteration is:
[101] valid_0's multi_logloss: 0.165764
training model for CV #4
[1] valid_0's multi_logloss: 0.871694
Training until validation scores don't improve for 10 rounds
[2] valid_0's multi_logloss: 0.779868
[3] valid_0's multi_logloss: 0.703921
[4] valid_0's multi_logloss: 0.639731
[5] valid_0's multi_logloss: 0.584873
[6] valid_0's multi_logloss: 0.53759
[7] valid_0's multi_logloss: 0.496525
[8] valid_0's multi_logloss: 0.460625
[9] valid_0's multi_logloss: 0.42907
[10] valid_0's multi_logloss: 0.401367
[11] valid_0's multi_logloss: 0.376949
[12] valid_0's multi_logloss: 0.355235
[13] valid_0's multi_logloss: 0.336027
[14] valid_0's multi_logloss: 0.318942
[15] valid_0's multi_logloss: 0.303773
[16] valid_0's multi_logloss: 0.290198
[17] valid_0's multi_logloss: 0.278005
[18] valid_0's multi_logloss: 0.26714
[19] valid_0's multi_logloss: 0.257386
[20] valid_0's multi_logloss: 0.248613
[21] valid_0's multi_logloss: 0.240744
[22] valid_0's multi_logloss: 0.233663
[23] valid_0's multi_logloss: 0.227289
[24] valid_0's multi_logloss: 0.221605
[25] valid_0's multi_logloss: 0.216433
[26] valid_0's multi_logloss: 0.211709
[27] valid_0's multi_logloss: 0.207478
[28] valid_0's multi_logloss: 0.203661
[29] valid_0's multi_logloss: 0.200235
[30] valid_0's multi_logloss: 0.197127
[31] valid_0's multi_logloss: 0.194295
[32] valid_0's multi_logloss: 0.191721
[33] valid_0's multi_logloss: 0.189326
[34] valid_0's multi_logloss: 0.1872
[35] valid_0's multi_logloss: 0.185294
[36] valid_0's multi_logloss: 0.183495
[37] valid_0's multi_logloss: 0.181857
[38] valid_0's multi_logloss: 0.180362
[39] valid_0's multi_logloss: 0.17896
[40] valid_0's multi_logloss: 0.177727
[41] valid_0's multi_logloss: 0.176646
[42] valid_0's multi_logloss: 0.175597
[43] valid_0's multi_logloss: 0.174636
[44] valid_0's multi_logloss: 0.173761
[45] valid_0's multi_logloss: 0.172954
[46] valid_0's multi_logloss: 0.172263
[47] valid_0's multi_logloss: 0.171648
[48] valid_0's multi_logloss: 0.171068
[49] valid_0's multi_logloss: 0.170448
[50] valid_0's multi_logloss: 0.169945
[51] valid_0's multi_logloss: 0.169456
[52] valid_0's multi_logloss: 0.16904
[53] valid_0's multi_logloss: 0.168621
[54] valid_0's multi_logloss: 0.168233
[55] valid_0's multi_logloss: 0.167896
[56] valid_0's multi_logloss: 0.167586
[57] valid_0's multi_logloss: 0.167323
[58] valid_0's multi_logloss: 0.167104
[59] valid_0's multi_logloss: 0.166887
[60] valid_0's multi_logloss: 0.166663
[61] valid_0's multi_logloss: 0.166436
[62] valid_0's multi_logloss: 0.166233
[63] valid_0's multi_logloss: 0.166042
[64] valid_0's multi_logloss: 0.165909
[65] valid_0's multi_logloss: 0.165741
[66] valid_0's multi_logloss: 0.165598
[67] valid_0's multi_logloss: 0.165444
[68] valid_0's multi_logloss: 0.165349
[69] valid_0's multi_logloss: 0.16587
[70] valid_0's multi_logloss: 0.165065
[71] valid_0's multi_logloss: 0.166954
[72] valid_0's multi_logloss: 0.166051
[73] valid_0's multi_logloss: 0.165968
[74] valid_0's multi_logloss: 0.165892
[75] valid_0's multi_logloss: 0.165815
[76] valid_0's multi_logloss: 0.165763
[77] valid_0's multi_logloss: 0.164637
[78] valid_0's multi_logloss: 0.164582
[79] valid_0's multi_logloss: 0.164509
[80] valid_0's multi_logloss: 0.164451
[81] valid_0's multi_logloss: 0.164381
[82] valid_0's multi_logloss: 0.164346
[83] valid_0's multi_logloss: 0.16433
[84] valid_0's multi_logloss: 0.164325
[85] valid_0's multi_logloss: 0.165469
[86] valid_0's multi_logloss: 0.164241
[87] valid_0's multi_logloss: 0.169313
[88] valid_0's multi_logloss: 0.16509
[89] valid_0's multi_logloss: 0.164636
[90] valid_0's multi_logloss: 0.164581
[91] valid_0's multi_logloss: 0.16457
[92] valid_0's multi_logloss: 0.168354
[93] valid_0's multi_logloss: 0.168279
[94] valid_0's multi_logloss: 0.170919
[95] valid_0's multi_logloss: 0.169824
[96] valid_0's multi_logloss: 0.175127
Early stopping, best iteration is:
[86] valid_0's multi_logloss: 0.164241
training model for CV #5
[1] valid_0's multi_logloss: 0.871917
Training until validation scores don't improve for 10 rounds
[2] valid_0's multi_logloss: 0.780226
[3] valid_0's multi_logloss: 0.704197
[4] valid_0's multi_logloss: 0.639978
[5] valid_0's multi_logloss: 0.585046
[6] valid_0's multi_logloss: 0.537799
[7] valid_0's multi_logloss: 0.496669
[8] valid_0's multi_logloss: 0.460732
[9] valid_0's multi_logloss: 0.429252
[10] valid_0's multi_logloss: 0.401499
[11] valid_0's multi_logloss: 0.376988
[12] valid_0's multi_logloss: 0.355251
[13] valid_0's multi_logloss: 0.335915
[14] valid_0's multi_logloss: 0.318731
[15] valid_0's multi_logloss: 0.303479
[16] valid_0's multi_logloss: 0.289859
[17] valid_0's multi_logloss: 0.27766
[18] valid_0's multi_logloss: 0.266741
[19] valid_0's multi_logloss: 0.256899
[20] valid_0's multi_logloss: 0.248123
[21] valid_0's multi_logloss: 0.240227
[22] valid_0's multi_logloss: 0.233112
[23] valid_0's multi_logloss: 0.22671
[24] valid_0's multi_logloss: 0.220957
[25] valid_0's multi_logloss: 0.215741
[26] valid_0's multi_logloss: 0.210974
[27] valid_0's multi_logloss: 0.206716
[28] valid_0's multi_logloss: 0.202885
[29] valid_0's multi_logloss: 0.199392
[30] valid_0's multi_logloss: 0.196203
[31] valid_0's multi_logloss: 0.193356
[32] valid_0's multi_logloss: 0.190782
[33] valid_0's multi_logloss: 0.188429
[34] valid_0's multi_logloss: 0.186254
[35] valid_0's multi_logloss: 0.184275
[36] valid_0's multi_logloss: 0.182449
[37] valid_0's multi_logloss: 0.180789
[38] valid_0's multi_logloss: 0.179286
[39] valid_0's multi_logloss: 0.177917
[40] valid_0's multi_logloss: 0.176665
[41] valid_0's multi_logloss: 0.175579
[42] valid_0's multi_logloss: 0.174546
[43] valid_0's multi_logloss: 0.173632
[44] valid_0's multi_logloss: 0.172763
[45] valid_0's multi_logloss: 0.171961
[46] valid_0's multi_logloss: 0.17123
[47] valid_0's multi_logloss: 0.170586
[48] valid_0's multi_logloss: 0.169966
[49] valid_0's multi_logloss: 0.169434
[50] valid_0's multi_logloss: 0.168887
[51] valid_0's multi_logloss: 0.168419
[52] valid_0's multi_logloss: 0.167972
[53] valid_0's multi_logloss: 0.167581
[54] valid_0's multi_logloss: 0.167191
[55] valid_0's multi_logloss: 0.166874
[56] valid_0's multi_logloss: 0.166523
[57] valid_0's multi_logloss: 0.166254
[58] valid_0's multi_logloss: 0.165973
[59] valid_0's multi_logloss: 0.165689
[60] valid_0's multi_logloss: 0.165602
[61] valid_0's multi_logloss: 0.165345
[62] valid_0's multi_logloss: 0.165138
[63] valid_0's multi_logloss: 0.16491
[64] valid_0's multi_logloss: 0.164738
[65] valid_0's multi_logloss: 0.164543
[66] valid_0's multi_logloss: 0.164355
[67] valid_0's multi_logloss: 0.164264
[68] valid_0's multi_logloss: 0.164142
[69] valid_0's multi_logloss: 0.164023
[70] valid_0's multi_logloss: 0.163922
[71] valid_0's multi_logloss: 0.16384
[72] valid_0's multi_logloss: 0.163753
[73] valid_0's multi_logloss: 0.163659
[74] valid_0's multi_logloss: 0.163542
[75] valid_0's multi_logloss: 0.163467
[76] valid_0's multi_logloss: 0.163388
[77] valid_0's multi_logloss: 0.163334
[78] valid_0's multi_logloss: 0.163245
[79] valid_0's multi_logloss: 0.163165
[80] valid_0's multi_logloss: 0.163083
[81] valid_0's multi_logloss: 0.163035
[82] valid_0's multi_logloss: 0.16299
[83] valid_0's multi_logloss: 0.162953
[84] valid_0's multi_logloss: 0.1629
[85] valid_0's multi_logloss: 0.162846
[86] valid_0's multi_logloss: 0.163054
[87] valid_0's multi_logloss: 0.163007
[88] valid_0's multi_logloss: 0.162984
[89] valid_0's multi_logloss: 0.162932
[90] valid_0's multi_logloss: 0.169905
[91] valid_0's multi_logloss: 0.166641
[92] valid_0's multi_logloss: 0.166626
[93] valid_0's multi_logloss: 0.166576
[94] valid_0's multi_logloss: 0.166539
[95] valid_0's multi_logloss: 0.166526
Early stopping, best iteration is:
[85] valid_0's multi_logloss: 0.162846
print(f'{accuracy_score(y, np.argmax(p_val, axis=1)) * 100:.4f}%')
93.1559%
print(p_val.shape, p_tst.shape)
(320000, 3) (80000, 3)
np.savetxt(p_val_file, p_val, fmt='%.6f', delimiter=',')
np.savetxt(p_tst_file, p_tst, fmt='%.6f', delimiter=',')
피처 중요도 시각화¶
clf.coef_.shape
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-15-742ca016ff99> in <module>
----> 1 clf.coef_.shape
AttributeError: 'LGBMClassifier' object has no attribute 'coef_'
imp = pd.DataFrame({'feature': feature_names, 'importance': clf.feature_importances_})
imp = imp.sort_values('importance').set_index('feature')
imp.plot(kind='barh')
제출 파일 생성¶
sub = pd.read_csv(sample_file, index_col=0)
print(sub.shape)
sub.head()
sub[target_col] = np.argmax(p_tst, axis=1)
sub.head()
sub[target_col].value_counts()
sub.to_csv(sub_file)