๐ ๊ณต๋ถํ๋ ์ง์ง์ํ์นด๋ ์ฒ์์ด์ง?
[DACON] ๋์๋ฐ์ ํ์๊ด ๋ฐ์ ๋ ์์ธก AI ๊ฒฝ์ง๋ํ ๋ณธ๋ฌธ
[DACON] ๋์๋ฐ์ ํ์๊ด ๋ฐ์ ๋ ์์ธก AI ๊ฒฝ์ง๋ํ
์ง์ง์ํ์นด 2022. 9. 14. 16:25220914 ์์ฑ
<๋ณธ ๋ธ๋ก๊ทธ๋ dacon ๋ํ์์์ Baseline ์ haenara-shin ๋์ ๊นํ๋ธใน,ใน ์ฐธ๊ณ ํด์ ๊ณต๋ถํ๋ฉฐ ์์ฑํ์์ต๋๋ค :-) >
GitHub - haenara-shin/DACON: DACON competition code repos.
DACON competition code repos. Contribute to haenara-shin/DACON development by creating an account on GitHub.
github.com
https://dacon.io/competitions/official/235720/codeshare/2512?page=1&dtype=recent
[Public LB-9.4531] Baseline Code / LightGBM (์์ )
๋์๋ฐ์ ํ์๊ด ๋ฐ์ ๋ ์์ธก AI ๊ฒฝ์ง๋ํ
dacon.io
๐ 1. ๋ํ ์๊ฐ
ํ์๊ด ๋ฐ์ ์ ๋งค์ผ ๊ธฐ์ ์ํฉ๊ณผ ๊ณ์ ์ ๋ฐ๋ฅธ ์ผ์ฌ๋์ ์ํฅ์ ๋ฐ๋๋ค
์ด์ ๋ํ ์์ธก์ด ๊ฐ๋ฅํ๋ค๋ฉด ๋ณด๋ค ์ํํ ์ ๋ ฅ ์๊ธ ๊ณํ์ด ๊ฐ๋ฅํ๋ค
์ธ๊ณต์ง๋ฅ ๊ธฐ๋ฐ ํ์๊ด ๋ฐ์ ๋ ์์ธก ๋ชจ๋ธ์ ๋ง๋ค์ด๋ณด์
- ํ์๊ด ๋ฐ์ ์ ๋งค์ผ์ '๊ธฐ์ ์ํฉ'๊ณผ ๊ณ์ ์ ๋ฐ๋ฅธ '์ผ์ฌ๋'์ ์ํฅ์ ๋ฐ์
- Input-dataset : 7์ผ(Day 0 ~ Day 6) ๋์์ ๋ฐ์ดํฐ
- Prediction(Target) : ํฅํ 2์ผ(Day 7 ~ Day 8) ๋์์ 30๋ถ ๊ฐ๊ฒฉ์ ๋ฐ์ ๋ ์์ธก. (1์ผ๋น 48๊ฐ์ฉ ์ด 96๊ฐ timestep์ ๋ํ ์์ธก)
๐ 2. ๋ฐ์ดํฐ ๊ตฌ์ฑ
- dataset
- train.csv : 3๋ (Day 0 ~ Day 1094) ๋์์ ๊ธฐ์ ๋ฐ์ดํฐ, ๋ฐ์ ๋(target) ๋ฐ์ดํฐ
- test fold files : ์ ๋ต์ฉ ๋ฐ์ดํฐ
- (81๊ฐ - 2๋ ๋์์ ๊ธฐ์ ๋ฐ์ดํฐ, ๋ฐ์ ๋(target)๋ฐ์ดํฐ. ์์๋ ๋๋ค, ์๊ณ์ด ์์์ ๋ฌด๊ด.
- ๊ฐ ํ์ผ์ 7์ผ ๋์์ ๋ฐ์ดํฐ ์ ์ฒด ํน์ ์ผ๋ถ๋ฅผ input ์ผ๋ก ์ฌ์ฉํด์, ํฅํ 2์ผ ๋์์ 30๋ถ ๊ฐ๊ฒฉ์ ๋ฐ์ ๋(target)์ ์์ธก(1์ผ๋น 48๊ฐ์ฉ ์ด 96๊ฐ์ timestep ์์ธก)
- submission.csv : ์ ๋ต ์ ์ถ ํ์ผ
- test ํด๋์ ๊ฐ ํ์ผ์ ๋ํด์, ์๊ฐ๋๋ณ ๋ฐ์ ๋์ 9๊ฐ์ quantile(0.1, 0.2 ... 0.9) ์ ๋ง์ถฐ ์์ธก. 'ํ์ผ๋ช _๋ ์ง_์๊ฐ' ํ์
- Loss
- Pinball loss
- 50% ์ด์์ ๋ฐฑ๋ถ์์์์ ๊ณผ์ ์์ธกํ๋ฉด ํ๋ํฐ ํฌ๊ฒ ๋จน์
- ๋์ quantile ๊ฐ์์๋ ์ธก์ ๋ ๊ฐ์ด ์์ธก๊ฐ ๋ณด๋ค ๋ฎ์์ผ ํจ. ์ฆ, over-forecast ์ ๋
- 50% ๋ฏธ๋ง์ ๋ฐฑ๋ถ์์์์ ๊ณผ๋์์ธกํ ๊ฒฝ์ฐ, ํ๋ํฐ ํฌ๊ฒ ๋จน์
- ๋ฎ์ quantile ๊ฐ์์๋ ์ธก์ ๋ ๊ฐ์ด ์์ธก๊ฐ ๋ณด๋ค ๋์์ผ ํจ. ์ฆ, under-forecast ์ ๋
- 50% ์ด์์ ๋ฐฑ๋ถ์์์์ ๊ณผ์ ์์ธกํ๋ฉด ํ๋ํฐ ํฌ๊ฒ ๋จน์
- Pinball loss
- QR
- percentile(๋ฐฑ๋ถ์์) : ํฌ๊ธฐ๊ฐ ์๋ ๊ฐ๋ค๋ก ์ด๋ค์ง ์๋ฃ๋ฅผ ์์๋๋ก ๋์ดํ์ ๋ ๋ฐฑ๋ถ์จ๋ก ๋ํ๋ธ ํน์ ์์น์ '๊ฐ'. ์์ ๊ฐ(0) ๋ถํฐ ํฐ ๊ฐ(100)๊น์ง ๋งค๊น
- percentile rank(๋ฐฑ๋ถ์) : ์๋ฃ์ ํน์ ๊ฐ์ด ์ ์ฒด์์ ์ด๋ '์์น'์ ์๋์ง
- quantile(์ฌ๋ถ์) : 0.25์ฉ ๋์ด์ ์๊ฐํ๋ฉด ๋จ
- ์์ธก ๊ฐ์ ๋ฒ์๋ฅผ ์ ๊ณตํด์ ๋ '์์ ์ ์ด๊ฑฐ๋ ๋ฏฟ์ ๋งํ' ์์ธก ๊ฐ์ด๋๋ฅผ ์ ๊ณตํ ์ ์์
- QR(Quantile Regression)๋ก ์์ธก ๊ตฌ๊ฐ์ ๋๋ ์ ์์ธกํ๋๊ฒ ๋ '๋ฏฟ์๋งํ๊ฑฐ๋ ์์ ์ ์'
- QR์ด OLS(์ผ๋ฐ ์ต์ ์ ๊ณฑ ๋ชจ๋ธ, ํ๊ท ์ถ์ ์น๋ง ์ ๊ณต) ๋ณด๋ค ์๋ฏธ ์๋ ์ด์
- QR์ ๋ชฉํ ๋ณ์์ ์ ์ฒด ์กฐ๊ฑด๋ถ ๋ถํฌ๋ฅผ ๋ชจํํ ํ ์ ์์(OLS๋ ํ๊ท ์ถ์ ์น๋ง ์ ๊ณต)
- QR์ ๋ชฉํ ๋ถํฌ์ ๋ํด์ ๊ฐ์ ์ ํ์ง ์๊ธฐ ๋๋ฌธ์ ์๋ฌ(์ค์ฐจ) ๋ถํฌ์ mis-specification์ ์ข ๋ ๊ฐํจ
- QR์ outlier์ ๋ ๋ฏผ๊ฐํจ
- QR์ monotonic transformation(log์ ๊ฐ์๊ฒ)์ ๋ถํธ์
๐ 3. ์ฝ๋ ๊ตฌํ
1๏ธโฃ Package load
import pandas as pd
# imort cupy as cp
import numpy as np
import os
import random
import math
from scipy.optimize import curve_fit # Use non-linear least squares to fit a function - for the zenith angle calculation
import warnings
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
warnings.filterwarnings("ignore")
from sklearn.model_selection import train_test_split
import tensorflow as tf
import tensorflow_addons as tfa
train = pd.read_csv('data/train/train.csv')
print(train.shape)
train.head()
-
Day : ๋ ์ง
-
Hour : ์๊ฐ
-
Minute : ๋ถ
-
DHI : ์ํ๋ฉด ์ฐ๋์ผ์ฌ๋(Diffuse Horizontal Irradiance (W/m2))
-
DNI : ์ง๋ฌ์ผ์ฌ๋(Direct Normal Irradiance (W/m2))
-
WS : ํ์(Wind Speed (m/s))
-
RH : ์๋์ต๋(Relative Humidity (%))
-
T : ๊ธฐ์จ(Temperature (Degree C))
-
Target : ํ์๊ด ๋ฐ์ ๋ (kW)
train.describe()
train.info()
- plot feature data distribution
## plot feature data distribution
fig, ax = plt.subplots(2, train.shape[1]//2+1, figsize=(20, 6))
for idx, feature in enumerate(train.columns):
data = train[feature]
if idx<train.shape[1]//2 + 1:
ax[0,idx].hist(train.iloc[:,idx], bins=10, alpha=0.5)
ax[0,idx].set_title(train.columns[idx])
else:
ax[1,idx-train.shape[1]//2-1].hist(train.iloc[:,idx], bins=10, alpha=0.5)
ax[1,idx-train.shape[1]//2-1].set_title(train.columns[idx])
plt.show()
- target์ ๋ํ feature(x)๋ค์ ์๊ด ๊ด๊ณ(correlation)
fig, axes = plt.subplots(2, 3, figsize=(10,7))
train.plot(x='Hour', y='TARGET', kind='scatter', alpha=0.1, ax=axes[0,0])
train.plot(x='DHI', y='TARGET', kind='scatter', alpha=0.1, ax=axes[0,1])
train.plot(x='DNI', y='TARGET', kind='scatter', alpha=0.1, ax=axes[0,2])
train.plot(x='WS', y='TARGET', kind='scatter', alpha=0.1, ax=axes[1,0])
train.plot(x='RH', y='TARGET', kind='scatter', alpha=0.1, ax=axes[1,1])
train.plot(x='T', y='TARGET', kind='scatter', alpha=0.1, ax=axes[1,2])
fig.tight_layout()
2๏ธโฃ Data preprocessing (feature engineering)
- Hour๊ณผ Minute๋ฅผ 1๊ฐ ํญ๋ชฉ์ผ๋ก(float) ๋ณํฉ, day ํญ๋ชฉ ์ ๊ฑฐ ํ ์ผ๊ฐํจ์(sin, cos) ํจ์๋ฅผ ์ด์ฉํด์ ์๊ฐ์ ์ฐ์์ ์ธ ํํ๋ก ํํ
- ๊ฐ์ฅ ๋ง์ง๋ง 3, 5์ผ ๋์๊ฐ Target๊ฐ์ ํ๊ท
- ๊ธฐ์จ๊ณผ ์๋์ต๋๋ฅผ ์ด์ฉํ ์ด์ฌ์ ์ฐ์ถ
-
๊ธฐ์จ๊ณผ ์๋ ์ต๋ ์์ฒด๋ฅผ feature์ ๋ฃ์ง ์๊ณ , ์ด์ฌ์ (๊ฒฐ๋ก ํ์ ๋ฐ์ ์ง์ )์ ์๋ก์ด feature๋ก ๊ตฌํ๋๊ฒ ๋ ํฉ๋ฆฌ์
-
- ๋จผ์ง, ์ต๋ ๋ฐ ํ์์ด ํ์์ ์ง ํจ์จ์ ๋ฏธ์น๋ ์ํฅ
- ์ผ์ถ/์ผ๋ชฐ ์๊ฐ ์ถ์ถ
-
DHI > 0 ๊ธฐ์ค์ผ๋ก ์์ธก
- ์ผ์ถ/์ผ๋ชฐ ์๊ฐ์ ๋ฐ๋ฅธ ์ฐ๊ฐ, ์ผ๋ณ ๊ณ์ ์ฑ(seasonality)๋ฅผ ๊ณ ๋ คํ ์ผ๋ณ 2์ฐจ ํจ์ ๊ทผ์ฌ๋ฅผ ํตํด zenith angle ์ฐ์ถ
- zenith angle๊ณผ DNI DHI๋ฅผ ์ด์ฉํ GHI ์ฐ์ถ
- ๊ตญ๋ด ํ์๊ด ์ง๊ดํ์ ๊ณ ์ ํ ์ด๊ธฐ ๋๋ฌธ์ GHI(Global Horizontal Irradiance, ์ํ๋ช ์ ์ผ์ฌ๋)๋ฅผ ์์์ผ ํจ
- ํ์์๋์ง ์์คํ ์ ์ ํฉํ ํ์ค๊ธฐ์๋ฐ์ดํฐ์ ์ ์๊ณผ ์ผ์ฌ๋ ๋ฐ์ดํฐ ๋ถ์
-
GHI = DHI + (DNI X Cosθ_zenith)
-
์ผ์ฌ๋ ์ง์ฐ๋ถ๋ฆฌ ๋ชจ๋ธ์ ๋ฐ๋ฅธ ํ์ค๊ธฐ์์ฐ๋ ๋ฐ์ดํฐ์ ํ์๊ด ๋ฐ์ ์์ธก๋์ ๋ถํ์ค์ฑ
-
(solar zenith angle) + (solar altitude angle) = 90 degrees
-
# ์ผ๋ณ 2์ฐจ ํจ์ ๊ทผ์ฌ๋ฅผ ํตํด zenith angle ์ฐ์ถ
def obj_curve(x, a, b, c):
return a*(x-b)**2+c
def preprocess_data(data, is_train=True):
temp = data.copy()
## cyclical time feature๋ก ๋ณํํ๊ธฐ ์ํด ์ ์ฒ๋ฆฌ
temp.Hour = temp.Hour + temp.Minute/60
temp.drop(['Minute','Day'], axis=1, inplace=True)
## ์๊ณ์ด(์๊ฐ)์ cyclical encoded time feature๋ก ๋ณํ (add cyclical encoded time feature), continueous time feature์
temp['cos_time'] = np.cos(2*np.pi*(temp.Hour/24))
temp['sin_time'] = np.sin(2*np.pi*(temp.Hour/24))
## add 3day & 5day mean value for target according to Hour
## ๊ฐ์ฅ ๋ง์ง๋ง 3, 5์ผ ๋์๊ฐ Target๊ฐ์ ํ๊ท
temp['shft1'] = temp['TARGET'].shift(48)
temp['shft2'] = temp['TARGET'].shift(48*2)
temp['shft3'] = temp['TARGET'].shift(48*3)
temp['shft4'] = temp['TARGET'].shift(48*4)
temp['avg3'] = np.mean(temp[['TARGET', 'shft1', 'shft2']].values, axis=-1)
temp['avg5'] = np.mean(temp[['TARGET', 'shft1', 'shft2', 'shft3','shft4']].values, axis=-1)
temp.drop(['shft1','shft2','shft3','shft4'], axis=1, inplace=True)
## ์ด์ฌ์ (๊ฒฐ๋ก ํ์ ๋ฐ์ ์ง์ ) feature ๊ณ์ฐ
c = 243.12
b = 17.62
gamma = (b * (temp['T']) / (c + (temp['T']))) + np.log(temp['RH'] / 100)
dp = ( c * gamma) / (b - gamma)
temp['Td']=dp
# zenith angle์ ๊ทผ์ฌ๋ฅผ ํตํด GHI ๊ตฌํจ
# 1. ์ผ์ถ/์ผ๋ณผ ์๊ฐ ์ถ์ (DHI > 0)
# 2. zenith angle ๊ทผ์ฌ๋ ์ฐ๋ณ, ์ผ๋ณ ๊ณ์ ๊ทผ์ฌ๋ก ๊ตฌํจ
# 3. GHI ๊ณ์ฐ (calculated from DNI DHI and zenith angle)
for day in temp.rolling(window = 48):
if day.values[0][0] == 0 and day.shape[0] == 48:
sun_rise = day[day.DHI > 0]
sun_rise['zenith'] = np.nan
sunrise = sun_rise.Hour.values[0]
sunset = sun_rise.Hour.values[-1]
peak = (sunrise + sunset)/2
param, _ = curve_fit(obj_curve, # ์ผ๋ณ 2์ฐจ ํจ์ ๊ทผ์ฌ๋ฅผ ํตํด zenith angle ์ฐ์ถ
[sunrise-0.5, peak, sunset+0.5],
[90, (sunrise-6.5)/1.5*25+35, 90],
p0=[0.5, peak, 36],
bounds=([0.01, (sunrise+sunset)/2-1, 10],
[1.2, (sunrise+sunset)/2+1, 65]))
temp.loc[day.index,'zenith']= obj_curve(day.Hour, *param)
## ํ์์ zenitn angle ๋ง๊ณ ์งํ์ ์์๋ถํฐ ๊ณ์ฐํ ๊ฐ๋
temp['altitude'] = 90 - temp.zenith
temp['GHI'] = temp.DHI + temp.DNI * np.cos(temp.zenith * math.pi / 180)
temp = temp[['Hour','cos_time','sin_time','altitude','GHI','DHI','DNI','WS','RH','T','Td','avg3','avg5','TARGET']]
## ํ๋ จ ๋ฐ์ดํฐ์ ๋ํด์, ์ปฌ๋ผ์ ๋งจ ๋ง์ง๋ง 2์ค์ target values ๋ฅผ ๋ํจ
if is_train==True:
temp['Target1'] = temp['TARGET'].shift(-48)
temp['Target2'] = temp['TARGET'].shift(-48*2)
else:
pass
## ์ฒ์ 4์ผ์ drop. nan values ๋ค์ด ์ฑ์์ ธ ์์
## ํ๋ จ ๋ฐ์ดํฐ์ ๋ํด์, ๋ง์ง๋ง 2์ผ์ ์ถ๊ฐ์ ์ผ๋ก ๋๋
temp = temp.dropna()
return temp
df_train = preprocess_data(train)
df_train.iloc[:48]
df_train.columns
3๏ธโฃ ๋ณ์ ์ ํ & ๋ชจ๋ธ ๊ตฌ์ถ
- ๋ฐ์ดํฐ ์๊ด๊ด๊ณ ํ์ธ
f, ax = plt.subplots(figsize=(10,8))
corr = df_train.corr()
sns.heatmap(corr, mask=np.zeros_like(corr, dtype=np.bool),square=True, annot=True, ax=ax)
- day 7๊ณผ 8 ๋๋์ด ๋ชจ๋ธ๋ง
print(df_train.shape)
tf_train=[]
for day in df_train.rolling(48): # rolling(n): n๊ฐ์ฉ ์ด๋ํ๊ท ๊ณ์ฐ
if day.shape[0] == 48 and day.values[0][0] == 0:
day=day.drop(['Td','WS'],axis=1) # ์๊ด๊ด๊ณ ๋ฎ์๊ฑฐ ์ ๊ฑฐ!
tf_train.append(day.values)
tf_train = np.asarray(tf_train) # ์๋ณธ์ด ๋ณ๊ฒฝ๋ ๊ฒฝ์ฐ asarray์ ๋ณต์ฌ๋ณธ๊น์ง ๋ณ๊ฒฝ
df_test = []
for i in range(81): # ํ
์คํธ์ฉ ๋ฐ์ดํฐ๋ 81๊ฐ
file_path = './data/test/' + str(i) + '.csv'
temp = pd.read_csv(file_path)
temp = preprocess_data(temp, is_train=False)
temp=temp.drop(['Td','WS'],axis=1)
temp = temp.values[-48:,:]
df_test.append(temp)
tf_test = np.asarray(df_test)
print(tf_train.shape)
print(tf_test.shape)
- ๋ฐ์ดํฐ split
## train & validation split
TF_X_train, TF_X_valid, TF_Y_train_1, TF_Y_valid_1 = train_test_split(tf_train[:,:,:-2], tf_train[:,:,-2], test_size=0.3, shuffle=False, random_state=42)
TF_X_train, TF_X_valid, TF_Y_train_2, TF_Y_valid_2 = train_test_split(tf_train[:,:,:-2], tf_train[:,:,-1], test_size=0.3, shuffle=False, random_state=42)
- ๋ชจ๋ธ ๋ฃ๊ธฐ ์ ์ ๋ฐ์ดํฐ shape ํ์ธ
print('for Day_7')
print(TF_X_train.shape)
print(TF_Y_train_1.shape)
print(TF_X_valid.shape)
print(TF_Y_valid_1.shape, '\n')
print('for Day_8')
print(TF_X_train.shape)
print(TF_Y_train_2.shape)
print(TF_X_valid.shape)
print(TF_Y_valid_2.shape)
4๏ธโฃ Model building and selection
- 4 ๊ฐ์ง(MLP, Conv1D CNN, LSTM, CNN LSTM) ๋ชจ๋ธ
- Conv1D CNN : ์ธ์ ๋ฐ์ดํฐ์์ ์๊ด ๊ด๊ณ๋ฅผ ํ์ต + BatchNormalization
- ์๊ณ์ด ํ์ต์ ์ ํฉํ๋ค๊ณ ์๋ ค์ง LSTM
- CNN๊ณผ LSTM ์ฅ์ ํฉ์น CNN(w/ BN)-LSTM
- ๊ธฐ๋ณธ MLP + Dropout
- Quantile ๋ฐ ์์ธก ์ผ ๋ณ(day 7, 8)
- ์ด 4(๋ชจ๋ธ) 9(quantile) 2(์์ธก ์ผ)=72๊ฐ ์์ฑ
- optimizer
- rectified adam(RAdam)๊ณผ LookAhead๋ฅผ ๊ฒฐํฉ
- rectified adam(RAdam)
- ์์ ์ ์ผ๋ก ์ฐ(์์ค)์ ํ๊ณ ๋ด๋ ค๊ฐ๊ฒ ์ด๋ฐ ๊ธธ์ก์ด ์ญํ
- ํ์ฌ ๋ฐ์ดํฐ ์ธํธ์ ๋ง์ถคํ๋ ์๋ํ๋ ์๋ฐ์ ์ ํจ๊ณผ์ ์ผ๋ก ์ ๊ณตํ์ฌ ์์ ์ ์ธ ํ๋ จ ์์์ ๋ณด์ฅ
- LookAhead
- ๋์ ๊ณณ์์ ๊ด์ฐฐํ๋ค๊ฐ ์๋ชป๋๋ฉด ์์ชฝ์ผ๋ก ์ฌ๋ ค์ฃผ๋ ๋์ฐ๋ฏธ ์ญํ
- ์ต์ํ์ ๊ณ์ฐ ์ค๋ฒํค๋๋ก ๋ค์ํ ๋ฅ ๋ฌ๋ ์์ ์์ ๋ ๋น ๋ฅธ ์๋ ด
- => ๋งค ์ ์ ์ง์ ์ด ์ ๋ฐ์ดํธ๋ ๋๋ง๋ค, ์์ ์ ์ผ๋ก ์ด๋ฐ ํ์ต์ ์ง์์ ์ผ๋ก ์ํ
- rectified adam(RAdam)
- rectified adam(RAdam)๊ณผ LookAhead๋ฅผ ๊ฒฐํฉ
## Lookahead(RAdam), Adam, SGD ๋น๊ต
def get_opt(init_lr=5e-4):
RAdam = tfa.optimizers.RectifiedAdam(learning_rate=init_lr)
opt = tfa.optimizers.Lookahead(RAdam)
#opt = tf.keras.optimizers.Adam(learning_rate=init_lr)
#opt = tf.keras.optimizers.SGD(learning_rate=init_lr)
return opt
## ๊ท์ ๋ Pinball loss ๋ง๋ค์ด์ ์ฌ์ฉ
from tensorflow.keras.backend import mean, maximum
def quantile_loss(q, y, f):
err = (y-f)
return mean(maximum(q*err, (q-1)*err), axis=-1)
- CNN
## CNN model structure
def CNN(q, X_train, Y_train, X_valid, Y_valid, X_test):
inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
norm = tf.keras.layers.experimental.preprocessing.Normalization() # normalization
norm.adapt(X_train)
norm_data = norm(inputs)
x = tf.keras.layers.Conv1D(100, 3, activation='relu', kernel_initializer='he_normal')(norm_data) # ๊ฐ์ค์น ์ด๊ธฐํ -> He ์ ๊ท๋ถํฌ ์ด๊ธฐ๊ฐ ์ค์ ๊ธฐ
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv1D(80, 3, activation='relu', kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv1D(50, 3, activation='relu', kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv1D(30, 3, activation='relu', kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Flatten()(x)
x = tf.keras.layers.Dropout(0.7)(x)
x= tf.keras.layers.Dense(Y_train.shape[-1])(x)
x1= tf.keras.layers.Flatten()(x)
model = tf.keras.models.Model(inputs=inputs, outputs=x1)
tf.keras.utils.plot_model(model, show_shapes=True)
# model.summary()
model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt())
es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)
train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
print("CNN_train_score: ", train_score, '\n', 'CNN_val_score: ', val_score)
pred = np.asarray(model.predict(X_test))
return pred, model, val_score
- LSTM
def LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test):
inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
norm = tf.keras.layers.experimental.preprocessing.Normalization()
norm.adapt(X_train)
norm_data = norm(inputs)
x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32, kernel_initializer='he_normal', recurrent_dropout=0.3, return_sequences=True))(norm_data)
x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(16, kernel_initializer='he_normal', recurrent_dropout=0.3, return_sequences=True))(x)
x= tf.keras.layers.Dense(1)(x)
x1= tf.keras.layers.Flatten()(x)
model = tf.keras.models.Model(inputs=inputs, outputs=x1)
tf.keras.utils.plot_model(model, show_shapes=True)
model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt(1e-2))
# model.summary()
es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)
train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
print("LSTM_train_score: ", train_score, '\n', 'LSTM_val_score: ', val_score)
pred = np.asarray(model.predict(X_test))
return pred, model, val_score
- CNN-LSTM
- Bidirectional LSTM ์ ์ด์ ๋ฐ์ดํฐ ๋ฟ๋ง ์๋๋ผ ๋ค์์ ๋ฐ์ดํฐ๋ฅผ ํตํด ์ด์ ์ ๋ญ๊ฐ ๋์ฌ์ง ์์ธกํ๋ ๋ชจ๋ธ
- ๊ธฐ์กด์ RNN์ ํ ๋ฐฉํฅ(์ -> ๋ค)์ผ๋ก์ ์์๋ง ๊ณ ๋ คํ์์ง๋ง, ์๋ฐฉํฅ RNN์ ์ญ๋ฐฉํฅ์ผ๋ก์ ์์๋ ๊ณ ๋ ค
def CNN_LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test):
inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
norm = tf.keras.layers.experimental.preprocessing.Normalization()
norm.adapt(X_train)
norm_data = norm(inputs)
x = tf.keras.layers.Conv1D(100, 3, activation='relu', padding='same',kernel_initializer='he_normal')(norm_data)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv1D(80, 3, activation='relu', padding='same',kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv1D(50, 3, activation='relu', padding='same',kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Conv1D(30, 3, activation='relu', padding='same',kernel_initializer='he_normal')(x)
x = tf.keras.layers.BatchNormalization()(x)
# Bidirectional LSTM ์ ์ด์ ๋ฐ์ดํฐ ๋ฟ๋ง ์๋๋ผ ๋ค์์ ๋ฐ์ดํฐ๋ฅผ ํตํด ์ด์ ์ ๋ญ๊ฐ ๋์ฌ์ง ์์ธกํ๋ ๋ชจ๋ธ
# ์๋ฐฉํฅ RNN -> ๊ธฐ์กด์ RNN์ ํ ๋ฐฉํฅ(์ -> ๋ค)์ผ๋ก์ ์์๋ง ๊ณ ๋ คํ์๋ค๊ณ ๋ณผ ์๋ ์๋ค. ๋ฐ๋ฉด, ์๋ฐฉํฅ RNN์ ์ญ๋ฐฉํฅ์ผ๋ก์ ์์๋ ๊ณ ๋ ค
x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, kernel_initializer='he_normal', return_sequences=True,recurrent_dropout=0.3))(x)
x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(10, kernel_initializer='he_normal', return_sequences=True,recurrent_dropout=0.3))(x)
x= tf.keras.layers.Dense(1)(x)
x1= tf.keras.layers.Flatten()(x)
model = tf.keras.models.Model(inputs=inputs, outputs=x1)
tf.keras.utils.plot_model(model, show_shapes=True)
model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt(1e-2))
es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)
train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
print("CNN-LSTM_train_score: ", train_score, '\n', 'CNN-LSTM_val_score: ', val_score)
pred = np.asarray(model.predict(X_test))
return pred, model, val_score
- MLP
def MLP(q, X_train, Y_train, X_valid, Y_valid, X_test):
inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
norm = tf.keras.layers.experimental.preprocessing.Normalization()
norm.adapt(X_train)
norm_data = norm(inputs)
x = tf.keras.layers.Flatten()(norm_data)
x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
x = tf.keras.layers.Dropout(0.7)(x)
x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
x = tf.keras.layers.Dropout(0.7)(x)
x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
x = tf.keras.layers.Dropout(0.7)(x)
x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
x = tf.keras.layers.Dropout(0.7)(x)
x1= tf.keras.layers.Dense(Y_train.shape[-1])(x)
model = tf.keras.models.Model(inputs=inputs, outputs=x1)
tf.keras.utils.plot_model(model, show_shapes=True)
model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt())
es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)
train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
print("MLP_train_score: ", train_score, '\n', 'MLP_val_score: ', val_score)
pred = np.asarray(model.predict(X_test))
return pred, model, val_score
- train and predict Test data
def TF_train_func(X_train, Y_train, X_valid, Y_valid, X_test):
models=[]
actual_pred = []
for model_select in ['CNN','MLP','LSTM','CNNLSTM']:
score_lst=[]
for q in [0.1, 0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9]:
print(model_select, q)
if model_select=='LSTM':
pred , mod, s = LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test)
elif model_select=='CNN':
pred , mod, s = CNN(q, X_train, Y_train, X_valid, Y_valid, X_test)
elif model_select=='CNNLSTM':
pred , mod, s = CNN_LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test)
elif model_select=='MLP':
pred , mod, s = MLP(q, X_train, Y_train, X_valid, Y_valid, X_test)
score_lst.append(s)
models.append(mod)
actual_pred.append(pred)
print(sum(score_lst)/len(score_lst))
return models, np.asarray(actual_pred)
- train
import keras
import pydot
import pydotplus
from pydotplus import graphviz
from keras.utils.vis_utils import plot_model
## total of 4*9*2 models are trained(4 models, 9 quantiles, 2 seperate target days)
models_tf1, results_tf1 = TF_train_func(TF_X_train, TF_Y_train_1 , TF_X_valid, TF_Y_valid_1, tf_test)
models_tf2, results_tf2 = TF_train_func(TF_X_train, TF_Y_train_2 , TF_X_valid, TF_Y_valid_2, tf_test)
Step 1 : pip install pydot
Step 2 : pip install pydotplus
Step 3 : sudo apt-get install graphviz
๋ค ์ค์นํ์์ผ๋,,,,, ์๋ฌ๊ฐ ๋ฌ๋ค ใ ใ ใ
๊ฒฐ๊ณผ ๊ถ๊ธํ๋ฐ..ใ ใ
์.. ๊ธฐ๋ค๋ ธ๋๋ ์์ํ ๋์ค๊ณ ์๋ค. ๊ต์ฅํ ์ค๋ ๊ฑธ๋ฆด๋ฏ.
์์ง ์์ ์ค~