๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

[DACON] ๋™์„œ๋ฐœ์ „ ํƒœ์–‘๊ด‘ ๋ฐœ์ „๋Ÿ‰ ์˜ˆ์ธก AI ๊ฒฝ์ง„๋Œ€ํšŒ ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป ์ธ๊ณต์ง€๋Šฅ (ML & DL)/Serial Data

[DACON] ๋™์„œ๋ฐœ์ „ ํƒœ์–‘๊ด‘ ๋ฐœ์ „๋Ÿ‰ ์˜ˆ์ธก AI ๊ฒฝ์ง„๋Œ€ํšŒ

์ง•์ง•์•ŒํŒŒ์นด 2022. 9. 14. 16:25
728x90
๋ฐ˜์‘ํ˜•

220914 ์ž‘์„ฑ

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” dacon ๋Œ€ํšŒ์—์„œ์˜ Baseline ์™€ haenara-shin ๋‹˜์˜ ๊นƒํ—ˆ๋ธŒใ„น,ใ„น ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค :-) >

https://github.com/haenara-shin/DACON/blob/main/2_TimeSeries_%ED%83%9C%EC%96%91%EA%B4%91%EB%B0%9C%EC%A0%84%EB%9F%89%EC%98%88%EC%B8%A1_KAERI/%5BPrediction_of_PV_Power_Generation%5D_Stacking_Quantile_Regression_Final_submission.ipynb

 

GitHub - haenara-shin/DACON: DACON competition code repos.

DACON competition code repos. Contribute to haenara-shin/DACON development by creating an account on GitHub.

github.com

https://dacon.io/competitions/official/235720/codeshare/2512?page=1&dtype=recent 

 

[Public LB-9.4531] Baseline Code / LightGBM (์ˆ˜์ •)

๋™์„œ๋ฐœ์ „ ํƒœ์–‘๊ด‘ ๋ฐœ์ „๋Ÿ‰ ์˜ˆ์ธก AI ๊ฒฝ์ง„๋Œ€ํšŒ

dacon.io

 

 

 

๐Ÿ˜Ž 1. ๋Œ€ํšŒ ์†Œ๊ฐœ

ํƒœ์–‘๊ด‘ ๋ฐœ์ „์€ ๋งค์ผ ๊ธฐ์ƒ ์ƒํ™ฉ๊ณผ ๊ณ„์ ˆ์— ๋”ฐ๋ฅธ ์ผ์‚ฌ๋Ÿ‰์˜ ์˜ํ–ฅ์„ ๋ฐ›๋Š”๋‹ค

์ด์— ๋Œ€ํ•œ ์˜ˆ์ธก์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ๋ณด๋‹ค ์›ํ™œํ•œ ์ „๋ ฅ ์ˆ˜๊ธ‰ ๊ณ„ํš์ด ๊ฐ€๋Šฅํ•˜๋‹ค

์ธ๊ณต์ง€๋Šฅ ๊ธฐ๋ฐ˜ ํƒœ์–‘๊ด‘ ๋ฐœ์ „๋Ÿ‰ ์˜ˆ์ธก ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด๋ณด์ž

  • ํƒœ์–‘๊ด‘ ๋ฐœ์ „์€ ๋งค์ผ์˜ '๊ธฐ์ƒ ์ƒํ™ฉ'๊ณผ ๊ณ„์ ˆ์— ๋”ฐ๋ฅธ '์ผ์‚ฌ๋Ÿ‰'์˜ ์˜ํ–ฅ์„ ๋ฐ›์Œ
    • Input-dataset : 7์ผ(Day 0 ~ Day 6) ๋™์•ˆ์˜ ๋ฐ์ดํ„ฐ
    • Prediction(Target) : ํ–ฅํ›„ 2์ผ(Day 7 ~ Day 8) ๋™์•ˆ์˜ 30๋ถ„ ๊ฐ„๊ฒฉ์˜ ๋ฐœ์ „๋Ÿ‰ ์˜ˆ์ธก. (1์ผ๋‹น 48๊ฐœ์”ฉ ์ด 96๊ฐœ timestep์— ๋Œ€ํ•œ ์˜ˆ์ธก)

 

๐Ÿ˜Ž 2. ๋ฐ์ดํ„ฐ ๊ตฌ์„ฑ

  • dataset
    • train.csv : 3๋…„(Day 0 ~ Day 1094) ๋™์•ˆ์˜ ๊ธฐ์ƒ ๋ฐ์ดํ„ฐ, ๋ฐœ์ „๋Ÿ‰(target) ๋ฐ์ดํ„ฐ
    • test fold files : ์ •๋‹ต์šฉ ๋ฐ์ดํ„ฐ
      • (81๊ฐœ - 2๋…„ ๋™์•ˆ์˜ ๊ธฐ์ƒ ๋ฐ์ดํ„ฐ, ๋ฐœ์ „๋Ÿ‰(target)๋ฐ์ดํ„ฐ. ์ˆœ์„œ๋Š” ๋žœ๋ค, ์‹œ๊ณ„์—ด ์ˆœ์„œ์™€ ๋ฌด๊ด€.
      • ๊ฐ ํŒŒ์ผ์˜ 7์ผ ๋™์•ˆ์˜ ๋ฐ์ดํ„ฐ ์ „์ฒด ํ˜น์€ ์ผ๋ถ€๋ฅผ input ์œผ๋กœ ์‚ฌ์šฉํ•ด์„œ, ํ–ฅํ›„ 2์ผ ๋™์•ˆ์˜ 30๋ถ„ ๊ฐ„๊ฒฉ์˜ ๋ฐœ์ „๋Ÿ‰(target)์„ ์˜ˆ์ธก(1์ผ๋‹น 48๊ฐœ์”ฉ ์ด 96๊ฐœ์˜ timestep ์˜ˆ์ธก)
    • submission.csv : ์ •๋‹ต ์ œ์ถœ ํŒŒ์ผ
      • test ํด๋”์˜ ๊ฐ ํŒŒ์ผ์— ๋Œ€ํ•ด์„œ, ์‹œ๊ฐ„๋Œ€๋ณ„ ๋ฐœ์ „๋Ÿ‰์„ 9๊ฐœ์˜ quantile(0.1, 0.2 ... 0.9) ์— ๋งž์ถฐ ์˜ˆ์ธก. 'ํŒŒ์ผ๋ช…_๋‚ ์งœ_์‹œ๊ฐ„' ํ˜•์‹
  • Loss
    • Pinball loss
      • 50% ์ด์ƒ์˜ ๋ฐฑ๋ถ„์œ„์ˆ˜์—์„œ ๊ณผ์†Œ ์˜ˆ์ธกํ•˜๋ฉด ํŽ˜๋„ํ‹ฐ ํฌ๊ฒŒ ๋จน์ž„
        • ๋†’์€ quantile ๊ฐ’์—์„œ๋Š” ์ธก์ •๋œ ๊ฐ’์ด ์˜ˆ์ธก๊ฐ’ ๋ณด๋‹ค ๋‚ฎ์•„์•ผ ํ•จ. ์ฆ‰, over-forecast ์œ ๋„
      • 50% ๋ฏธ๋งŒ์˜ ๋ฐฑ๋ถ„์œ„์ˆ˜์—์„œ ๊ณผ๋Œ€์˜ˆ์ธกํ•  ๊ฒฝ์šฐ, ํŽ˜๋„ํ‹ฐ ํฌ๊ฒŒ ๋จน์ž„
        • ๋‚ฎ์€ quantile ๊ฐ’์—์„œ๋Š” ์ธก์ •๋œ ๊ฐ’์ด ์˜ˆ์ธก๊ฐ’ ๋ณด๋‹ค ๋†’์•„์•ผ ํ•จ. ์ฆ‰, under-forecast ์œ ๋„
  • QR
    • percentile(๋ฐฑ๋ถ„์œ„์ˆ˜) : ํฌ๊ธฐ๊ฐ€ ์žˆ๋Š” ๊ฐ’๋“ค๋กœ ์ด๋ค„์ง„ ์ž๋ฃŒ๋ฅผ ์ˆœ์„œ๋Œ€๋กœ ๋‚˜์—ดํ–ˆ์„ ๋•Œ ๋ฐฑ๋ถ„์œจ๋กœ ๋‚˜ํƒ€๋‚ธ ํŠน์ • ์œ„์น˜์˜ '๊ฐ’'. ์ž‘์€ ๊ฐ’(0) ๋ถ€ํ„ฐ ํฐ ๊ฐ’(100)๊นŒ์ง€ ๋งค๊น€
    • percentile rank(๋ฐฑ๋ถ„์œ„) : ์ž๋ฃŒ์˜ ํŠน์ • ๊ฐ’์ด ์ „์ฒด์—์„œ ์–ด๋Š '์œ„์น˜'์— ์žˆ๋Š”์ง€
    • quantile(์‚ฌ๋ถ„์œ„) : 0.25์”ฉ ๋Š์–ด์„œ ์ƒ๊ฐํ•˜๋ฉด ๋จ
    • ์˜ˆ์ธก ๊ฐ’์˜ ๋ฒ”์œ„๋ฅผ ์ œ๊ณตํ•ด์„œ ๋” '์•ˆ์ •์ ์ด๊ฑฐ๋‚˜ ๋ฏฟ์„ ๋งŒํ•œ' ์˜ˆ์ธก ๊ฐ€์ด๋“œ๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Œ
    • QR(Quantile Regression)๋กœ ์˜ˆ์ธก ๊ตฌ๊ฐ„์„ ๋‚˜๋ˆ ์„œ ์˜ˆ์ธกํ•˜๋Š”๊ฒŒ ๋” '๋ฏฟ์„๋งŒํ•˜๊ฑฐ๋‚˜ ์•ˆ์ •์ ์ž„'
      • QR์ด OLS(์ผ๋ฐ˜ ์ตœ์†Œ ์ œ๊ณฑ ๋ชจ๋ธ, ํ‰๊ท  ์ถ”์ •์น˜๋งŒ ์ œ๊ณต) ๋ณด๋‹ค ์˜๋ฏธ ์žˆ๋Š” ์ด์œ 
      • QR์€ ๋ชฉํ‘œ ๋ณ€์ˆ˜์˜ ์ „์ฒด ์กฐ๊ฑด๋ถ€ ๋ถ„ํฌ๋ฅผ ๋ชจํ˜•ํ™” ํ•  ์ˆ˜ ์žˆ์Œ(OLS๋Š” ํ‰๊ท  ์ถ”์ •์น˜๋งŒ ์ œ๊ณต)
      • QR์€ ๋ชฉํ‘œ ๋ถ„ํฌ์— ๋Œ€ํ•ด์„œ ๊ฐ€์ •์„ ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์—๋Ÿฌ(์˜ค์ฐจ) ๋ถ„ํฌ์˜ mis-specification์— ์ข€ ๋” ๊ฐ•ํ•จ
      • QR์€ outlier์— ๋œ ๋ฏผ๊ฐํ•จ
      • QR์€ monotonic transformation(log์™€ ๊ฐ™์€๊ฒƒ)์— ๋ถˆํŽธ์ž„

 

๐Ÿ˜Ž 3. ์ฝ”๋“œ ๊ตฌํ˜„

1๏ธโƒฃ Package load

import pandas as pd
# imort cupy as cp
import numpy as np
import os

import random
import math
from scipy.optimize import curve_fit # Use non-linear least squares to fit a function - for the zenith angle calculation
import warnings

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

warnings.filterwarnings("ignore")

from sklearn.model_selection import train_test_split

import tensorflow as tf
import tensorflow_addons as tfa
train = pd.read_csv('data/train/train.csv')

print(train.shape)
train.head()

 

  • Day : ๋‚ ์งœ
  •  
    Hour : ์‹œ๊ฐ„
  •  
    Minute : ๋ถ„
  •  
    DHI : ์ˆ˜ํ‰๋ฉด ์‚ฐ๋ž€์ผ์‚ฌ๋Ÿ‰(Diffuse Horizontal Irradiance (W/m2))
  •  
    DNI : ์ง๋‹ฌ์ผ์‚ฌ๋Ÿ‰(Direct Normal Irradiance (W/m2))
  •  
    WS : ํ’์†(Wind Speed (m/s))
  •  
    RH : ์ƒ๋Œ€์Šต๋„(Relative Humidity (%))
  •  
    T : ๊ธฐ์˜จ(Temperature (Degree C))
  •  
    Target : ํƒœ์–‘๊ด‘ ๋ฐœ์ „๋Ÿ‰ (kW)
train.describe()

train.info()

  • plot feature data distribution
## plot feature data distribution

fig, ax = plt.subplots(2, train.shape[1]//2+1, figsize=(20, 6))

for idx, feature in enumerate(train.columns):
    data = train[feature]
    if idx<train.shape[1]//2 + 1:
        ax[0,idx].hist(train.iloc[:,idx], bins=10, alpha=0.5)
        ax[0,idx].set_title(train.columns[idx])
    else:
        ax[1,idx-train.shape[1]//2-1].hist(train.iloc[:,idx], bins=10, alpha=0.5)
        ax[1,idx-train.shape[1]//2-1].set_title(train.columns[idx])
plt.show()

  • target์— ๋Œ€ํ•œ feature(x)๋“ค์˜ ์ƒ๊ด€ ๊ด€๊ณ„(correlation)
fig, axes = plt.subplots(2, 3, figsize=(10,7))

train.plot(x='Hour', y='TARGET', kind='scatter', alpha=0.1, ax=axes[0,0])
train.plot(x='DHI', y='TARGET', kind='scatter', alpha=0.1, ax=axes[0,1])
train.plot(x='DNI', y='TARGET', kind='scatter', alpha=0.1, ax=axes[0,2])
train.plot(x='WS', y='TARGET', kind='scatter', alpha=0.1, ax=axes[1,0])
train.plot(x='RH', y='TARGET', kind='scatter', alpha=0.1, ax=axes[1,1])
train.plot(x='T', y='TARGET', kind='scatter', alpha=0.1, ax=axes[1,2])

fig.tight_layout()

 

 

 

2๏ธโƒฃ Data preprocessing (feature engineering)

  • Hour๊ณผ Minute๋ฅผ 1๊ฐœ ํ•ญ๋ชฉ์œผ๋กœ(float) ๋ณ‘ํ•ฉ, day ํ•ญ๋ชฉ ์ œ๊ฑฐ ํ›„ ์‚ผ๊ฐํ•จ์ˆ˜(sin, cos) ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด์„œ ์‹œ๊ฐ„์„ ์—ฐ์†์ ์ธ ํ˜•ํƒœ๋กœ ํ‘œํ˜„
  • ๊ฐ€์žฅ ๋งˆ์ง€๋ง‰ 3, 5์ผ ๋™์‹œ๊ฐ„ Target๊ฐ’์˜ ํ‰๊ท 
  • ๊ธฐ์˜จ๊ณผ ์ƒ๋Œ€์Šต๋„๋ฅผ ์ด์šฉํ•œ ์ด์Šฌ์  ์‚ฐ์ถœ
    • ๊ธฐ์˜จ๊ณผ ์ƒ๋Œ€ ์Šต๋„ ์ž์ฒด๋ฅผ feature์— ๋„ฃ์ง€ ์•Š๊ณ , ์ด์Šฌ์ (๊ฒฐ๋กœ ํ˜„์ƒ ๋ฐœ์ƒ ์ง€์ )์„ ์ƒˆ๋กœ์šด feature๋กœ ๊ตฌํ•˜๋Š”๊ฒŒ ๋” ํ•ฉ๋ฆฌ์ 
  • ๋จผ์ง€, ์Šต๋„ ๋ฐ ํ’์†์ด ํƒœ์–‘์ „์ง€ ํšจ์œจ์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ
  • ์ผ์ถœ/์ผ๋ชฐ ์‹œ๊ฐ„ ์ถ”์ถ•
     
  • DHI > 0 ๊ธฐ์ค€์œผ๋กœ ์˜ˆ์ธก
    • ์ผ์ถœ/์ผ๋ชฐ ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ์—ฐ๊ฐ„, ์ผ๋ณ„ ๊ณ„์ ˆ์„ฑ(seasonality)๋ฅผ ๊ณ ๋ คํ•œ ์ผ๋ณ„ 2์ฐจ ํ•จ์ˆ˜ ๊ทผ์‚ฌ๋ฅผ ํ†ตํ•ด zenith angle ์‚ฐ์ถœ
  • zenith angle๊ณผ DNI DHI๋ฅผ ์ด์šฉํ•œ GHI ์‚ฐ์ถœ
    • ๊ตญ๋‚ด ํƒœ์–‘๊ด‘ ์ง‘๊ด‘ํŒ์€ ๊ณ ์ •ํ˜• ์ด๊ธฐ ๋•Œ๋ฌธ์— GHI(Global Horizontal Irradiance, ์ˆ˜ํ‰๋ช… ์ „ ์ผ์‚ฌ๋Ÿ‰)๋ฅผ ์•Œ์•„์•ผ ํ•จ
    • ํƒœ์–‘์—๋„ˆ์ง€ ์‹œ์Šคํ…œ์— ์ ํ•ฉํ•œ ํ‘œ์ค€๊ธฐ์ƒ๋ฐ์ดํ„ฐ์˜ ์ œ์ž‘๊ณผ ์ผ์‚ฌ๋Ÿ‰ ๋ฐ์ดํ„ฐ ๋ถ„์„
    •  
      GHI = DHI + (DNI X Cosθ_zenith)
  • ์ผ์‚ฌ๋Ÿ‰ ์ง์‚ฐ๋ถ„๋ฆฌ ๋ชจ๋ธ์— ๋”ฐ๋ฅธ ํ‘œ์ค€๊ธฐ์ƒ์—ฐ๋„ ๋ฐ์ดํ„ฐ์™€ ํƒœ์–‘๊ด‘ ๋ฐœ์ „ ์˜ˆ์ธก๋Ÿ‰์˜ ๋ถˆํ™•์‹ค์„ฑ
    • (solar zenith angle) + (solar altitude angle) = 90 degrees
# ์ผ๋ณ„ 2์ฐจ ํ•จ์ˆ˜ ๊ทผ์‚ฌ๋ฅผ ํ†ตํ•ด zenith angle ์‚ฐ์ถœ
def obj_curve(x, a, b, c):
    return a*(x-b)**2+c
def preprocess_data(data, is_train=True):
    
    temp = data.copy()
    
    ## cyclical time feature๋กœ ๋ณ€ํ™˜ํ•˜๊ธฐ ์œ„ํ•ด ์ „์ฒ˜๋ฆฌ
    temp.Hour = temp.Hour + temp.Minute/60
    temp.drop(['Minute','Day'], axis=1, inplace=True)
    
    ## ์‹œ๊ณ„์—ด(์‹œ๊ฐ„)์„ cyclical encoded time feature๋กœ ๋ณ€ํ™˜ (add cyclical encoded time feature), continueous time feature์ž„
    temp['cos_time'] = np.cos(2*np.pi*(temp.Hour/24))
    temp['sin_time'] = np.sin(2*np.pi*(temp.Hour/24))
    
    
    ## add 3day & 5day mean value for target according to Hour
    ## ๊ฐ€์žฅ ๋งˆ์ง€๋ง‰ 3, 5์ผ ๋™์‹œ๊ฐ„ Target๊ฐ’์˜ ํ‰๊ท 
    temp['shft1'] = temp['TARGET'].shift(48)
    temp['shft2'] = temp['TARGET'].shift(48*2)
    temp['shft3'] = temp['TARGET'].shift(48*3)
    temp['shft4'] = temp['TARGET'].shift(48*4)
    
    temp['avg3'] = np.mean(temp[['TARGET', 'shft1', 'shft2']].values, axis=-1)
    temp['avg5'] = np.mean(temp[['TARGET', 'shft1', 'shft2', 'shft3','shft4']].values, axis=-1)
    temp.drop(['shft1','shft2','shft3','shft4'], axis=1, inplace=True)
    
    ## ์ด์Šฌ์ (๊ฒฐ๋กœ ํ˜„์ƒ ๋ฐœ์ƒ ์ง€์ ) feature ๊ณ„์‚ฐ
    c = 243.12
    b = 17.62
    gamma = (b * (temp['T']) / (c + (temp['T']))) + np.log(temp['RH'] / 100)
    dp = ( c * gamma) / (b - gamma)
    temp['Td']=dp
    
    
    # zenith angle์˜ ๊ทผ์‚ฌ๋ฅผ ํ†ตํ•ด GHI ๊ตฌํ•จ
    # 1. ์ผ์ถœ/์ผ๋ณผ ์‹œ๊ฐ„ ์ถ”์ • (DHI > 0)
    # 2. zenith angle ๊ทผ์‚ฌ๋Š” ์—ฐ๋ณ„, ์ผ๋ณ„ ๊ณ„์ ˆ ๊ทผ์‚ฌ๋กœ ๊ตฌํ•จ
    # 3. GHI ๊ณ„์‚ฐ (calculated from DNI DHI and zenith angle)
    for day in temp.rolling(window = 48):
        if day.values[0][0] == 0 and day.shape[0] == 48:
            sun_rise = day[day.DHI > 0]
            sun_rise['zenith'] = np.nan
            
            sunrise = sun_rise.Hour.values[0]
            sunset = sun_rise.Hour.values[-1]
            
            peak = (sunrise + sunset)/2
                        
            param, _ = curve_fit(obj_curve,         # ์ผ๋ณ„ 2์ฐจ ํ•จ์ˆ˜ ๊ทผ์‚ฌ๋ฅผ ํ†ตํ•ด zenith angle ์‚ฐ์ถœ
                                 [sunrise-0.5, peak, sunset+0.5],
                                 [90, (sunrise-6.5)/1.5*25+35, 90],
                                 p0=[0.5, peak, 36],
                                 bounds=([0.01, (sunrise+sunset)/2-1, 10],
                                         [1.2,  (sunrise+sunset)/2+1, 65]))
                                
            temp.loc[day.index,'zenith']= obj_curve(day.Hour, *param)
    
    
    ## ํƒœ์–‘์˜ zenitn angle ๋ง๊ณ  ์ง€ํ‰์„ ์—์„œ๋ถ€ํ„ฐ ๊ณ„์‚ฐํ•œ ๊ฐ๋„
    temp['altitude'] = 90 - temp.zenith
    temp['GHI'] = temp.DHI + temp.DNI * np.cos(temp.zenith * math.pi / 180)
    temp = temp[['Hour','cos_time','sin_time','altitude','GHI','DHI','DNI','WS','RH','T','Td','avg3','avg5','TARGET']]

    ## ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ, ์ปฌ๋Ÿผ์˜ ๋งจ ๋งˆ์ง€๋ง‰ 2์ค„์— target values ๋ฅผ ๋”ํ•จ
    if is_train==True:          
        temp['Target1'] = temp['TARGET'].shift(-48) 
        temp['Target2'] = temp['TARGET'].shift(-48*2) 
    else:
        pass
       
    ## ์ฒ˜์Œ 4์ผ์€ drop. nan values ๋“ค์ด ์ฑ„์›Œ์ ธ ์žˆ์Œ
    ## ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ, ๋งˆ์ง€๋ง‰ 2์ผ์€ ์ถ”๊ฐ€์ ์œผ๋กœ ๋“œ๋ž
    temp = temp.dropna()
    return temp
df_train = preprocess_data(train)
df_train.iloc[:48]

df_train.columns

 

 

3๏ธโƒฃ  ๋ณ€์ˆ˜ ์„ ํƒ & ๋ชจ๋ธ ๊ตฌ์ถ•

  • ๋ฐ์ดํ„ฐ ์ƒ๊ด€๊ด€๊ณ„ ํ™•์ธ
f, ax = plt.subplots(figsize=(10,8))

corr = df_train.corr()
sns.heatmap(corr, mask=np.zeros_like(corr, dtype=np.bool),square=True, annot=True, ax=ax)

  • day 7๊ณผ 8 ๋‚˜๋ˆ„์–ด ๋ชจ๋ธ๋ง
print(df_train.shape)

tf_train=[]
for day in df_train.rolling(48):    # rolling(n): n๊ฐœ์”ฉ ์ด๋™ํ‰๊ท  ๊ณ„์‚ฐ
    if day.shape[0] == 48 and day.values[0][0] == 0:
        day=day.drop(['Td','WS'],axis=1)        # ์ƒ๊ด€๊ด€๊ณ„ ๋‚ฎ์€๊ฑฐ ์ œ๊ฑฐ!
        tf_train.append(day.values)
tf_train = np.asarray(tf_train)                 # ์›๋ณธ์ด ๋ณ€๊ฒฝ๋  ๊ฒฝ์šฐ asarray์˜ ๋ณต์‚ฌ๋ณธ๊นŒ์ง€ ๋ณ€๊ฒฝ

df_test = []
for i in range(81): # ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ๋Š” 81๊ฐœ
    file_path = './data/test/' + str(i) + '.csv'
    temp = pd.read_csv(file_path)
    temp = preprocess_data(temp, is_train=False)
    temp=temp.drop(['Td','WS'],axis=1)
    temp = temp.values[-48:,:]
    df_test.append(temp)
tf_test = np.asarray(df_test)

print(tf_train.shape)
print(tf_test.shape)

  • ๋ฐ์ดํ„ฐ split
## train & validation split
TF_X_train, TF_X_valid, TF_Y_train_1, TF_Y_valid_1 = train_test_split(tf_train[:,:,:-2], tf_train[:,:,-2], test_size=0.3, shuffle=False, random_state=42)
TF_X_train, TF_X_valid, TF_Y_train_2, TF_Y_valid_2 = train_test_split(tf_train[:,:,:-2], tf_train[:,:,-1], test_size=0.3, shuffle=False, random_state=42)
  • ๋ชจ๋ธ ๋„ฃ๊ธฐ ์ „์— ๋ฐ์ดํ„ฐ shape ํ™•์ธ
print('for Day_7')
print(TF_X_train.shape)
print(TF_Y_train_1.shape)
print(TF_X_valid.shape)
print(TF_Y_valid_1.shape, '\n')

print('for Day_8')
print(TF_X_train.shape)
print(TF_Y_train_2.shape)
print(TF_X_valid.shape)
print(TF_Y_valid_2.shape)

 

 

 

4๏ธโƒฃ Model building and selection

  • 4 ๊ฐ€์ง€(MLP, Conv1D CNN, LSTM, CNN LSTM) ๋ชจ๋ธ
    • Conv1D CNN : ์ธ์ ‘ ๋ฐ์ดํ„ฐ์™€์˜ ์ƒ๊ด€ ๊ด€๊ณ„๋ฅผ ํ•™์Šต + BatchNormalization
    • ์‹œ๊ณ„์—ด ํ•™์Šต์— ์ ํ•ฉํ•˜๋‹ค๊ณ  ์•Œ๋ ค์ง„ LSTM
    • CNN๊ณผ LSTM ์žฅ์  ํ•ฉ์นœ CNN(w/ BN)-LSTM
    • ๊ธฐ๋ณธ MLP + Dropout
  • Quantile ๋ฐ ์˜ˆ์ธก ์ผ ๋ณ„(day 7, 8)
  • ์ด 4(๋ชจ๋ธ) 9(quantile) 2(์˜ˆ์ธก ์ผ)=72๊ฐœ ์ƒ์„ฑ
  • optimizer
    • rectified adam(RAdam)๊ณผ LookAhead๋ฅผ ๊ฒฐํ•ฉ
      • rectified adam(RAdam)
        • ์•ˆ์ •์ ์œผ๋กœ ์‚ฐ(์†์‹ค)์„ ํƒ€๊ณ  ๋‚ด๋ ค๊ฐ€๊ฒŒ ์ดˆ๋ฐ˜ ๊ธธ์žก์ด ์—ญํ• 
        • ํ˜„์žฌ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋งž์ถคํ™”๋œ ์ž๋™ํ™”๋œ ์›Œ๋ฐ์—…์„ ํšจ๊ณผ์ ์œผ๋กœ ์ œ๊ณตํ•˜์—ฌ ์•ˆ์ •์ ์ธ ํ›ˆ๋ จ ์‹œ์ž‘์„ ๋ณด์žฅ
      • LookAhead
        • ๋†’์€ ๊ณณ์—์„œ ๊ด€์ฐฐํ•˜๋‹ค๊ฐ€ ์ž˜๋ชป๋˜๋ฉด ์œ—์ชฝ์œผ๋กœ ์˜ฌ๋ ค์ฃผ๋Š” ๋„์šฐ๋ฏธ ์—ญํ• 
        • ์ตœ์†Œํ•œ์˜ ๊ณ„์‚ฐ ์˜ค๋ฒ„ํ—ค๋“œ๋กœ ๋‹ค์–‘ํ•œ ๋”ฅ ๋Ÿฌ๋‹ ์ž‘์—…์—์„œ ๋” ๋น ๋ฅธ ์ˆ˜๋ ด
      • => ๋งค ์ •์ƒ ์ง€์ ์ด ์—…๋ฐ์ดํŠธ๋  ๋•Œ๋งˆ๋‹ค, ์•ˆ์ •์ ์œผ๋กœ ์ดˆ๋ฐ˜ ํ•™์Šต์„ ์ง€์†์ ์œผ๋กœ ์ˆ˜ํ–‰
## Lookahead(RAdam), Adam, SGD ๋น„๊ต
def get_opt(init_lr=5e-4):
    RAdam = tfa.optimizers.RectifiedAdam(learning_rate=init_lr)
    opt = tfa.optimizers.Lookahead(RAdam)
    #opt = tf.keras.optimizers.Adam(learning_rate=init_lr)
    #opt = tf.keras.optimizers.SGD(learning_rate=init_lr)
    return opt

## ๊ทœ์ •๋œ Pinball loss ๋งŒ๋“ค์–ด์„œ ์‚ฌ์šฉ
from tensorflow.keras.backend import mean, maximum
def quantile_loss(q, y, f):
    err = (y-f)
    return mean(maximum(q*err, (q-1)*err), axis=-1)
  • CNN
## CNN model structure
def CNN(q, X_train, Y_train, X_valid, Y_valid, X_test):
    inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
    
    norm = tf.keras.layers.experimental.preprocessing.Normalization() # normalization
    norm.adapt(X_train)
    norm_data = norm(inputs)

    x = tf.keras.layers.Conv1D(100, 3, activation='relu', kernel_initializer='he_normal')(norm_data)        # ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™” -> He ์ •๊ทœ๋ถ„ํฌ ์ดˆ๊ธฐ๊ฐ’ ์„ค์ •๊ธฐ
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv1D(80, 3, activation='relu', kernel_initializer='he_normal')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv1D(50, 3, activation='relu', kernel_initializer='he_normal')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv1D(30, 3, activation='relu', kernel_initializer='he_normal')(x)
    x = tf.keras.layers.BatchNormalization()(x)

    x = tf.keras.layers.Flatten()(x)
    x = tf.keras.layers.Dropout(0.7)(x)

    x= tf.keras.layers.Dense(Y_train.shape[-1])(x)
    x1= tf.keras.layers.Flatten()(x)
    
    model = tf.keras.models.Model(inputs=inputs, outputs=x1)
    
    tf.keras.utils.plot_model(model, show_shapes=True)
#     model.summary()
    
    model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt())
    
    es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
    
    history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)
    
    train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
    val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
    
    print("CNN_train_score: ", train_score, '\n', 'CNN_val_score: ', val_score)
    
    pred = np.asarray(model.predict(X_test))
    
    return pred, model, val_score
  • LSTM
def LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test):
    inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
    
    norm = tf.keras.layers.experimental.preprocessing.Normalization()
    norm.adapt(X_train)
    norm_data = norm(inputs)
    
    x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32, kernel_initializer='he_normal', recurrent_dropout=0.3, return_sequences=True))(norm_data)
    x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(16, kernel_initializer='he_normal', recurrent_dropout=0.3, return_sequences=True))(x)

    x= tf.keras.layers.Dense(1)(x)
    x1= tf.keras.layers.Flatten()(x)
    
    model = tf.keras.models.Model(inputs=inputs, outputs=x1)
    
    tf.keras.utils.plot_model(model, show_shapes=True)
    
    model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt(1e-2))
#     model.summary()
    es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
    
    history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)
 
    train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
    val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
    
    print("LSTM_train_score: ", train_score, '\n', 'LSTM_val_score: ', val_score)
    
    pred = np.asarray(model.predict(X_test))
    
    return pred, model, val_score
  • CNN-LSTM
    • Bidirectional LSTM ์€ ์ด์ „ ๋ฐ์ดํ„ฐ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋‹ค์Œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ์ด์ „์— ๋ญ๊ฐ€ ๋‚˜์˜ฌ์ง€ ์˜ˆ์ธกํ•˜๋Š” ๋ชจ๋ธ
    • ๊ธฐ์กด์˜ RNN์€ ํ•œ ๋ฐฉํ–ฅ(์•ž -> ๋’ค)์œผ๋กœ์˜ ์ˆœ์„œ๋งŒ ๊ณ ๋ คํ•˜์˜€์ง€๋งŒ, ์–‘๋ฐฉํ–ฅ RNN์€ ์—ญ๋ฐฉํ–ฅ์œผ๋กœ์˜ ์ˆœ์„œ๋„ ๊ณ ๋ ค
def CNN_LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test): 
    inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
    
    norm = tf.keras.layers.experimental.preprocessing.Normalization()
    norm.adapt(X_train)
    norm_data = norm(inputs)

    x = tf.keras.layers.Conv1D(100, 3, activation='relu', padding='same',kernel_initializer='he_normal')(norm_data)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv1D(80, 3, activation='relu', padding='same',kernel_initializer='he_normal')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv1D(50, 3, activation='relu', padding='same',kernel_initializer='he_normal')(x)
    x = tf.keras.layers.BatchNormalization()(x)
    x = tf.keras.layers.Conv1D(30, 3, activation='relu', padding='same',kernel_initializer='he_normal')(x)
    x = tf.keras.layers.BatchNormalization()(x)

    # Bidirectional LSTM ์€ ์ด์ „ ๋ฐ์ดํ„ฐ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋‹ค์Œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•ด ์ด์ „์— ๋ญ๊ฐ€ ๋‚˜์˜ฌ์ง€ ์˜ˆ์ธกํ•˜๋Š” ๋ชจ๋ธ
    # ์–‘๋ฐฉํ–ฅ RNN -> ๊ธฐ์กด์˜ RNN์€ ํ•œ ๋ฐฉํ–ฅ(์•ž -> ๋’ค)์œผ๋กœ์˜ ์ˆœ์„œ๋งŒ ๊ณ ๋ คํ•˜์˜€๋‹ค๊ณ  ๋ณผ ์ˆ˜๋„ ์žˆ๋‹ค. ๋ฐ˜๋ฉด, ์–‘๋ฐฉํ–ฅ RNN์€ ์—ญ๋ฐฉํ–ฅ์œผ๋กœ์˜ ์ˆœ์„œ๋„ ๊ณ ๋ ค
    x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(20, kernel_initializer='he_normal', return_sequences=True,recurrent_dropout=0.3))(x)
    x =tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(10, kernel_initializer='he_normal', return_sequences=True,recurrent_dropout=0.3))(x)

    x= tf.keras.layers.Dense(1)(x)
    x1= tf.keras.layers.Flatten()(x)
    
    model = tf.keras.models.Model(inputs=inputs, outputs=x1)
    
    tf.keras.utils.plot_model(model, show_shapes=True)
    
    model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt(1e-2))
    es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
    history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)

    train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
    val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
    
    print("CNN-LSTM_train_score: ", train_score, '\n', 'CNN-LSTM_val_score: ', val_score)
    
    pred = np.asarray(model.predict(X_test))
    
    return pred, model, val_score
  • MLP
def MLP(q, X_train, Y_train, X_valid, Y_valid, X_test):
    
    inputs = tf.keras.layers.Input(shape=(X_train.shape[1],X_train.shape[2]), name='input')
    
    norm = tf.keras.layers.experimental.preprocessing.Normalization()
    norm.adapt(X_train)
    norm_data = norm(inputs)
    
    x = tf.keras.layers.Flatten()(norm_data)
    
    x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
    x = tf.keras.layers.Dropout(0.7)(x)
    x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
    x = tf.keras.layers.Dropout(0.7)(x)
    x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
    x = tf.keras.layers.Dropout(0.7)(x)
    x = tf.keras.layers.Dense(100, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(1e-3))(x)
    x = tf.keras.layers.Dropout(0.7)(x)

    x1= tf.keras.layers.Dense(Y_train.shape[-1])(x)
    
    model = tf.keras.models.Model(inputs=inputs, outputs=x1)
    
    tf.keras.utils.plot_model(model, show_shapes=True)
    
    model.compile(loss=lambda y,f: quantile_loss(q,y,f), optimizer=get_opt())
    
    es = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=20, restore_best_weights=True, verbose=1)
    
    history = model.fit(X_train, Y_train, epochs=500, batch_size=16, shuffle=True, validation_data=[X_valid, Y_valid], callbacks=[es], verbose=0)
    
    train_score = np.asarray(quantile_loss(q,Y_train,model.predict(X_train))).flatten().mean()
    val_score = np.asarray(quantile_loss(q,Y_valid,model.predict(X_valid))).flatten().mean()
    
    print("MLP_train_score: ", train_score, '\n', 'MLP_val_score: ', val_score)
    
    pred = np.asarray(model.predict(X_test))
    
    return pred, model, val_score
  • train and predict Test data
def TF_train_func(X_train, Y_train, X_valid, Y_valid, X_test):
    models=[]
    actual_pred = []
    
    for model_select in ['CNN','MLP','LSTM','CNNLSTM']:
        score_lst=[]
        for q in [0.1, 0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9]:
            print(model_select, q)
            if model_select=='LSTM':
                pred , mod, s = LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test)
            elif model_select=='CNN':
                pred , mod, s = CNN(q, X_train, Y_train, X_valid, Y_valid, X_test)
            elif model_select=='CNNLSTM':
                pred , mod, s = CNN_LSTM(q, X_train, Y_train, X_valid, Y_valid, X_test)
            elif model_select=='MLP':
                pred , mod, s = MLP(q, X_train, Y_train, X_valid, Y_valid, X_test)
            score_lst.append(s)
            models.append(mod)
            actual_pred.append(pred)
            
        print(sum(score_lst)/len(score_lst))
    return models, np.asarray(actual_pred)
  • train
import keras
import pydot
import pydotplus
from pydotplus import graphviz
from keras.utils.vis_utils import plot_model
## total of 4*9*2 models are trained(4 models, 9 quantiles, 2 seperate target days)
models_tf1, results_tf1 = TF_train_func(TF_X_train, TF_Y_train_1 , TF_X_valid, TF_Y_valid_1, tf_test)
models_tf2, results_tf2 = TF_train_func(TF_X_train, TF_Y_train_2 , TF_X_valid, TF_Y_valid_2, tf_test)

Step 1 : pip install pydot
Step 2 : pip install pydotplus
Step 3 : sudo apt-get install graphviz

๋‹ค ์„ค์น˜ํ•˜์˜€์œผ๋‚˜,,,,, ์—๋Ÿฌ๊ฐ€ ๋œฌ๋‹ค ใ…œใ…œใ…œ
๊ฒฐ๊ณผ ๊ถ๊ธˆํ•œ๋ฐ..ใ… ใ… 
์•—.. ๊ธฐ๋‹ค๋ ธ๋”๋‹ˆ ์„œ์„œํžˆ ๋‚˜์˜ค๊ณ  ์žˆ๋‹ค. ๊ต‰์žฅํžˆ ์˜ค๋ž˜ ๊ฑธ๋ฆด๋“ฏ.

 

 

 

 

 

์•„์ง ์ˆ˜์ •์ค‘~

 

 

728x90
๋ฐ˜์‘ํ˜•
Comments