๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

์‹œ๊ณ„์—ด ๋ชจ๋ธ ARIMA 2 (์ž๊ธฐํšŒ๊ท€ ์ง‘์  ์ด๋™ ํ‰๊ท ) ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป ์ธ๊ณต์ง€๋Šฅ (ML & DL)/Serial Data

์‹œ๊ณ„์—ด ๋ชจ๋ธ ARIMA 2 (์ž๊ธฐํšŒ๊ท€ ์ง‘์  ์ด๋™ ํ‰๊ท )

์ง•์ง•์•ŒํŒŒ์นด 2022. 9. 8. 13:21
728x90
๋ฐ˜์‘ํ˜•

220908 ์ž‘์„ฑ

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” IBM ๋ฌธ์„œ์™€ otexts์˜ chanpter 8, byeongkijeong๋‹˜์˜ ๋ธ”๋กœ๊ทธ๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค :-) >

https://www.ibm.com/docs/ko/spss-statistics/25.0.0?topic=modeler-custom-arima-models 

 

์‚ฌ์šฉ์ž ์ •์˜ ARIMA ๋ชจํ˜•

์‹œ๊ณ„์—ด ๋ชจ๋ธ๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ ์ • ์˜ˆ์ธก๋ณ€์ˆ˜ ์„ธํŠธ๊ฐ€ ํฌํ•จ๋˜๊ฑฐ๋‚˜ ํฌํ•จ๋˜์ง€ ์•Š์€ Box-Jenkins1 ๋ชจํ˜•์ด๋ผ๊ณ ๋„ ํ•˜๋Š” ์‚ฌ์šฉ์ž ์ •์˜ ๋น„๊ณ„์ ˆ ๋˜๋Š” ๊ณ„์ ˆ ARIMA(์ž๊ธฐํšŒ๊ท€ ์ง‘์  ์ด๋™ ํ‰๊ท ) ๋ชจํ˜•์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž„์˜

www.ibm.com

https://otexts.com/fppkr/arima.html

 

Chapter 8 ARIMA ๋ชจ๋ธ | Forecasting: Principles and Practice

2nd edition

otexts.com

https://byeongkijeong.github.io/ARIMA-with-Python/

 

ARIMA, Python์œผ๋กœ ํ•˜๋Š” ์‹œ๊ณ„์—ด๋ถ„์„ (feat. ๋น„ํŠธ์ฝ”์ธ ๊ฐ€๊ฒฉ์˜ˆ์ธก)

์„œ๋ก  ์‹œ๊ณ„์—ด ๋ถ„์„(Time series analysis)์ด๋ž€, ๋…๋ฆฝ๋ณ€์ˆ˜(Independent variable)๋ฅผ ์ด์šฉํ•˜์—ฌ ์ข…์†๋ณ€์ˆ˜(Dependent variable)๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์ผ๋ฐ˜์ ์ธ ๊ธฐ๊ณ„ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•˜์—ฌ ์‹œ๊ฐ„์„ ๋…๋ฆฝ๋ณ€์ˆ˜๋กœ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ํŠน์ง•์ด

byeongkijeong.github.io

 

 

 

๐Ÿ˜Ž 1. ARIMA (Autoregressive Integrated Moving Average)

 

โ–ถ ARIMA๋ž€?

: ์‹œ๊ณ„์—ด์˜ ๋น„์ •์ƒ์„ฑ(Non-stationary)์„ ์„ค๋ช…ํ•˜๊ธฐ ์œ„ํ•ด ๊ด€์ธก์น˜๊ฐ„์˜ ์ฐจ๋ถ„(Diffrance)์„ ์‚ฌ์šฉ

  • AR : ์ž๊ธฐํšŒ๊ท€(Autoregression)
    • ์ด์ „ ๊ด€์ธก๊ฐ’์˜ ์˜ค์ฐจํ•ญ์ด ์ดํ›„ ๊ด€์ธก๊ฐ’์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ๋ชจํ˜•
    • theta : ์ž๊ธฐ์ƒ๊ด€๊ณ„์ˆ˜
    • epsilon : white noise
    • Time lag : 1์ด ๋ ์ˆ˜๋„ ์žˆ๊ณ  ๊ทธ ์ด์ƒ์ด ๋  ์ˆ˜๋„ ์žˆ์Œ
  • I : Intgrated
    • ๋ˆ„์ 
    • ์ฐจ๋ถ„์„ ์ด์šฉํ•˜๋Š” ์‹œ๊ณ„์—ด๋ชจํ˜•๋“ค์— ๋ถ™์ด๋Š” ํ‘œํ˜„
  • MA : ์ด๋™ํ‰๊ท (Moving Average)
    • ๊ด€์ธก๊ฐ’์ด ์ด์ „์˜ ์—ฐ์†์ ์ธ ์˜ค์ฐจํ•ญ์˜ ์˜ํ–ฅ์„ ๋ฐ›๋Š”๋‹ค๋Š” ๋ชจํ˜•
    • beta : ์ด๋™ํ‰๊ท ๊ณ„์ˆ˜
    • epsilon : t์‹œ์ ์˜ ์˜ค์ฐจํ•ญ

 

โ–ถ ๋ชจ์ˆ˜ ์„ค์ • ARMIA(p, d, q)

  • AR๋ชจํ˜•์˜ Lag์„ ์˜๋ฏธํ•˜๋Š” p
  • MA๋ชจํ˜•์˜ Lag์„ ์˜๋ฏธํ•˜๋Š” q
  • ์ฐจ๋ถ„(Diffrence)ํšŸ์ˆ˜๋ฅผ ์˜๋ฏธํ•˜๋Š” d
  • ํ†ต์ƒ์ ์œผ๋กœ p + q < 2, p * q = 0 ์ธ ๊ฐ’๋“ค์„ ๋งŽ์ด ์‚ฌ์šฉ

 

  • ๋ชจ์ˆ˜ ์ถ”์ • ๋ฐฉ๋ฒ•
    • ACF(Autocorrelation function) : Lag์— ๋”ฐ๋ฅธ ๊ด€์ธก์น˜๋“ค ์‚ฌ์ด์˜ ๊ด€๋ จ์„ฑ์„ ์ธก์ •ํ•˜๋Š” ํ•จ์ˆ˜
    • PACF(Partial autocorrelation function) : k ์ด์™ธ์˜ ๋ชจ๋“  ๋‹ค๋ฅธ ์‹œ์  ๊ด€์ธก์น˜์˜ ์˜ํ–ฅ๋ ฅ์„ ๋ฐฐ์ œํ•˜๊ณ  ์™€ ๋‘ ๊ด€์ธก์น˜์˜ ๊ด€๋ จ์„ฑ์„ ์ธก์ •ํ•˜๋Š” ํ•จ์ˆ˜ 
    • AR์˜ ํŠน์„ฑ์„ ๋„๋Š” ๊ฒฝ์šฐ
      • ACF๋Š” ์ฒœ์ฒœํžˆ ๊ฐ์†Œ
      • PACF๋Š” ์ฒ˜์Œ ์‹œ์ฐจ๋ฅผ ์ œ์™ธํ•˜๊ณ  ๊ธ‰๊ฒฉํžˆ ๊ฐ์†Œ
    • MA์˜ ํŠน์„ฑ์„ ๋„๋Š” ๊ฒฝ์šฐ
      • ACF๋Š” ๊ธ‰๊ฒฉํžˆ ๊ฐ์†Œ
      • PACF๋Š” ์ฒœ์ฒœํžˆ ๊ฐ์†Œ
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

plot_acf                                                                                                              plot_pacf

๋”๋ณด๊ธฐ
  • diff๋Š” ํ•œ ๊ฐ์ฒด ๋‚ด์—์„œ ์—ด๊ณผ ์—ด / ํ–‰๊ณผ ํ–‰์˜ ์ฐจ์ด๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๋ฉ”์„œ๋“œ 
DataFrame.diff(periods=1, axis=0)
  • ์ฐจ๋ถ„์„ ํ•˜๋Š” ์ด์œ ๋Š” non-stationaryํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฐจ๋ถ„์„ ํ†ตํ•ด stationaryํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ฃผ๋Š” ๊ฒƒ

                 ARIMA(0,1,1)์„ ์‚ฌ์šฉ!       <=                          plot_acf                                                 plot_pacf

 

 

โ–ถ ๋ชจํ˜• ๊ตฌ์ถ•

  • ARIMA(0,1,1)์„ ์ด์šฉํ•˜์—ฌ ๋ชจํ˜•์˜ Parameter๋ฅผ ์ถ”์ •ํ•˜๊ณ , ๊ฒฐ๊ณผ ํ™•์ธ
from statsmodels.tsa.arima_model import ARIMA

model = ARIMA(series, order=(0,1,1))
model_fit = model.fit(trend='nc',full_output=True, disp=1)
print(model_fit.summary())

 

 

โ–ถ ์˜ˆ์ธก

  • constraint๊ฐ€ ์—†๋Š” ๋ชจํ˜•์œผ๋กœ fitting

 

  • ์•ž์œผ๋กœ์˜ ๊ฐ’์„ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” forecast method๋ฅผ ์‚ฌ์šฉ
    • steps : ์˜ˆ์ธกํ•  ๊ฐœ์ˆ˜
fore = model_fit.forecast(steps=1)
print(fore)

=> ์˜ˆ์ธก๊ฐ’, stderr, upper bound, lower bound

 

 

 

728x90
๋ฐ˜์‘ํ˜•
Comments