๐ ๊ณต๋ถํ๋ ์ง์ง์ํ์นด๋ ์ฒ์์ด์ง?
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (13) ๋ณธ๋ฌธ
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (13)
์ง์ง์ํ์นด 2022. 11. 28. 10:54<๋ณธ ๋ธ๋ก๊ทธ๋ itadventrue ๋์ ๋ธ๋ก๊ทธ๋ฅผ ์ฐธ๊ณ ํด์ ๊ณต๋ถํ๋ฉฐ ์์ฑํ์์ต๋๋ค :-)>
https://itadventure.tistory.com/555
ํ๋!(14) - ๋ฆฟ์ง ๋ฆฌ๊ทธ๋ ์ ์ผ๋ก ์ ํ๋๊ฐ ๋์์ง๋ค๊ตฌ?
โป 'ํ๋'๋ ํ์ด์คํฌ๋ฆฝํธ ๋์ ๊ธฐ์ ์ค๋ง์ ๋๋ค. ์ง๋ ๊ฒ์๊ธ์ ์ฐ์ฌ๋๋ ๊ธ์ ๋๋ค : https://itadventure.tistory.com/554 ํ๋!(13) - ์? ์ธ๊ณต์ง๋ฅ ์ ์ค์จ์ด?! - ํ๊ท ๊ฐ๊ฒฉ ์ถ๊ฐ 'ํ๋'๋ ํ์ด์คํฌ๋ฆฝํธ ๋์
itadventure.tistory.com
๐ต ๊ณผ์์ ํฉ
ํ๋ จ๋ฐ์ดํฐ ์ ์ค์จ์ด ํ ์คํธ๋ฐ์ดํฐ ์ ์ค์จ๋ณด๋ค ๋ฎ์ ๊ฒฝ์ฐ๋ฅผ '๊ณผ์์ ํฉ'
ํ๋ จ์ธํธ์ ํ ์คํธ ๋ฐ์ดํฐ์ ์ ์ค์จ์ด ๋น์ทํด์ผ ์ข์ ์๊ณ ๋ฆฌ์ฆ
๐ต ์ฐจ์
- ์ ํํ๊ท
- 1์ฐจ ๋ฐฉ์ ์
- x๊ฐ์ด ํ๊ฐ๊ฐ ์๋๋ผ, ์ฌ๋ฌ ๊ฐ์ x๊ฐ์ด ์กด์ฌ
- ๋จธ์ ๋ฌ๋์ด ํ๋ จ๊ณผ์ ์ ๊ทธ ๊ฐ๋ค์ ์ ๋ฌํ๊ฒ ์กฐ์ ํด ๋งค์ถ๋(y)๊ฐ์ ์ฐ์ถ
์ ๊ณฑ ์์น ๋ฐ์ดํฐ๋ฅผ ์ ๊ณตํ๊ฒ ๋๋ฉด ๊ทธ๋ํ๊ฐ ์ข ๋ ์ ๋ฐํ ํํ
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด = ์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด.astype(np.float)
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด = np.column_stack((
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด ,
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด[:,0] ** 2,
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด[:,1] ** 2,
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด[:,2] ** 2,
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด[:,3] ** 2,
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด[:,4] ** 2,
์ฃผ๊ฐ๋งค์ถ๋ฐ์ดํฐํ๋ จ_๋ํ์ด[:,5] ** 2
))
astype(np.float) ๋ ๋ํ์ด ๋ฐ์ดํฐ๋ค์ ๋ฌธ์์์ ์ซ์๋ก ๋ฐ๊ฟ์ฃผ๋ ๊ธฐ๋ฅ
np.column_stack ์ ๋ํ์ด ์๋ณธ ๋ฐ์ดํฐ์ ํ๋์ฉ ์ด์ ์ถ๊ฐ
๐ต ๊ณผ๋์ ํฉ
ํ๋ จ์ฉ ๋ฐ์ดํฐ์ ๋๋ฌด ์ถฉ์ค
ํ ์คํธ๋ฐ์ดํฐ ์ ์ค์จ์ด ์์ฃผ ์ข์ง ์์
๐ฅ ๋ฆฟ์ง ๋ฆฌ๊ทธ๋ ์ (Rigde Regression) -> ๊ณผ์ ํฉ์ ํด๊ฒฐ
from sklearn.linear_model import Ridge
๋ฆฟ์ง๋ชจ๋ธ = Ridge(alpha=์ํ๊ฐ)
๋ฆฟ์ง๋ชจ๋ธ.fit(ํ๋ จ์ฉ๋ฐ์ดํฐ, ํ๋ จ์ฉ๋ชฉํ)
print("ํ๋ จ์ฉ๋ฐ์ดํฐ ์ ํ๋")
print(๋ฆฟ์ง๋ชจ๋ธ.score(ํ๋ จ์ฉ๋ฐ์ดํฐ, ํ๋ จ์ฉ๋ชฉํ))
print("ํ
์คํธ๋ฐ์ดํฐ ์ ํ๋")
print(๋ฆฟ์ง๋ชจ๋ธ.score(ํ
์คํธ๋ฐ์ดํฐ, ํ
์คํธ๋ชฉํ))
ํ๋ จ์ฉ๋ชฉํ์์ธก = ๋ฆฟ์ง๋ชจ๋ธ.predict(ํ๋ จ์ฉ๋ฐ์ดํฐ)
ํ
์คํธ๋ชฉํ์์ธก = ๋ฆฟ์ง๋ชจ๋ธ.predict(ํ
์คํธ๋ฐ์ดํฐ)
์ํ๊ฐ์ด๋ ๋ณ์๋ฅผ ํ๋ผ๋ฏธํฐ๋ก ์ฃผ๋ ์
์ด ์ํ๊ฐ์ด ๊ณผ์ ํฉ๋์ง ์๋๋ก ๋ฐฉ์งํด์ฃผ๋ ์ต์
0.01, 0.1, 1, 10 ๋ฑ์ผ๋ก 1์ ๋ฐฐ์ ๋จ์๋ฅผ ์ฌ์ฉํ๋ ๊ฒ์ด ์ผ๋ฐ์
์ฌ๋์ด ์ด ๊ฐ์ผ๋ก ์ ๋๋ฅผ ์กฐ์ ํด์ผ ํ๋ค๊ณ ํด์ ํ์ดํผ ํ๋ผ๋ฏธํฐ(Hyper parameter)
๋ฆฟ์ง๋ชจ๋ธ = Ridge(alpha=0.1)
๐ฅ ๋ผ์ ๋ฆฌ๊ทธ๋ ์ (Lasso Regression) -> ๊ณผ์ ํฉ์ ํด๊ฒฐ
๋ผ์๋ ๊ทธ๋ค์ ํ์ด์ง์์~
๐ต ์ฝ๋ ๊ตฌํ
- index.html
<html>
<head>
<link rel="stylesheet"
href="https://pyscript.net/alpha/pyscript.css" />
<script defer
src="https://pyscript.net/alpha/pyscript.js"></script>
<py-env>
- pandas
- matplotlib
- seaborn
- scikit-learn
- paths :
- ./common.py
</py-env>
</head>
<body>
<link rel="stylesheet" href="pytable.css"/>
<py-script>
import pandas as pd
from pyodide.http import open_url
from common import *
import numpy as np
from datetime import datetime
<!-- ๋ํ์ด ๋ฐฐ์ด ์ถ๋ ฅ์ ์์ซ์ ์๋ฆด์ ์ง์ -->
np.set_printoptions(formatter={'float_kind': lambda x: "{0:0.2f}".format(x)})
<!-- ๊ฒฝ๊ณ ๋ฌธ๊ตฌ ์ ๊ฑฐ -->
import warnings
warnings.filterwarnings( 'ignore' )
<!-- ํ๋ค์ค์์ csv ๋ฅผ ๋ฐ์ดํฐ ํ๋ ์์ผ๋ก ์ฝ์ด์ด -->
SalesData = pd.read_csv(open_url(
"http://dreamplan7.cafe24.com/pyscript/csv/avocado.csv"
))
<!-- # 3๊ฐ ํ๋๋ง ์ถ๋ ค์ ๋ฐ์ดํฐ ํ๋ ์์ ๋ค์ ๋ง๋ฌ -->
SalesData = SalesData[[
'Date',
'Total Volume',
'AveragePrice'
]]
SalesData.columns = [
'Day',
'Amount',
'AveragePrice'
]
<!-- ๋ ์ง๋ณ๋ก ( ์ฃผ ๋จ์๋ก ) ๊ทธ๋ฃน์ ์ง์ ๋๋ ๋งค์ถ๋์ ๊ทธ๋ฃน๋จ์๋ก ํฉ์ฐํ์ฌ ํฉ๊ณ -->
WeekdaysSales_sum = SalesData.fillna(0) \
.groupby('Day', as_index=False)[['Amount']].sum() \
.sort_values(by='Day', ascending=True)
WeekdaysSales_mean = SalesData.fillna(0) \
.groupby('Day', as_index=False)[['AveragePrice']].mean() \
.sort_values(by='Day', ascending=True)
<!-- 2๊ฐ์ ๋ฐ์ดํฐ ํ๋ ์์ ํ๋๋ก merge (on์ ๊ธฐ์ฌ๋ '๋ ์ง'๋ฅผ ๊ธฐ์ค) -->
WeekdaysSalesData = pd.merge(WeekdaysSales_sum, WeekdaysSales_mean, on = 'Day')
<!-- ๋ ์ง(์๊ฐ๊ฐ) ์ถ๊ฐ -->
WeekdaysSalesData.insert(1, 'Day(timeValue)',
'', True)
for i in WeekdaysSalesData['Day'].index:
WeekdaysSalesData['Day(timeValue)'].loc[i]=time.mktime(
datetime.strptime(
WeekdaysSalesData['Day'].loc[i],
'%Y-%m-%d'
).timetuple()
)
<!-- 10000์ผ๋ก ๋๋ ๋งค์ถ๋ ํ๋ ์ถ๊ฐ -->
WeekdaysSalesData.insert(3, 'Amount(10000)',
WeekdaysSalesData['Amount']/10000,
True)
<!-- ํ๋ จํ์ต์ฉ์ผ๋ก ๋ ์ง๋ฅผ ์ฐ๋, ์, ์ผ๋ก ๋๋๋ค -->
WeekdaysSalesData.insert(4, 'year', '', True)
WeekdaysSalesData.insert(5, 'month', '', True)
WeekdaysSalesData.insert(6, 'day', '', True)
WeekdaysSalesData.insert(7, 'week', '', True)
for i in WeekdaysSalesData['Day'].index:
temp = str(WeekdaysSalesData['Day'].loc[i]).split('-')
year = int(temp[0])
month = int(temp[1])
day = int(temp[2])
WeekdaysSalesData['year'].loc[i] = year
WeekdaysSalesData['month'].loc[i] = month
WeekdaysSalesData['day'].loc[i] = day
WeekdaysSalesData['week'].loc[i] = str(
datetime(year, month, day).isocalendar()[1]
)
createElementDiv(
document,
Element,
'output2'
).write(WeekdaysSalesData)
WeekdaysSalesDataTrain_numpy = WeekdaysSalesData[['Day(timeValue)', 'year', 'month', 'day', 'week', 'AveragePrice']].to_numpy()
WeekdaysSalesDataTest_numpy = WeekdaysSalesData['Amount(10000)'].to_numpy()
WeekdaysSalesDataTrain_numpy = WeekdaysSalesDataTrain_numpy.astype(np.float)
WeekdaysSalesDataTrain_numpy = np.column_stack((
WeekdaysSalesDataTrain_numpy ,
WeekdaysSalesDataTrain_numpy[:,0] ** 2,
WeekdaysSalesDataTrain_numpy[:,1] ** 2,
WeekdaysSalesDataTrain_numpy[:,2] ** 2,
WeekdaysSalesDataTrain_numpy[:,3] ** 2,
WeekdaysSalesDataTrain_numpy[:,4] ** 2,
WeekdaysSalesDataTrain_numpy[:,5] ** 2
))
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = \
train_test_split(
WeekdaysSalesDataTrain_numpy,
WeekdaysSalesDataTest_numpy,
random_state=100,
shuffle=False)
<!-- ์ค์ผ์ผํ๋ '๋ฐ์ดํฐ๋ฅผ ์์ ํ' -->
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train_scaler = scaler.transform(X_train)
X_test_scaler = scaler.transform(X_test)
from sklearn.linear_model import Ridge
ridge_model = Ridge(alpha=0.1)
ridge_model.fit(X_train_scaler, y_train)
<!-- ํ๋ จ๊ณผ์ ์ ๋ํ ์ฒ๋๋ฅผ ํ๊ฐ -> score() -->
print("ํ๋ จ์ฉ๋ชจ๋ธ ์ ํ๋")
print(ridge_model.score(X_train_scaler, y_train))
print("ํ
์คํธ๋ชจ๋ธ ์ ํ๋")
print(ridge_model.score(X_test_scaler, y_test))
<!-- ์ค์ผ์ผํ๋ ๋ฐ์ดํฐ๋ฅผ ๋ฐํ์ผ๋ก ์์ธก๊ฒฐ๊ณผ -->
y_train_predict = ridge_model.predict(X_train_scaler)
y_test_predict = ridge_model.predict(X_test_scaler)
import matplotlib.pyplot as plt
import matplotlib as mat
<!-- ๊ทธ๋ํ -->
fig = plt.figure(
figsize=(15, 7)
)
plt.xticks(WeekdaysSalesData['Day(timeValue)'].to_numpy(), WeekdaysSalesData[['Day']].to_numpy()[:,0], rotation=90)
plt.title('Weekdays Avocado SalesAmount')
plt.plot(
X_train[:,0],
y_train,
marker='o',
color='#c14549',
label='Original'
)
plt.plot(
X_train[:,0],
y_train_predict,
marker='d',
color='blue',
label='Train pattern'
)
plt.plot(
X_test[:, 0],
y_test,
marker='o',
color='#c14549'
)
plt.plot(
X_test[:, 0],
y_test_predict,
marker='d',
color='green',
label='Predict pattern'
)
plt.xlabel('Day')
plt.ylabel('Day(timeValue)')
plt.legend(
shadow=True
)
ax = plt.gca()
<!-- ์ถ๋ง ๊ทธ๋ฆฌ๋ -->
ax.xaxis.grid(True)
<!-- ๋ฐฐ๊ฒฝ์, ๋ง์ง ์กฐ์ -->
ax.set_facecolor('#e8e7d2')
ax.margins(x=0.01, y=0.02)
<!-- ์ฃผ์ ์ด์ํ ์ฌ๋ฐฑ ์์ ๊ธฐ -->
fig.tight_layout()
fig
</py-script>
</body>
</html>
- common.py
def createElementDiv(document, Element, name):
element = document.createElement('div')
element.id = name
document.body.append(element)
return Element(name)
'๐ฉโ๐ป ๋ฐฑ์๋(Back-End) > Node js' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (15) (0) | 2022.11.28 |
---|---|
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (14) (0) | 2022.11.28 |
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (12) (0) | 2022.11.28 |
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (11) (0) | 2022.11.25 |
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (10) (1) | 2022.11.25 |