๐ ๊ณต๋ถํ๋ ์ง์ง์ํ์นด๋ ์ฒ์์ด์ง?
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (14) ๋ณธ๋ฌธ
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (14)
์ง์ง์ํ์นด 2022. 11. 28. 13:38<๋ณธ ๋ธ๋ก๊ทธ๋ itadventrue ๋์ ๋ธ๋ก๊ทธ๋ฅผ ์ฐธ๊ณ ํด์ ๊ณต๋ถํ๋ฉฐ ์์ฑํ์์ต๋๋ค :-)>
https://itadventure.tistory.com/557
ํ๋!(15) - ๋ผ์ํ๊ท์ 4์ฐจ๋ฐฉ์ ์๊น์ง
๐ฟ 'ํ๋'๋ ํ์ด์คํฌ๋ฆฝํธ ๋์ ๊ธฐ์ ์ค์๋ง์ ๋๋ค. ์ง๋ ๊ฒ์๊ธ์์ ์ฐ์ฌ๋๋ ๊ธ์ ๋๋ค. : https://itadventure.tistory.com/555 ํ๋!(14) - ๋ฆฟ์ง ๋ฆฌ๊ทธ๋ ์ ์ผ๋ก ์ ํ๋๊ฐ ๋์์ง๋ค๊ตฌ? โป 'ํ๋'๋ ํ์ด์คํฌ
itadventure.tistory.com
๐ฅ ๋ผ์ ๋ฆฌ๊ทธ๋ ์ (Lasso Regression)
PolynomialFeatures ( ํด๋ฆฌ๋ ธ๋ฏธ์ผ ํผ์ณ ), ๋คํญํน์ฑ ๋ชจ๋
์ ๊ณฑ์ด๋ ๊ณฑํ๊ธฐํ ์ ์๋ ๊ฒฝ์ฐ์ ์๊ฐ ๋ชจ๋ ๋์ด
1) ํด๋ฆฌ ๋ชจ๋์ ๋ถ๋ฌ์ ๋คํญ์ ๋ชจ๋ธ์ ๋ง๋ค๊ธฐ
from sklearn.preprocessing import PolynomialFeatures
ํด๋ฆฌ = PolynomialFeatures(degree=4, include_bias=False)
2) ํ๋ จ์ฉ๋ฐ์ดํฐ๋ฅผ ๋ฃ์ด ํ์ ๋ง๊ฒ ํ๋ จ ์ํด
ํด๋ฆฌ.fit(ํ๋ จ์ฉ๋ฐ์ดํฐ)
3) ํ๋ จ๋ ๋คํญ์ ๋ชจ๋ธ์ ๋ฐ์ดํฐ๋ฅผ ๋ฃ์ผ๋ฉด ์๋ก์ด ๋ฐ์ดํฐ๊ฐ ๋์ด
ํ๋ จ์ฉ๋ฐ์ดํฐ_๊ฐ๊ณต = ํด๋ฆฌ.transform(ํ๋ จ์ฉ๋ฐ์ดํฐ)
4) ํน์ฑ ๋ชฉ๋ก ์ถ๋ ฅ
print(ํด๋ฆฌ.get_feature_names_out())
๐ฅ ๋ผ์ ๋ฆฌ๊ทธ๋ ์ (Lasso Regression) vs ๋ฆฟ์ง ๋ฆฌ๊ทธ๋ ์ (Ridge Regression)
ํญ์ด ๋ง์์๋ก ๋์ฑ ๋ฐ์ดํฐ์ ๊ณผ์ ํฉ์ด ๋ ์ ์์
ํญ์ด ๋ง์์๋ก ์ํ๊ฐ์ ๋ ๋๊ฒ ์ค์ผ ํจ
#=====================================
# ๋ฆฟ์ง๋ชจ๋ธ
from sklearn.linear_model import Ridge
๋ฆฟ์ง๋ชจ๋ธ = Ridge(alpha=10)
๋ฆฟ์ง๋ชจ๋ธ.fit(ํ๋ จ์ฉ๋ฐ์ดํฐ_๊ฐ๊ณต, ํ๋ จ์ฉ๋ชฉํ)
# ์ข
๋ฅ๊ฐ ๋ชฉํ๊ฐ ์๋ ์ด์ ์ ํ๋๋ ์ธก์ ๋ถ๊ฐ
print("๋ฆฟ์ง ํ๋ จ์ฉ๋ชจ๋ธ ์ ํ๋")
print(๋ฆฟ์ง๋ชจ๋ธ.score(ํ๋ จ์ฉ๋ฐ์ดํฐ_๊ฐ๊ณต, ํ๋ จ์ฉ๋ชฉํ))
print("๋ฆฟ์ง ํ
์คํธ๋ชจ๋ธ ์ ํ๋")
print(๋ฆฟ์ง๋ชจ๋ธ.score(ํ
์คํธ๋ฐ์ดํฐ_๊ฐ๊ณต, ํ
์คํธ๋ชฉํ))
ํ๋ จ์ฉ๋ชฉํ์์ธก_๋ฆฟ์ง = ๋ฆฟ์ง๋ชจ๋ธ.predict(ํ๋ จ์ฉ๋ฐ์ดํฐ_๊ฐ๊ณต)
ํ
์คํธ๋ชฉํ์์ธก_๋ฆฟ์ง = ๋ฆฟ์ง๋ชจ๋ธ.predict(ํ
์คํธ๋ฐ์ดํฐ_๊ฐ๊ณต)
#=====================================
# ๋ผ์๋ชจ๋ธ
from sklearn.linear_model import Lasso
๋ผ์๋ชจ๋ธ = Lasso(alpha=10)
๋ผ์๋ชจ๋ธ.fit(ํ๋ จ์ฉ๋ฐ์ดํฐ_๊ฐ๊ณต, ํ๋ จ์ฉ๋ชฉํ)
# ์ข
๋ฅ๊ฐ ๋ชฉํ๊ฐ ์๋ ์ด์ ์ ํ๋๋ ์ธก์ ๋ถ๊ฐ
print("๋ผ์ ํ๋ จ์ฉ๋ชจ๋ธ ์ ํ๋")
print(๋ผ์๋ชจ๋ธ.score(ํ๋ จ์ฉ๋ฐ์ดํฐ_๊ฐ๊ณต, ํ๋ จ์ฉ๋ชฉํ))
print("๋ผ์ ํ
์คํธ๋ชจ๋ธ ์ ํ๋")
print(๋ผ์๋ชจ๋ธ.score(ํ
์คํธ๋ฐ์ดํฐ_๊ฐ๊ณต, ํ
์คํธ๋ชฉํ))
ํ๋ จ์ฉ๋ชฉํ์์ธก_๋ผ์ = ๋ผ์๋ชจ๋ธ.predict(ํ๋ จ์ฉ๋ฐ์ดํฐ_๊ฐ๊ณต)
ํ
์คํธ๋ชฉํ์์ธก_๋ผ์ = ๋ผ์๋ชจ๋ธ.predict(ํ
์คํธ๋ฐ์ดํฐ_๊ฐ๊ณต)
๐ฅ ์ฝ๋ ๊ตฌํ
์๋ณธ, ๋ฆฟ์ง, ๋ผ์ ๊ทธ๋ํ๋ฅผ ํจ๊ป ๊ทธ๋ฆฌ๊ธฐ
- index.html
<html>
<head>
<title>๋คํญํ๊ท + ๋ผ์ ๋ฆฌ๊ทธ๋ ์
</title>
<link rel="stylesheet"
href="https://pyscript.net/alpha/pyscript.css" />
<script defer
src="https://pyscript.net/alpha/pyscript.js"></script>
<py-env>
- pandas
- matplotlib
- seaborn
- scikit-learn
- paths :
- ./common.py
</py-env>
</head>
<body>
<link rel="stylesheet" href="pytable.css"/>
<py-script>
import pandas as pd
from pyodide.http import open_url
from common import *
import numpy as np
from datetime import datetime
<!-- ๋ํ์ด ๋ฐฐ์ด ์ถ๋ ฅ์ ์์ซ์ ์๋ฆด์ ์ง์ -->
np.set_printoptions(formatter={'float_kind': lambda x: "{0:0.2f}".format(x)})
<!-- ๊ฒฝ๊ณ ๋ฌธ๊ตฌ ์ ๊ฑฐ -->
import warnings
warnings.filterwarnings( 'ignore' )
<!-- ํ๋ค์ค์์ csv ๋ฅผ ๋ฐ์ดํฐ ํ๋ ์์ผ๋ก ์ฝ์ด์ด -->
SalesData = pd.read_csv(open_url(
"http://dreamplan7.cafe24.com/pyscript/csv/avocado.csv"
))
<!-- # 3๊ฐ ํ๋๋ง ์ถ๋ ค์ ๋ฐ์ดํฐ ํ๋ ์์ ๋ค์ ๋ง๋ฌ -->
SalesData = SalesData[[
'Date',
'Total Volume',
'AveragePrice'
]]
SalesData.columns = [
'Day',
'Amount',
'AveragePrice'
]
<!-- ๋ ์ง๋ณ๋ก ( ์ฃผ ๋จ์๋ก ) ๊ทธ๋ฃน์ ์ง์ ๋๋ ๋งค์ถ๋์ ๊ทธ๋ฃน๋จ์๋ก ํฉ์ฐํ์ฌ ํฉ๊ณ -->
WeekdaysSales_sum = SalesData.fillna(0) \
.groupby('Day', as_index=False)[['Amount']].sum() \
.sort_values(by='Day', ascending=True)
WeekdaysSales_mean = SalesData.fillna(0) \
.groupby('Day', as_index=False)[['AveragePrice']].mean() \
.sort_values(by='Day', ascending=True)
<!-- 2๊ฐ์ ๋ฐ์ดํฐ ํ๋ ์์ ํ๋๋ก merge (on์ ๊ธฐ์ฌ๋ '๋ ์ง'๋ฅผ ๊ธฐ์ค) -->
WeekdaysSalesData = pd.merge(WeekdaysSales_sum, WeekdaysSales_mean, on = 'Day')
<!-- ๋ ์ง(์๊ฐ๊ฐ) ์ถ๊ฐ -->
WeekdaysSalesData.insert(1, 'Day(timeValue)',
'', True)
for i in WeekdaysSalesData['Day'].index:
WeekdaysSalesData['Day(timeValue)'].loc[i]=time.mktime(
datetime.strptime(
WeekdaysSalesData['Day'].loc[i],
'%Y-%m-%d'
).timetuple()
)
<!-- 10000์ผ๋ก ๋๋ ๋งค์ถ๋ ํ๋ ์ถ๊ฐ -->
WeekdaysSalesData.insert(3, 'Amount(10000)',
WeekdaysSalesData['Amount']/10000,
True)
<!-- ํ๋ จํ์ต์ฉ์ผ๋ก ๋ ์ง๋ฅผ ์ฐ๋, ์, ์ผ๋ก ๋๋๋ค -->
WeekdaysSalesData.insert(4, 'year', '', True)
WeekdaysSalesData.insert(5, 'month', '', True)
WeekdaysSalesData.insert(6, 'day', '', True)
WeekdaysSalesData.insert(7, 'week', '', True)
for i in WeekdaysSalesData['Day'].index:
temp = str(WeekdaysSalesData['Day'].loc[i]).split('-')
year = int(temp[0])
month = int(temp[1])
day = int(temp[2])
WeekdaysSalesData['year'].loc[i] = year
WeekdaysSalesData['month'].loc[i] = month
WeekdaysSalesData['day'].loc[i] = day
WeekdaysSalesData['week'].loc[i] = str(
datetime(year, month, day).isocalendar()[1]
)
createElementDiv(
document,
Element,
'output2'
).write(WeekdaysSalesData)
WeekdaysSalesDataTrain_numpy = WeekdaysSalesData[['Day(timeValue)', 'year', 'month', 'day', 'week', 'AveragePrice']].to_numpy()
WeekdaysSalesDataTest_numpy = WeekdaysSalesData['Amount(10000)'].to_numpy()
WeekdaysSalesDataDay_numpy = WeekdaysSalesData['Day'].to_numpy()
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = \
train_test_split(
WeekdaysSalesDataTrain_numpy,
WeekdaysSalesDataTest_numpy,
random_state=100,
shuffle=False)
<!-- PolynomialFeatures ( ํด๋ฆฌ๋
ธ๋ฏธ์ผ ํผ์ณ ), ๋คํญํน์ฑ ๋ชจ๋ -->
<!-- ์ ๊ณฑ์ด๋ ๊ณฑํ๊ธฐํ ์ ์๋ ๊ฒฝ์ฐ์ ์๊ฐ ๋ชจ๋ ๋์ด -->
from sklearn.preprocessing import PolynomialFeatures
polynomial = PolynomialFeatures(degree=4, include_bias=False) # ์ ํธ ์์ฑ์ ์ ๊ฑฐ
polynomial.fit(X_train) # ํน์ฑ์ ๋คํญ์ผ๋ก ์๋์ผ๋ก ๋ถ๋ฆผ
train_polynomial_added = polynomial.transform(X_train) # ํ์ต์ ์ถ๊ฐ๋ ํ๋ผ๋ฏธํฐ์ ๋ง๊ฒ ๋คํญ ๋ณํ
test_polynomial_added = polynomial.transform(X_test) # ํ
์คํธ ์ธํธ๋ ๋คํญ ๋ณํ, fitํ๋ ํ๋ จ poly ๋ฅผ ์ฌ์ฉ.
<!-- ์ค์ผ์ผํ๋ '๋ฐ์ดํฐ๋ฅผ ์์ ํ' -->
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(train_polynomial_added)
train_polynomial_added = scaler.transform(train_polynomial_added)
test_polynomial_added = scaler.transform(test_polynomial_added)
<!-- ===================================== -->
<!-- ๋ฆฟ์ง๋ชจ๋ธ -->
from sklearn.linear_model import Ridge
ridge_model = Ridge(alpha=0.1)
ridge_model.fit(train_polynomial_added, y_train)
<!-- ํ๋ จ๊ณผ์ ์ ๋ํ ์ฒ๋๋ฅผ ํ๊ฐ -> score() -->
print("๋ฆฟ์ง ํ๋ จ์ฉ๋ชจ๋ธ ์ ํ๋")
print(ridge_model.score(train_polynomial_added, y_train))
print("๋ฆฟ์ง ํ
์คํธ๋ชจ๋ธ ์ ํ๋")
print(ridge_model.score(test_polynomial_added, y_test))
<!-- ์ค์ผ์ผํ๋ ๋ฐ์ดํฐ๋ฅผ ๋ฐํ์ผ๋ก ์์ธก๊ฒฐ๊ณผ -->
y_train_ridge_predict = ridge_model.predict(train_polynomial_added)
y_test_ridge_predict = ridge_model.predict(test_polynomial_added)
<!-- ===================================== -->
<!-- ๋ผ์๋ชจ๋ธ -->
from sklearn.linear_model import Lasso
lasso_model = Lasso(alpha=0.1)
lasso_model.fit(train_polynomial_added, y_train)
<!-- ํ๋ จ๊ณผ์ ์ ๋ํ ์ฒ๋๋ฅผ ํ๊ฐ -> score() -->
print("๋ผ์ ํ๋ จ์ฉ๋ชจ๋ธ ์ ํ๋")
print(lasso_model.score(train_polynomial_added, y_train))
print("๋ผ์ ํ
์คํธ๋ชจ๋ธ ์ ํ๋")
print(lasso_model.score(test_polynomial_added, y_test))
<!-- ์ค์ผ์ผํ๋ ๋ฐ์ดํฐ๋ฅผ ๋ฐํ์ผ๋ก ์์ธก๊ฒฐ๊ณผ -->
y_train_lasso_predict = lasso_model.predict(train_polynomial_added)
y_test_lasso_predict = lasso_model.predict(test_polynomial_added)
import matplotlib.pyplot as plt
import matplotlib as mat
<!-- ๊ทธ๋ํ -->
fig = plt.figure(
figsize=(15, 7)
)
plt.xticks(
WeekdaysSalesDataTrain_numpy[:, 0],
WeekdaysSalesDataDay_numpy,
rotation=90)
plt.title('Weekdays Avocado SalesAmount (Ridge, Lasso)')
line_alpha=0.5
<!-- ์๋ณธ -->
plt.plot(
X_train[:,0],
y_train,
marker='o',
color='gray',
label='Original',
alpha = line_alpha
)
plt.plot(
X_test[:,0],
y_test,
marker='o',
color='gray',
alpha = line_alpha
)
<!-- ๋ฆฟ์ง -->
plt.plot(
X_train[:,0],
y_train_ridge_predict,
marker='d',
color='green',
label='Train pattern (Ridge)',
alpha = line_alpha
)
<!-- ๋ผ์ -->
plt.plot(
X_train[:,0],
y_train_lasso_predict,
marker='d',
color='red',
label='Train pattern (Lasso)',
alpha = line_alpha
)
<!-- ๋ฆฟ์ง ์์ธก -->
plt.plot(
X_test[:,0],
y_test_ridge_predict,
marker='*',
color='blue',
label='Predict pattern (Ridge)',
alpha = line_alpha
)
<!-- ๋ผ์ ์์ธก-->
plt.plot(
X_test[:,0],
y_test_lasso_predict,
marker='*',
color='red',
label='Predict pattern (Lasso)',
alpha = line_alpha
)
plt.xlabel('Day')
plt.ylabel('Day(timeValue)')
plt.legend(
shadow=True
)
ax = plt.gca()
<!-- ์ถ๋ง ๊ทธ๋ฆฌ๋ -->
ax.xaxis.grid(True)
<!-- ๋ฐฐ๊ฒฝ์, ๋ง์ง ์กฐ์ -->
ax.set_facecolor('#e8e7d2')
ax.margins(x=0.01, y=0.02)
<!-- ์ฃผ์ ์ด์ํ ์ฌ๋ฐฑ ์์ ๊ธฐ -->
fig.tight_layout()
fig
</py-script>
</body>
</html>
- common.py
def createElementDiv(document, Element, name):
element = document.createElement('div')
element.id = name
document.body.append(element)
return Element(name)
'๐ฉโ๐ป ๋ฐฑ์๋(Back-End) > Node js' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
๋ค์ ๋์ .. Flask CSV ๋ถ๋ฌ์์ HTML์ ํ๋ก ์๊ฐํํ๊ธฐ(4) (0) | 2022.11.30 |
---|---|
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (15) (0) | 2022.11.28 |
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (13) (1) | 2022.11.28 |
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (12) (0) | 2022.11.28 |
HTML์์ Python์ ์ฌ์ฉํ ์ ์๋ PyScript (11) (0) | 2022.11.25 |