๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

HTML์—์„œ Python์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” PyScript (11) ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป ๋ฐฑ์—”๋“œ(Back-End)/Node js

HTML์—์„œ Python์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” PyScript (11)

์ง•์ง•์•ŒํŒŒ์นด 2022. 11. 25. 16:56
728x90
๋ฐ˜์‘ํ˜•

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” itadventrue ๋‹˜์˜ ๋ธ”๋กœ๊ทธ๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค :-)>

https://itadventure.tistory.com/553

 

ํŒŒ๋„!(12) - ๋ฌด์‹  ๋Ÿฌ๋‹? ๋จธ์‹ ๋Ÿฌ๋‹! - ๋ฆฌ๋‹ˆ์–ด ๋ฆฌ๊ทธ๋ ˆ์…˜ ( LinearRegression )

'ํŒŒ๋„'๋Š” ํŒŒ์ด์Šคํฌ๋ฆฝํŠธ ๋„์ „๊ธฐ์˜ ์ค„์ž„๋ง์ž…๋‹ˆ๋‹ค. ์ง€๋‚œ ๊ฒŒ์‹œ๊ธ€๊นŒ์ง€๋Š” ํŒŒ์ด์Šคํฌ๋ฆฝํŠธ์—์„œ csv ๋กœ ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™€ ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์ ์ ์ธ ๋‚ด์šฉ์ด์—ˆ์ง€์š”. https://itadventure.tistory.com/552 ํŒŒ

itadventure.tistory.com

 

 

 

 

 

 

๐ŸŽ„ ํŒŒ์ด์Šคํฌ๋ฆฝํŠธ์—์„œ ๋จธ์‹ ๋Ÿฌ๋‹์„ ์ฒ˜๋ฆฌ

โ— ๋จธ์‹  ๋Ÿฌ๋‹์€ ๊ธฐ๊ณ„(๋จธ์‹ :Machine) ํ•™์Šต(๋Ÿฌ๋‹:Learning)

ํŒŒ์ด์ฌ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ์ธ๊ณต์ง€๋Šฅ ๊ด€๋ จ ๊ธฐ์ˆ ์„ ๋ฌด๋ฃŒ ๊ณต๊ฐœ -> ์‹ธ์ดํ‚ท-๋Ÿฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

<py-env>
  - pandas
  - matplotlib
  - seaborn
  - scikit-learn
 </pv-env>

 

โ— ๋ฆฌ๋‹ˆ์–ด ๋ฆฌ๊ทธ๋ ˆ์…˜(Linear Regression)

 '์„ ํ˜•ํšŒ๊ท€๋ชจ๋ธ'์„ ์„ ์–ธ

from sklearn.linear_model import LinearRegression
์„ ํ˜•ํšŒ๊ท€๋ชจ๋ธ = LinearRegression()

 

โ— ๊ธฐ๊ณ„ ํ•™์Šต ๋ฐฉ๋ฒ•

๋จธ์‹ ๋Ÿฌ๋‹์ด ์˜ˆ์ธกํ•  ๊ฒฐ๊ณผ๋ฅผ ์ •ํ•จ
๋จธ์‹ ๋Ÿฌ๋‹์ด ๋งค์ถœ๋Ÿ‰์„ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ

 

๋ฐ์ดํ„ฐ๋ฅผ ๋‚˜๋ˆ„์–ด์„œ ์ œ๊ณต

ํ›ˆ๋ จ์šฉ๋ฐ์ดํ„ฐ : ์—ฐ๋„, ์›”, ์ผ, ์ฃผ(1๋…„์ค‘ ๋ช‡๋ฒˆ์งธ ์ฃผ์ธ์ง€)
ํ›ˆ๋ จ์šฉ๋ชฉํ‘œ : ๋งค์ถœ๋Ÿ‰

 

ํ›ˆ๋ จ

์„ ํ˜•ํšŒ๊ท€๋ชจ๋ธ.fit(ํ›ˆ๋ จ์šฉ๋ฐ์ดํ„ฐ, ํ›ˆ๋ จ์šฉ๋ชฉํ‘œ)

 

๋ชจ๋ธ์˜ ์˜ˆ์ธก

์„ ํ˜•ํšŒ๊ท€ ๋ชจ๋ธ์€ ๋‹ค๋ฅธ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธก

์—ฐ๋„, ์›”, ์ผ, ์ฃผ ๋‹จ์œ„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ…Œ์ŠคํŠธ๋ฐ์ดํ„ฐ๋กœ ์ œ๊ณตํ•˜๋ฉด ๋งค์ถœ๋Ÿ‰ ๊ฒฐ๊ณผ๋ฅผ ์˜ˆ์ธก

ํ…Œ์ŠคํŠธ๋ชฉํ‘œ์˜ˆ์ธก = ์„ ํ˜•ํšŒ๊ท€๋ชจ๋ธ.predict(ํ…Œ์ŠคํŠธ๋ฐ์ดํ„ฐ)

๊ทธ๋ž˜ํ”„๋กœ ์‹œ๊ฐํ™”

 

 

๐ŸŽ„ ๋งค์ถœ๋Ÿ‰ ์˜ˆ์ธก ๊ทธ๋ž˜ํ”„

โ— ์ฝ”๋“œ ๊ตฌํ˜„

  • index.html
<html> 
    <head> 
      <link rel="stylesheet" 
        href="https://pyscript.net/alpha/pyscript.css" /> 
      <script defer 
        src="https://pyscript.net/alpha/pyscript.js"></script> 
<py-env>
  - pandas
  - matplotlib
  - seaborn
  - scikit-learn
  - paths :
    - ./common.py
</py-env>
    </head>
  <body> 
    <link rel="stylesheet" href="pytable.css"/>
    <py-script>
    import pandas as pd
    from pyodide.http import open_url
    from common import *
    import numpy as np

    from datetime import datetime

    <!-- ๊ฒฝ๊ณ  ๋ฌธ๊ตฌ ์ œ๊ฑฐ -->
    import warnings
    warnings.filterwarnings( 'ignore' )

    <!-- ํŒ๋‹ค์Šค์—์„œ csv ๋ฅผ ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„์œผ๋กœ ์ฝ์–ด์˜ด -->
    SalesData = pd.read_csv(open_url(
      "http://dreamplan7.cafe24.com/pyscript/csv/avocado.csv"
    ))      

    <!-- # 2๊ฐœ ํ•„๋“œ๋งŒ ์ถ”๋ ค์„œ ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„์„ ๋‹ค์‹œ ๋งŒ๋“ฌ -->
    SalesData = SalesData[[
      'Date', 
      'Total Volume'
    ]]   

    SalesData.columns = [
      'Day', 
      'Amount'
    ]

    <!-- ์ฃผ๊ฐ„ ๋งค์ถœ๋Ÿ‰ ๊ทธ๋ฃน -->
    WeekdaysSalesData = SalesData.fillna(0) \
      .groupby('Day', as_index=False)[['Amount']] \
      .sum() \
      .sort_values(
        by='Day', 
        ascending=True
      )

    <!-- ๋‚ ์งœ(์‹œ๊ฐ„๊ฐ’) ์ถ”๊ฐ€ -->
    WeekdaysSalesData.insert(1, 'Day(timeValue)',
      '',
      True)
    
    for i in WeekdaysSalesData['Day'].index:
        WeekdaysSalesData['Day(timeValue)'].loc[i]=time.mktime(
            datetime.strptime(
                WeekdaysSalesData['Day'].loc[i], 
                '%Y-%m-%d'
            ).timetuple()
        )

    <!-- 10000์œผ๋กœ ๋‚˜๋ˆˆ ๋งค์ถœ๋Ÿ‰ ํ•„๋“œ ์ถ”๊ฐ€ -->
    WeekdaysSalesData.insert(2, 'Amount(10000)', 
    WeekdaysSalesData['Amount']/10000, 
      True)

    <!-- ํ›ˆ๋ จํ•™์Šต์šฉ์œผ๋กœ ๋‚ ์งœ๋ฅผ ์—ฐ๋„, ์›”, ์ผ๋กœ ๋‚˜๋ˆˆ๋‹ค -->
    WeekdaysSalesData.insert(4, 'year', '', True)
    WeekdaysSalesData.insert(5, 'month', '', True)
    WeekdaysSalesData.insert(6, 'day', '', True)
    WeekdaysSalesData.insert(7, 'week', '', True)

    for i in WeekdaysSalesData['Day'].index:
      temp = str(WeekdaysSalesData['Day'].loc[i]).split('-')
      year = int(temp[0])
      month = int(temp[1])
      day = int(temp[2])
      WeekdaysSalesData['year'].loc[i] = year
      WeekdaysSalesData['month'].loc[i] = month
      WeekdaysSalesData['day'].loc[i] = day
      WeekdaysSalesData['week'].loc[i] = str(
        datetime(year, month, day).isocalendar()[1]
      )

    createElementDiv(
      document, 
      Element, 
      'output2'
    ).write(WeekdaysSalesData)

    WeekdaysSalesDataTrain_numpy = WeekdaysSalesData[['Day(timeValue)', 'year', 'month', 'day', 'week']].to_numpy()
    WeekdaysSalesDataTest_numpy = WeekdaysSalesData['Amount(10000)'].to_numpy()

    from sklearn.model_selection import train_test_split

    X_train, X_test, y_train, y_test = \
      train_test_split(
        WeekdaysSalesDataTrain_numpy, 
        WeekdaysSalesDataTest_numpy,
        random_state=100,
        shuffle=False)

    <!-- ์„ ํ˜• ํšŒ๊ท€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ -->
    <!-- ํ›ˆ๋ จ, ์ตœ์ ์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ์ฐพ์•„์ค€๋‹ค -->
    from sklearn.linear_model import LinearRegression
    lr = LinearRegression()
    lr.fit(X_train, y_train)

    y_train_predict = lr.predict(X_train)
    y_test_predict = lr.predict(X_test)

    import matplotlib.pyplot as plt
    import matplotlib as mat

    <!-- ๊ทธ๋ž˜ํ”„ -->
    fig = plt.figure(
      figsize=(15, 7)
    )
    plt.xticks(WeekdaysSalesData['Day(timeValue)'].to_numpy(), WeekdaysSalesData[['Day']].to_numpy()[:,0], rotation=90)

    plt.title('Weekdays Avocado SalesAmount')

    plt.plot(        
        X_train[:,0],
        y_train,
        marker='o',
        color='#c14549',
        label='Original'
    )
    plt.plot(        
        X_train[:,0],
        y_train_predict,
        marker='d',
        color='blue',
        label='Train pattern'
    )

    plt.plot(        
        X_test[:, 0],
        y_test,
        marker='o',
        color='#c14549'
    )

    plt.plot(        
        X_test[:, 0],
        y_test_predict,
        marker='d',
        color='green',
        label='Predict pattern'
    )

    plt.xlabel('Day')
    plt.ylabel('Day(timeValue)')

    plt.legend(
      shadow=True
    )

    ax = plt.gca()
    <!-- ์ถ•๋งŒ ๊ทธ๋ฆฌ๋“œ -->
    ax.xaxis.grid(True)

    <!-- ๋ฐฐ๊ฒฝ์ƒ‰, ๋งˆ์ง„ ์กฐ์ • -->
    ax.set_facecolor('#e8e7d2')
    ax.margins(x=0.01, y=0.02)

    <!-- ์ฃผ์œ„ ์ด์ƒํ•œ ์—ฌ๋ฐฑ ์—†์• ๊ธฐ -->
    fig.tight_layout() 
    fig

</py-script> 
  </body> 
</html>

 

  • common.py
def createElementDiv(document, Element, name):
    element = document.createElement('div')
    element.id = name
    document.body.append(element)
    return Element(name)

 

 

 

 

๋‚˜๋Š” ํ•œ๊ธ€ ์•ˆํ•ด์„œ

์˜์–ด๋กœ ๋ฐ”๊พธ๋Š๋ผ ํž˜๋“ค์–ด๋”ฐ~~~ ใ…‹ใ…‹ใ…‹

728x90
๋ฐ˜์‘ํ˜•
Comments