๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

[์Œ์„ฑ] ์Œ์„ฑ์ธ์‹ ์Œ์•…๋ถ„๋ฅ˜ & ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป ์ธ๊ณต์ง€๋Šฅ (ML & DL)/ML & DL

[์Œ์„ฑ] ์Œ์„ฑ์ธ์‹ ์Œ์•…๋ถ„๋ฅ˜ & ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜

์ง•์ง•์•ŒํŒŒ์นด 2022. 1. 31. 13:59
728x90
๋ฐ˜์‘ํ˜•

220131 ์ž‘์„ฑ

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” ์กฐ๋… ์ฝ”๋”ฉ์ผ๊ธฐ๋ฅผ  ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค>

https://jonhyuk0922.tistory.com/114

 

[Librosa] ์Œ์„ฑ์ธ์‹ ๊ธฐ์ดˆ ๋ฐ ์Œ์•…๋ถ„๋ฅ˜ & ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜

์•ˆ๋…•ํ•˜์„ธ์š”~ 27๋…„์ฐจ ์ง„๋กœํƒ์ƒ‰๊พผ ์กฐ๋…์ž…๋‹ˆ๋‹ค!! ์˜ค๋Š˜์€ ์Œ์„ฑํŒŒ์ผ์„ ์ธ์‹ํ•˜๊ณ  ๊ฑฐ๊ธฐ์„œ ํŠน์ง•์ถ”์ถœํ•˜๋Š” ๊ธฐ์ดˆ์ ์ธ ๋‚ด์šฉ๋ถ€ํ„ฐ ์ถ”์ถœํ•œ ํŠน์ง•๋“ค์„ ํ†ตํ•ด ๋…ธ๋ž˜์˜ ์žฅ๋ฅด๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ชจ๋ธ๊ณผ ๋น„์Šทํ•œ ์žฅ๋ฅด์˜ ๋…ธ๋ž˜๋ฅผ

jonhyuk0922.tistory.com

https://newsight.tistory.com/294

 

์Œ์„ฑ์ธ์‹ ๊ธฐ์ดˆ ์ดํ•ดํ•˜๊ธฐ

# ๋ฐœ์Œ๊ธฐํ˜ธ์™€ ๋ฌธ์žํ‘œํ˜„ - phoneme: ์Œ์†Œ, ๊ฐ€์žฅ ์ž‘์€ ์†Œ๋ฆฌ์˜ ๋‹จ์œ„. ์‰ฝ๊ฒŒ ๋งํ•ด ์˜์–ด์‚ฌ์ „์˜ ๋ฐœ์Œ๊ธฐํ˜ธ๋ฅผ ์ƒ๊ฐํ•˜๋ฉด ๋œ๋‹ค. - grapheme: ์ž์†Œ(=๋ฌธ์ž์†Œ), ๊ฐ€์žฅ ์ž‘์€ ๋ฌธ์ž์˜ ๋‹จ์œ„. ๋ฐœ์Œ๊ธฐํ˜ธ๋กœ ํ‘œํ˜„๋˜๊ธฐ ์ด์ „์˜ ์›

newsight.tistory.com

 

 

 

 

 

1. ์†Œ๋ฆฌ

: ๊ณต๊ธฐ์— ์ฃผ๊ธฐ์ ์ธ ์ง„๋™(์•ž๋’ค๋กœ์˜ ์ˆ˜์ถ•/ํŒฝ์ฐฝ)์ด ๋ฐœ์ƒํ•ด์•ผ๋งŒ ์†Œ๋ฆฌ

: ๊ณต๊ธฐ๋ฅผ ๋งค์งˆ๋กœํ•˜๋Š” ํŒŒ๋™ ์—๋„ˆ์ง€๊ฐ€ ๋ฐ”๋กœ ์†Œ๋ฆฌ์˜ ์ •์ฒด

: ํŒŒ๋™์˜ ์ง„๋™์ˆ˜๊ฐ€ ์ž‘์œผ๋ฉด ๋‚ฎ์€ ์—๋„ˆ์ง€๋ฅผ ๊ฐ€์ง„ ์ €์Œ

: ์ง„๋™์ˆ˜๊ฐ€ ํฌ๋ฉด ๋งŽ์€ ์—๋„ˆ์ง€๋ฅผ ๊ฐ€์ง„ ๊ณ ์Œ

 

 

 

 

 

2. Wav ํŒŒ์ผ

: ์†Œ๋ฆฌ ํŒŒ๋™์˜ ๋†’์ด ๊ฐ’(floatํ˜•)์„ ์ผ์ •ํ•œ ์‹œ๊ฐ„ ๊ฐ„๊ฒฉ(sampling rate)๋งˆ๋‹ค ๊ธฐ๋กํ•œ ์‹ค์ˆ˜์˜ ๋ฐฐ์—ด

: ์ €์žฅ ๋ฐฉ์‹์„ PCM(Pulse-Code Modulation)

: ์ˆซ์ž๋กœ ์ด๋ฃจ์–ด์ง

 

- y : ์†Œ๋ฆฌ๊ฐ€ ๋–จ๋ฆฌ๋Š” ์„ธ๊ธฐ(์ง„ํญ)๋ฅผ ์‹œ๊ฐ„ ์ˆœ์„œ๋Œ€๋กœ ๋‚˜์—ดํ•œ ๊ฒƒ

- Sampling rate: 1์ดˆ๋‹น ์ƒ˜ํ”Œ์˜ ๊ฐœ์ˆ˜, ๋‹จ์œ„ 1์ดˆ๋‹น Hz ๋˜๋Š” kHz

- Mono vs Stereo

: Left-Right ๋งˆ์ดํฌ 2๊ฐœ์—์„œ ๋“ค์–ด์˜จ ์†Œ๋ฆฌ(2๊ฐœ์˜ mono)๋ฅผ ๋…น์Œํ•ด์„œ ํ•œ ๋ฒˆ์— ์ €์žฅํ•œ ๊ฒƒ์„ ์Šคํ…Œ๋ ˆ์˜ค

 

- Sampling rate

: data point์˜ x์ถ• ํ•ด์ƒ๋„

: 1์ดˆ์— ๋ช‡๋ฒˆ์ด๋‚˜ data point๋ฅผ ์ฐ์„์ง€์ด๋‹ค.

 

- Bit depth

: data point์˜ y์ถ• ํ•ด์ƒ๋„

: ๊ฐ ์ ๋“ค์˜ ๋†’์ด(amplitude)๋ฅผ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์žˆ๋Š” ํ•ด์ƒ๋„

 

- Bit rate

: Sampling rate * Bit depth

: ์ดˆ๋‹น bits์ „์†ก๋Ÿ‰

 

 

 

 

 

 

3. ์†Œ๋ฆฌ ํŒŒ์ผ ๋ถ„์„ ( with kaggle )

  • 1) ๋ฐ์ดํ„ฐ์…‹ ๋กœ๋“œ
import librosa

# librosa.load() : ์˜ค๋””์˜ค ํŒŒ์ผ์„ ๋กœ๋“œ
y , sr = librosa.load('Data/genres_original/reggae/reggae.00036.wav') 

print(y)
print(len(y))
print(y.shape)
print('Sampling rate (Hz): %d' %sr)
print('Audio length (seconds): %.2f' % (len(y) / sr)) 
#์Œ์•…์˜ ๊ธธ์ด(์ดˆ) = ์ŒํŒŒ์˜ ๊ธธ์ด/Sampling rate

 

 

  • 2) ์Œ์•… ๋“ค์–ด๋ณด๊ธฐ
import IPython.display as ipd
ipd.Audio(y, rate=sr)

- VS code ์—์„œ๋Š” ์•ˆ๋“ค๋ฆฐ๋‹น..

- colab ์œผ๋กœ ๋Œ๋ฆฌ๋‹ˆ๊นŒ ๋‚˜์˜ธใ…๋•จ!

 

 

  • 3) ์Œ์•… ๊ทธ๋ž˜ํ”„
  • 2D ๊ทธ๋ž˜ํ”„
import matplotlib.pyplot as plt
import librosa.display

plt.figure(figsize =(16,6))
librosa.display.waveplot(y=y,sr=sr)
plt.show()

 

 

  • Fourier Transform(ํ‘ธ๋ฆฌ์— ๋ณ€ํ™˜)

: ์‹œ๊ฐ„ ์˜์—ญ ๋ฐ์ดํ„ฐ๋ฅผ ์ฃผํŒŒ์ˆ˜ ์˜์—ญ์œผ๋กœ ๋ณ€๊ฒฝ

: time(์‹œ๊ฐ„) domain -> frequency(์ง„๋™์ˆ˜)

 

- y์ถ• : ์ฃผํŒŒ์ˆ˜(๋กœ๊ทธ ์Šค์ผ€์ผ)

- color์ถ• : ๋ฐ์‹œ๋ฒจ(์ง„ํญ)

import numpy as np

# n_fft : window size
# ์Œ์„ฑ์˜ ๊ธธ์ด๋ฅผ ์–ผ๋งˆ๋งŒํผ์œผ๋กœ ์ž๋ฅผ ๊ฒƒ์ธ๊ฐ€? => window
Fourier = np.abs(librosa.stft(y, n_fft=2048, hop_length=512)) 

print(Fourier.shape)

plt.figure(figsize=(16,6))
plt.plot(Fourier)
plt.show()

 

 

 

  • Spectogram

: ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ์‹ ํ˜ธ ์ฃผํŒŒ์ˆ˜์˜ ์ŠคํŽ™ํŠธ๋Ÿผ ๊ทธ๋ž˜ํ”„

# amplitude(์ง„ํญ) -> DB(๋ฐ์‹œ๋ฒจ)๋กœ ๋ฐ”๊ฟ”๋ผ
DB = librosa.amplitude_to_db(Fourier, ref=np.max)

plt.figure(figsize=(16,6))
librosa.display.specshow(DB,sr=sr, hop_length=512, x_axis='time', y_axis='log')
plt.colorbar()
plt.show()

 

 

 

 

  • Mel Spectogram

: Spectogram์˜ y์ถ•์„ Mel Scale๋กœ ๋ณ€ํ™˜ํ•œ ๊ฒƒ

Mel = librosa.feature.melspectrogram(y, sr=sr)
Mel_DB = librosa.amplitude_to_db(Mel, ref=np.max)

plt.figure(figsize=(16,6))
librosa.display.specshow(Mel_DB, sr=sr,hop_length=512, x_axis='time',y_axis='log')
plt.colorbar()
plt.show()

 

 

 

 

  • 4) ๋ ˆ๊ฒŒ VS ํด๋ž˜์‹ ๋น„๊ต
y, sr = librosa.load('Data/genres_original/classical/classical.00036.wav')
y, _ = librosa.effects.trim(y)

S = librosa.feature.melspectrogram(y, sr=sr)
S_DB = librosa.amplitude_to_db(S, ref=np.max)

plt.figure(figsize=(16,6))
librosa.display.specshow(S_DB, sr=sr,hop_length=512, x_axis='time',y_axis='log')
plt.colorbar()
plt.show()

 

 

 

  • 5) ์˜ค๋””์˜ค ํŠน์„ฑ ์ถ”์ถœ(Audio Feature Extraction)
  • Tempo(BPM)
tempo , _ = librosa.beat.beat_track(y,sr=sr)     
print(tempo)

 

  • Zero Crossing Rate

: ์ŒํŒŒ๊ฐ€ ์–‘์—์„œ ์Œ์œผ๋กœ ๋˜๋Š” ์Œ์—์„œ ์–‘์œผ๋กœ ๋ฐ”๋€Œ๋Š” ๋น„์œจ

# 0์ด ๋˜๋Š” ์„ ์„ ์ง€๋‚˜์นœ ํšŸ์ˆ˜
zero_crossings = librosa.zero_crossings(y, pad=False)

print(zero_crossings)
print(sum(zero_crossings)) # ์Œ <-> ์–‘ ์ด๋™ํ•œ ํšŸ์ˆ˜

: Zero Crossing์€ 0์ด ๋˜๋Š” ์„ ์„ ์ง€๋‚˜์นœ ํšŸ์ˆ˜

n0 = 9000
n1 = 9080

plt.figure(figsize=(16,6))
plt.plot(y[n0:n1])
plt.grid()
plt.show()

: ๊ทธ๋ž˜ํ”„ 0 ์ดํ•˜๋กœ ์ฐ๋Š”๊ฑฐ ๋ณด๋ฉด ์•ฝ 11 ๊ฐœ?

 

 #n0 ~ n1 ์‚ฌ์ด zero crossings 
zero_crossings = librosa.zero_crossings(y[n0:n1], pad=False)
print(sum(zero_crossings))

 

 

 

  • 6) ํŠน์ง• ์ถ”์ถœ
  • 1) Harmonic and Percussive Components

- Percussives: ๋ฆฌ๋“ฌ๊ณผ ๊ฐ์ •์„ ๋‚˜ํƒ€๋‚ด๋Š” ์ถฉ๊ฒฉํŒŒ

- Harmonics : ์‚ฌ๋žŒ์˜ ๊ท€๋กœ ๊ตฌ๋ถ„ํ•  ์ˆ˜ ์—†๋Š” ํŠน์ง•๋“ค(์Œ์•…์˜ ์ƒ‰๊น”)

y_harm, y_perc = librosa.effects.hpss(y)

plt.figure(figsize=(16,6))
plt.plot(y_harm, color='b')
plt.plot(y_perc, color='r')
plt.show()

 

 

 

 

  • 2) Spectral Centroid

 

: ์†Œ๋ฆฌ๋ฅผ ์ฃผํŒŒ์ˆ˜ ํ‘œํ˜„ํ–ˆ์„ ๋•Œ, ์ฃผํŒŒ์ˆ˜์˜ ๊ฐ€์ค‘ํ‰๊ท ์„ ๊ณ„์‚ฐํ•˜์—ฌ ์†Œ๋ฆฌ์˜ "๋ฌด๊ฒŒ ์ค‘์‹ฌ"์ด ์–ด๋”˜์ง€๋ฅผ ์•Œ๋ ค์ฃผ๋Š” ์ง€ํ‘œ

spectral_centroids = librosa.feature.spectral_centroid(y, sr=sr)[0]

#Computing the time variable for visualization
frames = range(len(spectral_centroids))

# Converts frame counts to time (seconds)
t = librosa.frames_to_time(frames)

import sklearn
def normalize(x, axis=0):
# sk.minmax_scale() : ์ตœ๋Œ€ ์ตœ์†Œ๋ฅผ 0 ~ 1 ๋กœ ๋งž์ถฐ์ค€๋‹ค.
  return sklearn.preprocessing.minmax_scale(x, axis=axis) 

plt.figure(figsize=(16,6))
librosa.display.waveplot(y, sr=sr, alpha=0.5, color='b')
plt.plot(t, normalize(spectral_centroids), color='r')
plt.show()

 

 

  • 3) Spectral Rolloff

: ์ด ์ŠคํŽ™ํŠธ๋Ÿด ์—๋„ˆ์ง€ ์ค‘ ๋‚ฎ์€ ์ฃผํŒŒ์ˆ˜(85% ์ดํ•˜)์— ์–ผ๋งˆ๋‚˜ ๋งŽ์ด ์ง‘์ค‘๋˜์–ด ์žˆ๋Š”๊ฐ€

: ์‹ ํ˜ธ ๋ชจ์–‘์„ ์ธก์ •

spectral_rolloff = librosa.feature.spectral_rolloff(y, sr=sr)[0]

plt.figure(figsize=(16,6))
librosa.display.waveplot(y,sr=sr,alpha=0.5,color='b')
plt.plot(t, normalize(spectral_rolloff),color='r')
plt.show()

 

 

 

 

  • 4) Mel-Frequency Cepstral Coefficients(MFCCs)

: MFCCs๋Š” ํŠน์ง•๋“ค์˜ ์ž‘์€ ์ง‘ํ•ฉ(์•ฝ 10-20)์œผ๋กœ ์ŠคํŽ™ํŠธ๋Ÿด ํฌ๊ณก์„ ์˜ ์ „์ฒด์ ์ธ ๋ชจ์–‘์„ ์ถ•์•ฝ

์‚ฌ๋žŒ์˜ ์ฒญ๊ฐ ๊ตฌ์กฐ๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ์Œ์„ฑ ์ •๋ณด ์ถ”์ถœ

1.  ์ „์ฒด ์˜ค๋””์˜ค ์‹ ํ˜ธ๋ฅผ ์ผ์ • ๊ฐ„๊ฒฉ์œผ๋กœ ๋‚˜๋ˆ„๊ณ  ํ‘ธ๋ฆฌ์— ๋ณ€ํ™˜์„ ๊ฑฐ์ณ ์ŠคํŽ™ํŠธ๋กœ๊ทธ๋žจ์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
2.  ๊ฐ ์ŠคํŽ™ํŠธ๋Ÿผ์˜ ์ œ๊ณฑ์ธ ํŒŒ์›Œ ์ŠคํŽ™ํŠธ๋กœ๊ทธ๋žจ์— Mel scale filter bank๋ฅผ ์‚ฌ์šฉํ•ด ์ฐจ์› ์ˆ˜๋ฅผ ์ค„์ž…๋‹ˆ๋‹ค.
3.  cepstral ๋ถ„์„์„ ์ ์šฉํ•ด MFCC๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.

์ถœ์ฒ˜: https://tech.kakaoenterprise.com//66
mfccs = librosa.feature.mfcc(y, sr=sr)
mfccs = normalize(mfccs,axis=1)

print('mean : %.2f' % mfccs.mean())
print('var : %.2f' % mfccs.var())

plt.figure(figsize=(16,6))
librosa.display.specshow(mfccs,sr=sr, x_axis='time')
plt.show()

 

 

  • Chroma Frequencies

: ํฌ๋กœ๋งˆ๋Š” ์ธ๊ฐ„ ์ฒญ๊ฐ์ด ์˜ฅํƒ€๋ธŒ ์ฐจ์ด๊ฐ€ ๋‚˜๋Š” ์ฃผํŒŒ์ˆ˜๋ฅผ ๊ฐ€์ง„ ๋‘ ์Œ์„ ์œ ์‚ฌ์Œ์œผ๋กœ ์ธ์ง€

: ํฌ๋กœ๋งˆ ํŠน์ง•์€ ์Œ์•…์˜ ํฅ๋ฏธ๋กญ๊ณ  ๊ฐ•๋ ฌํ•œ ํ‘œํ˜„

chromagram = librosa.feature.chroma_stft(y, sr=sr, hop_length=512)

plt.figure(figsize=(16,6))
librosa.display.specshow(chromagram,x_axis='time', y_axis='chroma', hop_length=512)
plt.show()

 

 

 

 

 

 

4. ์Œ์•… ์žฅ๋ฅด ๋ถ„๋ฅ˜ ( with kaggle )

  • 1) ๋ฐ์ดํ„ฐ์…‹ ๋กœ๋“œ
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd

from sklearn.metrics import classification_report
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn import svm
from sklearn.svm import SVC
from sklearn.linear_model import SGDClassifier
from sklearn.linear_model import LogisticRegression

from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler

from sklearn.metrics import confusion_matrix, classification_report
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
from sklearn.metrics import roc_curve, auc

from sklearn.model_selection import cross_val_score
from sklearn.model_selection import cross_val_predict
from sklearn.model_selection import GridSearchCV

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set_style('whitegrid')  # ๊ทธ๋ž˜ํ”„ ํ…Œ๋‘๋ฆฌ ๋ชจ๋‘ ์ œ๊ฑฐ
df = pd.read_csv('Data/features_3_sec.csv')

df.head()

 

 

 

  • 2) ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ
X = df.drop(columns=['filename','length','label']) # ํ•„์š” ์—†๋Š”๊ฒƒ!
y = df['label'] #์žฅ๋ฅด๋ช…

scaler = MinMaxScaler()   # scale 0~1 ์กฐ์ •
np_scaled = scaler.fit_transform(X)

X = pd.DataFrame(np_scaled, columns=X.columns)

X.head()

 

 

 

 

  • 3) Train, Test ๋ถ„ํ• 
X_train , X_test , y_train, y_test = train_test_split(X,y , test_size=0.2, random_state=42)

print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

 

 

 

  • 4) ๋ชจ๋ธ ๊ตฌ์ถ•
  • xgboost

d

 

from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score

xgb = XGBClassifier(n_estimators=100, learning_rate=0.05) #1000๊ฐœ์˜ ๊ฐ€์ง€, 0.05 ํ•™์Šต๋ฅ 

xgb.fit(X_train, y_train) #ํ•™์Šต

print("xgb Test Accuarcy : {}%".format(round(xgb.score(X_test, y_test) * 100, 2)))

 

 

  • Logistic Regression
lr = LogisticRegression(solver = "lbfgs")
lr.fit(X_train, y_train)

print("lr Test Accuarcy : {}%".format(round(lr.score(X_test, y_test) * 100, 2)))

 

 

  • RandomForest
rf = RandomForestClassifier(n_estimators = 100, random_state = 42)
rf.fit(X_train, y_train)

print("rf Test Accuarcy : {}%".format(round(rf.score(X_test, y_test) * 100, 2)))

 

 

 

  • Decision Tree
tree = DecisionTreeClassifier()
params = {
    'max_depth' : [6, 8, 10, 12, 16, 20, 24],
    'min_samples_split' : [16, 24]
}

grid_dt = GridSearchCV(tree, param_grid=params, scoring='accuracy', cv=5, verbose=1)
grid_dt.fit(X_train, y_train)

print('์ตœ์ƒ์˜ ๊ต์ฐจ๊ฒ€์ฆ ์ •ํ™•๋„ {:.2f}'.format(grid_dt.best_score_))
print("rf Test Accuarcy : {}%".format(round(grid_dt.score(X_test, y_test) * 100, 2)))
print('์ตœ์ ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜ : {}'.format(grid_dt.best_params_))

 

 

 

 

  • 5) Confusion Matrix
y_preds = rf.predict(X_test) #๊ฒ€์ฆ

cm = confusion_matrix(y_test,y_preds)

plt.figure(figsize=(16,9))
sns.heatmap(
    cm,
    annot=True,
    xticklabels=["blues","classical","country","disco","hiphop","jazz","metal","pop","reggae","rock"],
    yticklabels=["blues","classical","country","disco","hiphop","jazz","metal","pop","reggae","rock"]
)
plt.show()

 

 

 

 

  • 6) Confusion Matrix
for feature, importance in zip(X_test.columns, rf.feature_importances_):
  print('%s: %.2f' % (feature, importance))   #์–ด๋–ค ํŠน์ง•์ด ์ค‘์š”ํ–ˆ๋Š”์ง€ ๋ณด์—ฌ์คŒ

 

 

 

 

 

 

 

 

5. ๋…ธ๋ž˜ ์ถ”์ฒœ

  • 1) ๋ฐ์ดํ„ฐ ๋กœ๋“œ
df_30 = pd.read_csv('Data/features_30_sec.csv', index_col='filename')

labels = df_30[['label']]
df_30 = df_30.drop(columns=['length','label'])

df_30_scaled = StandardScaler().fit_transform(df_30)    #ํ‰๊ท  0 , ํ‘œ์ค€ํŽธ์ฐจ 1 

df_30 = pd.DataFrame(df_30_scaled, columns=df_30.columns)

df_30.head()

 

 

 

 

 

  • 2) ์œ ์‚ฌ๋„ ์„ค์ •
from sklearn.metrics.pairwise import cosine_similarity

# ๋ฒกํ„ฐ์˜ ์œ ์‚ฌ๋„ , ์ฆ‰ ๋ฒกํ„ฐ๊ฐ„์˜ ๊ฐ๋„๋ฅผ ํ†ตํ•ด ์ถ”์ • cos0 =1 ์ด๋ฏ€๋กœ 1์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์œ ์‚ฌ
# cos180 = -1 ์ด๋ฏ€๋กœ -1์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ๋‹ค๋ฅด๋‹ค.
similarity = cosine_similarity(df_30)   

sim_df = pd.DataFrame(similarity, index=labels.index, columns=labels.index)

sim_df.head()

 

 

 

 

 

  • +) ํ•จ์ˆ˜ํ™”
def find_similar_songs(name, n=5):
  series = sim_df[name].sort_values(ascending=False)

  series = series.drop(name)

  return series.head(n).to_frame()

find_similar_songs('rock.00000.wav')

 

 

 

 

 

728x90
๋ฐ˜์‘ํ˜•
Comments