๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

[Kaggle] CNN Architectures ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป ์ปดํ“จํ„ฐ ๊ตฌ์กฐ/Kaggle

[Kaggle] CNN Architectures

์ง•์ง•์•ŒํŒŒ์นด 2022. 2. 4. 23:17
728x90
๋ฐ˜์‘ํ˜•

220204 ์ž‘์„ฑ

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” Kaggle ์„ ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค>

https://www.kaggle.com/shivamb/cnn-architectures-vgg-resnet-inception-tl

 

CNN Architectures : VGG, ResNet, Inception + TL

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

www.kaggle.com

https://wooono.tistory.com/233

 

[DL] LeNet-5, AlexNet, VGG-16, ResNet, Inception Network

CNN ์ข…๋ฅ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. Classic Networks LeNet-5 AlexNet VGG-16 ResNet Inception(GoogLeNet) Network ๋“ค์–ด๊ฐ€๊ธฐ ์•ž์„œ, ์ž…๋ ฅ ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋กœ๋ถ€ํ„ฐ ์ถœ๋ ฅ ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋ฅผ ์ถ”๋ก ํ•˜๋Š” ๊ณต์‹์€ Convolution layer์™€ P..

wooono.tistory.com

 

 

 

 

 

 

 

 

 

 

1. CNN Architectures

  • VGG16

: VGG16 ์€ ๋งŽ์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ€์ง€๋ฉฐ, ๋‹จ์ˆœํ•จ

: Convolution Layer (filter = (3, 3), stride = 1, padding = same), Pooling Layer (filter = (2, 2), stride = 2)๋ฅผ ๋ฐ˜๋ณต์  ์‚ฌ์šฉ

: ์ด๋ฏธ์ง€์˜ ์ฑ„๋„์€ Convolution Layer๋งˆ๋‹ค 2๋ฐฐ์”ฉ ์ฆ๊ฐ€ํ•˜๋ฉฐ, ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋Š” Pooling Layer๋งˆ๋‹ค 2๋ฐฐ์”ฉ ๊ฐ์†Œ

: ํŠน์ง•

  • Max Pooling์„ ์‚ฌ์šฉ
  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ReLU๋ฅผ ์‚ฌ์šฉ
  • 16๊ฐœ์˜ Layer๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— VGG-16

: ๋‹จ์ 

  • Fully Connected Layer์—์„œ parameter ์ˆ˜๊ฐ€ ๋งŽ์Œ
  • ์ƒ๋‹นํ•œ memory cost, overfitting ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐ
from keras.layers import Input, Conv2D, MaxPooling2D
from keras.layers import Dense, Flatten
from keras.models import Model

_input = Input((224,224,1)) 

conv1  = Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu")(_input)
conv2  = Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu")(conv1)
pool1  = MaxPooling2D((2, 2))(conv2)

conv3  = Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu")(pool1)
conv4  = Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu")(conv3)
pool2  = MaxPooling2D((2, 2))(conv4)

conv5  = Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu")(pool2)
conv6  = Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu")(conv5)
conv7  = Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu")(conv6)
pool3  = MaxPooling2D((2, 2))(conv7)

conv8  = Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu")(pool3)
conv9  = Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu")(conv8)
conv10 = Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu")(conv9)
pool4  = MaxPooling2D((2, 2))(conv10)

conv11 = Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu")(pool4)
conv12 = Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu")(conv11)
conv13 = Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu")(conv12)
pool5  = MaxPooling2D((2, 2))(conv13)

flat   = Flatten()(pool5)
dense1 = Dense(4096, activation="relu")(flat)
dense2 = Dense(4096, activation="relu")(dense1)
output = Dense(1000, activation="softmax")(dense2)

vgg16_model  = Model(inputs=_input, outputs=output)
from keras.applications.vgg16 import decode_predictions
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
import matplotlib.pyplot as plt 
from PIL import Image 
import seaborn as sns
import pandas as pd 
import numpy as np 
import os 

img1 = "dogs-vs-cats-redux-kernels-edition/train/cat.11679.jpg"
img2 = "dogs-vs-cats-redux-kernels-edition/train/dog.2811.jpg"
img3 = "dogs-vs-cats-redux-kernels-edition/train/cat.11679.jpg"
img4 = "dogs-vs-cats-redux-kernels-edition/train/dog.2811.jpg"
imgs = [img1, img2, img3, img4]

def _load_image(img_path):
    img = image.load_img(img_path, target_size=(224, 224))
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = preprocess_input(img)
    return img 

def _get_predictions(_model):
    f, ax = plt.subplots(1, 4)
    f.set_size_inches(80, 40)
    for i in range(4):
        ax[i].imshow(Image.open(imgs[i]).resize((200, 200), Image.ANTIALIAS))
    plt.show()
    
    f, axes = plt.subplots(1, 4)
    f.set_size_inches(80, 20)
    for i,img_path in enumerate(imgs):
        img = _load_image(img_path)
        preds  = decode_predictions(_model.predict(img), top=3)[0]
        b = sns.barplot(y=[c[1] for c in preds], x=[c[2] for c in preds], color="gray", ax=axes[i])
        b.tick_params(labelsize=55)
        f.tight_layout()
from keras.applications.vgg16 import VGG16
vgg16_weights = 'dogs-vs-cats-redux-kernels-edition/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
vgg16_model = VGG16(weights=vgg16_weights)
_get_predictions(vgg16_model)

 

 

 

 

  • VGG19

: VGG19๋Š” 19๊ฐœ์˜ ๋ ˆ์ด์–ด ๊ฐ€์ง

: ๊ฐ€์žฅ ๋งŽ์€ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ์ˆ˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ  ๊ธฐ๋ณธ์ ์ธ ํ‹€์€ VGG16์˜ ํŠน์ง•๊ณผ ๋น„์Šทํ•˜๋ฉฐ, Conv ๋ ˆ์ด์–ด๊ฐ€ 3๊ฐœ ์ถ”๊ฐ€๋จ

from keras.applications.vgg19 import VGG19
vgg19_weights = 'dogs-vs-cats-redux-kernels-edition/vgg19_weights_tf_dim_ordering_tf_kernels.h5'
vgg19_model = VGG19(weights=vgg19_weights)
_get_predictions(vgg19_model)

 

 

 

  • InceptionNet

1) 1 x 1 convolution

(1, 1, channel) convolutuio

=> ๋ณผ๋ฅจ์˜ ํฌ๊ธฐ๋Š” ๊ทธ๋Œ€๋กœ ๋‘”์ฑ„, ์ฑ„๋„์„ ์ค„์ด๊ธฐ

=> (1, 1, channel) convolutuion ์‚ฌ์šฉํ•ด์„œ ์ฑ„๋„ ์ค„์ด๊ธฐ

=> 1x1 convolution ์‚ฌ์šฉํ•˜๋ฉด ์ฑ„๋„์˜ ์ˆ˜ ์ค„์ด๊ณ , ์œ ์ง€ํ•˜๊ณ , ๋Š˜๋ฆด ์ˆ˜ ์žˆ์Œ

 

2) Inception Module

: input ๋ณผ๋ฅจ์— ๋Œ€ํ•ด Convolution(1x1, 3x3, 5x5)๊ณผ Max-Pooling(3x3)์„ ๊ฐ๊ฐ ์ˆ˜ํ–‰ํ•ด์„œ output์— ์Œ“์•„์˜ฌ๋ฆฌ๊ธฐ

: ํŒŒ๋ผ๋ฏธํ„ฐ์™€ ํ•„ํ„ฐ ์‚ฌ์ด์ฆˆ ์กฐํ•ฉ์„ ๋ชจ๋‘ ํ•™์Šต

=> 1x1 Conv, 1x1 Conv -> 3x3 Conv, 1x1 Conv ->5x5 Conv, MAXPOOL -> 1x1 Conv

=> MAXPOOL layer ๋’ค์— 1x1 Conv layer๊ฐ€ ์˜ค๋Š” ๊ฒƒ์— ์œ ์˜

=> MAXPOOL์€ channel ์ˆ˜๋ฅผ ๊ฐ์†Œ์‹œํ‚ฌ ์ˆ˜ ์—†์–ด์„œ, 1x1 Conv๋ฅผ ํ†ตํ•ด channel ์ˆ˜๋ฅผ ์ค„์ž„

=> layer์— 1x1 Convolution layer๋ฅผ ์ถ”๊ฐ€ํ•ด bottlenect layer๋ฅผ ๊ตฌํ˜„ํ•จ์œผ๋กœ์จ, channel ์ˆ˜๋ฅผ ๊ฐ์†Œ, ์—ฐ์‚ฐ๋Ÿ‰์„ ์ค„์ž„

( Conv ์—ฐ์‚ฐ ์ค‘๊ฐ„์— (1, 1) Conv ์—ฐ์‚ฐ์„ ์ถ”๊ฐ€๋ฅผ bottleneck layer(๋ณ‘๋ชฉ์ธต))

 

3) Inception (GoogleNet) Network

: Inception Module ์ง‘ํ•ฉ

: ๋ชจ๋ธ ์‚ฌ์ด์—๋Š” Max Pooling์ด ๋ผ์›Œ์ ธ ์žˆ์Œ

: ๋ชจ๋ธ์˜ ๋งˆ์ง€๋ง‰์—๋Š” Fully connected layer๋กœ ๊ฒฐ๊ณผ๊ฐ’์„ ์ถœ๋ ฅ

: ์ค‘๊ฐ„์— softmax layer๊ฐ€ ์ถ”๊ฐ€๋กœ ๋‹ฌ๋ ค์žˆ๋Š”๋ฐ, ์ด๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์ž˜ ์—…๋ฐ์ดํŠธ๋˜๋„๋ก, output์˜ ์„ฑ๋Šฅ์ด ๋‚˜์˜์ง€์•Š๊ฒŒ ๋„์™€์คŒ

: Regularization ํšจ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๊ณ , overfitting์„ ๋ฐฉ์ง€

 

 

 

  • Resnet

: ๋งˆ์ดํฌ๋กœ์†Œํ”„ํŠธ ๊ฐœ๋ฐœ, ๊นŠ์€ ์‹ ๊ฒฝ๋ง ( Vanishing Gradient, Exploding Gradient ๋ฌธ์ œ ๋ฐœ์ƒ )

: Residual block ๋„์ž…

Residual block

:  l+2๋ฒˆ์งธ ๋น„์„ ํ˜• ํ•จ์ˆ˜ ์ž…๋ ฅ๊ฐ’์— l ๋ฒˆ์งธ ๋น„์„ ํ˜• ํ•จ์ˆ˜ ์ถœ๋ ฅ๊ฐ’์„ ๋”ํ•ด์ค„ ์ˆ˜ ์žˆ๋„๋ก ์ง€๋ฆ„๊ธธ(shortcut)์„ ํ•˜๋‚˜ ๋งŒ๋“ฆ

Resnet

: ๊ธฐ๋ณธ์ ์œผ๋กœ VGG19 ๊ตฌ์กฐ ๋”ฐ๋ฆ„

: ์ปจ๋ณผ๋ฃจ์…˜ ์ธต๋“ค ์ถ”๊ฐ€ํ•ด์„œ ๊นŠ๊ฒŒ ๋งŒ๋“ค๊ณ , shoutcut ์ถ”๊ฐ€

: 34 ๊ฐœ์˜ layer ์„ ๊ฐ€์ง„ 34-layer residual ๋„คํŠธ์›Œํฌ์™€ shortcut ์ œ์™ธ ๋ฒ„์ „์ธ 34-layer plain ๊ตฌ์กฐ

: ์ฒ˜์Œ์„ ์ œ์™ธํ•˜๊ณ  3*3 convolution layer ๊ท ์ผ ์‚ฌ์šฉ

: ์ด๋ฏธ์ง€ ํฌ๊ธฐ๊ฐ€ ๋ฐ˜์œผ๋กœ ์ค„์–ด๋“ค๋ฉด, ์ฑ„๋„ ํฌ๊ธฐ 2๋ฐฐ๋กœ ๋Š˜๋ฆผ

 

  • XceptionNet

: Extreme version of Inception module

: Depthwise separable convolution ( ๊ฐ ์ฑ„๋„๋ณ„๋กœ conv ์—ฐ์‚ฐํ•˜๊ณ , ๊ทธ ๊ฒฐ๊ณผ์— 1x1 conv ์—ฐ์‚ฐ ์ทจํ•จ ) ์ˆ˜์ •ํ•จ

: channel, spatial convolution ์„ depthwise separable convolution ์œผ๋กœ ์™„๋ฒฝํ•˜๊ฒŒ ๋ถ„๋ฆฌํ•˜์ž

 

 

2. Image Feature Extraction

vgg1616 = VGG16(weights="imagenet", include_top=False)
def _get_features(img_path):
    img = image.load_img(img_path, target_size=(224, 224))
    img_data = image.img_to_array(img)
    img_data = np.expand_dims(img_data, axis=0)
    img_data = preprocess_input(img_data)
    resnet_features = vgg1616.predict(img_data)
    return resnet_features

img_path = "dogs-vs-cats-redux-kernels-edition/train/dog.2811.jpg"
vgg16_features = _get_features(img_path)
features_representation_1 = vgg16_features.flatten()
features_representation_2 = vgg16_features.squeeze()

print ("Shape 1: ", features_representation_1.shape)
print ("Shape 2: ", features_representation_2.shape)

 

 

 

3. Transfer Learning

basepath = "dogs-vs-cats-redux-kernels-edition/train/"
class1 = os.listdir(basepath + "dog/")
class2 = os.listdir(basepath + "cat/")

data = {'dog': class1[:10], 
        'cat': class2[:10], 
        'test': [class1[11], class2[11]]}
features = {"dog" : [], "cat" : [], "test" : []}
testimgs = []
for label, val in data.items():
    for k, each in enumerate(val):        
        if label == "test" and k == 0:
            img_path = basepath + "/dog/" + each
            testimgs.append(img_path)
        elif label == "test" and k == 1:
            img_path = basepath + "/cat/" + each
            testimgs.append(img_path)
        else: 
            img_path = basepath + label.title() + "/" + each
        feats = _get_features(img_path)
        features[label].append(feats.flatten())
dataset = pd.DataFrame()
for label, feats in features.items():
    temp_df = pd.DataFrame(feats)
    temp_df['label'] = label
    dataset = dataset.append(temp_df, ignore_index=True)
dataset.head()

y = dataset[dataset.label != 'test'].label
X = dataset[dataset.label != 'test'].drop('label', axis=1)
from sklearn.feature_selection import VarianceThreshold
from sklearn.neural_network import MLPClassifier
from sklearn.pipeline import Pipeline

model = MLPClassifier(hidden_layer_sizes=(100, 10))
pipeline = Pipeline([('low_variance_filter', VarianceThreshold()), ('model', model)])
pipeline.fit(X, y)

print ("Model Trained on pre-trained features")

preds = pipeline.predict(features['test'])

f, ax = plt.subplots(1, 2)
for i in range(2):
    ax[i].imshow(Image.open(testimgs[i]).resize((200, 200), Image.ANTIALIAS))
    ax[i].text(10, 180, 'Predicted: %s' % preds[i], color='k', backgroundcolor='red', alpha=0.8)
plt.show()

 

 

 

 

 

 

dog, cat ํด๋”๊ฐ€ ๋”ฐ๋กœ ์—†์–ด์„œ,,,, ๊ฐœ์ธ์ ์œผ๋กœ

91๊ฐœ ์”ฉ ์˜ฎ๊ฒจ์„œ.... train ํ–ˆ๋•…

๋ฐ์ดํ„ฐ ์–‘์ด ๋ถ€์กฑํ•ด์„œ,, ๊ฐœ ์˜ˆ์ธก์„ ์ž˜๋ชป ํ•œใ„ท์Šค ใ…œใ…กใ…œ

๊ทธ๋ฆฌ๊ถ vgg ๋ง๊ณ  ๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค์€

ใ„ท์—๋Ÿฌ๊ฐ€ ๋‚ฌ๋•ฝ,,, why??

728x90
๋ฐ˜์‘ํ˜•
Comments