๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

[CNN]_Convolution ๊ณผ์ • ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป ์ธ๊ณต์ง€๋Šฅ (ML & DL)/ML & DL

[CNN]_Convolution ๊ณผ์ •

์ง•์ง•์•ŒํŒŒ์นด 2022. 1. 29. 17:40
728x90
๋ฐ˜์‘ํ˜•

220129 ์ž‘์„ฑ

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” ๊น€ํƒœํ™˜ (TAEWAN.KIM) ๋‹˜์˜ ๋ธ”๋กœ๊ทธ๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค>

http://taewan.kim/post/cnn/

 

CNN, Convolutional Neural Network ์š”์•ฝ

Convolutional Neural Network, CNN์„ ์ •๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

taewan.kim

 

 

 

 

 

 

1. CNN, Covolutional Neural Network

CNN

- ๊ธฐ์กด

: Fully Connected layer ์˜ ์ธ๊ณต์‹ ๊ฒฝ๋ง ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋Š” 1์ฐจ์›

: ํ•œ์žฅ์˜ ์ปฌ๋Ÿฌ ์‚ฌ์ง„์€ 3์ฐจ์›

: ๋ฐฐ์น˜ ๋ชจ๋“œ์˜ ์—ฌ๋Ÿฌ ์‚ฌ์ง„์€ 4์ฐจ์›

=> 3์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ 1์ฐจ์›์œผ๋กœ ํ‰๋ฉดํ™”

=> ๊ณต๊ฐ„ ์ •๋ณด ์†์‹ค

=> ์ด๋ฏธ์ง€ ๊ณต๊ฐ„ ์ •๋ณด ์œ ์‹ค๋กœ ์ธํ•œ ์ •๋ณด ๋ถ€์กฑ์œผ๋กœ ํŠน์ง• ์ถ”์ถœ ํ•™์Šต ๋น„ํšจ์œจ์ , ์ •ํ™•๋„ ํ•œ๊ณ„

 

 

- CNN
: ์ด๋ฏธ์ง€์˜ ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ์œ ์ง€ํ•œ ์ƒํƒœ๋กœ ํ•™์Šต์ด ๊ฐ€๋Šฅ

: ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์Šค์Šค๋กœ ํ•„ํ„ฐ ๊ฐ’์„ ํ•™์Šตํ•˜๊ฒŒ๋” ํ•จ

 

  • ๊ฐ ๋ ˆ์ด์–ด์˜ ์ž…์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํ˜•์ƒ ์œ ์ง€
  • ์ด๋ฏธ์ง€์˜ ๊ณต๊ฐ„ ์ •๋ณด ์œ ์ง€, ์ธ์ ‘ ์ด๋ฏธ์ง€์™€ ํŠน์ง• ํšจ๊ณผ์ ์œผ๋กœ ์ธ์‹
  • ๋‹ค์ˆ˜์˜ ํ•„ํ„ฐ๋กœ ์ด๋ฏธ์ง€ ํŠน์ง• ์ถ”์ถœ
  • ์ถ”์ถœํ•œ ์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ๋ชจ์•„ ๊ฐ•ํ™”ํ•˜๋Š” Pooling ๋ ˆ์ด์–ด
  • ํ•„ํ„ฐ๋ฅผ ๊ณต์œ  ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์‚ฌ์šฉ, ํ•™์Šต ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ๋งค์šฐ ์ ์Œ

 

 

1) ์ด๋ฏธ์ง€ ํŠน์ง• ์ถ”์ถœ ( Feature Extration)

: Convolution Layer ์™€ Pooling Layer๋ฅผ ์—ฌ๋Ÿฌ๊ฒน ์Œ“๋Š”๋‹ค

: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„ํ„ฐ๋ฅผ ์ˆœํšŒํ•˜๋ฉฐ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์‚ฐ -> Feature map

: Convolution Layer ์€ filter ํฌ๊ธฐ, stride, padding, max poolxing ์— ๋”ฐ๋ผ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ shape ๋ณ€๊ฒฝ

: Convolution Layer์€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ํ•„ํ„ฐ๋ฅผ ์ ์šฉ ํ›„ , ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ๋ฐ˜์˜

: Pooling Layer ์€ ์„ ํƒ์  ๋ ˆ์ด์–ด

 

 

+ ์ด๋ฏธ์ง€ ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐฐ์—ด ํ˜•ํƒœ๋กœ ๋งŒ๋“œ๋Š” flatten ๋ ˆ์ด์–ด

 

 

2) ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ( Classification )

: ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ Fully Connceted ๋ ˆ์ด์–ด

 

 

 

 

 

 

2. CNN ์ฃผ์š” ์šฉ์–ด

  • Convolution (ํ•ฉ์„ฑ๊ณฑ)

: ์›๋ณธ ์ด๋ฏธ์ง€์— ํŠน์ • ํ•„ํ„ฐ๋ฅผ ๊ณฑํ•˜์—ฌ ๋”ํ•˜๊ธฐ
: ๋‘ ๊ฐœ์˜ ํ•จ์ˆ˜ f ์™€ g ๊ฐ€ ์žˆ์„ ๋•Œ, ๋‘ ํ•จ์ˆ˜์˜ ํ•ฉ์„ฑ๊ณฑ์„ ์ˆ˜ํ•™ ๊ธฐํ˜ธ๋กœ๋Š” f * g ์™€ ๊ฐ™์ด ํ‘œ์‹œ
: ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์€ ๋‘ ํ•จ์ˆ˜ f , g ๊ฐ€์šด๋ฐ ํ•˜๋‚˜์˜ ํ•จ์ˆ˜๋ฅผ ๋ฐ˜์ „(reverse), ์ „์ด(shift)์‹œํ‚จ ๋‹ค์Œ, ๋‹ค๋ฅธ ํ•˜๋‚˜์˜ ํ•จ์ˆ˜์™€ ๊ณฑํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ ๋ถ„

ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ
๊ทธ๋ฆผ 1 : ํ•ฉ์„ฑ๊ณฑ ์ฒ˜๋ฆฌ ์ ˆ์ฐจhttp://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

: 2 ์ฐจ์› ์ž…๋ ฅ ๋ฐ์ดํ„ฐ (5 x 5)๋ฅผ 1๊ฐœ์˜ ํ•„ํ„ฐ (3 x 3)๋กœ ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ ์ˆ˜ํ–‰ => Feature map ๋งŒ๋“ฆ

 

 

  • ์ฑ„๋„ (Channel)

: ์ด๋ฏธ์ง€ ํ”ฝ์…€์€ ํ•˜๋‚˜ํ•˜๋‚˜๊ฐ€ ์‹ค์ˆ˜
: ์ปฌ๋Ÿฌ ์ด๋ฏธ์ง€๋Š” 3๊ฐœ์˜ ์ฑ„๋„ (RGB)
: ํ‘๋ฐฑ ์ด๋ฏธ์ง€๋Š” 2์ฐจ์› ๋ฐ์ดํ„ฐ๋กœ, 1๊ฐœ ์ฑ„๋„ (ํ‘๋ฐฑ)
: n ๊ฐœ์˜ ํ•„ํ„ฐ๊ฐ€ ์‚ฌ์šฉ๋œ๋‹ค๋ฉด ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ๋Š” n๊ฐœ์˜ ์ฑ„๋„ ๊ฐ€์ง

EX) ๋†’์ด 39 (ํ”ฝ์…€), ํญ 31 (์ปฌ๋Ÿฌ), ์ปฌ๋Ÿฌ => (39, 31, 3) shape

channel ๋ณ„๋กœ filter ์ ์šฉ

 

 

 

  • ํ•„ํ„ฐ (Filter) = ์ปค๋„ (Kernel)

: ์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ์ฐพ์•„๋‚ด๊ธฐ ์œ„ํ•œ ๊ณต์šฉ ํŒŒ๋ผ๋ฏธํ„ฐ
: ๋‚ด๊ฐ€ ํ›‘์€ ํ”ฝ์…€ ์˜์—ญ์— ์ฐพ๊ณ ์ž ํ•˜๋Š” ๋Œ€์ƒ์ด ์žˆ๋Š”์ง€ ํŒ๋ณ„
: filter = kernel
: CNN ์˜ ํ•™์Šต ๋Œ€์ƒ
: filter ์˜ ๋‚ด๋ถ€ ๊ฐ’ (weight) ์€ ์ฃผ๋กœ ๋žœ๋˜๊ฐ’ -> ํ•™์Šต ์ง„ํ–‰ํ•˜๋ฉฐ ๋‚ด๋ถ€ weight ๊ฐฑ์‹ !
: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์ •๋œ ๊ฐ„๊ฒฉ์œผ๋กœ ์ˆœํšŒํ•˜๋ฉฐ ์ฑ„๋„๋ณ„๋กœ ํ•ฉ์„ฑ๊ณฑ ํ•˜๊ณ , ๋ชจ๋“  ์ฑ„๋„์˜ ํ•ฉ์„ฑ๊ณฑ์˜ ํ•ฉ์„ Feature map ์œผ๋กœ ๋งŒ๋“ฆ


- Stride
: ์ง€์ •๋œ ๊ฐ„๊ฒฉ์œผ๋กœ ํ•„ํ„ฐ๋ฅผ ์ˆœํšŒ 
: filter ๊ฐ€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํ›‘์œผ๋ฉฐ ์—ฐ์‚ฐ ํ•  ๋•Œ, ํ•œ๋ฒˆ์— ์ด๋™ํ•˜๋Š” pixel ๊ฐœ์ˆ˜

ex) 2์นธ์”ฉ ์ด๋™ํ•˜๋ฉด์„œ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์‚ฐ

http://taewan.kim/post/cnn/


: ์—ฌ๋Ÿฌ ์ฑ„๋„์„ ๊ฐ€์งˆ ๊ฒฝ์šฐ, ํ•„ํ„ฐ๋Š” ๊ฐ์ฑ„๋„์„ ์ˆœํšŒํ•˜๋ฉฐ ์ฑ„๋„๋ณ„ feature map ๋งŒ๋“ ๋‹ค
: ๊ฐ ์ฑ„๋„์˜ feature map์„ ํ•ฉ์‚ฐํ•˜์—ฌ ์ตœ์ข… feature map ์œผ๋กœ ๋ฐ˜ํ™˜
: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋Š” ์ฑ„๋„ ์ˆ˜์™€ ์ƒ๊ด€์—†์ด ํ•„ํ„ฐ ๋ณ„๋กœ 1๊ฐœ์˜ feature map ๋งŒ๋“ค์–ด์ง

๋ฉ€ํ‹ฐ ์ฑ„๋„ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ํ•„ํ„ฐ๋ฅผ ์ ์šฉํ•œ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์‚ฐ ์ ˆ์ฐจ


: ํ•˜๋‚˜์˜ Convolution Layer ์— ํฌ๊ธฐ๊ฐ€ ๊ฐ™์€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ํ•„ํ„ฐ ์ ์šฉ ๊ฐ€๋Šฅ
: feature map ์— ํ•„ํ„ฐ ๊ฐœ์ˆ˜ ๋งŒํผ ์บ๋„ ๋งŒ๋“ค์–ด์ง
: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ์ ์šฉํ•œ ํ•„ํ„ฐ ๊ฐœ์ˆ˜ => ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ feature map ์ฑ„๋„

- Activation Map ( = Feature map : ํ•ฉ์„ฑ ๊ณฑ ๊ณ„์‚ฐ์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ํ–‰๋ ฌ)
: Convolution Layer์˜ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํ•„ํ„ฐ๊ฐ€ ์ˆœํšŒํ•˜๋ฉฐ ํ•ฉ์„ฑ๊ณฑ์„ ํ†ตํ•ด ๋งŒ๋“  ์ถœ๋ ฅ
: Feature map ํ–‰๋ ฌ์— ํ™œ์„ฑ ํ™ค์ˆ˜ ์ ์šฉํ•œ ๊ฒฐ๊ณผ
=> Convolution ๋ ˆ์ด์–ด์˜ ์ตœ์ข… ๊ฒฐ๊ณผ

 

 

  • ์ŠคํŠธ๋ผ์ด๋“œ (Strid)

: ์ง€์ •๋œ ๊ฐ„๊ฒฉ์œผ๋กœ ํ•„ํ„ฐ๋ฅผ ์ˆœํšŒ 
ex) 2์นธ์”ฉ ์ด๋™ํ•˜๋ฉด์„œ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์‚ฐ

 

  • ํŒจ๋”ฉ (Padding)

: Convolution ๋ ˆ์ด์–ด์—์„œ filter ์™€ stride ์ž‘์šฉ์œผ๋กœ feature map ํฌ๊ธฐ๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ณด๋‹ค ์ž‘๋‹ค
: Convolution ๋ ˆ์ด์–ด์˜ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ๊ฐ€ ์ค„์–ด๋“œ๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ ( ๊ฐ€์žฅ์ž๋ฆฌ ํ”ฝ์…€ ์ •๋ณด ์œ ์‹ค ๋ฐฉ์ง€ )
: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ์™ธ๊ฐ์— ์ง€์ •๋œ ํ”ฝ์…€๋งŒํผ ํŠน์ • ๊ฐ’์œผ๋กœ ์ฑ„์›Œ ๋„ฃ๊ธฐ ( ๋ณดํ†ต 0 ์œผ๋กœ ์ฑ„์›Œ๋„ฃ์Œ : zero-padding )

zero-padding

 

 

  • ํ’€๋ง (Pooling) ๋ ˆ์ด์–ด

: Convolution Layer์˜ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„์„œ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ (Activation map) ์˜ ํฌ๊ธฐ๋ฅผ ์ค„์ด๊ฑฐ๋‚˜ ํŠน์ • ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ•์กฐ
: ์†์‹ค์ด ์—†๋Š” feature map ์—์„œ ์‹ค์ œ pooling ์„ ๊ฑฐ์น˜๋ฉฐ ์ค‘์š”ํ•œ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•ด ํ•ต์‹ฌ ์ •๋ณด๋ฅผ ๋งค ๋‹จ๊ณ„๋งˆ๋‹ค ์ƒ์„ฑ
: ํ•™์Šต ๋Œ€์ƒ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์—†์Œ
: pooling layer ํ†ต๊ณผํ•˜๋ฉด ํ–‰๋ ฌ ํฌ๊ธฐ ๊ฐ์†Œ
: pooling layer ํ†ตํ•ด์„œ ์ฑ„๋„ ์ˆ˜ ๋ณ€๊ฒฝ ์—†์Œ

1) Max Pooling
: ํŠน์ • ์‚ฌ์ด์ฆˆ ์œˆ๋„์šฐ ๋‚ด์˜ ๊ฐ’ ์ค‘ Max( ์ตœ๋Œ€ ) ๊ฐ’์„ ๋Œ€ํ‘œ๋กœ ๊ฐ–๊ณ  ์˜ค๊ธฐ

2) Average Pooling
: ํŠน์ • ์‚ฌ์ด์ฆˆ ์œˆ๋„์šฐ ๋‚ด์˜ ๊ฐ’ ์ค‘ Average( ํ‰๊ท  ) ๊ฐ’์„ ๋Œ€ํ‘œ๋กœ ๊ฐ–๊ณ  ์˜ค๊ธฐ

3) Min Pooing
: ํŠน์ • ์‚ฌ์ด์ฆˆ ์œˆ๋„์šฐ ๋‚ด์˜ ๊ฐ’ ์ค‘ Min( ์ตœ์†Œ ) ๊ฐ’์„ ๋Œ€ํ‘œ๋กœ ๊ฐ–๊ณ  ์˜ค๊ธฐ

 

 

 

 

 

 

 

 

 

3. ๋ ˆ์ด์–ด๋ณ„ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์„ ์ •

1) Convolution Layer ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ(Activation Map)์˜ Shape ๊ณ„์‚ฐ ์‹

: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ํ•„ํ„ฐ์˜ ํฌ๊ธฐ, stride ํฌ๊ธฐ์— ๋”ฐ๋ผ feature map ๊ฒฐ์ •

  • EX) 
  • ์ž…๋ ฅ shape = (39 : H ๋†’์ด , 31 : W ํญ, 1 : ์ฑ„๋„)
  • ์ž…๋ ฅ ์ฑ„๋„ = 1
  • ํ•„ํ„ฐ F = (4, 4)
  • ์ถœ๋ ฅ ์ฑ„๋„ = 20
  • stride = 1
  • ํŒจ๋”ฉ P = 2
RowSize = (H +2P - F) / Stride + 1
ColumnSize = (W +2P - F) / Stride + 1

=> Activation Map์˜ Shape๋Š” (36, 28, 20)

 

2) Pooling Layer ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ ์„ค์ •

: pooling ์‚ฌ์ด์ฆˆ๋Š” ์ •์‚ฌ๊ฐํ˜•

: ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ํ–‰ ํฌ๊ธฐ์™€ ์—ด ํฌ๊ธฐ๋Š” pooling ์‚ฌ์ด์ฆˆ์˜ ๋ฐฐ์ˆ˜ (๋‚˜๋ˆ„์–ด ๋–จ์–ด์ง€๋Š” ์ˆ˜)

: pooling layer์˜ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ๋Š” ํ–‰๊ณผ ์—ด์˜ ํฌ๊ธฐ๋ฅผ Pooling ์‚ฌ์ด์ฆˆ๋กœ ๋‚˜๋ˆˆ ๋ชซ

OutputRowSize = InputRowSize / PoolingSize
OutputColumnSize = InputColumnSize / PoolingSize

=> ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ Shape์€ (18, 14, 20)     if PoolingSize = (2, 2)

 

 

 

 

 

 

 

4. CNN ๊ตฌ์„ฑ

: Convolution Layer ์™€ Max pooling Layer๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ stack ์„ ์Œ“๋Š” ํŠน์ง• ์ถ”์ถœ (Feature Extraction)

: Fully Connected Layer ๊ตฌ์„ฑ

: ๋งˆ์ง€๋ง‰ ์ถœ๋ ฅ์ธต์— Softmax ์ ์šฉํ•˜์—ฌ ๋ถ„๋ฅ˜ (Classifiation)

 

: Filter, Stride, Padding์„ ์กฐ์ ˆ

: ํŠน์ง• ์ถ”์ถœ(Feature Extraction) ๋ถ€๋ถ„์˜ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ ํฌ๊ธฐ ๋งž์ถ”๋Š” ์ž‘์—…์ด ์ค‘์š”

 

 

 

 

 

 

5. CNN ์ฝ”๋“œ

1) ํ•„ํ„ฐ๋กœ ํŠน์ง•์„ ๋ฝ‘์•„์ฃผ๋Š” ์ปจ๋ณผ๋ฃจ์…˜(Convolution) ๋ ˆ์ด์–ด

Conv2D(filters = 1, kernel_size = (2, 2), padding='valid', input_shape=(3, 3, 1), activation='relu')

input : 3*3 ์— 1๊ฐœ์˜ filter๊ฐ€ 2*2 kernel_size๋กœ Convolution

  • ์ฒซ๋ฒˆ์งธ : Convolution filter ์ˆ˜
  • ๋‘๋ฒˆ์งธ : Convolution kernel (ํ–‰, ์—ด)
  • padding : ๊ฒฝ๊ณ„ ์ฒ˜๋ฆฌ ๋ฐฉ๋ฒ•์„ ์ •์˜
    • valid : ์œ ํšจํ•œ ์˜์—ญ๋งŒ ์ถœ๋ ฅ => ์ถœ๋ ฅ ์ด๋ฏธ์ง€ ์‚ฌ์ด์ฆˆ๋Š” ์ž…๋ ฅ ์‚ฌ์ด์ฆˆ๋ณด๋‹ค ์ž‘์Œ   ( default = "valid" )
    • same : ์ถœ๋ ฅ ์ด๋ฏธ์ง€ ์‚ฌ์ด์ฆˆ๊ฐ€ ์ž…๋ ฅ ์ด๋ฏธ์ง€ ์‚ฌ์ด์ฆˆ์™€ ๋™์ผ
  • input_shape : ์ƒ˜ํ”Œ ์ˆ˜๋ฅผ ์ œ์™ธํ•œ ์ž…๋ ฅ ํ˜•ํƒœ๋ฅผ ์ •์˜ (๋ชจ๋ธ์—์„œ ์ฒซ ๋ ˆ์ด์–ด์ผ ๋•Œ๋งŒ ์ •์˜)
    • (ํ–‰, ์—ด, ์ฑ„๋„ ์ˆ˜)๋กœ ์ •์˜
    • ํ‘๋ฐฑ์˜์ƒ์ธ ๊ฒฝ์šฐ์—๋Š” ์ฑ„๋„์ด 1
    • ์ปฌ๋Ÿฌ(RGB)์˜์ƒ์ธ ๊ฒฝ์šฐ์—๋Š” ์ฑ„๋„์„ 3
  • strides : convolution์˜ stride๋ฅผ ์ง€์ • ( default = 1 )
  • activation : ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์„ค์ •
    • linear : ๋””ํดํŠธ ๊ฐ’, ์ž…๋ ฅ๋‰ด๋Ÿฐ๊ณผ ๊ฐ€์ค‘์น˜๋กœ ๊ณ„์‚ฐ๋œ ๊ฒฐ๊ณผ๊ฐ’์ด ๊ทธ๋Œ€๋กœ ์ถœ๋ ฅ์œผ๋กœ ๋‚˜์˜ด
    • relu : rectifier ํ•จ์ˆ˜, ์€์ต์ธต์— ์ฃผ๋กœ ์“ฐ์ž„  ( default = "relu" )
    • sigmoid : ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜, ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์ถœ๋ ฅ์ธต์— ์ฃผ๋กœ ์“ฐ์ž„
    • softmax : ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜, ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์ถœ๋ ฅ์ธต์— ์ฃผ๋กœ ์“ฐ์ž„

 

2) ์ž…์ถœ๋ ฅ์„ ๋ชจ๋‘ ์—ฐ๊ฒฐํ•ด์ฃผ๋Š” Dense ๋ ˆ์ด์–ด

Dense(8, input_dim = 4, init = 'uniform', activation = 'relu'))

 

: ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์„ ๋ชจ๋‘ ์—ฐ๊ฒฐํ•ด์ฃผ๋ฉฐ, ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์„ ๊ฐ๊ฐ ์—ฐ๊ฒฐํ•ด์ฃผ๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ํฌํ•จ

: ์ž…๋ ฅ ๋‰ด๋Ÿฐ์ด 4๊ฐœ, ์ถœ๋ ฅ ๋‰ด๋Ÿฐ์ด 8๊ฐœ์žˆ๋‹ค๋ฉด ์ด ์—ฐ๊ฒฐ์„ ์€ 32๊ฐœ (4 * 8 = 32)

: ๊ฐ ์—ฐ๊ฒฐ์„ ์—๋Š” ๊ฐ€์ค‘์น˜(weight)๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š”๋ฐ, ์ด ๊ฐ€์ค‘์น˜๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ์˜๋ฏธ๋Š” ์—ฐ๊ฒฐ๊ฐ•๋„

  • ์ฒซ๋ฒˆ์งธ : ์ถœ๋ ฅ ๋‰ด๋Ÿฐ์˜ ์ˆ˜๋ฅผ ์„ค์ •
  • input_dim : ์ž…๋ ฅ ๋‰ด๋Ÿฐ์˜ ์ˆ˜๋ฅผ ์„ค์ •
  • init : ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™” ๋ฐฉ๋ฒ• ์„ค์ •
    • uniform : ๊ท ์ผ ๋ถ„ํฌ
    • normal : ๊ฐ€์šฐ์‹œ์•ˆ ๋ถ„ํฌ
  • activation : ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์„ค์ •
    • linear : ๋””ํดํŠธ ๊ฐ’, ์ž…๋ ฅ๋‰ด๋Ÿฐ๊ณผ ๊ฐ€์ค‘์น˜๋กœ ๊ณ„์‚ฐ๋œ ๊ฒฐ๊ณผ๊ฐ’์ด ๊ทธ๋Œ€๋กœ ์ถœ๋ ฅ์œผ๋กœ ๋‚˜์˜ด
    • relu : rectifier ํ•จ์ˆ˜, ์€์ต์ธต์— ์ฃผ๋กœ ์“ฐ์ž„
    • sigmoid : ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜, ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์ถœ๋ ฅ์ธต์— ์ฃผ๋กœ ์“ฐ์ž„
    • softmax : ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜, ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์ถœ๋ ฅ์ธต

์ž…๋ ฅ ์‹ ํ˜ธ๊ฐ€ 4๊ฐœ์ด๊ณ  ์ถœ๋ ฅ ์‹ ํ˜ธ๊ฐ€ 3๊ฐœ์ด๋ฏ€๋กœ ์‹œ๋ƒ…์Šค ๊ฐ•๋„์˜ ๊ฐœ์ˆ˜๋Š” 12๊ฐœ

# 4๊ฐœ์˜ ์ž…๋ ฅ ๊ฐ’์„ ๋ฐ›์•„ ์ด์ง„๋ถ„๋ฅ˜ (sigmoid)
from keras.models import Sequential
from keras.layers import Dense

model = Sequential()

model.add(Dense(8, input_dim=4, init='uniform', activation='relu'))
model.add(Dense(6, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))

 

 

 

 

 

 

6. CNN ์‹ค์Šต ์ฝ”๋“œ

from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten

model = Sequential()

model.add(Conv2D(12, kernel_size=(5, 5), activation='relu', input_shape=(56, 56, 1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(16, kernel_size=(5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(20, kernel_size=(4, 4), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation="relu"))    # ์ถœ๋ ฅ 128
model.add(Dense(4, activation="softmax"))   # ์ถœ๋ ฅ 4 ์ž…๋ ฅ 128
model.summary()
Model: "sequential_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_28 (Conv2D)           (None, 52, 52, 12)        312       
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 26, 26, 12)        0         
_________________________________________________________________
conv2d_29 (Conv2D)           (None, 22, 22, 16)        4816      
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 11, 11, 16)        0         
_________________________________________________________________
conv2d_30 (Conv2D)           (None, 8, 8, 20)          5140      
_________________________________________________________________
max_pooling2d_22 (MaxPooling (None, 4, 4, 20)          0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 320)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 128)               41088     
_________________________________________________________________
dense_7 (Dense)              (None, 4)                 516       
=================================================================
Total params: 51,872
Trainable params: 51,872
Non-trainable params: 0
_________________________________________________________________
  • Layer (type)

: ๋ ˆ์ด์–ด์˜ ์ด๋ฆ„๊ณผ ํƒ€์ž…

: ๋”ฐ๋กœ ์ง€์ •ํ•ด์ฃผ๊ณ  ์‹ถ์„๋•Œ๋Š” Dense์— ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ name= '์ง€์ •ํ•˜๊ณ ์‹ถ์€ ์ด๋ฆ„'

 

  • Output Shape

: (None, 4)์ด๋ผ๋Š” ๋œป์€ None๊ฐœ์˜ ํ–‰๊ณผ 4๊ฐœ์˜ ์•„์›ƒํ’‹ ๊ฐ’์ด ์ฃผ์–ด์กŒ๋‹ค

: ํ–‰์ด None์œผ๋กœ ์ง€์ •๋˜๋Š” ์ด์œ ๋Š” ๋ฐ์ดํ„ฐ์˜ ๊ฐฏ์ˆ˜๋Š” ๊ณ„์†ํ•ด์„œ ์ถ”๊ฐ€๋  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์—์„œ๋Š” ์ฃผ๋กœ ํ–‰์„ ๋ฌด์‹œ

: ์—ด์˜ shape์„ ๋งž์ถ”์–ด์ฃผ๋Š” ์ž‘์—…

 

  • Param:

: ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ์ˆ˜, ์ฆ‰ ๊ฐ ์ž…๋ ฅ๋…ธ๋“œ์™€ ์ถœ๋ ฅ๋…ธ๋“œ์— ๋Œ€ํ•ด ์—ฐ๊ฒฐ๋œ ๊ฐ„์„ ์˜ ์ˆ˜

: ์ธํ’‹์— Bias(b) ๋…ธ๋“œ๊ฐ€ ์ถ”๊ฐ€

 

 

 

7. CNN ์ž…์ถœ๋ ฅ, ํŒŒ๋ผ๋ฏธํ„ฐ ๊ณ„์‚ฐ

Model: "sequential_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_28 (Conv2D)           (None, 52, 52, 12)        312       
_________________________________________________________________
max_pooling2d_20 (MaxPooling (None, 26, 26, 12)        0         
_________________________________________________________________
conv2d_29 (Conv2D)           (None, 22, 22, 16)        4816      
_________________________________________________________________
max_pooling2d_21 (MaxPooling (None, 11, 11, 16)        0         
_________________________________________________________________
conv2d_30 (Conv2D)           (None, 8, 8, 20)          5140      
_________________________________________________________________
max_pooling2d_22 (MaxPooling (None, 4, 4, 20)          0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 320)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 128)               41088     
_________________________________________________________________
dense_7 (Dense)              (None, 4)                 516       
=================================================================
Total params: 51,872
Trainable params: 51,872
Non-trainable params: 0
_________________________________________________________________
โ€‹input_shape = (56, 56, 1)

- conv2d_28    ( 52, 52, 12 )
1) ํŒŒ๋ผ๋ฏธํ„ฐ 

Convolution layer1์—์„œ ํ•™์Šต์‹œํ‚ฌ ๋Œ€์ƒ์€ ์ž…๋ ฅ์ฑ„๋„ 1, ์ปค๋„ ์‚ฌ์ด์ฆˆ (5 , 5), ์ถœ๋ ฅ์ฑ„๋„ 12 ๊ฐœ
=> (56, 56, 1) ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด 12 ์žฅ์˜ Conv Layer๋ฅผ ๋งŒ๋“ ๋‹ค
=> (5 * 5) ์ปค๋„ * 1 ์ฑ„๋„ * 12 ์žฅ ์ƒ์„ฑ + 12 ๊ฐœ bias ํ•ญ์ด ์žˆ์œผ๋‹ˆ๊นŒ 312 ๊ฐœ์˜ ๋ชจ์ˆ˜ ์ƒ์„ฑ

2) Convolution Layer ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ

RowSize = (H + 2P - F) / Stride + 1
ColumnSize = (W + 2P - F) / Stride + 1
=> 56 - 5 / 1 + 1 = 52


- max_pooling2d_22    ( 26, 26, 12 )
1) ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ
(52, 52) ์ด๋ฏธ์ง€ 12์žฅ์ด (2, 2) max pooling์„ ํ†ต๊ณผํ•˜๋ฉด์„œ (26, 26) ์ด๋ฏธ์ง€ 12์žฅ์œผ๋กœ ์ด๋ฏธ์ง€ ์ฐจ์›์ด ์ถ•์†Œ
OutputRowSize = InputRowSize / PoolingSize

OutputColumnSize = InputColumnSize / PoolingSize
=> 52 / 2 = 26


- conv2d_29    ( 22, 22, 16 )
1) ํŒŒ๋ผ๋ฏธํ„ฐ

Convolution layer2 ์—์„œ ํ•™์Šต์‹œํ‚ฌ ๋Œ€์ƒ์€ ์ž…๋ ฅ์ฑ„๋„ 12, ์ปค๋„ ์‚ฌ์ด์ฆˆ (5 , 5), ์ถœ๋ ฅ์ฑ„๋„ 16 ๊ฐœ
=> (22, 22, 16) ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด 16 ์žฅ์˜ Conv Layer๋ฅผ ๋งŒ๋“ ๋‹ค
=> (5 * 5) ์ปค๋„ * 12 ์ฑ„๋„ * 16 ์žฅ ์ƒ์„ฑ + 16 ๊ฐœ bias ํ•ญ์ด ์žˆ์œผ๋‹ˆ๊นŒ 4816 ๊ฐœ์˜ ๋ชจ์ˆ˜ ์ƒ์„ฑ

2) Convolution Layer ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ

RowSize = (H + 2P - F) / Stride + 1
ColumnSize = (W + 2P - F) / Stride + 1
=> 26 - 5 / 1 + 1 = 22


max_pooling2d_21    ( 11, 11, 16 )
1) ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ

(22, 22) ์ด๋ฏธ์ง€ 16์žฅ์ด (2,2) max pooling์„ ํ†ต๊ณผํ•˜๋ฉด์„œ (11, 11) ์ด๋ฏธ์ง€ 16 ์žฅ์œผ๋กœ ์ด๋ฏธ์ง€ ์ฐจ์›์ด ์ถ•์†Œ
OutputRowSize = InputRowSize / PoolingSize

OutputColumnSize = InputColumnSize / PoolingSize
=> 22 / 2 = 11


- conv2d_30    ( 8, 8, 20 )
1) ํŒŒ๋ผ๋ฏธํ„ฐ

Convolution layer2 ์—์„œ ํ•™์Šต์‹œํ‚ฌ ๋Œ€์ƒ์€ ์ž…๋ ฅ์ฑ„๋„ 16, ์ปค๋„ ์‚ฌ์ด์ฆˆ (4 , 4), ์ถœ๋ ฅ์ฑ„๋„ 20 ๊ฐœ
=> (11, 11, 16) ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด 20 ์žฅ์˜ Conv Layer๋ฅผ ๋งŒ๋“ ๋‹ค
=> (4 * 4) ์ปค๋„ * 16 ์ฑ„๋„ * 20 ์žฅ ์ƒ์„ฑ + 20 ๊ฐœ bias ํ•ญ์ด ์žˆ์œผ๋‹ˆ๊นŒ 5140 ๊ฐœ์˜ ๋ชจ์ˆ˜ ์ƒ์„ฑ

2) Convolution Layer ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ

RowSize = (H + 2P - F) / Stride + 1
ColumnSize = (W + 2P - F) / Stride + 1
=> 11 - 4 / 1 + 1 = 8


max_pooling2d_21    ( 4, 4, 20 )
1) ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ

(8, 8) ์ด๋ฏธ์ง€ 20์žฅ์ด (2,2) max pooling์„ ํ†ต๊ณผํ•˜๋ฉด์„œ (4, 4) ์ด๋ฏธ์ง€ 20 ์žฅ์œผ๋กœ ์ด๋ฏธ์ง€ ์ฐจ์›์ด ์ถ•์†Œ
OutputRowSize = InputRowSize / PoolingSize

OutputColumnSize = InputColumnSize / PoolingSize
=> 8 / 2 = 4


flatten_3     ( 320 )
(4, 4) * 20 = 320 ๊ฐœ์˜ ์ž…๋ ฅ ํ…์„œ๊ฐ€ ์ƒ์„ฑ

- dense_6
320๊ฐœ๋ฅผ ์ž…๋ ฅ ๋ฐ›์•„ 128๊ฐœ๋ฅผ ์ถœ๋ ฅํ•˜๋ฏ€๋กœ bias๊ฐ€ 128 ๊ฐœ์ด๋ฏ€๋กœ ๋ฏธ์ง€์ˆ˜๋Š” 320 * 128 + 128 = 41088

- dense_7
128๊ฐœ๋ฅผ ์ž…๋ ฅ๋ฐ›์•„ 4๊ฐœ๋ฅผ ์ถœ๋ ฅํ•˜๊ณ  bias๊ฐ€ 4 ๊ฐœ์ด๋ฏ€๋กœ ๋ฏธ์ง€์ˆ˜๋Š” 517

=> ์ด ๋ชจ๋“  ๋ฏธ์ง€์ˆ˜๋ฅผ ํ•ฉํ•˜๋ฉด 51,872

 

 

 

 

 

8. CNN ์š”์•ฝ

: ์ด๋ฏธ์ง€์˜ ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ ์ธ์ ‘ ์ด๋ฏธ์ง€์™€์˜ ํŠน์ง•์„ ํšจ๊ณผ์ ์œผ๋กœ ์ธ์‹ํ•˜๊ณ  ๊ฐ•์กฐํ•˜๋Š” ๋ฐฉ์‹

: ์ด๋ฏธ์ง€์˜ ํŠน์ง• ์ถ”์ถœ + ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜

- ์ด๋ฏธ์ง€ ํŠน์ง• ์ถ”์ถœ

=> filter ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณต์œ  ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ ์ตœ์†Œํ™” ํ•˜๋ฉฐ ์ด๋ฏธ์ง€ ํŠน์ง•์„ ์ฐพ๋Š” convolution ๋ ˆ์ด์–ด์™€ ํŠน์ง•์„ ๊ฐ•ํ™”ํ•˜๊ณ  ๋ชจ์œผ๋Š” pooling ๋ ˆ์ด์–ด๋กœ ๊ตฌ์„ฑ

 

:  Filter์˜ ํฌ๊ธฐ, Stride, Padding๊ณผ Pooling ํฌ๊ธฐ๋กœ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ๋ฅผ ์กฐ์ ˆ

: ํ•„ํ„ฐ์˜ ๊ฐœ์ˆ˜๋กœ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ์˜ ์ฑ„๋„์„ ๊ฒฐ์ •

: ์ด์ „ ๋ ˆ์ด์–ด์˜ ๋ชจ๋“  ๋…ธ๋“œ๊ฐ€ ๋‹ค์Œ ๋ ˆ์ด์–ด์˜ ๋ชจ๋“  ๋…ธ๋“œ์— ์—ฐ๊ฒฐ๋œ ๋ ˆ์ด์–ด๋ฅผ Fully Connected Layer(FC Layer)

=> FC Layer๋ฅผ Dense Layer

728x90
๋ฐ˜์‘ํ˜•
Comments