๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

๊ฐ์ฒด ํƒ์ง€ Object Detection ์ฃผ์š” ๋ชจ๋ธ one stage & two stage ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป IoT (Embedded)/Image Processing

๊ฐ์ฒด ํƒ์ง€ Object Detection ์ฃผ์š” ๋ชจ๋ธ one stage & two stage

์ง•์ง•์•ŒํŒŒ์นด 2022. 11. 7. 11:13
728x90
๋ฐ˜์‘ํ˜•

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” pseudo-lab ๋‹˜์˜ Tutorial-Book ๋ธ”๋กœ๊ทธ๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค>

https://pseudo-lab.github.io/Tutorial-Book/chapters/object-detection/Ch1-Object-Detection.html

 

1. ๊ฐ์ฒด ํƒ์ง€ ์†Œ๊ฐœ — PseudoLab Tutorial Book

๊ฐ์ฒด ํƒ์ง€(Object Detection)๋Š” ์ปดํ“จํ„ฐ ๋น„์ „ ๊ธฐ์ˆ ์˜ ์„ธ๋ถ€ ๋ถ„์•ผ์ค‘ ํ•˜๋‚˜๋กœ์จ ์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€๋‚ด ์‚ฌ์šฉ์ž๊ฐ€ ๊ด€์‹ฌ ์žˆ๋Š” ๊ฐ์ฒด๋ฅผ ํƒ์ง€ํ•˜๋Š” ๊ธฐ์ˆ ์ž…๋‹ˆ๋‹ค. ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์ด ๊ทธ๋ฆผ 1-1 ์ขŒ์ธก์— ์žˆ๋Š” ๊ฐ•์•„์ง€ ์‚ฌ์ง„์„ ๊ฐ•

pseudo-lab.github.io

 

 

 

 

๐Ÿฆ„ ๊ฐ์ฒด ํƒ์ง€(Object Detection)

์ปดํ“จํ„ฐ ๋น„์ „ ๊ธฐ์ˆ ์˜ ์„ธ๋ถ€ ๋ถ„์•ผ์ค‘ ํ•˜๋‚˜๋กœ์จ ์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€๋‚ด ์‚ฌ์šฉ์ž๊ฐ€ ๊ด€์‹ฌ ์žˆ๋Š” ๊ฐ์ฒด๋ฅผ ํƒ์ง€

 

 

โœ… ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค

ํŠน์ • ์‚ฌ๋ฌผ์„ ํƒ์ง€ํ•˜์—ฌ ๋ชจ๋ธ์„ ํšจ์œจ์ ์œผ๋กœ ํ•™์Šต ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์›€์„ ์ฃผ๋Š” ๋ฐฉ๋ฒ•

ํƒ€๊ฒŸ ์œ„์น˜๋ฅผ ํŠน์ •ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ

ํƒ€๊ฒŸ ์œ„์น˜๋ฅผ X์™€ Y์ถ•์„ ์ด์šฉํ•˜์—ฌ ์‚ฌ๊ฐํ˜•์œผ๋กœ ํ‘œํ˜„

ex) ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค ๊ฐ’์€ (X ์ตœ์†Œ๊ฐ’, Y ์ตœ์†Œ๊ฐ’, X ์ตœ๋Œ€๊ฐ’, Y ์ตœ๋Œ€๊ฐ’)์œผ๋กœ ํ‘œํ˜„

 

X, Y ๊ฐ’์€ ํ”ฝ์…€๊ฐ’์œผ๋กœ ํšจ์œจ์ ์ธ ์—ฐ์‚ฐ์„ ์œ„ํ•ด์„œ๋Š” ์ตœ๋Œ€๊ฐ’ 1๋กœ ๋ณ€ํ™˜

 

 

โœ… Sliding Window

ํŠน์ •ํ•œ ๋ชจ์–‘์˜ ์œˆ๋„์šฐ๋ฅผ ์ด๋ฏธ์ง€ ์™ผ์ชฝ ์ƒ๋‹จ์—์„œ ์˜ค๋ฅธ์ชฝ ํ•˜๋‹จ์œผ๋กœ ์ ์ง„์ ์œผ๋กœ ์ด๋™ํ•˜๋ฉด์„œ ๊ฐ์ฒด๊ฐ€ ์žˆ์„ ๋งŒํ•œ Region๋“ค์„ Proposal(์ œ์•ˆ)

๊ฐ๊ธฐ ๋‹ค๋ฅธ ๋ชจ์–‘์˜ ์œˆ๋„์šฐ๋ฅผ ๊ฐ๊ฐ ์Šฌ๋ผ์ด๋”ฉํ•˜๊ฑฐ๋‚˜ ์•„๋‹ˆ๋ฉด ์œˆ๋„์šฐ ํฌ๊ธฐ๋ฅผ ๊ณ ์ •ํ•˜๋˜ ํƒ์ง€ํ•  ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋ฅผ ๋‹ค์–‘ํ•˜๊ฒŒ ์Šค์ผ€์ผ๋งํ•ด์„œ Region Proposal์„ ์ˆ˜ํ–‰

 

๊ฐ์ฒด๊ฐ€ ์—†๋Š” ์ง€์—ญ๋„ ์Šฌ๋ผ์ด๋”ฉ์„ ๋ฌด์กฐ๊ฑด ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์Šฌ๋ผ์ด๋”ฉ ์œˆ๋„์šฐ๋ฅผ ํ•˜๋Š” ๋ฐ ์‹œ๊ฐ„์ด ๋งŽ์ด ๊ฑธ๋ฆด ๋ฟ๋”๋Ÿฌ ๊ฐ์ฒด๋ฅผ ์ž˜ ํƒ์ง€ํ•  ํ™•๋ฅ ๋„ ๋‚ฎ์•„์ง€๊ฒŒ ๋จ

 

 

โœ… ๋ชจ๋ธ ํ˜•ํƒœ

Classification : ํŠน์ • ๋ฌผ์ฒด์— ๋Œ€ํ•ด ์–ด๋–ค ๋ฌผ์ฒด์ธ์ง€ ๋ถ„๋ฅ˜๋ฅผ ํ•˜๋Š” ๊ฒƒ

Region Proposal : ๋ฌผ์ฒด๊ฐ€ ์žˆ์„๋งŒํ•œ ์˜์—ญ์„ ๋น ๋ฅด๊ฒŒ ์ฐพ์•„๋‚ด๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜

 

๐Ÿ’ซ One-Stage Detector

Classification, Regional Proposal์„ ๋™์‹œ์— ์ˆ˜ํ–‰ํ•˜์—ฌ ๊ฒฐ๊ณผ๋ฅผ ์–ป๋Š” ๋ฐฉ๋ฒ•

์ด๋ฏธ์ง€๋ฅผ ๋ชจ๋ธ์— ์ž…๋ ฅ ํ›„, Conv Layer๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ํŠน์ง•์„ ์ถ”์ถœ

๋น„๊ต์  ๋น ๋ฅด์ง€๋งŒ ์ •ํ™•๋„๊ฐ€ ๋‚ฎ์Œ

 

ex) YOLO, SSD, RetinaNet

 

๐Ÿ‘‍๐Ÿ—จ YOLO

Bouning-box์™€ Class probability๋ฅผ ํ•˜๋‚˜์˜ ๋ฌธ์ œ๋กœ ๊ฐ„์ฃผํ•˜์—ฌ ๊ฐ์ฒด์˜ ์ข…๋ฅ˜์™€ ์œ„์น˜๋ฅผ ํ•œ๋ฒˆ์— ์˜ˆ์ธก

์ด๋ฏธ์ง€๋ฅผ ์ผ์ • ํฌ๊ธฐ์˜ ๊ทธ๋ฆฌ๋“œ๋กœ ๋‚˜๋ˆ  ๊ฐ ๊ทธ๋ฆฌ๋“œ์— ๋Œ€ํ•œ Bounding-box๋ฅผ ์˜ˆ์ธก

Bounding-box์˜ confidence score์™€ ๊ทธ๋ฆฌ๋“œ์…€์˜ class score์˜ ๊ฐ’์œผ๋กœ ํ•™์Šต

 

๊ฐ„๋‹จํ•œ ์ฒ˜๋ฆฌ๊ณผ์ •์œผ๋กœ ์†๋„๊ฐ€ ๋งค์šฐ ๋น ๋ฅด์ง€๋งŒ ์ž‘์€ ๊ฐ์ฒด์— ๋Œ€ํ•ด์„œ๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ์ •ํ™•๋„๊ฐ€ ๋‚ฎ์Œ

 

๐Ÿ‘‍๐Ÿ—จ SSD (Single Shot MultiBox Detector)

๊ฐ Covolutional Layer ์ดํ›„์— ๋‚˜์˜ค๋Š” Feature map๋งˆ๋‹ค Bounding-box์˜ Class ์ ์ˆ˜์™€ Offset(์œ„์น˜์ขŒํ‘œ)๋ฅผ ๊ตฌํ•จ

NMS( Non-Maximum Suppression) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ์ตœ์ข… Bounding-box๋ฅผ ๊ฒฐ์ •

 

๊ฐ Feature map๋งˆ๋‹ค ์Šค์ผ€์ผ์ด ๋‹ค๋ฅด๊ธฐ ๋•Œ๋ฌธ์— ์ž‘์€ ๋ฌผ์ฒด์™€ ํฐ ๋ฌผ์ฒด๋ฅผ ๋ชจ๋‘ ํƒ์ง€ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ 

 

 

๐Ÿ‘‍๐Ÿ—จ RetinaNet

 

๋ชจ๋ธ ํ•™์Šต์‹œ ๊ณ„์‚ฐํ•˜๋Š” ์†์‹ค ํ•จ์ˆ˜(loss function)์— ๋ณ€ํ™”๋ฅผ ์ฃผ์–ด ๊ธฐ์กด One-Stage Detector๋“ค์ด ์ง€๋‹Œ ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๊ฐœ์„ 

One-Stage Detector๋Š” ๋งŽ๊ฒŒ๋Š” ์‹ญ๋งŒ๊ฐœ ๊นŒ์ง€์˜ ํ›„๋ณด๊ตฐ ์ œ์‹œ๋ฅผ ํ†ตํ•ด ํ•™์Šต์„ ์ง„ํ–‰

๊ทธ ์ค‘ ์‹ค์ œ ๊ฐ์ฒด์ธ ๊ฒƒ์€ ์ผ๋ฐ˜์ ์œผ๋กœ 10๊ฐœ ์ด๋‚ด ์ด๊ณ , ๋‹ค์ˆ˜์˜ ํ›„๋ณด๊ตฐ์ด background ํด๋ž˜์Šค๋กœ ์žกํž˜

์ƒ๋Œ€์ ์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์‰ฌ์šด background ํ›„๋ณด๊ตฐ๋“ค์— ๋Œ€ํ•œ loss๊ฐ’์„ ์ค„์—ฌ์คŒ์œผ๋กœ์จ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์–ด๋ ค์šด ์‹ค์ œ ๊ฐ์ฒด๋“ค์˜ loss ๋น„์ค‘์„ ๋†’์ด๊ณ , ๊ทธ์— ๋”ฐ๋ผ ์‹ค์ œ ๊ฐ์ฒด๋“ค์— ๋Œ€ํ•œ ํ•™์Šต์— ์ง‘์ค‘

 

์†๋„ ๋น ๋ฅด๋ฉด์„œ Two-Stage Detector์™€ ์œ ์‚ฌํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„

 

 

๐Ÿ’ซ Two-Stage Detector

Classification, Regional Proposal์„ ์ˆœ์ฐจ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•˜์—ฌ ๊ฒฐ๊ณผ๋ฅผ ์–ป๋Š” ๋ฐฉ๋ฒ•

๋น„๊ต์  ๋Š๋ฆฌ์ง€๋งŒ ์ •ํ™•๋„๊ฐ€ ๋†’์Œ

 

ex) R-CNN, Fast R-CNN, Faster R-CNN

 

๐Ÿ‘‍๐Ÿ—จ R-CNN

Selective Search๋ฅผ ์ด์šฉํ•ด ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ํ›„๋ณด์˜์—ญ(Region Proposal)์„ ์ƒ์„ฑ

  • Selective Search ์ด๋ฏธ์ง€ ํ”ฝ์…€์˜ ์ปฌ๋Ÿฌ, ๋ฌด๋Šฌ, ํฌ๊ธฐ, ํ˜•ํƒœ์— ๋”ฐ๋ผ ์œ ์‚ฌํ•œ Region์„ ๊ณ„์ธต์  ๊ทธ๋ฃนํ•‘ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ณ„์‚ฐ
  • Region Proposal ๊ฐ์ฒด๊ฐ€ ์žˆ์„ ๋งŒํ•œ ์˜์—ญ๋“ค์˜ ํ›„๋ณด๊ตฐ๋“ค์„ ์—ฌ๋Ÿฌ ๊ฐœ ์ถ”์ถœ

 

์ƒ์„ฑ๋œ ๊ฐ ํ›„๋ณด์˜์—ญ์„ ๊ณ ์ •๋œ ํฌ๊ธฐ๋กœ wrappingํ•˜์—ฌ CNN์˜ input์œผ๋กœ ์‚ฌ์šฉ

CNN์—์„œ ๋‚˜์˜จ Feature map์œผ๋กœ SVM์„ ํ†ตํ•ด ๋ถ„๋ฅ˜, Regressor์„ ํ†ตํ•ด Bounding-box๋ฅผ ์กฐ์ •

 

๊ฐ•์ œ๋กœ ํฌ๊ธฐ๋ฅผ ๋งž์ถ”๊ธฐ ์œ„ํ•œ wrapping์œผ๋กœ ์ด๋ฏธ์ง€์˜ ๋ณ€ํ˜•์ด๋‚˜ ์†์‹ค์ด ์ผ์–ด๋‚˜๊ณ  ํ›„๋ณด์˜์—ญ๋งŒํผ CNN์„ ๋Œ๋ ค์•ผํ•˜ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํฐ ์ €์žฅ๊ณต๊ฐ„์„ ์š”๊ตฌํ•˜๊ณ  ๋Š๋ฆฌ๋‹ค๋Š” ๋‹จ์ 

 

 

๐Ÿ‘‍๐Ÿ—จ Fast R-CNN

๊ฐ ํ›„๋ณด์˜์—ญ์— CNN์„ ์ ์šฉํ•˜๋Š” R-CNN๊ณผ ๋‹ฌ๋ฆฌ ์ด๋ฏธ์ง€ ์ „์ฒด์— CNN์„ ์ ์šฉํ•˜์—ฌ ์ƒ์„ฑ๋œ Feature map์—์„œ ํ›„๋ณด์˜์—ญ์„ ์ƒ์„ฑ

์ƒ์„ฑ๋œ ํ›„๋ณด์˜์—ญ์€ RoI Pooling์„ ํ†ตํ•ด ๊ณ ์ • ์‚ฌ์ด์ฆˆ์˜ Feature vector๋กœ ์ถ”์ถœ

  • RoI Pooling ์›ํ•˜๋Š” ์œ„์น˜(regions)์˜ feature๋ฅผ max pooling ํ•˜๊ธฐ ์œ„ํ•œ layer 
  • feature map์˜ proposal region์—์„œ ๋ฏธ๋ฆฌ ์ •ํ•ด๋†“์€ ํฌ๊ธฐ(FC layer์˜ ์ธํ’‹ ์‚ฌ์ด์ฆˆ)์˜ ๊ฒฉ์ž(grid)์— ๋งž์ถ”์–ด  bin๋งˆ๋‹ค maxpooling ํ•˜์—ฌ ๊ณ ์ •๋œ ํฌ๊ธฐ์˜ vector๋ฅผ ๋งŒ๋“ค์–ด ๋ƒ„

 

Feature vector์— FC layer๋ฅผ ๊ฑฐ์ณ Softmax๋ฅผ ํ†ตํ•ด ๋ถ„๋ฅ˜, Regressor๋ฅผ ํ†ตํ•ด Bounding-box๋ฅผ ์กฐ์ •

 

 

๐Ÿ‘‍๐Ÿ—จ Faster R-CNN

Selective Search ๋ถ€๋ถ„์„ ๋”ฅ๋Ÿฌ๋‹์œผ๋กœ ๋ฐ”๊พผ Region Proposal Network(RPN)์„ ์‚ฌ์šฉ

  • RPN Feature map์—์„œ CNN ์—ฐ์‚ฐ์‹œ sliding-window๊ฐ€ ์ฐ์€ ์ง€์ ๋งˆ๋‹ค Anchor-box๋กœ ํ›„๋ณด์˜์—ญ์„ ์˜ˆ์ธก
  • Anchor-box ๋ฏธ๋ฆฌ ์ง€์ •ํ•ด๋†“์€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋น„์œจ๊ณผ ํฌ๊ธฐ์˜ Bounding-box

 

RPN์—์„œ ์–ป์€ ํ›„๋ณด์˜์—ญ์„ IoU์ˆœ์œผ๋กœ ์ •๋ ฌํ•˜์—ฌ Non-Maximum Suppression(NMS) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ†ตํ•ด ์ตœ์ข… ํ›„๋ณด์˜์—ญ์„ ์„ ํƒ

  • Non-Maximum Suppression(NMS) 
    • Detected ๋œ bounding box ๋ณ„๋กœ Confidence threshold ์ดํ•˜์˜ bounding box๋Š” ์ œ๊ฑฐ
    • ๊ฐ€์žฅ ๋†’์€ confidence score๋ฅผ ๊ฐ€์ง„ box ์ˆœ์œผ๋กœ ๋‚ด๋ฆผ์ฐจ์ˆœ ์ •๋ ฌํ•˜๊ณ  ์•„๋ž˜ ๋กœ์ง์„ ๋ชจ๋“  box์— ์ˆœ์ฐจ์ ์œผ๋กœ ์ ์šฉ
      • ๊ฐ€์žฅ ๋†’์€ confience score๋ฅผ ๊ฐ€์ง„ box์™€ ๊ณ‚์น˜๋Š” ๋‹ค๋ฅธ box๋ฅผ ๋ชจ๋‘ ์กฐ์‚ฌํ•˜์—ฌ IOU๊ฐ€ ํŠน์ • threshold ์ด์ƒ์ธ box๋ฅผ ๋ชจ๋‘ ์ œ๊ฑฐ
      • IOU(์•„๋ž˜ ์„ค๋ช…)๊ฐ€ ์ผ์ • ์ด์ƒ์ธ boundingbox๋Š” ๋™์ผํ•œ ๋ฌผ์ฒด๋ฅผ detect ํ–ˆ๋‹ค๊ณ  ํŒ๋‹จํ•˜๊ณ  ๊ณ‚์น˜๋Š” box๋ฅผ ์ œ๊ฑฐํ•ด ์ฃผ๋Š” ๊ณผ์ •
    • ๋‚จ์•„ ์žˆ๋Š” box๋งŒ ์„ ํƒ (์ตœ๋Œ€ํ•œ ํ•˜๋‚˜์˜ Object์— ํ•˜๋‚˜์˜ Detection box๋งŒ์ด ์กด์žฌ)

 

์„ ํƒ๋œ ํ›„๋ณด์˜์—ญ์˜ ํฌ๊ธฐ๋ฅผ ๋งž์ถ”๊ธฐ ์œ„ํ•ด RoI Pooling์„ ๊ฑฐ์น˜๊ณ  ์ดํ›„ Fast R-CNN๊ณผ ๋™์ผํ•˜๊ฒŒ ์ง„ํ–‰

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

728x90
๋ฐ˜์‘ํ˜•
Comments