๐Ÿ˜Ž ๊ณต๋ถ€ํ•˜๋Š” ์ง•์ง•์•ŒํŒŒ์นด๋Š” ์ฒ˜์Œ์ด์ง€?

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ Learning rate & batch size & iteration ์ตœ์ ํ™” ๋ณธ๋ฌธ

๐Ÿ‘ฉ‍๐Ÿ’ป ์ธ๊ณต์ง€๋Šฅ (ML & DL)/ML & DL

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ Learning rate & batch size & iteration ์ตœ์ ํ™”

์ง•์ง•์•ŒํŒŒ์นด 2022. 9. 5. 15:18
728x90
๋ฐ˜์‘ํ˜•

220905 ์ž‘์„ฑ

<๋ณธ ๋ธ”๋กœ๊ทธ๋Š” inhovation97 ๋‹˜์˜ ๋ธ”๋กœ๊ทธ๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ๊ณต๋ถ€ํ•˜๋ฉฐ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค :-) >

https://inhovation97.tistory.com/32

 

Learning rate & batch size best ์กฐํ•ฉ ์ฐพ๊ธฐ (feat.๋…ผ๋ฌธ๋ฆฌ๋ทฐ์™€ ์‹คํ—˜๊ฒฐ๊ณผ)

 * 2022-08-29 ์ˆ˜์ •ํ•จ. ์ด๋ฒˆ ํฌ์ŠคํŒ…์€ ์ €์˜ ์ง€๋‚œ ํฌ์ŠคํŒ…์˜ ๋ฐฐ๊ฒฝ ์ง€์‹์ด ์š”๊ตฌ๋˜๋ฏ€๋กœ ์ฝ๊ณ  ์˜ค์‹œ๊ธฐ๋ฅผ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค :) ์ €๋Š” ๋ชจ๋ธ๋ง ๊ฒฝํ—˜์ด ์—„์ฒญ ๋งŽ์ง€๋Š” ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๊ณ ์ˆ˜๋ถ„๋“ค์—๊ฒŒ๋Š” ํ•ด๋‹น ํฌ์ŠคํŒ…์ด ๋‹น

inhovation97.tistory.com

https://www.slideshare.net/w0ong/ss-82372826

 

ํ…์„œํ”Œ๋กœ์šฐ๋กœ ๋ฐฐ์šฐ๋Š” ๋”ฅ๋Ÿฌ๋‹

์ €ํฌ๊ณผ ํ•™์ƒ๋“ค์„ ์œ„ํ•ด ๋งŒ๋“  ์ž…๋ฌธ ๋ฐœํ‘œ ์ž๋ฃŒ์ž…๋‹ˆ๋‹ค.

www.slideshare.net

 

 

 

 

๐Ÿ˜Ž Learing rate

  • ํ˜„์žฌ์ ์—์„œ ๋‹ค์Œ์ ์œผ๋กœ ์–ผ๋งŒํผ ์ด๋™ํ• ์ง€, ๋‹ค๋ฅด๊ฒŒ ๋งํ•˜๋ฉด ๋ชจ๋ธ์ด ์–ผ๋งˆ๋‚˜ ์„ธ์„ธํ•˜๊ฒŒ ํ•™์Šต์„ ํ• ์ง€

 

โ–ถ Learing rate ํด ๋•Œ

  • ํ•œ ๋ฒˆ์˜ step์—์„œ ํŒŒ๋ผ๋ฏธํ„ฐ ํ•™์Šต์ด ํฌ๊ฒŒ ์ง„ํ–‰๋˜๊ธฐ ๋•Œ๋ฌธ์— ๋ณดํญ์ด ์ปค์ง„๋‹ค
  • ๐Ÿ”ต ๋ณดํญ์ด ํฌ๊ธฐ ๋•Œ๋ฌธ์— ์ข€ ๋” ๋นจ๋ฆฌ ์ˆ˜๋ ด์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ, local minima๋กœ ๋น ์งˆ ์œ„ํ—˜์€ ์ ๋‹ค
  • ๐Ÿ”ด ๋„ˆ๋ฌด ํฌ๋ฉด ์˜ค๋ฒ„์Š›์ด ์‹ฌํ•˜๊ฒŒ ์ผ์–ด๋‚˜ loss๊ฐ€ ์ „ํ˜€ ์ค„์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค => ์ˆ˜๋ ดํ•˜์ง€ ์•Š๋Š” ๊ฒƒ

 

โ–ถ Learing rate ์ž‘์„ ๋•Œ

  • learning rate๊ฐ€ ์ž‘์œผ๋ฉด, step ๋ณดํญ์ด ์ž‘์•„ ์กฐ๊ธˆ์”ฉ ํ•™์Šตํ•จ
  • ๐Ÿ”ต ๋ณดํญ์ด ์ž‘๊ธฐ ๋•Œ๋ฌธ์— ์˜ค๋ฒ„์Š›์ด ์ƒ๊ธฐ์ง€ ์•Š์Œ
  • ๐Ÿ”ด ์ž‘์€ ๋ณดํญ ๋•Œ๋ฌธ์— local minima์— ๋น ์งˆ ์œ„ํ—˜์ด ์žˆ์Œ

 

 

๐Ÿ˜Ž Batch size

  • ๋ฐ์ดํ„ฐ ์…‹์„ ์ชผ๊ฐค ํฌ๊ธฐ

 

โ–ถ batch size๊ฐ€ ํด ๋•Œ

  • batch size๊ฐ€ ํฌ๋ฉด, ํ•œ ๋ฒˆ ํ•™์Šตํ•  ๋•Œ ๋งŽ์€ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต์„ ํ•˜๊ฒŒ ๋จ
  • ๐Ÿ”ต ํ•™์Šต์ด ๋น ๋ฅด๋ฉฐ ์–ด๋Š์ •๋„ ์ˆ˜์ค€๊นŒ์ง€ ์ˆ˜๋ ด์ด ๋งค์šฐ ๋น ๋ฆ„ -> local optima์— ๋น ์งˆ ํ™•๋ฅ ์ด ์ž‘์Œ
  • ๐Ÿ”ด ์ž‘์€ ๋ฐฐ์น˜๋ณด๋‹ค ๊ณผ์ ํ•ฉ์˜ ์œ„ํ—˜์ด ์žˆ์Œ -> batch๊ฐ€ ํฌ๋ฉด ๊ณ„์‚ฐ๋˜๋Š” loss๊ฐ’์˜ ํŽธ์ฐจ๊ฐ€ ์ž‘์Œ
  • ๐Ÿ”ด ๋„ˆ๋ฌด ํฌ๋ฉด ํ•œ ๋ฒˆ์— ์ฒ˜๋ฆฌํ•  ์–‘์ด ๋งŽ์•„์ง€๋ฏ€๋กœ ์†๋„๊ฐ€ ๋Š๋ ค์ง€๊ณ  ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋ถ€์กฑ

 

โ–ถ batch size๊ฐ€ ์ž‘์„ ๋•Œ

  • 1 epoch ๋‹น iteration์ด ํฌ๊ธฐ ๋•Œ๋ฌธ์— step์ด ๋งŽ์•„์ง
  • ๐Ÿ”ต ์ž‘์€ ๋ฐฐ์น˜๋Š” ์ž‘์€ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šตํ•˜๋ฏ€๋กœ loss์˜ ๋ถ„์‚ฐ์ด ์ปค์„œ (ํ•œ ๋ฒˆ์— ๊ณ„์‚ฐํ•˜๋Š” ๋ฐ์ดํ„ฐ ์–‘์ด ์ž‘์Œ) regularize ํšจ๊ณผ๊ฐ€ ์žˆ์Œ -> ์กฐ๊ธˆ ๋” ์„ธ์„ธํ•˜๊ฒŒ ํ•™์Šต์ด ๋จ
  • ๐Ÿ”ด step์ด ๋งŽ์•„ local minima๋กœ ๋น ์งˆ ์ˆ˜ ์žˆ์Œ
  • ๐Ÿ”ด ํ•™์Šต ์‹œ๊ฐ„์ด ์˜ค๋ž˜๊ฑธ๋ฆผ
  • ๐Ÿ”ด ๋„ˆ๋ฌด ์ž‘์œผ๋ฉด ์—…๋ฐ์ดํŠธ๊ฐ€ ์ง€๋‚˜์น˜๊ฒŒ ์ž์ฃผ ์ผ์–ด๋‚˜ ํ›ˆ๋ จ์ด ๋ถˆ์•ˆ์ •

 

๐Ÿ˜Ž 1 Epoch

  • ์ „์ฒด ๋ฐ์ดํ„ฐ ์…‹์„ ํ•œ ๋ฒˆ ํ•™์Šตํ•˜๋Š” ๊ฒƒ
  • ์ „์ฒด ๋ฐ์ดํ„ฐ ์…‹์ด ํ•˜๋‚˜์˜ ๋ชจ๋ธ์—์„œ Forwarding, Backwarding์„ ํ•œ ๋ฒˆ ์ˆ˜ํ–‰ํ•œ ๊ฒƒ
  • ๐Ÿ”ต Epoch์„ ๋†’์ด๊ฒŒ ๋˜๋ฉด ์—ฌ๋Ÿฌ ๋ฌด์ž‘์œ„ ๊ฐ€์ค‘์น˜๋ฅผ ํ†ตํ•ด์„œ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ ์ ˆํ•œ ๊ฐ’์„ ์ฐพ์„ ํ™•๋ฅ ์ด ๋†’์•„์ง
  • ๐Ÿ”ด ์ง€๋‚˜์น˜๊ฒŒ ๋†’์€ Epoch์€ Overfitting์˜ ์›์ธ์ด ๋จ

 

 

๐Ÿ˜Ž Iteration

  • ํ•˜๋‚˜์˜ minibatch๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ
  • ๊ฐ batch๋งˆ๋‹ค ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ๊ฐ€ ์ด๋ฃจ์–ด์ง€๊ธฐ ๋•Œ๋ฌธ์— iteration์€ ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ ํšŸ์ˆ˜์ด์ž ์ „์ฒด ๋ฐ์ดํ„ฐ์˜ ์ด ๋ฐฐ์น˜์˜ ์ˆ˜๊ฐ€ ๋จ

 

728x90
๋ฐ˜์‘ํ˜•
Comments