์ธ๊ณต์ง€๋Šฅ ์ •๋ฆฌ [๋ณธ๋ก 4] :: ๋”ฅ๋Ÿฌ๋‹์˜ ์‹œ์ž‘

๋”ฅ๋Ÿฌ๋‹์˜ ์‹œ์ž‘

์ด๋Ÿฌํ•œ multi-layer ์˜ forward-propagation ๊ณผ์ •์„ ์‹์œผ๋กœ ๋‚˜ํƒ€๋‚ด๋ณด๋ฉด,

h1 = f(x11*w11+x12*w21)

net = h1*w13+h2*w23 = f(x11*w11+x12*w21)*w13+f(x11*w12+x12*w22)*w23

์—ฌ๊ธฐ์„œ f ์ฆ‰, activation fuction์ด linearํ•œ function์ด๋ผ๊ณ  ๊ฐ€์ •ํ•ด๋ณด์ž.

๊ทธ๋ ‡๋‹ค๋ฉด f(x) = ax์˜ ํ˜•ํƒœ์ด๋ฏ€๋กœ,

net = x11*a(w11*w13+w12*w23)+x12*a(w21*w13+w22*w23)

์œผ๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด๋Ÿฌํ•œ net์€ ๊ฐ€์žฅ ์ฒ˜์Œ์— ์ฃผ์–ด์ง„ input layer์—๋‹ค๊ฐ€ ์ƒ์ˆ˜๋ฅผ ๊ณฑํ•œ ๊ผด์ด๋ฏ€๋กœ one-layer๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰, ์—ฌ๋Ÿฌ ๊ฐœ์˜ layer๋ฅผ ๊ฑฐ์ณค์Œ์—๋„ ์‰ฌ์šด ๋ฌธ์ œ๋กœ ๋ณ€ํ˜•ํ•˜์ง€ ๋ชปํ•˜๊ณ  ๋˜‘๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ๋‚ด์—ˆ๋‹ค.

๋”ฐ๋ผ์„œ, multi-layer perceptron์—์„œ๋Š” non-linear ํ•จ์ˆ˜๋ฅผ activation function์œผ๋กœ ์‚ฌ์šฉํ•ด์•ผํ•œ๋‹ค.

activation function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜)

์ข…๋ฅ˜

sigmoid function

{\displaystyle f(x)={\frac {1}{1+e^{-x}}}}

์•ž์„œ ์‚ฌ์šฉํ•œ hard limiting ํ™œ์„ฑํ™” ํ•จ์ˆ˜์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ thresholding ํŠน์„ฑ์ด ์žˆ๋‹ค.

์ด์™€ ๋™์‹œ์— ์—ฐ์† ํ•จ์ˆ˜์ด๋‹ค.

๋ฏธ๋ถ„ํ•˜๊ธฐ ์ข‹๋‹ค.

multi-layer perceptron์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด์„œ non-linear ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด์•ผํ•˜๋ฉฐ, ์•ž์„œ ์„ค๋ช…ํ•œ ๋ฏธ๋ถ„์˜ ๋ฐฉ์‹์œผ๋กœ w๋ฅผ ํ•™์Šตํ•˜๋ ค๋ฉด ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ํ•จ์ˆ˜์—ฌ์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— sigmoid ํ•จ์ˆ˜๋Š” ์ ์ ˆํ•˜๋‹ค.

๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์ ์ ˆํ•œ ํ•จ์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • sigmoid
  • tanh
  • LReLU
  • ReLU
๋ฐ˜์‘ํ˜•