Object Localization & Object Detection

Object localization and Object detection with Sliding Window Algorithm, and its Convolutional Implementation

Test Image

Object Localization

Generating boundary boxes for an object in an image.

Bounding Box : Parameters

top left: (0,0)
bottom right: (1,1)
Mid Point: $(b_x,b_y)$
Height: $b_h$
Width: $b_w$

All Parameters : $(b_x,b_y, b_h, b_w)$

$P_c$ = 1 if an object is present $C_1, C_2, C_3..$: Class labels

Test Image

For the eaxmple case, say there are three classes Pedestrian ($C_1$), Car($C_2$) and Bike ($C_3$). If there is no object all 3 classes $C_1, C_2, C_3$ will be equal to 0.

$P_c$ = 1 is any of the three objects are present in the image, =0 if no object and only background is present.

Output unit:

y = $\left[ \eqalign{P_c\cr b_x\cr b_y\cr b_h\cr b_w\cr C_1\cr C_2 \cr C_3} \right]$

For examples, for object Car with $b_x, b_y, b_h, b_w$ parameters, y = $\left[ \eqalign{1 \cr b_x\cr b_y\cr b_h\cr b_w\cr 0\cr 1 \cr 0} \right]$

For no object, y = $\left[ \eqalign{0\cr ?\cr ?\cr ?\cr ?\cr ?\cr ? \cr ?} \right]$ $\rightarrow$ ‘?’ means don’t care about the value

Loss Functions

$L(\hat y, y) = {(\hat y_1 - y_1)^2 + … + (\hat y_8 - y_8)^2 , if y_1 =1 }$

$L(\hat y, y) = {(\hat y_1 - y_1)^2, if y_1=0 }$ (since $P_c$ = 0, all other values are equal to ‘?’)

Object Detection

Sliding window detection algorithm

Pass a window over image with a certain stride. Start with a small window and move it all over the image, then increase the size of window and move it all over the image.

Disadvantage: Huge computation cost, slow algortihm

Convolutional implementation of sliding window - Combines the sliding window computation in one step

Test Image

Test Image

Test Image

Written on December 22, 2017