# Hyperparameter Tuning

HYperparameter tuning in Deep Learning: Learning rate $\alpha$, $\beta , \beta_1, \beta_2, \epsilon$, number of layers, number of hidden units, learning rate decay, mini-batch size

#### Some parameters are more important than the others.

1. Learning Rate $\alpha$
2. Momentum term $\beta$, # hidden units, mini batch size
3. #layers, learning rate decay
4. $\beta_1$ = 0.9, $\beta_2$ = 0.999, $\epsilon = 10^{-8}$

#### Importance of picking appropriate scale to pick Hyperparameters

Lets see we are trying to tune number of layers $n^{[l]}$ from somewhere between 50 to 100, or #layers between 2-4 can be used. In such cases sampling uniformly at random makes sense. This might not be true for all hyperparameters

For example, Learning Rate $\alpha$ - say is between 0.0001 to 1
In such case, search for parameters on a log scale rather than uniform scale, in this case all possible features can be learnt through all the scales

python implementation

r = -4*np.random.randn  #r will be between [-4,0]
alpha = 10^r


alpha will be between $10^{-4}..10^0$

##### Generalization for a log scale

If you have to search between $10^a$ and $10^b$, where a and b are the ends of the scale
In the above case, a = $log_{10}0.0001$ = 4 and b = $log_{10}1$ = 0
r will be between [-4,0] or [a,b]

#### Hyperparameter for exponentially weighted averages $\beta$

$\beta$ = 0.9 …. ..0.999
0.9 : Averaging over last 10 days value
0.999: Averaging over last 1000 days value

Similar to log scale
Exploring the values of 1-$\beta$ = 0.1…. 0.001
r will belong to [-3,-1]
1-$\beta$ = $10^r$
$\beta$ = 1-$10^r$

#### Pandas vs Caviar

• Re-test hyperparameters occasionally, intutions do get stale

Approaches:

1. Babysitting one model and keep working on it (less computational capacity case) [Pandas]
2. Train multiple models in parallel [Caviar]
Written on December 16, 2017
[ ]