Friday, 6 September 2019

Case Study: Separate Close-by Classes (Y values)

For training DNN, when a class (Y value) is too close to others, it's hard to separate them (linear separation). Thus the training process has to be longer, multiple times.

Normal training flow:
  • Normalise input values to be in range [0,1]
  • Use ReLU for hidden layers (ReLU is for performance)
  • Use Sigmoid for output layers (class probabilities)
Training process:
  • Let's say for easy-to-separate inputs, training done with N batches
  • Run multiple times N batches
The losses during training below show when separation is done (marked with colour changing):
Loss: 3.7734208 Loss: 1.4345056 Loss: 0.921915 Loss: 0.6754134 Loss: 0.62778795 Loss: 0.29518488 Loss: 0.019447815 Loss: 0.007134443 Loss: 0.00425537 Loss: 0.0029799175 Loss: 0.0022652268 (Last)

Loss curve:

Source code:
#libs
import tensorflow        as tf;
import matplotlib.pyplot as pyplot;

#data
MAX = 11;
X = [[0,0],   [0,1],   [1,0],   [10,10], [MAX,MAX]];
Y = [[1,0,0], [1,0,0], [1,0,0], [0,1,0], [0,0,1]  ];
Batch_Size = 5;

#normalise
for I in range(len(X)):
  X[I][0] = X[I][0]/MAX;
  X[I][1] = X[I][1]/MAX;
#end for

#model
Input     = tf.placeholder(dtype=tf.float32, shape=[Batch_Size,2]);
Expected  = tf.placeholder(dtype=tf.float32, shape=[Batch_Size,3]); #3 classes

Weight1   = tf.Variable(tf.random_uniform(shape=[2,20], minval=-1, maxval=1));
Bias1     = tf.Variable(tf.random_uniform(shape=[  20], minval=-1, maxval=1));
Hidden1   = tf.nn.relu(tf.matmul(Input,Weight1) + Bias1);

Weight2   = tf.Variable(tf.random_uniform(shape=[20,10], minval=-1, maxval=1));
Bias2     = tf.Variable(tf.random_uniform(shape=[   10], minval=-1, maxval=1));
Hidden2   = tf.nn.relu(tf.matmul(Hidden1,Weight2) + Bias2);

Weight3   = tf.Variable(tf.random_uniform(shape=[10,3], minval=-1, maxval=1));
Bias3     = tf.Variable(tf.random_uniform(shape=[   3], minval=-1, maxval=1));
Output    = tf.sigmoid(tf.matmul(Hidden2,Weight3) + Bias3);

Loss      = tf.reduce_sum(tf.square(Expected-Output));
Optimiser = tf.train.GradientDescentOptimizer(1e-1);
Training  = Optimiser.minimize(Loss);

#train
Sess = tf.Session();
Init = tf.global_variables_initializer();
Sess.run(Init);

Losses = [];
for I in range(2000):
  if (I%200==0):
    Lossvalue = Sess.run(Loss, feed_dict={Input:X, Expected:Y});
    Losses += [Lossvalue];
    print("Loss:",Lossvalue);
  #end if
  
  Sess.run(Training, feed_dict={Input:X, Expected:Y});
#end for

#result: loss
Lastloss = Sess.run(Loss, feed_dict={Input:X, Expected:Y});
Losses  += [Lastloss];
print("Loss:",Lastloss,"(Last)");

#result: eval
Evalresult = Sess.run(Output, feed_dict={Input:X, Expected:Y});
for I in range(Batch_Size):
  Evalresult[I][0] = round(Evalresult[I][0],3);  
  Evalresult[I][1] = round(Evalresult[I][1],3);
  Evalresult[I][2] = round(Evalresult[I][2],3);
#end for
print("Eval:\n"+str(Evalresult));

#result: diagram
print("Loss curve:");
pyplot.plot(Losses);
#eof

Colab link:
https://colab.research.google.com/drive/1y3wV_5K__46tbxNHq-S1n_W4LOF0ANYx

No comments:

Post a Comment