The Data Science Lab
Neural Networks Using Python and NumPy
The rnd object establishes a random number generator that can be used instead of the default global random object. The initializeWeights method uses rnd to set the weights and biases to random values between -0.01 and +0.01.
Computing Output Values
The input-output mechanism is implemented in class method computeOutputs. The definition begins:
def computeOutputs(self, xValues):
print("\n ihWeights: ")
showMatrix(self.ihWeights, 2)
print("\n hBiases: ")
showVector(self.hBiases, 2)
print("\n hoWeights: ")
showMatrix(self.hoWeights, 2)
print("\n oBiases: ")
showVector(self.oBiases, 2)
...
The showMatrix method is a program-defined helper. In most situations you wouldn't print out the weights and bias values but the ability to do so is useful during program development and debugging. Next, two local vectors are initialized:
hSums = np.zeros(shape=[self.nh], dtype=np.float32)
oSums = np.zeros(shape=[self.no], dtype=np.float32)
These vectors hold the pre-activation sums of products plus bias values for the hidden and output nodes. You could omit these vectors and compute directly into the self.hNodes and self.oNodes vectors, but you'd have to be sure to remember to explicitly zero-out those vectors before accumulating into them. Next, the input values are copied into the self.iNodes vector:
for i in range(self.ni):
self.iNodes[i] = xValues[i]
In some scenarios, to improve performance, you might want to use the values in the xValues parameter directly, rather than take time to copy the values into the self.iNodes vector. Next, the hidden node values are computed:
for j in range(self.nh):
for i in range(self.ni):
hSums[j] += self.iNodes[i] * self.ihWeights[i][j]
for j in range(self.nh):
hSums[j] += self.hBiases[j]
print("\n pre-tanh activation hidden node values: ")
showVector(hSums, 4)
for j in range(self.nh):
self.hNodes[j] = self.hypertan(hSums[j])
print("\n after activation hidden node values: ")
showVector(self.hNodes, 4)
Other than having to be careful with matrix and vector indexing, the code is relatively straightforward. The demo uses a program-defined hypertan function, which is just a wrapper around the built-in math.tanh function:
@staticmethod
def hypertan(x):
if x < -20.0:
return -1.0
elif x > 20.0:
return 1.0
else:
return math.tanh(x)
When x is less than -20.0 or greater than +20.0, the value of tanh(x) is very, very close to -1 or +1, respectively, so the hypertan wrapper is a bit more efficient in most situations than calling math.tanh directly.
The definition of computeOutputs concludes by calculating the values of the output nodes:
...
for k in range(self.no):
for j in range(self.nh):
oSums[k] += self.hNodes[j] * self.hoWeights[j][k]
for k in range(self.no):
oSums[k] += self.oBiases[k]
print("\n pre-softmax output values: ")
showVector(oSums, 4)
softOut = self.softmax(oSums)
for k in range(self.no):
self.oNodes[k] = softOut[k]
result = np.zeros(shape=self.no, dtype=np.float32)
for k in range(self.no):
result[k] = self.oNodes[k]
return result
Because the softmax function requires the values of all pre-activation output sum of products to compute a common divisor, it's more efficient to define a softmax implementation that operates on all pre-activation sums rather than on each individual sum. The implementation of function softmax uses the "max trick" to reduce the possibility of arithmetic overflow:
@staticmethod
def softmax(oSums):
result = np.zeros(shape=[len(oSums)], dtype=np.float32)
m = max(oSums)
divisor = 0.0
for k in range(len(oSums)):
divisor += math.exp(oSums[k] - m)
for k in range(len(result)):
result[k] = math.exp(oSums[k] - m) / divisor
return result
The softmax function is an interesting topic in its own right, and you can find more information in the Wikipedia article on the function.
Setting Weights and Bias Values
The NeuralNetwork class defines four methods for working with the weights and bias values: setWeights, getWeights, initializeWeights and totalWeights. The totalWeights method calculates the total number of weights and biases needed for a neural network with the given number of input, hidden and output nodes:
@staticmethod
def totalWeights(nInput, nHidden, nOutput):
tw = (nInput * nHidden) + (nHidden * nOutput) +
nHidden + nOutput
return tw
I defined the method as static so it would be available to code outside the NeuralNetwork class without having to instantiate a class object.
The setWeights method accepts a vector of float32 values and then assigns those values in the following order: input-to-hidden weights, hidden biases, hidden-to-output weights, output biases. This order is arbitrary.
The getWeights method traverses the ihWeights, hBiases, hoWeights, and oBiases data structures and stores their values into a one-dimensional vector.
The initializeWeights method uses the totalWeights and setWeights methods to set the weights and bias values to small random values between -0.01 and +0.01:
def initializeWeights(self):
numWts = self.totalWeights(self.ni, self.nh, self.no)
wts = np.zeros(shape=[numWts], dtype=np.float32)
lo = -0.01; hi = 0.01
for idx in range(len(wts)):
wts[idx] = (hi - lo) * self.rnd.random() + lo
self.setWeights(wts)
Recall that initializeWeights is called by the __init__ method. However, in the demo program, the initial random weights and bias values are not used because the main function calls setWeights. Initializing neural network weights and bias values is somewhat more subtle than you might guess, but for most scenarios using a uniform random distribution is OK.
Wrapping Up
Python has been used for many years, and it seems as if Python is becoming the most common language for accessing sophisticated neural network libraries such as CNTK, TensorFlow and others. At the time of writing this article, Visual Studio 2017 had just been released. Originally Visual Studio 2017 was slated to include full support for Python; however, at the last moment Python support was pulled because the code wasn't quite ready. I've talked to the Python and Visual Studio people at Microsoft and they tell me that full support for Python in Visual Studio will come soon. [Editor's Note: The Python tools are available as of May 12. More information from Microsoft's Visual Studio blog here.]
About the Author
Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Azure and Bing. James can be reached at [email protected].