Shufang Ci

Understanding a Neural Network by Making Pizza

Share this content:

Neural networks are currently a hot field, especially in healthcare. People have talked about using them to score pathology slides and mammograms, and mine the EMR for connections. However, they are very confusing. According to Maureen Caudill, a neural network is “a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.” Can’t figure out what is going on in Maureen’s definition? No worries. At healthcare.ai, we are all about making machine learning transparent. Let’s step out of the hospital and into the kitchen.

Imagine we are trying to make a pizza and a sandwich, but we don’t know how. We are only given some ingredients: flour, meat, water, and salt. No recipes. We have no choice but to randomly guess how much of each ingredient we should add. The figure below shows our workflow.

What are Input Layers, Neurons, and Weights?

First, we do some trial and error. If we choose 1lb flour + 0.5lb meat + 0.7lb water + 0.6lb salt, simply dump them in a bowl, and mix them, we might get a dough. If we choose 0.5lb flour + 0.8lb meat + 0.6lb water+ 0.7lb salt, put them in another bowl, and mix them, we might get a sausage. Here, the ingredients represent the “neurons” of a neural network. Since we start building our food from these ingredients, we call them “input neurons.” The input neurons make up the “input layer” in a neural network. The amount of each ingredient is called a “weight.” Its initial value can be randomly selected. Dumping everything in the bowl is messy at the beginning, but by hand mixing and kneading, the mixtures transfer to a better-looking shape: dough or sausage. The way to transfer the ingredients from one state to another is called “activation function” in a neural network.

What are Hidden Layers and Output Layers?

Next, we go through a similar process, using dough and sausage as our intermediate ingredients. The layers made up by the intermediate products are called “hidden layers” in a neural network. The intermediate products are the “hidden neurons.” We choose 1 piece of dough and 5 pieces of sausage, place them together in the oven and get a pizza. We choose 2 pieces of dough and 6 pieces of sausage, place them together in the oven and get a sandwich. Here, we initiated another set of “weights:” 1 piece, 2 pieces, etc. We also used a new way to process the ingredients: baking. It is another type of activation function in a neural network. When the oven is finished, we got the “outputs:” pizza and a sandwich!

How does one train a neural network?

As you might have imagined, our first pizza doesn’t look like a real pizza (for example, there is too much meat in the dough) and it tastes terrible! We need to reduce the salt. We calculate the new amount of salt by looking at the last recipe and adjusting. The next time we make pizza and a sandwich, we will use the reduced amount of salt in a sort of iterative process. Updating the input weights using the difference between the current and expected outcome is called “backpropagation” in training a neural network. Not only would we update salt, we would adjust all of the other input ingredients. Back and forth many times, our pizza and sandwich look better and better, and the taste is closer to our expectations. We finally get our best recipes for making pizza and a sandwich. This kind of thinking is how a neural network is trained, even in healthcare. Ingredients might be pixels in an image, or data points in a table.

How does one make new predictions?

Now the pizza/sandwich neural network has been trained. In the future, if someone gives us a recipe but doesn’t say what type of food that it makes, we can use our pizza/sandwich neural network to guess the expected food. We would run the ingredients through the network and it would tell us whether the output was more similar to pizza or a sandwich.

In much the same way, a neural network might be able to tell whether an array of pixels is more similar to a malignant case or a healthy case. Though the analogy is a little silly, the thought process is pretty realistic. Just don’t try to use a pizza/sandwich classifier in the hospital!

Thanks for reading, and if you want to get in touch, feel free to reach out.