Have you ever wondered how AI makes decisions? It’s not magic; it’s math, and it starts with something you do every day—making choices! Let’s dive into the world of neural networks using your morning coffee routine as our starting guide.
Coffee time: A Neuron’s decision-making
Imagine you’re deciding whether to order coffee. You take into account several factors:
- How much sleep did I get?
- Is it cold outside?
- Do I have a busy day ahead?
Each factor has a different importance (weight) in your decision. Your personal preference (bias) also plays a role.
A neuron in a neural network works similarly. Here’s how we represent this mathematically:
z = (input1×weight1) + (input2×weight2) + (input3×weight3) + bias
In this case, the equation is a simple linear function, meaning the output decision z is a weighted sum of inputs. It is important to note that the mathematical representation up to this point is linear. Without any additional operations, no matter how many inputs we add, the decision boundary is a straight line, limiting the complexity of patterns that can be learned.
Weights and Bias
Think of the weights as how much each input influences the decision. If it’s cold outside and you care a lot about that factor when deciding to order coffee, the “cold weather” input will have a higher weight. The bias acts like your general inclination toward coffee. Maybe you’re a coffee lover who’s likely to order even on warm days, or perhaps you only drink coffee when you’re particularly tired. The bias adjusts the threshold for making a decision.
Let’s break this down with a quick example using three inputs (sleep, weather, and busy day), three weights, and a bias.
Assume the following:
- Sleep (hours): 6
- Weather (cold = 1, warm = 0): 1
- Busy Day (yes = 1, no = 0): 1
- Weight for sleep: 0.3
- Weight for weather: 0.6
- Weight for busy day: 0.8
- Bias: -1
So, the equation for z is:
z = (6×0.3) + (1×0.6) + (1×0.8) + (−1)
z = 1.8 + 0.6 + 0.8 − 1 = 2.2
Here, z is the result of the linear function. On its own, this linear combination doesn’t offer much flexibility in decision-making — it’s just a weighted sum.
Now, let’s pass z through an activation function like the sigmoid function:
Substituting z = 2.2: