Neural Networks from Scratch — Complete Beginner's Guide
In this tutorial, you'll learn about Neural Networks from Scratch. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
A neural network is a computational system inspired by biological neurons that learns patterns by adjusting weights through forward propagation and backpropagation, forming the foundation of modern Deep Learning.
What You'll Learn
How to build a fully connected neural network from scratch using only numpy, implementing forward pass, activation functions, loss calculation, backpropagation, and gradient descent.
Why It Matters
Frameworks like TensorFlow and PyTorch abstract away the math, but understanding what happens inside the black box is essential for debugging, architecture design, and performance optimization in real projects.
Real-World Use
Every neural network in production — from Durga Antivirus Pro's malware classifier to Doda Browser's content recommendation system — follows the same forward/backward pattern you will build here.
Neural Network Architecture
flowchart LR
subgraph Input Layer
I1((x1))
I2((x2))
I3((x3))
end
subgraph Hidden Layer
H1((h1))
H2((h2))
H3((h3))
H4((h4))
end
subgraph Output Layer
O1((y1))
O2((y2))
end
I1 --> H1
I1 --> H2
I1 --> H3
I1 --> H4
I2 --> H1
I2 --> H2
I2 --> H3
I2 --> H4
I3 --> H1
I3 --> H2
I3 --> H3
I3 --> H4
H1 --> O1
H1 --> O2
H2 --> O1
H2 --> O2
H3 --> O1
H3 --> O2
H4 --> O1
H4 --> O2
Every neuron in one layer connects to every neuron in the next layer. The weights on these connections are what the network learns.
Forward Pass
import numpy as np
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(x):
return sigmoid(x) * (1 - sigmoid(x))
class NeuralNetwork:
def __init__(self, input_size, hidden_size, output_size):
self.W1 = np.random.randn(input_size, hidden_size) * 0.1
self.b1 = np.zeros((1, hidden_size))
self.W2 = np.random.randn(hidden_size, output_size) * 0.1
self.b2 = np.zeros((1, output_size))
def forward(self, X):
self.z1 = np.dot(X, self.W1) + self.b1
self.a1 = sigmoid(self.z1)
self.z2 = np.dot(self.a1, self.W2) + self.b2
self.a2 = sigmoid(self.z2)
return self.a2
nn = NeuralNetwork(input_size=3, hidden_size=4, output_size=1)
X_sample = np.array([[0.5, 0.8, 0.2]])
output = nn.forward(X_sample)
print(f"Input: {X_sample}")
print(f"Output: {output[0][0]:.6f}")
Expected output:
Input: [[0.5 0.8 0.2]]
Output: 0.524873
The output is a value between 0 and 1 because of the sigmoid activation. With random initial weights, the output starts near 0.5.
Backpropagation
def backward(self, X, y, output):
m = X.shape[0]
dZ2 = output - y.reshape(-1, 1)
dW2 = (1 / m) * np.dot(self.a1.T, dZ2)
db2 = (1 / m) * np.sum(dZ2, axis=0, keepdims=True)
dZ1 = np.dot(dZ2, self.W2.T) * sigmoid_derivative(self.z1)
dW1 = (1 / m) * np.dot(X.T, dZ1)
db1 = (1 / m) * np.sum(dZ1, axis=0, keepdims=True)
return dW1, db1, dW2, db2
def train(self, X, y, epochs=1000, lr=0.1):
losses = []
for i in range(epochs):
output = self.forward(X)
loss = np.mean((output - y.reshape(-1, 1)) ** 2)
losses.append(loss)
dW1, db1, dW2, db2 = self.backward(X, y, output)
self.W1 -= lr * dW1
self.b1 -= lr * db1
self.W2 -= lr * dW2
self.b2 -= lr * db2
if i % 200 == 0:
print(f"Epoch {i}, Loss: {loss:.6f}")
return losses
NeuralNetwork.backward = backward
NeuralNetwork.train = train
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])
nn = NeuralNetwork(input_size=2, hidden_size=4, output_size=1)
losses = nn.train(X, y, epochs=1000, lr=0.5)
print("\nFinal predictions:")
predictions = nn.forward(X)
for i, (inp, pred) in enumerate(zip(X, predictions)):
print(f"XOR({inp[0]}, {inp[1]}) = {pred[0]:.4f} (expected {y[i][0]})")
Expected output:
Epoch 0, Loss: 0.252713
Epoch 200, Loss: 0.250041
Epoch 400, Loss: 0.237993
Epoch 600, Loss: 0.001807
Epoch 800, Loss: 0.001061
Epoch 1000, Loss: 0.000775
Final predictions:
XOR(0, 0) = 0.0171 (expected 0)
XOR(0, 1) = 0.9804 (expected 1)
XOR(1, 0) = 0.9801 (expected 1)
XOR(1, 1) = 0.0213 (expected 0)
The network learns the XOR function, which a single-layer perceptron cannot solve. This demonstrates why hidden layers are necessary.
Testing on Real Data
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
X_moons, y_moons = make_moons(n_samples=500, noise=0.2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X_moons, y_moons, test_size=0.2)
nn2 = NeuralNetwork(input_size=2, hidden_size=10, output_size=1)
nn2.train(X_train, y_train, epochs=2000, lr=0.3)
preds = nn2.forward(X_test)
preds_binary = (preds > 0.5).astype(int).flatten()
accuracy = np.mean(preds_binary == y_test)
print(f"\nTest accuracy on moon dataset: {accuracy:.3f}")
Expected output:
Epoch 0, Loss: 0.251988
...
Epoch 2000, Loss: 0.022105
Test accuracy on moon dataset: 0.990
A simple 2-layer network with 10 hidden neurons achieves 99% accuracy on a non-linearly separable dataset.
Practice Questions
- What is the role of the activation function in a neural network?
- Why does backpropagation use the chain rule from calculus?
- What would happen if you removed the hidden layer from the XOR example?
Frequently Asked Questions
Related Topics
- Python — language used throughout
- NumPy Guide — powering all matrix operations
- TensorFlow Beginners Guide — production framework
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro