You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: ML in Python Part 2 - Neural Networks with TensorFlow
3
+
title: Deep Learning Fundamentals - Building Neural Networks with TensorFlow
4
4
date: 2023-10-23 13:00:00
5
5
description: Building your first neural network for image classification
6
6
@@ -36,7 +36,7 @@ from tensorflow.keras import layers
36
36
Unlike Scikit-learn, TensorFlow's MNIST dataset comes in a slightly different format. We'll keep the images in their original 2D shape (28x28 pixels) since neural networks can work directly with this structure - another advantage over traditional methods.
37
37
38
38
```python
39
-
# Model and data parameters
39
+
# Model parameters
40
40
num_classes =10# One class for each digit (0-9)
41
41
input_shape = (28, 28, 1) # Height, width, and channels (1 for grayscale)
-**60,000 training samples**: Much larger than scikit-learn's version for better learning
64
+
-**28x28 pixels**: Higher resolution images than Part 1's 8x8 grid
65
+
-**1 channel**: Grayscale images (RGB would be 3 channels)
66
+
-**10,000 test samples**: Large test set for robust evaluation
67
+
61
68
The final dimension (1) represents the color channel. Since MNIST contains grayscale images, we only need one channel, unlike RGB images which would have 3 channels.
62
69
63
70
Now that the data is loaded and scaled to appropriate range, we can go ahead and create the neural network
@@ -68,36 +75,47 @@ multiple ways how we can set this up.
68
75
69
76
For image classification, we'll use a Convolutional Neural Network (CNN). CNNs are specifically designed to work with image data through specialized layers:
70
77
71
-
-**Convolutional layers**: Detect patterns like edges, textures, and shapes
72
-
-**Pooling layers**: Reduce dimensionality while preserving important features
73
-
-**Dense layers**: Combine detected features for final classification
74
-
-**Dropout layers**: Prevent overfitting by randomly deactivating neurons
78
+
-**Convolutional layers**: Extract spatial features like edges, textures, and shapes
79
+
-**Pooling layers**: Reduce spatial dimensions while preserving important features
80
+
-**Dense layers**: Combine extracted features for final classification
81
+
-**Dropout layers**: Prevent overfitting by randomly deactivating neurons during training
82
+
83
+
There are multiple ways to define a model in TensorFlow. Let's explore two common approaches:
75
84
76
-
There are multiple ways to define a model in TensorFlow. We'll start with the most straightforward approach, which is a sequential model:
85
+
### 1. Sequential API
86
+
The Sequential API is the simplest way to build neural networks - layers are stacked linearly, one after another:
layers.Dropout(0.5), # Prevents overfitting by randomly dropping 50% of connections
106
+
layers.Dense(32, activation='relu'), # Hidden layer combines features
107
+
108
+
# Output layer for classification
91
109
layers.Dense(num_classes, activation='softmax'),
92
110
]
93
111
)
94
112
```
95
113
96
-
To have a bit more control about the individual steps, we can also separate each individual part, and define
97
-
the network architecture as follows.
114
+
### 2. Layer-by-Layer Sequential API
115
+
For more explicit control, we can separate each layer and activation:
98
116
99
117
```python
100
-
# More precise and sequential
118
+
# More precise and sequential approach
101
119
model = keras.Sequential(
102
120
[
103
121
keras.Input(shape=input_shape),
@@ -118,13 +136,17 @@ model = keras.Sequential(
118
136
)
119
137
```
120
138
121
-
The two models are functionally identical, but this second version:
122
-
- Allows finer control over layer placement
139
+
The two models are functionally identical, but the layer-by-layer approach offers several advantages:
123
140
- Makes it easier to insert additional layers like BatchNormalization
124
141
- Provides more explicit activation functions
125
142
- Makes the data flow more transparent
143
+
- Allows finer control over layer parameters
126
144
127
-
Next to this sequential API, there's also a functional one. We will cover that in the later, more advanced, TensorFlow example.
145
+
Next to this sequential API, there's also a functional API.We'll explore this more flexible approach in our advanced TensorFlow tutorial, which allows for:
146
+
- Multiple inputs and outputs
147
+
- Layer sharing
148
+
- Non-sequential layer connections
149
+
- Complex architectures like residual networks
128
150
129
151
Once the model is created, you can use the `summary()` method to get an overview of the network's architecture
130
152
and the number of trainable and non-trainable parameters.
@@ -183,28 +205,24 @@ black arts of any deep learning practitioners. For this example, let's just go w
183
205
parameters.
184
206
185
207
```python
186
-
#Model parameters
187
-
batch_size =128
188
-
epochs =10
208
+
#Training configuration
209
+
batch_size =128# Number of samples processed before model update
210
+
epochs =10# Number of complete passes through the dataset
189
211
190
-
# Compile model with appropriate metrics and optimizers
212
+
# Compile model with appropriate loss function and optimizer
191
213
model.compile(
192
-
loss='sparse_categorical_crossentropy',
193
-
optimizer='adam',
194
-
metrics=['accuracy']
214
+
loss='sparse_categorical_crossentropy',# Appropriate for integer labels
Figure 1: Training metrics over time showing model loss (left) and Mean Absolute Error (right) for both training and validation sets. The logarithmic scale helps visualize improvement across different magnitudes.
292
+
</div>
255
293
256
294
Once the model is trained we can also compute its score on the test set. For this we can use the `evaluate()`
- Watch for overfitting (validation loss increasing)
392
+
- Use appropriate batch sizes
393
+
```python
394
+
# Add validation monitoring during training
395
+
history = model.fit(
396
+
x_train, y_train,
397
+
validation_split=0.1,
398
+
batch_size=128,
399
+
epochs=10
400
+
)
401
+
```
402
+
403
+
4.**Memory Management**
404
+
- Clear unnecessary variables
405
+
- Use appropriate data types
406
+
- Watch batch sizes on limited hardware
407
+
```python
408
+
# Free memory after training
409
+
import gc
410
+
gc.collect()
411
+
keras.backend.clear_session()
412
+
```
413
+
324
414
## Summary and Next Steps
325
415
326
416
In this tutorial, we've introduced neural networks using TensorFlow:
@@ -329,7 +419,7 @@ In this tutorial, we've introduced neural networks using TensorFlow:
329
419
- Monitoring learning progress
330
420
- Visualizing learned features
331
421
332
-
Our neural network achieved comparable accuracy to our Scikit-learn models (~99%), but this time on images with a higher resoltuion with the potential for even better performance through further optimization.
422
+
Our neural network achieved comparable accuracy to our Scikit-learn models (~99%), but this time on images with a higher resolution with the potential for even better performance through further optimization.
333
423
334
424
**Key takeaways:**
335
425
1. Neural networks can work directly with structured data like images
In Part 3, we'll explore more advanced machine learning concepts using Scikit-learn, focusing on regression problems and complex preprocessing pipelines.
342
432
343
-
[← Back to Part 1]({{ site.baseurl }}/blog/2023/01_scikit_simple) or [Continue to Part 3 →]({{ site.baseurl }}/blog/2023/03_scikit_advanced)
433
+
[← Previous: Getting Started]({{ site.baseurl }}/blog/2023/01_scikit_simple) or
0 commit comments