Transfer Learning From Pre-Trained Model for Image Recognition
The Purpose of this article is to solve Image Recognition Problems in a fast and easy way using Transfer Learning. For Demonstration purposes we will create a Deep Learning Code using Keras and its Pre-trained Model.
Before jumping straight into practical, let us first understand what Transfer Learning really is.
Nowadays, Deep Learning is playing a major role in Artificial Intelligence Applications and most of them are in the field of Computer Vision or Natural Language Processing or Speech Recognition.
But Deep Learning has done a tremendous job majorly in the field of Computer Vision and to be specific in Image Classification and Recognition. Image Classification does the task of classifying a specific image to a set of possible categories and Image Recognition refers to the ability of software to identify objects, places, people, writing and actions in images. One of the example of Image Classification is Identification of Cars and Bikes.
Since there is so much work that has already been done on Image Recognition and Classification, we can use a technique in Machine Learning to solve Image Classification problems which can generate great result in fast and easy way, and that technique is Transfer Learning.
What is Transfer Learning ?
Transfer learning generally refers to a process where a model trained on one problem is used in some way on a second related problem.
In deep learning, transfer learning is a technique whereby a neural network model is first trained on a problem similar to the problem that is being solved. One or more layers from the trained model are then used in a new model trained on the problem of interest.
Transfer learning is a popular method in computer vision because it allows us to build accurate models in a timesaving way . With transfer learning, instead of starting the learning process from scratch, you start from patterns that have been learned when solving a different problem. This way you leverage previous learnings and avoid starting from scratch.
Transfer learning has the benefit of decreasing the training time for a neural network model and can result in lower generalization error.
The weights in re-used layers may be used as the starting point for the training process and adapted in response to the new problem. This usage treats transfer learning as a type of weight initialization scheme. This may be useful when the first related problem has a lot more labeled data than the problem of interest and the similarity in the structure of the problem may be useful in both contexts.
For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks.
Why Not Training Model From Scratch ?
Convolutional Neural Networks can learn extremely complex mapping functions when trained on enough data. Convolutional Neural Networks are best known for learning patterns in datasets with large number of features.
At a base level, the weights of a CNN (Convolutional Neural Network) consist of filters. Think of a filter as an (n*n) matrix which consists of certain numbers. Now this filter is convoluted(slide and multiply) through the provided image. Assume the input image is of size (10,10) and the filter is of size (3,3), first the filter is multiplied with the 9 pixels on the top-left of the input image, this multiplication produces another (3,3) matrix. The values of the 9 pixels of this matrix are summed up and this value becomes a single pixel value on the top-left of layer_2 of the CNN.
Basically the training of a CNN involves, finding of the right values on each of the filters so that an input image when passed through the multiple layers, activates certain neurons of the last layer so as to predict the correct class for the input image.
Though training a CNN from scratch is possible for small projects, most applications require the training of very large CNN’s and this as you guessed, takes extremely huge amounts of processed data and computational power. And both of these are not found so easily these days. So to tackle this problem other alternative is Transfer Learning.
In transfer learning, we take the pre-trained weights of an already trained model(one that has been trained on millions of images belonging to 1000’s of classes, on several high power GPU’s for several days) and use these already learned features to predict new classes.
The advantages of transfer learning are that:
1: No need of large datasets for training .
2: Less Computational power required as compared to training a CNN from scratch , As we are using pre-trained weights and only have to learn the weights of the last few layers.
How to Use Pre-Trained Models for Transfer Learning :
For taking Pre-trained model into use, we first need to remove the classifier i.e Fully Connected of Pre-trained Model and then we have to add our own classifier i.e Fully Connected that fits our purpose and then we need to fine-tune our model according to one of the following strategies based on what is our goal and what kind of dataset we have :
- Train the entire model : In this case, you use the architecture of the pre-trained model and train it according to your dataset. You’re learning the model from scratch, so you’ll need a large dataset along with that you will also need great amount of computational power.
- Train some layers and leave the others frozen : As you remember, lower layers refer to general features (problem independent), while higher layers refer to specific features (problem dependent). Usually, if you’ve a small dataset and a large number of parameters, you’ll leave more layers frozen to avoid overfitting. By contrast, if the dataset is large and the number of parameters is small, you can improve your model by training more layers to the new task since overfitting is not an issue.
- Freeze the convolutional base : The main idea is to keep the convolutional base in its original form and then use its outputs to feed the classifier i.e Fully Connected. You’re using the pre-trained model as a fixed feature extraction mechanism, which can be useful if you’re short on computational power, your dataset is small, and/or pre-trained model solves a problem very similar to the one you want to solve.
Process of Transfer Learning :
This process can be understood by 3 major points :
- Selecting a Pre-Trained Model : There are perhaps a dozen or more top-performing models for image recognition that can be downloaded and used as the basis for image recognition and related computer vision tasks.
Perhaps three of the more popular models are as follows:
- VGG (e.g. VGG16 or VGG19).
- GoogLeNet (e.g. InceptionV3).
- Residual Network (e.g. ResNet50).
These models are both widely used for transfer learning both because of their performance, but also because they were examples that introduced specific architectural innovations, namely consistent and repeating structures (VGG), inception modules (GoogLeNet), and residual modules (ResNet).
Keras provides access to a number of top-performing pre-trained models that were developed for image recognition tasks.
They are available via the Applications API, and include functions to load a model with or without the pre-trained weights, and prepare data in a way that a given model may expect (e.g. scaling of size and pixel values).
The first time a pre-trained model is loaded, Keras will download the required model weights, which may take some time given the speed of your internet connection. Weights are stored in the .keras/models/ directory under your home directory and will be loaded from this location the next time that they are used.
2. Identify the problem based on categories shown in following images :
The following image shows Size-Similarity matrix based on which we will decide our strategy.
3. Fine-Tune your Model : Fine tuning is just about making some fine adjustments to further improve performance. For example, during transfer learning, you can unfreeze the pre-trained model and let it adapt more to the task at hand.
Implementing Transfer Learning for Building Face- Recognition Model :
Pre-requisites :
- Keras
- Tensorflow
- Pillow
- numpy
- pandas
- jupyter notebook
Pre-trained Model : The model that we’ll be using here is the VGG16, which is already available in applications API of Keras framework.
Dataset : For this practical, we will be using dataset which is created by me only. But you can create a dataset of your own images, or images of your family or friends to train the model.
The dataset we will be using consist of 10 separate classes of images, one for each different indian celebrity.
Below is the Sample Data from dataset, showing images of Aamir Khan.
Before we start, first download the code and dataset for better practical understanding. Click here to download.
The building of a model is a 3 step process:
Please, launch your Jupyter Notebook ( as I am also using Jupyter Notebook ) in the environment where all the dependencies are installed or else you might face some issues.
1. Importing the Pre-Trained model and adding the Dense Layers.
2. Loading train data into Image Data Generators.
3. Loading Trained Model and Model Evaluation by Predicting Label for Validation Data
When you run the last block, which is marked [96] in the image, you will see one image come up, which displays the image for which label is to be predicted and at the bottom of code block, you will see the original label of image. The image is chosen randomly by the “getRandomImage” Function. So you can run just Block No [96] again and again to check prediction for different images.
This time the the Label which is predicted for the image is same as the original label of image. This means our Face-Recognition Model works.
But Sometimes it might happen that our model can do wrong prediction.
In the above Evaluation, the label predicted for the image is not same as the original label, which means our model is not 100% accurate which is normal in Machine Learning probelms.
To achieve better accuracy, the only way is to fine tune our model again and again until its accuracy increases without over-fitting.
Conclusion :
So, with this we have completed the practical demonstration of Transfer Learning using VGG16 Pre-trained model in Keras. I hope that you feel motivated to start developing your deep learning projects on computer vision and also try some more transfer learning models. This is a great field of study and new exciting findings are coming out everyday.
I’d be glad to help you, so let me know if you have any questions or improvement suggestions!
Below is my Linkedin Account Link, you can also connect me there.
Linkedin : https://www.linkedin.com/in/sagar-sonwane-b2a960178/
Thank You.
References :
Transfer Learning Banner Image Source : https://www.topbots.com/transfer-learning-in-nlp/
Convolutional Neural Network Image : https://www.cs.swarthmore.edu/~meeden/cs63/s19/labs/08.html