The end of the confinement does not remove all the important actions to limit the Covid-19 propagation. The governments of the all the countries in Europe have defined lockdown exit strategy approaches balancing the imperatives of public health and the psycho-social and economic aspects linked to the current confinement.
However, it is still required to wear mask and maintain some social distance. AI models can help to detect the people not wearing masks and help limit the virus propagation. Following 6 simple steps, it is possible to create, test and operate a computer vision classification model able to detect the presence of masks.
In the current context of the post COVID-19, there are still some places where everyone needs to wear a mask. Indeed, each European country, and even outside Europe, put in place a set of measures to still limit the virus contamination and the pressure on their respective Health systems.
For example in Luxembourg, resumption of activities will be accompanied by very strict barrier gestures and supplemented by the compulsory wearing of a mask or any other device allowing the nose and mouth to be covered for situations of interpersonal contact if the distance from sanitary security 2 metres cannot be guaranteed.
How to monitor the presence mask in these different applicable cases?
The automatic analysis of the video surveillance camera can help here to warn in case of mask absence. In this article, we will show, how easy the creation such model can be.
An end-to-end assisted creation process
The Max-ICS Platform includes a set of tools to assist the data scientists in the different steps when creating an Artificial Intelligence model. In the context of the current article, the data-scientist will have to create a specific type of model: a computer vision classification model.
This type of models are used to classify images to match one or several categories or classes. The classification of 'cats' and 'dogs' is often used to illustrates this type of model. In the current case, the goal is to develop a classification model that will classify people 'wearing a mask' and people 'not wearing a mask'.
The model creation process can be defined in these six different steps. They covered the most important elements in term of model preparation, from the collection of the training data up to use of the model in operational phase.
Step 1 - collection of a dataset
The first step is to collect a set of representative pictures to illustrate both the case of people 'wearing a mask' and the one 'not wearing a mask'.
Max-ICS already integrates a set of public datasets but it also proposes different solutions to create a computer vision dataset.
It is possible to import the data using Google data scrapping, from the Sentinel and other applicable satellite imageries, using local images uploaded by FTP, using Geo-Files uploaded by FTP and, for the coders, directly use the exposed API.
In the scope of this article, the FTP upload will be used to upload 1376 images including both images of people with mask and people without mask.
Step 2 - review of the dataset
An important step is to validate the quality of the corpus. Max-ICS includes different tools to analyse the quality of the corpus but this will detailed in an order article. At this stage, it will assumed that the imported images are correct. One of the important element is to mentions that the corpus is balanced: there are around the same number of images of people with mask than images of people without mask. The following interface of Max-ICS allows to have an overview of the classes distribution.
Step 3 - defining different models to train
The creation of classification models will rely on the training raw or pre-trained neural architecture. Max-ICS integrates a 'model-zoo'. A 'model-zoo' is a repository of a set of neural architecture with different pre-trained weights. These pre-trained models have been already trained on generic datasets like 'ImageNet'. Within Max-ICS model-zoo, more than 30 different architectures, pre-trained or not, are accessible.
For training our classification model, different architectures can be used like ResNet, Inception, DenseNet, MobileNet or SqueezeNet.
Step 4 - Following the models' training
Max-ICS integrates a novel concept called Deep Learning Factory. This concept is taking inspiration in the car industry where vehicles are built over a production chain. A similar set of production chains are implemented in Max-ICS: these chains are defined to train models following the training parameters specified by the user. The following screenshot illustrates a set of models trained following different set of parameters or models architecture from the 'model-zoo'.
These different trained models have been configured on a naive way, with default parameters and different architectures. On the top right of the interface, the user can compare the different trained models using both the accuracy (representing of correct detections) and the loss (representing to wrong detections).
Step 5 - comparing the models performances
A good model should have a high accuracy and low loss to correctly operate.
This interface also allows to follow in real time the training of the models: accuracy, loss and learning rate evolutions can be followed in real-time as illustrated in the above copy of the interface.
Max-ICS also compte different quality metrics like the F-Score (that will be further detailed in another article) and the Confusion Matrix. The confusion matrix is computed at the end of the trained models. The principle is to perform predictions on a set of images representing a complete subset of the different classes and compare the predicted class. Such comparison, can be represented as a matrix as illustrated in the interface copy below.
For each of the trained model, a confusion matrix is processed. In the presented example, the diagonal representing good predictions are maximised, the error is marginal with 3.8% of wrong predictions. The confusion matrix is an important tool for the data-scientist to monitor the performance of the trained models as complement of accuracy and the loss values.
Max-ICS also proposes a way to manually test the models and analyse its comportement for specific predictions. The above copy of the Max-ICS interface illustrates such manual test with the integration CAM (Class Activation Mapping) or Grad-CAM.
The tested image is correctly classified as someone 'with a mask' with a confidence of 99.99%. The CAM layer highlights that the man's face has been used as the main focus to determine the predicted class: this is a good hint to confirm that the model is working properly. In a future article, I will illustrate how the CAM maps can help to break cover the 'clever hans'.
Step 6 - put the best model in operation
Max-ICS also integrates the possibility to put in operation the generated models into processing chains, called pipelines in the Max-ICS jargon.
The above pipeline represents an example of implementation to collect images from HTTP camera stream, pre-process (here a simple resize and optionally face detection), detect masks, store the analyse images in an API and visualise the results.
Such simple pipeline is even simpler to implement thanks to the Max-ICS internal marketplace already including dozens of pre-coded components. The use of marketplace components highly reduce the efforts and the risk in the implementation of processing chains.
Finally, it is important to notice that it is simple to instantiate the trained model thanks to an selecting assistant (refers to the following screenshot).
In the illustrated example, the Max-ICS platform allowed to create, with an end-to-end and 0-code approach, a deep learning classification model to detect the presence of protection mask. Thanks to different tools provided by the Max-ICS platform, the complete process, from the integration of a dataset, to deployment in operation of the model has been made without strong coding knowledge but also with an impressive efficiency.
The quality and explainability features can be used asses the quality of the model.
To finish, I wanted to be sure that our model is not making any confusion between the FFP protection mask and... another type of mask... hopefully, the model makes some difference :-)