The design of experimental platform and algorithmic basis for multi-modal face recognition

Participant: Olegs Nikisins (Institute of Electronics and Computer Science, Riga, Latvia)

Scientific supervisor in the host institution: Assistant Professor Kamal Nasrollahi, and Professor Thomas B. Moeslund (Aalborg University, Laboratory of Visual Analysis of People, Denmark, Aalborg)

Period:  01/11/2013 to 31/12/2013

  1.        Purpose of the STSM

The purpose of this Short Term Scientific Mission (STSM) is to establish an international collaboration between two institutions within the COST Action IC1106 (Integrating Biometrics and Forensics for the Digital Age): Institute of Electronics and Computer Science (Riga, Latvia) and Laboratory of Visual Analysis of People (VAP) at Aalborg University (Denmark, Aalborg), which is the host institution. The general scientific idea behind the collaboration is to develop a multi-modal facial recognition system, which is based on the combination of depth, thermal and RGB images of the face. The ultimate goal is to study the robustness of such a multi-modal face recognition system in different capturing scenarios. The research during STSM covered two major fields: 1) development, testing and tuning of algorithmical basis of multi-modal face recognition system, 2) acquiring a face database, which consists of depth, thermal and RGB images. The development of the algorithms for such a system is a challenging issue which will benefit both biometrics and forensics and requires the complementary competences of both institutions to be joined. In order to achieve the second goal the specific instruments are needed, which are available in the host institution.

  1. 2.       Description of the work carried out during the STSM

The work carried during the STSM can be briefly divided into the following stages:

1)      Development of multi-modal face database, which incorporates RGB, depth and thermal images (RGB-D-T) of the face. Database covers three capturing scenarios: rotations of the head, variable facial expressions and variable illumination conditions. The database has also been supplemented with ground-truth data (coordinates of the face bounding boxes for all modalities) and standard evaluation protocol. Activity duration: approx. 2 weeks.

2)      Evaluation of feature parameters. The selected features are Local Binary Patterns (LBP), Histograms of Oriented Gradients (HOG) and HAAR-like features. The parameters are selected so as to provide the best recognition performance for each modality, with constraint of equal dimensionality of feature spaces in all modalities (needed for feature-level fusion). Activity duration: approx. 1 weeks.

3)      Testing of multi-modal face recognition algorithms on the acquired face database. Both individual evaluation of each module and analysis of the complete multi-modal system have been performed in order to realize the gain in the performance of multi-modal approach. The fusion of the modalities has been made in the feature level. Three classifiers have been considered in the recognition module – NNC, Weighted NNC and linear SVM.  Activity duration: approx. 2 weeks.

4)      Preparation of the publication based on the obtained results. The paper entitled “RGB-D-T based Face Recognition” is submitted to ICPR 2014 conference. Deadline for submission: 20.12.2013.

5)      The research in the field of Sparse-coding with application to RGB-D-T based face recognition has been initialized next. Activity duration in the host institution: till the end of the STSM, however this is an ongoing joint research.

 3.       Description of the main results obtained

1)      RGB-D-T facial database

The first significant result of the STSM is the RGB-D-T facial database. The images of each person in the database are organized in three sets corresponding to rotation, expression and illumination scenarios. The total number of persons in the database is 51. Each capturing sequence (rotation, expression, illumination) has 300 images per person: 100 RGB, 100 Depth and 100 Thermal synchronized images. The total number of images per person is 900 resulting in 45900 images in the database.

For development purposes the database is supplemented with a Matlab indexing function, which must be used to split the data into Training, Validation and Testing sets. This function is introduced in order to unify both the development and testing of the face recognition algorithms among the researchers who use the database.

Each facial image in the database is supplemented with ground-truth data, which includes bounding box parameters of the face. The ground-truth data is generated automatically.

The database will be publically available shortly on the webpage of Laboratory of Visual Analysis of People (Aalborg University).

2)      RGB-D-T based face recognition

The next result is the developed multi-modal face recognition algorithm. The algorithm is based on feature-level fusion concept. The obtained experimental results cover various combinations of classifiers (NNC, Weighted NNC and Linear SVM) and feature spaces (LBP, HOG, HAAR-like). It is worth mentioning that preprocessing of the input face images was deliberately excluded from the algorithmic pipeline in order to get a clear insight of “as is” possibilities of each particular modality.

From experimental results, which are described in the scientific publication a few important conclusions can be made. First, based on the complexity for the recognition the capturing scenarios can be prioritized as follows: rotations (difficult), illumination (less difficult), expressions (the simplest one). Second, the importance of each modality in the recognition process depends on the capturing scenario. However, thermal data constantly holds high impact in the recognition regardless of the scenario. From the list of observed features LBP in most cases provides the best recognition results.

Details are covered in the publication, which will be attached for consideration.

4.       Future collaboration with the host institution

The collaboration with the host institution continues in the form of joint research in the field of Sparse coding with application to RGB-D-T based face recognition.

Also the participation in the joint research projects is currently under discussion (Horizon 2020).

5.       Foreseen publications/articles resulting from the STSM

1)      Submitted publications:

O. Nikisins, K. Nasrollahi, M. Greitans and T.B. Moeslund. RGB-D-T based Face Recognition. International Conference on Pattern Recognition (ICPR 2014), under review, 2014

2) Foreseen publications:

Journal paper in the field of Sparse coding with application to RGB-D-T based face recognition.

Other comments

Many thanks to all for a great opportunity to participate in the STSM within the COST project and for comprehensive support during the STSM!