Recognizing Visual Object Using Machine Learning Techniques

Korichi, Aicha (2022) Recognizing Visual Object Using Machine Learning Techniques. Doctoral thesis, Université de mohamed kheider biskra.

[img] Text
Thesis.pdf

Download (2MB)

Abstract

Nowadays, Visual Object Recognition (VOR) has received growing interest from researchers and it has become a very active area of research due to its vital applications including handwriting recognition, diseases classification, face identification ..etc. However, extracting the relevant features that faithfully describe the image represents the challenge of most existing VOR systems. This thesis is mainly dedicated to the development of two VOR systems, which are presented in two different contributions. As a first contribution, we propose a novel generic feature-independent pyramid multilevel (GFIPML) model for extracting features from images. GFIPML addresses the shortcomings of two existing schemes namely multi-level (ML) and pyramid multi-level (PML), while also taking advantage of their pros. As its name indicates, the proposed model can be used by any kind of the large variety of existing features extraction methods. We applied GFIPML for the task of Arabic literal amount recognition. Indeed, this task is challenging due to the specific characteristics of Arabic handwriting. While most literary works have considered structural features that are sensitive to word deformations, we opt for using Local Phase Quantization (LPQ) and Binarized Statistical Image Feature (BSIF) as Arabic handwriting can be considered as texture. To further enhance the recognition yields, we considered a multimodal system based on the combination of LPQ with multiple BSIF descriptors, each one with a different filter size. As a second contribution, a novel simple yet effcient, and speedy TR-ICANet model for extracting features from unconstrained ear images is proposed. To get rid of unconstrained conditions (e.g., scale and pose variations), we suggested first normalizing all images using CNN. The normalized images are fed then to the TR-ICANet model, which uses ICA to learn filters. A binary hashing and block-wise histogramming are used then to compute the local features. At the final stage of TR-ICANet, we proposed to use an effective normalization method namely Tied Rank normalization in order to eliminate the disparity within blockwise feature vectors. Furthermore, to improve the identification performance of the proposed system, we proposed a softmax average fusing of CNN-based feature extraction approaches with our proposed TR-ICANet at the decision level using SVM classifier.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: Keywords: Visual object recognition, Machine learning, Pattern recognition, Deep learning, Ear recognition, Arabic handwriting recognition
Subjects: Q Science > Q Science (General)
Divisions: Faculté des Sciences Exactes et des Sciences de la Nature et de la Vie > Département d'informatique
Depositing User: BFSE
Date Deposited: 06 Sep 2022 10:44
Last Modified: 06 Sep 2022 10:44
URI: http://thesis.univ-biskra.dz/id/eprint/5739

Actions (login required)

View Item View Item