Review paper on Face Recognition using Different Feature Extraction Methods

Neural Networks 69 (2015) 111–125Contents lists available at ScienceDirectNeural Networksjournal homepage: www.elsevier.com/locate/neunetOptimized face recognition algorithm using radial basis functionneural networks and its practical applicationsSung-Hoon Yooa, Sung-Kwun Oha,∗, Witold Pedrycz b,c,da Department of Electrical Engineering, The University of Suwon, San 2-2 Wau-ri, Bongdam-eup, Hwaseong-si, Gyeonggi-do, 445-743, South Koreab Department of Electrical & Computer Engineering, University of Alberta, Edmonton T6R 2V4 AB, Canadac Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah, 21589, Saudi ArabiadSystems Research Institute, Polish Academy of Sciences, Warsaw, Polanda r t i c l e i n f oArticle history:Received 14 July 2014Received in revised form 14 May 2015Accepted 17 May 2015Available online 5 June 2015Keywords:P-RBF NNs (Polynomial based Radial BasisFunction Neural Networks)PCA (Principal Component Analysis)ASM (Active Shape Model)FCM (Fuzzy C-means Method)DE (Differential Evolution)a b s t r a c tIn this study, we propose a hybrid method of face recognition by using face region information extractedfrom the detected face region. In the preprocessing part, we develop a hybrid approach based on theActive Shape Model (ASM) and the Principal Component Analysis (PCA) algorithm. At this step, we usea CCD (Charge Coupled Device) camera to acquire a facial image by using AdaBoost and then HistogramEqualization (HE) is employed to improve the quality of the image. ASM extracts the face contour andimage shape to produce a personal profile. Then we use a PCA method to reduce dimensionality offace images. In the recognition part, we consider the improved Radial Basis Function Neural Networks(RBF NNs) to identify a unique pattern associated with each person. The proposed RBF NN architectureconsists of three functional modules realizing the condition phase, the conclusion phase, and the inferencephase completed with the help of fuzzy rules coming in the standard ‘if-then’ format. In the formationof the condition part of the fuzzy rules, the input space is partitioned with the use of Fuzzy C-Means(FCM) clustering. In the conclusion part of the fuzzy rules, the connections (weights) of the RBF NNsare represented by four kinds of polynomials such as constant, linear, quadratic, and reduced quadratic.The values of the coefficients are determined by running a gradient descent method. The output of theRBF NNs model is obtained by running a fuzzy inference method. The essential design parameters of thenetwork (including learning rate, momentum coefficient and fuzzification coefficient used by the FCM)are optimized by means of Differential Evolution (DE). The proposed P-RBF NNs (Polynomial based RBFNNs) are applied to facial recognition and its performance is quantified from the viewpoint of the outputperformance and recognition rate.© 2015 Elsevier Ltd. All rights reserved.1. IntroductionBiometrics delivers technologies that identify individuals bymeasuring physical or behavioral characteristics of humans. Apassword or PIN (Personal Identification Number) type recentlyused is the means of personal authentication that requires tobe memorized. They could be compromised relatively easily. Inmore advanced biometric scenarios, memorization is not required(Chellappa, Wilson, & Sirohey, 1995). The existing face recognitionalgorithms were studied by using 2D image. Besides the face, localeye or template matching-based methods were used. The issues of∗ Corresponding author. Tel.: +82 31 229 6342; fax: +82 31 220 2667.E-mail addresses: shyoo@suwon.ac.kr (S.-H. Yoo), ohsk@suwon.ac.kr (S.-K. Oh),wpedrycz@ualberta.ca (W. Pedrycz).overhead of computing time as well as the memory requirementswere raised that concerned an acquisition of image data or theensuing learning. PCA transformation that enables to decreaseprocessing time by reducing the dimensionality of the data hasbeen proposed to solve such a problem. Boehnen and Russ (2005)used facial color present in 2D images while Colombo, Cusano, andSchettini (2005) used curvature, position and shape of the face.Colbry, Stockman, and Jain (2005) and Lu and Jain (2005) generateda statistical model of the eyes, nose and mouth and eyes. Recently,the more effective applications were supported by the use of ASM(Cootes, Cooper, Taylor, & Graham, 1995).Most 2D feature-based biometrics algorithms typically requirehigh quality images in order to achieve high performance (Mohammed,Minhas, Jonathan Wu, & Sid-Ahmed, 2011). Thereforein order to properly extract the feature candidates of face image,all unnecessary information existing in the face image must behttp://dx.doi.org/10.1016/j.neunet.2015.05.0010893-6080/© 2015 Elsevier Ltd. All rights reserved.112 S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125Fig. 1. An overall architecture of the face recognition system.removed. The facial feature image can be thus obtained after removingthe unnecessary features by augmenting the ‘‘conventional’’recognition system by the ASM process. That is, facialfeatures are extracted by removing the background and obstaclesfrom input images by using the ASM.In the sequel, face extraction and recognition are carried outby running a series of algorithms such as ASM, PCA, and DE-basedP-RBF NNs. The main objective of this study is to improve the facerecognition rate by handling the image of enhanced facial featuresthrough the multi-dimensional data preprocessing technologies(PCA combined with ASM) and DE-based P-RBF NNs. We realizethis objective within the context of a practical recognition environment.Along with this new design approach we provide a thoroughcomparative analysis contrasting the performance of the recognitionscheme proposed here with the performance of other approachesexisting in the literature.This paper is organized as follows: in Section 2, we introduce theproposed overall face recognition system and the design method.Section 3, Histogram equalization, AdaBoost, ASM, and PCAforming the preprocessing part of face recognition. Optimizationtechniques and a design method of a pattern classifier for facerecognition are covered in Section 4. In Section 5, we analyze theperformance of the proposed system by using input images datacoming from a CCD camera. Finally, the conclusions are covered inSection 6.2. Structure of face recognition systemIn this section, an overall structure of proposed face recognitionsystem and design method is described. Face recognitionsystem is constructed to data preprocessing and RBF NNspattern classifier applying optimization techniques. Histogramequalization, AdaBoost, ASM, and PCA realized a phase of data preprocessing.Histogram equalization compensates for illuminationdistortionof images. The face area is detected by using AdaBoost.By using the ASM, face information is extracted and eventual obstaclesare removed. The features of face images are extracted with theuse of the PCA method. The RBF NN pattern classifier is constructed.The classifier comprises three functional modules. These modulesrealize the condition, conclusion and aggregation phase. The inputspace of the condition phase is represented by fuzzy sets formedby running the FCM clustering algorithm (Tsekourasa, Sarimveisb,Kavaklia, & Bafasb, 2005). The conclusion phase involves a certainpolynomial. The output of the network is determined by runningfuzzy inference. The proposed RBF NNs come with the fuzzy inferencemechanism constructed with the use of the fuzzy rule-basednetwork. In the construction of the classifier, Differential Evolution(DE) is used to optimize the momentum coefficient, learning rate,and the fuzzification coefficient used in the FCM algorithm. Fig. 1portrays an overall architecture of the system.3. Data reprocessing for facial feature extractionIn this section, we briefly elaborate on the histogram equalization,AdaBoost algorithm, ASM and PCA as they are being used atthe phase of data preprocessing.3.1. Histogram equalizationHistogram equalization (HE) is a commonly used technique forenhancing image contrast (Gonzalez & Woods, 2002).Consider a facial image W with N pixels, and a total number ofk gray levels, e.g., 256 gray levels in the dynamic range of [0, L−1].The generic idea is to map the gray levels based on the probabilitydistribution of the image input gray levels. For a given image W,the probability density function (PDF) of PDF (Wk) is defined asPDF (Wk) = nk/n. (1)For k = 0, 1, . . . , L − 1, where nk represents the number oftimes that the level, Wk appears in the input image W, and n is thetotal number of samples in the input image. Note that PDF (Wk) isassociated with the histogram of the input image which representsthe number of pixels that have a specific input intensity Xk. A plotof nk versus Wk is known as the histogram of image W(I, J). Basedon the PDF, the cumulative density function (CDF) is defined asCDF (w) =kj=0PDF (Wj) (2)where Wk = w, for k = 0, 1, . . . , L−1. By definition, PDF (WL−1) =1. Through histogram equalization we map the input image intothe entire dynamic range (W0, WL−1).Fig. 2(a) and (b) show a face image along with its equivalent histogram.The output image produced after histogram equalization isgiven in Fig. 2(c) and (d). This result demonstrates the performanceof the histogram equalization method in enhancing the contrast ofan image through dynamic range expansion.3.2. AdaBoost-based face detectionThe purpose of face detection is to locate faces present in stillimages. This has long been a focus of computer vision researchand has achieved a great deal of success (Gao, Pan, Ji, & Yang,2012; Lopez-Molina, Baets, Bustince, Sanz, & Barrenechea, 2013;Rowley, Baluja,&Kanade, 1998; Sung&Poggio, 1998; Viola&Jones,2004). Comprehensive reviews are given by Yang, Kriegman, andAhuja (2002), and Zhao, Chellappa, and Phillips (2003). Amongface detection algorithms, the AdaBoost (Freund & Schapire, 1995)based method proposed by Viola and Jones (2001) has gainedgreat popularity due to a high detection rate, low complexity,S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125 113Fig. 2. Face images and histogram.Fig. 3. Rectangle feature of Haar-like.and solid theoretical foundations. The high speed of AdaBoostmethod is mainly due to the use of simple Haar-like features anda cascaded classifier structure, which excludes quickly most of theimage window hypotheses. Fig. 3 shows several types of Haar-likefeatures.At a preprocessing stage, an auxiliary image Ii, called theintegral image or summed-area table (Lienhart & Maydt, 2002) isformed on a basis of the original image I0, where the value Ii(i, j)is the sum of pixels above or to the left of position (i, j) in I0.Using Ii, the sum of pixel intensities in I0 any rectangle in can becalculated in constant time. Afterwards, each stage classifier wastrained using the AdaBoost.AdaBoost constructs the strong classifier as a combination ofweak classifier with proper coefficients. This is an iterative learningprocess. Each candidate image window w, at all positions and allscales, is fed into a cascaded classifier. At each stage, the classifierresponse h(w) comes as the sum of a series of feature responseshj(w) as described belowh(w) =nij=1hj(w) =αj1 fj(w) < tjαj2 otherwise (3)where fj(w) is the feature response of the jth Haar feature andαj1 and αj2 are the feature weight coefficients. If h(w) is lowerthan a threshold t, the candidate window w is regarded as nonfaceand thrown away, otherwise it is sent to the next classifier.Multiple detections for a single face are pruned by non-maximumsuppression realized at the last step. Fig. 4 shows the cascadeof classifier with N stages. Each classifier in cascade AdaBoostdetector works independently, and the minimum true positive andthe maximum false alarm rates of these stages are the same. Onlythese sub-windows accepted as true positive by all stages of thedetector are regarded as targets. ‘T’ means that true candidatesof these sub-windows passed the verification of each classifier,and ‘F’ means that these false candidates are rejected by thecorresponding classifier.3.3. Face shape extraction using ASMASM is a commonly used technique for facial feature extraction.This method is similar to the Active Contour Model, or snakes(Huang, Hzu, & Cheng, 2010), but exhibits some advantage suchthat the instances of an ASM can only deform in the ways foundin its training set. ASM also allows for a considerable level ofvariability in shape modeling, but the model is specific to the classof target objects or structures that it intends to represent.3.3.1. The shape modelA shape model is described by n landmark points that representthe important positions in the object to be represented. Thesepoints are generated based on a set of training shapes. Each trainingshape x is represented as a shape vector, which is a collection oflandmark points called a point distribution model (Wang, Xie, Zhu,Yang, & Zheng, 2013),x = (x0, y0, x1, y1, . . . , xk, yk, . . . , xn−1, yn−1)T(4)where T denotes the transpose operation, and (xk, yk) are thecoordinates of the kth landmark point.Fig. 5 shows a training image with its landmark points beingmarked.114 S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125Fig. 4. Flowchart of the cascade AdaBoost detector.Fig. 5. Boundary of the face area.The training shapes are all aligned by translation, rotation andscaling for minimizing the sum of squared distances between theircorresponding landmark points. Then, the mean shape x¯ and thedeviation of each training shape from the mean are calculated.Principal component analysis (PCA) is then applied to capturemost of the shape variations. Therefore, a shape model can beapproximated as follows:x ≈ x¯ + Pb (5)where the P = (p1, p2, . . . , pt) is the matrix whose columns arethe first t eigenvectors with the largest eigenvalues arranged in adescending order, and b = (b1, b1, . . . , bt)Tis a weight vector forthe t eigenvectors, referred to as the shape parameters. When fittingthe shape model to an object, the value of biis constrained tolie within the range ±3 standard deviations. This can ensure thatthis range of the shape parameters can represent most of the shapevariations in the training set. The number of eigenvectors t to beused is determined such that the eigenvectors can represent a certainamount of the shape variations in the training shapes, usuallyranging from 90% to 95%. The desired number of eigenvectors t isgiven as the smallest t which satisfies the following conditionti=1λi ≥ 0.95Ni=1λi (6)where N is the overall number of eigenvectors available.3.3.2. Modeling the gray-level appearanceThe gray-level appearance (Cootes, Taylor, Lanitis, Cooper, &Graham, 1993), which describes the local texture feature aroundeach landmark, is the normalized derivative of the profiles sampledperpendicular to the landmark contour and centered at thelandmark. This gray-level information is used to estimate the bestposition of the landmarks in the searching process. The normalizedderivative of the profiles is invariant to the offsets of the gray levels.The gray-level profile, gij, of the landmark j in the image i is a(2n + 1)-D vector, in which n pixels are sampled on either sideof the landmark under consideration,gij = [gij0, gij1, . . . , gij(2n−1)] (7)where gij, k = 0, . . . , 2n + 1, is the gray-level intensity of acorresponding pixel. The derivative profile of gij of length of 2n isgiven as follows:dgij = [gij1 − gij0, gij2 − gij1, . . . , gij(2n+1) − gij(2n)]. (8)The normalized derivative profile is given byyij =dgij2nk=0|dgijk|(9)where dgijk = gij(k+1) − gijk. The covariance matrix of the normalizedderivative profile for N training images comes in the formCyj =1NNi=1(yij − y¯j)(yij − y¯j) (10)where y¯jis the mean profile. The ASM employs the informationcoming as a result of modeling the gray-level statistics around eachlandmark to determine the desired movement or adjustment ofeach landmark such that a face shape model can fit into the targetobject accurately. To determine the movement of a landmark,a search profile, which is a line passing through the landmark underconsideration and perpendicular to the contour formed by thelandmark and its neighbors is extracted. A number of sub-profileswill be generated when the best set of shape parameters is beingsearched. These sub-profiles are matched to the correspondingprofiles obtained from the training set. The difference between asub-profile y and the training profile is computed using the Mahalanobisdistance:f(y) = (y − y¯j)TCyj(y − y¯j). (11)Minimizing f(y) is equivalent to the maximization of the probabilityof y matched to y¯ according to a Gaussian distribution. Fig. 6shows each movement of singular points of model point (y) andnearest edge (y¯) by the ASM.S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125 115Fig. 6. Movement of singular point.3.3.3. The optimization algorithmModifying the point distribution model (PDM) to fit the objectin an image is an iterative optimization process. Starting from themean shape representation of the model, each point of the modelis allowed to move dynamically until it fits the object. At eachmodel point or landmark, a profile perpendicular to the contour isextracted and a new and better position of that point is estimatedalong this profile. Different approaches (Yong, Zhang, Xiaoguang,& Zhang, 2000) can be used to search for a better position forthe points. The simplest way is to find the strongest edge alongthe searching profile. Another approach is to create the gray-levelappearance model or profile of each point, which will maximize theprobability of the gray-level profile, as described in the last section.After searching, the shape parameters b = (b1, b1, . . . , bt)Tandthe pose parameters (i.e., rotation, scale, and translation of themodel) are adjusted in such a way as to minimize the overalldistance between the new position of the points and the positionof the original points. The adjustment process is repeated until nosignificant changes in the model points have been observed.3.4. Feature extraction using PCAPrincipal component analysis is a standard technique usedin statistical pattern recognition and signal processing for datareduction and feature extraction. As the pattern often containsredundant information, we map it to the feature space of lowerdimensionality (Mohammed et al., 2011).A face image of size N × N pixels can be considered as a onedimensionalvector of dimensionality N2. For example, face imagefrom the AT&T (formerly the ORL database of faces) database ofsize 112 × 92 can be considered as a vector of dimension 10,304,or equivalently points in a 10,304 dimensional space. An ensembleof images maps to a collection of points in this highly dimensionalspace. The main idea of the principal component is to find thevectors that best account for the distribution of face images withinthe entire image space. Because these vectors are the eigenvectorsof the covariance matrix corresponding to the original face images,and because they are face like in appearance, we refer to them as‘eigenfaces’.Let the training set of face images be Γ1, Γ2, . . . , ΓM , theaverage face of the set is expressed in the form in (12).Fig. 7 shows the average face coming from the AT&T database.Ψ =1MMn=1Γn. (12)Each face differs from the average by the vectorΦi = Γi − Ψ. (13)This set of highly-dimensional vectors is then subject to principalcomponent analysis, which seeks for a set of M orthonormalvectors, Um, which the best describe the distribution of the data.Fig. 7. Average face of AT&T database.The kth vector, Uk, is selected in a way such that the expressionλk =1MMn=1(UTk Φn)2(14)attains maximum, subject to the constraintUTI Uk = δIk =1, if I = k0, otherwise. (15)The vectors Uk are the eigenvectors of the covariance matrixC =1MMn=1ΦnΦTn = AAT. (16)The covariance matrix C, however, is an N × N real symmetricmatrix therefore calculating the N2eigenvectors and eigenvaluesis an intractable task for image of typical sizes. Therefore, if weuse the approximate equation derived from (14), the resultantcovariance matrix C will be of dimensionality N2 × N2. Thisresult makes the task of determining eigenvectors and eigenvaluesintractable and computationally unfeasible. Recall that A =[Φ1, Φ2, . . . , ΦM ], the matrix multiplication of AT A results in anM × M matrix. Since M is the number of faces in the database, theeigenvectors analysis is reduced from the order of the number ofpixels in the images (N2) to the order of the number of images inthe training set (M).Consider the eigenvectors vi of AT A such thatATAvi = µivi. (17)Pre-multiplying both sides by A and using (14), we obtainAATAvi = µiAvi. (18)We see that Avis are the eigenvectors and µi s are theeigenvalues of C = AAT.Following the analysis shown above, we construct the M × Mmatrix L = AT A, where Lmn = ΦTmΦn, and find the M eigenvectors,vi, of L. These vectors determine linear combinations of the Mtraining set face images to form the eigenfaces UI. Fig. 8 shows theeigenface of the PCA.UI =Mk=1vIkΦk, I = 1, . . . , M. (19)With this analysis, the calculations are greatly reduced, from theorder of the number of pixels in the images (N2) to the order ofthe number of images in the training set (M). In practice, the trainingset of face images will be relatively small (M ≪ N2), and thecalculations become quite manageable. The associated eigenvaluesallow us to rank the eigenvectors according to their usefulness116 S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125Fig. 8. Eigenface.when characterizing the variation among the images. The eigenfaceimages calculated from the eigenvectors of L span a basis setthat can be used to describe face images (Slavkovic & Jevtic, 2012).We evaluated a limited version of this framework on an ensembleof 115 images (M = 115) images of Caucasian males digitizedin a controlled manner, and found that 40 eigenfaces (M′ = 40)were sufficient to realize a very good description of face images.In practice, a lower value of M′can be sufficient for identification,since accurate reconstruction of the image is not an absolute requirement.In the framework of face recognition, we are concernedwith a pattern recognition task rather than image reconstruction.The eigenfaces span an M′ dimensional subspace of the original N2image space and hence, the M′significant eigenvectors of the L matrixwith the largest associated eigenvalues, are sufficient for reliablerepresentation of the faces in the face space characterized bythe eigenfaces.Then a new face image (Γ ) is transformed into its eigenfacecomponents (projected onto ‘‘face space’’) in the following mannerwk = UTk(Γ − Ψ) (20)for k = 1, . . . , M′. The weights form a projection vector ΩT =[w1, w2, . . . , wM′] describing the contribution of each eigenface inthe representation of the input face image, treating the eigenfacesas a basis set for face images. The projection vector is then used ina standard pattern recognition algorithm to identify which out ofthe number of predefined face classes, if any, describes the face tothe highest extent.4. Designing pattern classifier using radial basis functionneural networksIn this section, we outline a design of fuzzy inferencemechanism-based Radial Basis Function Neural Networks. A networkstructure of the proposed RBF NNs is divided into three modulesinvolving the condition, conclusion, and inference phases. Theinput space of condition phases is divided by using FCM and LocalArea of conclusion phases is represented to polynomial function.The final output of network is obtained through fuzzy inference ofinference phases. Performance of proposed RBF NNs is improvedby generating nonlinear discriminant function in the output spaceowing to fuzzy inference mechanism of polynomial-based structure(Balasubramanian, Palanivel, & Ramalingam, 2009; Connolly,Granger, & Sabourin, 2012; Han, Chen, & Qiao, 2011; Oh, Kim,Pedrycz, & Park, 2014; Park, Oh, & Kim, 2008).4.1. Architecture of polynomial-based RBF networksThe general architecture of the radial basis function neuralnetworks (RBF NNs) consists of the three layers as shown in Fig. 9.Fig. 9. General architecture of RBF neural networks.The network exhibits a single hidden layer. Each node in thehidden layer determines a level of activation of the receptive field(radial basis function) Θ(x) given some input x. The jth outputyj(x) is a weighted linear combination of the activation levels ofthe receptive fields:yj(x) =ci=1wjiΘi(x) (21)j = 1, . . . , s where s stands for the number of outputs (beingequal to the number of classes encountered in a given classificationproblem). In the case of the Gaussian type of RBFs, we haveΘi(x) = exp−∥x − vi∥22σ2i(22)where x is the n-dimensional input vector [x1, . . . , xn]T, and vi =[vi1, . . . , vin]Tis the center of the ith basis function Θi(x) while c isthe number of the nodes in the hidden layer. Typically the distance∥.∥ used in (22)is the Euclidean one (Staiano, Tagliaferri,&Pedrycz,2006).The proposed P-RBF NNs (Polynomial based RBF NNs) exhibita similar topology as the one encountered in RBF NNs. Howeverthe functionality and the associated design process exhibit severalevident differences. In particular, the receptive fields do notassume any explicit functional form (say, Gaussian, ellipsoidal,etc.), but are directly reflective of the nature of the data and comeas the result of fuzzy clustering. Given the prototypes formed bythe FCM method, the receptive fields are described in the followingwayΘi(x) = Ai(x) =1cj=1∥x−vi∥2∥x−vj∥2. (23)In addition, the weights between the output layer and thehidden layer are not single numeric values but come in the formof polynomials of the input variables (hence the term of functionallinks used in this architecture)wji = fji(x). (24)The neuron located at the output layer realizes a linearcombination of the activation levels of the corresponding receptivefields hence (21) can be rewritten as follows,yj(x) =ci=1fji(x)Ai(x). (25)S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125 117The above structure of the classifier can be represented through acollection of fuzzy rules.If x is Ai, then fji(x) (26)where the fuzzy set Aiis the i-cluster (membership function) of theith fuzzy rule, fji(x)is a polynomial function generalizing a numericweight used in the standard form of the RBF NNs, and c is thenumber of fuzzy rules (clusters), and j = 1, . . . , s.4.2. Three processing phase of polynomial-based radial basis functionneural networksThe proposed P-RBF NNs are implemented by realizing threeprocessing phases that is, condition, conclusion and aggregationphases. Condition and conclusion phases relate to the formationof the fuzzy rules and their ensuing analysis. Aggregation phase isconcerned with a fuzzy inference (mapping procedure).4.2.1. Condition phase of networksThe condition phase of the P-RBF NNs is formed by means ofthe Fuzzy C-Means. In this section, we briefly review the objectivefunction-based fuzzy clustering with intent of highlighting its keyfeatures pertinent to the architecture of the network (Roh, Oh, &Pedrycz, 2010). The FCM algorithm is aimed at the formation of ‘c’fuzzy sets (relations) in Rn. The objective function Q guiding theclustering is expressed as a sum of the distances of individual datafrom the prototypes v1, v2, . . . , and vc,Q =ci=1Nk=1umik ∥xk − Vi∥2. (27)Here, ∥ ∥ denotes a certain distance function; ‘m’ stands for afuzzification factor (coefficient), m > 1.0. N is the number ofpatterns (data). The resulting partition matrix is denoted by U =[uik], i = 1, 2, . . . , c; k = 1, 2, . . . , N. While there is a substantialdiversity as far as distance functions are concerned, here we adhereto a weighted Euclidean distance taking on the following form∥xk − Vi∥2 =nj=1(xkj − vij)2σ2j(28)with σj being a standard deviation of the jth variable. While notbeing computationally demanding, this type of distance is stillquite flexible and commonly used.Consider the set X consisting of N patterns X = {x1, x2, . . . , xN},xk ∈ Rn, 1 ≤ k ≤ N. In clustering we assign patterns xk ∈ X intoc clusters, which are represented by its prototypes vi ∈ Rn, 1 ≤i ≤ c. The assignment to individual clusters is expressed in termsof the partition matrix U = [uik] whereci=1uik = 1, 1 ≤ k ≤ N (29)and0 <Nk=1uik < N, 1 ≤ i ≤ c. (30)The minimization of Q is realized in successive iterations byadjusting both the prototypes and the partition matrix, that ismin Q (U, v1, v2, . . . , vc ). The corresponding formulas used in aniterative fashion read as followsuik =1cj=1∥xk−vi∥∥xk−vj∥ 2m−1, 1 ≤ k ≤ N, 1 ≤ i ≤ c (31)Fig. 10. Topology of P-RBF NNs exhibiting three functional modules of condition,conclusion and aggregation phases.andvi =Nk=1umikxkNk=1umik, 1 ≤ i ≤ c. (32)The properties of the optimization algorithm are well documentedin the literature, cf. Bezdek (1981). In the context of our investigations,we note that the resulting partition matrix produces‘c’ fuzzy relations (multivariable fuzzy sets) with the membershipfunctions u1, u2, . . . , ucforming the corresponding rows of thepartition matrix U, that is U = [uT1 uT2. . . uTc]. From the designstandpoint, there are several essential parameters of the FCMthat impacts the usage of the produced results. These parametersconcern the number of clusters, the values of the fuzzification coefficientand a form of the distance function. The fuzzification coefficientexhibits a significant impact on the form (shape) of thedeveloped clusters. The commonly used value of m is equal to2. Lower values of the fuzzification coefficient produce moreBoolean-like shapes of the fuzzy sets where the regions of intermediatemembership values are very much reduced. When weincrease the values of ‘‘m’’ above 2, the resulting membership functionsstart to become ‘‘spiky’’ with the values close to 1 in a veryclose vicinity of the prototypes. We anticipate that the adjustmentof the values of m will substantially impact the performance of thenetwork.4.2.2. Conclusion phase of the networkPolynomial functions are dealt with in the conclusion phase.For convenience, we omit the suffix j from the original notationfji(x) shown in Fig. 10 and described by (26). Several classes ofpolynomials are worth notingConstant; fi(x) = ai0 (33)Linear; fi(x) = ai0 +nj=1aijxj (34)118 S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125Quadratic; fi(x) = ai0 +nj=1aijxj +nj=1nk=1aijkxjxk. (35)These functions are activated by the corresponding entries of thepartition matrix and lead to local regression models located at thecondition phase of the individual rules.In case of the quadratic function, the dimensionality of theproblem increases quite quickly especially when dealing withproblems of high dimensionality. The reduced quadratic functionis also discussed with intent of reducing computational burden.Reduced Quadratic; fi(x) = ai0 +nj=1aijxj +nk=1aijkx2k. (36)The use of some constant in (33) (which are special cases of thepolynomial functions) reduces the P-RBF NNs to the standard RBFneural networks as illustrated in Fig. 9.4.2.3. Aggregation phase of networksLet us consider the P-RBF NNs structure by considering thefuzzy partition realized in terms of FCM as shown in Fig. 10. Thenode denoted by Π is realized as a product of the correspondingfuzzy set and the polynomial function. The family of fuzzy sets Aiforms a partition (so that the sum of membership grades sum upto one at each point of the input space). The ‘‘’’ neuron realizesa sum as shown in (25). The output of P-RBF NNs can be obtainedby following a standard inference mechanism used in rule-basedsystems (Oh, Pedrycz, & Park, 2004),yj = gj(x) =ci=1uifji(x)ck=1uk=ci=1uifji(x) (37)where, ui = Ai(x). All the entries sum up to 1 as indicated by (9).gj(x) describes here the discriminant function for discerning jthclass.Based on the local polynomial-like representation, the globalcharacteristics of the P-RBF NNs result through the composition oftheir local relationships.4.3. The discriminant function polynomial-based radial basis functionneural networks classifiersThere are many different ways to describe pattern classifiers.One of the most useful ways is the one realized in terms of a set ofdiscriminant functions gi(x), i = 1, . . . , m (where m stands for thenumber of classes). The classifier is said to assign an input vector xto class ωiifgi(x) > gj(x) for all j ̸= i. (38)Thus, the classifiers are viewed as networks that compute mdiscriminant functions and select the category corresponding tothe largest value produced by these functions.In this paper, the proposed P-RBF NNs classifier is used for twoclassor multi-class problems. If a classification problem is multiclassone, then we use (38) as the discriminant function, otherwise,we consider the following decision rule defined commonly as asingle discriminant function g(x) in a two-class problem.Decide ω1 if g(x) > 0; otherwise decide ω2. (39)The final output of networks, (37) is used as a discriminantfunction g(x) and can be rewritten in a form of the linearcombinationg(x) = aTfx (40)where a is a vector of coefficients of polynomial functions used inthe conclusion phase of the rules in (33)–(36) and fx is a matrix ofu. These can be defined for each of polynomials as follows.(i) Constant;aT = [a10, . . . , ac0], fx = [u1, . . . , uc ]T.(ii) Linear;aT = [a10, . . . , ac0, a11, . . . , ac1, . . . , acn]fx = [u1, . . . , uc , u1x1, . . . , ucx1, . . . , ucxn]T.(iii) Quadratic;aT = [a10, . . . , ac0, a11, . . . , ac1, . . . , acn, . . . , acnn]fx = [u1, . . . , uc , u1x1, . . . , ucx1, . . . , ucxn, . . . , ucxnxn]T.(iv) Reduced quadratic;aT = [a10, . . . , ac0, a11, . . . , ac1, . . . , acn, . . . , acnn]fx = [u1, . . . , uc , u1x1, . . . , ucx1, . . . , ucxn, . . . , ucx2n]T.For the discriminant function coming in the form of (40), a twoclassclassifier implements the decision rule expressed by (39).Namely, x is assigned to ω1 if the inner product aTfx is greaterthan zero and to ω2 otherwise. The equation g(x) = 0 defines thedecision surface that separates the two classes. Otherwise, a multiclassclassifier implements the decision rule expressed by (38).Each output node generates a discriminant function correspondingto each class. If the ith output is larger than all remaining outputs,the pattern x is assigned to ith class.4.4. Optimized process of P-RBF NNs by using differential evolutionThe algorithm of Differential Evolution (DE) (Dervis & Selcuk,2004; Storn, 1997) introduced by Storn (1997) is a paralleldirect search method, which utilizes NP parameter vectors as apopulation for each generation G. DE can be categorized into a classof floating-point encoded, evolutionary optimization algorithms.Currently, there are several variants of DE. The particular variantused throughout this investigation is the DE/rand/1/bin scheme.This scheme will be discussed here and more detailed descriptionsare provided. Since the DE algorithm was originally designed towork with continuous variables, the optimization of continuousproblems is discussed first.The DE algorithm is a population-based algorithm using threeoperators: crossover, mutation, and selection. Several optimizationparameters require tuning. These parameters are put togetherunder the common name of control parameters. In fact, thereare only three real control parameters of the algorithm, namelydifferentiation (or mutation) constant F, crossover constant CR, andsize of population NP. The rest of the parameters are the dimensionof problem D that scales the difficulty of the optimization task;maximum number of generations (iterations) GEN, which mayserve as a stopping condition; and the boundary constraintsimposed on the variables that limit the feasible area.Regarding DE variants Price et al. present a notation to identifydifferent ways to generate new vectors on DE.The most popular of them is called DE/rand/1/bin, where thefirst term means differential evolution, the second term indicateshow the base vector is chosen (in this case, vector chosen atrandom). The number in the third term expresses how many vectordifferences will contribute in the differential mutation (one pairin this case). Finally, the fourth term shows the type of crossoverutilized (bin from binomial in this variant).The DE algorithm can be outlined shortly as the followingsequence of steps:Step 1: Set up the values of the parameters of the methodsuch as a crossover rate (CR), scaling factor (SF), and mutationtype (one out of the 5 types available) and then randomly generate‘‘NP’’ population in search space. Each variable (or vector)in the n-dimensional search space will be denoted by theS.-H. Yoo et al. / Neural Networks 69 (2015) 111–125 119Di(t) = [x1(t), x2(t), . . . , xn(t)] and the population NP(t) ={D1(t), D2(t), . . . , Ds(t)} is composed of the elements D(t). Evaluateeach individual using the objective function.Step 2: Perform mutation to generate a mutant vectorDmutant(t + 1). For the target vector, randomly choose distinctvectors Da(t), Db(t), Dc (t), etc. (a, b, c ∈ {1, 2, . . . , NP}).There are five types of mutation:DE/Rand/1/β : Dmutant(t + 1) = Dc (t) + β(Da(t) − Db(t)) (41)DE/Best/1/β : Dmutant(t + 1) = Dbest(t) + β(Da(t) − Db(t)) (42)DE/Rand/2/β : Dmutant(t + 1)= De(t) + β(Da(t) − Db(t) − Dc (t) − Dd(t)) (43)DE/best/2/β : Dmutant(t + 1)= Dbest(t) + β(Da(t) − Db(t) − Dc (t) − Dd(t)) (44)DE/RandToBest/1 : Dmutant(t + 1)= Dc (t) + β(Dbest(t) − Dc (t)) + β(Da(t) − Db(t)). (45)Produce a mutant vector using one of the mutation methodsshown above. Generally, DE/R and/1/β is used.Step 3: Perform crossover to obtain a trial vector for each targetvector using its mutant vector in the following equation:Djtrial(t + 1)=Djmutant(t + 1) if(rand < CR) or j = jrand indexDjtarget(t). Otherwise(46)j = 1, 2, . . . , n; r and ∈ [0, 1] is a random number drawn from auniform distribution over [0, 1]; CR is the crossover rate ∈ [0, 1];and jrand index ∈ (1, 2, . . . , n) is randomly selected index.Step 4: Evaluate the trial vectors Dtrial(t + 1). If a trial vectorcomes with the better fitness than that of individual in the currentgeneration, it is transferred to the next generation.Step 5: Until the termination criterion has been satisfied, repeatSteps 2–4.In this study, essential design parameters are optimized byDE/rand/1/bin method.4.4.1. InitializationAs with all evolutionary optimization algorithms, DE workswith a population of solutions, rather than a single solution. Thepopulation P at generation G contains NP solution vectors calledindividuals of the population where each vector represents apotential solution for the optimization problemP(G) = X(G)i = x(G)i, i = 1, . . . , NP; j = 1, . . . , D;G = 1, . . . , Gmax. (47)In order to form a starting point for optimum seeking, thepopulation must be initialized. Often there is no specific knowledgeavailable about the location of a global optimum. Typically, wemight have knowledge about the boundaries of the problem’svariables. In this case, a natural way to initialize the populationP(0)(initial population) is to seed it with random values within thegiven boundary constraints:P(0) = x(0)j,i = x(L)j + randj[0, 1] × x(U)j − x(L)j∀i ∈ [1, NP]; ∀j ∈ [1, D], (48)where randj[0, 1] represents a uniformly distributed randomvariable assuming values in [0, 1].4.4.2. MutationThe self-referential population recombination scheme of DE isdifferent from the other evolutionary algorithms. From the firstgeneration onward, the population of the subsequent generationP(G+1)is obtained on the basis of the current population P(G).First a temporary or trial population of candidate vectors for thesubsequent generation, P(G+1) = V(G+1) = v(G+1)j,i, is generated asfollows:v(G+1)j,i =x(G)j,r3 + F × (x(G)j,r1 − x(G)j,r2), randj[0, 1] < CR ∨ j = k,x(G)ij , otherwise,(49)where i ∈ [1, NP]; j ∈ [1, D],r1,r2,r3 ∈ [1, NP], randomlyselected, except for r1 ̸= r2 ̸= r3 ̸= i, k = (int(randi[0, 1] ×D) + 1), and CR ∈ [0, 1], F ∈ (0, 1].Three randomly chosen indexes, r1,r2, and r3 refer to threerandomly chosen vectors of the population. They are different fromeach other and also different from the running index i. New randomvalues for r1,r2, and r3 are assigned for each value of index i (foreach vector). A new value for the random number rand[0, 1] isassigned for each value of index j (for each vector parameter).4.4.3. CrossoverThe index k refers to a randomly chosen vector of parameterand it is used to ensure that at least one vector parameter ofeach individual trial vector V(G+1) differs from its counterpartpresent in the previous generation X(G). A new random integervalue is assigned to k for each value of the index i (prior toconstruction of each trial vector). F and CR are control parametersof DE. Both values remain constant during the search process. Bothvalues as well as the third control parameter, NP (population size),remain constant during the search process. F is a real-valued factorin range [0.0, 1.0] that controls the amplification of differentialvariations. CR is a real-valued crossover factor taking on the valuesin the range [0.0, 1.0] that controls the probability that a trialvector will be selected from the randomly chosen, mutated vector,V(G+1)j,iinstead of from the current vector, x(G)j,i. Generally, bothF and CR affect the convergence rate and the robustness of thesearch process. Their optimal values are dependent both on thecharacteristics of the objective function and on the population size,NP. Usually, suitable values for F , CR and NP can be found throughexperimentation after a number of tests using different values.Practical guidelines on how to select control parameters NP, F andCR can be found in Storn (1997).4.4.4. SelectionThe selection scheme of DE differs from the one encountered inother evolutionary algorithms. On the basis of the current populationP(G)and the temporary population P′(G+1), the population ofthe next generation P(G+1)is formed as follows:X(G+1)i =V(G+1)i, if ℑ(V(G+1)i) ≤ ℑ(X(G)i)X(G+1)i, otherwise . (50)Thus each individual of the temporary or trial population iscompared with its counterpart in the current population. Theone with the lower value of cost-function ℑ(X) to be minimizedwill propagate to the population of the next generation. As aresult, all the individuals of the next generation are better thantheir counterparts in the current generation. The interesting pointconcerning the DE selection scheme is that a trial vector is onlycompared to one individual vector, not to all the vectors in thecurrent population.120 S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125Fig. 11. A content of the vector of parameters.4.4.5. Boundary constraintsIt is important to notice that the recombination operation of DEis able to extend the search outside of the initialized range of thesearch space (refer to (4.2) and (4.3)). It is also worthwhile to noticethat sometimes it is a beneficial property in problems with noboundary constraints because it is possible to find the optimumthat is located outside of the initialized range. However, in theboundary-constrained problems, it is essential to ensure that parametervalues lie inside their allowed ranges after recombination.A simple way to guarantee this is to replace the parameter valuesthat violate boundary constraints with random values generatedwithin the feasible range:u(G+1)j,i =x(L)j + randj[0, 1] × (x(U)j − x(L)j),if u(G+1)j,i < x(L)j ∨ u(G+1)j,i > x(U)ju(G+1)i,j, otherwise(51)where i ∈ [1, NP]; j ∈ [1, D].This is the method used in this work. Another simple but lessefficient method is to reproduce the boundary constraint violatingvalues according to relationship (51) as many times as necessary tosatisfy the boundary constraints. Yet another simple method thatallows bounds to be approached asymptotically while minimizingthe amount of disruption that results from resetting the boundvalues is expressed as followsu(G+1)j,i =(x(G)j,i + x(L)j)/2, if u(g+1)j,i < x(L)j,(x(G)j,i + x(U)j)/2, if u(g+1)j,i < x(U)j,u(G+1)j,i, otherwise.(52)Through the optimization of these parameters such as learningrate, momentum coefficient and fuzzification coefficient by usingDE, P-RBF NNs structure exhibits better convergence propertiesin the generation process of the networks from the viewpoint ofperformance. Fig. 11 shows a content of the vectors used in theoptimization of the P-RBF NNs.Individual particles of the P-RBF NNs include entries thatrepresent optimized learning rate, momentum, and fuzzificationFig. 13. Face contour extraction.coefficient. The learning rate and momentum of vectors are appliedto optimize the connections (weights). The fuzzification coefficientchanges the shape of the membership functions produced by thefuzzy C-means clustering. The value of the membership functiondepends on the center point and the fuzzification coefficient to beadjusted.5. Application to the design of face recognition systemThe proposed face recognition system is designed by usingP-RBF NNs-based pattern recognition scheme. Histogram equalization,AdaBoost, ASM and PCA are used to data preprocessing (asdiscussed in Section 3). First, the image obtained from the camerais converted to a gray scale and its dimensionality is reduced byusing the PCA (Section 3.4). The PCA-reduced data are then used asthe inputs for the P-RBF NNs based classification scheme. Fig. 12shows a flow of processing realized by the overall classificationsystem.Color images of 640 × 480 size are converted to gray images.The distorted images are improved through histogram equalization.We extracted images including the face area to squares eachof the N × N size. After the completion of the extraction phase,a personal profile consists of the extracted face contour and theshape obtained by ASM. Finally, face images are stored in the JPGformat where the images are of size 200 × 200 pixels. A Face areais extracted using the ASM that helps removing disturbances andreflecting the morphological characteristics of individuals faces.Fig. 13 visualizes the process of extracting the face contour fromimages using the ASM. A new face shape consists of by finding edgeto the vertical direction in the boundary of the face shape. The processof face contour extraction is dependent on the size and positionof the face shape. An incorrect initial position of the shapetakes a long time to find a face and may also lead to incorrect results.Fig. 12. An overall processing scheme of the classification system.S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125 121Fig. 14. Example of facial image dataset used in the experiment.Fig. 15. Average face obtained after running the PCA.The dataset of faces obtained in this way consists of 350 imageswith 10 images per person (including in total 35 persons). Fig. 14provides an example of the extracted facial image dataset.The weights of new facial database are calculated throughrunning the PCA algorithm. Fig. 15 presents an average face of 15persons obtained in this way.Fig. 16 shows reconstructed eigenfaces in face space byextracting eigenvectors corresponding to the largest eigenvaluescoming from the overall 350 eigenvectors.The PCA weights associated with each candidate image are usedto form the discrimination functions associated with the RBF NNs.The weights of the candidate image are obtained in real-time.Fig. 17 is used as the inputs of the discriminant function.Fig. 17 shows discrimination process of recognition candidateperson based on real output by figure. As shown, we extractedthe 5 dimensional feature by PCA. It means that we extracted5-eigenvectors as shown in Fig. 16. Fig. 17 displays the polynomialfunction of the conclusion part connected of recognized candidate(A ∼ D).In this study, we carry out two suites of experiments asdescribed below:(a) Case 1: Carry out AdaBoost and histogram equalization withoutASM from real-time images.(b) Case 2: Carry out using AdaBoost and histogram equalizationwith ASM from real-time images.In this paper, we experiment with the Yale and IC&CI Lab datasets.5.1. The Yale datasetThe Yale face dataset consists of 165 grayscale images of 15individuals represented in the GIF format. There are 11 images perperson, one per different facial expression or configuration: centerlight,w/glasses, happy, left-light, wearing/no glasses, normal, rightlight, sad, sleepy, surprised, and wink. Because of the changesto extreme facial expressions, images concerning surprised facialexpressions are not included in the experiments. As a result, wearrive at a dataset comprising 150 images of 15 persons with 10images per person.Table 1 summarizes the values of the parameters of theproposed classifier along with the parameters of the designenvironment.Each experiment we split data into 50%–30%–20% for training,validation, and testing subsets, namely, 80% (50%–30% trainingand validation set) of the whole pattern is selected randomly fortraining and the remaining pattern is used for testing purpose.Fig. 16. Eigenfaces formed in the face space.122 S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125Fig. 17. Weights of the PCA used in the discrimination process.Table 1Values of the parameters used in the experiments.RBF NNsData split Training: Validation: Testing = 50%: 30%: 20%Optimization algorithm DEThe number of evaluations of the objective function(Generations/swarms)1000 (20 × 50)Search spaceThe number of rules 2, 3, 4, 5Polynomial type Linear, and reduced quadratic typeFuzzification coefficient [1.1, 3.0]Table 2Classification rate obtained for the Yale dataset (without obstacle factors). The best results are shown in boldface.Number of rules Case 1 (without ASM) Case 2 (with ASM)L-RBF NNs RQ-RBF NNs L-RBF NNs RQ-RBF NNs2 86.00% ± 4.94 73.33% ± 4.35 82.00% ± 5.58 73.33% ± 5.843 86.00% ± 5.48 64.67% ± 12.61 86.67% ± 5.27 52.66% ± 5.964 88.00% ± 2.98 66.00% ± 8.63 83.33% ± 4.08 55.33% ± 6.065 86.00% ± 3.65 66.66% ± 10.54 79.33% ± 4.94 50.66% ± 9.83Table 3Results for the IC&CI Lab. data (without obstacle factors). The best results shown in boldface.Number of rules Case 1 (without ASM) Case 2 (with ASM)L-RBF NNs RQ-RBF NNs L-RBF NNs RQ-RBF NNs2 94.88% ± 1.19 93.33% ± 3.45 92.92% ± 2.98 94.67% ± 2.443 93.56% ± 2.35 92.78% ± 1.51 95.73% ± 1.97 97.56% ± 3.944 95.14% ± 2.13 95.67% ± 1.71 94.21% ± 1.34 97.22% ± 2.455 95.45% ± 4.34 94.64% ± 2.70 96.45% ± 2.61 96.12% ± 2.16There was no overlap between the training, validation, and testingsets (Lewrence, Giles, Tsoi, & Back, 1997). For each combinationof the parameters, the experiment was repeated five times. Theresults are reported by presenting the average and standarddeviation of the classification rate obtained over these fiverepetitions of the experiment. The number of rules, polynomialtype and the fuzzification coefficient are optimized by the DE.The experimental results expressed in terms of the classificationrate and its standard deviation are reported in Table 2. InTables 2–5, the abbreviations L-RBF NNs and RQ-RBF NNS referto polynomial types of each RBF NNs such as linear and reducedquadratic type.When the polynomial type is linear and the number of rules isset to 4, the recognition rate is higher than 85% in Case 1 (withoutASM). When the polynomial type is linear and the number of rulesis 3, we obtain a similar recognition rate in Case 2 (with ASM).Notably, the recognition rate in Case 1 is slightly higher than theone obtained for Case 2.5.2. IC&CI Laboratory datasetIC&CI (Intelligent Control & Computational Intelligence) facedatabase contains a set of face images taken by students in theUniversity of Suwon. There are 10 different images of each of 35distinct subjects. Each image was digitized and presented in the200×200 pixel array with gray levels. Images feature frontal viewfaces with different facial expressions, and occlusions. The valuesof the parameters of the experimental setup are the same as usedin the experiments with the Yale data.The obtained results are included in Table 3.Next, the performance of the classifier face recognition isreported in presence of various obstacle factors.In this study, the classifier is trained by images used in theprevious experiment (Case 1). The classification rate is evaluated byusing test images that are affected by obstacles. The experimentsare completed under the same conditions as set up in the previousexperiments.S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125 123Table 4Results for the IC&CI Lab. dataset (in case of wearing a cap). The best results shown in boldface.Number of rules Case 1 (without ASM) Case 2 (with ASM)L-RBF NNs Q-RBF NNs L-RBF NNs Q-RBF NNs2 53.18% ± 4.09 50.75% ± 3.98 55.51% ± 3.55 64.19% ± 5.383 56.16% ± 3.78 53.16% ± 4.34 63.75% ± 4.47 74.72% ± 6.344 61.02% ± 5.46 59.57% ± 4.68 68.07% ± 6.61 73.14% ± 5.695 62.97% ± 4.51 65.55% ± 5.97 66.22% ± 5.34 70.25% ± 4.56Table 5Results for the IC&CI Lab. dataset (in case of using a mobile phone).Rules number Case 1 (without ASM) Case 2 (with ASM)L-RBF NNs RQ-RBF NNs L-RBF NNs RQ-RBF NNs2 63.67% ± 2.09 67.02% ± 3.57 72.23% ± 3.71 76.83% ± 3.903 68.92% ± 3.60 70.21% ± 3.85 74.31% ± 2.61 80.40% ± 5.424 67.15% ± 4.86 70.32% ± 2.88 75.07% ± 4.77 80.44% ± 4.375 71.61% ± 3.46 63.22.% ± 4.16 77.88% ± 4.75 76.23% ± 3.26Fig. 18. A real-time acquired testing image for recognition (wearing a cap).Fig. 19. A real-time acquired testing image for recognition (in case of using a mobile phone).– In case of wearing a cap (see Fig. 18).In case of wearing a cap, Table 4 shows the experimental resultsobtained for the two cases.When wearing a cap, the decrease of overall performance wasnoticeable in comparison to the results produced in the previousexperiment (without obstacles). The recognition rate of Case 2 isbetter than the one reported in Case 1 by using the effective facialfeatures because the unnecessary image parts have been removedwith the use of the ASM.– In case of using a mobile phone (see Fig. 19).The corresponding results are reported in Table 5.In this case, the recognition rate for Case 2 is better than Case 1because the unnecessary image parts have been removed with theaid of ASM.5.3. Comparative analysis—PCA and RBF NNsWe now compare recognition performance of PCA and RBF NNsin order to evaluate the performance of the proposed model. Theprocess of face recognition experimental using PCA is given asfollows.First, the data are divided into training data (50%), validationdata (30%), and testing data (20%) exactly in the same conditionsas for the RBF NNs. Database consists of PCA weights of trainingimages. Next, we calculate Euclidean distance between testimage weights and database image (Yale and IC&CI Lab. image)weights. Then the final recognized candidate is decided as thecandidate with the fewest errors. Finally, recognition performanceis obtained by using test image and determined database image.Recognition performance is represented to recognition rate (%) andthe number of false recognition (false recognition/test data).The summary of the final classification results is presented inFig. 20.When we used the non obstacle (general) images, therecognition rate of Case 1 (without ASM) is slightly better thanCase 2 (with ASM). When obstacles are involved, the recognitionrate of Case 2 is higher than the one reported for Case 1 by usingthe effective facial features because the unnecessary image partsare removed by using ASM. Also, the recognition rate of RBFNNs isbetter than the one reported for the PCA.6. ConclusionsIn this study, the proposed face recognition system comprisestwo main functional modules. In the preprocessing part, twodimensionalgray face images are obtained by using AdaBoost andthen histogram equalization is used to improve the quality ofimage. The personal profile consists of the extracted face contourand shape by using the ASM. The features were extracted by thePCA algorithm. In the classifier part, we proposed the optimizedRBF NNs for the problem of face recognition. In the recognitionpart, the proposed P-RBF NNs exhibit some unique and usefulcharacteristics. The P-RBF NNs involve a partition module formedby the FCM clustering and used here as an activation function124 S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125Fig. 20. Comparison of face recognition results—shown are selected faces.of the neurons located in the hidden layer. The P-RBF NNs areexpressed as a collection of ‘‘if-then’’ fuzzy rules. The imagedata obtained from the CCD camera is used for preprocessingprocedures including image stabilizer, face detection and featureextraction. The preprocessed image data are processed with theaid of RBF NNs. We have reported recognition rates of the ASM—processed images, generic images and facial images containingsome obstacles. In the presence of obstacles, the recognition rateof Case 2 is better than the one reported in Case 1 by using theeffective facial features. This improvement can be attributed to thefact that the unnecessary parts of the image have been removedwith the use of the ASM.AcknowledgmentThis work was supported by the GRRC program of Gyeonggiprovince [GRRC Suwon 2015-B2, Center for U-city Security &Surveillance Technology].ReferencesBalasubramanian, M., Palanivel, S., & Ramalingam, V. (2009). Real time face andmouth recognition using radial basis function neural networks. Expert Systemswith Applications, 36, 6879–6888.Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. NewYork: Plenum Press.Boehnen, C., & Russ, T. (2005). A fast multi-modal approach to facial featuredetection. In Proceedings of the seventh IEEE workshop on applications of computervision, WACV/MOTION’05 (pp. 135–142).Chellappa, R., Wilson, C. L., & Sirohey, S. (1995). Human and machine recognition offaces: A survey. Proceedings of the IEEE, 83(5), 704–740.Colbry, D., Stockman, G., & Jain, A. (2005). Detection of anchor points for 3D faceverification. In Proc. of A3DISS, San Diego CA.Colombo, A., Cusano, C., & Schettini, R. (2005). 3D face detection using curvatureanalysis. In 4 Int’l symposium on. image and signal processing and analysis. ISPA.Connolly, J.-F., Granger, E., & Sabourin, R. (2012). Evolution of heterogeneousensembles through dynamic particle swarm optimization for video-based facerecognition. Pattern Recognition, 45(7), 2460–2477.Cootes, T., Cooper, D., Taylor, C., & Graham, J. (1995). Active shape models-theirtraining and application. Computer Vision and Image Understanding, 61(1),38–39.Cootes, T. F., Taylor, C. J., Lanitis, A., Cooper, D. H., & Graham, J. (1993). Building andusing flexible models incorporating grey-level information. In Fourth internat.conf. computer vision (pp. 242–246).Dervis, K., & Selcuk, O. (2004). A simple and global optimization algorithm forengineering problems: Differential evolution algorithm. Turkish Journal ofElectrical Engineering, 12, 53–60.Freund, Y., & Schapire, R. E. (1995). A decision–theoretic generalization of on-linelearning and an application to boosting. In European conference on computationallearning theory (pp. 23–27).Gao, Y., Pan, J., Ji, G., & Yang, Z. (2012). A novel two-level nearest neighborclassification algorithm using an adaptive distance metric. Knowledge-BasedSystems, 26, 103–110.Gonzalez, R. C., & Woods, R. E.(2002). Digital image processing. New Jersey: PrenticeHall.Han, H. G., Chen, Q. L., & Qiao, J. F. (2011). An efficient self-organizing RBF neuralnetwork for water quality prediction. Neural Networks, 24, 717–725.Huang, Y.-S., Hzu, T.-C., & Cheng, F.-H. (2010). Facial landmark detection bycombining object detection and active shape model. In Electronic commerce andsecurity, ISECS (pp. 381–386).Lewrence, S., Giles, C. L., Tsoi, A. C., & Back, A. D. (1997). Face recognition: Aconvolutional neural network approach. IEEE Transactions on Neural Networks,8(1), 98–113.Lienhart, R., & Maydt, J. (2002). An extended set of Haar-like features for rapidobject detection. In IEEE conference on image processing, ICIP, New York (1)(pp. 900–903).Lopez-Molina, C., Baets, B. D., Bustince, H., Sanz, J., & Barrenechea, E. (2013).Multiscale edge detection based on Gaussian smoothing and edge tracking.Knowledge-Based Systems, 50, 101–111.Lu, X., & Jain, A. K. (2005). Multimodal facial feature extraction for automatic3D face recognition, technical report MSU-CSE-05–22. Computer Science andEngineering, Michigan State University.Mohammed, A. A., Minhas, R., Jonathan Wu, Q. M., & Sid-Ahmed, M. A. (2011).Human face recognition based on multidimensional PCA and extreme learningmachine. Pattern Recognition, 44(10–11), 2588–2597.Oh, S.-K., Kim, W.-D., Pedrycz, W., & Park, H.-S. (2014). Fuzzy radial basisfunction neural networks with information granulation and its parallel geneticoptimization. Fuzzy Sets and Systems, 237, 96–117.Oh, S.-K., Pedrycz, W., & Park, B.-J. (2004). Self-organizing neurofuzzy networks inmodeling software data. Fuzzy Sets and Systems, 145, 165–181.Park, B.-J., Oh, S.-K., & Kim, H.-K. (2008). Design of polynomial neural networkclassifier for pattern classification with two classes. Journal of ElectricalEngineering & Technology, 3(1), 108–114.Roh, S.-B., Oh, S.-K., & Pedrycz, W. (2010). A fuzzy of parallel polynomial neuralnetworks with information granules formed by fuzzy clustering. KnowledgeBasedSystems, 23(3), 202–219.Rowley, H. A., Baluja, S., & Kanade, T. (1998). Neural network-based face detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1), 23–38.Slavkovic, M., & Jevtic, D.(2012). Face recognition using eigenface approach. SerbianJournal of Electrical Engineering, 9(6), 121–130.Staiano, A., Tagliaferri, R., & Pedrycz, W. (2006). Improving RBF networksperformance in regression tasks by means of a supervised fuzzy clustering.Neurocomputing, 69, 1570–1581.S.-H. Yoo et al. / Neural Networks 69 (2015) 111–125 125Storn, R. (1997). Differential evolution, a simple and efficient heuristic strategy forglobal optimization over continuous spaces. Journal of Global Optimization, 11,341–359.Sung, K.-K., & Poggio, T. (1998). Example-based learning for view-based human facedetection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1),39–51.Tsekourasa, G., Sarimveisb, H., Kavaklia, E., & Bafasb, G. (2005). A hierarchical fuzzyclustering approach to fuzzy modeling. Fuzzy Sets and Systems, 150(2), 245–266.Viola, P., & Jones, M. J. (2001). Rapid object detection using a boosted cascade ofsimple features. In IEEE conference on computer vision and pattern recognition,CVPR, Hawaii (1) (pp. 511–518).Viola, P., & Jones, M. J. (2004). Robust real-time object detection. InternationalJournal of Computer Vision, 57(2), 137–154.Wang, Qihui, Xie, Lijun, Zhu, Bo, Yang, Tingjun, & Zheng, Yao (2013). Facialfeature extraction based on active shape model. Journal of Multimedia, 8(6),747–754.Yang, M.-H., Kriegman, D., & Ahuja, N. (2002). Detecting faces in images: a survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(1), 34–58.Yong, L., Zhang, C. S., Xiaoguang, L., & Zhang, D. (2000). Face contour extraction withactive shape models embedded knowledge. In Internat. conf. signal processing 2,(pp. 1347–1350).Zhao, W., Chellappa, R., & Phillips, P. (2003). Face recognition: a literature survey.ACM Computing Surveys, 35(4), 399–458.
Optik 126 (2015) 3483–3487Contents lists available at ScienceDirectOptikj ou rnal homepage: www.elsevier.de/ijleoA new pose invariant face recognition system using PCA and ANFISReecha Sharma∗, M.S. PatterhDepartment of Electronics and Communication Engineering, Punjabi University, Patiala 147002, Punjab, Indiaa r t i c l e i n f oArticle history:Received 4 September 2014Accepted 27 August 2015Keywords:Principle component analysis (PCA)Face recognitionANFISScore valuea b s t r a c tIn this paper, an efficient pose invariant face recognition system using PCA and ANFIS (PCA–ANFIS) hasbeen proposed. The features of an image under test have been extracted using PCA then neuro fuzzy basedsystem ANFIS is used for recognition. The proposed system recognizes the face images under a variety ofpose conditions by using ANFIS. The training face image dataset is processed by PCA technique to computethe score values, which are then utilized in the recognition process. The proposed face recognitiontechnique with neuro-fuzzy system recognizes the input face images with high recognition ratio. Theproposed approach is implemented in the MATLAB platform and it is evaluated by employing a varietyof database images under various pose variant conditions.© 2015 Elsevier GmbH. All rights reserved.1. IntroductionFace recognition is to identify or verify one or more persons inthe given still or video images of a scene using a stored databaseof faces [1]. Face recognition can be classified into two categories;these are geometric feature-based and appearance-based [4]. Thegeometric feature-based methods, such as elastic bunch graphmatching [5] and active appearance model [6] make use of thegeometrical parameters that measure the facial parts; whereas theappearance-based methods use the intensity or intensity-derivedparameters [1]. Face recognition system consists of two stages;these are face detection and the face identification [2]. In the facedetection stage, facial images are localized in an input image. Inthe face identification stage, the localized faces are identified asindividuals registered in the system. Therefore, developing bothface detection algorithms and face identification algorithms is quiteimportant [11].The variations involved inface recognition,include illumination,pose, and identity [3], facial expression, hair style, aging, make-up,scale. It is very difficult for even humans to recognize faces correctlywhen the illumination varies severely, since the same personappears to be very much different[10].Acommon solution to handlingpose variations in face recognition is the view-based method.Inthis method, the face images of the individuals to be recognized areacquired from different view angles [13]. The images of the sameview are used to construct an Eigen space representation for each∗ Corresponding author. Tel.: +91 9501895500.E-mail address: sharmareecha@gmail.com (R. Sharma).view, and the view-specific Eigen space representations are thenused for recognizing a person in different poses [12].However the 2D image patterns of 3D face object can changedramatically due to lighting and viewing variations [7]. Recentlythere has been growing interest in face recognition from sets ofimages. Here, rather than supplying a single query image, the usersupplies a set of images of the same unknown individual. In generalthe gallery also contains a set of images for each known individual,so the system must recover the individual whose gallery set is thebest match for the given query set [9]. Recently face recognitionusing image-set or video sequence has attracted more and moreattention within computer vision and pattern recognition community.More importantly, compared with single snapshot, a set ora sequence of images provides much more information about thevariation in the appearance of the target subject [8].The overall structure of the paper is organized as follows: Section2 in which proposed face recognition system using PCA andANFIS (PCA–ANFIS) is discussed. Section 3 gives the experimentalresults and discussions. Section 4 concludes the paper.2. The proposed face recognition system using PCA–ANFISFor the proposed work, the face images are taken from theORL database. These images are first denoised using the adaptivemedian filter, before further processing. The denoised images aregiven to the next process in order to calculate the score values usingprinciple component analysis (PCA) technique. The score values soobtained from the PCA techniques are then used by ANFIS classi-fier for accomplishing the training process. Based on the predefinedthreshold value the image under test is indicated as recognized ornot recognized.http://dx.doi.org/10.1016/j.ijleo.2015.08.2050030-4026/© 2015 Elsevier GmbH. All rights reserved.3484 R. Sharma, M.S. Patterh / Optik 126 (2015) 3483–3487Fig. 1. Architecture of the proposed face recognition system.The face database images are represented asfd (r, s) =fd1 (r, s),fd2 (r, s). . .fdi (r, s)
; i = 1, 2, 3,. . ., N, (1)where N is the total number of images in the database D. Thesenumbers of face images from the database D are utilized in therecognition process. The basic structure of our proposed face recognitionsystem is given in Fig. 1.The proposed face detection technique consists of three stagesnamely(i) Preprocessing• Adaptive median filter(ii) Principle component analysis.• Score value calculation(iii) Classification using ANFIS.2.1. Adaptive median filterThe adaptive median filter is applied to the images fd(r, s) whichis affected by the (salt and pepper) noise and acquire a noise freeimage as an output. The process of adaptive median filtering innoise removal is given below:Step 1: Initialize the window w size wz .Step 2: Check if the center pixel pcen(r, s) within w is noisy. If thepixel pcen(r, s) is noisy go to step 3. Otherwise slide the window tothe next pixel and repeat step 1.Step 3: Sort all pixels within the window w in an ascendingorder and find the minimum (pmin(r, s)), median (pmed(r, s)), andmaximum (pmax(r, s)) values.Step 4: Compute if pmed(r, s) is noisy,(i.e.) pmin(r, s) < pmed(r, s) < pmax(r, s) (2)If the median value range is in between the minimum and maximummeans the pixel is not a noisy and go to step 5, otherwisepmed(r, s) is a noisy pixel and go to step 6.Step 5: Replace the corresponding centre pixel in output imagewith pmed(r, s) and go to step 8.Step 6: Check if all other pixels are noisy. If yes then expendthe window size by 2 and go to step 3. Otherwise, go tostep 7.Step 7: Replace the center pixel of the image with the noise freepixel which is the closest one of the median pixel pmed(r, s).Step8:Reset window size wz and center of window to next pixel.Step 9: Repeat the steps until all pixels are processed.Using the above mentioned adaptive median filter algorithmthe salt and pepper noise is removed. This denoised image is thengiven to the next process to calculate the score values using PCAtechnique.Fig. 2. Flow chart of the principle component analysis.Fig. 3. Architecture of ANFIS.2.2. Score value calculation using principle component analysisThe denoised image fd acquired from the adaptive median filtersystem is subjected to score values estimation utilizing principlecomponent analysis [14]. Fig. 2 shows the flow chart of PCA.In the last step of flow chartthe score values p(x 1), p(x 2). . . p(x n)obtained from the PCA process for different pose images are thenpassed into ANFIS based classification process.2.3. Classification using ANFIS classifierThe score value p(x 1), p(x 2). . . p(x n) obtained from the PCA areclassified using the well known classifier named ANFIS which comprisesfive layers of nodes. Out of five layers, the first and the fourthlayers possess adaptive nodes whereas the second, third and fifthlayers possess fixed nodes. The architecture of the ANFIS is given inFig. 3.R. Sharma, M.S. Patterh / Optik 126 (2015) 3483–3487 3485The learning process of ANFIS is carried out on the extracted PCAfeatures such as Eigen vectors. The Rule basis of the ANFIS is of theform:If p(x 1) is Ai, p(x 2) is Bi, is Ci thenRulesi = aip x 1 + bip x 2 + cip x n + fi (3)where p(x 1), p(x 2), p(x n) are the inputs, Ai Bi and Ci are the fuzzy sets,Rulesi is the output within the fuzzy region specified by the fuzzyrule, ai, bi, ci and fi are the design parameters that are determinedby the training process.Layer-1: Every node i in this layer is a square node with a nodefunction.O1,i = Ai p x 1 , O1,i = Bi p x 2 , O1,i = C p x n (4)Usually Ai(p(x 1)), Bi(p(x 2), Ci(p(x n)) are chosen to be bellshapedwith maximum equal to 1 and minimum equal to 0 and aredefined asAi p x 1 = Bi p x 2 = C p x n = 11 + (x − oi) /pi 2 qi (5)where oi, pi, qi is the parameter set. These parameters in this layerare referred to as premise parameters.Layer-2: Every node in this layer is a circle node labeled ˘which multiplies the incoming signals and sends the product out.For instance,O2,i = wti = Ai p x 1 × Bi p x 2 × Ci p x n ,i = 1, 2 (6)Each node output represents the firing strength of a rule.Fig. 4. Sample dataset from the ORL database.Fig. 5. Denoised images after adaptive median filtering.Table 1Image denoising performance of adaptive median filter and existing average andGaussian filtering methods.Images PSNRProposed adaptivemedian filterExisting averagefilter (in dB)Existing Gaussianfilter (in dB)1 38.64005 28.37 26.352 33.977 26.28 24.413 35.1861 26.81 26.024 34.54 26.19 25.525 33.96 26.68 25.08Layer-3: Every node in this layer is a circle node labeled N. Theith node calculates the ratio of the ith rules firing strength to thesum of all rule’s firing strengths:O3,i = wti = wti(wt1 + wt2), i = 1, 2 (7)Layer-4: Every node i in this layer is a square node with a nodefunctionO4,i = wti.Rulesi i = 1, 2 (8)where wti is the output of layer-3 and ai, bi, ci, fi are the parameterset. Parameters in this layer will be referred to as consequentparameters.Layer-5: The single node in this layer is a circle node labeled ˙that computes the overall output as the summation of all incomingsignals:O5,i = iwtiRulesi = iwt iRulesiiwti(9)Z = wt1Rules1 + wt2Rules2wt1 + wt2(10)Z = wt Rules1 + wt Rules2 (11)Fig. 6. Comparison of adaptive median filtering technique with the existing averageand Gaussian filtering methods.Table 2Demonstrate the performance comparison of the proposed PCA-AFIS technique,ICA–ANFIS and LDA–ANFIS technique.Measures Proposed PCA–ANFIS ICA–ANFIS LDA–ANFISAccuracy 0.9666 0.713 0.68Sensitivity 0.9729 0.728 0.6483Specificity 0.9605 0.712 0.7288Table 3Illustrates the performance measures ofthe proposed PCA–ANFIS technique and theexisting FFBNN techniques in terms of accuracy, sensitivity, specificity.Measures Proposed PCA–ANFIS Existing FFBNNAccuracy 0.9666 0.8666Sensitivity 0.9729 0.8481Specificity 0.9605 0.8873FPR 0.0394 0.1126PPV 0.96 0.8933NPV 0.9733 0.84FDR 0.04 0.106MCC 0.9334 0.73433486 R. Sharma, M.S. Patterh / Optik 126 (2015) 3483–3487Fig. 7. Proposed PCA–ANFIS technique comparison with the existing FFBNN in terms of accuracy, sensitivity and specificity measures.Then the predefined threshold value ω and the result of theneural network (Z) is compared which is given in the followingequation:result =
recognized, Z ≥ ω,not recognized, Z < ω (12)The neural network output Z greater than the threshold valueω means, the given input image is recognized and Z less than thethreshold value ω mean image is not recognized. Thus the ANFIS iswell trained using the score value obtained from PCA. The performanceof the well trained ANFIS is tested by giving more numberof different pose images.3. Experimental results and discussionsThe proposed PCA–ANFIS for different pose images is implementedusing MATLAB (version 7.12) with machine configurationas follows:• Processor: Intel core i7• OS: Windows 7• CPU speed: 3.20 GHz• RAM: 4 GBThe performance of the proposed PCA–ANFIS technique for differentpose images are evaluated by giving more number of imagestaken from the ORL database. Fig. 4 shows some sample imagestaken from the database.To remove the noise from the given input face images, theimages are passed through the adaptive median filter and denoisedface images so obtained are shown in Fig. 5.As can be seen from Table 1 and Fig. 6, adaptive median filterwith PCA has achieved more denoising ratio than the other filteringmethods. Adaptive median filter has given high PSNR value for differentdataset images. For example in case of Image 3 the PSNR incase of proposed method is 35.1861 dB where as in existing averagefilter is 26.81 dB and in Gaussian filter is 26.02 dB.Accordingly the denoised images acquired from the adaptivemedian filter are used to compute the score values utilizing the PCAbased calculation. The score values in this way acquired from theprinciple component analysis are given as the input to the ANFISclassifier. More number of face images are used to analyze the performanceof the proposed face recognition system using differentstatistical performance measures.The face images from ORL database are utilized to analyzethe performance of proposed PCA–ANFIS technique with theICA–ANFIS and LDA–ANFIS techniques. The comparison results ofthe proposed technique, ICA–AFIS and LDA–AFIS techniques areshown in Table 2.In Table 2, the accuracy of the proposed PCA–ANFIS techniqueis 0.9666 but the ICA–ANFIS and LDA–ANFIS techniques have offeronly 0.713, 0.68 of accuracy. Similarly the sensitivity and speci-ficity of the proposed PCA–ANFIS technique is 0.9729 and 0.9605but the ICA–ANFIS and LDA–ANFIS techniques give 0.728, 0.6483of sensitivity and 0.712, 0.7288 of specificity, respectively. Hencefrom the table it can be seen that proposed method recognizesthe image more accurately. Moreover proposed PCA–ANFIS is alsocompared with the existing FFBNN technique in terms of sensitivity,specificity and accuracy measures. The results are shown inTable 3.From the table it can be seen that the proposed PCA–ANFIS hasgiven accuracy of 0.9666 but the existing FFBNN has given accuracyof only 0.8666. Similarly the sensitivity and the specificityof our proposed method are higher than the existing FFBNN. Thecomparison graph has been given in Fig. 7.From the graph it can also be seen that the performance of theproposed PCA–ANFIS is high when compared to the existing FFBNN.Thus from the performance metrics it can be seen thatthe proposedPCA–ANFIS efficiently recognize the images.4. ConclusionIn this paper a face recognition technique using PCA–ANFIS isproposed. First, the images under test are denoised by using adaptivemedian filter and its performance is compared with averagefilter and Gaussian filter. From the comparative result it has beenfound that adaptive median filter performs better as compared toAverage and Gaussian filter. PCA is used for feature extraction andANFIS is used for face recognition. The performance ofthe proposedsetup (PCA–ANFIS) is compared with ICA–ANFIS and LDA–ANFIS.From the comparative results it has been found that PCA–ANFISperforms better than ICA–ANFIS and LDA–ANFIS. For example theproposed PCA–ANFIS gives accuracy of 0.9666 as compared toICA–ANFIS which gives 0.713 and LDA–ANFIS which gives 0.68.Proposed PCA–ANFIS technique also performs better than FFBNN.It has been concluded that PCA–ANFIS set up can be used for facerecognition with better accuracy.References[1] Wu-Jun Li, Chong-Jun Wang, Dian-Xiang Xu, Shi-Fu Chen, Illumination invariantfacerecognition based on neural network ensemble, in: Proceedings of 16thIEEE International Conference on Tools with Artificial Intelligence, November2004, 2004, pp. 486–490.[2] Rehab F. Abdel-Kader, Rabab M. Ramadan, Rawya Y. Rizk, Rotation invariantface recognition based on hybrid LPT/DCT features, Int. J. Electr. Comput. Eng.3 (7) (2008) 488–493.[3] Shaohua Kevin Zhou, Rama Chellappa, Image-based face recognition under illuminationandposevariations,J. Opt. Soc.Am.A: Opt.Image Sci.Vis. 22 (2)(2005)217–229.[4] R. Brunelli, T. Poggio, Face recognition: features versus templates, IEEE Trans.Pattern Anal. Mach. Intell. 15 (10) (1993) 1042–1052.[5] Laurenz Wiskott, Jean-Marc Fellous, Norbert Kruger, Christoph von der Malsburg,Face recognition by elastic bunch graph matching, IEEE Trans. PatternAnal. Mach. Intell. 19 (1997) 775–779.[6] G.J. Edwards, T.F. Cootes, C.J. Taylor, Face recognition using active appearancemodels, in: Computer Vision—ECCV’98, Springer, Berlin, Heidelberg, 1998, pp.581–595.[7] WenYi Zhao, R. Chellappa, Image based face recognition issues and methods,in: Optical Engineering, Marcel Dekker Incorporated, 2002, pp. 375–402.[8] Xi Li, Kazuhiro Fukui, Nanning Zheng, Image-set based face recognition usingboosted global and local principal angles, Lect. Notes Comput. Sci. (LNCS)(2009).[9] Hakan Cevikalp, Bill Triggs, Face recognition based on image sets, in:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,San Francisco, CA, United States, 2010.[10] Seok Cheol Kee, Kyoung Mu Lee, Sang Uk Lee, Illumination invariant facerecognition using photometric stereo, IEICE Trans. Inf. Syst. E83–D (7) (2000)1466–1474.R. Sharma, M.S. Patterh / Optik 126 (2015) 3483–3487 3487[11] Yasufumi Suzuki, Tadashi Shibata, Illumination-invariant face identificationusing edge-based feature Vectors In Pseudo-2d Hidden Markov models, in:Proceedings ofthe 14th European Signal Processing Conference, Florence, Italy,2006.[12] Hui-Fuang Ng, Pose-invariant face recognition security system, Asian J. HealthInf. Sci. 1 (1) (2006) 101–111.[13] J. Shermina, Impact of locally linear regression and Fisher Linear discriminantanalysis in pose invariant face recognition, Int. J. Comput. Sci. Netw. Secur. 10(10) (2010) 106–110.[14] Syed Navaz, Dhevi Sri, Pratap Mazumder, Face recognition using principal componentanalysis and neural networks, Int. J. Comput. Networking 3 (1) (2013)245–256.

Review paper on Face Recognition using Different Feature Extraction Methods

Save your time!

Make sure you submit a unique essay

100% ORIGINAL

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #

Client #