Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Classification and Feature Extraction in Man and Machine


Graf,  AAB
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Department Human Perception, Cognition and Action, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Graf, A. (2004). Classification and Feature Extraction in Man and Machine. PhD Thesis, Eberhard-Karls-Universität Tübingen, Tübingen, Germany.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-F32D-8
This dissertation attempts to shed new light on the mechanisms used by human subjects to extract features from visual stimuli and for their subsequent classification. A methodology combining human psychophysics and machine learning is introduced, where feature extractors are modeled using methods from unsupervised machine learning whereas supervised machine learning is considered for classification. We consider a gender classification task using stimuli drawn from the Max Planck Institute face database. Once a feature extractor is chosen and the corresponding data representation is computed, the resulting feature vector is classied using a separating hyperplane (SH) between the classes. The behavioral responses of humans to one stimulus, in our study the gender estimate and its corresponding reaction time and confidence rating, are compared and correlated to the distance of the feature vector of this stimulus to the SH. It is successfully demonstrated that machine learning can be used as a novel method to \look into the human head" in an algorithmic way. In a first psychophysical classification experiment we note that a high classification error and a low confidence for humans are accompanied by a longer processing of information by the brain. Furthermore, a second classi-fication experiment on the same stimuli but in a different presentation order confirms the consistency and the reproducibility of the subjects' responses. Using several classification algorithms from supervised machine learning, we show that separating hyperplanes (SHs) are a plausible model to describe classification of visual stimuli by humans since stimuli represented by features distant from the SH are classified more accurately, faster and with higher confidence than the ones closer to the SH. A piecewise linear extension as in the K-means classifier seems however less adapted to model classification. Furthermore, the comparison of the classification algorithms indicates that the Support Vector Machine (SVM) and the Relevance Vector Machine (RVM), both exemplar-based classifiers, compare best to human classification performance and also exhibit the best man-machine correlations. The mean-of-class prototype learner, its popularity in neuroscience notwithstanding, is the least human-like classifier in all cases examined. These findings are corroborated by the stochastic nature of the human classi-fication between the first and second classification experiments: elements close to the SH are subject to more jitter in the subjects' gender estimation than elements distant from the SH. The above classification studies also give a hint at the mechanisms responsible for the computation of the feature vector corresponding to a stimulus, in other words the feature extraction procedure which is defined by the combination of a data type with a preprocessor. Gabor wavelet filters reveal to be the most suited preprocessor when considering the image pixel data type. The biological realism of both Gabor wavelets and the image data confirms the validity of our approach. Alternatively, the information contained in the data type defined by the combination of the texture and the shape maps of each face, these maps bringing each face into spatial correspondence with a reference face, is also shown to be useful when describing the internal face representation of humans. Non-negative Matrix Factorization applied on the texture-and-shape data type is demonstrated to describe well the preprocessing of visual information in humans, and this has three implications. First, humans seem to use a basis of images to encode visual information, what may suggest that models such as kernel maps are less adapted since they do not use a basis to decompose (visual) data. Second, this basis seems to be part-based, in contrast to Principal Component Analysis which yields a holistic basis. Third, this part-based basis is spatially not too sparse, excluding Independent Component Analysis. Both for the encodings and for the basis, a medium degree of sparseness is shown to be most adapted. Alternative approaches to model classification of visual stimuli by humans are subsequently introduced. In order to get novel insights into the metric of the human internal representation of faces, the above data is analyzed using logistic regression interpolations between the mean subjects' class estimate for a stimulus and the distance of this stimulus to the SH of each classifier. We show that a representation based upon the subjects' gender estimates is most appropriate, while the classification performance is demonstrated to be a poor measure when comparing man and machine. A novel psychophysical experiment is then designed where the hypotheses generated from machine learning are used to generate novel stimuli along a direction|the gender axis|orthogonal to the SH of each classifier. The study of the subjects' responses along these gender axes allows us then to infer the validity of the prediction given by machine learning. The results of this experiment|SVM and RVM are best while the prototype classifier is worst|validate the models given by machine learning and close the “psychophysics-machine learning" loop. We finally show in a psychophysical experiment that it is more difficult to cast concepts from machine learning into a formalism describing the memory mechanisms of humans. However, machine learning is demonstrated to be an appropriate model for feature extraction and classification of visual stimuli in humans given the particular task we chose.