Catalog Home Page

Multi-layer perceptron models for classification

Dunne, R.A. (2003) Multi-layer perceptron models for classification. PhD thesis, Murdoch University.

[img]
PDF - Whole Thesis
Available Upon Request

Abstract

This thesis concerns the Multi-layer Perceptron (MLP) model, one of a variety of neural network models that have come into wide prominence since the mid 1980s for the classification of individuals into pre-defined classes based on a vector of individual measurements.
Each discipline in which the MLP model has had influence, including computing, electrical engineering and psychology, has recast the model into its own language and imbued it with its own concerns. This divergence of terminologies has made the literature somewhat impenetrable but has also led to an appreciation of other disciplines' priorities and interests.
The major aim of the thesis has been to bring the MLP model within the frame­work of statistics. We have two aims here: one is to make the MLP model more intelligible to statisticians; and the other is to bring the insights of statistics to the MLP model. A statistical modeling approach can make valuable contributions, ranging from small but important clarifications, such as clearing up the confusion in the MLP literature between the model and the methodology for fitting the model, to much larger insights such as determining the robustness of the model in the event of outlying or atypical data.
We provide a treatment of the relationship of the MLP classifier to more familiar statistical models and of the various fitting and model selection methodologies currently used for MLP models. A description of the influence curves of the MLP is provided, leading to both an understanding of how the MLP model relates to logistic regression (and to robust versions of logistic regression) and to a proposal for a robust MLP model.
Practical problems associated with the fitting of MLP models, from the effects of scaling of the input data to the effects of various penalty terms, are also considered. The MLP model has a variable architecture with the major source of variation being the number of hidden layer processing units. A direct method is given for determining this in multi-class problems where the pairwise decision boundary is linear in the feature space.
Finally, in applications such as remote sensing each vector of measurements or pixel contains contextual information about the neighboring pixels. The MLP model is modified to incorporate this contextual information into the classification procedure.

Item Type: Thesis (PhD)
Murdoch Affiliation: Division of Science and Engineering
Notes: Note to the author: If you would like to make your thesis openly available on Murdoch University Library's Research Repository, please contact: repository@murdoch.edu.au. Thank you.
Supervisor(s): James, Ian and Campbell, Norm
URI: http://researchrepository.murdoch.edu.au/id/eprint/50257
Item Control Page Item Control Page