Data classifier output interpretation

Data processing: artificial intelligence – Neural network – Learning task

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C700S048000, C700S047000

Reexamination Certificate

active

06564195

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to a method and apparatus for interpretation of data classifier outputs and a system incorporating the same.
BACKGROUND TO THE INVENTION
Trainable data classifiers (for example Neural Networks) can learn to classify given input vectors into the required output group with a high degree of accuracy. However, a known limitation of such data classifiers is that they do not provide any explanation or reason as to why a particular decision has been made. This “black box” nature of the decision making process is a disadvantage when human users want to be able to understand the decision before acting on it.
Such data classifiers can be split into two main groups: those which have a supervised training period and those which are trained in an unsupervised manner. Those trained in a supervised manner (i.e. supervisedly trained data classifiers) include, for example, Multi Layer Perceptrons (MLPs).
In order for a supervisedly trained data classifier (e.g. Neural Network) to be trained, a training set of examples has to be provided. The examples contain associated input/output vector pairs where the input vector is what the data classifier will see when performing its classification task, and the output vector is the desired response for that input vector. The data classifier is then trained over this set of input/output pairs and learns to associate the required output with the given inputs. The data classifier thereby learns how to classify the different regions of the input space in line with the problem represented in the training set. When the data classifier is subsequently given an input vector to classify it produces an output vector dependant upon the previously learned region of the input space that the input vector occupies.
In the case of some very simple classification problems, the “reasoning” made by the data classifier may, perhaps, be intuitively guessed by a user. However, Neural Networks are typically used to solve problems without a well bounded problem space and which have no solution obvious to humans. If the rules defining such a problem were clear then a rule-based system would probably provide a more suitable classifier than a supervisedly trained data classifier such as a Neural Network. A typical data classifier application involves complex, high-dimensional data where the rules between input vectors and correct classification are not at all obvious. In such situations the supervisedly trained data classifier becomes a complex, accurate decision maker but unfortunately offers the human end user no help understanding the decision that it has reached. In many situations the end user nevertheless wishes to understand, at least to some degree, why a given supervisedly trained data classifier data classifier has reached a decision before the user can act upon that decision with confidence.
In the past, much work has been directed to extracting rules from Neural Networks, where people have attempted to convert the weights contained within the Neural Network topology into if-then-else type rules [Andrews, R., Diederich, J., & Tickle, A. (1995): “A survey and critique of techniques for extracting rules from trained artificial neural networks” in
Knowledge Based Systems
, 8(6), pp.373-389]. This work has had only limited success and the rules generated have not been clear, concise, nor readily understandable. Work has also been performed which concentrates on the problem as a rule inversion problem; given a subset Y of the output space, find the reciprocal image of Y by the function f computed by the Neural Network [Maire, F. (1995): “Rule-extraction by back-propagation of polyhedra” in
Neural Networks
, 12(4-5), pp. 717-725. Pub. Elsevier/pergamon, ISSN 0893-6080]. This method back-propagates regions from the output layer back to the input layer. Unfortunately, whilst this method is theoretically sound, the output from this method is once again not readily understandable to the user, and so does not solve the problem of helping the user to understand the reason for a Neural Network's decision.
Other methods which have been tried in the past divide each individual value in the input vector into different categories (percentile bins). This technique is described in, for example, U.S. Pat. No. 5,745,654. Each percentile bin has associated with it an explanation describing the meaning of the associated individual input value rather than for the whole vector of input values. A reason is then associated with the output vector, selected as being the reason associated with the most significant input variable in the input vector. This method does not take into consideration the facts that the data classifier classifies on the input vector as a whole and that relationships between input variables are often significant. It also requires some definition of relative significance of the component variables of an input vector which is not always meaningful.
OBJECT OF THE INVENTION
The invention seeks to provide an improved method and apparatus for interpreting outputs from supervisedly trained data classifiers.
SUMMARY OF THE INVENTION
The present invention provides to a user a textual (or other representation of a) reason for a decision made by a supervisedly trained data classifier (e.g. Neural Network). The reason may be presented only when it is required, in a manner that does not hinder the performance of the data classifier, is easily understood by the end user of the data classifier, and is scaleable to large, high dimensional data sets.
According to a first aspect of the present invention there is provided a method of operating a supervisedly trained data classifier, comprising the steps of: generating an output vector responsive to provision of an input vector; associating a reason with said classifier output vector responsive to a comparison between said classifier input vector and a previously stored association between a training vector used to train said classifier and said reason.
Advantageously, the method of operation facilitates later interpretation of the classifier outputs by a user, and is scaleable to large, high dimensional data sets.
Preferably, the method additionally comprises the step of: presenting to a user information indicative of said output vector, of said reason, and of their association.
Advantageously, the association enables the user to interpret the classifier outputs more rapidly and more directly.
Preferably, the method additionally comprises the step of: associating with said reason a measure of a degree of confidence with which said reason is associated with said input vector.
Preferably, the method additionally comprises the step of: presenting to said user information indicative of said measure of a degree of confidence.
Preferably, the method wherein said degree of confidence is calculated responsive to a comparison between said training vector and said input vector.
Preferably, said degree of confidence is calculated as a distance between said input vector and an input vector component of said training vector.
Preferably, said distance is a Euclidean distance.
Advantageously, these measures are simple to calculate and provide a good and intuitively easy to understand measure of confidence.
In a preferred embodiment, a plurality of reasons may be associated with said classifier output vector responsive to comparisons between said classifier input vector and a plurality of previously stored associations between training vectors used to train said classifier and said reasons.
Preferably, the method additionally comprises the step of: associating with each said reason a measure of a degree of confidence with which said reason is associated with said input vector.
Preferably, the method additionally comprises the step of: presenting to said user information indicative of said measure of a degree of confidence.
Advantageously, this allows the user to identify and to concentrate interpretation effort on reasons allocated a high degree of confidence.
Preferably, said i

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Data classifier output interpretation does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Data classifier output interpretation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data classifier output interpretation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3093923

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.