Pattern recognition using generalized association rules

Data processing: artificial intelligence – Neural network – Learning task

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C706S045000, C707S793000, C707S793000

Reexamination Certificate

active

06311173

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to systems and methods for pattern recognition, and specifically to methods of pattern recognition based on association rules.
BACKGROUND OF THE INVENTION
Automated pattern recognition is well known in the art, in a variety of applications. Common applications of pattern recognition include image analysis, speech recognition, and predicting unknown fields for records in a database. Typically, a template or a collection of rules is determined, which are believed to constitute a pattern that is characteristic of a certain class. Items in the set are then evaluated to determine how closely they fit the pattern. A close fit indicates a high probability that the item being evaluated falls within the class. Thus, a face may be found to belong to a certain individual; or a spoken sound may be found to correspond to a certain word; or a bank customer may be predicted to be a good or bad credit risk.
In order to build the template or rules, “data mining” of a training database is frequently used. The training database is selected and is assumed to be a real and representative sample of the overall population. The training database generally contains variables (fields, representing various attributes of the items) of different types, among which a field is selected as the “Field to Predict” (also referred to in the database art as the “output”, or “result”, or “dependent” variable). The training database may be represented as a temporary file of codes of attribute values, given by the matrix:
A={x
1
, . . . , x
n
,y}
1
N
  (1)
consisting of N records, where a record is a vector of previously encoded values of n input (predicting) fields x
1
, . . . , x
n
and of the output (predicted) field y. The purpose is to discover all essential regularities within the investigated file A, thus enabling one to predict the unknown value of output field y based on the known values of variables x
1
, . . . , x
n
. The same encoding methods may be applied when the rules derived from the training database are applied to the rest of the population.
Typically, the Field to Predict is Boolean, i.e., the output field can be specified as y &egr;{0,1}. It will be understood, however, that similar methods may be applied, mutatis mutandis, when the output field has a wider range of possible values.
The accuracy of prediction of the value of y may be verified by testing on the training database. However, such testing may not be sufficient, since there exists the problem of “overfitting,” wherein regularities discovered on the training database turn out to be found by chance, and are therefore not valid on other samples from the overall population. Thus, the purpose of data mining is to discover regularities within the training database which possess the property of likely stability of their validity on the whole population.
Regularities of this sort are sought within the training database in the form of association rules. Methods of deriving such association rules are described, for example, by Agrawal, et al., in “Fast Discovery of Association Rules,” in
Advances in Kowledge Discovery and Data Mining
(AAAI Press/MIT Press, 1996), pages 307-328; and by Zaki, et al., in “New Algorithms for Fast Discovery of Association Rules,” in Proc. 3rd Int. Conf. KDD (California), pages 283-286. These publications are incorporated herein by reference.
A formal definition of an association rule is as follows: Let I
i
={a
i1
, . . . , a
im
i
} be the set of codes of values of a variable x
i
. A single condition (referred to herein as a 1-condition) is a condition of the type: x
i
=a
ij
, j &egr;{1, . . . , m
i
}. A composite condition is a conjunction of q single conditions, wherein q=2, . . . , n, referred to herein as a q-condition. Thus, a q-condition is a condition of the type:
(
x
i
1
=a
i
1
j
1
){circumflex over ( )}(
x
i
2
=a
i
2
j
2
){circumflex over ( )}. . . {circumflex over ( )}(
x
i
q
=a
i
q
j
q
)  (2)
Association rules are “if-then” rules, wherein the condition is the single or conjunctive expression, as in equation (2). Such rules are referred to in the context of the present patent application and in the claims as “simple association rules.” Thus, a simple association rule is a statement of the following type:
 if (
q
-condition) then
y=y
1
  (3)
The possible values of y
1
are y
1
=1 or y
1
=0. The number s of records at which both the condition of the rule (the q-condition) and the rule's conclusion (y=y
1
) are fulfilled is referred to as the support, s, of the rule. The probability p of the rule is the probability that for a random record satisfying the rule's condition, the rule's conclusion is also fulfilled. Hence
p
=
s
m
,
wherein m is the number of records satisfying the rule's condition.
Let p
a
be the a priori probability that the Field to Predict y possesses the predicted value 1, that is,
p
a
=
r
N
,
where r is the number of records in the training database at which y=1. As mentioned above, the training database is assumed to be a real and representative sample of the investigated population. Hence, it can be assumed that the same a priori probability p
a
is valid on the overall population.
Which rules are interesting? In other words, about which rules can it be said that they were discovered not by chance, and are likely to be valid on the overall population? It can be assumed that these are rules fulfilled at a sufficiently large number of records and whose probability significantly deviates from p
a
. A formal statement for this intuitive notion is as follows: A user specifies a minimum support S
min
, and minimum admissible probabilities for a rule with y=1 (denoted by {overscore (p)}
1
) and with y=0 (denoted by {overscore (p)}
0
), 1−{overscore (p)}
0
<p
a
<{overscore (p)}
1
. The objective of the data mining is then to determine association rules (rules of the type of equation (2)) for which s≧S
min
and p≧{overscore (p)}
1
(if y=1), or p≧{overscore (p)}
0
(if y=0). Methods of data mining known in the art do not necessarily find all such rules exhaustively on the training database, and furthermore tend to require very substantial computing resources.
The rule defined by equation (3) can be expressed as a statement of conditional probability:
P
(
y=y
1
|(the
q
-condition))=
p
  (4)
However, unlike equation (3), the number of records at which this statement is fulfilled (support s) is absent in equation (4). Therefore, association rules in a sense contain more information than conditional probabilities, which are applied in Bayes methods. Such methods are described, for example, by Friedman, in “On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality,” in
Data Mining and Knowledge Discovery
, 1(1), pages 54-77; and by Heckerman, in “Bayesian Networks for Data Mining,” in
Data Mining and Knowledge Discovery,
1(1), pages 79-119. These publications are incorporated herein by reference.
As mentioned above, the object of deriving association rules from the training database is to predict accurately the Field to Predict value for data outside the training database. To predict the value of output field y for a given record, all relevant association rules are selected, and using the respective probabilities of these rules, the conclusive probability that the Field to Predict possesses the predicted value is calculated. Such prediction gives especially accurate results when a great number of “strong” association rules (rules with a high probability and support) are found, and many of them are independent or nearly independent. Methods of finding strong rules are described, for example, by Piatetsky-Shapiro, in “Discovery, Analysis, and Presentation of Strong Rules,” in
Knowledge Discovery in Databases
(Menlo Park, Calif.: AAAI Press), pages 229-248, which

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pattern recognition using generalized association rules does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pattern recognition using generalized association rules, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pattern recognition using generalized association rules will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2595176

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.