Data processing: artificial intelligence – Knowledge processing system
Reexamination Certificate
2000-05-31
2004-01-27
Starks, Jr., Wilbert L. (Department: 2121)
Data processing: artificial intelligence
Knowledge processing system
C707S793000, C707S793000
Reexamination Certificate
active
06684202
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is related to the field of binary classification and, more particularly, to a computer-automated system and method for the binary classification of text units constituting rules of law in case law documents.
2. Description of the Related Art
When disagreements arise about the proper interpretation of statutes, administrative regulations, and constitutions, the higher courts of our land clarify their meaning by applying established judicial criteria. A written description of this application is known as the court's opinion. In order to understand a particular statute or provision of the Constitution, one has to see how the courts have interpreted it, i.e., one needs to read the courts' opinions.
Every case law opinion describes the nature of the dispute and the basis for the court's decision. Courts apply the basic methods of legal reasoning that are taught in all law schools and are used in the practice of law. Most case law documents begin with an introduction that sets forth the facts and procedural history of the case. The court then identifies the issues in dispute, followed by a statement of the prevailing law pertaining to the issue, the court's decision on the issue, and the court's rationale for its decision. Finally there is a statement of the court's overall disposition which either affirms or reverses the judgment of the lower court.
In order to apply the case as precedent, one must determine the significance of the court's decision for future litigants as well as identify the general principles of law that are likely to be applied in future cases. The holding is a statement that the law is to be interpreted in a certain way when a given set of facts exists.
Most written court opinions devote considerable space to justifying the court's decisions. In the rationale, the court usually follows established patterns of legal reasoning and reviews the relevant provisions of the constitutions, statutes, and case law and then relates the thought processes used to arrive at the court's judgment.
A ‘rule of law’ is a general statement of the law and its application under a given set of circumstances that is intended to guide conduct and may be applied to subsequent situations having analogous circumstances. Rules of law are found in the rationales used by the court to support their decisions and often the holding is considered a rule of law.
In the prior art, ascertaining the rule or rules of law in any given decision required an individual to manually read through the text of court decisions. This is time consuming and requires the reviewing individual to read a lot of superfluous material in the effort to glean what are often just a few, pithy rules of law. Therefore, a need exists for a way to automate document review while still accurately identifying the rules of law.
Distinguishing a rule of law from text that does not constitute a rule of law requires binary classification. In the prior art, there are many statistical and machine learning approaches to binary classification. Examples of statistical approaches include Bayes' rule, k-nearest neighbor, projection pursuit regression, discriminant analysis, and regression analysis. Examples of machine learning approaches include Naive Bayes, neural networks, and regression trees.
These approaches can be grouped into two broad classes based on the type of classification being done. When a set of observations is given with the aim of establishing the existence of classes or clusters in the data, this is known as unsupervised learning or clustering. When it is known for certain that there are N classes, and the aim is to establish a rule whereby new observations can be classified into one of the existing classes, then this is known as supervised learning. With supervised learning, a rule for classifying new observations is established using known, correctly classified data.
Rules can be established using many of the supervised techniques mentioned above. One such technique is logistic regression, a statistical regression procedure that may be used to establish an equation for classifying new observations.
In general, regression analysis is the analysis of the relationship between one variable and another set of variables. The relationship is expressed as an equation. Using the equation it is possible to predict a response, or dependent, variable from a function of regressor variables and parameters. Regressor variables are sometimes referred to as independent variables, predictors, explanatory variables, factors, features, or carriers.
Standard regression analysis, or linear regression, is not recommended for the present invention because of the dichotomous nature of the response variable, which indicates that a unit of text is either a rule of law (ROL) or not a rule of law (~ROL). The reason this is true is because R
2
, which is used by linear regression to evaluate the effectiveness of the regression, is not suitable when the response variable is dichotomous. The present invention uses logistic regression because it uses the maximum likelihood estimation procedure to evaluate the effectiveness of the regression and this procedure works with a response variable that is dichotomous.
The training process of logistic regression operates by choosing a hyperplane to separate the classes as well as possible, but the criterion for a good separation, or goodness of fit, is not the same as for other regression methods, such as linear regression. For logistic regression, the criterion for a good separation is the maximum of a conditional likelihood. Logistic regression is identical, in theory, to linear regression for normal distributions with equal covariances, and also for independent binary features. So, the greatest differences between the two are to be expected when the data depart from these two cases, for example when the features have very non-normal distributions with very dissimilar covariances.
Several well known statistical packages contain a procedure for logistic regression, e.g., the SAS package has a logistic procedure, and SPSS has one called LOGISTIC REGRESSION.
Binomial distributions may be compared using what is known as a Z value. In statistics the so-called binomial distribution describes the possible number of times that a particular event will occur in a sequence of observations. The event is coded binary, i.e., it may or may not occur. The binomial distribution is used when a researcher is interested in the occurrence of an event instead of, for example, its magnitude. For instance, in a clinical trial, a patient may survive or die. The researcher studies the number of survivors, and not how long the patient survives after treatment. Another example is whether a person is overweight. The binomial distribution describes the number of overweight persons, and not the extent to which they are overweight.
There are many practical problems involved in the comparison of two binomial parameters. For example, social scientists may wish to compare the proportions of women taking advantage of prenatal health services for two communities that represent different socioeconomic backgrounds. Or, a director of marketing may wish to compare the public awareness of a new product recently launched with that of a competitor's product.
Two binomial parameters can be compared using the Z statistic, where:
Z
=(
P
0
−
P
1
)/(
TP
*(1
−TP
)(1
/T
0
+1
/T
1
))
0.5
where Px is the probability of binomial parameter x (where x is either binomial parameter 0 or 1); TP is the combined probability of the two binomial parameters; and Tx is the sample size taken from the population(s) in order to estimate the two probabilities P
0
and P
1
.
The same formula can be used to compare a binomial parameter from two different distributions. In this case, Px is the probability of the binomial parameter in distribution x, where x is either distribution 0 or 1; TP is the probability of the binomial p
Ahmed Salahuddin
Collias Spiro G.
Humphrey Timothy L.
Lu X. Allan
Morelock John T.
Jacobson & Holman PLLC
Lexis Nexis
Starks, Jr. Wilbert L.
LandOfFree
Computer-based system and method for finding rules of law in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer-based system and method for finding rules of law in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer-based system and method for finding rules of law in... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3236577