Method and system for document classification with multiple...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000

Reexamination Certificate

active

06792415

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method and system for document classification, and particularly to a method and system for document classification that employs multiple algorithms to classify documents in multiple dimensions.
2. Description of the Related Art
In current document classification mechanism, the method for document classification always belongs to a single dimension classification method. That is, one document is classified into one or multiple detailed catalogues by employing one classification algorithm. Since only one algorithm is employed in the classification procedure, the document is classified according to its most noticeable feature, such as a keyword having maximum appearances or the similarity of the document.
However, features considered important but not paramount may not be classified and extracted. For example, the author of the document cannot be classified, since the name of the author only appears in the cover page. In addition, the technique in a system analysis document also cannot be classified, since the analysis is more important than the technique in the document.
FIG. 1
is a schematic diagram showing an example of classification structure
100
of the documents in an enterprise. The classification structure
100
includes four categories, “Author”
110
, “Classification”
120
, “Analysis Method”
130
, and “Application Area”
140
. Category “Author”
110
includes detailed catalogues, “Employee A”
111
and “Employee B”
112
; Category “Classification”
120
includes detailed catalogues, “Requirement Specification”
121
and “Design Specification”
122
; Category “Analysis Method”
130
includes detailed catalogues, “SDG2 Analysis”
131
and “Use Case Analysis”
132
; and Category “Application Area”
140
includes detailed catalogues, “Catalog Service”
141
and “Supply Chain Management”
142
.
As an example, the requirement of a catalog service is described in a specification, and the word “Catalog Service” is mentioned repeatedly in this specification, the author of the specification, “Employee A”, and the word “Requirement Specification” only appear in the cover page, and the word “Analysis Method” only appears once in one section of the specification. In conventional methods, since the feature of “Catalog Service” is stronger than the feature of “Employee A”, “Requirement Specification” and/or “Analysis Method”, the specification is only classified into the detailed catalogues, “Catalog Service”
141
, as shown in
FIG. 2
(denoted by the black circle). However, the features of “Employee A”, “Requirement Specification” and/or “Analysis Method” are not taken into consideration.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a method and system for document classification with multiple dimensions and multiple algorithms. Users can set categories (dimensions) and the corresponding algorithms according to the characteristics of documents, so as to employ these algorithms to classify documents in respective dimensions.
To achieve the above objects, the present invention provides a method for document classification with multiple dimensions and multiple algorithms. According to one aspect of the invention, first, a classification preference is set. The classification preference includes a plurality of categories, and each of the categories has a corresponding algorithm. Then, a document is classified according to the classification preference, thus one or several detailed catalogues corresponding to each of the categories are acquired.
According to another aspect of the invention, first, a document is received, and a classification code is determined. The classification code contains a classification preference. The classification preference includes a plurality of categories, and each of the categories has a corresponding algorithm. Then, the classification code is executed to classify the document, thus one or several detailed catalogues corresponding to each of the categories are acquired.
According to the embodiment of the present invention, a system for document classification with multiple dimensions and multiple algorithms is also provided. The system includes a preference database, a generator, and a classification unit. The preference database stores at least one classification preference. The classification preference includes a plurality of categories, and each of the categories has a corresponding algorithm. The generator transforms the classification preference into a classification code. The classification unit executes the classification code to classify the document, thus one or several detailed catalogues corresponding to each of the categories are acquired.
It should be noted that the document is classified in each of the categories by employing the algorithms corresponding to the categories respectively.


REFERENCES:
patent: 5765029 (1998-06-01), Schweid et al.
patent: 6442555 (2002-08-01), Shmueli et al.
patent: 6611825 (2003-08-01), Billheimer et al.
patent: 6751621 (2004-06-01), Calistri-Yeh et al.
Wen-Lin Hsu et al., Classification Algorithms for NETNEWS Articles, 1999, ACM, pp. 114-121.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for document classification with multiple... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for document classification with multiple..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for document classification with multiple... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3227926

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.