Data processing: artificial intelligence – Knowledge processing system
Reexamination Certificate
2007-12-31
2011-10-25
Gaffin, Jeffrey A (Department: 2129)
Data processing: artificial intelligence
Knowledge processing system
C706S012000
Reexamination Certificate
active
08046317
ABSTRACT:
An improved system and method is provided for feature selection for text classification using subspace sampling. A text classifier generator may be provided for selecting a small set of features using subspace sampling from the corpus of training data to train a text classifier for using the small set of features for classification of texts. To select the small set of features, a subspace of features from the corpus of training data may be randomly sampled according to a probability distribution over the set of features where a probability may be assigned to each of the features that is proportional to the square of the Euclidean norms of the rows of left singular vectors of a matrix of the features representing the corpus of training texts. The small set of features may classify texts using only the relevant features among a very large number of training features.
REFERENCES:
patent: 2008/0082475 (2008-04-01), Aggarwal et al.
Dasgupta et al (“Feature Selection Methods for Text Classification” KDD'07).
Drineas et al (“Relative-Error CUR Matrix Decompositions” Aug. 27, 2007), downloaded at http://arxiv.org/PS—cache/arxiv/pdf/0708/0708.3696v1.pdf.
Drineas et al (“Sampling Algorithms and Coresets for Lp Regression” Jul. 11, 2007), downloaded at http://arxiv.org/PS—cache/arxiv/pdf/0707/0707.1714v1.pdf.
Drineas et al (“Sampling algorithms for I2 regression and applications” Jan. 2006), downloaded at http://portal.acm.org/citation.cfm?id=1109557.1109682.
Gabrilovich et al (“Text Categorization with Many Redundant Features: Using Aggressive Feature Selection to Make SVMs Competitive with C4.5” 2004).
Drineas et al (“Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods” APPROX and RANDOM Aug. 28-30, 2006).
Dasgupta Anirban
Drineas Petros
Harb Boulos
Josifovski Vanja
Mahoney Michael William
Buchenhorner Patent Law
Gaffin Jeffrey A
Wong Lut
Yahoo ! Inc.
LandOfFree
System and method of feature selection for text... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method of feature selection for text..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method of feature selection for text... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4262907