Data processing: database and file management or data structures – Database design – Data structure types
Patent
1997-08-12
1999-12-21
Amsbury, Wayne
Data processing: database and file management or data structures
Database design
Data structure types
707 3, 707 6, 707102, 707203, 707511, 707532, G06F 1730
Patent
active
060062231
ABSTRACT:
A method and apparatus for mining text databases, employing sequential pattern phrase identification and shape queries, to discover trends. The method passes over a desired database using a dynamically generated shape query. Documents within the database are selected based on specific classifications and user defined partitions. Once a partition is specified, transaction IDs are assigned to the words in the text documents depending on their placement within each document. The transaction IDs encode both the position of each word within the document as well as representing sentence, paragraph, and section breaks, and are represented in one embodiment as long integers with the sentence boundaries. A maximum and minimum gap between words in the phrases and the minimum support all phrases must meet for the selected time period may be specified. A generalized sequential pattern method is used to generate those phrases in each partition that meet the minimum support threshold. The shape query engine takes the set of phrases for the partition of interest and selects those that match a given shape query. A query may take the form of requesting a trend such as "recent upwards trend", "recent spikes in usage", "downward trends", and "resurgence of usage". Once the phrases matching the shape query are found, they are presented to the user.
REFERENCES:
patent: 5615341 (1997-03-01), Agrawal et al.
patent: 5675819 (1997-10-01), Schuetze
patent: 5729730 (1998-03-01), Wlaschin et al.
patent: 5787386 (1998-07-01), Kaplan et al.
patent: 5790848 (1998-08-01), Wlaschin
patent: 5794178 (1998-08-01), Caid et al.
Osmar R Zaiane et al., discovering web access patterns and trends by applying OLAP and data mining technology on web logs, IEEE Apr. 1998, and 19-29.
Ming-Syan Chen, et al.,. efficient data mining for traversal patterns, IEEE Apr. 1998 and 209-221.
Mika Klemettinen et al., a data mining methodology and its application to semi-automatic knowledge acquistion, IEEE Sep. 1997 and 670-677.
Feldman R. et al., "Knowledge Discovery in Textural Databases (KDT)", Proc. of the 1st Int'l Conf. on Knowledge Discovery in Databases (KDD) and Data Mining, 1995 and Bar-Ilan University, Israel, Math and Computer Science Dept., KKD-95, pp. 112-117.
Feldman, R. et al., "Mining Associations in Text in the Presence of Background Knowledge", Proc. of the 2nd Int'l. Conf. on Knowledge Discovery on Databases and Data Mining, 1996. and Technology Spotlight / Spatial, Temporal & Multimedia Data Mining, pp. 343-346 (undated).
Renouf, A., "Making Sense of Text: Automated Approaches to Meaning Extraction", 17th Int'l . On-Line Information Meeting Proceedings / Online Information 93, p. 77-87, England, Dec. 1993.
Srikant, R., et al., "Mining Sequential Patterns: Generalizations and Performance Improvements", Proc. of the 5th Int'l. Conf. on Extending Database Technology (EDBT), 1996, pp. 3-17.
Deerwester, S. et al., "Indexing by Latent Semantic Analysis", Journal of the American Society for Information Science, 41(6):391-407, 1990.
Croft, W., et al. "The Use of Phrases and Structured Queries in Information Retrevial", 14th Int'l. ACCM SIGIR Conf. on Research and Development on Information Retrieval, 1991 and ACM 0-89791-448, pp. 32-45, 1991.
Agrawal, R. et al., "Fast Algorithms For Mining Association Rules", Proceedings of the 20th VLDB Conference Santiago, Chile, pp. 487-499, 1994.
Agrawal, R. et al., "Active Data Mining", IBM Almaden Research Center, California, 6 pages, (undated Abstract).
Agrawal, R. et al., "Querying Shapes of Histories" Proceedings of the 21st VLDB Conference, Zurich, Switzerland, 13 pages, 1995.
Agrawal, R. et al., "Mining Sequential Patterns", IEEE (1063-6382), pp. 3-14, 1995.
Agrawal Rakesh
Lent Brian Scott
Srikant Ramakrishnan
Amsbury Wayne
Channavajjala Srirama
International Business Machines - Corporation
LandOfFree
Mapping words, phrases using sequential-pattern to find user spe does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Mapping words, phrases using sequential-pattern to find user spe, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Mapping words, phrases using sequential-pattern to find user spe will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-515986