Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2005-10-25
2008-12-09
Trujillo, James K. (Department: 2169)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C715S234000, C715S249000
Reexamination Certificate
active
07464078
ABSTRACT:
A by-line extraction method detects a set of potential headlines from a title meta-tag of a crawled document, selects a candidate headline from the set of potential headlines, and extracts the by-line information from the document using the location of the selected candidate headline. The method constructs the set of potential headlines based on the title meta-tag. The method selects a candidate headline by evaluating the set of potential headlines in order of the lengths of the potential headlines. The method extracts the by-line information from the document by using the location of the selected candidate headline to extract a string representing a date, a name, or a source located within a minimum distance from the location of the potential headline.
REFERENCES:
patent: 6735586 (2004-05-01), Timmons
patent: 6836768 (2004-12-01), Hirsch
patent: 6924828 (2005-08-01), Hirsch
patent: 7152058 (2006-12-01), Shotton et al.
patent: 7240067 (2007-07-01), Timmons
patent: 7363294 (2008-04-01), Billsus et al.
patent: 2002/0099695 (2002-07-01), Abajian et al.
patent: 2002/0099696 (2002-07-01), Prince
patent: 2004/0111400 (2004-06-01), Chevalier
patent: 2005/0165789 (2005-07-01), Minton et al.
patent: 2008/0077582 (2008-03-01), Reed
patent: 2008/0148144 (2008-06-01), Tatsumi
patent: 200300673 (2004-10-01), None
Holovaty, Adrian. “Page titles on news article pages,” Holovaty.com, Oct. 25, 2002, http://www.holovaty.com/blog/archive/2002/10/25/1741.
Johansson, Roger. “Document titles and title separators,” Oct. 19, 2004, http://web.archive.org/web/20041113212131/www.456bereastreet.com/archive/200410/document—titles—and—title—separators/.
Hu, Yunhua. Xin, Guomao. Song, Ruihua. Hu, Guoping. Shi, Shuming. Cao, Yunbo. Li, Hang. “Title extraction from bodies of HTML documents and its application to web page retrieval,” Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, Aug. 15-19, 2005, Salvador, Brazil.
Paynter, Gordon W. “Developing practical automatic metadata assignment and evaluation tools for internet resources,” Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, Jun. 7-11, 2005, Denver, CO, USA.
Agyemang, Malik. Barker, Ken. Alhajj, Rada. “Mining Web Content Outliers using Structure Oriented Weighting Techniques and N-Grams,” Proceedings of the 2005 ACM symposium on Applied computing, 2005, Santa Fe, New Mexico.
Gustafson T. et al., “Agents in their Midst: Evaluating User Adaptation to Agent-Assisted Interfaces,” pp. 163-170 IUI 1998.
Feldman R., et al., “A Domain Independent Environment for Creating Information Extraction Modules,” pp. 586-588, CIKM'01, 2001.
Dill Stephen
Korupolu Madhukar R.
Tomkins Andrew S.
International Business Machines - Corporation
Kneitel Justin
Shimokaji & Associates P.C.
Trujillo James K.
LandOfFree
Method for automatically extracting by-line information does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method for automatically extracting by-line information, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for automatically extracting by-line information will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4031131