Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
2010-12-15
2011-12-27
Yen, Eric (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C704S001000
Reexamination Certificate
active
08086442
ABSTRACT:
Input text may be broken into sentence, or other types of segments, by first detecting exceptions in the input text, and then detecting break positions. Given a segment breaking scheme that comprises a set of break rules and a set of exceptions, a regular expression is created that represents the break rules, and another regular expression is created that represents the exceptions. The input text is analyzed to identify strings that match any exception, and the matching strings are substituted with placeholders that are not likely to occur naturally in the input. The resulting text, with substitutions, is then evaluated to find the positions in the text that match the break rules. Those positions are declared to be segment breaks, and the placeholders are then replaced with the original strings. The result is the original text, with breaks assigned to the appropriate positions in the text.
REFERENCES:
patent: 4994966 (1991-02-01), Hutchins
patent: 5379373 (1995-01-01), Hayashi et al.
patent: 5572732 (1996-11-01), Fant et al.
patent: 5802533 (1998-09-01), Walker
patent: 6188977 (2001-02-01), Hirota
patent: 6223150 (2001-04-01), Duan et al.
patent: 6574644 (2003-06-01), Hsu et al.
patent: 7523125 (2009-04-01), Zeng
patent: 7548848 (2009-06-01), Deb et al.
patent: 7668718 (2010-02-01), Kahn et al.
patent: 7827188 (2010-11-01), Howard et al.
patent: 7937344 (2011-05-01), Baum et al.
patent: 2004/0225999 (2004-11-01), Nuss
patent: 2005/0108001 (2005-05-01), Aarskog
patent: 2006/0149558 (2006-07-01), Kahn et al.
patent: 2007/0208755 (2007-09-01), Bhatkar et al.
Zydroń, Andrzej., “Reference Model for Open Architecture for XML Authoring and Localization Version 1.0”, Retrieved at << http://docs.oasis-open.org/oaxal/V1.0/cd01/oaxal-v1.0-cd01.html >>, Mar. 20, 2009, 27 pages.
“Man Pages Section 5: Standards, Environments, and Macros”, Retrieved at << http://docs.sun.com/app/docs/doc/816-5175/6mbba7evc >>, dated May 20, 2002, 6 pages.
Gintrowicz, et al., “Using Regular Expressions in Translation Memories”, Retrieved at << http://www.mt-archive.info/IMCSIT-2007-Gintrowicz.pdf >>, Proceedings of the International Multiconference on Computer Science and Information Technology, vol. 2, Oct. 15-17, 2007, pp. 87-92.
“Localization Definitions and Standards”, Retrieved at << http://www.sisulizer.com/localization/support/localization-glossary.shtml>>, Retrieved Date: Aug. 25, 2010, 8 pages.
“Segmentation Rules”, Retrieved at << http://www.alchemysoftware.ie/livedocs/publisher30/general—options/configuring—xml—segmentation—rules.htm >>, Retrieved Date: Aug. 25, 2010, 2 pages.
“Add/Edit Segmentation Rule”, Retrieved at << http://producthelp.sdl.com/SDL%20Trados%20Studio/client—en/Ref/A-G/AE—SegRul.htm >>, Retrieved Date: Aug. 25, 2010, 2 pages.
“Add/Edit Rule Exception”, Retrieved at << http://producthelp.sdl.com/SDL%20Trados%20Studio/client—en/Ref/A-G/AERulExc.htm >>, Retrieved Date: Aug. 25, 2010, 1 page.
Milkowski, et al., “Using SRX standard for sentence segmentation in LanguageTool,” 4th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Nov. 6-8, 2009, 5 pages, Poznań, Poland.
Michael Alan K.
Oh Beom Seok
Taylor Marcus A.
Uehara Shusuke
Wu Enyuan
Microsoft Corporation
Yen Eric
LandOfFree
Efficient use of exceptions in text segmentation does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Efficient use of exceptions in text segmentation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient use of exceptions in text segmentation will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4302402