Efficient use of exceptions in text segmentation

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S001000

Reexamination Certificate

active

08086442

ABSTRACT:
Input text may be broken into sentence, or other types of segments, by first detecting exceptions in the input text, and then detecting break positions. Given a segment breaking scheme that comprises a set of break rules and a set of exceptions, a regular expression is created that represents the break rules, and another regular expression is created that represents the exceptions. The input text is analyzed to identify strings that match any exception, and the matching strings are substituted with placeholders that are not likely to occur naturally in the input. The resulting text, with substitutions, is then evaluated to find the positions in the text that match the break rules. Those positions are declared to be segment breaks, and the placeholders are then replaced with the original strings. The result is the original text, with breaks assigned to the appropriate positions in the text.

REFERENCES:
patent: 4994966 (1991-02-01), Hutchins
patent: 5379373 (1995-01-01), Hayashi et al.
patent: 5572732 (1996-11-01), Fant et al.
patent: 5802533 (1998-09-01), Walker
patent: 6188977 (2001-02-01), Hirota
patent: 6223150 (2001-04-01), Duan et al.
patent: 6574644 (2003-06-01), Hsu et al.
patent: 7523125 (2009-04-01), Zeng
patent: 7548848 (2009-06-01), Deb et al.
patent: 7668718 (2010-02-01), Kahn et al.
patent: 7827188 (2010-11-01), Howard et al.
patent: 7937344 (2011-05-01), Baum et al.
patent: 2004/0225999 (2004-11-01), Nuss
patent: 2005/0108001 (2005-05-01), Aarskog
patent: 2006/0149558 (2006-07-01), Kahn et al.
patent: 2007/0208755 (2007-09-01), Bhatkar et al.
Zydroń, Andrzej., “Reference Model for Open Architecture for XML Authoring and Localization Version 1.0”, Retrieved at << http://docs.oasis-open.org/oaxal/V1.0/cd01/oaxal-v1.0-cd01.html >>, Mar. 20, 2009, 27 pages.
“Man Pages Section 5: Standards, Environments, and Macros”, Retrieved at << http://docs.sun.com/app/docs/doc/816-5175/6mbba7evc >>, dated May 20, 2002, 6 pages.
Gintrowicz, et al., “Using Regular Expressions in Translation Memories”, Retrieved at << http://www.mt-archive.info/IMCSIT-2007-Gintrowicz.pdf >>, Proceedings of the International Multiconference on Computer Science and Information Technology, vol. 2, Oct. 15-17, 2007, pp. 87-92.
“Localization Definitions and Standards”, Retrieved at << http://www.sisulizer.com/localization/support/localization-glossary.shtml>>, Retrieved Date: Aug. 25, 2010, 8 pages.
“Segmentation Rules”, Retrieved at << http://www.alchemysoftware.ie/livedocs/publisher30/general—options/configuring—xml—segmentation—rules.htm >>, Retrieved Date: Aug. 25, 2010, 2 pages.
“Add/Edit Segmentation Rule”, Retrieved at << http://producthelp.sdl.com/SDL%20Trados%20Studio/client—en/Ref/A-G/AE—SegRul.htm >>, Retrieved Date: Aug. 25, 2010, 2 pages.
“Add/Edit Rule Exception”, Retrieved at << http://producthelp.sdl.com/SDL%20Trados%20Studio/client—en/Ref/A-G/AERulExc.htm >>, Retrieved Date: Aug. 25, 2010, 1 page.
Milkowski, et al., “Using SRX standard for sentence segmentation in LanguageTool,” 4th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Nov. 6-8, 2009, 5 pages, Poznań, Poland.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Efficient use of exceptions in text segmentation does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Efficient use of exceptions in text segmentation, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient use of exceptions in text segmentation will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4302402

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.