Method of and system for splitting and/or merging content to...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S004000

Reexamination Certificate

active

06782384

ABSTRACT:

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
Not Applicable
REFERENCE TO MICROFICHE APPENDIX
Not Applicable
BACKGROUND OF THE INVENTION
The present invention relates to processing information content, and more particularly, to combining and/or separating segments of this content to simplify and otherwise facilitate translation and other processing functions associated with the content. Over the past few decades, opportunities for international relationships have expanded at a staggering rate. Many factors have contributed to this expansion—improved transportation capabilities, advances in communication and media technologies, opening of once inaccessible cultures, among others. More recently, the Internet (the World Wide Web, in particular) has provided seemingly unlimited access to international audiences. The Internet represents a massive global business opportunity, and has provided the means for a wide range of businesses to deploy a multilingual and multicultural marketing presence, thereby increasing revenue, improving customer loyalty and reinforcing brand recognition.
As information becomes available globally, the role of translators has shifted away from simple transcription of text into a target language. Translators always had to pay close attention to any attributes and linguistic idiosyncrasies of the target culture, as well as understand and adapt to these differences. Now, however, translators must also ensure the timely deployment of the translated content to the designated site. Translation can be made more efficient with greater flexibility in software functionality and the ability to save previous translations for future use. Traditionally, translators worked with hard copy documents, from which they had the flexibility to translate content at any suitable level. Thus, translators had the ability to look at an entire document and translate it without confines. The increased need for efficient content translation has motivated numerous companies to develop tools that automate at least part of the translation process.
To increase the overall speed of content translation, tools have been developed to save translations in some type of memory (referred to herein as “translation memory” or “TM”), so that the tool can make automatic substitutions, and the translator will not have to consider further instances of those translations. The TM provides a record of pairs of units of translation that have already been translated. A “unit of translation” is a segment of content that has been delineated by any of several criteria, as is discussed in more detail herein. Each associated pair in the TM includes a unit of translation from the content in the source language (i.e., the language of the content that is to be translated), and the corresponding translation unit from content in the target language (i.e., the language into which the source content is being translated). In order to populate the TM, prior art translation methods segment content into sentences (or other syntactic units, e.g., words, phrases, etc.) based on predetermined criteria so that the translator can focus on translating one sentence (or other syntactic unit) at a time.
However, differences between the source language and the target language create difficulties in translating directly from one language to another within the constraints of the particular segments chosen. Such differences may include, but are not limited to, differences in grammatical structure, differences in idiomatic expressions, and punctuation differences. Further, segments that are spatially adjacent in the source document may not necessarily be best suited as adjacent in the target content. Content generally cannot be translated word for word, sentence for sentence, paragraph for paragraph, because of these language differences. Another consideration is that competent, efficient translation is typically not deterministic. For example, three translators operating on the same content may well produce three different translations, each of which would be technically correct. Any type of segmentation tool that segments the content based on a rigid set of criteria will force a translator to approach translation of the content on a word for word (etc.) basis.
Flexibility in content segmentation is important because translators must be able to account for the differences in language structures. For instance, translating content sentence by sentence may populate a translation memory with more specific entries. Storing more specific entries in translation memory is useful because doing so increases the likelihood that future translation instances will make use of those entries. However, as described herein, a sentence-to-sentence translation may not be accurate, depending on the languages being used in the translation. For example, the following sentences in Italian:
Per quanto riguarda la Banca Centrale Europea, un euro debole può essere un problema soltanto se aumenta l'inflazione. Però, a 2.3%, l'inflazione nella zona euro è ancora abbastanza modesta.
would be translated as a single sentence in English:
Yet as far as the ECB is concerned, a weak euro is only really a problem if it pushes up inflation; and at 2.3%, inflation in the euro zone is still rather modest. (
The Economist
, Sep. 23-29, 2000, p. 89)
On the other hand, although translating an entire paragraph as a unit may be more accurate, it can be inefficient for translators because doing so will populate the translation memory with entries that are unlikely to be used again.
An additional problem with content segmentation is determining the sentence boundaries. Typically, a period denotes a sentence end. Yet, if a word within a sentence is abbreviated and uses a period (e.g., “Mr.” ), the period following the abbreviation could be interpreted as a sentence end and the sentence would thus be segmented at that point. Likewise, some languages such as Thai do not even use period punctuation.
It is an object of the present invention to substantially overcome the above-identified disadvantages and drawbacks of the prior art.
SUMMARY OF THE INVENTION
The present invention provides a method of and system for splitting and merging blocks of information content (e.g., textual blocks) so as to simplify and expedite a translator's task in converting content from one language to another. The method and system of the present invention is referred to herein, in general, as “Split/Merge.” The textual information to be translated from one language to another is referred to herein as “content.” The Split/Merge method and system allows a user (i.e., a translator) to decide, in real time, the level at which he or she wishes to translate content. The translator has the ability to “split” a paragraph into separate sentences, allowing for individual translation of each sentence. Thus, the translation memory contains entries at the sentence level, which are more likely to be repeated than entire paragraphs. In addition, the translator can “merge” selected sentences together to form a single segment for translation. Furthermore, the translator can “merge” all sentences of a paragraph into a single textual “chunk,” as well as merge all of the paragraphs into a larger textual “chunk”. This split/merge functionality provides flexibility for source material that is not suitable for sentence-by-sentence translation.
The utility of the Split/Merge invention may be exploited in a translation system such as Idiom's WorldServer. In general, WorldServer is a Web-based application that enables enterprises to manage their content while leveraging established Web architecture, content management and workflow systems. A translator uses WorldServer to determine what content he or she needs to translate. The translator can either export the content needing translation to a third party editing tool, or use the Translation Workbench to perform the actual translation. A translator can be an individual contributor, including users that are adapting but not translating content and reviewers who r

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of and system for splitting and/or merging content to... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of and system for splitting and/or merging content to..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of and system for splitting and/or merging content to... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3331201

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.