Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
1999-08-06
2001-04-17
Thomas, Joseph (Department: 2747)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C704S001000, C707S793000
Reexamination Certificate
active
06219633
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to analogically similar word production apparatus and method for producing or generating an analogically similar word analogized from three inputted unit strings. More particularly, the invention relates to analogically similar word production apparatus and method for producing or generating a unit string composed of a plurality of units whose property or attribute is analogically similar in a predetermined analogically similar relation to three inputted unit strings given in a predetermined order. In this case, the unit is a character, an alphabetical letter, a word or the like.
2. Description of the Prior Art
Conventionally, as a procedure to produce a new word which is morphologically related to another word, techniques such as finite state automata are used. For example, Prior Art Document 1, Kimmo Koskenniemi, “Two Level-Morphology: A General Computational Model for Word Form Recognition and Production”, Department of General Linguistics, University of Helsinki, 1983, has proposed a method for producing a word with respect to some attributes which specify a certain form and which are given this certain form (hereinafter, referred to as a first prior art).
For example, production of a letter string “unlike” from a letter string “like” and an attribute or property “antonym” is considered. According to the first prior art, if the attribute “antonym” is given to a finite state automaton, and if this finite state automaton converts the attribute into the task of inserting a prefix, namely “un”, before the word to which the attribute has been given, then the letter string “like” can be transformed into the letter string “unlike” with a prefix “un” inserted, and therefore the antonym of the letter string “like” can be computed as “unlike”. Similarly, the antonym of a letter string “known” can also be computed using the same method, and therefore the antonym of the letter string “known” can be computed as “unknown”.
In the first prior art, in order that a new letter string be produced, it is therefore necessary to input attributes at the same time the letter string is given to the finite state automaton. Therefore, by executing the finite state automaton, a word similar to an inputted word but different therefrom with some attributes inputted can be obtained. In particular, because the whole process is implemented by the finite state automaton, the first prior art method has an advantage of operating at relatively high speed.
Likewise, Prior Art Document 2, “Unix user commands, sed—stream editor”, has proposed a method for replacing a letter string by another letter string by means of description of regular expressions (hereinafter, referred to as a second prior art).
For example, replacement of a letter string “miracle” by a letter string “miraculous” is considered. According to the second prior art, if the letter string “miracle” and the letter string “miraculous” are given to a finite state automaton, and if this finite state automaton first recognizes the letter string “miracle”, delimits its boundaries in a letter stream, and replaces the letter string “miracle” by the letter string “miraculous” within the boundaries, then the letter string “miracle”, when recognized in the letter stream, can be replaced into the letter string “miraculous”. For instance, in a letter stream “it was a miracle and fable healing”, the letter string “miracle” can be replaced by the letter string “miraculous”. Therefore, by executing the method of the second prior art, the replacement of a letter string by another letter string can be achieved.
However, since the first prior art method produces a word form by using a finite state automaton, the method involves preparatory registration of all the possible letter strings that can be added as prefixes, suffixes or infixes with respect to an object language. Hence, the first prior art requires a linguistic description of the object language which is not immediate. Therefore, the first prior art requires the involvement of specialized workers to establish the non-immediate linguistic description of an object language.
Also, since the second prior art method is based on a finite state automaton method, the replacement of, for example, a letter string “fable” by a letter string “fabulous” is impossible even if the letter strings “miracle” and “miraculous” are given to the finite state automaton. However, the letter string “fabulous” is in the same relation to the letter string “fable” as the letter string “miraculous” is to the letter string “miracle”. Therefore, the second prior art can not perform the replacement of analogically similar words.
SUMMARY OF THE INVENTION
An essential object of the present invention is therefore to provide analogically similar word production apparatus and method capable of solving the aforementioned problems and of producing, based on three inputted unit strings, an analogically similar word which is a unit string analogically similar to and other than the three inputted unit strings, without using attributes and without using any finite state automaton, at higher speed than the prior art.
In order to achieve the aforementioned objective, according to one aspect of the present invention, there is provided an analogically similar word production apparatus (
100
) for, based on first, second and third three inputted unit strings which are inputted in a predetermined order, producing an analogically similar word having properties analogically similar in a predetermined analogically similar relation to the first to third unit strings, comprising:
matrix storage means (
10
) for storing a plurality of elements of a first limited pseudo-distance matrix, and a plurality of elements of a second limited pseudo-distance matrix,
a number of units to be deleted or replaced toward another unit string from one unit string being expressed by a pseudo-distance,
said plurality of elements of said first limited pseudo-distance matrix being computed at locations of a part of elements of a first pseudo-distance matrix presenting pseudo-distances between partial strings of the first inputted unit string from its beginning to its end and partial strings of the second inputted unit string from its beginning to its end, said plurality of elements of said first limited pseudo-distance matrix including a diagonal band composed of diagonal elements having a predetermined width in said first pseudo-distance matrix, and including an extra band composed of elements having a predetermined further width in said first pseudo-distance matrix which are positioned outside of said diagonal band, so as to include information sufficient for computation of limited pseudo distances between the first inputted unit string and the second inputted unit string,
said plurality of elements of said second limited pseudo-distance matrix being computed at locations of a part of elements of a second pseudo-distance matrix presenting pseudo-distances between partial strings of the first inputted unit string from its beginning to its end and partial strings of the third inputted unit string from its beginning to its end, said plurality of elements of said second limited pseudo-distance matrix including a diagonal band composed of diagonal elements having a predetermined width in said second pseudo-distance matrix, and including an extra band composed of elements having a predetermined further width in said second pseudo-distance matrix which are positioned outside of said diagonal band, so as to include information sufficient for computation of limited pseudo distances between the first inputted unit string and the third inputted unit string;
preprocessing means (
2
, S
2
) for analyzing the three inputted unit strings, computing the elements of the limited first and second pseudo-distance matrices, and storing computed elements into said matrix storage means (
10
);
parameter storage means (
51
) for storing therein a status parameter (com) which is a parameter for judging whether or not fo
ATR Interpreting Telecommunications Research Laboratories
Thomas Joseph
LandOfFree
Apparatus and method for producing analogically similar word... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Apparatus and method for producing analogically similar word..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for producing analogically similar word... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2545948