Document retrieving apparatus and document retrieving method

Data processing: speech signal processing – linguistics – language – Speech signal processing – Recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S009000, C704S270000, C704S275000, C707S793000

Reexamination Certificate

active

06622122

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to an apparatus and a method for retrieving document based on voice or the like. More specifically, the present invention provides a document retrieving apparatus and a document retrieving method capable of assuring an effective and reliable document search which is not adversely influenced by the sentence recognition accuracy in the voice-based document retrieving operation.
Conventionally known, as representative voice-based document retrieving apparatus/method, is a document retrieving apparatus/method which combines the voice or speech recognition and the whole sentence retrieval.
FIG. 43
shows a conventional voice-based document retrieving apparatus. The conventional voice-based document retrieving apparatus shown in
FIG. 43
comprises an audio input section
4301
which converts a sound or voice, such as a user's utterance, into an electric signal. A sentence recognizing section
4302
receives the electric signal from audio input section
4301
and recognizes the sound as a sentence. A retrieval condition producing section
4303
produces retrieval conditions for retrieving documents based on the sentence recognized by the sentence recognizing section
4302
. A document storing section
4304
stores documents to be retrieved. A document retrieving section
4305
retrieves the documents stored in the document storing section
4304
based on the retrieval conditions produced by the retrieval condition producing section
4303
. And, an information output section
4306
outputs the document search result having been done by the document retrieving section
4305
.
FIG. 44
is a flowchart showing the document retrieving operation performed in the above-described conventional document retrieving apparatus. First, in the flowchart shown in
FIG. 44
, the audio input section
4301
converts the user's utterance into an electric signal (step
4401
).
Next, the sentence recognizing section
4302
analyzes the electric signal of the user's voice or speech as a character pattern signal and recognizes a sentence based on the analyzed character patterns (step
4402
).
The retrieval condition producing section
4303
produces the retrieval conditions for retrieving documents based on the sentence recognized by the sentence recognizing section
4302
(step
4403
).
The document retrieving section
4305
retrieves the documents (i.e., retrieval objects) stored in the document storing section
4304
based on the retrieval conditions produced by the retrieval condition producing section
4303
(step
4404
).
The information output section
4306
informs an outside device or person, such as the user, of the document search result having been done by the document retrieving section
4305
(step
4405
).
As apparent from the foregoing description, the above-described conventional document retrieving apparatus/method recognizes the voice as a sentence, produces the retrieval conditions based on the recognized sentence, and retrieves the documents (i.e., retrieval objects) based on the produced retrieval conditions, thereby accomplishing the voice-based document retrieval operation.
However, the following problem arises in the above-described conventional document retrieving apparatus/method. In general, the voice or speech recognition is subjected to severe input circumstances including uncertainty in user's utterance, performance reliability of voice input device, and inclusion of noises. Thus, there is the possibility that the converted electric signal of the input voice may comprise a strange word (or character) not involved in the original voice or speech but similar to the word (or character) inherently involved in the original voice or speech.
Accordingly, because of inclusion of such strange words not involved in the original voice or speech, the above-described conventional document retrieving apparatus/method may erroneously recognize such strange words as candidate words constituting the sentence corresponding to the input voice or speech. In some cases, this kind of strange or error words have a higher likelihood than the corresponding true or genuine words inherently involved in the original voice or speech.
FIG. 45
is a sample explaining the voice or speech recognition performed by the above-described conventional document retrieving apparatus/method.
In
FIG. 45
, someone speaks “san-in e ryokyoo shitain desuga”, the sound of which is entered into the audio input section
4301
. In this case, the audio input section
4301
may erroneously convert the input sound into an electric signal representing a phonemic string of “sanninderyokooshitaiindesuga.” Namely, “sannin”/“san'in”, “de”, “ryokoo”, “shita”, “iin”, and “desuga” are recognized as candidate words for constituting the sentence. Regarding the expression of “sannin”/“san'in”, it means that the word “sannin (three persons)” has a higher likelihood than that of the word “san'in (San-in area).” Thus, “sannin” is ranked high.
The above conventional voice-based document retrieving apparatus/method, however, constructs only one sentence based on the recognized candidate words in compliance with its own standards for the sentence recognition. In this case, the actually spoken word “san'in (San-in area)” will be deleted or dropped due to its lower likelihood whereas it is the true or genuine word inherently involved in the original utterance.
According to the example shown in
FIG. 45
, the sentence “sannin de ryokoo shita iin desuga” is finally recognized. The actually spoken word “san'in (San-in area)” disappeared from the resultant sentence, because the word “san'in (San-in area)” has a lower likelihood than that of the word “sannin (three persons).” Accordingly, “san'in (San-in area)” is no longer involved in the document retrieval conditions produced by the retrieval condition producing section
4303
. Instead, the resultant sentence comprises some strange (error) words, such as “sannin (three persons)” and “iin (doctor's office)” etc. Therefore, in the step
4404
, the document retrieval operation is improperly performed based on the wrong sentence having a different meaning not corresponding to the original voice or speech.
As described above, there is the problem that the above-described conventional document retrieving apparatus/method possibly deletes or drops the actually spoken word in the sentence recognition and therefore produces wrong retrieval conditions. Thus, it becomes impossible to successfully perform the document retrieval operation.
Furthermore, to realize a highly accurate sentence recognition for the general sentences of natural language, the above conventional voice-based document retrieving apparatus/method requires a huge number of general language data relating to normally used various vocabulary and sentence patterns to perform the sentence recognition with reference to these language data. Thus, the tremendous cost is required for collecting or establishing such a huge language data base.
SUMMARY OF THE INVENTION
In view of the above, the present invention has an object to provide a document retrieving apparatus and a document retrieving method capable of assuring an effective and reliable document search which is not adversely influenced by the sentence recognition accuracy in the voice-based document retrieval operation.
Furthermore, another object of the present invention is to provide a document retrieving apparatus and a document retrieving method capable of suppressing the cost in collecting or establishing a necessary language data base for the voice-based document retrieval operation.
In order to accomplish this and other related objects, a first aspect of the present invention provides a document retrieving apparatus for performing a document search based on sound including voice. The first aspect document retrieving apparatus comprises an audio input means for converting a sound into an electric signal and generating a character pattern data. A language model storing means

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Document retrieving apparatus and document retrieving method does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Document retrieving apparatus and document retrieving method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Document retrieving apparatus and document retrieving method will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3024793

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.