Media presentation system controlled by voice to text commands

Data processing: speech signal processing – linguistics – language – Speech signal processing – Application

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C345S215000

Reexamination Certificate

active

06718308

ABSTRACT:

BACKGROUND
1. Field of the Invention
The present invention relates to the manipulation, navigation, and assemblage of multimedia using voice commands to allow for remote assembly and display of images during projected presentations, by combining pre-existing computer programs with original computer programs and electronic circuitry. In particular, a method and apparatus as further described allows voice commands to operate impromptu juxtapositioning and displaying of various forms of data onto a presentation screen. The display can either be immediately recorded for a subsequent presentation or incorporated into an active presentation of movies, still images, text, and sound. Submitted utterances go through a series of conversional and identification filters (based on pre-set user preferences), which automatically search, scrutinize, and capture from pre-selected databases (local or remote), on-line commercial media vendors, and/or the World Wide Web (WWW). The speaker then sees instantaneous results and can either submit those results to a large display, modify the search, or juxtapose the results to fit desired projected output.
2. Description of the Related Art
Known in the art is the use of voice commands for voice-to-text conversion of language utterances with preferred words stored in a modifiable word bank. A voice recognition module working in conjunction with a computer and utilizing a microphone as an input device allows for the display of translated language on the computer screen. See U.S. Pat. No. 4,984,177 (Rondel et al.).
More recently, the use of voice recognition has been implemented for the navigation of application programs being utilized by a single operating system. As seen, for example, in U.S. Pat. No. 5,890,122 to Van Kleeck et al., a method and system is described for an application program to receive voice commands as an input facility for the navigation and display of drop-down menus listing available commands for the application. The available commands can be modified as preferred by the user allowing the list to be made variable.
The navigation through applications utilizing “windows”, or graphical interfaces of a portion of a screen that can contain its own document, is also demonstrated by U.S. Pat. No. 5,974,384 to Yasuda. Again, these systems demonstrate the use of the voice recognition module accompanying a computer system employing a particular operating system. The software or hardware works in close relationship with the operating system allowing the voice recognition process in the system to provide a signal resulting from the executed voice input. What is desired, however, is not just a means of navigation through a single application being used on a computer using voice command utilities, but a unit that allows access to separate database to provide hands-free navigation through variable output facilities.
The navigation of displays outside the field of text and graphical user interfaces using voice technology is evident in its implementation for the World Wide Web (WWW). In U.S. Pat. No. 5,890,123 to Brown et al., a system and method is disclosed for a voice controlled video screen display wherein the data links of a web page are utilized by speech recognition, thereby allowing the navigation of the WWW by voice command. The software program in this application is a web browser. Though a web browser may be utilized by a variety of operating systems, the displays retrievable are made accessible only through entry into a global network, and the “hands-free” navigation can only be accomplished by the displayed links particular for the web page. The present invention demonstrates the assembly, manipulation, and navigation of digital displays beyond those simply comprising displays produced by Hypertext Markup Language (HTML) on the World Wide Web. This system and method will also teach, not only the navigation of text and graphical interfaces, but also the manipulation and assembling of various types of on-screen-digital displays and multimedia, retrievable and searchable from variable databases.
Multimedia as it is used for presentation purposes covers a wide range of displays, both audio and visual. A cohesive organization of these displays is paramount when presenting the images on a screen. There are currently graphics and recording programs that can provide voice-command manipulation of images. However, there is no graphics program that allows easy and precise manipulation, of non-graphics experts and voice functioning systems, which benefit the product-as-a-whole, without re-structuring it to accommodate the varying degrees of inputs and outputs. The art of “hands-free” manipulation of digital images, such as still pictures or movies, and sound objects is limited. U.S. Pat. No. 5,933,807 to Fukuzawa shows how a displayed picture can be manipulated by a generated sound signal. A major limitation exists in that the arrangement and display of the images occurs within a single screen in that prior art.
For instance, an example that exemplifies the need for the present system and method is one in which a doctor in an operating room conducting a complex procedure requires a recent X-ray, which can be called-up immediately. Or, for example, an auto mechanic, who is following a procedure from images in an on-line manual, requires an immediate visual comparison to an older part and needs to perform this action without taking his/her hands off of the tool being held in place. Lastly, there may be envisioned a speaker who, during a business presentation, impresses the clientele by visually addressing tough questions answered by a simple vocal query through a pre-constructed local database.
Thus, there is a need for a system that provides “hands-free” navigation, manipulation, and assembly of a variety of multimedia, which is accomplished remotely for the purposes of presentation on various screens. The present invention can assemble searched text and images from variable databases, and allow a user to record, juxtapose, and manipulate image displays either impromptu or pre-planned.
SUMMARY OF THE INVENTION
Combined with external and internal computer components, and internal software programs, this system comprises a unit that enables any user to vocally assemble and display (individually or in a series) still images, movie-clips, feature-length movies, feature-length audio presentations, short audio-clips, text, or any combination of the aforementioned without any concern of media type. The system can also, through verbal command, “free-float” the placement of any visual image, including any non-rectangular forms, transparent images, or text, onto a background image or similar image that concurrently becomes the background image. Lengthy presentations (still-image, movie, or audio or a combination) can also be automatically re-configured to fit into pre-assigned time frames that previously, would have had to of been disrupted due to sporadic pauses caused by the typical external human intervention. The option to record—with or without the original verbatim search query—into video/audio playback and/or text-computer readable language for future reference also exists.
An inputted voice utterance is converted by voice recognition means into computer-readable text in the form of a command. The commands are categorized as “search”, “manipulation”, and “navigation”, each of which comprise other commands that are triggered in succession by directionals, which are a means for triggering the commands separately or simultaneously.
The means necessary for performing and directing the commands include a series of EDSP filters, which are search and image capturing commands that contain therein a series of “media reader” directionals. The EDSP filters are used as a means for taking the converted search command, identifying relevant database(s), committing the search, and retrieving results by conducting multiple-page “plane” searches for compatible media. The filtering means then transfers matching data into a means for juxtapositioning or displaying search

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Media presentation system controlled by voice to text commands does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Media presentation system controlled by voice to text commands, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Media presentation system controlled by voice to text commands will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3214539

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.