Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-02-08
2002-09-03
Breene, John (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
C704S009000
Reexamination Certificate
active
06446081
ABSTRACT:
BACKGROUND OF THE INVENTION
This invention relates to method and apparatus for the input of data into computers and, in some embodiments, to subsequent retrieval thereof. Particularly but not exclusively, in one embodiment the invention relates to the input of data to, and data retrieval from, a database, and in another to the input of data defining a specification for a computer program.
1. Field of the Invention
The problem of providing communication between humans and computers has occupied those in the fields of computing hardware and software since the birth of computing. For decades, the goal has been to provide computers which can communicate with a human “naturally” by understanding free-form speech or text input. However, despite continued progress, this goal has not been reached yet.
Human-computer interaction is used for many things. For example, it is used to input immediate instructions for action by a computer (which is at present mainly provided by the combination of a cursor control device such as mouse and icons displayed on the screen, or by the use of menus). It is also used to input instructions for subsequent execution (which is mainly achieved at present by forcing human beings to use tightly constructed programming languages or descriptive languages which, despite their superficial resemblance to human languages, bear little relationship to way human beings actually communicate). Finally, it is used for data storage and retrieval (which is at present typically performed by storage of a textual document, and retrieval by searching for the occurrence of character strings within the document).
Those skilled in the art have approached this problem by the development of artificial intelligence techniques, with the aim either of providing a sufficiently comprehensive set of rules that a machine can eventually understand natural language input, or of providing a “self learning” machine capable of developing the same ability by repeated exposure to natural language.
SUMMARY OF THE INVENTION
In one aspect the present invention seeks to address the same technical problem, but from a different direction. In the present invention, an input document (which may be spoken or in text form, or indeed in any other form representative of natural language) is input to an input apparatus (which may be provided by a general purpose computer) and is analysed, to separate the meaningful concepts within the document and record these together their inter-relationship. The present invention has this in common with most attempted artificial intelligence systems.
For example, EP-A-0118187 discloses a natural language input system which is menu driven, allowing a user to select one word at a time from a menu, which prompts the next possible choices based on what has previously been input.
U.S. Pat. No. 5,677,835 discloses an authoring system in which documents to be translated are input and analysed, and where ambiguities are detected, the user is prompted to resolve them.
In this aspect of the invention, however, these meaningful entities (for example, the concepts described by nouns) are displayed on an output screen, in a graphical form, which represents them as separate icons and meaningfully indicates their interconnection or relationship.
This apparently simple step provides a number of benefits. The first is that it gives immediate feedback to the human inputting the data of the “understanding” gained by the computer. Natural human language is full of ambiguities which, normally, human beings are readily able to resolve without conscious thought because of their shared knowledge base, which are at best ambiguous and, at worst, mis-recognised by a computer.
To take an English example, “Mary was kissed by the lake” is ambiguous, since it can be interpreted either as indicating that the lake is the active party (the kisser) or that the lake is the location at which Mary is kissed by an (unknown) active party.
Whereas a human immediately understands the correct meaning, and may not even see the presence of an ambiguity, a computer is unable to do so unless programmed by a rule or conditioned by experience.
By displaying the construction understood graphically, however, the present invention enables the avoidance of such ambiguities which are immediately recognisable to the user.
Very preferably, the invention provides a graphical user interface to enable the user to manipulate the graphical display, and means to interpret the results of such manipulation. Indeed, it would in principle be possible to allow the user to directly input the document graphically without previous direct document input (although this is not preferred, for reasons of speed, for most applications).
Thus, the user is able to allow the computer to extract as much meaning as possible from the input document and then to correct the ambiguities or errors graphically.
The invention will be understood to differ from so called “visual programming” systems, as described, for example, in EP-A-0473414. Such visual programming systems provide a graphic environment in which operations to be specified are represented visually, and a user may specify a sequence of such operations by editing the display to create and alter linkages between the elements. However, in visual programming, as in other known methods of creating or specifying programs, the user is constrained to select from a limited number of predefined operations and connections therebetween. By contrast, the present invention accepts documents as input and analyses the documents to provide the graphical display which may subsequently be edited.
The resulting semantic structures, corresponding to the graphical representation (corrected where necessary), are stored for subsequent processing or retrieval. In one embodiment, data retrieval apparatus is provided. In another embodiment, the stored data is employed by a code generator, to generate a computer program.
The invention is advantageously used for this latter application, because the detection of ambiguities eliminates one of the difficulties in existing software specification and automatic code generation from such specifications.
In either case, in the preferred embodiment there is a stored lexical table which stores data relating to the meanings of words which will be encountered in the source document (analogously to an entry in an well structured dictionary).
Preferably, in this case, the apparatus is arranged to perform “reasoning” utilising this semantic information, by comparing the meanings of groups of words (i.e. clauses or sentences) of the document to locate inconsistencies, or by performing the same operation between multiple different documents.
This is particularly advantageous in embodiments where the source document is to act as a specification for the generation of computer code, because it enables the location of conflicting requirements.
Since the present invention, in this embodiment, has some “understanding” of the “meaning” of words, it is able to store the content data (for example in the form of semantic structures representing groups of words such as clauses or sentences) by reference to such “dictionary entries”—i.e. by reference to their “meaning”, rather than the source language word which was input. This makes it possible to use a multilingual embodiment of the present invention, where the lexical entries are mapped onto corresponding words in each of a plurality of languages, so that data may be input in one language and output in one or multiple different languages.
In embodiments for data retrieval, or similar applications, each such lexical entry may have an associated code indicating the “difficulty”, “obscurity” or “unfamiliarity” of the concept described. For example, concepts may be labelled as familiar to children upwards; familiar to adults; or familiar only to particular specialists such as physicists, chemists, biologists, or lawyers.
With knowledge of the level of familiarity of the data retriever, the present invention is in this embodiment able to utilise such
Breene John
British Telecommunications public limited company
Nixon & Vanderhye P.C.
Rayyan Susan
LandOfFree
Data input and retrieval apparatus does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data input and retrieval apparatus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data input and retrieval apparatus will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2878089