Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2000-04-14
2003-09-30
Metjahic, Safet (Department: 2171)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C345S215000
Reexamination Certificate
active
06629097
ABSTRACT:
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
Not Applicable
BACKGROUND OF THE INVENTION
This invention relates to the field of computer-implemented systems and methods for extracting and displaying implicit associations among items in loosely-structured data sets.
The advent of electronic data storage and retrieval technology has provided users of that technology with significant benefits in terms of the ability to request and receive enormous amounts of information with very little effort and in a very short period of time. Associated with these advance, however, is the difficulty in identifying information that is relevant to the searcher's notion of what he wants to find, and separating that information from the sheer volume of data retrieved which is not relevant to those concerns. This issue is familiar to anyone who has used one of the common search tools to search the worldwide web for information: potentially thousands of pages of information are returned, but there is almost no assistance in identifying which are relevant to the searcher's interests, and which are not.
People are increasingly surrounded, if not bombarded, with a growing volume of data and information. Gaining access to data is relatively easy; being able to sort through that data to find information relevant to our interests is increasingly difficult. One general approach to this problem has been to attempt to narrow search results by filtering out items deemed irrelevant to the searcher's interests. The purpose of data set search mechanism, such as Standardized Query Language (SQL) inquiries for formally structured databases, or search “engines” such as those used by popular Internet sites, is to return a subset of the total data set based on the specifications supplied by the person making the query. The usefulness of these search mechanisms, depends to a large extend on the knowledge and sophistication of the person making the query: if the query is formed using the same terminology used to describe or index the data source (of course this is required in the SQL query example), the greater is the probability that the query will return “relevant” information. The less the searcher knows about the structure or specific terminology describing the data set being queried, the greater the volume of irrelevant information that will be retrieved.
However, the capability of judging what the searcher's actual intentions and interests are, in the context of an automated search system such as a computer search algorithm, is highly problematic. This is especially difficult when the searcher's intentions are vague or uncertain, which leads to search criteria that are ill-defined and ambiguous. This is a well-known and unsolved problem in the field of data search and retrieval. Natural languages, such as English, are comprised of numerous words and expressions capable of conveying multiple meanings, the intended meaning of which is often recognizable only when the ambiguous term is considered in context with other surrounding terms and conceptual constructs. To compound the problem, the searcher may not realize what he is looking for, and may recognize relevant data only when he sees it in a context that he could not have specified in advance. One common strategy in defining the context that will lead to relevant information, is to interpret the searcher's intentions by filtering out assumed irrelevant data based on that interpretation. However, solutions based on such interpretations of the searcher's intent may fall short of delivering relevant data, especially when the query itself is uncertain and not fully defined.
The identification of items in a data set can be facilitated by imposing a categorical structure on that data, which is another general strategy that has been applied to the problem of producing relevant search results. A number of Internet search engines use this approach, grouping results in terms of such categories as “music,” “travel,” “shopping,” and so on. This approach, although arguably more useful than most attempts to discern the searcher's underlying intentions, has numerous problems. Sorting data items into determined and fixed categories generally requires human intervention and interpretation; that is, the process is expensive and not easily automated. Also, data items frequently fall into multiple categories: how are they to be represented? There can be many alternate interpretations of what belongs in one category and what does not, and this added to the ambiguity of language itself means that imposing an external categorical structure on a complex data set is difficult, costly, inexact, and generally incomplete. Last and not least, the relationships between the categories themselves cannot be easily conveyed to the searcher. There is a relationship between, for example, all “countries” and all “vegetation,” but this type of relationship cannot be described in a fixed category, “list-like” format typical of popular Internet search engines.
Consider a simple example: a person is visiting a city which he has not visited in many years. The person has a vague memory of a wonderful restaurant where he dined with friends long ago: he has no idea where it is, what the name might be, but he does recall that it had an ornately carved wooden bar made of South American rosewood, and that the cuisine was an interesting combination of Italian and Asian, although he cannot remember if it was Thai or Vietnamese, or possible Chinese. To find it again, he might look in the hardcopy telephone Yellow Pages, or access an electronic yellow pages.
Use of the hardcopy Yellow Pages requires that the person run down the entire alphabetical list of restaurant names, hoping to remember the name itself, or he may be able to browse through a categorical listing of “Italian” as opposed to “Chinese” restaurants. His chance of finding the information he wants, namely the name of the restaurant, depends on his ability to “recognize” that name when he sees it.
Online, using a Web-based yellow pages, he can specify a Boolean search, using terms such as “Italian AND Asian,” which may return a large list of restaurants, from midtown to the farthest suburban outpost, and offering many combinations of cuisine. The restaurant he is looking for may be in this list somewhere, but again, not obviously so. The only hope he has of finding it is to painstakingly work down the list, and perhaps go to individual restaurant Web sites, reading the descriptions and looking at the pictures. Still, even though he is not exactly sure of what he is looking for, if he located a reference to an Italian-Asian cuisine restaurant which mentioned an ornately carved antique rosewood bar, he would feel relatively confident about having found what he was looking for.
What would be preferred in this situation is a more detailed category, and at the same time a more semantically flexible category, to describe something close to what he is looking for: Italian-Asian-possibly-Vietnamese restaurants with antique carved-Rosewood-bars. Attempting to express this type of dynamic, subjectively-relevant categorization, through the use of fixed, hierarchical category schemes is problematic at best, and virtually impossible in terms of anticipating all the combinations of categorical constructs possible. An ideal solution to the problem would be a query result, based not on externally defined categories, or on categories which may be specified in the search itself, but a query result based on the categories inherent (implicit) in the data set, and based on the content and descriptions of the individual data items, no matter what that content might be.
To stretch the above example, suppose the restaurant the person is looking for is actually run by a Greek-Cambodian couple, and thus the cuisine is off the mark of his original search criteria And, that he had forgotten that it is furnished with chairs from the Captain's table of an old whaling ship. He did not specify this in his search,
Hunter Robert M.
Keith Douglas K.
Metjahic Safet
Nguyen Merilyn P.
LandOfFree
Displaying implicit associations among items in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Displaying implicit associations among items in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Displaying implicit associations among items in... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3018908