Encoding semi-structured data for efficient search and browsing

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C700S102000, C715S252000, C715S252000, C715S252000

Reexamination Certificate

active

06804677

ABSTRACT:

FIELD OF THE INVENTION
The preset invention is in the general field of accessing data including but not limited to eXtensible Markup Language (XML) documents.
BACKGROUND OF THE INVENTION
There follows a glossary of conventional terms. The meaning of terms are generally known per se and accordingly the definitions below are provided for clarity and should not be regarded as binding.
Glossary of Terms
Data—Information that one wants to store and/or manipulate.
Database—A collection of data organized by some set of rules.
Attribute—A feature or characteristic of specific data, represented e.g. as “columns” in a relational database. A record representing a person might have an attribute “age” that stores the person's age. Each column represents an attribute. In XML (XNM is defined below), there is an “attribute” that exists as part of a “tag.”
Column—In a relational database, columns represent attributes for particular rows in a relation. For example, a single row might contain a complete mailing address. The mailing address would have four columns (“attributes”): street address, city, state, and zip code.
Record—A single entry in a database. Often referred to as a“tuple” or “row” in a relational database.
Tuple—See “record”
Row—See “record”
Table—See “relation”
Relation—A way of organizing data into a table consisting of logical rows and columns. Each row represents a complete entry in the table. Each column represents an attribute of the row entries. Frequently referred to as a “table.”
Relational database—A database that consists of one or more “relations” or “tables”
Database administrator—A person (or persons) responsible for optimizing and maintaining a particular database
Schema—The organization of data in a database. In a relational database, all new data that comes into the database must be consistent with the schema, or the database administrator must change the schema (or reject the new data).
Index—Extra information about a database used to reduce the time required to find specific data in the database. It provides access to particular rows based on a particular column or columns.
Path—A series of relationships among data elements. For instance, a path from a grandson to grandfather would be two steps: from son to father, and from father to grandfather.
Structure—The embodiment of paths in particular documents or data For example, in a “family tree,” the structure of the data is hierarchical: it is a tree with branches from parents to children. Data without a hierarchical structure is often referred to as “flat.”
Query—A search for information in a database.
Range query—A search for a range of data values, like “all employees aged 25 to 40.”
I/O—A read from a physical device, such a fixed disk (hard drive). I/Os take a significant amount of time compared to memory operations: usually hundreds and even thousands of times (or more) longer.
Block read—Reading a fixed sized chunk of information for processing. A block read implies an “I/O” if the block is not in memory.
Tree—A data structure that is either empty or consists of a root node linked by means of d (d ? 0) pointers (or links) to d disjoint trees called subtrees of the root. The roots of the subtrees are referred to as “child nodes” of the root node of the tree, and nodes of the subtrees are “descendent nodes” of the root. A node in which all the subtrees are empty is called a “leaf node.” The nodes in the tree that arc not leaves are designated as “internal nodes.”
In the context of the invention, leaf nodes are also nodes that are associated with data.
Nodes and trees should be construed in a broad sense. Thus, the definition of tree encompasses also a tree of blocks wherein each node constitutes a block In the same manner, descendent blocks of a said block are all the blocks tat can be accessed from the block For detailed definition of “tree,” also refer to the book by Lewis and Deneberg, “Data structures and their algorithms.”
B-tree—A tree structure that can be used as an index in a database. It is useful for exact match and range queries. B-trees frequently require multiple block reads to access a single record. A more complete description of B-trees can be found on pages 473-479 of
The Art of Computer Programming
, volume 3, by Donald Ktiuth (© 1973, Addison-Wesley).
Hash table—A structure that can be used as an index in a database. It is useful for exact match queries. It is not useful for range queries. Hash tables generally require one block read to access a single record. A more complete description of hash tables can be found on e.g. pages 473-479 of
The Art of Computer Programming
, volume 3, by Donald Knuth (© 1973, Addison-Wesley).
Inverted list—A structure that can be used as an index in a database. It is a set of character strings that points to records that contain particular strings. For example, an inverted list may have an entry “hello.” The entry “hello” points to all database records that have the word “hello” as part of the record. A more complete description of inverted lists can be found on e.g. pages 552-559 of
The Art of Computer Programming
, volume 3, by Donald Knuth (© 1973, Addison-Wesley).
Semi-structured data—Data that does not conform to a fixed schema. Its format is often irregular or only loosely defined.
Data mining—Searching for useful, previously unknown patterns in a database.
Object—An object is some quantity of data It can be any piece of data, a single path in a document path, or some mixture of structure and data. An object can be a complete record in a database, or formed “on the fly” out of a portion of a record returned as the result of a query.
Markup—In computerized document preparation, a method of adding information to the text indicating the logical components of a document, or instructions for layout of the text on the page or other information which can be interpreted by some automatic system. (from the Free On-Line Dictionary of Computing—www.foldoc.ic.ac.uk)
Markup Language—A language for applying markup to text documents to indicate formatting and logical contents. Mark up languages are increasingly being used to add logical structure information to documents to enable automated or semi-automated processing of such documents. Many such languages have been proposed, ranging from generic ones such as SGML and XML, to industry or application-specific versions.
SGML—A specific example of Markup Language, Standard Generalized Markup Language. SGML is a means of formally describing a language, in this case, a markup language. A markup language is a set of conventions used together for encoding (e.g., HTML or XML).
XML—A specific example of Markup Language eXtensible Markup Language. A language used to represent semi-structured data. It is a subset of SGML. XML documents can be represented as trees.
Key—An identifier used to refer to particular rows in a database. In the context of relational database, keys represent column information used to identify rows. For instance, “social security number” could be a key that uniquely identifies each individual in a database. Keys may or may not be unique.
Join—A method of match portions of two or more tables to form a (potentially much larger) unified table. This is generally one of the most expensive relational database operations, in terms of space and execution time.
Key search—The search for a particular value or data according to a key value. This search is usually performed by an index
Search—In the context of data, searching is the process of locating relevant or desired data from a (typically much larger) set of data based on the content and/or structure of the data. Searching is often done as a batch process, in which a request is submitted to the system, and after processing the request, the system returns the data or references to the data that match the request. Typical (yet not exclusive) examples of searching are the submission of a query to a relational database system, or the submission of key words to a search engine on the World Wide Web.
Path search—The search for a particular path in the

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Encoding semi-structured data for efficient search and browsing does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Encoding semi-structured data for efficient search and browsing, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Encoding semi-structured data for efficient search and browsing will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3321336

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.