Electrical computers and digital processing systems: multicomput – Computer-to-computer protocol implementing
Reexamination Certificate
2001-07-03
2002-12-10
Barot, Bharat (Department: 2154)
Electrical computers and digital processing systems: multicomput
Computer-to-computer protocol implementing
C709S220000, C709S223000, C709S224000, C709S250000, C370S401000
Reexamination Certificate
active
06493761
ABSTRACT:
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the xerographic reproduction by anyone of the patent document or the patent disclosure in exactly the form it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
FIELD OF THE INVENTION
The present invention relates to a general-purpose data parsing and analysis system and, in particular, a common system and method for analyzing any data composed of interrelated data structures similar to the protocols found within network frames.
BACKGROUND OF THE INVENTION
Data search processors perform a number of functions such as data matching, filtering, statistics gathering, converting and bracket matching. Data search processors or tools are typically associated with a specific data editor and are limited to recognizing embedded control characters that are associated with the particular data editor. Data search tools that function independently, meaning that they are not associated with a specific data editor, are not currently able to recognize embedded control characters for any data editor. A discussion of a data editor independent search tool may be found in B. Kernighan, et al., “Regular Expressions,” Dr. Dobb's Journal, April, 1999, p. 19-22, the contents of which are incorporated herein by reference.
Data parsing refers to the ability to categorize data into components based on the characteristics of the data values. A practical example of data parsing would consist of the following procedures: (1) efficiently searching for text words in a document that consisted of both text and graphics; (2) identifying the components of the document that are graphical; and (3) skipping over the graphical components rather than searching through them character by character as though they were text or control characters.
Filtering of data files is typically implemented using a value for comparison, and, in some cases, “wildcard” characters within the value. Filtering of data files typically comprises doing a search on the data file, and then taking an action based upon the search results. For example, a filter might search for all instances of a particular data expression, and then provide a count of the total number of instances found.
For multiple value filters, the result from each filter is logically combined together to obtain an overall result. Therefore, each additional result adds to the processing required to filter on that value. Conventional filtering does not typically include a provision to identify embedded graphic images so that the images may be either intentionally examined or skipped over in a data search.
A practical example of data filtering would be to search all components of a document for company proprietary information, and filter out the proprietary information, in order to prevent its unwanted disclosure. Such information might be embedded, or hidden, in the control characters of the data editor format.
Existing data search, filtering and statistical tools are either specific to a particular version of a particular data editor, for example the “Find” or “Word Count” functions typically found in popular word processors, or must parse through files character-by-character without being able to differentiate among data, document format control characters or graphic characters. Thus, the existing tools are either limited by their inability to work across various editors, or, for those tools that are not editor-dependent, their inability to efficiently parse files containing data, document control characters and graphic characters.
Although CPU's available today can execute hundreds of millions or even billions of instructions per second, to achieve the necessary processing rates for most filtering, vendors often must provide dedicated hardware assistance and/or front-end processors with hand-coded assembly language routines. This solution typically requires hardware and/or software modifications whenever changes are made to the number of supported features or editors.
In a conventional data search engine, a string of characters is specified, and the engine searches for the string of characters in the data editor file(s). For an ASCII character set file, the file may contain:
a. alphanumeric characters, such as a-z, A-Z, 0-9;
b. delimiters, such as punctuation characters and spaces;
c. graphics, such as bit maps;
d. control character sequences, such as a sequence of characters that will cause the data editor to show words underlined or in bold print, or change the size of the font; and
e. “junk strings” of characters, such as control character sequences appearing consecutively with different values for the same control or duplicate control character strings, that may be generated by automatic conversions performed on a file to change it from one data editor format to another, for instance a conversion from Word document format to Rich Text Format.
Conventional data search engines cannot be configured to: a) recognize or identify values as i) elements for the control syntax of a data, spreadsheet or other kind of editor, or ii) part of a graphic image; or b) to modify the use of a value by specifying the characteristics associated with the value.
Thus, it is desirable to have a configurable search, filter, statistics, and conversion capability, with common control logic that: a) is applicable to many different data editors or character sets, b) provides field based operations, and c) can be implemented in either hardware or software. By using common control logic, the system can be reconfigured to support the variety of existing data editors, document formats and character sets and to support future data editors, document formats, and character sets without the need for hardware or software modifications. Moreover, the added ability to provide filtering and to collect statistics in hardware may significantly improve performance.
SUMMARY OF THE INVENTION
The present invention is directed to improved systems and methods for parsing, searching, filtering, gathering statistics, and converting data files generated by any data editor, using character sets and editor controls definitions that can be programmably defined. A single logic control module, implemented in either hardware or software, is used to perform a number of data manipulation functions, such as parsing, filtering, statistics gathering, and data conversion. The module is based on one or more programmably configurable protocol descriptions that may be stored in and retrieved from an associated memory.
By using common control logic, meaning a single logic control module, and programmably configurable character-set characteristics and data editor control protocol descriptions, changes can be made to existing data editor control protocol descriptions and support for new data editor control protocol descriptions can be added to a system entirely through user reconfiguration, without the need for hardware or software system modifications. Thus, those skilled in the art will appreciate that a data file manipulation system in accordance with the present invention may be configured and reconfigured in a highly efficient and cost effective manner to implement numerous data manipulation functions, such as parsing, and to accommodate substantial data editor modifications, such as the use of different editors, editor versions, or editor formats, without requiring substantial system changes.
In a preferred embodiment, the system employs a CPU or other hardware-implemented method as a processing unit for analyzing files in response to selectively programmed parsing, filtering, statistics gathering, and display requests. The embodiment may be incorporated in a device, including a CPU and a plurality of input devices, storage devices, and output devices wherein files are received from the input devices, stored in the storage devices, processed by the CPU based upon one or more programmabl
Baker Peter D.
Neal Karen
Barot Bharat
Lyon & Lyon LLP
NB Networks
LandOfFree
Systems and methods for data processing using a protocol... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Systems and methods for data processing using a protocol..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Systems and methods for data processing using a protocol... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2972821