Data processing: measuring – calibrating – or testing – Measurement system – Performance or efficiency evaluation
Reexamination Certificate
2002-12-31
2004-10-12
Barlow, John (Department: 2863)
Data processing: measuring, calibrating, or testing
Measurement system
Performance or efficiency evaluation
C707S793000, C707S793000, C709S224000
Reexamination Certificate
active
06804627
ABSTRACT:
BACKGROUND OF THE INVENTION
Conventional database management applications organize and store large amounts of data according to a predetermined normalization and indexing manner to allow efficient update and retrieval by a user. In a typical relational database, the data is organized according to tables on a logical level, and further subdivided into physical disks, files, extents and segments. Further, a particular database operation often includes multiple tables via joining or linking the tables by using key fields. Also, a single table may be spread across multiple disks or files, due to volume or access contention constraints.
Accordingly, a single database query may invoke a number of database I/O requests to a plurality of database resources, such as disks, files, tables and extents. Further, I/O requests triggering a physical device access tend to be particularly burdensome operations in a database management application because of disk seek and latency time. Contention among multiple users attempting to simultaneously access a database, therefore, can result in contention for the common database resources, such as disks and files, resulting in a bottleneck for access to the common resource.
While proper database design and resource allocation purports to balance the expected demand load, at least initially, database contention and the resulting bottlenecks represent an ongoing maintenance issue. Changes in the number of users, increases in the quantity of data stored, and the method of access (i.e. LAN throughput, remote Internet access) can affect the demand load placed on database resources. Further, disks and files can become fragmented and extended over time, thereby causing a table or file to migrate to different physical areas, and increasing the likelihood of incurring additional read and latency access time.
Accordingly, conventional methods are known for tracking database access attempts and providing output indicative of database operations. Conventional systems employ event logging, log files, media utilization graphs, high water marks and CPU utilization graphs to track database usage and isolate potential or actual bottlenecks. These conventional methods typically provide a graphical or textual output format that an operator or field service technician can interpret in an attempt to assess database resource contention.
SUMMARY
Conventional database analysis methods suffer from a variety of deficiencies. In general, conventional methods typically generate output that is too voluminous and unweildly to be analyzed effectively, or are prohibitively intrusive such that normal database throughput suffers from the monitoring overhead. In particular, the methods outlined above tend to generate log files, which dump an indication of each database access attempt. A typical conventional event logger or log file will generate a very large text or other type of file identifying each transaction over a data gathering period. Typically, these conventional files contain extraneous data such as system operations and access to tables or files which are not the subject of the analysis, in addition to pertinent database table accesses. Additionally, the conventional systems perform a subsequent analysis operation of the raw data that imposes a lag time on the output result, hindering any ability to obtain real time feedback on database performance.
Often, conventional database tracking entries are written with such frequency that the CPU overhead required hinders overall system performance. Conventional graphical analysis, such as CPU utilization graphs and disk utilization graphs, can also entail substantial overhead. Also, during processing of conventional database statistics systems, other computer system activities tend to affect CPU usage in addition to access to the database tables or files for which information is sought, thereby skewing the results of such a CPU or disk graph.
To illustrate an example of deficiencies posed by conventional database analysis methods, consider an operator establishing a log file for access to a database table. The operator designates an hour of log time. The operator is focused on database accesses to a certain table, but many tables are frequently accessed in the logged database instance. Consider further that each user access transaction results in an acknowledgement from the disk and a confirmatory update to an index. Accordingly, the conventional logging process generates three entries for each access transaction for all tables, resulting in a large, unwieldy log file.
The operator can access the resulting unwieldy data in the log file several ways using conventional systems. One conventional technique involves manual inspection of the log file by table name and may yield the transactions initiated for the particular table, but the operator will need to examine many other entries and may inadvertently skip entries in the voluminous hardcopy while scanning for the proper table name. A conventional parser could analyze the log file automatically, but the operator must manually develop the procedure to parse the log file and look for the table name. The operator may be able to modify the conventional logging procedure to selectively log certain entries, however, this approach also requires manual coding of procedures.
Embodiments of the invention are based in part, on the observation that it would be beneficial to provide a database performance gathering and analysis tool to retrieve database requests without gathering substantial extraneous data and without unduly burdening the database or the CPU with the resources required to execute the tool itself. Configurations of the present invention significantly overcome deficiencies with the above conventional methods and provide such a solution. In particular, embodiments of the invention provide mechanism and techniques that include a method for processing database performance statistics that includes periodic sampling of pending database requests, rather than exhaustively monitoring and capturing all database access traffic, to identify areas of contention. The sampling is done in sample/sleep intervals that occur for a predefined time period such as 20 seconds for each database instance. The cycle of sampling different database instance can repeat, for example, every two minutes for a total sampling time of 30 minutes. By using a unique embedded set of sample sequences for different instances of a database, embodiments of the invention can obtain an accurate indication of performance bottlenecks to various database resources of different database instances.
During this sampling process, the system of the invention periodically samples or scans a database access queue to gather samples of pending requests corresponding to database transactions. An aggregating component receives the sampled requests and aggregates the samples with previous samples corresponding to the same transaction. Correlating the aggregated samples identifies transactions that have been pending the longest and identifies database objects, such as files, tables and segments, which have a relatively high number of pending transactions. By periodically sampling, rather than exhaustively logging all requests, embodiments of the invention significantly reduce or minimize CPU intrusiveness and significantly eliminate trivial and benign transactions from the output. Further still, embodiments of the invention identify the most burdened database objects to enable a database administrator to make informed decisions about remedial actions to correct database performance issues. Also, by sampling using a sampling structure that is then “dumped” out to the aggregating structure, continuously pending transaction progress can be tracked over multiple sample iterations.
The database performance gathering and analysis tool of this invention therefore substantially pinpoints areas of contention in the database, allowing a database administrator or other operator to pursue quantitative and deterministic remedial actions, rather than
Chen Shu-zi
Marokhovsky Serge G.
Prathab Sadasiva K
Ward Anthony
Barlow John
Chapin & Huang LLC
Chapin, Esq. Barry W.
EMC Corporation
Le John
LandOfFree
System and method for gathering and analyzing database... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for gathering and analyzing database..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for gathering and analyzing database... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3270425