Method and apparatus for using system traces to characterize...

Electrical computers and digital data processing systems: input/ – Input/output data processing – Input/output command process

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C709S241000

Reexamination Certificate

active

06269410

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention relates to data storage systems. More specifically, the present invention relates to methods and apparatus for distributing data over a range of data storage devices.
Configuration and management of a data storage system can be a major undertaking. Planning for a medium-scale installation (e.g., a few terabytes) might take many months, representing a significant fiscal expenditure. High-end applications (e.g. OLTP or decision support systems) typically deal with many terabytes of data spread over a range of physical devices. The difficulties inherent in configuring and managing storage are compounded by the sheer scale of the systems. Additionally, these high-end applications tend to exhibit fairly complex behaviors. Thus, the question of how to distribute data over a range of storage devices while providing some performance guarantees is not trivial.
The configuration and management difficulties are further compounded because the configuration of a data storage system is dynamic. After a system is initially configured, the configuration is likely to change. Applications and databases are added, new devices are added, older devices that become obsolete and defective devices are removed and replaced by devices having different characteristics, etc. Adding to the complexity of configuring a system is the use of network-attached storage devices along with client's desire to share the storage across multiple computer systems with nearly arbitrary interconnection topologies via storage fabrics like fiber-channel networks.
The complexity of configuration and management can lead to poor provisioning of the resources (“capacity planning”). Poor capacity planning, in turn, might result in the use of more data storage devices than needed. This, in turn, can needlessly add to the cost of the data storage system.
Additional problems can flow from poor capacity planning. Poor allocation of data among different devices can reduce throughput. For example, two data sets (i.e., two database tables) that are stored on the same device might be accessed at the same time. Those two data sets could compete for the same throughput resources and potentially cause a bottleneck and queuing delays.
Queuing delays arise when a storage device is in the process of servicing a first request and receives additional requests. The additional requests are usually queued and will not be serviced until an outstanding request is completed by the device. Eventually, the storage device will service all of the requests that are queued; however, response time will suffer.
Analysis of application behavior such as “workload characterization” can be used to improve the capacity planning of data storage systems. For example, if two data sets are competing for the same throughput resources, it would be very useful to identify the degree to which these data sets are being used simultaneously. Once identified, the data sets can be re-allocated to avoid a bottleneck.
Therefore, it would be desirable to have a better understanding of workload characterization in order to better allocate workloads across the storage devices.
SUMMARY OF THE INVENTION
The present invention allows for an understanding of I/O activity patterns which, in turn, allows for a better allocation of data across multiple storage devices in a data storage system. I/O activity is characterized in terms of streams (I/O request collections) accessing stores (units of storage).
According to one aspect of the invention, use is made of system traces that are generated during I/O operations with the data storage system. The system traces are gathered, records in the gathered system traces are grouped according to stores, I/O activity in streams corresponding to the stores is identified, and groups of records are processed to characterize I/O activity patterns corresponding to the streams.
The stores can be re-allocated across the data storage system based on this characterization of the I/O activity patterns. Thus, the present invention can be used to increase data throughput of the data storage system, decrease data storage capacity, and reduce response time.
Other aspects and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the present invention.


REFERENCES:
patent: 4852001 (1989-07-01), Tsushima et al.
patent: 5036456 (1991-07-01), Koegel
patent: 5978844 (1999-11-01), Tsuchiya et al.
patent: 6006248 (1999-12-01), Nagae
patent: 6047309 (2000-04-01), Dan et al.
patent: 6061761 (2000-05-01), Bachmat
patent: 6076174 (2000-06-01), Freund
patent: 6078943 (2000-06-01), Yu
patent: 6112257 (2000-08-01), Mason, Jr. et al.
patent: 6119174 (2000-09-01), Borowsky et al.
patent: 6148324 (2000-11-01), Ransom et al.
patent: 6154852 (2000-11-01), Amundson et al.
patent: 6173306 (2001-01-01), Raz et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for using system traces to characterize... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for using system traces to characterize..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for using system traces to characterize... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2555453

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.