Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-05-07
2004-09-14
Robinson, Greta (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
Reexamination Certificate
active
06792431
ABSTRACT:
BACKGROUND OF THE INVENTION
Large masses of data reside in multiple databases, applications, file systems, repositories, or specialized data stores. The large masses of data are comprised of multiple models of multiple products of multiple vendors or manufacturers, all of which utilize different data structures and different database management systems including different user interfaces into their respective underlying databases. The data structures within databases even vary among versions of the same model from the same manufacturer. Adding to the complexity, many data stores are not even databases as such, comprising, for example, repositories of electronic files or documents stored in file systems under hierarchical directory structures.
Data integration is intended to enable a customer using one repository to make use of data residing in another repository. Data integration customers typically need to locate data in a source repository, transform the data from a source format to a destination format, and transfer the data from the source to the destination.
The most ambitious attempt in prior art to solve the problem of data integration is data warehousing based upon a standard data model. The idea of the standard model is that an industry, for example the seismic data processing industry or the geophysical data processing industry, gathers in committee and agrees on standard data formats for seismic data. The geophysical data processing industry is a good example of the need for data integration because the industry utilizes extremely large volumes of geophysical data regarding wells, well logs, and log curves. If the industry could agree on a standard data model, then the industry could build application programs to convert the multiple data models from various source databases into one standard model and use the data in standard form to transfer data among customers.
In one application of a standard model, data in the standard form is physically stored in a central location called a data warehouse which is then made available to subscribing customers who can make use of the data through applications designed to operate against the standard data model. It is useful to note that data warehousing, as the term is usually used in the data integration industry, does not require use of an industry-wide standard model. In fact, many data warehousing projects start with a group within a corporate entity establishing a local standard model for their own internal warehouse. This local standard model may or may not be based on any industry standard. However, when such a local standard model is established and used as a corporate standard, it behaves identically to an industry-based standard with all its inherent flaws and weaknesses.
The standard data model does, to some extent, ease access to data across structure types. The standard data model, however, demonstrates problems that seem intractable within the standard model itself. One problem is that the standard data model utilizes a completely static standard structure. That is, there is no method or system within the standard model for giving effect to routine changes in source system data structures. After the structure of a standard model is standardized by an industry standards committee (or a local data management group), the standard model structure is locked in place until changed by the committee. The source data structures in the databases integrated by the standard model, however, change daily. The only way to change the standard model data structures to keep up with the changes in structures in industry databases is to gather a list of desired changes, take them to the industry standards committee, and request changes in the standard model. After the committee approves changes in the standard model, all applications desiring to use the new standard model, as well as the software processes, if any, comprising the model itself, must be rewritten, an extremely laborious, expensive, and time-consuming process.
A second problem with the standard model is data loss. The static nature of the standard model means that all data structure changes in industry databases not yet integrated into the standard model result in data loss every time data from an external repository is transferred into the standard model. In addition, the fact that the standard model data structure is established by committee means that it is a compromise practically never capable of including all fields from all databases for any record type. Neither the initial implementation of a standard model nor subsequent upgrades typically include all fields from all repositories contributing transferred data for a record type. For these reasons, actual utilization of a standard model for data integration almost always results in data loss.
For these reasons, and for other good reasons that will occur to the reader, there is an ongoing need for improved methods and systems for data integration.
SUMMARY
Aspects of the present invention include methods, systems, and products for data integration based upon dynamic common models. Aspects of the present invention typically include adapters as data communications interfaces between native data repositories and data integration applications. Aspects of the present invention typically include loose coupling between adapters and data integration applications. Aspects of the invention are summarized here in terms of methods, although persons skilled in the art will immediately recognize the applicability of this summary equally to systems and to products.
A first aspect of the invention includes methods of data integration including extracting a first native record from a first native repository, through a first adapter for the first native repository. In typical embodiments, the first adapter is loosely coupled for data integration to a data integration application, wherein the first native record from the first native repository has a first native format, and the first native format belongs to a category of formats identified as a datatype.
Typical embodiments include transforming, through the first adapter, the format of the first native record having the first native format to a dynamic common format, the dynamic common format being a subset of a dynamic common model, the dynamic common model comprising mappings specifying transformations to and from the dynamic common format for all data elements in all formats of all native records in all datatypes, whereby is produced a first native record having the dynamic common format.
Typical embodiments include transforming, through a second adapter, the format of the first native record having the dynamic common format from the dynamic common format to a second native format of a second native repository, the second native format belonging to a category of formats identified as datatypes, wherein the second adapter is loosely coupled for data integration to the data integration application, whereby is produced a first native record having attributes in the second native format. Typical embodiments include inserting, through the second adapter, the first native record having the second native format into the second native repository.
Other aspects of the invention include methods of creating systems implementing a dynamic common model, the systems typically including data integration applications, the methods typically including developing a first adapter for a first native repository, the first adapter being loosely coupled for data integration to the data integration application, the first native repository comprising first native records having first native formats, the first native formats belonging to categories of formats identified as datatypes. Typical embodiments further include developing a second adapter for a second native repository, the second adapter being loosely coupled for data integration to the data integration application, the second native repository comprising second native records having second native formats, the second native formats belonging to
Jacobs John
Tamboli Aderbad
Anadarko Petroleum Corporation
Arnold & Ferrera
Ferrera Raymond R.
Rayyan Susan
Robinson Greta
LandOfFree
Method, system, and product for data integration through a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method, system, and product for data integration through a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method, system, and product for data integration through a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3191091