System and method for comparing data sets

Data processing: database and file management or data structures – Database and file access – Preparing data for information retrieval

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S758000

Reexamination Certificate

active

07921110

ABSTRACT:
The present invention provides a system and method for comparing data sets, to ensure that they are accurate reflections of each other, without the need for performing O(N2) operations, in which N is the size of each data set. A hash table is generated for the first data set. For each of the second data set entries, should the entry not exist in the hash table, the entry is second data set unique. Otherwise, the entry is removed from the hash table. At the end of the pass through the second data set entries, only those entries that exist in the hash table are first data set unique. Alternately, two processes operate in parallel so that each selects entries from one of the data sets and determines if the entry exists in the hash table. If the entry does exist, it is removed. Otherwise, the entry is added to the hash table.

REFERENCES:
patent: 4570217 (1986-02-01), Allen et al.
patent: 5163131 (1992-11-01), Row et al.
patent: 5202979 (1993-04-01), Hillis et al.
patent: 5278979 (1994-01-01), Foster et al.
patent: 5355453 (1994-10-01), Row et al.
patent: 5485579 (1996-01-01), Hitz et al.
patent: 5802366 (1998-09-01), Row et al.
patent: 5819292 (1998-10-01), Hitz et al.
patent: 5835953 (1998-11-01), Ohran
patent: 5931918 (1999-08-01), Row et al.
patent: 5941972 (1999-08-01), Hoese et al.
patent: 5963962 (1999-10-01), Hitz et al.
patent: 5987506 (1999-11-01), Carter et al.
patent: 6065037 (2000-05-01), Hitz et al.
patent: 6240409 (2001-05-01), Aiken
patent: 6289356 (2001-09-01), Hitz et al.
patent: 6425035 (2002-07-01), Hoese et al.
patent: 6466696 (2002-10-01), Politis
patent: 6473767 (2002-10-01), Bailey et al.
patent: 6574591 (2003-06-01), Kleiman et al.
patent: 6604118 (2003-08-01), Kleiman et al.
patent: 6606694 (2003-08-01), Carteau
patent: 6662196 (2003-12-01), Holenstein et al.
patent: 6748504 (2004-06-01), Sawdon et al.
patent: 6993539 (2006-01-01), Federwisch et al.
patent: 7007046 (2006-02-01), Manley et al.
patent: 7010553 (2006-03-01), Chen et al.
patent: 7039663 (2006-05-01), Federwisch et al.
patent: 7043485 (2006-05-01), Manley et al.
patent: 7096421 (2006-08-01), Lou
patent: 7225204 (2007-05-01), Manley et al.
patent: 7249150 (2007-07-01), Watanabe et al.
patent: 7454445 (2008-11-01), Lewis et al.
patent: 2002/0049720 (2002-04-01), Schmidt
patent: 2002/0078041 (2002-06-01), Wu
patent: 2002/0178146 (2002-11-01), Akella et al.
patent: 2003/0041211 (2003-02-01), Merkey et al.
patent: 2003/0070043 (2003-04-01), Merkey
patent: 2003/0158861 (2003-08-01), Sawdon et al.
patent: 2003/0158863 (2003-08-01), Haskin et al.
patent: 2003/0158873 (2003-08-01), Sawdon et al.
patent: 2003/0159007 (2003-08-01), Sawdon et al.
patent: 2004/0030668 (2004-02-01), Pawlowski et al.
patent: 2004/0078419 (2004-04-01), Ferrari et al.
patent: 2004/0093347 (2004-05-01), Dada
patent: 2005/0015391 (2005-01-01), Pohlan
patent: 1003103 (2000-05-01), None
patent: WO 00/07104 (2000-02-01), None
patent: WO 01/31446 (2001-05-01), None
“Adaptive set intersections, unions, and differences”, by Demaine et al, Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 743-752, Year: 2000.
Mastering Algorithms with Perl, by Jon Orwant et al., Publisher: O'Reilly Media, Inc., Pub Date: Aug. 20, 1999, ISBN-10: 1-565-92398-7.
rsync Unix command manual page, version 2.4.1, Feb. 2000.
Rationale for Adding Hash Table to the C++ Standard Template Library, by David R. Musser, Computer Science Department, Rensselaer Polytechnic Institute, Feb. 1995.
“NESTOR: An Architecture for Network Self-Management and Organization”, by Yemini et al., IEEE Journal on Selected Areas in Communications, vol. 18, No. 5, May 2000, pp. 758-766.
“A Technique for Isolating Differences Between Files”, by Heckel, Communications of the ACM, Apr. 1978.
Ting et al. “System and Method for Comparing Data Sets”, U.S. Appl. No. 60/531,890, filed Dec. 23, 2003, 32 pages.
Akyurek, Sedat, Placing Replicated Data to Reduce Seek Delays, Department of Computer Science, University of Maryland, UMIACS-TR-91-121, CS-TR-2746, Aug. 1991.
Bitton, Dina, Disk Shadowing, Proceedings of the 14th VLDB Conference, LA, CA (1988).
Chaudhuri, Surajit, et al., Self-Tuning Technology in Microsoft SQL Server, Data Engineering Journal 22, 2 1999 pp. 20-27.
Chutani, Sailesh, et al., The Episode file system, In Proceedings of the USENIX Winter 1992.
Coyne, Robert A., et al., Storage Systems for National Information Assets, Proc. Supercomputing 92, Minneapolis, Nov. 1992, pp. 626-633.
Finlayson, Ross S., et al., Log Files: An Extended File Service Exploiting Write-Once Storage Department of Computer Science, Stanford University, Report No. STAN-CS-87-1177, Sep. 1987.
Gray, Jim, et al., The Recovery Manager of the System R Database Manager, ACM Computing Surveys, (13)2:223-242 1981.
Hecht, Matthew S., et al. Shadowed Management of Free Disk Pages with a Linked List, ACM Transactions on Database Systems, 8/4, Dec. 1983, pp. 503-514.
Howard, John H, et al. Scale and Performance in a Distributed File System, Carnegie Mellon University, CMU-ITC-87-068, Aug. 5, 1987.
Howard, John, H. et al., Scale and performance in a distributed file system, ACM Trans. Computer. System., 6(1), Feb. 1988 pp. 51-81.
Howard, John H., An Overview of the Andrew File System, Carnegie Mellon University, CMU-ITC-88-062.
Kazar, Michael Leon, Synchronization and Caching Issues in the Andrew File System, Carnegie Mellon University, CMU-ITC-88-063.
Kazar, Michael L., et al., Decorum File System Architectural Overview, USENIX Summer Conference, Anaheim, California, 1990.
Kemper, Alfons, et al., Performance Tuning for SAP R/3, Data Engineering Journal 22, 2 1999 pp. 33-40.
Kent, Jack et al., Optimizing Shadow Recovery Algorithms, IEEE Transactions on Software Engineering, 14( 2): 155-168 , Feb. 1988.
Kistler, et al., Disconnected Operation in the Coda File System, ACM Transactions on Computer Systems, vol. 10, No. 1, Feb. 1992, pp. 3-25.
Lorie, Raymond, A, Physical integrity in a large segmented database, ACM Trans. Database Systems, (2)1: 91-104, Mar. 1977.
Ousterhout, John et al., Beating the I/O Bottleneck: A Case for Log-Structured File Systems, Technical Report, Computer Science Division, Electrical Engineering and Computer Sciences, University of California at Berkeley, Oct. 30, 1988.
Patterson, D., et al., A Case for Redundant Arrays of Inexpensive Disks (RAID), Technical Report, CSD-87-391, Computer Science Division, Electrical Engineering and Computer Sciences, University of California at Berkeley (1987).
Patterson, D., et al., A Case for Redundant Arrays of Inexpensive Disks (RAID), SIGMOD International Conference on Management of Data, Chicago, IL, USA, Jun. 1-3, 1988, SIGMOD Record (17)3:109-16 (Sep. 1988).
Peterson, Zachary Nathaniel Joseph, Data Placement for Copy-on-Write Using Virtual Contiguity, University of CA, Santa Cruz, Master of Science in Computer Science Thesis, Sep. 2002.
Quinlan, Sean, A Cached WORM File System, Software-Practice and Experience, 21(12):1289-1299 (1991).
Rosenblum, Mendel, et al., The LFS Storage Manager, Computer Science Division, Electrical Engin. and Computer Sciences, Univ. of CA, presented at Summer '90 USENIX Technical Conference, Anaheim, CA Jun. 1990.
Rosenblum, Mendel, et al, The Design and Implementation of a Log-Structured File System Jul. 24, 1991 pp. 1-15.
Rosenblum, Mendel, The Design and Implementation of a Log-Structured File System, Copyright 1992, pp. 1-93.
Rosenblum, Mendel, et al., The Design and Implementation of a Log-Structured File System, , In Proceedings of ACM Transactions on Computer Systems, (10)1:26-52, Feb. 1992.
Schiefer, Berni, et al., DB2 Universal Database Performance Tuning, Data Engineering Journal 22, 2 1999 pp. 12-19.
Seltzer, Margo I., et al., Journaling Versus Soft Updates: Asynchronous Meta-Data Protection in File Systems, Proceedings of 200 USENIX Annual Technical Conference, Jun. 18-23, 2000.
Shasha, Dennis, Tuning Time Series Queries in Finance: Case Studies and Recommendations, Data Engineering Journal 22, 2 1999 pp. 41-47.
Sidebotham, Bob, Volumes: The

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for comparing data sets does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for comparing data sets, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for comparing data sets will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2707317

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.