NUMA system with redundant main memory architecture

Electrical computers and digital processing systems: memory – Storage accessing and control – Shared memory area

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S162000

Reexamination Certificate

active

06785783

ABSTRACT:

BACKGROUND
1. Field of the Present Invention
The present invention generally relates to the field of data processing systems and more particularly to a non-uniform memory architecture (NUMA) system in which main memory data is backed-up on one or more other nodes of the NUMA system using RAID-like techniques to improve fault tolerance and reduce the amount of time spent storing data to permanent storage.
2. History of Related Art
In the field of microprocessor based data processing systems, the use of multiple processors to improve the performance of a computer system is well known. In a typical multi-processor arrangement commonly referred to as a symmetric multi-processor (SMP) system, a set of processors access a system memory via a shared bus referred to herein as the system or local bus. The use of a shared bus presents a scalability limitation. More specifically, the shared bus architecture ultimately limits the ability to improve performance by connecting additional processors to the system bus, after a certain point, the limiting factor in the performance of a multiprocessor system is the bandwidth of the system bus. Roughly speaking, the system bus bandwidth is typically saturated after four processors have been attached to the bus. Incorporating additional processors beyond four generally results in little, if any, performance improvement.
To combat the bandwidth limitations of shared bus systems, distributed memory systems, in which two or more SMP systems (referred to as nodes) are connected to form a larger system, have been proposed and implemented. One example of such a system is referred to as a non-uniform memory architecture (NUMA) system. A NUMA system is comprised of multiple nodes, each of which may include its own processors, local memory, and corresponding system bus. The memory local to each node is accessible to the other nodes via an interconnect network (referred to herein as the NUMA fabric) that links the various nodes. The use of multiple system busses (one for each node) enables NUMA systems to employ additional processors without incurring the system bus bandwidth limitation experienced by single bus systems.
For many data processing applications, reliably maintaining the application's data is of paramount importance. The reliability of data is conventionally maintained by periodically backing up the data in main memory data to persistent or non-volatile memory. Referring to
FIG. 1
, a data processing system
100
is illustrated in block diagram format. Data processing system
100
may include one or more nodes
102
. Each node
102
includes one or more processors
104
that access a local memory
108
via a memory controller
106
. A cache memory (not explicitly shown in
FIG. 1
) may reside between a processor and the memory controller). Nodes
102
may share a common, persistent mass storage device or devices identified in
FIG. 1
as disk
112
. If multiple disks are used, they may be arranged as a redundant array of inexpensive disks (RAID) to assure high availability of the data. RAID designs are described in Source, which is incorporated by reference herein.
Local memory
108
is typically implemented with dynamic random access memory (DRAM) that is susceptible to power loss, but has a significantly faster access time than disk
112
. The application data stored in local memory
108
is periodically written back to disk
112
to protect against data loss from an unexpected event such as a power outage or node crash. The frequency with which data in local memory
108
is written back to disk
112
is a function the particular application and the rate at which data accumulates in local memory
108
. Data intensive applications may require frequent disk backups to guard against loss of a large amount of data. The time required to write data to or retrieve data from disk
112
(the disk access time) is characteristically orders of magnitude greater than the access time of RAM
108
. Application performance may, therefore, suffer in data intensive applications requiring frequent disk backup. It would be highly desirable, therefore, to implement a system in which data is maintained with sufficient reliability in a high-speed memory to enable less frequent disk backup thereby enhancing system performance.
SUMMARY OF THE INVENTION
The problem identified above is addressed by a method and system for managing data in a data processing system as disclosed herein. Initially, data is stored in a first portion of the main memory of the system. Responsive to storing the data in the first portion of main memory, information is then stored in a second portion of the main memory. The information stored in the second portion of main memory is indicative of the data stored in the first portion. In an embodiment in which the data processing system is implemented as a multi-node system such as a NUMA system, the first portion of the main memory is in the main memory of a first node of system and the second portion of the main memory is in the main memory of a second node of the system. In one embodiment, storing information in the second portion of the main memory is achieved by storing a copy of the data in the second portion. If a fault in the first portion of the main memory is detected, the information in the second main memory portion is retrieved and stored to a persistent storage device. In another embodiment, storing information in the second portion of the main memory includes calculating a value based on the corresponding contents of other portions of the main memory using an algorithm such as checksum, parity, or ECC, and storing the calculated value in the second portion. In one embodiment, the main memory of at least one of the nodes is connectable to a persistent source of power, such as a battery, such that the main memory contents may be preserved if system power is disabled.


REFERENCES:
patent: 5177744 (1993-01-01), Cesare et al.
patent: 5469542 (1995-11-01), Foster et al.
patent: 5495570 (1996-02-01), Heugel et al.
patent: 6035411 (2000-03-01), Yomtoubian
patent: 2001075741 (2001-03-01), None

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

NUMA system with redundant main memory architecture does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with NUMA system with redundant main memory architecture, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and NUMA system with redundant main memory architecture will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3349303

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.