Error detection/correction and fault detection/recovery – Pulse or data error handling – Transmission facility testing
Reexamination Certificate
1999-06-08
2002-03-05
Chung, Phung M. (Department: 2784)
Error detection/correction and fault detection/recovery
Pulse or data error handling
Transmission facility testing
C714S047300, C714S048000
Reexamination Certificate
active
06353902
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates in general to an improved telecommunications networks, and in particular to a method and system for fault prevention in telecommunications network. Still more particularly, the present invention relates to a method and system for fault prediction and proactive maintenance in a telecommunications network.
2. Description of the Related Art
In a highly competive market, there is a demand for highly reliable networks and networks that are easy to monitor. This includes the detection of faults in real-time or near real-time with minimal manual intervention.
Modern telecommunications networks are growing fast in both size and complexity. A Network Alarm correlation system improves network reliability through network surveillance and fault management. Traditionally, alarms (also referred to as logs and utilized interchangeably throughout this document) report status and abnormalities in the network to the Network Operations Centers (NOC) manned by network domain experts. These alarms are generated by the Network Elements (NE). NEs produce thousands of alarms a day, where a single failure often generates multiple alarms and the same alarm may be raised by different failures. Currently, a burst of alarms during a major network failure may exhibit 40-50 alarms per second. These alarms that are provided to protect the network, due to its sheer volume, may cause network operators to overlook alarms unnoticed, notice them too late, and incorrectly interpret groups of alarms, which results in frequent and undetected network failures. Thus, the task of network failures, faults and surveillance is very difficult. Added to this is the ever increasing number of alarms introduced to the system by new software loads.
Previously, when network maintenance was entirely dependent on network domain experts, all network logs flowed directly to the NOC. As the network growth increased exponentially, so did the number of logs flooding to the NOCs. Due to the frequent inability to foresee the failures, NOC staff operates in a reactive mode to failures already occurred, rather than being in a proactive mode to contain failures in their initial stages. Such frequent network failures affect the revenues of service providers and results in low customer satisfaction. Thus, the task of identifying the faults and correcting them before it is too late is a critical task of network management.
The currently available tools for log detection are nodal management tools. These tools often can not perform root cause analysis or prediction, lack the capability to predict faults, require manual monitoring of the network, and are reactive in nature.
The International Telecommunications Union (ITU) has a five-layered model known as Telecommunications Management Network (TMN) put forth to address this problem. TMN includes (1) the business management layer, (2) the service management layer, (3) network management (NM) layer, (4) element management (EM) layer and (5) the network element (NE) layer.
Fault prediction applications exist for the EM layer and NE layer which are vendor-specific. The NM layer is the domain of the equipment manufacturer and because of this it is difficult to integrate multi-vendor products.
Faults in the NM layer are due to the impact of external factors that could be in the form of busy hour traffic, cable cut during road construction, microwave link failure due to bad weather, etc. A failure results in network downtime, loss of revenue to service provider, and reduction of customer satisfaction. Thus, the task of identifying these and other faults and correcting them before it is too late is a critical task of network management.
With this backdrop of a chaotic network management structure, calls for equipment vendors to be more proactive to resolving these issues with efficient network management systems has been growing. The answer from many equipment manufacturers has been to build and deploy alarm correlation systems described above. These alarms correlation systems are placed between the network and the NOC.
The older (first generation) alarm correlation systems are more domain expert intensive. Here the fault patterns observed by the experts are implemented as rules in an expert system. The information required for building an expert system is readily available (in the form of experts'knowledge).
However, as discussed above, such first generation systems are incomplete due to the nature of telecommunications today. Because of the complexity of code and the number of logs that may be generated in a typical fault scenario (around 5,000 a second) it is almost impossible and improbable that a group of domain experts will catch any significant number of the faults and take necessary proactive measures to prevent a network failure. The problems appear continuously, but the expert never gets the opportunity to completely analyze the scenario prior to the next occurrence. This leads to an incomplete knowledge base of failures and proactive actions by the domain expert.
The newer (next generation) alarm correlation systems have incorporated many software methodologies and concepts to systematically search alarm databases, problem ticket databases, etc. and extract patterns not seen by the domain experts. The next generation alarm correlation solutions employed data mining (DM) techniques to identify and learn patterns (commonly referred to as episode rules in data mining terminology). Analogous to the phase where the fault patterns are extracted from domain experts in traditional systems, DM techniques extract fault patterns from alarm databases. These rules are then passed through system experts to see if any are redundant, or if it is a significant rule. The resulting patterns are then fed in as rules to an expert system.
Several such solutions have been advanced including sophisticated systems that consist of both traditional and non-traditional alarm correlation systems. These includes: (i) the Telecommunications Alarm Sequence Analyzer (TASA) by NOKIA; (ii) ANSWER and ECXpert, two tools created by AT&T; and (iii) IMPACT by GTE.
Many of the current approaches for alarm correlation depend on the expertise of a domain expert to provide the observed network fault patterns. However, as previously discussed, it is not sufficient to depend on domain expertise alone. The rapidly evolving networks continuously alter the existing network topology with the addition of new network elements, new software loads, and network connections. These scenarios pose a serious threat to the expert's knowledge, which to a great extent relies on seeing a pattern over and over again. This opens the field for the utilization of systems that are capable of assisting the domain experts in identifying fault patterns.
It is therefore desirable to have a system for dynamically handling faults in a telecommunications network that is capable of discovering, learning and predicting the recurrent patterns of faults of a network as well as being capable of providing precautionary action. It would be further desirable to have a network alarm correlation system that dynamically and systematically discovers alarm correlation rules that enables root cause analysis, fault prediction and proactive maintenance.
SUMMARY OF THE INVENTION
It is therefore one object of the present invention to provide an improved telecommunications network.
It is another object of the present invention to provide a method and system for identifying fault patterns as they occur in telecommunications network.
It is yet another object of the present invention to provide a method and system for fault prediction and proactive maintenance in a telecommunications network.
The foregoing objects are achieved as is now described. A system for proactive maintenance of a telecommunications network is disclosed. A database is created containing characteristics (parameters) of a plurality of valid logs. These valid logs represent alarms within a network that report status and abnormalit
Basu Kalyan
Kulatunge Anurudha
Lee Hee C.
Prakash Meenakshi
Bracewell & Patterson L.L.P.
Chung Phung M.
Crane John D.
Nortel Networks Limited
LandOfFree
Network fault prediction and proactive maintenance system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Network fault prediction and proactive maintenance system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Network fault prediction and proactive maintenance system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2817421