Method and apparatus for evaluating a data processing...

Electrical computers and digital processing systems: multicomput – Computer-to-computer data routing – Least weight routing

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C709S241000, C709S241000, C709S241000

Reexamination Certificate

active

06604122

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to performing tasks using distributed processes either in the context of a single data processor or in a multiprocessor context. While the present invention finds advantageous application to any cooperative effort to perform a task, one example application is to database query evaluation in data management systems.
BACKGROUND AND SUMMARY OF THE INVENTION
In database management, data is organized and stored in electronically readable form for subsequent shared access by multiple computer users. Database engines enable those users to submit queries addressing such data, which may be organized conceptually in relational, tabular, hierarchical or other forms, and to receive in response to a query an output known as an answer set. The following is an example of a query. Assume there is a register of employees in a particular company. For each employee, there is stored (among other things) the employee's name, salary, and age. One possible query is to “find all names of employees that have a salary of more than $20,000 and are less than 30 years old.”
Under various circumstances, answer sets can take a significant amount of time to produce, and this problem may be significant. Therefore, database management systems often employ multiple processes operating on a single computer or multiple processors to perform a query by breaking each query request into interrelated tasks that can be executed by using multiple processes or processors. In other words, each process or processor performs some part of the execution of the query. The query results produced from all of the cooperating query processes (i.e., query operators) are gathered and combined to generate the inquired query response/answer set.
The difficulty in a distributed query execution specifically, and in any type of distributed processing system generally, is knowing when all of the interconnected and cooperating processes have completed processing of their respective portions of the overall task. Consider the example distributed processing scenario shown in
FIG. 1. A
user terminal
10
communicates with a computer system
12
by way of a user interface
14
. In the context of a database management system, the computer system
12
may also be connected to a database
16
. A task (such as a database query) from the user interface
14
is provided for distributed execution to a controlling process C which then passes the task (or portions thereof) to supporting processes P
1
, P
2
, P
3
, and P
4
as shown. Each process P
1
-P
4
executes some part (or its own portion) of the task. More specifically, controlling process C sends a message defining or otherwise related to the task to the first process P
1
. First process P
1
performs its part of the task and sends those partial task results along with the message it received from the controlling process C both to supporting processes P
2
and P
3
. The supporting processes P
2
and P
3
perform their respective parts of the task and send their respective results along with the message each received from the process P
1
, which includes the results from the process P
1
, to supporting processes.
The completed task reaches the controlling process C in the form of two messages, one from the chain C→P
1
→P
2
→C and one from the chain C→P
1
→P
3
→P
4
→C. The problem at this point is for the controlling process C to determine when it has received all messages from all of the agent processes involved in performing the task. This problem is complicated by the fact that while the controlling process knows about the first supporting process P
1
, it does not necessarily know the identity of all supporting processes that contributed to the completion of the task. As a result, the controlling process C does not even know how many process messages to expect.
One possible solution to this problem is for all of the supporting processes P
i
to inform the controlling process C of their identities and then to send a message from the controlling process C to each of the supporting processes P
i
(i.e., P
1
-P
4
in
FIG. 1
) requesting that each of the supporting processes communicate with the controlling process C when they have completed its portion of the task. Each supporting process P
i
then sends another message to the controlling process C when it has completed its respective task portion. One drawback of this solution is that it requires that the identity of all the supporting processes that will be involved in executing some portion of the task be known in advance by C. This requirement is not always easy or desirable to meet. Another significant drawback of this possible solution is that all of the overhead-type signaling between the controlling process C and the supporting processes is time consuming and inefficient.
It is an object of the present invention to provide efficient distributed processing of a task in a timely fashion.
It is an object of the present invention to perform distributed processing of a task using plural processes without the need for overhead signaling between various processes.
It is an object of the present invention for a controlling process in a distributed processing environment to be able to determine from the messages received from distributed processes executing a portion of the task when the task is complete without having specialized signaling between the controlling process and each supporting process.
It is a further object of the present invention to provide an efficient distributed processing technique where the controlling process does not even need to know the number or the identities of supporting processes which will participate in the distributed execution of a task.
The present invention solves these problems and satisfies these and other objects. Rather than sending control messages back and forth between the controlling process and each supporting process, each process includes some type of identifying information that it forwards to a supporting process along with a substantive task message. Ultimately, those messages with the identifying information reach the controlling process. The controlling process then deduces from the received identifying information whether all of the messages from all processes involved in the task have been received. If the controlling process deduces they have, the task is complete.
In one example embodiment, as a message related to a data processing task is passed between those processes involved in performing the data processing task. Each of those processes performs some function related to the task and passes an “end token” along with the message—as modified by the function performed by that process—to another of the involved processes. The control process determines that the data processing task is finished using the end tokens received along with the distributed processing results from the involved processes. Each end token includes a unique identifier and the number of copies of the message forwarded by the corresponding process and is used by the control process to generate an array of count values. A count value is established for each unique end token based on the number of message copies forwarded by the corresponding process. When all of the count values are decremented to zero, the controlling process determines that the messages from all involved processes have arrived and processing is complete.
The invention has advantageous application to database management, and in particular, to database query evaluation. However, the present invention is not limited to database management or query evaluation and may be employed advantageously in any application where processes cooperate to achieve an overall goal. The processes may be implemented in the same computer or at separate computers. In a single computer case, the processes may communicate for example using the message passing functionality of the computer's operating system. In a multi-computer case, each computer may communicate over an appropria

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for evaluating a data processing... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for evaluating a data processing..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for evaluating a data processing... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3103578

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.