High concurrency data download apparatus and method

Multiplex communications – Pathfinding or routing – Switching a message which includes an address header

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C370S473000

Reexamination Certificate

active

06289012

ABSTRACT:

FIELD OF INVENTION
The invention pertains generally to downloading large data items in packets across communication systems, and in particular to downloading concurrently a large data item to a plurality of users on demand.
BACKGROUND OF THE INVENTION
Networking is leading to increasing demand for distributed computing, particularly “client-server” style, in which one software entity—the “client”—requests a service from another software entity—the “server.” One such service is access to a file stored on or available to the server. This access may take one of several forms. One such form, for example, is a request by a client for data in a file managed by a database located on a server. The server retrieves the data from the database and sends it to the client. In another form, on-line, shared access, a client application may access files stored by a server's file system in the same manner it accesses its local file system. Another form is file transfer, where an entire file is transferred. Although this transparent access to a remote file may be desirable in some applications, it is often difficult to implement, especially in a network or between networks composed of heterogeneous computers. Furthermore, networks are often congested or overloaded, with the unavailability of remote files potentially creating any number of problems. Therefore, it is often preferable to transfer an entire file in situations where on-line file sharing may otherwise be desirable. With file transfer, an entire program, data file or other large item can be transferred to a client for immediate or long-term use.
Protocols developed for the Internet and its predecessor packet-switched data networks are increasingly being relied upon for communication not only between interconnected networks of differing media, but also on local and wide area networks. These include various protocols for transferring files. The primary communication and file transfer protocols are the Internet Protocol (IP), the Transmission Control Protocol (TCP), HyperText Transmission Protocol (HTTP) and the File Transfer Protocol (FTP).
The IP protocol is used in the transmission of data from a source host computer (or, simply, “host”) to a destination. It is a connectionless, or datagram, service that provides for addressing independently of the underlying network medium. It does not provide end-to-end communication services. Rather, it is up to implementations of the TCP protocol to do so. TCP is a connection-oriented transport service that provides end-to-end reliability, packet sequencing and packet flow control. FTP and HTTP are application layer services that rely on the underlying TCP/IP processes. FTP, for example, sets up and manages the data connection over which a file is transferred, as well as handling any data transformations that are necessary. HTTP performs many similar services.
More particularly, a TCP session provides reliability and flow control in the transmission of the data in the file. Data that flows in a TCP connection may be thought of as a stream of octets. A sending TCP implementation is allowed to collect data from the sending user and to send that data in segments at its own convenience, until the user signals a “push” function. Then, it must send all unsent data. A stream of data sent on a TCP connection is delivered reliably and in order at the destination. Transmission is made reliable by use of sequence numbers and acknowledgments. Conceptually, each octet of data is assigned a sequence number. The sequence number of the first octet of data in a segment is transmitted with that segment and is called the segment sequence number. Segments also carry an acknowledgment number, which is the sequence number of the next expected octet of data transmitted in the reverse direction. When the TCP transmits a segment containing data, it puts a copy on a retransmission queue and starts a timer. When the acknowledgment for that data is received, the segment is deleted from the queue. If the acknowledgment is not received before the timer runs out, the segment is retransmitted.
One problem with using FTP and HTTP is that a TCP requires significant resource overhead. Thus, on a server maintaining multiple user connections, significant resources are consumed, causing the server to run slowly, to be unable to make new connections, or to crash. In order to implement TCP, each user connection must have a separate TCP buffer.
For example, as shown in
FIG. 1
, if there are three users, a file transfer application at application layer
101
, such as one implementing FTP, will prepare three different copies of an item
103
to be downloaded. The copies are designated as
103
a
,
103
b
and
103
c
, for users
1
,
2
and
3
, respectively. The application reads each of these copies into a separate TCP buffer for the user, which are referenced as
105
a
,
105
b
and
105
c
, respectively. A TCP implementation runs at TCP layer
107
. In a network layer
109
, the TCP implementation copies packets of data, in some predetermined order, from the buffers
105
a
,
105
b
and
105
c
to a buffer
111
of a network communications card, which transmits the packets over the network.
Referring now to
FIG. 2
, some wasted memory can be reclaimed at the application layer
101
by caching a single copy of the item to be downloaded and copying it to each of the TCP buffers
105
a
,
105
b
and
105
c
. Furthermore, although not shown in
FIGS. 1
or
2
, FTP requires both a control connection and a data transfer connection for each user, thus doubling the number of memory buffers.
Referring to
FIG. 3
, where multiple users have requested the same download item and the download to all the users can be scheduled for the same time, a multicast capable network can be made use of. In this case, the file transfer application at layer
101
makes use of a network communications system implementing the User Datagram Protocol (UDP) and multicasting protocol. UDP is a connectionless transport layer protocol. It is up to the application making use of a UDP implementation to deal directly with end-to-end communication problems such as packetization and reassembly, flow control and retransmission for reliable delivery. One disadvantage of this method is that the network must be multicast-enabled. Thus, it is up to the application to deliver packets containing data from download item
103
to the network buffer
111
. The Internet, however, is not fully multicast-enabled at this time. Another disadvantage is that a file transfer application relying on network multicasting cannot provide a download when demanded by only a single user, or carry out downloads of the same item to multiple users commenced at different times. In other words, multicasts must be scheduled. Scheduling is not practical or desirable in many high concurrency environments and applications.
Despite the reliability promised by TCP/IP, file downloads handled by implementations of FTP and HTTP tend to hang up when packets are lost in transmission, particularly when a server is heavily burdened with a large number of downloads. This tendency is due to the nature of TCP's process for handling missing packets. When the connection between a client or the server “time outs” due to network transmission failures, both the client and the server must go through a procedure to restart the transmission at the point where the connection times out. This restarting process provides greater opportunity for the download application to hang up when the server is already heavily burdened with multiple downloads. Furthermore, should the connection be lost in the middle of a download, resuming the download in a new connection requires identification of the last byte actually received. Although some implementations of HTTP have “recover” or “resume” features that enable resumption of an interrupted download, HTTP does not allow for a mirrored server to resume the download if the original server is down or not available.
SUMMARY OF THE INVENTION
The invention has as its object an improvem

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

High concurrency data download apparatus and method does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with High concurrency data download apparatus and method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High concurrency data download apparatus and method will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2499285

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.