Electrical computers and digital processing systems: memory – Address formation – Hashing
Reexamination Certificate
1998-04-15
2001-09-18
Kim, Matthew (Department: 2186)
Electrical computers and digital processing systems: memory
Address formation
Hashing
C711S118000, C707S793000
Reexamination Certificate
active
06292880
ABSTRACT:
FIELD OF THE INVENTION
The present invention relates to information delivery, and relates more specifically to a cache for information objects that are to be delivered efficiently and at high speed over a network to a client.
BACKGROUND OF THE INVENTION
Several important computer technologies rely, to a great extent, upon rapid delivery of information from a central storage location to remote devices. For example, in the client/server model of computing, one or more servers are used to store information. Client computers or processes are separated from the servers and are connected to the servers using a network. The clients request information from one of the servers by providing a network address of the information. The server locates the information based on the provided network address and transmits it over the network to the client, completing the transaction.
The World Wide Web is a popular application of the client/server computing model.
FIG. 1
is a simplified block diagram of the relationship between elements used in a Web system. One or more web clients
10
a
,
10
b
, each of which is a computer or a software process such as a browser program, are connected to a global information network
20
called the Internet, either directly or through an intermediary such as an Internet Service Provider, or an online information service.
A web server
40
is likewise connected to the Internet
20
by a network link
42
. The web server
40
has one or more internet network addresses and textual host names, associated in an agreed-upon format that is indexed at a central Domain Name Server (DNS). The server contains multimedia information resources, such as documents and images, to be provided to clients upon demand. The server
40
may additionally or alternatively contain software for dynamically generating such resources in response to requests.
The clients
10
a
,
10
b
and server
40
communicate using one or more agreed-upon protocols that specify the format of the information that is communicated. A client
10
a
looks up network address of a particular server using DNS and establishes a connection to the server using a communication protocol called the Hypertext Transfer Protocol (HTTP). A Uniform Resource Locator (URL) uniquely identifies each information object stored on or dynamically generated by the server
40
. A URL is a form of network address that identifies the location of information stored in a network.
A key factor that limits the performance of the World Wide Web is the speed with which the server
40
can supply information to a client via the Internet
20
. Performance is limited by the speed, reliability, and congestion level of the network route through the Internet, by geographical distance delays, and by server load level. Accordingly, client transaction time can be reduced by storing replicas of popular information objects in repositories geographically dispersed from the server. Each local repository for object replicas is generally referred to as a cache. A client may be able to access replicas from a topologically proximate cache faster than possible from the original web server, while at the same time reducing Internet server traffic.
In one arrangement, as shown in
FIG. 1
, the cache is located in a proxy server
30
that is logically interposed between the clients
10
a
,
10
b
and the server
40
. The proxy server provides a “middleman” gateway service, acting as a server to the client, and a client to the server. A proxy server equipped with a cache is called a caching proxy server, or commonly, a “proxy cache”.
The proxy cache
30
intercepts requests for resources that are directed from the clients
10
a
,
10
b
to the server
40
. When the cache in the proxy
30
has a replica of the requested resource that meets certain freshness constraints, the proxy responds to the clients
10
a
,
10
b
and serves the resource directly. In this arrangement, the number and volume of data transfers along the link
42
are greatly reduced. As a result, network resources or objects are provided more rapidly to the clients
10
a
,
10
b.
A key problem in such caching is the efficient storage, location, and retrieval of objects in the cache. This document concerns technology related to the storage, location, and retrieval of multimedia objects within a cache. The object storage facility within a cache is called a “cache object store” or “object store”.
To effectively handle heavy traffic environments, such as the World Wide Web, a cache object store needs to be able to handle tens or hundreds of millions of different objects, while storing, deleting, and fetching the objects simultaneously. Accordingly, cache performance must not degrade significantly with object count. Performance is the driving goal of cache object stores.
Finding an object in the cache is the most common operation and therefore the cache must be extremely fast in carrying out searches. The key factor that limits cache performance is lookup time. It is desirable to have a cache that can determine whether an object is in the cache (a “hit”) or not (a “miss”) as fast as possible. In past approaches, caches capable of storing millions of objects have been stored in traditional file system storage structures. Traditional file systems are poorly suited for multimedia object caches because they are tuned for particular object sizes and require multiple disk head movements to examine file system metadata. Object stores can obtain higher lookup performance by dedicating DRAM memory to the task of object lookup, but because there are tens or hundreds of millions of objects, the memory lookup tables must be very compact.
Once an object is located, it must be transferred to the client efficiently. Modern disk drives offer high performance when reading and writing sequential data, but suffer significant performance delays when incurring disk head movements to other parts of the disk. These disk head movements are called “seeks”. Disk performance is typically constrained by the drive's rated seeks per second. To optimize performance of a cache, it is desirable to minimize disk seeks, by reading and writing contiguous blocks of data.
Eventually, the object store will become full, and particular objects must be expunged to make room for new content. This process is called “garbage collection”. Garbage collection must be efficient enough that it can run continually without providing a significant decrease in system performance, while removing objects that have the least impact on future cache performance.
PAST APPROACHES
In the past, four approaches have been used to structure cache object stores: using the native file system, using a memory-blocked “page” cache, using a database, and using a “cyclone” circular storage structure. Each of these prior approaches has significant disadvantages.
The native file system approach uses the file system of an operating system running on the server to create and manage a cache. File systems are designed for a particular application in mind: storing and retrieving user and system data files. File systems are designed and optimized for file management applications. They are optimized for typical data file sizes and for a relatively small number of files (both total and within one folder/directory). Traditional file systems are not optimized to minimize the number of seeks to open, read/write, and close files. Many file systems incur significant performance penalties to locate and open files when there are large numbers of files present. Typical file systems suffer fragmentation, with small disk blocks scattered around the drive surface, increasing the number of disk seeks required to access data, and wasting storage space. Also, file systems, being designed for user data file management, include facilities irrelevant to cache object stores, and indeed counter-productive to this application. Examples include: support for random access and selective modification, file permissions, support for moving files, support for renaming files, and support for append
Beguelin Adam
Gourley David
Haines Matthew
Mattis Peter
Plevyak John
Anderson Matt
Bingham Marcel K.
Hickman Palermo & Truong & Becker LLP
Inktomi Corporation
Kim Matthew
LandOfFree
Alias-free content-indexed object cache does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Alias-free content-indexed object cache, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Alias-free content-indexed object cache will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2519828