vein/19971030: All

Transfer Sizes

click to enlarge

The graph shows the distribution of network transfer sizes for hits and misses and disk transfer sizes for swap requests based on 24 hour data. To show the real traffic rather than file size distribution, we count every access to a document. Thus, a document that had 3 accesses would be counted once as a miss (if the document was not cached before) and twice as a hit (if the document was still in the cache).

Misses are larger than hits (have more large files) because large documents are not popular (big unpopular files are always counted at least once as misses but rarely counted as hits) and every access to a document is counted (small popular files are counted many times for hits and only one for misses). Swap-in requests are larger (have more large files) than hits because of significant number of 304 hits that are very small but are never retrieved from the disk. Are swap-out requests larger than misses because the number of small uncachable documents is high?


File Sizes

click to enlarge

The graph shows the distribution of file sizes for hits, misses, and swap requests based on 24 hour data. To show file size distribution rather that traffic, we count only the first access to a document. This graph is helpful in estimating the number of cached objects given the cache size as well as per file memory requirements.


Proxy Traffic Intensity

click to enlarge

The graph shows the change in proxy server load with time. Load is measured as number of requests per second processed by the proxy. Hits and misses are determined using Squid's action field in access.log.


Proxy Response Time

click to enlarge

Proxy response time is the total time it takes to serve a client request. The graph shows median response time in milliseconds during the day.

Hits do have smaller median response time. This helps to decrease response time for all requests (the "all" curve goes down).


Repsonse Time vs. Size

click to enlarge

The graph shows proxy response time versus document size.

Interestingly, response time for hits does not increase with the file size for small files. This phenomenon can be observed on all types of proxies from leaf to top level caches. This may be attributed to the TCP buffer size which is usually at least 16 KB. Misses are retrieved from another server. This may make their response time more size-dependent.


Concurrent Requests

click to enlarge

The graph shows the number of concurrent requests present in the proxy server. We count the number of requests in the system using 10 msec intervals and calculate the median based on 20 minute grouping. Small 10 msec intervals assure that we count the number of concurrent requests rather than total number of requests per [large] interval. Note that this graph is not a "request per second" graph.

This graph is helpful in estimation of per request resources needed to support studied traffic.


Hit Rates

click to enlarge

The graph shows the variation of the Document Hit Ratio and Byte Hit Ratio during the day.


Request Response Time Components (Misses)

click to enlarge

This graph analyzes relative impact of request processing stages. Such analysis is essential for performance optimizations since it helps in identifying performance bottlenecks.

For misses, we distinguish four major stages: client connect, proxy connect, server reply, and proxy reply. The graph shows relative contribution of each stage towards total delay. Total delay is calculated as a sum of delays of all stages.

Note that median total delay may differ from median response time because of pipelining and such non-accounted activities as DNS lookups. We are working on a more precise model that accounts for these side effects.

We cannot account for pipelining affects that may change relative contributions for large requests. However, usually more than 80% of all requests cannot be pipelined due to small document sizes. We believe that our estimations are very close to actual performance.

See also "Request Response Time Components (200 Hits)".


Request Response Time Components (200 Hits)

click to enlarge

See "Request Response Time Components (Misses)" experiment for the graph description and important caveats.

For hits, we distinguish three major stages: client connect, swap-in, and proxy reply. The graph shows relative contribution of each stage towards total delay. Total delay is calculated as a sum of delays of all stages.


Disk Traffic Intensity

click to enlarge

The graph shows the speed of processing swap-in and swap-out requests during the day (in requests per second).


Concurrent Disk Requests

click to enlarge

The graph shows the number of concurrent swap requests present in the proxy server. We count the number of requests in the system using 2 msec intervals and calculate the median based on 20 minute grouping. Small 2 msec intervals assure that we count the number of concurrent requests rather than total number of requests per [large] interval. Note that this graph is not a "disk request per second" graph.

We plot the 50th and 75th percentiles. The 50th percentile is the same as median.

This graph is useful in determining the increase in the length of disk queues during peak hours (if any).


Disk Utilization

click to enlarge

Disk utilization is measured in the percent of time there was at least one active swap request. The measurements are done using 2 msec intervals. Note that only swap requests are taken into consideration. Disk utilization is represented by the "all" curve. Curves for swap-in and swap-out requests are given to compare the contribution of each class towards disk utilization.

The patch does not measure per disk utilization. The graph represents the utilization of the disk I/O subsystem as a whole. In other words, if there is always one disk I/O in the system, then utilization is 100% regardless of the number of physical disks installed.


Disk Response Time

click to enlarge

Disk response time is the total time it takes to load/store (swap in/out) a document from/into the disk cache. The graph shows median disk response time in milliseconds during the day.


Disk Response Time Anatomy

click to enlarge

The patch allows for quantifying varios I/O delays. Let's consider a swap request. Swapping a document is done in several steps. First, the corresponding file should be opened for reading or created for writing. This requires an open(2) system call which may incur significant OS overhead: An open(2) call may result in extra I/Os if OS has to write/read i-nodes to/from disk. Then the content in swapped to/from disk using blocks of fixed size. Disk cache and various delays in-between these I/Os affect the total response time.

To estimate the OS overhead on swapping a file we plot the median disk response time of a request versus file size. Response times for files smaller than 16 KB were grouped using 1 KB granularity. Larger files used 1 KB granularity to get enough entries per group. The graph is based on the 24 hour data.

Squid attempts to swap files using blocks of fixed length (e.g. 8 KB). For each I/O direction, we plot the total request response time and the time it takes to swap the first block. The "total" and "1st delay" curves for files smaller than 8 KB are the same.

Since various per I/O delays dominate disk transfer time, the size of an I/O is not very important (the number of I/Os is). This explains step-like shape of the "total" curves: If I/O block size is 8 KB, the times to read 5 KB and 8 KB are the same!

The first disk delay always includes OS overhead on opening a file. Consecutive I/Os for the same file (if any) do not have this overhead. Thus, for file sizes equal to two I/O blocks (16 KB), the difference between the first delay and second I/O approximates OS overhead on opening a file. The patch does not measure the duration of the second I/O. However, we can compute it, assuming that overhead does not depend on the file size for small files:

    1st_Delay   = Overhead + I/O
    Total( 8KB) = 1s_Delay
    Total(16KB) = 1s_Delay + I/O
 =>
    I/O      = Total(16KB) - Total( 8KB)
    Overhead = Total( 8KB) - I/O

Atomic Disk Requests

click to enlarge

It is tempting to improve the overall response time by pipelining disk and network transfers. The graph studies the percentage of atomic swap requests. An atomic requests is served using a single disk I/O. Thus, there is no opportunity for a cache server to pipeline the processing of an atomic request.


Memory Hit Analysis

click to enlarge

Hot memory buffer has two major functions. First, it caches incoming (new) documents. Second it caches documents swapped in from disk as a result of a disk hit. It is not obvious what documents should be cached in memory if any. To further understand how Squid memory buffer works, we track down the memory hits that correspond to documents that were never swapped in:

When a document is retrieved from its source, it is placed into memory buffer. Later, a document is swapped out to free space for incoming requests. However, before the document content is freed from memory buffer, it may be requested by other clients. Such requests result in no swap-in memory hits that we are interested in. These hits are interesting because they show how effective memory buffer is in caching new (previously uncached) documents.


Hit Classification

click to enlarge

There are hits and hits. We consider four hit categories or classes:

For all hit classes, the original server could be contacted to check the freshness of a cached object. There are also a few cases not covered by our classification.

To illustrate relative importance of each class, we plot the percentage of all hits a class represents.


Concurrent Outgoing Connections

click to enlarge

The graph shows the number of concurrent outgoing connections present in the proxy server.

The majority of the connections are due to misses. A hit may require an outgoing connection to verify the freshness of an object.


Proxy Connect Time

click to enlarge

Proxy connect time is the time it takes to send an HTTP request to a server. A proxy may request a document from the original server or another proxy. In case of a local hit for an "old" object, a proxy may send an If-Modified-Since request. Note that IP lookup activity is not included in proxy connect time. For this graph, we ignore requests that did not contact other servers.

Why does it take longer for [future] hits to connect?


Distribution of Proxy Connect Time (Peak)

click to enlarge

The graph plots the distribution of proxy connect time during peak load.


Server Reply Time

click to enlarge

This experiment measures the time it takes to receive a reply from the original server or another proxy after a request has been sent.

Hit replies are much "faster" than misses. This may be attributed to a small size of a hit reply: a reply can be recorded as a hit only if it is a 304 or Not-Modified reply. 304 replies contain only a small header and no document content. Also, the remote server does not have to read the document from disk to send a 304 reply. Thus, the server may reply "faster".

See also server reply time versus file size experiment.


Server Reply Time vs. File Size

click to enlarge

Server reply time depends on the size of the reply. Here we plot this dependency. Variations in reply time for large files are due to insufficient number of such files that leads to less stable median.

Note that reply time of a hit does not depend on a document size. This is because the only possible reply for a hit is a 304 reply, and all 304 replies have approximately the same [small] size regardless of the size of the corresponding document.


Server Response Time

click to enlarge

Server response time is the total time it takes to send a request and receive a reply from a primary server. In other words, it is the sum of connect and reply times.


Client Connect Time

click to enlarge

Client connect time is the delay from accept() system call till receiving a parse-able HTTP request.

It may seem that client connect time cannot depend on the result of the request (hit or miss) because the result is not known at that time. However, this is not the case: Connect time for hits is often longer than for misses.

Longer connect time for hits can be explained by the source of a [future] hit request. There are two sources for hits: clients and neighbors (siblings). There is only one major source for misses, that is clients, because, by ICP protocol, neighbors do not ask for an object if it will result in a miss! (Unless siblings are also parents, which is rare.) Note that clients are often closer to a cache server than neighbors. Thus, hit requests originated from neighbors may have longer connect time. Again, this does not happen to misses since they are not originated from neighbors.

This effect is especially noticeable on leaf proxies, but it can be visible on an intermediate proxy as well.


Proxy Reply Time

click to enlarge

Proxy reply time is the time it takes to send a reply to a client after the document was retrieved from the cache or another server. Note that due to pipelining, a reply process may start prior to receiving the last byte from the server.

See also proxy reply time versus file size experiment.


Proxy Reply Time Anatomy

click to enlarge

To understand what affects proxy reply time it is important to distinguish several subclasses of replies. On this graph we isolate 200 hits from 304 hits. 200 hits require transmission of the entire document as in case of a miss. 304 hits require transmittion of minimal amount of information and are usually done in one system call.

Increase in proxy reply time may be affected by three factors:

The behavior of the "304 hits" line is important. Neither outbound connections nor proxy performance can affect 304 replies. If reply time for 304 hits goes up, then inbound connections are congested.

Note that 200 hits and misses suffer from congestion more than 304 hits because of a larger size that may require several network I/Os. They may also depend on performance of the proxy, and misses may depend on outbound connections.

See also proxy reply time versus file size experiment.


Proxy Reply Time Versus File Size

click to enlarge

Proxy reply time depends on the amount of information to be sent. This graph depicts the dependency based on 24 hour data.


Proxy Reply Time Versus File Size (Detailed)

click to enlarge

Most documents served by proxies are smaller than 8 KB. Here we show reply time for small files. We plot 200 and 304 hits separately to compare reply time of a 200 hit with a miss in a fair setup.

Consider 1 KB files. If misses have smaller reply time then 200 hits, then destinations of hits are farther then destinations of misses. Note that the load on a destination machine should not affect reply time because the network connection is already established and we are transmitting very small files.

Slow hits are typical for leaf servers that cooperate with other proxies. This is consistent with slow hits during client connection phase.

You may want to click on a graph to see the details.



back to 'vein/19971030'