| ruu/19970922: All |
|---|
The graph shows the distribution of network transfer sizes for hits and misses and disk transfer sizes for swap requests based on 24 hour data. To show the real traffic rather than file size distribution, we count every access to a document. Thus, a document that had 3 accesses would be counted once as a miss (if the document was not cached before) and twice as a hit (if the document was still in the cache).
Misses are larger than hits (have more large files) because large documents are not popular (big unpopular files are always counted at least once as misses but rarely counted as hits) and every access to a document is counted (small popular files are counted many times for hits and only one for misses). Swap-in requests are larger (have more large files) than hits because of significant number of 304 hits that are very small but are never retrieved from the disk. Are swap-out requests larger than misses because the number of small uncachable documents is high?
The graph shows the distribution of file sizes for hits, misses, and swap requests based on 24 hour data. To show file size distribution rather that traffic, we count only the first access to a document. This graph is helpful in estimating the number of cached objects given the cache size as well as per file memory requirements.
The graph shows the change in proxy server load with time. Load is measured
as number of requests per second processed by the proxy. Hits and
misses are determined using Squid's action field in
access.log.
The Hit Ratio does not depend on the load much and fluctuates at about 50%.
Proxy response time is the total time it takes to serve a client request. The graph shows median response time in milliseconds during the day.
Misses are 7-10 times slower than hits. This characterizes both outgoing network connections and original servers.
Response time for misses depends on the load while hits show little variation in response time. Misses must be resolved from another proxy or an original server. Thus, an increase in response time for misses may be caused by original servers and/or outgoing network that are slowed down by the traffic during peak hours. The proxy server itself and incoming network connections are not responsible for these large delays since response time for hits remains stable.
The position of "all" line may be surprising. Indeed, with 50% Hit Ratio and misses 7-10 times slower than hits, one would expect "average" response time to be much higher. This does not happen because we are using median or 50th percentile to represent an "average" response time. Traditional mean "average" does not work well because it is highly susceptible to huge isolated peaks in response time of requests.
Median of 1000 msec would mean that 50% of requests were served faster than 1000 msec. The latter is reasonable according to the response time distribution graph: about 75% of hits and 30% of misses are served within 1000 msec interval. This gives about 52% of all requests.
The distribution was build for requests served during the peak load hour. Note the log scale of x axis.
The graph shows proxy response time versus document size.
Interestingly, response time for hits does not increase with the file size for small files. This phenomenon can be observed on all types of proxies from leaf to top level caches. This may be attributed to the TCP buffer size which is usually at least 16 KB. Misses are retrieved from another server. This may make their response time more size-dependent.
The graph shows the number of concurrent requests present in the proxy server. We count the number of requests in the system using 10 msec intervals and calculate the median based on 20 minute grouping. Small 10 msec intervals assure that we count the number of concurrent requests rather than total number of requests per [large] interval. Note that this graph is not a "request per second" graph.
This graph is helpful in estimation of per request resources needed to support studied traffic.
With Hit Ratio about 50% the total number of hits is comparable to the total number of misses. However, the number of concurrent hits is at least 5 times lower than number of concurrent misses. This is because a hit is processed faster than a miss. Consequently, the total amount of resources allocated for hits at any moment is substantially lower than the amount of resources for misses.
The graph shows the variation of the Document Hit Ratio and Byte Hit Ratio during the day.
Hit Ratio (HR) does not depend on proxy load much. For this server, average HR was 48%. Thus, the number of hits is comparable to the number of misses.
This graph analyzes relative impact of request processing stages. Such analysis is essential for performance optimizations since it helps in identifying performance bottlenecks.
For misses, we distinguish four major stages: client connect, proxy connect, server reply, and proxy reply. The graph shows relative contribution of each stage towards total delay. Total delay is calculated as a sum of delays of all stages.
Note that median total delay may differ from median response time because of pipelining and such non-accounted activities as DNS lookups. We are working on a more precise model that accounts for these side effects.
We cannot account for pipelining affects that may change relative contributions for large requests. However, usually more than 80% of all requests cannot be pipelined due to small document sizes. We believe that our estimations are very close to actual performance.
See also "Request Response Time Components (200 Hits)".
See "Request Response Time Components (Misses)" experiment for the graph description and important caveats.
For hits, we distinguish three major stages: client connect, swap-in, and proxy reply. The graph shows relative contribution of each stage towards total delay. Total delay is calculated as a sum of delays of all stages.
The graph shows the speed of processing swap-in and swap-out requests during the day (in requests per second).
The number of reads is higher that the number of writes despite the fact that hit ratio is less than than 50%. This is because many non-cachable misses are never written to disk. Thus, even with 50% hit ratio the majority of disk requests are reads.
The curves follow the number of requests processed by the proxy.
The graph shows the number of concurrent swap requests present in the proxy server. We count the number of requests in the system using 2 msec intervals and calculate the median based on 20 minute grouping. Small 2 msec intervals assure that we count the number of concurrent requests rather than total number of requests per [large] interval. Note that this graph is not a "disk request per second" graph.
We plot the 50th and 75th percentiles. The 50th percentile is the same as median.
This graph is useful in determining the increase in the length of disk queues during peak hours (if any).
For this server, in 50% of cases, a swap request will not compete with others on disk. In 75% of cases, there will be at most one request on disk so no competition again. Disk subsystem is clearly under-utilized.
Disk utilization is measured in the percent of time there was at least one active swap request. The measurements are done using 2 msec intervals. Note that only swap requests are taken into consideration. Disk utilization is represented by the "all" curve. Curves for swap-in and swap-out requests are given to compare the contribution of each class towards disk utilization.
The patch does not measure per disk utilization. The graph represents the utilization of the disk I/O subsystem as a whole. In other words, if there is always one disk I/O in the system, then utilization is 100% regardless of the number of physical disks installed.
Note that there is not enough load to fully utilize the disk I/O subsystem on this server. With two physical disks, the utilization is always less than 45%.
Note that this graph may be affected by a performance bug in Squid described elsewhere. Without the bug, the utilization will be even lower though.
Disk response time is the total time it takes to load/store (swap in/out) a document from/into the disk cache. The graph shows median disk response time in milliseconds during the day.
Median swap-in response time is noticeably higher than swap-out time. There are several factors affecting disk response time. See other disk related experiments for their quantification.
The patch allows for quantifying varios I/O delays. Let's consider a swap
request. Swapping a document is done in several steps. First, the
corresponding file should be opened for reading or created for
writing. This requires an open(2) system call which may incur
significant OS overhead: An open(2) call may result in
extra I/Os if OS has to write/read i-nodes to/from disk. Then the content in
swapped to/from disk using blocks of fixed size. Disk cache and various delays
in-between these I/Os affect the total response time.
To estimate the OS overhead on swapping a file we plot the median disk response time of a request versus file size. Response times for files smaller than 16 KB were grouped using 1 KB granularity. Larger files used 1 KB granularity to get enough entries per group. The graph is based on the 24 hour data.
Squid attempts to swap files using blocks of fixed length (e.g. 8 KB). For each I/O direction, we plot the total request response time and the time it takes to swap the first block. The "total" and "1st delay" curves for files smaller than 8 KB are the same.
Since various per I/O delays dominate disk transfer time, the size of an I/O is not very important (the number of I/Os is). This explains step-like shape of the "total" curves: If I/O block size is 8 KB, the times to read 5 KB and 8 KB are the same!
The first disk delay always includes OS overhead on opening a file. Consecutive I/Os for the same file (if any) do not have this overhead. Thus, for file sizes equal to two I/O blocks (16 KB), the difference between the first delay and second I/O approximates OS overhead on opening a file. The patch does not measure the duration of the second I/O. However, we can compute it, assuming that overhead does not depend on the file size for small files:
1st_Delay = Overhead + I/O
Total( 8KB) = 1s_Delay
Total(16KB) = 1s_Delay + I/O
=>
I/O = Total(16KB) - Total( 8KB)
Overhead = Total( 8KB) - I/O
Note that the first "step" on each "total" curve corresponds to the size of the Squid I/O block for this server (8 KB). "Steps" for requests larger than 3 blocks (24 KB) are not distinct: There are not enough files of that size to get a "stable" median; also disk and network delays in-between disk I/Os may spoil the picture.
Clearly, the duration of the first delay should be the same for any file size. However, on our graph swap-in requests do not follow this rule (first delay is smaller for files bigger than 8 KB) due to a performance bug in Squid described elsewhere. We silently adjust for this bug in calculations for swap-in requests. Swap-out requests need no adjustment.
We summarized our calculations for 24 hours and peak load in a table (all
times are medians in milliseconds).
| 24 Hours | Peak Load | |||
|---|---|---|---|---|
| I/O | Overhead | I/O | Overhead | |
| Swap In | 38 | 16 | 44 | 28 |
| Swap Out | 12 | 58 | 16 | 58 |
Huge OS overhead for swap-out requests could be caused by several I/Os needed to create a new file (reading and writing i-nodes). Swap-ins do not create a new file and have much smaller overhead.
Fast swap-out I/Os may be explained by the presence of a low-level disk cache (buffer): Reads have to go all the way to the disk surface while writes can be buffered in the disk cache and written at a later time (the writing process is released after data is buffered).
Interestingly, fast 12 msec swap-out I/Os make large swap-outs faster than large swap-ins despite large OS overhead for writes. Compared to swap-ins, swap-out requests pay big one time "subscription fee" but small "monthly payments". This saves them "money" on the long run.
The increase in I/O duration with load may be caused by longer queuing
delays of individual disk requests. Note that these are per I/O delays
and our computations account for them in the I/O cost.
Currently, we do not have enough data to quantify these delays. However, we
can show how I/O duration changes with load.
The graph is based on second I/O duration for 2 page files (8 KB, 16 KB]. The second I/O duration is calculated as a difference between total response time and the first delay (another method to approximate I/O duration which gives close results). Again, we adjust for the bug with swap-in requests.
It is tempting to improve the overall response time by pipelining disk and network transfers. The graph studies the percentage of atomic swap requests. An atomic requests is served using a single disk I/O. Thus, there is no opportunity for a cache server to pipeline the processing of an atomic request.
The graph proves that about 85% of swap-in requests are atomic. Thus, savings in disk response time will be directly reflected on overall response time of a request.
Interestingly, the percent of atomic reads is higher than of atomic writes. This may be attributed to the skew in documents popularity towards smaller objects. Squid stores (swaps out) small and large objects but smaller ones are requested back (swapped in) more often. The smaller the object the more likely it can be read in one (atomic) I/O.
Hot memory buffer has two major functions. First, it caches incoming (new) documents. Second it caches documents swapped in from disk as a result of a disk hit. It is not obvious what documents should be cached in memory if any. To further understand how Squid memory buffer works, we track down the memory hits that correspond to documents that were never swapped in:
When a document is retrieved from its source, it is placed into memory buffer. Later, a document is swapped out to free space for incoming requests. However, before the document content is freed from memory buffer, it may be requested by other clients. Such requests result in no swap-in memory hits that we are interested in. These hits are interesting because they show how effective memory buffer is in caching new (previously uncached) documents.
The graph demonstrates that, among all memory hits, no swap-in hits constitute less than 5%. Thus, it may not be wise to keep a large number of new documents in memory buffer. New documents could be swapped out as soon as possible to free space for old documents swapped in from the disk because the odds of memory hitting a new document versus old one is 5:95.
Note, however, that the odds of hitting a document in memory versus from disk are, again, not very good (about 5:75).
There are hits and hits. We consider four hit categories or classes:
TCP_IMS_HIT action in Squid. Note: The cache may have
more recent data than the client, so Squid may send the full body with the
new content. It is still a hit from Squid point of view.
To illustrate relative importance of each class, we plot the percentage of all hits a class represents.
The graph shows that about 75% of hits come from disk and about 5% come from memory. This means that memory buffer for "hot" objects is currently not very effective.
The contribution of "negative" hits is less than 5%.
The graph shows the number of concurrent outgoing connections present in the proxy server.
The majority of the connections are due to misses. A hit may require an outgoing connection to verify the freshness of an object.
Proxy connect time is the time it takes to send an HTTP request to a server. A proxy may request a document from the original server or another proxy. In case of a local hit for an "old" object, a proxy may send an If-Modified-Since request. Note that IP lookup activity is not included in proxy connect time. For this graph, we ignore requests that did not contact other servers.
Why does it take longer for [future] hits to connect?
The graph plots the distribution of proxy connect time during peak load.
Note that there are much less "fast" hits than misses. Seems like it takes noticeably longer for a hit to connect to another server. Note that the only reason a [future] hit to contact another server is to verify the freshness of an object (an If-Modified-Since request). This "slow connect for hits" pattern is less visible on an intermediate proxy server. Does it mean that the majority of IMS requests on a leaf proxy are sent to remote servers, while misses are evenly spread?
This experiment measures the time it takes to receive a reply from the original server or another proxy after a request has been sent.
Hit replies are much "faster" than misses. This may be attributed to a small size of a hit reply: a reply can be recorded as a hit only if it is a 304 or Not-Modified reply. 304 replies contain only a small header and no document content. Also, the remote server does not have to read the document from disk to send a 304 reply. Thus, the server may reply "faster".
See also server reply time versus file size experiment.
Server reply time depends on the size of the reply. Here we plot this dependency. Variations in reply time for large files are due to insufficient number of such files that leads to less stable median.
Note that reply time of a hit does not depend on a document size. This is because the only possible reply for a hit is a 304 reply, and all 304 replies have approximately the same [small] size regardless of the size of the corresponding document.
Server response time is the total time it takes to send a request and receive a reply from a primary server. In other words, it is the sum of connect and reply times.
Client connect time is the delay from accept() system call
till receiving a parse-able HTTP request.
It may seem that client connect time cannot depend on the result of the request (hit or miss) because the result is not known at that time. However, this is not the case: Connect time for hits is often longer than for misses.
Longer connect time for hits can be explained by the source of a [future] hit request. There are two sources for hits: clients and neighbors (siblings). There is only one major source for misses, that is clients, because, by ICP protocol, neighbors do not ask for an object if it will result in a miss! (Unless siblings are also parents, which is rare.) Note that clients are often closer to a cache server than neighbors. Thus, hit requests originated from neighbors may have longer connect time. Again, this does not happen to misses since they are not originated from neighbors.
This effect is especially noticeable on leaf proxies, but it can be visible on an intermediate proxy as well.
Proxy reply time is the time it takes to send a reply to a client after the document was retrieved from the cache or another server. Note that due to pipelining, a reply process may start prior to receiving the last byte from the server.
See also proxy reply time versus file size experiment.
To understand what affects proxy reply time it is important to distinguish several subclasses of replies. On this graph we isolate 200 hits from 304 hits. 200 hits require transmission of the entire document as in case of a miss. 304 hits require transmittion of minimal amount of information and are usually done in one system call.
Increase in proxy reply time may be affected by three factors:
The behavior of the "304 hits" line is important. Neither outbound connections nor proxy performance can affect 304 replies. If reply time for 304 hits goes up, then inbound connections are congested.
Note that 200 hits and misses suffer from congestion more than 304 hits because of a larger size that may require several network I/Os. They may also depend on performance of the proxy, and misses may depend on outbound connections.
See also proxy reply time versus file size experiment.
Proxy reply time depends on the amount of information to be sent. This graph depicts the dependency based on 24 hour data.
Most documents served by proxies are smaller than 8 KB. Here we show reply time for small files. We plot 200 and 304 hits separately to compare reply time of a 200 hit with a miss in a fair setup.
Consider 1 KB files. If misses have smaller reply time then 200 hits, then destinations of hits are farther then destinations of misses. Note that the load on a destination machine should not affect reply time because the network connection is already established and we are transmitting very small files.
Slow hits are typical for leaf servers that cooperate with other proxies. This is consistent with slow hits during client connection phase.
You may want to click on a graph to see the details.