| ruu/19970922: General |
|---|
The graph shows the distribution of network transfer sizes for hits and misses and disk transfer sizes for swap requests based on 24 hour data. To show the real traffic rather than file size distribution, we count every access to a document. Thus, a document that had 3 accesses would be counted once as a miss (if the document was not cached before) and twice as a hit (if the document was still in the cache).
Misses are larger than hits (have more large files) because large documents are not popular (big unpopular files are always counted at least once as misses but rarely counted as hits) and every access to a document is counted (small popular files are counted many times for hits and only one for misses). Swap-in requests are larger (have more large files) than hits because of significant number of 304 hits that are very small but are never retrieved from the disk. Are swap-out requests larger than misses because the number of small uncachable documents is high?
The graph shows the distribution of file sizes for hits, misses, and swap requests based on 24 hour data. To show file size distribution rather that traffic, we count only the first access to a document. This graph is helpful in estimating the number of cached objects given the cache size as well as per file memory requirements.
The graph shows the change in proxy server load with time. Load is measured
as number of requests per second processed by the proxy. Hits and
misses are determined using Squid's action field in
access.log.
The Hit Ratio does not depend on the load much and fluctuates at about 50%.
Proxy response time is the total time it takes to serve a client request. The graph shows median response time in milliseconds during the day.
Misses are 7-10 times slower than hits. This characterizes both outgoing network connections and original servers.
Response time for misses depends on the load while hits show little variation in response time. Misses must be resolved from another proxy or an original server. Thus, an increase in response time for misses may be caused by original servers and/or outgoing network that are slowed down by the traffic during peak hours. The proxy server itself and incoming network connections are not responsible for these large delays since response time for hits remains stable.
The position of "all" line may be surprising. Indeed, with 50% Hit Ratio and misses 7-10 times slower than hits, one would expect "average" response time to be much higher. This does not happen because we are using median or 50th percentile to represent an "average" response time. Traditional mean "average" does not work well because it is highly susceptible to huge isolated peaks in response time of requests.
Median of 1000 msec would mean that 50% of requests were served faster than 1000 msec. The latter is reasonable according to the response time distribution graph: about 75% of hits and 30% of misses are served within 1000 msec interval. This gives about 52% of all requests.
The distribution was build for requests served during the peak load hour. Note the log scale of x axis.
The graph shows proxy response time versus document size.
Interestingly, response time for hits does not increase with the file size for small files. This phenomenon can be observed on all types of proxies from leaf to top level caches. This may be attributed to the TCP buffer size which is usually at least 16 KB. Misses are retrieved from another server. This may make their response time more size-dependent.
The graph shows the number of concurrent requests present in the proxy server. We count the number of requests in the system using 10 msec intervals and calculate the median based on 20 minute grouping. Small 10 msec intervals assure that we count the number of concurrent requests rather than total number of requests per [large] interval. Note that this graph is not a "request per second" graph.
This graph is helpful in estimation of per request resources needed to support studied traffic.
With Hit Ratio about 50% the total number of hits is comparable to the total number of misses. However, the number of concurrent hits is at least 5 times lower than number of concurrent misses. This is because a hit is processed faster than a miss. Consequently, the total amount of resources allocated for hits at any moment is substantially lower than the amount of resources for misses.
The graph shows the variation of the Document Hit Ratio and Byte Hit Ratio during the day.
Hit Ratio (HR) does not depend on proxy load much. For this server, average HR was 48%. Thus, the number of hits is comparable to the number of misses.
This graph analyzes relative impact of request processing stages. Such analysis is essential for performance optimizations since it helps in identifying performance bottlenecks.
For misses, we distinguish four major stages: client connect, proxy connect, server reply, and proxy reply. The graph shows relative contribution of each stage towards total delay. Total delay is calculated as a sum of delays of all stages.
Note that median total delay may differ from median response time because of pipelining and such non-accounted activities as DNS lookups. We are working on a more precise model that accounts for these side effects.
We cannot account for pipelining affects that may change relative contributions for large requests. However, usually more than 80% of all requests cannot be pipelined due to small document sizes. We believe that our estimations are very close to actual performance.
See also "Request Response Time Components (200 Hits)".
See "Request Response Time Components (Misses)" experiment for the graph description and important caveats.
For hits, we distinguish three major stages: client connect, swap-in, and proxy reply. The graph shows relative contribution of each stage towards total delay. Total delay is calculated as a sum of delays of all stages.