| surfnet/19970930: Memory |
|---|
Hot memory buffer has two major functions. First, it caches incoming (new) documents. Second it caches documents swapped in from disk as a result of a disk hit. It is not obvious what documents should be cached in memory if any. To further understand how Squid memory buffer works, we track down the memory hits that correspond to documents that were never swapped in:
When a document is retrieved from its source, it is placed into memory buffer. Later, a document is swapped out to free space for incoming requests. However, before the document content is freed from memory buffer, it may be requested by other clients. Such requests result in no swap-in memory hits that we are interested in. These hits are interesting because they show how effective memory buffer is in caching new (previously uncached) documents.
The graph demonstrates that, among all memory hits, no swap-in hits constitute less than 5%. Thus, it may not be wise to keep a large number of new documents in memory buffer. New documents could be swapped out as soon as possible to free space for old documents swapped in from the disk because the odds of memory hitting a new document versus old one is 5:95.
Note, however, that the odds of hitting a document in memory versus from disk are, again, not very good (about 5:60).
There are hits and hits. We consider four hit categories or classes:
TCP_IMS_HIT action in Squid. Note: The cache may have
more recent data than the client, so Squid may send the full body with the
new content. It is still a hit from Squid point of view.
To illustrate relative importance of each class, we plot the percentage of all hits a class represents.
The graph shows that about 55-65% of hits come from disk and less than 5% come from memory. This means that memory buffer for "hot" objects is currently not very effective.
IMS hits bring about 25% of all hits. This is about 10% higher than for ruu.nl, a leaf proxy.
The contribution of "negative" hits is less than 5%.
Another interesting observation is the increase in number of disk hits during the day. Seems like the cache was "warming up" by caching fresh documents that will give hits later. At this point, we do not have any data to support this guess.