Tag: cache

Linux – Clearing Caches

I encountered some documentation at work that provided a process for clearing caches. It wasn’t wrong per se, but it showed a lack of understanding of what was being performed. I enhanced our documentation to explain what was happening and why the series of commands was redundant. Figured I’d post my revisions here in case they’re useful for someone else.

Only clean caches can be dropped — dirty ones need to be written somewhere before they can be dropped. Before dropping caches, flush the file system buffer using sync — this tells the kernel to write dirty cache pages to disk (or, well, write as many as it can). This will maximize the number of cache pages that can be dropped. You don’t have to run sync, but doing so optimizes your subsequent commands effectiveness.

Page cache is memory that’s held after reading a file. Linux tends to keep the files in cache on the assumption that a file that’s been read once will probably be read again. Clear the pagecache using echo 1 > /proc/sys/vm/drop_caches — this is the safest to use in production and generally a good first try.

If clearing the pagecache has not freed sufficient memory, proceed to this step. The dentries (directory cache) and inodes cache are memory held after reading file attributes (run strace and look at all of those stat() calls!). Clear the dentries and inodes using echo 2 > /proc/sys/vm/drop_caches — this is kind of a last-ditch effort for a production environment. Better than having it all fall over, but things will be a little slow as all of the in-flight processes repopulate the cached data.

You can clear the pagecache, dentries, and inodes using echo 3 > /proc/sys/vm/drop_caches — this is a good shortcut in a non-production environment. But, if you’re already run 1 and 2 … well, 3 = 1+2, so clearing 1, 2, and then 3 is repetitive.

 

Another note from other documentation I’ve encountered — you can use sysctl to clear the caches, but this can cause a deadlock under heavy load … as such, I don’t do this. The syntax is sysctl -w vm.drop_caches=1 where the number corresponds to the 1, 2, and 3 described above.

 

Python Time-Expiring Cache

I needed to post information into a SharePoint Online list. There’s an auth header required to post data, but the authentication expires every couple of minutes. Obviously, I could just get a new auth string each time … but that’s terribly inefficient. I could also use what I had and refresh it when it fails … inelegant but effective. I wanted, instead, to have a value cached for a duration slightly less than the server-side expiry on the auth string. This decorator allows me to use the cached auth header string for some period of time and actually run the function that grabs the auth header string before the string is invalidated.

import functools
import time
from datetime import datetime

def timed_cache(iMaxAge, iMaxSize=128, boolTyped=False):
    #######################################
    # LRU cache decorator with expiry time
    #
    # Args:
    #    iMaxAge: Seconds to live for cached results. 
    #    iMaxSize: Maximum cache size (see functools.lru_cache).
    #    boolTyped: Cache on distinct input types (see functools.lru_cache).
    #######################################
    def _decorator(fn):
        @functools.lru_cache(maxsize=iMaxSize, typed=boolTyped)
        def _new(*args, __time_salt, **kwargs):
            return fn(*args, **kwargs)

        @functools.wraps(fn)
        def _wrapped(*args, **kwargs):
            return _new(*args, **kwargs, __time_salt=int(time.time() / iMaxAge))
        return _wrapped

    return _decorator

# Usage example -- 23 second cache expiry
@timed_cache(23)
def slow_function(iSleepTime: int):
    datetimeStart = datetime.now()
    time.sleep(iSleepTime)
    return f"test started at {datetimeStart} and ended at at {datetime.now()}"


print(f"Start at {datetime.now()}")

for i in range(1, 50):
    print(slow_function(5))
    
    if i % 5 is 0:
        print(f"sleeping 5 at {datetime.now()}")
        time.sleep(5)
        print(f"done sleeping at {datetime.now()}\n\n")

print(f"Ended at at {datetime.now()}")