[입 개발] Redis가 maxmemory 보다 더 썼다가 OOM 에러를 던져요!!!

Redis를 운영하다보면 항상 어려운 문제는 memory 관리입니다. In-Memory Cache 다 보니, 메모리 보다 더 많은 데이터를 써서 swap이 발생하면 해당 메모리 page를 접근할 때 마다 swap out 이 발생해서, 속도에 엄청난 영향을 주게 됩니다. 또한 더 많은 메모리를 쓰면, 메모리 문제로 장애가 발생할 수 가 있습니다. 그런데 오늘 팀에서 다음과 같은 에러가 발생했다고 보고를 해주셨습니다.

redis.clients.jedis.exceptions.JedisDataException: OOM command not allowed when used memory > 'maxmemory'.

Redis는 이런 메모리 관리를 위해서 두 가지 옵션을 제공하고 있습니다. 첫 번째는 maxmemory 설정이고 두 번째는 maxmemory policy 입니다.

maxmemory는 메모리를 이것 이상 사용하지 않도록 설정하는 옵션입니다. 내부적으로 메모리 할당에 zmalloc 이라는 함수를 이용하는데, 이 안에서 메모리 할당 사이즈를 계산하고 이를 이용합니다. 다만 Redis에서 할당을 요청하는 값이기 때문에, 실제 메모리 page는 일반적으로 더 사용하게 되고 이로 인해 실제 물리 메모리 사용량은 계산한 값보다 훨씬 더 많이 사용하고 있을 수 있습니다. 이를 통해서 필요할 때, 사용하던 메모리를 반납해서 메모리 용량을 확보하게 되는데 이를 eviction 이라고 합니다.

maxmemory-policy 는 메모리 사용량이 maxmemory 보다 커졌을 때 어떤 정책을 취할 것인가를 정해둔 정책입니다. 다음과 같은 정책 값들이 있습니다.

정책명	내용
noeviction	eviction 작업을 하지 않고, 바로 write 작업시에 에러를 리턴한다.
volatile-lru	expire set(expire를 건 키집합) 에 대해 LRU 방식을 이용해서 키를 제거한다. 정확한 LRU 방식이 아니라 유사한 방식을 사용
volatile-lfu	expire set 에 대해 LFU 유사한 방식을 이용해서 키를 제거한다.
volatile-random	expire set 에 대해 랜덤하게 키를 제거한다.
volatile-ttl	expire set에 대해서 ttl이 적게 남은 순으로 키를 제거한다.
allkeys-lru	모든 키에 대해서 LRU 유사한 방식으로 키를 제거한다.
allkeys-lfu	모든 키에 대해서 LFU 유사한 방식으로 키를 제거한다.
allkeys-random	모든 키에 대해서 랜덤하게 키를 제거한다.

여기서 LRU는 Least Recently Used, LFU는 Least Frequently Used 입니다.

각각 conf 에 다음과 같이 설정이 가능합니다.

maxmemory <bytes>
maxmemory-policy <policy>

그럼 Redis 는 언제 이 eviction을 실행하려고 할까요? eviction이 발생하는 사용자가 Command를 실행하려고 할 때 입니다. 만약 현재 사용중인 메모리(used memory)가 maxmemory 설정보다 크다면 발생하게 됩니다. 먼저 processCommand 함수를 살펴봅시다.

int processCommand(client *c) {
    ......
    int is_denyoom_command = (c->cmd->flags & CMD_DENYOOM) ||
                             (c->cmd->proc == execCommand && 
                             (c->mstate.cmd_flags & CMD_DENYOOM));

    ......
        /* Handle the maxmemory directive.
     *
     * Note that we do not want to reclaim memory if we are here re-entering
     * the event loop since there is a busy Lua script running in timeout
     * condition, to avoid mixing the propagation of scripts with the
     * propagation of DELs due to eviction. */
    if (server.maxmemory && !server.lua_timedout) {
        int out_of_memory = (performEvictions() == EVICT_FAIL);
        /* performEvictions may flush slave output buffers. This may result
         * in a slave, that may be the active client, to be freed. */
        if (server.current_client == NULL) return C_ERR;

        int reject_cmd_on_oom = is_denyoom_command;
        /* If client is in MULTI/EXEC context, queuing may consume an unlimited
         * amount of memory, so we want to stop that.
         * However, we never want to reject DISCARD, or even EXEC (unless it
         * contains denied commands, in which case is_denyoom_command is already
         * set. */
        if (c->flags & CLIENT_MULTI &&
            c->cmd->proc != execCommand &&
            c->cmd->proc != discardCommand &&
            c->cmd->proc != resetCommand) {
            reject_cmd_on_oom = 1;
        }

        if (out_of_memory && reject_cmd_on_oom) {
            rejectCommand(c, shared.oomerr);
            return C_OK;
        }

        /* Save out_of_memory result at script start, otherwise if we check OOM
         * until first write within script, memory used by lua stack and
         * arguments might interfere. */
        if (c->cmd->proc == evalCommand || c->cmd->proc == evalShaCommand) {
            server.lua_oom = out_of_memory;
        }
    }
    ......
}

위의 코드를 살펴보면 server.maxmemory 값이 설정되어 있고, server.lua_timeout 이 0이어야만 실행이 되게 되어있습니다. 여기서 out_of_memory 변수에 performEDvcitons() 의 결과 값이 EVICT_FAIL 인지 체크하고 해당 이슈가 실패하면, 해당 command 가 is_denyoom_command 인지를 체크해서 is_denyoom_command 라면, shared.oomerr 를 리턴하게 됩니다.

shared.oomerr 는 다음과 같이 정의되어 있습니다. 아까 어디선가 본듯한 문구이죠?

    shared.oomerr = createObject(OBJ_STRING,sdsnew(
        "-OOM command not allowed when used memory > 'maxmemory'.\r\n"));

그럼 이제 performEvictions 함수를 살펴봅시다.

/* Check that memory usage is within the current "maxmemory" limit.  If over
 * "maxmemory", attempt to free memory by evicting data (if it's safe to do so).
 *
 * It's possible for Redis to suddenly be significantly over the "maxmemory"
 * setting.  This can happen if there is a large allocation (like a hash table
 * resize) or even if the "maxmemory" setting is manually adjusted.  Because of
 * this, it's important to evict for a managed period of time - otherwise Redis
 * would become unresponsive while evicting.
 *
 * The goal of this function is to improve the memory situation - not to
 * immediately resolve it.  In the case that some items have been evicted but
 * the "maxmemory" limit has not been achieved, an aeTimeProc will be started
 * which will continue to evict items until memory limits are achieved or
 * nothing more is evictable.
 *
 * This should be called before execution of commands.  If EVICT_FAIL is
 * returned, commands which will result in increased memory usage should be
 * rejected.
 *
 * Returns:
 *   EVICT_OK       - memory is OK or it's not possible to perform evictions now
 *   EVICT_RUNNING  - memory is over the limit, but eviction is still processing
 *   EVICT_FAIL     - memory is over the limit, and there's nothing to evict
 * */
int performEvictions(void) {
    if (!isSafeToPerformEvictions()) return EVICT_OK;

    int keys_freed = 0;
    size_t mem_reported, mem_tofree;
    long long mem_freed; /* May be negative */
    mstime_t latency, eviction_latency;
    long long delta;
    int slaves = listLength(server.slaves);
    int result = EVICT_FAIL;

    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
        return EVICT_OK;

    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
        return EVICT_FAIL;  /* We need to free memory, but policy forbids. */

    unsigned long eviction_time_limit_us = evictionTimeLimitUs();

    mem_freed = 0;

    latencyStartMonitor(latency);

    monotime evictionTimer;
    elapsedStart(&evictionTimer);

    while (mem_freed < (long long)mem_tofree) {
        int j, k, i;
        static unsigned int next_db = 0;
        sds bestkey = NULL;
        int bestdbid;
        redisDb *db;
        dict *dict;
        dictEntry *de;

        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
        {
            struct evictionPoolEntry *pool = EvictionPoolLRU;

            while(bestkey == NULL) {
                unsigned long total_keys = 0, keys;

                /* We don't want to make local-db choices when expiring keys,
                 * so to start populate the eviction pool sampling keys from
                 * every DB. */
                for (i = 0; i < server.dbnum; i++) {
                    db = server.db+i;
                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
                            db->dict : db->expires;
                    if ((keys = dictSize(dict)) != 0) {
                        evictionPoolPopulate(i, dict, db->dict, pool);
                        total_keys += keys;
                    }
                }
                if (!total_keys) break; /* No keys to evict. */

                /* Go backward from best to worst element to evict. */
                for (k = EVPOOL_SIZE-1; k >= 0; k--) {
                    if (pool[k].key == NULL) continue;
                    bestdbid = pool[k].dbid;

                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
                        de = dictFind(server.db[pool[k].dbid].dict,
                            pool[k].key);
                    } else {
                        de = dictFind(server.db[pool[k].dbid].expires,
                            pool[k].key);
                    }

                    /* Remove the entry from the pool. */
                    if (pool[k].key != pool[k].cached)
                        sdsfree(pool[k].key);
                    pool[k].key = NULL;
                    pool[k].idle = 0;

                    /* If the key exists, is our pick. Otherwise it is
                     * a ghost and we need to try the next element. */
                    if (de) {
                        bestkey = dictGetKey(de);
                        break;
                    } else {
                        /* Ghost... Iterate again. */
                    }
                }
            }
        }

        /* volatile-random and allkeys-random policy */
        else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
                 server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
        {
            /* When evicting a random key, we try to evict a key for
             * each DB, so we use the static 'next_db' variable to
             * incrementally visit all DBs. */
            for (i = 0; i < server.dbnum; i++) {
                j = (++next_db) % server.dbnum;
                db = server.db+j;
                dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
                        db->dict : db->expires;
                if (dictSize(dict) != 0) {
                    de = dictGetRandomKey(dict);
                    bestkey = dictGetKey(de);
                    bestdbid = j;
                    break;
                }
            }
        }

        /* Finally remove the selected key. */
        if (bestkey) {
            db = server.db+bestdbid;
            robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
            propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
            /* We compute the amount of memory freed by db*Delete() alone.
             * It is possible that actually the memory needed to propagate
             * the DEL in AOF and replication link is greater than the one
             * we are freeing removing the key, but we can't account for
             * that otherwise we would never exit the loop.
             *
             * Same for CSC invalidation messages generated by signalModifiedKey.
             *
             * AOF and Output buffer memory will be freed eventually so
             * we only care about memory used by the key space. */
            delta = (long long) zmalloc_used_memory();
            latencyStartMonitor(eviction_latency);
            if (server.lazyfree_lazy_eviction)
                dbAsyncDelete(db,keyobj);
            else
                dbSyncDelete(db,keyobj);
            latencyEndMonitor(eviction_latency);
            latencyAddSampleIfNeeded("eviction-del",eviction_latency);
            delta -= (long long) zmalloc_used_memory();
            mem_freed += delta;
            server.stat_evictedkeys++;
            signalModifiedKey(NULL,db,keyobj);
            notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
                keyobj, db->id);
            decrRefCount(keyobj);
            keys_freed++;

            if (keys_freed % 16 == 0) {
                /* When the memory to free starts to be big enough, we may
                 * start spending so much time here that is impossible to
                 * deliver data to the replicas fast enough, so we force the
                 * transmission here inside the loop. */
                if (slaves) flushSlavesOutputBuffers();

                /* Normally our stop condition is the ability to release
                 * a fixed, pre-computed amount of memory. However when we
                 * are deleting objects in another thread, it's better to
                 * check, from time to time, if we already reached our target
                 * memory, since the "mem_freed" amount is computed only
                 * across the dbAsyncDelete() call, while the thread can
                 * release the memory all the time. */
                if (server.lazyfree_lazy_eviction) {
                    if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                        break;
                    }
                }

                /* After some time, exit the loop early - even if memory limit
                 * hasn't been reached.  If we suddenly need to free a lot of
                 * memory, don't want to spend too much time here.  */
                if (elapsedUs(evictionTimer) > eviction_time_limit_us) {
                    // We still need to free memory - start eviction timer proc
                    if (!isEvictionProcRunning) {
                        isEvictionProcRunning = 1;
                        aeCreateTimeEvent(server.el, 0,
                                evictionTimeProc, NULL, NULL);
                    }
                    break;
                }
            }
        } else {
            goto cant_free; /* nothing to free... */
        }
    }
    /* at this point, the memory is OK, or we have reached the time limit */
    result = (isEvictionProcRunning) ? EVICT_RUNNING : EVICT_OK;

cant_free:
    if (result == EVICT_FAIL) {
        /* At this point, we have run out of evictable items.  It's possible
         * that some items are being freed in the lazyfree thread.  Perform a
         * short wait here if such jobs exist, but don't wait long.  */
        if (bioPendingJobsOfType(BIO_LAZY_FREE)) {
            usleep(eviction_time_limit_us);
            if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                result = EVICT_OK;
            }
        }
    }

    latencyEndMonitor(latency);
    latencyAddSampleIfNeeded("eviction-cycle",latency);
    return result;
}

여기서 먼저 살펴볼 것은 리턴 값입니다. EVICT_OK는 used_memory가 maxmemory 보다 줄어들었거나, 아직 eviction을 수행할 수 가 없다는 것입니다. EVICT_RUNNING은 현재 eviction 이 계속 진행중이라는 뜻입니다. Single Threaded인 Redis가 어떻게 이렇게 동작할 수 있는지는 뒤에서 설명하겠습니다. 그리고 마지막으로 EVICT_FAIL은 여전히 메모리를 maxmemory 보다 많이 사용하지만, 현재 더 eviction해서 메모리를 줄일 데이터가 없다는 뜻입니다. 이 EVICT_FAIL이 나면 Redis는 OOM 에러를 던질 수 있습니다.

코드를 살펴보면 먼저 isSafeToPerformEvictions 함수가 나옵니다. 현재 eviction을 수행할 수 있는 상태인지를 살펴봅니다. 그래서 무시해도 되는 상황이면 eviction을 진행하지 않도록 0을 리턴합니다. 1이면 eviction 을 진행합니다.

/* Check if it's safe to perform evictions.
 *   Returns 1 if evictions can be performed
 *   Returns 0 if eviction processing should be skipped
 */
static int isSafeToPerformEvictions(void) {
    /* - There must be no script in timeout condition.
     * - Nor we are loading data right now.  */
    if (server.lua_timedout || server.loading) return 0;

    /* By default replicas should ignore maxmemory
     * and just be masters exact copies. */
    if (server.masterhost && server.repl_slave_ignore_maxmemory) return 0;

    /* When clients are paused the dataset should be static not just from the
     * POV of clients not being able to write, but also from the POV of
     * expires and evictions of keys not being performed. */
    if (checkClientPauseTimeoutAndReturnIfPaused()) return 0;

    return 1;
}

두번째로는 getMaxmemoryState 를 호출합니다. getMaxmemoryState는 현재 사용한 메모리 정보를 zmalloc_used_memory() 함수를 통해서 가져옵니다.

/* Get the memory status from the point of view of the maxmemory directive:
 * if the memory used is under the maxmemory setting then C_OK is returned.
 * Otherwise, if we are over the memory limit, the function returns
 * C_ERR.
 *
 * The function may return additional info via reference, only if the
 * pointers to the respective arguments is not NULL. Certain fields are
 * populated only when C_ERR is returned:
 *
 *  'total'     total amount of bytes used.
 *              (Populated both for C_ERR and C_OK)
 *
 *  'logical'   the amount of memory used minus the slaves/AOF buffers.
 *              (Populated when C_ERR is returned)
 *
 *  'tofree'    the amount of memory that should be released
 *              in order to return back into the memory limits.
 *              (Populated when C_ERR is returned)
 *
 *  'level'     this usually ranges from 0 to 1, and reports the amount of
 *              memory currently used. May be > 1 if we are over the memory
 *              limit.
 *              (Populated both for C_ERR and C_OK)
 */
int getMaxmemoryState(size_t *total, size_t *logical, size_t *tofree, float *level) {
    size_t mem_reported, mem_used, mem_tofree;

    /* Check if we are over the memory usage limit. If we are not, no need
     * to subtract the slaves output buffers. We can just return ASAP. */
    mem_reported = zmalloc_used_memory();
    if (total) *total = mem_reported;

    /* We may return ASAP if there is no need to compute the level. */
    int return_ok_asap = !server.maxmemory || mem_reported <= server.maxmemory;
    if (return_ok_asap && !level) return C_OK;

    /* Remove the size of slaves output buffers and AOF buffer from the
     * count of used memory. */
    mem_used = mem_reported;
    size_t overhead = freeMemoryGetNotCountedMemory();
    mem_used = (mem_used > overhead) ? mem_used-overhead : 0;

    /* Compute the ratio of memory usage. */
    if (level) {
        if (!server.maxmemory) {
            *level = 0;
        } else {
            *level = (float)mem_used / (float)server.maxmemory;
        }
    }

    if (return_ok_asap) return C_OK;

    /* Check if we are still over the memory limit. */
    if (mem_used <= server.maxmemory) return C_OK;

    /* Compute how much memory we need to free. */
    mem_tofree = mem_used - server.maxmemory;

    if (logical) *logical = mem_used;
    if (tofree) *tofree = mem_tofree;

    return C_ERR;
}

그리고 혹시나 maxmemory_policy 가 MAXMEMORY_NO_EVICTION이면 eviction을 안하니 바로 EVICT_FAIL 로 리턴합나다. 첫 번째로 OOMERR를 볼 수 있는 상황입니다.(maxmemroy_policy가 noeviction 일때…)

이제는 while 루프를 돌면서 메모리를 해제하게 됩니다. mem_tofree는 used – maxmemory 값 즉, 지금 얼마나 메모리를 해제해야 하는지를 나타내는 값이며, mem_freed 는 현재까지 확보한 메모리 크기입니다.

    while (mem_freed < (long long)mem_tofree) {
        ......
    }

이제 maxmemory-policy 정책에 따라서 조금 달라지게 됩니다. volatile 계열은 expire set 만(expire 로 ttl이 걸린 key들), ALLKEY 는 모든 key들을 대상으로 하기 때문에, 어떤 db에서 이를 처리할지가 결정되게 됩니다. Redis는 내부적으로 expire set을 관리하기 위해서 expires 라는 내부 변수를 가지고 있습니다. 전체 데이터는 dict 안에 있습니다.

Redis 는 eviction 작업을 좀 쉽게 하기 위해서 먼저 일부를 샘플링해서 eviciton 대상 pool을 만들고 이것을 eviction 하게 됩니다.

maxmemory-policy 가 LRU/LFU 종류거나 volatile_ttl 이면 다음과 같이 동작합니다.

현재 발견된 bestkey(eviction 대상) 가 없다면 evictionPoolPopulate 함수를 통해서 대상 풀을 먼저 만들고 여기서 bestkey를 찾아보게 됩니다.

만약에 maxmemory-policy가 RANDOM 계열이면, 정말 랜덤키를 가져와서 지우게 됩니다.

위의 두 경우에 bestkey를 찾는 것이 실패하면 cant_free로 점프하게 되고 EVICT_FAIL을 리턴하게 됩니다. 이 때 LAZY_FREE를 쓰면 조금 다르게 동작할 수 도 있습니다.(LAZY_FREE면 잠시 sleep 후에도 메모리 상황이 used_memory > maxmemory 이면 EVICT_FAIL 아니면 EVICT_OK를 던집니다. LAZY_FREE니 잠시 기다려보고 줄어들면 OK라는 거죠.)

짧게 Redis가 메모리가 부족할 때 eviction을 어떻게 처리하는지를 살펴보았습니다. 뭐 OOM 에러가 난다는 것은, 메모리가 부족한 상황이라는 것이므로, maxmemory-policy 정책을 바꿀 수 있다면… 바꾸건…(데이터가 날아가도 된다면, volatile 말고 allkey 로…) 아니면 메모리를 증설하는 것이 좋은 방법입니다.

[입 개발] Redis가 maxmemory 보다 더 썼다가 OOM 에러를 던져요!!!

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112