Lookup Table with O(1)


  • 3
    L

    The code is from "Bit Twiddling Hacks" by Sean Eron Anderson.

    uint32_t reverseBits(uint32_t n) {
        int rbitTable[256] = {
            #define R2(n) n, n + 2*64, n + 1 * 64, n + 3 * 64
            #define R4(n) R2(n), R2(n + 2 * 16), R2(n + 1 * 16), R2(n + 3 * 16)
            #define R6(n) R4(n), R4(n + 2 * 4), R4(n + 1 * 4), R4(n + 3 * 4)
            R6(0), R6(2), R6(1), R6(3)
        };
        uint32_t v;
        unsigned char *p = (unsigned char*)&n;
        unsigned char *q = (unsigned char*)&v;
        q[0] = rbitTable[p[3]];
        q[1] = rbitTable[p[2]];
        q[2] = rbitTable[p[1]];
        q[3] = rbitTable[p[0]];
        return v;
    }

  • 0
    L

    The most beautiful and elegant code I have ever seen, though it is a little difficulty to understand.


  • 0
    A

    One should write harder perf tests than Leetcode to see perf issues. Consider the following test:

    Solution s;
        auto r0 = s.reverseBits(43261596);
        if (r0 != 964176192)
            throw - 1;
    
        unsigned sum = 0;
        for (unsigned n = 0; n < 4000000000; ++n)
            sum += s.reverseBits(n);
    
        printf("%d", sum);
    

    For this test the code from this topic runs "forever". No other code I tried fails so badly. It is because the initialization of the lookup table is done per each call. Changing it to:

    **bolded text**static int rbitTable[256] = {...}
    

    makes the test run in 30 sec. Furthermore... I have been observing that to get real perf gain one should make as little writes as possible. Often it means that making significantly more reads or computations still wins. Consider the code below. It is exactly your solution but different "assembly of the result". It runs the test in 8.33 sec.

    class Solution
    {
    public:
        unsigned reverseBits(unsigned n) const
        {
            static unsigned char map[256] =
            {
    ...
            };
    
            auto src = reinterpret_cast<const unsigned char*>(&n);
            return (map[src[0]] << 24) | (map[src[1]] << 16) | (map[src[2]] << 8) | map[src[3]];
            
            /*unsigned v;
            auto dst = reinterpret_cast<unsigned char*>(&v);
            dst[0] = map[src[3]];
            dst[1] = map[src[2]];
            dst[2] = map[src[1]];
            dst[3] = map[src[0]];
            return v;*/
        }
    };
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.