Boyer-Moore Majority Vote algorithm and my elaboration


  • 199
    O

    For those who aren't familiar with Boyer-Moore Majority Vote algorithm,
    I found a great article (http://goo.gl/64Nams) that helps me to understand this fantastic algorithm!!
    Please check it out!

    The essential concepts is you keep a counter for the majority number X. If you find a number Y that is not X, the current counter should deduce 1. The reason is that if there is 5 X and 4 Y, there would be one (5-4) more X than Y. This could be explained as "4 X being paired out by 4 Y".

    And since the requirement is finding the majority for more than ceiling of [n/3], the answer would be less than or equal to two numbers.
    So we can modify the algorithm to maintain two counters for two majorities.

    Followings are my sample Python code:

    class Solution:
    # @param {integer[]} nums
    # @return {integer[]}
    def majorityElement(self, nums):
        if not nums:
            return []
        count1, count2, candidate1, candidate2 = 0, 0, 0, 1
        for n in nums:
            if n == candidate1:
                count1 += 1
            elif n == candidate2:
                count2 += 1
            elif count1 == 0:
                candidate1, count1 = n, 1
            elif count2 == 0:
                candidate2, count2 = n, 1
            else:
                count1, count2 = count1 - 1, count2 - 1
        return [n for n in (candidate1, candidate2)
                        if nums.count(n) > len(nums) // 3]

  • 4

    Wow, the Python implementation is really concise! In fact, it is safe to initialize candidate1 and candidate2 to the same value since you have avoided repeated counting using the if-elif structure.


  • 0
    F
    This post is deleted!

  • 0
    C

    Is no good if length is just 1 or 2. Should it candidate be iniialized with num[0] and num[1] if length is 2?


  • 3

    Well, it works even length is 1 or 2 since the third and fourth elif statements have handled this.


  • 0
    C

    Yes got it now. Thanks!


  • 1
    I

    excellent explanation, helps a lot!


  • 43
    L

    Great algorithm, but need some more explanation on the confusing word 2 "majorities". They are not necessarily be the 2 most frequent elements after the 1st round. Here is why the poster's 2 "majorities" algorithm really works:
    consider 3 cases:

    1. there are no elements that appears more than n/3 times, then whatever the algorithm 
     got from 1st round wound be rejected in the second round.
    2. there are only one elements that appears more than n/3 times, after 1st round one of 
     the candicate must be that appears more than n/3 times(<2n/3 other elements could only
     pair out for <n/3 times), the other candicate is not necessarily be the second most frequent 
     but it would be rejected in 2nd round.
    3. there are two elments appears more than n/3 times, candicates would contain both of
     them. (<n/3 other elements couldn't pair out any of the majorities.)

  • 0
    C

    @ jianchao.li.fighter, if we initialize like: count1, count2, candidate1, candidate2 = 0, 0, 0, 0, then it seems we can not pass the OJ, why~


  • 0

    Hi, caikehe, are you sure? I modify in your way and the code gets accepted.


  • 0
    C

    It's quite strange, here is what I get from the OJ: Input: [0,0,0]
    Output: [0,0]
    Expected: [0]


  • 0

    Could you please share your complete code?


  • 2
    C
    def majorityElement(self, nums):
        if not nums:
            return []
        count1, count2, candidate1, candidate2 = 0, 0, 0, 0
        for n in nums:
            if n == candidate1:
                count1 += 1
            elif n == candidate2:
                count2 += 1
            elif count1 == 0:
                candidate1, count1 = n, 1
            elif count2 == 0:
                candidate2, count2 = n, 1
            else:
                count1, count2 = count1 - 1, count2 - 1
        return [n for n in (candidate1, candidate2) if nums.count(n) > len(nums) // 3]

  • 4

    Hi, caikehe. I am sorry. The above code requires us to set candidate1 and candidate2 to different values. The problem occurs in the return statement wich cannot avoid duplicates. The code may be simply modified as follows (using set in return) to get accepted.

    class Solution:
        # @param {integer[]} nums
        # @return {integer[]}
        def majorityElement(self, nums):
            if not nums:  
                return []
            count1, count2, candidate1, candidate2 = 0, 0, 0, 0
            for n in nums:
                if n == candidate1:
                    count1 += 1
                elif n == candidate2:
                    count2 += 1
                elif count1 == 0:
                    candidate1, count1 = n, 1
                elif count2 == 0:
                    candidate2, count2 = n, 1
                else:
                    count1, count2 = count1 - 1, count2 - 1
            return [n for n in set([candidate1, candidate2]) if nums.count(n) > len(nums) // 3]

  • 0
    C

    Yes, thanks~


  • 0
    Z

    i don't think it very well for you using "nums.count(n)", i think you should store original count of the tmp major~


  • 0
    T

    Thanks for your explanation ! It makes sense.


  • 1

    Thank you very much for your solution-sharing and explanation. I'm not familiar with Python, so could you tell me is nums.count(n) > len(nums) // 3 means calculate the numbers of two candidates from the array whether they are greater than n/3 ? Thank you in advance.


  • 0
    S

    Yes. Your are totally right.

    // in python is a Integer division. (Both python 3 and python 2)

    / in python is a float division. (python 3)

    Reference:
    http://stackoverflow.com/questions/21316968/division-in-python-2-7-and-3-3


  • 1
    J

    After the first traversal, can whether check count1 and count2 are greater than 0. If not, can immediately exclude the corresponding candidate. Only verify the candidates with count greater than 0 (from the first traversal).


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.