The failure rate of a cluster


  • 1

    There is a cluster of n nodes, each node has a failure rate Pi, e.g. the failure rate of node 1 is P1, the failure rate of node 2 is P2. If there are more than half of nodes fail, the cluster fails. The question is given an array double[] P, what is the failure rate of this cluster.

    I gave a brute-force solution in the interview, which is to first write a helper function to get the subsets of P, where each subset is of size K, and use a for loop to call that helper method from (P.length + 1) / 2 to P.length. When it came to optimization, i mentioned DP, and the interviewer didn't say yes or no, just asked me to go ahead, but i didn't even come up with the transition function.

    I don't know if there's a dp solution, but any idea of a better solution?


  • 0

    @Cloud_cal How do we define failure rate of a node, of a cluster? Could you explain more about this sentence: If there are more than half of nodes fail, the cluster fails. I don't see a relation between it and the other statement for an array with failure rates of cluster's nodes
    Thank you


  • 0

    @elmirap an array double[] P is given as the input, P[i] stands for the failure rate of node i + 1.


  • 0

    @Cloud_cal Thank you very much for your fast respond but this still doesn't answer to the other questions

    1. How do we define failure rate of a cluster?
    2. Why do we need to know this information "If there are more than half of nodes fail, the cluster fails" to calculate cluster's failure rate?

    I will be grateful if you share with us little more. Thanks


  • 0

    @elmirap

    1. How do we define failure rate of a node, of a cluster
      ===> failure rate of each node is given. For example, failure rate for node 1 is P[0], failure rate for node 2 is P[1], etc. Such array P is given as input.
    2. Why do we need to know this information "If there are more than half of nodes fail, the cluster fails" to calculate cluster's failure rate?
      ===> This is how failure of a cluster defined. If more than half nodes of a cluster fail, the cluster fails.

  • 0

    @Cloud_cal How do we know that half of nodes fail?
    Can I assume that failure rate of node is number of failures per some interval?


  • 0
    F

    I believe this may help:

    Poisson Distribution


  • 5
    D

    Possible dp:

    p -> probability array of size N

    dp[i][j] -> Probability that exactly j nodes failed out of the first i nodes.
    Hence,

    dp[i][j] = p[i - 1] * dp[i - 1][j - 1] + (1 - p[i]) * dp[i - 1][j];
    Base cases:
    dp[1][0] = 1-p[0]
    dp[1][1] = p[0]
    dp[i][0] = dp[i - 1][0]*(1-p[0])
    
    double ans = 0;
    for (int i = n / 2 + 1; i < n; i++) {
        ans += dp[n][i];
    }
    

  • 0

    Better provide an example. Is this one correct?

    Three nodes:
    P1 = 1/3
    P2 = 1/4
    P3 = 1/6
    The cluster is failing when two or more nodes are failing.

    So, thinking in terms of probabilities (what's the probability that the cluster is failing at any given moment) with for example P(12) being the rate of nodes 1 and 2 both failing:

    P(cluster)
    = P(12) + P(13) + P(23) - 2*P(123)
    = 1/12 + 1/18 + 1/24 - 2/72
    = 11/72
    I subtracted P(123) twice because the case of all three failing has been triple-counted in P(12) + P(13) + P(23)

    Or thinking in terms of time:
    Node 1 fails 1/3 of the time, let's say the first 8 hours each day.
    Node 2 fails 1/4 of the time, let's say the first 15 minutes of each hour.
    Node 3 fails 1/6 of the time, let's say the first 10 seconds of each minute.
    Mixing them like this provides a bias-free combination.
    Now, how much time each day are two or more nodes failing at the same time? Let's do it with a little coding, which makes it simple and clear. Also, it's different from the above way of subtracting double-counted stuff, so it's a neat "independent" verification.

    >>> sum((h < 8 and m < 15) or (h < 8 and s < 10) or (m < 15 and s < 10)
            for h in range(24) for m in range(60) for s in range(60)) / float(24*60*60)
    0.1527777777777778
    

    That's 11/72 and thus agrees with my result from probability-thinking.

    Does Heisenberg agree as well?


  • 0

    @elmirap No, we don't need to take time into account. See StefanPochmann's first example.


  • 0

    @StefanPochmann Heisenberg agrees with the first example. We don't need to worry about time.


  • 0

    @Cloud_cal Well we don't need to, but we can view it that way. Might be easier for some. And it's equivalent, right?


  • 0

    @StefanPochmann Yea. I'll give an example next time when i post an interview question. The reason i didn't do that was because i didn't think in terms of time and thought it's straightforward enough to understand.


  • 0

    @diptesh Can you make your code work with my example? I just tried and it failed. Using dp[n][i] leads to accessing p[n], which doesn't exist.


  • 0
    D

    @StefanPochmann My bad, ,the final loop should just use dp[n -1][i]. It was a typo.


  • 0
    J

    Would this be considered an easy, medium or hard question on leetcode?

    @Cloud_cal Were you invited onsite based upon your response?
    When you say "write a helper function to get the subset of P of size K," do you mean all subsets of size K, not the subset of size K?


  • 0

    @JamesBondPuppy
    No, it's an onsite problem.
    I didn't make it clear, "write a helper function to get the subsets of P, where each subset is of size K".


  • 0
    J

    @Cloud_cal said in The failure rate of a cluster:

    write a helper function to get the subsets of P

    Agreed. Did you get the job?


  • 0
    This post is deleted!

  • 0

    @JamesBondPuppy
    Nope, like i said, i didn't come up with the transition function, probably got a "no hire" or "strong no hire" from this round, LOL


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.