Snapchat: Parse CPU log file


  • 3
    D

    Three columns inside log file generated by single thread CPU, assuming input is Entry(String jobName, boolean start, int timeStamp), output is HashMap<String, List<Interval>>

    write a parse function to parse log file. Please read these two examples below very carefully. This problem is more complicated than I think !

    job (String) start/end (boolean) timestamp (Integer)
    f1 true 1
    f1 true 2
    f2 true 4
    f2 false 8
    f1 false 16
    f1 false 32
    f3 true 64
    f3 false 128

    output: f1: [1,4] [8,32]
    f2: [4,8]
    f3: [64,128}


    Another example:
    f1 true 0
    f2 true 2
    f1 true 5
    f1 false 7
    f2 false 10
    f3 true 11
    f3 false 12
    f1 false 15
    f4 true 16
    f4 false 19

    Output:
    f1: [0,2], [5, 7], [10, 11], [12 15]
    f2: [2,5], [7, 10]
    f3: [11, 12]
    f4: [16, 19]


  • 0

    Can you explain more about the input. For example what does it mean start/end param?


  • 0
    D
    This post is deleted!

  • 0
    D

    @elmirap jobName means id of different jobs, f1, f2, f2. boolean start means if it is the start or end of a job. timeStamp means the current time of each entry input , i.e each input command


  • 0

    @daniel.w.1 in your example it is possible to start the same job twice before the it has finished.Is it right?In the problem statement above only a single thread is used. So what does it mean it this case? Why does the single thread write twice f1 true 1, f1 true 2?


  • 0
    D

    In the first example, since f1 is called twice before it is finished(f1,true 1 f1,true, 2) , only the first call (f1, true, 1) will be effective, the second call will be ignored. this is one of the corner cases should be handles in our code.


  • 1
    W
    import java.util.*;
    public class ParseCPULogFile {
        
    	Stack<jobInformation> stack = new Stack<jobInformation>();
    	//the key is the name of the job, and the list of the interval is the list of job time
    	Map<String, List<interval>> map = new HashMap<String, List<interval>>();
    	String currentJobName = "";
    	int startTime = 0;
    	public void add(String jobName, boolean start, int timeStamp){
    		if(start){
    			//stack is null push into stack. cut the original job
    			if(stack.isEmpty()){
    			    startTime = timeStamp;
    			    currentJobName = jobName;
    			}
    			else{
    				if(currentJobName.equals(jobName)){
    				}
    				else{
    					int endTimeStamp = stack.peek().timeStamp;
    					addToMap(stack.peek().jobName, startTime, endTimeStamp);
    				    startTime = timeStamp;
    				    currentJobName = jobName;
    				}
    			}
    		    jobInformation job = new jobInformation(jobName, start, timeStamp);
    		    stack.push(job);
    		}
    		else{
    	        jobInformation originalJob= stack.pop();
    	        if(!stack.isEmpty() && stack.peek().jobName.equals(jobName)){
    	        }
    	        else{
    	        	addToMap(jobName, startTime, timeStamp);
    	        	if(!stack.isEmpty()){
    	        		currentJobName = stack.peek().jobName;
    	        		startTime = stack.peek().timeStamp;
    	        	}
    	        }
    		}
    	}
    	//print out the result
    	
    	public void addToMap(String jobName, int start, int end){
    		interval newInterval = new interval(startTime, end);
        	if(!map.containsKey(jobName)){
        	   List<interval> list = new LinkedList<interval>();
        	   map.put(jobName, list);
        	}
        	map.get(jobName).add(newInterval);
    	}
    	
    	public String output(){
    		return "";
    		//for loop the print all the jobs.
    		//print out all the jobs. So I have twenty five minutes to try the tackle with the proramming question, 10 minutes to deal with the fllow up question
    	}
    	
    	//build the class to store the interval
    	public class interval{
    		int start;
    		int end;
    		interval(int start, int end){
    			this.start = start;
    			this.end = end;
    		}
    	}
    	//build the jobInformation class
    	public class jobInformation{
    		String jobName;
    		boolean start;
    		int timeStamp;
    		jobInformation(String jobName, boolean start, int timeStamp){
    			this.jobName = jobName;
    			this.start = start;
    			this.timeStamp = timeStamp;
    		}
    	}
    }
    
    

  • 0
    W

    I still confused in the second example, why f1 get <10,11> and <12,15>? Is that if no one is running, alway run the first file? And is there a way to end a file forever?


  • 2
    V

    Pretty straight forward.
    Here are your cases.

    A process starts or a process ends.
    A process cannot end before it starts.

    If a process starts, push it into the stack.
    If a process ends, peak the stack.
    Compare the function names.

    If the names are the same and it isn't ending, ignore it. (Could happen with recursive calls, but we only care about earliest execution and are using the function name not PID to track.)

    TRICKY PART
    If the element at the top of the stack is the same as the current element and its ending, POP the peak from the stack. Get the new peak. If there is no peak, return (it's empty) else set it's time_stamp to the time_stamp of the current element. Why? When a function finishes it's execution, it RETURNS. Meaning the previous function in the stack (caller) STARTS AGAIN. You need to capture that interval.


  • 0
    V

    I don't quite understand the first example :

    f1 false 16
    f1 false 32

    How come job f1 can be stopped twice without a "f1 true xxx" in between ?


  • 0
    S

    @wanghanwen I think there is a mistake:
    startTime=timeStamp instead of stack.peek().timeStamp;


  • 2
    S

    It seems it is not necessary to compare the peek() value from stack, as long as we merge intervals.
    Every time it comes to a new LogEntry, we add an interval to the current possessed job, and merge the interval if it is concatenated with its previous one.
    Here are my codes:

    public static Map<String, List<Interval>> parseSingleThreadLog(List<LogEntry> list) {
    	// Entry(String jobName, boolean start, int timeStamp)
    	// return intervals of each job that it is using the CPU
    	Map<String, List<Interval>> map = new HashMap<>();
    	Stack<LogEntry> jobStack = new Stack<>();
    	for(int i=0; i<list.size(); i++) {
    		LogEntry en = list.get(i);
    		int timeStamp = en.timeStamp;
    		// add interval
    		if(!jobStack.isEmpty()){
    			addInterval(map, jobStack.peek().jobName, list.get(i-1).timeStamp, en.timeStamp);
    		}
    		// update jobstack
    		if(en.start){
    			jobStack.push(en);
    		}else{
    			jobStack.pop();
    		}
    	}
    	return map;
    }
    
    private static void addInterval(Map<String, List<Interval>> map, String jobName, int start, int end) {
    	List<Interval> interval = map.computeIfAbsent(jobName, x->new ArrayList<>());
    	if(interval.isEmpty()){
    		interval.add(new Interval(start, end));
    	}else{
    		Interval last = interval.get(interval.size()-1);
    		if(last.end>=start){ // merge
    			last.end = end;
    		}else{
    			interval.add(new Interval(start, end));
    		}
    	}
    }
    

  • 0
    R

    This looks mostly like a stack problem where you have to be careful about the insertion and deletion. Here is my solution which works for both the cases provided the input is correct.

    class Solution:
        def SNAP_JobScheduling(self, fileName):
            jobMap = {}
            contextStack = []
            curJob = None
            startTime = -1
            f = open(fileName, 'r')
            for line in f:
                jobName, startBool, timeStr = line.split(' ')
                time = int(timeStr)
                if jobName not in jobMap:
                    jobMap[jobName] = []
                if curJob == jobName:
                    if startBool == 'true':
                        contextStack.append(jobName)
                    else:
                        if contextStack:
                            nextJob = contextStack.pop()
                            if nextJob != curJob:
                                jobMap[curJob].append([startTime, time])
                                curJob = nextJob
                                startTime = time
                        else:
                            jobMap[curJob].append([startTime, time])
                            curJob = None
                            startTime = -1
                else:
                    if curJob:
                        jobMap[curJob].append([startTime, time])
                        contextStack.append(curJob)
                    curJob = jobName
                    startTime = time
            return jobMap
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.