Find the first word in a stream in which it is not repeated in the rest of the stream


  • 2
    T

    Coding exercise:

    Find the first word in a stream in which it is not repeated in the rest of the stream. Please note that you are being provided a stream as a source for the characters. The stream is guaranteed to eventually terminate (i.e. return false from a call to the hasNext() method), though it could be very long. You will access this stream through the provided interface methods. A call to hasNext() will return whether the stream contains any more characters to process. A call to getNext() will return the next character to be processed in the stream. It is not possible to restart the stream.

    Example:

    Input: The angry dog was red. And the cat was also angry.

    Output: dog

    In this example, the word ‘dog’ is the first word in the stream in which it is not repeated in the stream.

    Use one of the following skeletons for your solution:

    Java:

    package questions;
    
    public interface Stream {
        char getNext();
        boolean hasNext();
    }
    
    public class QuestionA {
        public static String firstWord(final Stream input) {
            return "";
        }
    }
    

    C++

    class Stream {
    public:
    char getNext();
    bool hasNext();
    
    };
    
    class QuestionA {
    	std::string firstWord(Stream &input) {
    		return "";
    	}
    };

  • 0
    D

    Nice question. Assumption two words are separated by space. Space complexity O(n). Lemme know how can be improved further.

    Pseudo Code:

    public class QuestionA {
    
      Set<String> nonduplicates= new HashSet<String>();
      Set<String> duplicates= new HashSet<String>();
       StringBuilder sb = new StringBuilder();
    
    public static String firstWord(final Stream input) {
    
           if(input.hasNext() ==true)
           {
               char c= input.getNext();
    
                
    
               if(c==(ASCII Value of blank space)) 
                { 
                              if(nonduplicates.contains(sb.toString()){
                                       nonduplicates.remove(sb.toString()); 
                                       duplicates.add(sb.toString());
                                   }
                                else
                                 if(!duplicates.contains(sb.toString())       
                                    nonduplicates.add(sb.toString());
                                
                        sb=new StringBuilder();
                      
                   }
          sb.append(c);
                   
    
    
          }
    
              return nonduplicates.get(0); // this your first non repeating word.
    }
    

  • 1
    U

    as per the example in the problem description...

    • case in-sensititve comparison is desired, so solution should lowercase each char.

    • There is period after the word red. The delimiter set should handle punctation chars as well.


  • 3
    M

    First at all I want to thank you for your solution.
    Besides the details said by u98101 there are few mistakes:

    1. "nonduplicates" must be initialized as LinkedHashSet because using HashSet makes no guarantees as to the iteration order
    2. Your solution doesn't take in account the case where the Stream doesn't have not repeated words
    3. The method "get" in the last line doesn't exist for java.util.Set interface you need to use nonduplicates.iterator().next() instead (validating that nonduplicates is not empty of course)
    Thanks and greetings!!

  • 0
    N

    your code goes here

    import operator
    import string
    inStr = "The angry dog was red, And the cat was also angry."
    inStr = inStr.strip() ### IMPORTANT
    inStr = inStr.translate(string.maketrans("",""), string.punctuation)
    hashWord = {}
    for i,word in enumerate(inStr.split(" ")):
    word = word.lower()
    if word in hashWord.keys():
    hashWord[word][1] = hashWord[word][1] + 1
    else:
    hashWord[word]= [i,1]
    singleWord= {k: v for k, v in hashWord.iteritems() if v[1] < 2}
    sorted_hashWord = sorted(singleWord.items(), key=lambda (k, v): v)
    print sorted_hashWord.pop(0)[0]


  • 0

    My Python solution using a Ordered Dictionary

    from collections import OrderedDict
    def non_repeated(s):
    	dict =OrderedDict()
    	s = s.lower()
    	for i in s.replace('.','').split(' '):
    		dict[i]= dict.get(i,0)+1
    	
    	for i in dict:
    		if dict[i] ==1:
    			return i
    	
    s = "The angry dog was red. And the cat was red angry"
    print non_repeated(s)
    

    Using a word list to store the already existing words,

    def wordlist(s):
    	wordlist=[]
    	s = s.lower()
    	for i in s.replace('.','').split():
    		if i in wordlist:
    			wordlist.remove(i)
    		else:
    			wordlist.append(i)
    
    	return wordlist[0]
    
    s = "The angry dog was red. And the cat was red angry"
    print wordlist(s)
    

  • 0
    R
    This post is deleted!

  • 0
    J

    My C# code

    public interface Stream
        {
            char getNext();
            bool hasNext();
        }
    
        public class QuestionA
        {
            public static String firstWord(Stream input)
            {
                Dictionary<string, int> repeatedWords = new Dictionary<string, int>();
                StringBuilder word = new StringBuilder();
                string result = string.Empty;
    
                while (input.hasNext())
                {
                    char ch = input.getNext();
                    if(ch != ' ' || ch != '.')
                        word.Append(ch);
                    else
                    {
                        if (!repeatedWords.ContainsKey(word.ToString().ToLower()))
                            repeatedWords.Add(word.ToString().ToLower(), 1);
                        else
                            result = word.ToString();
                        
                        //clear for next word
                        word.Clear();
                    }
                }
    
                return result;
            }
        }
    

  • 0
    E

    This is similar to the LRU cache. Use a Queue to save the order of words and a HashMap to check duplicate. If the word exists, remove the node in the queue. Return the head of the queue as result.


  • 0
    A

    The solution is LinkedHashMap, which maintains the order in which we insert any object and has the properties of a HashMap.
    This problem can also be solved by HashMap and a Queue.
    Considering the stream terminated by a space character.

    
    class QuestionA {
        public static String firstWord(final Stream input) {
        	//Key - String, word
        	//value - Boolean  indicating whether the word is repeated(or not)
        	LinkedHashMap <String, Boolean>words=new LinkedHashMap<String, Boolean>();
        	//StringBuilder for collecting the word
        	StringBuilder sbr=new StringBuilder();
        	while(input.hasNext()){    		
        		char letter=input.getNext();
        		if(Character.isSpaceChar(letter)){
        			//if its is a space then its a new word    			
        			if(sbr.length()<1){
        				//handling space followed by space
        				continue;
        			}
        			String s=sbr.toString().trim();    			
        			sbr=new StringBuilder();    			
        			Boolean b=words.get(s);
        			if(b==null){
        				//new word
        				words.put(s, false);
        			} else if(b){
        				//its already repeated word
        			} else{
        				//new repeated word
        				words.put(s, true);
        			}
        		} else{
        			//if it is not a space then add to our stringBuilder
        			sbr.append(letter);
        		}
        	}
        	
        	String toReturn=null;
        	//Now go through the linked hashMap and find the first value with false indicating not repeated word
        	for(Entry<String, Boolean> crnt:words.entrySet()){
        		if(crnt.getValue()){
        			toReturn=crnt.getKey();
        		}
        	}
            return toReturn;
        }
    }

  • 0
    K
    #include <iostream>
    #include <string>
    #include <unordered_map>
    #include <vector>
    #include <assert.h>
    using namespace std;
    
    class Stream {
    public:
        char getNext();
        bool hasNext();
    };
    
    class QuestionA {
    public:
        string getNextWord(Stream &input) {
            assert(input.hasNext());
            string s;
            while (input.hasNext()) {
                char c = input.getNext();
                if (c == ' ')
                    return s;
                else
                    s.push_back(c);
            }
    
            return s;
        }
    
        string firstWord(Stream &input) {
            unordered_map<string, int> m = new unordered_map<string, int>();
            vector<string> v = new vector<string>();
    
            while (input.hasNext()) {
                string word = getNextWord(input);
                m[word]++;
                v.push_back(word);
            }
    
            vector<string>::size_type sz = v.size();
            for (int i = 0; i < sz; i++) {
                assert(m[vector[i]] >= 1);
                if (m[vector[i]] == 1)
                    return m[vector[i]];
            }
            assert (0);
        }
    };
    
    

  • 0
    L

    A simple Java solution as below:

        public static String firstWord(final Stream input) {
            char c; //current char of input stream
            StringBuilder sb = new StringBuilder();
            String word;
            Set<String> linkedHashSet = new LinkedHashSet<String>();
            Set<String> duplicateWords = new HashSet<String>(); //handle the case triple duplicate "the the the"
    
            while (input.hasNext()) {
                c = input.getNext();
                switch (c) {
                    //The chars to specify the end of word are ' ' and '.'
                    case ' ':
                    case '.':
                        //NOTE: need to convert to lower case, else the
                        //      sample stream will return "The" instead of "dog"
                        word = sb.toString().toLowerCase();
                        if (linkedHashSet.contains(word)) {
                            linkedHashSet.remove(word);
                            duplicateWords.add(word);
                        } else if (!duplicateWords.contains(word)) {
                            linkedHashSet.add(word);
                        }
    
                        //Reset StringBuilder to accept the next word.
                        sb.setLength(0);
                        break;
                    default:
                        sb.append(c);
                }
            }
    
            if (linkedHashSet.isEmpty()) {
                return "";
            } else {
                return linkedHashSet.iterator().next();
            }
        }
    

    The simple test cases are:
    Input:
    The angry dog was red. And the cat was also angry.
    Output: dog
    Expected: dog
    Result: true

    Input:
    The . . angry dog was red. And the cat was also angry.
    Output: dog
    Expected: dog
    Result: true

    Input:
    the the the.
    Output:
    Expected:
    Result: true

    Input:
    ...
    Output:
    Expected:
    Result: true


  • 0
    O

    @laqxs How Can I execute this code?


  • 0
    H

    @DotProduct

    How does set return the first non repeating word? It doesn't store data in order.

    You should be using LinkedHashMap for this.


  • 0
    X

    @miguel.salto @DotProduct @tferreira
    To @DotProduct solution and @miguel-salto comments I can add - don't forget the last word in a stream - it may not terminate with space


  • 0
    X

    @hrshchaitanya see for LinkedHashSet.iterator().next()


  • 0
    A

    here is an implementation to it.

    public class FirstNonRepeatWordInAStream {
        private Set<String> uniqueSet = new LinkedHashSet<>();
        private Set<String> nonUniqueSet = new HashSet<>();
        private String getNextWord(Stream input){
            if(!input.hasNext())return null;
            StringBuilder sb = new StringBuilder();
            while(input.hasNext()){
                char c = input.getNext();
                if((c==' '||c=='.')&&sb.length()>0){
                   break;
                }
                else{
                    if(c!=' ' && c!='.') {
                        sb.append(c);
                    }
                }
            }
            return sb.toString();
        }
        public  String firstWord(final Stream input) {
            String next = getNextWord(input);
            while(next!=null){
                if(nonUniqueSet.add(next)){
                    uniqueSet.add(next);//add if new
                }else{
                    //remove if repeat
                    uniqueSet.remove(next);
                }
                next = getNextWord(input);
            }
            return nonUniqueSet.iterator().next();
        }
    
    }
    
    interface Stream {
        char getNext();
        boolean hasNext();
    }
    
    

  • 0

    Using a linked list to mimic LinkedHashMap / OrderedDict. keeping track of order by using a linked list and using a hash map to hash the node corresponding to each word

    class Stream:
    
        def __init__(self, s):
            self.s = s
            self.pos = 0
    
        def hasNext(self):
            return self.pos < len(self.s)
    
        def getNext(self):
            if not self.hasNext: return
            temp = self.s[self.pos]
            self.pos += 1
            return temp
    
    class WordNode:
    
        def __init__(self, word):
            self.word = word
            self.next = None
            self.cnt = 1
    
    def unique_word(stream):
        buffr = itr = WordNode(None)
        _map = {}
        word = []
        while stream.hasNext():
            c = stream.getNext()
            if c in (' ', ',', '.', ';', ':'):
                if word:
                    w = ''.join(word).lower()
                    word = []
                    if w in _map:
                        _map[w].cnt += 1
                    else: # create a new node
                        itr.next = WordNode(w)
                        itr = itr.next
                        _map[w] = itr
            else:
                word.append(c)
        head = buffr.next
        while head:
            if head.cnt == 1: return head.word
            head = head.next
        return None
    

Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.