Solution using awk and pipes with explaination


  • 11
    I
    1. I should count the words. So I chose the awk command.
    • I use a dictionary in awk. For every line I count every word in the dictionary.
    • After deal with all lines. At the END, use for (item in Dict) { #do someting# } to print every words and its frequency.
    1. Now the printed words are unsorted. Then I use a | pipes and sort it by sort
    • sort -n means "compare according to string numerical value".
    • sort -r means "reverse the result of comparisons".
    • sort -k 2 means "sort by the second word"

    awk '\
    { for (i=1; i<=NF; i++) { ++D[$i]; } }\
    END { for (i in D) { print i, D[i] } }\
    ' words.txt | sort -nr -k 2
    

    Are there any other solutions without awk?
    Such as using sed or grep.


  • 2
    F

    I did it with sed.

    cat words.txt | tr -s '[[:space:]]' '\n'| sort | uniq -c | sort -r | sed -r -e 's/[[:space:]]*([[:digit:]]+)[[:space:]]*([[:alpha:]]+)/\2 \1/g'


  • 0
    I

    It's a good idea to use tr and uniq.
    But I think better using awk '{ print $1, $2 }' than sed and regular expression.


  • 0
    F

    You are right.


  • 0
    S

    Nice solution. Jeez it is going to take me a while to get used to all these options :)


  • 0
    C

    It seems sed is faster than awk


Log in to reply
 

Looks like your connection to LeetCode Discuss was lost, please wait while we try to reconnect.