Sort and count number of occurrence of lines












118















I have Apache logfile, access.log, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]' is



a.php
b.php
a.php
c.php
d.php
b.php
a.php


the result that I want is:



3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php









share|improve this question


















  • 19





    | sort | uniq -c

    – Costas
    Nov 26 '14 at 11:33






  • 3





    | LC_ALL=C sort | LC_ALL=C uniq -c

    – Stéphane Chazelas
    Nov 26 '14 at 11:33













  • ah I never know that uniq could do that..

    – Kokizzu
    Nov 26 '14 at 11:37











  • Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.

    – user78605
    Nov 26 '14 at 13:54











  • it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3

    – Kokizzu
    Nov 26 '14 at 14:19
















118















I have Apache logfile, access.log, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]' is



a.php
b.php
a.php
c.php
d.php
b.php
a.php


the result that I want is:



3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php









share|improve this question


















  • 19





    | sort | uniq -c

    – Costas
    Nov 26 '14 at 11:33






  • 3





    | LC_ALL=C sort | LC_ALL=C uniq -c

    – Stéphane Chazelas
    Nov 26 '14 at 11:33













  • ah I never know that uniq could do that..

    – Kokizzu
    Nov 26 '14 at 11:37











  • Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.

    – user78605
    Nov 26 '14 at 13:54











  • it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3

    – Kokizzu
    Nov 26 '14 at 14:19














118












118








118


18






I have Apache logfile, access.log, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]' is



a.php
b.php
a.php
c.php
d.php
b.php
a.php


the result that I want is:



3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php









share|improve this question














I have Apache logfile, access.log, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]' is



a.php
b.php
a.php
c.php
d.php
b.php
a.php


the result that I want is:



3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php






command-line sort






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 26 '14 at 11:31









KokizzuKokizzu

2,40473560




2,40473560








  • 19





    | sort | uniq -c

    – Costas
    Nov 26 '14 at 11:33






  • 3





    | LC_ALL=C sort | LC_ALL=C uniq -c

    – Stéphane Chazelas
    Nov 26 '14 at 11:33













  • ah I never know that uniq could do that..

    – Kokizzu
    Nov 26 '14 at 11:37











  • Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.

    – user78605
    Nov 26 '14 at 13:54











  • it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3

    – Kokizzu
    Nov 26 '14 at 14:19














  • 19





    | sort | uniq -c

    – Costas
    Nov 26 '14 at 11:33






  • 3





    | LC_ALL=C sort | LC_ALL=C uniq -c

    – Stéphane Chazelas
    Nov 26 '14 at 11:33













  • ah I never know that uniq could do that..

    – Kokizzu
    Nov 26 '14 at 11:37











  • Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.

    – user78605
    Nov 26 '14 at 13:54











  • it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3

    – Kokizzu
    Nov 26 '14 at 14:19








19




19





| sort | uniq -c

– Costas
Nov 26 '14 at 11:33





| sort | uniq -c

– Costas
Nov 26 '14 at 11:33




3




3





| LC_ALL=C sort | LC_ALL=C uniq -c

– Stéphane Chazelas
Nov 26 '14 at 11:33







| LC_ALL=C sort | LC_ALL=C uniq -c

– Stéphane Chazelas
Nov 26 '14 at 11:33















ah I never know that uniq could do that..

– Kokizzu
Nov 26 '14 at 11:37





ah I never know that uniq could do that..

– Kokizzu
Nov 26 '14 at 11:37













Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.

– user78605
Nov 26 '14 at 13:54





Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.

– user78605
Nov 26 '14 at 13:54













it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3

– Kokizzu
Nov 26 '14 at 14:19





it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3

– Kokizzu
Nov 26 '14 at 14:19










3 Answers
3






active

oldest

votes


















155














| sort | uniq -c


As stated in the comments.



Piping the output into sort organises the output into alphabetical/numerical order.



This is a requirement because uniq only matches on repeated lines, ie



a
b
a


If you use uniq on this text file, it will return the following:



a
b
a


This is because the two as are separated by the b - they are not consecutive lines. However if you first sort the data into alphabetical order first like



a
a
b


Then uniq will remove the repeating lines. The -c option of uniq counts the number of duplicates and provides output in the form:



2 a
1 b


http://unixhelp.ed.ac.uk/CGI/man-cgi?sort



http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq






share|improve this answer





















  • 1





    Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)

    – John WH Smith
    Nov 26 '14 at 12:18






  • 1





    printf '%sn' ①.php ②.php | sort | uniq -c gives me 2 ①.php

    – Stéphane Chazelas
    Nov 26 '14 at 12:50











  • @StéphaneChazelas Thats because the printf prints phpnphp

    – user78605
    Nov 26 '14 at 13:52






  • 4





    @Jidder, no, that's because ①.php sorts the same as ②.php in my locale because no sorting order is defined for those and character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c.

    – Stéphane Chazelas
    Nov 26 '14 at 14:00











  • In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.

    – Lluís Suñol
    Mar 26 '18 at 11:41



















80














[your command] | sort | uniq -c | sort -nr


The accepted answer is almost complete you might want to add an extra sort -nr at the end to sort the results with the lines that occur most often first



uniq options:



-c, --count
prefix lines by the number of occurrences


sort options:



-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons


In the particular case were the lines you are sorting are numbers, you need use sort -gr instead of sort -nr, see comment






share|improve this answer





















  • 3





    Thanks so much for letting me know about -n option.

    – Sigur
    Nov 30 '16 at 17:00






  • 2





    Great answer, here's what I use to get a wordcount out of file with sentences: tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.

    – Bar
    Jul 20 '17 at 0:08






  • 1





    Using the options above I get " 1" before " 23344". Using sort -gr instead solves this. -g: compare according to general numerical value (instead of -n: compare according to string numerical value).

    – Peter Jaric
    Feb 14 at 12:24













  • @PeterJaric Great catch and very useful to know about -gr but I think the output of uniq -c will be as such that sort -nr will work as intended

    – Eduard Florinescu
    Feb 14 at 13:09






  • 1





    Actually, when the data are numbers, -gr works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr. The first one sorts incorrectly, but not the second one.

    – Peter Jaric
    Feb 15 at 10:31



















7














You can use an associative array on awk and then -optionally- sort:



cat access.log  | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort


output:



1 c.php
1 d.php
2 b.php
3 a.php





share|improve this answer


























  • How would you count the number of occurrences as the pipe is sending data?

    – user123456
    Oct 9 '16 at 18:00











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f170043%2fsort-and-count-number-of-occurrence-of-lines%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









155














| sort | uniq -c


As stated in the comments.



Piping the output into sort organises the output into alphabetical/numerical order.



This is a requirement because uniq only matches on repeated lines, ie



a
b
a


If you use uniq on this text file, it will return the following:



a
b
a


This is because the two as are separated by the b - they are not consecutive lines. However if you first sort the data into alphabetical order first like



a
a
b


Then uniq will remove the repeating lines. The -c option of uniq counts the number of duplicates and provides output in the form:



2 a
1 b


http://unixhelp.ed.ac.uk/CGI/man-cgi?sort



http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq






share|improve this answer





















  • 1





    Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)

    – John WH Smith
    Nov 26 '14 at 12:18






  • 1





    printf '%sn' ①.php ②.php | sort | uniq -c gives me 2 ①.php

    – Stéphane Chazelas
    Nov 26 '14 at 12:50











  • @StéphaneChazelas Thats because the printf prints phpnphp

    – user78605
    Nov 26 '14 at 13:52






  • 4





    @Jidder, no, that's because ①.php sorts the same as ②.php in my locale because no sorting order is defined for those and character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c.

    – Stéphane Chazelas
    Nov 26 '14 at 14:00











  • In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.

    – Lluís Suñol
    Mar 26 '18 at 11:41
















155














| sort | uniq -c


As stated in the comments.



Piping the output into sort organises the output into alphabetical/numerical order.



This is a requirement because uniq only matches on repeated lines, ie



a
b
a


If you use uniq on this text file, it will return the following:



a
b
a


This is because the two as are separated by the b - they are not consecutive lines. However if you first sort the data into alphabetical order first like



a
a
b


Then uniq will remove the repeating lines. The -c option of uniq counts the number of duplicates and provides output in the form:



2 a
1 b


http://unixhelp.ed.ac.uk/CGI/man-cgi?sort



http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq






share|improve this answer





















  • 1





    Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)

    – John WH Smith
    Nov 26 '14 at 12:18






  • 1





    printf '%sn' ①.php ②.php | sort | uniq -c gives me 2 ①.php

    – Stéphane Chazelas
    Nov 26 '14 at 12:50











  • @StéphaneChazelas Thats because the printf prints phpnphp

    – user78605
    Nov 26 '14 at 13:52






  • 4





    @Jidder, no, that's because ①.php sorts the same as ②.php in my locale because no sorting order is defined for those and character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c.

    – Stéphane Chazelas
    Nov 26 '14 at 14:00











  • In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.

    – Lluís Suñol
    Mar 26 '18 at 11:41














155












155








155







| sort | uniq -c


As stated in the comments.



Piping the output into sort organises the output into alphabetical/numerical order.



This is a requirement because uniq only matches on repeated lines, ie



a
b
a


If you use uniq on this text file, it will return the following:



a
b
a


This is because the two as are separated by the b - they are not consecutive lines. However if you first sort the data into alphabetical order first like



a
a
b


Then uniq will remove the repeating lines. The -c option of uniq counts the number of duplicates and provides output in the form:



2 a
1 b


http://unixhelp.ed.ac.uk/CGI/man-cgi?sort



http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq






share|improve this answer















| sort | uniq -c


As stated in the comments.



Piping the output into sort organises the output into alphabetical/numerical order.



This is a requirement because uniq only matches on repeated lines, ie



a
b
a


If you use uniq on this text file, it will return the following:



a
b
a


This is because the two as are separated by the b - they are not consecutive lines. However if you first sort the data into alphabetical order first like



a
a
b


Then uniq will remove the repeating lines. The -c option of uniq counts the number of duplicates and provides output in the form:



2 a
1 b


http://unixhelp.ed.ac.uk/CGI/man-cgi?sort



http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq







share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 8 '18 at 15:39









Rodrigue

1073




1073










answered Nov 26 '14 at 11:36









visudovisudo

1,831174




1,831174








  • 1





    Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)

    – John WH Smith
    Nov 26 '14 at 12:18






  • 1





    printf '%sn' ①.php ②.php | sort | uniq -c gives me 2 ①.php

    – Stéphane Chazelas
    Nov 26 '14 at 12:50











  • @StéphaneChazelas Thats because the printf prints phpnphp

    – user78605
    Nov 26 '14 at 13:52






  • 4





    @Jidder, no, that's because ①.php sorts the same as ②.php in my locale because no sorting order is defined for those and character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c.

    – Stéphane Chazelas
    Nov 26 '14 at 14:00











  • In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.

    – Lluís Suñol
    Mar 26 '18 at 11:41














  • 1





    Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)

    – John WH Smith
    Nov 26 '14 at 12:18






  • 1





    printf '%sn' ①.php ②.php | sort | uniq -c gives me 2 ①.php

    – Stéphane Chazelas
    Nov 26 '14 at 12:50











  • @StéphaneChazelas Thats because the printf prints phpnphp

    – user78605
    Nov 26 '14 at 13:52






  • 4





    @Jidder, no, that's because ①.php sorts the same as ②.php in my locale because no sorting order is defined for those and character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c.

    – Stéphane Chazelas
    Nov 26 '14 at 14:00











  • In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.

    – Lluís Suñol
    Mar 26 '18 at 11:41








1




1





Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)

– John WH Smith
Nov 26 '14 at 12:18





Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)

– John WH Smith
Nov 26 '14 at 12:18




1




1





printf '%sn' ①.php ②.php | sort | uniq -c gives me 2 ①.php

– Stéphane Chazelas
Nov 26 '14 at 12:50





printf '%sn' ①.php ②.php | sort | uniq -c gives me 2 ①.php

– Stéphane Chazelas
Nov 26 '14 at 12:50













@StéphaneChazelas Thats because the printf prints phpnphp

– user78605
Nov 26 '14 at 13:52





@StéphaneChazelas Thats because the printf prints phpnphp

– user78605
Nov 26 '14 at 13:52




4




4





@Jidder, no, that's because ①.php sorts the same as ②.php in my locale because no sorting order is defined for those and character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c.

– Stéphane Chazelas
Nov 26 '14 at 14:00





@Jidder, no, that's because ①.php sorts the same as ②.php in my locale because no sorting order is defined for those and character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c.

– Stéphane Chazelas
Nov 26 '14 at 14:00













In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.

– Lluís Suñol
Mar 26 '18 at 11:41





In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.

– Lluís Suñol
Mar 26 '18 at 11:41













80














[your command] | sort | uniq -c | sort -nr


The accepted answer is almost complete you might want to add an extra sort -nr at the end to sort the results with the lines that occur most often first



uniq options:



-c, --count
prefix lines by the number of occurrences


sort options:



-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons


In the particular case were the lines you are sorting are numbers, you need use sort -gr instead of sort -nr, see comment






share|improve this answer





















  • 3





    Thanks so much for letting me know about -n option.

    – Sigur
    Nov 30 '16 at 17:00






  • 2





    Great answer, here's what I use to get a wordcount out of file with sentences: tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.

    – Bar
    Jul 20 '17 at 0:08






  • 1





    Using the options above I get " 1" before " 23344". Using sort -gr instead solves this. -g: compare according to general numerical value (instead of -n: compare according to string numerical value).

    – Peter Jaric
    Feb 14 at 12:24













  • @PeterJaric Great catch and very useful to know about -gr but I think the output of uniq -c will be as such that sort -nr will work as intended

    – Eduard Florinescu
    Feb 14 at 13:09






  • 1





    Actually, when the data are numbers, -gr works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr. The first one sorts incorrectly, but not the second one.

    – Peter Jaric
    Feb 15 at 10:31
















80














[your command] | sort | uniq -c | sort -nr


The accepted answer is almost complete you might want to add an extra sort -nr at the end to sort the results with the lines that occur most often first



uniq options:



-c, --count
prefix lines by the number of occurrences


sort options:



-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons


In the particular case were the lines you are sorting are numbers, you need use sort -gr instead of sort -nr, see comment






share|improve this answer





















  • 3





    Thanks so much for letting me know about -n option.

    – Sigur
    Nov 30 '16 at 17:00






  • 2





    Great answer, here's what I use to get a wordcount out of file with sentences: tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.

    – Bar
    Jul 20 '17 at 0:08






  • 1





    Using the options above I get " 1" before " 23344". Using sort -gr instead solves this. -g: compare according to general numerical value (instead of -n: compare according to string numerical value).

    – Peter Jaric
    Feb 14 at 12:24













  • @PeterJaric Great catch and very useful to know about -gr but I think the output of uniq -c will be as such that sort -nr will work as intended

    – Eduard Florinescu
    Feb 14 at 13:09






  • 1





    Actually, when the data are numbers, -gr works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr. The first one sorts incorrectly, but not the second one.

    – Peter Jaric
    Feb 15 at 10:31














80












80








80







[your command] | sort | uniq -c | sort -nr


The accepted answer is almost complete you might want to add an extra sort -nr at the end to sort the results with the lines that occur most often first



uniq options:



-c, --count
prefix lines by the number of occurrences


sort options:



-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons


In the particular case were the lines you are sorting are numbers, you need use sort -gr instead of sort -nr, see comment






share|improve this answer















[your command] | sort | uniq -c | sort -nr


The accepted answer is almost complete you might want to add an extra sort -nr at the end to sort the results with the lines that occur most often first



uniq options:



-c, --count
prefix lines by the number of occurrences


sort options:



-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons


In the particular case were the lines you are sorting are numbers, you need use sort -gr instead of sort -nr, see comment







share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 15 at 13:09

























answered Feb 17 '16 at 14:50









Eduard FlorinescuEduard Florinescu

3,404103855




3,404103855








  • 3





    Thanks so much for letting me know about -n option.

    – Sigur
    Nov 30 '16 at 17:00






  • 2





    Great answer, here's what I use to get a wordcount out of file with sentences: tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.

    – Bar
    Jul 20 '17 at 0:08






  • 1





    Using the options above I get " 1" before " 23344". Using sort -gr instead solves this. -g: compare according to general numerical value (instead of -n: compare according to string numerical value).

    – Peter Jaric
    Feb 14 at 12:24













  • @PeterJaric Great catch and very useful to know about -gr but I think the output of uniq -c will be as such that sort -nr will work as intended

    – Eduard Florinescu
    Feb 14 at 13:09






  • 1





    Actually, when the data are numbers, -gr works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr. The first one sorts incorrectly, but not the second one.

    – Peter Jaric
    Feb 15 at 10:31














  • 3





    Thanks so much for letting me know about -n option.

    – Sigur
    Nov 30 '16 at 17:00






  • 2





    Great answer, here's what I use to get a wordcount out of file with sentences: tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.

    – Bar
    Jul 20 '17 at 0:08






  • 1





    Using the options above I get " 1" before " 23344". Using sort -gr instead solves this. -g: compare according to general numerical value (instead of -n: compare according to string numerical value).

    – Peter Jaric
    Feb 14 at 12:24













  • @PeterJaric Great catch and very useful to know about -gr but I think the output of uniq -c will be as such that sort -nr will work as intended

    – Eduard Florinescu
    Feb 14 at 13:09






  • 1





    Actually, when the data are numbers, -gr works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr. The first one sorts incorrectly, but not the second one.

    – Peter Jaric
    Feb 15 at 10:31








3




3





Thanks so much for letting me know about -n option.

– Sigur
Nov 30 '16 at 17:00





Thanks so much for letting me know about -n option.

– Sigur
Nov 30 '16 at 17:00




2




2





Great answer, here's what I use to get a wordcount out of file with sentences: tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.

– Bar
Jul 20 '17 at 0:08





Great answer, here's what I use to get a wordcount out of file with sentences: tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.

– Bar
Jul 20 '17 at 0:08




1




1





Using the options above I get " 1" before " 23344". Using sort -gr instead solves this. -g: compare according to general numerical value (instead of -n: compare according to string numerical value).

– Peter Jaric
Feb 14 at 12:24







Using the options above I get " 1" before " 23344". Using sort -gr instead solves this. -g: compare according to general numerical value (instead of -n: compare according to string numerical value).

– Peter Jaric
Feb 14 at 12:24















@PeterJaric Great catch and very useful to know about -gr but I think the output of uniq -c will be as such that sort -nr will work as intended

– Eduard Florinescu
Feb 14 at 13:09





@PeterJaric Great catch and very useful to know about -gr but I think the output of uniq -c will be as such that sort -nr will work as intended

– Eduard Florinescu
Feb 14 at 13:09




1




1





Actually, when the data are numbers, -gr works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr. The first one sorts incorrectly, but not the second one.

– Peter Jaric
Feb 15 at 10:31





Actually, when the data are numbers, -gr works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr. The first one sorts incorrectly, but not the second one.

– Peter Jaric
Feb 15 at 10:31











7














You can use an associative array on awk and then -optionally- sort:



cat access.log  | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort


output:



1 c.php
1 d.php
2 b.php
3 a.php





share|improve this answer


























  • How would you count the number of occurrences as the pipe is sending data?

    – user123456
    Oct 9 '16 at 18:00
















7














You can use an associative array on awk and then -optionally- sort:



cat access.log  | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort


output:



1 c.php
1 d.php
2 b.php
3 a.php





share|improve this answer


























  • How would you count the number of occurrences as the pipe is sending data?

    – user123456
    Oct 9 '16 at 18:00














7












7








7







You can use an associative array on awk and then -optionally- sort:



cat access.log  | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort


output:



1 c.php
1 d.php
2 b.php
3 a.php





share|improve this answer















You can use an associative array on awk and then -optionally- sort:



cat access.log  | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort


output:



1 c.php
1 d.php
2 b.php
3 a.php






share|improve this answer














share|improve this answer



share|improve this answer








edited Apr 9 '18 at 18:25

























answered May 28 '15 at 4:21









Laurence R. UgaldeLaurence R. Ugalde

16114




16114













  • How would you count the number of occurrences as the pipe is sending data?

    – user123456
    Oct 9 '16 at 18:00



















  • How would you count the number of occurrences as the pipe is sending data?

    – user123456
    Oct 9 '16 at 18:00

















How would you count the number of occurrences as the pipe is sending data?

– user123456
Oct 9 '16 at 18:00





How would you count the number of occurrences as the pipe is sending data?

– user123456
Oct 9 '16 at 18:00


















draft saved

draft discarded




















































Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f170043%2fsort-and-count-number-of-occurrence-of-lines%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to reconfigure Docker Trusted Registry 2.x.x to use CEPH FS mount instead of NFS and other traditional...

is 'sed' thread safe

How to make a Squid Proxy server?