Sort and count number of occurrence of lines
I have Apache
logfile, access.log
, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]'
is
a.php
b.php
a.php
c.php
d.php
b.php
a.php
the result that I want is:
3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php
command-line sort
add a comment |
I have Apache
logfile, access.log
, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]'
is
a.php
b.php
a.php
c.php
d.php
b.php
a.php
the result that I want is:
3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php
command-line sort
19
| sort | uniq -c
– Costas
Nov 26 '14 at 11:33
3
| LC_ALL=C sort | LC_ALL=C uniq -c
– Stéphane Chazelas
Nov 26 '14 at 11:33
ah I never know thatuniq
could do that..
– Kokizzu
Nov 26 '14 at 11:37
Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.
– user78605
Nov 26 '14 at 13:54
it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3
– Kokizzu
Nov 26 '14 at 14:19
add a comment |
I have Apache
logfile, access.log
, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]'
is
a.php
b.php
a.php
c.php
d.php
b.php
a.php
the result that I want is:
3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php
command-line sort
I have Apache
logfile, access.log
, how to count number of line occurrence in that file? for example the result of cut -f 7 -d ' ' | cut -d '?' -f 1 | tr '[:upper:]' '[:lower:]'
is
a.php
b.php
a.php
c.php
d.php
b.php
a.php
the result that I want is:
3 a.php
2 b.php
1 d.php # order doesn't matter
1 c.php
command-line sort
command-line sort
asked Nov 26 '14 at 11:31
KokizzuKokizzu
2,40473560
2,40473560
19
| sort | uniq -c
– Costas
Nov 26 '14 at 11:33
3
| LC_ALL=C sort | LC_ALL=C uniq -c
– Stéphane Chazelas
Nov 26 '14 at 11:33
ah I never know thatuniq
could do that..
– Kokizzu
Nov 26 '14 at 11:37
Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.
– user78605
Nov 26 '14 at 13:54
it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3
– Kokizzu
Nov 26 '14 at 14:19
add a comment |
19
| sort | uniq -c
– Costas
Nov 26 '14 at 11:33
3
| LC_ALL=C sort | LC_ALL=C uniq -c
– Stéphane Chazelas
Nov 26 '14 at 11:33
ah I never know thatuniq
could do that..
– Kokizzu
Nov 26 '14 at 11:37
Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.
– user78605
Nov 26 '14 at 13:54
it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3
– Kokizzu
Nov 26 '14 at 14:19
19
19
| sort | uniq -c
– Costas
Nov 26 '14 at 11:33
| sort | uniq -c
– Costas
Nov 26 '14 at 11:33
3
3
| LC_ALL=C sort | LC_ALL=C uniq -c
– Stéphane Chazelas
Nov 26 '14 at 11:33
| LC_ALL=C sort | LC_ALL=C uniq -c
– Stéphane Chazelas
Nov 26 '14 at 11:33
ah I never know that
uniq
could do that..– Kokizzu
Nov 26 '14 at 11:37
ah I never know that
uniq
could do that..– Kokizzu
Nov 26 '14 at 11:37
Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.
– user78605
Nov 26 '14 at 13:54
Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.
– user78605
Nov 26 '14 at 13:54
it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3
– Kokizzu
Nov 26 '14 at 14:19
it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3
– Kokizzu
Nov 26 '14 at 14:19
add a comment |
3 Answers
3
active
oldest
votes
| sort | uniq -c
As stated in the comments.
Piping the output into sort
organises the output into alphabetical/numerical order.
This is a requirement because uniq
only matches on repeated lines, ie
a
b
a
If you use uniq
on this text file, it will return the following:
a
b
a
This is because the two a
s are separated by the b
- they are not consecutive lines. However if you first sort the data into alphabetical order first like
a
a
b
Then uniq
will remove the repeating lines. The -c
option of uniq
counts the number of duplicates and provides output in the form:
2 a
1 b
http://unixhelp.ed.ac.uk/CGI/man-cgi?sort
http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq
1
Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)
– John WH Smith
Nov 26 '14 at 12:18
1
printf '%sn' ①.php ②.php | sort | uniq -c
gives me2 ①.php
– Stéphane Chazelas
Nov 26 '14 at 12:50
@StéphaneChazelas Thats because the printf printsphpnphp
– user78605
Nov 26 '14 at 13:52
4
@Jidder, no, that's because①.php
sorts the same as②.php
in my locale because no sorting order is defined for those①
and②
character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C:| LC_ALL=C sort | LC_ALL=C uniq -c
.
– Stéphane Chazelas
Nov 26 '14 at 14:00
In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.
– Lluís Suñol
Mar 26 '18 at 11:41
add a comment |
[your command] | sort | uniq -c | sort -nr
The accepted answer is almost complete you might want to add an extra sort -nr
at the end to sort the results with the lines that occur most often first
uniq options:
-c, --count
prefix lines by the number of occurrences
sort options:
-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons
In the particular case were the lines you are sorting are numbers, you need use sort -gr
instead of sort -nr
, see comment
3
Thanks so much for letting me know about-n
option.
– Sigur
Nov 30 '16 at 17:00
2
Great answer, here's what I use to get a wordcount out of file with sentences:tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt
. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.
– Bar
Jul 20 '17 at 0:08
1
Using the options above I get " 1" before " 23344". Usingsort -gr
instead solves this.-g
: compare according to general numerical value (instead of-n
: compare according to string numerical value).
– Peter Jaric
Feb 14 at 12:24
@PeterJaric Great catch and very useful to know about-gr
but I think the output ofuniq -c
will be as such thatsort -nr
will work as intended
– Eduard Florinescu
Feb 14 at 13:09
1
Actually, when the data are numbers,-gr
works better. Try these two examples, differing only in the g and n flags:echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr
andecho "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr
. The first one sorts incorrectly, but not the second one.
– Peter Jaric
Feb 15 at 10:31
|
show 1 more comment
You can use an associative array on awk and then -optionally- sort:
cat access.log | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort
output:
1 c.php
1 d.php
2 b.php
3 a.php
How would you count the number of occurrences as the pipe is sending data?
– user123456
Oct 9 '16 at 18:00
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f170043%2fsort-and-count-number-of-occurrence-of-lines%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
| sort | uniq -c
As stated in the comments.
Piping the output into sort
organises the output into alphabetical/numerical order.
This is a requirement because uniq
only matches on repeated lines, ie
a
b
a
If you use uniq
on this text file, it will return the following:
a
b
a
This is because the two a
s are separated by the b
- they are not consecutive lines. However if you first sort the data into alphabetical order first like
a
a
b
Then uniq
will remove the repeating lines. The -c
option of uniq
counts the number of duplicates and provides output in the form:
2 a
1 b
http://unixhelp.ed.ac.uk/CGI/man-cgi?sort
http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq
1
Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)
– John WH Smith
Nov 26 '14 at 12:18
1
printf '%sn' ①.php ②.php | sort | uniq -c
gives me2 ①.php
– Stéphane Chazelas
Nov 26 '14 at 12:50
@StéphaneChazelas Thats because the printf printsphpnphp
– user78605
Nov 26 '14 at 13:52
4
@Jidder, no, that's because①.php
sorts the same as②.php
in my locale because no sorting order is defined for those①
and②
character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C:| LC_ALL=C sort | LC_ALL=C uniq -c
.
– Stéphane Chazelas
Nov 26 '14 at 14:00
In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.
– Lluís Suñol
Mar 26 '18 at 11:41
add a comment |
| sort | uniq -c
As stated in the comments.
Piping the output into sort
organises the output into alphabetical/numerical order.
This is a requirement because uniq
only matches on repeated lines, ie
a
b
a
If you use uniq
on this text file, it will return the following:
a
b
a
This is because the two a
s are separated by the b
- they are not consecutive lines. However if you first sort the data into alphabetical order first like
a
a
b
Then uniq
will remove the repeating lines. The -c
option of uniq
counts the number of duplicates and provides output in the form:
2 a
1 b
http://unixhelp.ed.ac.uk/CGI/man-cgi?sort
http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq
1
Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)
– John WH Smith
Nov 26 '14 at 12:18
1
printf '%sn' ①.php ②.php | sort | uniq -c
gives me2 ①.php
– Stéphane Chazelas
Nov 26 '14 at 12:50
@StéphaneChazelas Thats because the printf printsphpnphp
– user78605
Nov 26 '14 at 13:52
4
@Jidder, no, that's because①.php
sorts the same as②.php
in my locale because no sorting order is defined for those①
and②
character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C:| LC_ALL=C sort | LC_ALL=C uniq -c
.
– Stéphane Chazelas
Nov 26 '14 at 14:00
In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.
– Lluís Suñol
Mar 26 '18 at 11:41
add a comment |
| sort | uniq -c
As stated in the comments.
Piping the output into sort
organises the output into alphabetical/numerical order.
This is a requirement because uniq
only matches on repeated lines, ie
a
b
a
If you use uniq
on this text file, it will return the following:
a
b
a
This is because the two a
s are separated by the b
- they are not consecutive lines. However if you first sort the data into alphabetical order first like
a
a
b
Then uniq
will remove the repeating lines. The -c
option of uniq
counts the number of duplicates and provides output in the form:
2 a
1 b
http://unixhelp.ed.ac.uk/CGI/man-cgi?sort
http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq
| sort | uniq -c
As stated in the comments.
Piping the output into sort
organises the output into alphabetical/numerical order.
This is a requirement because uniq
only matches on repeated lines, ie
a
b
a
If you use uniq
on this text file, it will return the following:
a
b
a
This is because the two a
s are separated by the b
- they are not consecutive lines. However if you first sort the data into alphabetical order first like
a
a
b
Then uniq
will remove the repeating lines. The -c
option of uniq
counts the number of duplicates and provides output in the form:
2 a
1 b
http://unixhelp.ed.ac.uk/CGI/man-cgi?sort
http://unixhelp.ed.ac.uk/CGI/man-cgi?uniq
edited Feb 8 '18 at 15:39
Rodrigue
1073
1073
answered Nov 26 '14 at 11:36
visudovisudo
1,831174
1,831174
1
Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)
– John WH Smith
Nov 26 '14 at 12:18
1
printf '%sn' ①.php ②.php | sort | uniq -c
gives me2 ①.php
– Stéphane Chazelas
Nov 26 '14 at 12:50
@StéphaneChazelas Thats because the printf printsphpnphp
– user78605
Nov 26 '14 at 13:52
4
@Jidder, no, that's because①.php
sorts the same as②.php
in my locale because no sorting order is defined for those①
and②
character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C:| LC_ALL=C sort | LC_ALL=C uniq -c
.
– Stéphane Chazelas
Nov 26 '14 at 14:00
In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.
– Lluís Suñol
Mar 26 '18 at 11:41
add a comment |
1
Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)
– John WH Smith
Nov 26 '14 at 12:18
1
printf '%sn' ①.php ②.php | sort | uniq -c
gives me2 ①.php
– Stéphane Chazelas
Nov 26 '14 at 12:50
@StéphaneChazelas Thats because the printf printsphpnphp
– user78605
Nov 26 '14 at 13:52
4
@Jidder, no, that's because①.php
sorts the same as②.php
in my locale because no sorting order is defined for those①
and②
character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C:| LC_ALL=C sort | LC_ALL=C uniq -c
.
– Stéphane Chazelas
Nov 26 '14 at 14:00
In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.
– Lluís Suñol
Mar 26 '18 at 11:41
1
1
Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)
– John WH Smith
Nov 26 '14 at 12:18
Welcome to Unix & Linux :) Don't hesitate to add more details to your answer and explain why and how this works ;)
– John WH Smith
Nov 26 '14 at 12:18
1
1
printf '%sn' ①.php ②.php | sort | uniq -c
gives me 2 ①.php
– Stéphane Chazelas
Nov 26 '14 at 12:50
printf '%sn' ①.php ②.php | sort | uniq -c
gives me 2 ①.php
– Stéphane Chazelas
Nov 26 '14 at 12:50
@StéphaneChazelas Thats because the printf prints
phpnphp
– user78605
Nov 26 '14 at 13:52
@StéphaneChazelas Thats because the printf prints
phpnphp
– user78605
Nov 26 '14 at 13:52
4
4
@Jidder, no, that's because
①.php
sorts the same as ②.php
in my locale because no sorting order is defined for those ①
and ②
character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c
.– Stéphane Chazelas
Nov 26 '14 at 14:00
@Jidder, no, that's because
①.php
sorts the same as ②.php
in my locale because no sorting order is defined for those ①
and ②
character in my locale. If you want unique values for any byte values (remember file paths are not necessarily text), then you need to fix the locale to C: | LC_ALL=C sort | LC_ALL=C uniq -c
.– Stéphane Chazelas
Nov 26 '14 at 14:00
In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.
– Lluís Suñol
Mar 26 '18 at 11:41
In order to have the resulting count file sorted you should consider adding the "sort -nr" as @eduard-florinescu answers below.
– Lluís Suñol
Mar 26 '18 at 11:41
add a comment |
[your command] | sort | uniq -c | sort -nr
The accepted answer is almost complete you might want to add an extra sort -nr
at the end to sort the results with the lines that occur most often first
uniq options:
-c, --count
prefix lines by the number of occurrences
sort options:
-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons
In the particular case were the lines you are sorting are numbers, you need use sort -gr
instead of sort -nr
, see comment
3
Thanks so much for letting me know about-n
option.
– Sigur
Nov 30 '16 at 17:00
2
Great answer, here's what I use to get a wordcount out of file with sentences:tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt
. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.
– Bar
Jul 20 '17 at 0:08
1
Using the options above I get " 1" before " 23344". Usingsort -gr
instead solves this.-g
: compare according to general numerical value (instead of-n
: compare according to string numerical value).
– Peter Jaric
Feb 14 at 12:24
@PeterJaric Great catch and very useful to know about-gr
but I think the output ofuniq -c
will be as such thatsort -nr
will work as intended
– Eduard Florinescu
Feb 14 at 13:09
1
Actually, when the data are numbers,-gr
works better. Try these two examples, differing only in the g and n flags:echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr
andecho "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr
. The first one sorts incorrectly, but not the second one.
– Peter Jaric
Feb 15 at 10:31
|
show 1 more comment
[your command] | sort | uniq -c | sort -nr
The accepted answer is almost complete you might want to add an extra sort -nr
at the end to sort the results with the lines that occur most often first
uniq options:
-c, --count
prefix lines by the number of occurrences
sort options:
-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons
In the particular case were the lines you are sorting are numbers, you need use sort -gr
instead of sort -nr
, see comment
3
Thanks so much for letting me know about-n
option.
– Sigur
Nov 30 '16 at 17:00
2
Great answer, here's what I use to get a wordcount out of file with sentences:tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt
. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.
– Bar
Jul 20 '17 at 0:08
1
Using the options above I get " 1" before " 23344". Usingsort -gr
instead solves this.-g
: compare according to general numerical value (instead of-n
: compare according to string numerical value).
– Peter Jaric
Feb 14 at 12:24
@PeterJaric Great catch and very useful to know about-gr
but I think the output ofuniq -c
will be as such thatsort -nr
will work as intended
– Eduard Florinescu
Feb 14 at 13:09
1
Actually, when the data are numbers,-gr
works better. Try these two examples, differing only in the g and n flags:echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr
andecho "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr
. The first one sorts incorrectly, but not the second one.
– Peter Jaric
Feb 15 at 10:31
|
show 1 more comment
[your command] | sort | uniq -c | sort -nr
The accepted answer is almost complete you might want to add an extra sort -nr
at the end to sort the results with the lines that occur most often first
uniq options:
-c, --count
prefix lines by the number of occurrences
sort options:
-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons
In the particular case were the lines you are sorting are numbers, you need use sort -gr
instead of sort -nr
, see comment
[your command] | sort | uniq -c | sort -nr
The accepted answer is almost complete you might want to add an extra sort -nr
at the end to sort the results with the lines that occur most often first
uniq options:
-c, --count
prefix lines by the number of occurrences
sort options:
-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons
In the particular case were the lines you are sorting are numbers, you need use sort -gr
instead of sort -nr
, see comment
edited Feb 15 at 13:09
answered Feb 17 '16 at 14:50
Eduard FlorinescuEduard Florinescu
3,404103855
3,404103855
3
Thanks so much for letting me know about-n
option.
– Sigur
Nov 30 '16 at 17:00
2
Great answer, here's what I use to get a wordcount out of file with sentences:tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt
. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.
– Bar
Jul 20 '17 at 0:08
1
Using the options above I get " 1" before " 23344". Usingsort -gr
instead solves this.-g
: compare according to general numerical value (instead of-n
: compare according to string numerical value).
– Peter Jaric
Feb 14 at 12:24
@PeterJaric Great catch and very useful to know about-gr
but I think the output ofuniq -c
will be as such thatsort -nr
will work as intended
– Eduard Florinescu
Feb 14 at 13:09
1
Actually, when the data are numbers,-gr
works better. Try these two examples, differing only in the g and n flags:echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr
andecho "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr
. The first one sorts incorrectly, but not the second one.
– Peter Jaric
Feb 15 at 10:31
|
show 1 more comment
3
Thanks so much for letting me know about-n
option.
– Sigur
Nov 30 '16 at 17:00
2
Great answer, here's what I use to get a wordcount out of file with sentences:tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt
. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.
– Bar
Jul 20 '17 at 0:08
1
Using the options above I get " 1" before " 23344". Usingsort -gr
instead solves this.-g
: compare according to general numerical value (instead of-n
: compare according to string numerical value).
– Peter Jaric
Feb 14 at 12:24
@PeterJaric Great catch and very useful to know about-gr
but I think the output ofuniq -c
will be as such thatsort -nr
will work as intended
– Eduard Florinescu
Feb 14 at 13:09
1
Actually, when the data are numbers,-gr
works better. Try these two examples, differing only in the g and n flags:echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr
andecho "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr
. The first one sorts incorrectly, but not the second one.
– Peter Jaric
Feb 15 at 10:31
3
3
Thanks so much for letting me know about
-n
option.– Sigur
Nov 30 '16 at 17:00
Thanks so much for letting me know about
-n
option.– Sigur
Nov 30 '16 at 17:00
2
2
Great answer, here's what I use to get a wordcount out of file with sentences:
tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt
. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.– Bar
Jul 20 '17 at 0:08
Great answer, here's what I use to get a wordcount out of file with sentences:
tr ' ' 'n' < $FILE | sort | uniq -c | sort -nr > wordcount.txt
. The first command replaces spaces with newlines, allowing for the rest of the command to work as expected.– Bar
Jul 20 '17 at 0:08
1
1
Using the options above I get " 1" before " 23344". Using
sort -gr
instead solves this. -g
: compare according to general numerical value (instead of -n
: compare according to string numerical value).– Peter Jaric
Feb 14 at 12:24
Using the options above I get " 1" before " 23344". Using
sort -gr
instead solves this. -g
: compare according to general numerical value (instead of -n
: compare according to string numerical value).– Peter Jaric
Feb 14 at 12:24
@PeterJaric Great catch and very useful to know about
-gr
but I think the output of uniq -c
will be as such that sort -nr
will work as intended– Eduard Florinescu
Feb 14 at 13:09
@PeterJaric Great catch and very useful to know about
-gr
but I think the output of uniq -c
will be as such that sort -nr
will work as intended– Eduard Florinescu
Feb 14 at 13:09
1
1
Actually, when the data are numbers,
-gr
works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr
and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr
. The first one sorts incorrectly, but not the second one.– Peter Jaric
Feb 15 at 10:31
Actually, when the data are numbers,
-gr
works better. Try these two examples, differing only in the g and n flags: echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -nr
and echo "1 11 1 2" | tr ' ' 'n' | sort | uniq -c | sort -gr
. The first one sorts incorrectly, but not the second one.– Peter Jaric
Feb 15 at 10:31
|
show 1 more comment
You can use an associative array on awk and then -optionally- sort:
cat access.log | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort
output:
1 c.php
1 d.php
2 b.php
3 a.php
How would you count the number of occurrences as the pipe is sending data?
– user123456
Oct 9 '16 at 18:00
add a comment |
You can use an associative array on awk and then -optionally- sort:
cat access.log | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort
output:
1 c.php
1 d.php
2 b.php
3 a.php
How would you count the number of occurrences as the pipe is sending data?
– user123456
Oct 9 '16 at 18:00
add a comment |
You can use an associative array on awk and then -optionally- sort:
cat access.log | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort
output:
1 c.php
1 d.php
2 b.php
3 a.php
You can use an associative array on awk and then -optionally- sort:
cat access.log | awk ' { tot[$0]++ } END { for (i in tot) print tot[i],i } ' | sort
output:
1 c.php
1 d.php
2 b.php
3 a.php
edited Apr 9 '18 at 18:25
answered May 28 '15 at 4:21
Laurence R. UgaldeLaurence R. Ugalde
16114
16114
How would you count the number of occurrences as the pipe is sending data?
– user123456
Oct 9 '16 at 18:00
add a comment |
How would you count the number of occurrences as the pipe is sending data?
– user123456
Oct 9 '16 at 18:00
How would you count the number of occurrences as the pipe is sending data?
– user123456
Oct 9 '16 at 18:00
How would you count the number of occurrences as the pipe is sending data?
– user123456
Oct 9 '16 at 18:00
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f170043%2fsort-and-count-number-of-occurrence-of-lines%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
19
| sort | uniq -c
– Costas
Nov 26 '14 at 11:33
3
| LC_ALL=C sort | LC_ALL=C uniq -c
– Stéphane Chazelas
Nov 26 '14 at 11:33
ah I never know that
uniq
could do that..– Kokizzu
Nov 26 '14 at 11:37
Do you have an example of the line in the log, as i think this could all be done with awk without all the pipes.
– user78605
Nov 26 '14 at 13:54
it's ok, 8.1GB log file processed in about 2 minutes, and it's done for now, no longer need this anymore :3
– Kokizzu
Nov 26 '14 at 14:19