Count number of pairs of strings which on joining contain all vowels
$begingroup$
Problem
I have $N$ words. For each valid $i$, word $i$ is described by a string $D_i$ containing only lowercase vowels, i.e. characters 'a', 'e', 'i', 'o', 'u'.
We check the number of strings $D_i$ and $D_j$ which, when concatenated into a new string $M$, contain all 5 vowels. What is the total number of (unordered) pairs of words such that when concatenated, they contain all the vowels?
Input
- The first line of the input contains a single integer $T$ denoting the number of test cases. The description of $T$ test cases follows.
- The first line of each test case contains a single integer $N$.
$N$ lines follow. For each valid $i$, the $i^{th}$ of these lines contains a single string $D_i$.
Output
For each test case, print a single line containing one integer - the number of concatenated words that contain all vowels.
Test Case
3
aaooaoaooa
uiieieiieieuuu
aeioooeeiiaiei
Output
2
Constraints
- $1≤T≤1,000$
- $1≤N≤10^5$
$1≤|D_i|≤1,000$ for each valid $i$
- the sum of all |$D_i$| over all test cases does not exceed $3⋅10^7$
I have written some code but I want to optimize it:
#include<bits/stdc++.h>
using namespace std;
int main()
{
int t;
cin>>t;
while(t--)
{
int n;
cin>>n;
vector<string> v;
long long con_s[n];
for(int i=0;i<n;i++)
{
string s;
cin>>s;
con_s[i]=0;
for(int j=0;j<s.length();j++)
{
con_s[i] = con_s[i] | (1<<(s[j]-'a'));
}
}
long long complete = 0;
complete = (1<<('a'-'a')) | (1<<('e'-'a')) | (1<<('i'-'a')) | (1<<('o'-'a')) | (1<<('u'-'a'));
// cout<<complete;
int count = 0;
for(int i=0;i<n-1;i++)
{
for(int j=i+1;j<n;j++)
{
if((con_s[i] | con_s[j])==complete)count++;
}
}
cout<<count<<"n";
}
return 0;
}
Can you please suggest a more optimized solution...
c++
New contributor
$endgroup$
add a comment |
$begingroup$
Problem
I have $N$ words. For each valid $i$, word $i$ is described by a string $D_i$ containing only lowercase vowels, i.e. characters 'a', 'e', 'i', 'o', 'u'.
We check the number of strings $D_i$ and $D_j$ which, when concatenated into a new string $M$, contain all 5 vowels. What is the total number of (unordered) pairs of words such that when concatenated, they contain all the vowels?
Input
- The first line of the input contains a single integer $T$ denoting the number of test cases. The description of $T$ test cases follows.
- The first line of each test case contains a single integer $N$.
$N$ lines follow. For each valid $i$, the $i^{th}$ of these lines contains a single string $D_i$.
Output
For each test case, print a single line containing one integer - the number of concatenated words that contain all vowels.
Test Case
3
aaooaoaooa
uiieieiieieuuu
aeioooeeiiaiei
Output
2
Constraints
- $1≤T≤1,000$
- $1≤N≤10^5$
$1≤|D_i|≤1,000$ for each valid $i$
- the sum of all |$D_i$| over all test cases does not exceed $3⋅10^7$
I have written some code but I want to optimize it:
#include<bits/stdc++.h>
using namespace std;
int main()
{
int t;
cin>>t;
while(t--)
{
int n;
cin>>n;
vector<string> v;
long long con_s[n];
for(int i=0;i<n;i++)
{
string s;
cin>>s;
con_s[i]=0;
for(int j=0;j<s.length();j++)
{
con_s[i] = con_s[i] | (1<<(s[j]-'a'));
}
}
long long complete = 0;
complete = (1<<('a'-'a')) | (1<<('e'-'a')) | (1<<('i'-'a')) | (1<<('o'-'a')) | (1<<('u'-'a'));
// cout<<complete;
int count = 0;
for(int i=0;i<n-1;i++)
{
for(int j=i+1;j<n;j++)
{
if((con_s[i] | con_s[j])==complete)count++;
}
}
cout<<count<<"n";
}
return 0;
}
Can you please suggest a more optimized solution...
c++
New contributor
$endgroup$
$begingroup$
Welcome! As it's written, your code is a little difficult to understand, particularly with the bit shifting. My suggestion is to use more informative variable names. Short one-letter variable names are good for loops and writing quick code, but they make it very difficult for others (even yourself) to review your code later on. I'd also recommend moving some of the code out of main and into functions.
$endgroup$
– AleksandrH
12 hours ago
$begingroup$
Is the "T" variable the number of tests, or something else? I don't see it mentioned in the description of the problem.
$endgroup$
– Austin Hastings
9 hours ago
$begingroup$
@AustinHastings yes, T is the number of testcases. Can you help me solve this ?
$endgroup$
– AnonymousJoe
9 hours ago
1
$begingroup$
@AleksandrH Please put all suggestions for improvements in answers, not comments.
$endgroup$
– 200_success
7 hours ago
add a comment |
$begingroup$
Problem
I have $N$ words. For each valid $i$, word $i$ is described by a string $D_i$ containing only lowercase vowels, i.e. characters 'a', 'e', 'i', 'o', 'u'.
We check the number of strings $D_i$ and $D_j$ which, when concatenated into a new string $M$, contain all 5 vowels. What is the total number of (unordered) pairs of words such that when concatenated, they contain all the vowels?
Input
- The first line of the input contains a single integer $T$ denoting the number of test cases. The description of $T$ test cases follows.
- The first line of each test case contains a single integer $N$.
$N$ lines follow. For each valid $i$, the $i^{th}$ of these lines contains a single string $D_i$.
Output
For each test case, print a single line containing one integer - the number of concatenated words that contain all vowels.
Test Case
3
aaooaoaooa
uiieieiieieuuu
aeioooeeiiaiei
Output
2
Constraints
- $1≤T≤1,000$
- $1≤N≤10^5$
$1≤|D_i|≤1,000$ for each valid $i$
- the sum of all |$D_i$| over all test cases does not exceed $3⋅10^7$
I have written some code but I want to optimize it:
#include<bits/stdc++.h>
using namespace std;
int main()
{
int t;
cin>>t;
while(t--)
{
int n;
cin>>n;
vector<string> v;
long long con_s[n];
for(int i=0;i<n;i++)
{
string s;
cin>>s;
con_s[i]=0;
for(int j=0;j<s.length();j++)
{
con_s[i] = con_s[i] | (1<<(s[j]-'a'));
}
}
long long complete = 0;
complete = (1<<('a'-'a')) | (1<<('e'-'a')) | (1<<('i'-'a')) | (1<<('o'-'a')) | (1<<('u'-'a'));
// cout<<complete;
int count = 0;
for(int i=0;i<n-1;i++)
{
for(int j=i+1;j<n;j++)
{
if((con_s[i] | con_s[j])==complete)count++;
}
}
cout<<count<<"n";
}
return 0;
}
Can you please suggest a more optimized solution...
c++
New contributor
$endgroup$
Problem
I have $N$ words. For each valid $i$, word $i$ is described by a string $D_i$ containing only lowercase vowels, i.e. characters 'a', 'e', 'i', 'o', 'u'.
We check the number of strings $D_i$ and $D_j$ which, when concatenated into a new string $M$, contain all 5 vowels. What is the total number of (unordered) pairs of words such that when concatenated, they contain all the vowels?
Input
- The first line of the input contains a single integer $T$ denoting the number of test cases. The description of $T$ test cases follows.
- The first line of each test case contains a single integer $N$.
$N$ lines follow. For each valid $i$, the $i^{th}$ of these lines contains a single string $D_i$.
Output
For each test case, print a single line containing one integer - the number of concatenated words that contain all vowels.
Test Case
3
aaooaoaooa
uiieieiieieuuu
aeioooeeiiaiei
Output
2
Constraints
- $1≤T≤1,000$
- $1≤N≤10^5$
$1≤|D_i|≤1,000$ for each valid $i$
- the sum of all |$D_i$| over all test cases does not exceed $3⋅10^7$
I have written some code but I want to optimize it:
#include<bits/stdc++.h>
using namespace std;
int main()
{
int t;
cin>>t;
while(t--)
{
int n;
cin>>n;
vector<string> v;
long long con_s[n];
for(int i=0;i<n;i++)
{
string s;
cin>>s;
con_s[i]=0;
for(int j=0;j<s.length();j++)
{
con_s[i] = con_s[i] | (1<<(s[j]-'a'));
}
}
long long complete = 0;
complete = (1<<('a'-'a')) | (1<<('e'-'a')) | (1<<('i'-'a')) | (1<<('o'-'a')) | (1<<('u'-'a'));
// cout<<complete;
int count = 0;
for(int i=0;i<n-1;i++)
{
for(int j=i+1;j<n;j++)
{
if((con_s[i] | con_s[j])==complete)count++;
}
}
cout<<count<<"n";
}
return 0;
}
Can you please suggest a more optimized solution...
c++
c++
New contributor
New contributor
edited 3 hours ago
1201ProgramAlarm
3,2532923
3,2532923
New contributor
asked 14 hours ago
AnonymousJoeAnonymousJoe
112
112
New contributor
New contributor
$begingroup$
Welcome! As it's written, your code is a little difficult to understand, particularly with the bit shifting. My suggestion is to use more informative variable names. Short one-letter variable names are good for loops and writing quick code, but they make it very difficult for others (even yourself) to review your code later on. I'd also recommend moving some of the code out of main and into functions.
$endgroup$
– AleksandrH
12 hours ago
$begingroup$
Is the "T" variable the number of tests, or something else? I don't see it mentioned in the description of the problem.
$endgroup$
– Austin Hastings
9 hours ago
$begingroup$
@AustinHastings yes, T is the number of testcases. Can you help me solve this ?
$endgroup$
– AnonymousJoe
9 hours ago
1
$begingroup$
@AleksandrH Please put all suggestions for improvements in answers, not comments.
$endgroup$
– 200_success
7 hours ago
add a comment |
$begingroup$
Welcome! As it's written, your code is a little difficult to understand, particularly with the bit shifting. My suggestion is to use more informative variable names. Short one-letter variable names are good for loops and writing quick code, but they make it very difficult for others (even yourself) to review your code later on. I'd also recommend moving some of the code out of main and into functions.
$endgroup$
– AleksandrH
12 hours ago
$begingroup$
Is the "T" variable the number of tests, or something else? I don't see it mentioned in the description of the problem.
$endgroup$
– Austin Hastings
9 hours ago
$begingroup$
@AustinHastings yes, T is the number of testcases. Can you help me solve this ?
$endgroup$
– AnonymousJoe
9 hours ago
1
$begingroup$
@AleksandrH Please put all suggestions for improvements in answers, not comments.
$endgroup$
– 200_success
7 hours ago
$begingroup$
Welcome! As it's written, your code is a little difficult to understand, particularly with the bit shifting. My suggestion is to use more informative variable names. Short one-letter variable names are good for loops and writing quick code, but they make it very difficult for others (even yourself) to review your code later on. I'd also recommend moving some of the code out of main and into functions.
$endgroup$
– AleksandrH
12 hours ago
$begingroup$
Welcome! As it's written, your code is a little difficult to understand, particularly with the bit shifting. My suggestion is to use more informative variable names. Short one-letter variable names are good for loops and writing quick code, but they make it very difficult for others (even yourself) to review your code later on. I'd also recommend moving some of the code out of main and into functions.
$endgroup$
– AleksandrH
12 hours ago
$begingroup$
Is the "T" variable the number of tests, or something else? I don't see it mentioned in the description of the problem.
$endgroup$
– Austin Hastings
9 hours ago
$begingroup$
Is the "T" variable the number of tests, or something else? I don't see it mentioned in the description of the problem.
$endgroup$
– Austin Hastings
9 hours ago
$begingroup$
@AustinHastings yes, T is the number of testcases. Can you help me solve this ?
$endgroup$
– AnonymousJoe
9 hours ago
$begingroup$
@AustinHastings yes, T is the number of testcases. Can you help me solve this ?
$endgroup$
– AnonymousJoe
9 hours ago
1
1
$begingroup$
@AleksandrH Please put all suggestions for improvements in answers, not comments.
$endgroup$
– 200_success
7 hours ago
$begingroup$
@AleksandrH Please put all suggestions for improvements in answers, not comments.
$endgroup$
– 200_success
7 hours ago
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
I'm no C++ guru, but some things jump out at me:
Your coding style is very dense, which makes it hard to read and review. I'd suggest that you switch to a style that doesn't make you pay for each time you add a space. Code like this:
cin>>t;
while(t--)
Just isn't as easy to read as code like this:
cin >> t;
while (t--)
The little things make a difference! And even if you're in some kind of "shortest code" context, I think you should still write it long, and then compress it once you have it working.
The first
for
loop in yourwhile
loop is concerned with reading and parsing the input strings. I believe you should break that into a separate function. What's more, I think you should make a slight change to your data storage as @Deduplicator suggested: instead of just using the bits located at(1 << (vowel - 'a'))
, at some point you should compress the bits down into the range1<<0 .. 1<<4
.I believe that the description says "(unordered) pairs" and means that if two strings, A and B, can be concatenated to meet the requirements, they only count once, as in
set{A, B}
and not twice, as inpair(A,B), pair(B,A)
. So, as @Deduplicator suggested, if you "bin" your words - that is, categorize them according to the trait "which vowels are present", then you can represent a bin with just an integer (how many words are in the bin). So you would then cross-match every bin with every other bin to determine whether the binned words can successfully pair, and add an appropriate number to the count.
With that in mind, a successful solution would look something like this:
for each input word:
scan the word for vowels, recording which vowels were found
classify the found vowels into a small integer with bits 0..4 set
use the small integer as the "bin" number for that word
increment the count for that bin#
This should be $O(n cdot s)$ on the number of words, n, and the length of the words, s, in time. The only optimization would be to break out of the scanning loop if you match every single vowel, since further scanning gains you nothing. The code will require storage for the input, but compiles everything down to a single bin number, so the storage will be from $O(1)$ to $O(s)$ depending on how you implement the code. The bin numbers are members of a fixed set, so their counts will be $O(2^5)$ regardless, which simplifies to $O(1)$.
Now since matched pairs of words only count one time, you can loop "upwards" when generating combinations:
for each bin 0 .. 2^5 - 1
for each higher bin (looping upwards here):
if the two bins match
the count of word pairs is # in bin-1 * # in bin-2
add the count to the total count of matchable pairs
This should be $O(n^2)$ on the number of bins (not words!), which is a constant. You can't optimize much, except that bins with counts of zero won't contribute anything so can be skipped.
At this point you have your answer and just need to print it out.
$endgroup$
$begingroup$
It is simple to optimise to runtime O(characters+bins). And you are right, size is O(1).
$endgroup$
– Deduplicator
8 hours ago
add a comment |
$begingroup$
Your Code:
Don't use
<bits/stdc++.h>
, it's unportable and inefficient. See "Why should I not #include <bits/stdc++.h>?" for the details.Don't use
using namespace std;
, that namespace just isn't designed for it. See "Why is “using namespace std” considered bad practice?" for more background.Invest in some better names. Being too parsimonious there hurts, even though brevity is a virtue.
Format your code consistentently.
- Sometimes you have space around binary operators (other than comma), sometimes you don't.
- Also add a space after
#include
,for
,if
, andwhile
. - You don't have to use a block for single statements, but please don't put them on the same line, at the least separate them with a space.
C++ does not have VLAs. Just use a
std::vector
instead.While in general, keeping the scope of a variable minimal is a good idea, there are exceptions. One of them is efficiency. Always spinning up a new
std::string
means being unable to re-use the buffer. Not that you really need the whole string at once, anyway.One reason you should minimise a variables scope, is that it allows you to initialise it to the proper value, instead of leaving it uninitialised or, horrors of horrors, adding a spurious dummy-initialisation.
That also allows you to avoid writing the type (Almost Alwaysauto
), and making itconst
or evenconstexpr
.There are compound-assignment-operators for most binary operators. Like
a |= b
fora = a | b
. Using them leads to shorter, more readable code.When you want to output a single character, why not use a character-literal? It's potentially even slightly more efficient.
return 0;
is implicit formain()
. Take it or leave it, but be aware.
The Algorithm:
Your algorithm uses $O(#characters+#words^2)$ time and $O(max_word_length + #words)$ space.
An optimal algorithm only needs $O(#characters+2^{#vowels})$ time and $O(1)$ space:
- For every word:
- Set the bits for all the vowels contained.
- Increment the count on the indicated bin.
- For every vowel:
- Iterate the bins containing words with that vowel.
- Add the count to the respective bin without that vowel.
- For all bins:
Multiply the count in the bin with the count in the complementary bin, and add that. - Subtract twice the count in the bin for words containing all vowels. Those account for all self-pairings.
- The answer is half the calculated number, as we counted double.
Exercise for the attentive reader: Save half the multiplications this algorithm uses.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
AnonymousJoe is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214644%2fcount-number-of-pairs-of-strings-which-on-joining-contain-all-vowels%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
I'm no C++ guru, but some things jump out at me:
Your coding style is very dense, which makes it hard to read and review. I'd suggest that you switch to a style that doesn't make you pay for each time you add a space. Code like this:
cin>>t;
while(t--)
Just isn't as easy to read as code like this:
cin >> t;
while (t--)
The little things make a difference! And even if you're in some kind of "shortest code" context, I think you should still write it long, and then compress it once you have it working.
The first
for
loop in yourwhile
loop is concerned with reading and parsing the input strings. I believe you should break that into a separate function. What's more, I think you should make a slight change to your data storage as @Deduplicator suggested: instead of just using the bits located at(1 << (vowel - 'a'))
, at some point you should compress the bits down into the range1<<0 .. 1<<4
.I believe that the description says "(unordered) pairs" and means that if two strings, A and B, can be concatenated to meet the requirements, they only count once, as in
set{A, B}
and not twice, as inpair(A,B), pair(B,A)
. So, as @Deduplicator suggested, if you "bin" your words - that is, categorize them according to the trait "which vowels are present", then you can represent a bin with just an integer (how many words are in the bin). So you would then cross-match every bin with every other bin to determine whether the binned words can successfully pair, and add an appropriate number to the count.
With that in mind, a successful solution would look something like this:
for each input word:
scan the word for vowels, recording which vowels were found
classify the found vowels into a small integer with bits 0..4 set
use the small integer as the "bin" number for that word
increment the count for that bin#
This should be $O(n cdot s)$ on the number of words, n, and the length of the words, s, in time. The only optimization would be to break out of the scanning loop if you match every single vowel, since further scanning gains you nothing. The code will require storage for the input, but compiles everything down to a single bin number, so the storage will be from $O(1)$ to $O(s)$ depending on how you implement the code. The bin numbers are members of a fixed set, so their counts will be $O(2^5)$ regardless, which simplifies to $O(1)$.
Now since matched pairs of words only count one time, you can loop "upwards" when generating combinations:
for each bin 0 .. 2^5 - 1
for each higher bin (looping upwards here):
if the two bins match
the count of word pairs is # in bin-1 * # in bin-2
add the count to the total count of matchable pairs
This should be $O(n^2)$ on the number of bins (not words!), which is a constant. You can't optimize much, except that bins with counts of zero won't contribute anything so can be skipped.
At this point you have your answer and just need to print it out.
$endgroup$
$begingroup$
It is simple to optimise to runtime O(characters+bins). And you are right, size is O(1).
$endgroup$
– Deduplicator
8 hours ago
add a comment |
$begingroup$
I'm no C++ guru, but some things jump out at me:
Your coding style is very dense, which makes it hard to read and review. I'd suggest that you switch to a style that doesn't make you pay for each time you add a space. Code like this:
cin>>t;
while(t--)
Just isn't as easy to read as code like this:
cin >> t;
while (t--)
The little things make a difference! And even if you're in some kind of "shortest code" context, I think you should still write it long, and then compress it once you have it working.
The first
for
loop in yourwhile
loop is concerned with reading and parsing the input strings. I believe you should break that into a separate function. What's more, I think you should make a slight change to your data storage as @Deduplicator suggested: instead of just using the bits located at(1 << (vowel - 'a'))
, at some point you should compress the bits down into the range1<<0 .. 1<<4
.I believe that the description says "(unordered) pairs" and means that if two strings, A and B, can be concatenated to meet the requirements, they only count once, as in
set{A, B}
and not twice, as inpair(A,B), pair(B,A)
. So, as @Deduplicator suggested, if you "bin" your words - that is, categorize them according to the trait "which vowels are present", then you can represent a bin with just an integer (how many words are in the bin). So you would then cross-match every bin with every other bin to determine whether the binned words can successfully pair, and add an appropriate number to the count.
With that in mind, a successful solution would look something like this:
for each input word:
scan the word for vowels, recording which vowels were found
classify the found vowels into a small integer with bits 0..4 set
use the small integer as the "bin" number for that word
increment the count for that bin#
This should be $O(n cdot s)$ on the number of words, n, and the length of the words, s, in time. The only optimization would be to break out of the scanning loop if you match every single vowel, since further scanning gains you nothing. The code will require storage for the input, but compiles everything down to a single bin number, so the storage will be from $O(1)$ to $O(s)$ depending on how you implement the code. The bin numbers are members of a fixed set, so their counts will be $O(2^5)$ regardless, which simplifies to $O(1)$.
Now since matched pairs of words only count one time, you can loop "upwards" when generating combinations:
for each bin 0 .. 2^5 - 1
for each higher bin (looping upwards here):
if the two bins match
the count of word pairs is # in bin-1 * # in bin-2
add the count to the total count of matchable pairs
This should be $O(n^2)$ on the number of bins (not words!), which is a constant. You can't optimize much, except that bins with counts of zero won't contribute anything so can be skipped.
At this point you have your answer and just need to print it out.
$endgroup$
$begingroup$
It is simple to optimise to runtime O(characters+bins). And you are right, size is O(1).
$endgroup$
– Deduplicator
8 hours ago
add a comment |
$begingroup$
I'm no C++ guru, but some things jump out at me:
Your coding style is very dense, which makes it hard to read and review. I'd suggest that you switch to a style that doesn't make you pay for each time you add a space. Code like this:
cin>>t;
while(t--)
Just isn't as easy to read as code like this:
cin >> t;
while (t--)
The little things make a difference! And even if you're in some kind of "shortest code" context, I think you should still write it long, and then compress it once you have it working.
The first
for
loop in yourwhile
loop is concerned with reading and parsing the input strings. I believe you should break that into a separate function. What's more, I think you should make a slight change to your data storage as @Deduplicator suggested: instead of just using the bits located at(1 << (vowel - 'a'))
, at some point you should compress the bits down into the range1<<0 .. 1<<4
.I believe that the description says "(unordered) pairs" and means that if two strings, A and B, can be concatenated to meet the requirements, they only count once, as in
set{A, B}
and not twice, as inpair(A,B), pair(B,A)
. So, as @Deduplicator suggested, if you "bin" your words - that is, categorize them according to the trait "which vowels are present", then you can represent a bin with just an integer (how many words are in the bin). So you would then cross-match every bin with every other bin to determine whether the binned words can successfully pair, and add an appropriate number to the count.
With that in mind, a successful solution would look something like this:
for each input word:
scan the word for vowels, recording which vowels were found
classify the found vowels into a small integer with bits 0..4 set
use the small integer as the "bin" number for that word
increment the count for that bin#
This should be $O(n cdot s)$ on the number of words, n, and the length of the words, s, in time. The only optimization would be to break out of the scanning loop if you match every single vowel, since further scanning gains you nothing. The code will require storage for the input, but compiles everything down to a single bin number, so the storage will be from $O(1)$ to $O(s)$ depending on how you implement the code. The bin numbers are members of a fixed set, so their counts will be $O(2^5)$ regardless, which simplifies to $O(1)$.
Now since matched pairs of words only count one time, you can loop "upwards" when generating combinations:
for each bin 0 .. 2^5 - 1
for each higher bin (looping upwards here):
if the two bins match
the count of word pairs is # in bin-1 * # in bin-2
add the count to the total count of matchable pairs
This should be $O(n^2)$ on the number of bins (not words!), which is a constant. You can't optimize much, except that bins with counts of zero won't contribute anything so can be skipped.
At this point you have your answer and just need to print it out.
$endgroup$
I'm no C++ guru, but some things jump out at me:
Your coding style is very dense, which makes it hard to read and review. I'd suggest that you switch to a style that doesn't make you pay for each time you add a space. Code like this:
cin>>t;
while(t--)
Just isn't as easy to read as code like this:
cin >> t;
while (t--)
The little things make a difference! And even if you're in some kind of "shortest code" context, I think you should still write it long, and then compress it once you have it working.
The first
for
loop in yourwhile
loop is concerned with reading and parsing the input strings. I believe you should break that into a separate function. What's more, I think you should make a slight change to your data storage as @Deduplicator suggested: instead of just using the bits located at(1 << (vowel - 'a'))
, at some point you should compress the bits down into the range1<<0 .. 1<<4
.I believe that the description says "(unordered) pairs" and means that if two strings, A and B, can be concatenated to meet the requirements, they only count once, as in
set{A, B}
and not twice, as inpair(A,B), pair(B,A)
. So, as @Deduplicator suggested, if you "bin" your words - that is, categorize them according to the trait "which vowels are present", then you can represent a bin with just an integer (how many words are in the bin). So you would then cross-match every bin with every other bin to determine whether the binned words can successfully pair, and add an appropriate number to the count.
With that in mind, a successful solution would look something like this:
for each input word:
scan the word for vowels, recording which vowels were found
classify the found vowels into a small integer with bits 0..4 set
use the small integer as the "bin" number for that word
increment the count for that bin#
This should be $O(n cdot s)$ on the number of words, n, and the length of the words, s, in time. The only optimization would be to break out of the scanning loop if you match every single vowel, since further scanning gains you nothing. The code will require storage for the input, but compiles everything down to a single bin number, so the storage will be from $O(1)$ to $O(s)$ depending on how you implement the code. The bin numbers are members of a fixed set, so their counts will be $O(2^5)$ regardless, which simplifies to $O(1)$.
Now since matched pairs of words only count one time, you can loop "upwards" when generating combinations:
for each bin 0 .. 2^5 - 1
for each higher bin (looping upwards here):
if the two bins match
the count of word pairs is # in bin-1 * # in bin-2
add the count to the total count of matchable pairs
This should be $O(n^2)$ on the number of bins (not words!), which is a constant. You can't optimize much, except that bins with counts of zero won't contribute anything so can be skipped.
At this point you have your answer and just need to print it out.
answered 9 hours ago
Austin HastingsAustin Hastings
6,9671233
6,9671233
$begingroup$
It is simple to optimise to runtime O(characters+bins). And you are right, size is O(1).
$endgroup$
– Deduplicator
8 hours ago
add a comment |
$begingroup$
It is simple to optimise to runtime O(characters+bins). And you are right, size is O(1).
$endgroup$
– Deduplicator
8 hours ago
$begingroup$
It is simple to optimise to runtime O(characters+bins). And you are right, size is O(1).
$endgroup$
– Deduplicator
8 hours ago
$begingroup$
It is simple to optimise to runtime O(characters+bins). And you are right, size is O(1).
$endgroup$
– Deduplicator
8 hours ago
add a comment |
$begingroup$
Your Code:
Don't use
<bits/stdc++.h>
, it's unportable and inefficient. See "Why should I not #include <bits/stdc++.h>?" for the details.Don't use
using namespace std;
, that namespace just isn't designed for it. See "Why is “using namespace std” considered bad practice?" for more background.Invest in some better names. Being too parsimonious there hurts, even though brevity is a virtue.
Format your code consistentently.
- Sometimes you have space around binary operators (other than comma), sometimes you don't.
- Also add a space after
#include
,for
,if
, andwhile
. - You don't have to use a block for single statements, but please don't put them on the same line, at the least separate them with a space.
C++ does not have VLAs. Just use a
std::vector
instead.While in general, keeping the scope of a variable minimal is a good idea, there are exceptions. One of them is efficiency. Always spinning up a new
std::string
means being unable to re-use the buffer. Not that you really need the whole string at once, anyway.One reason you should minimise a variables scope, is that it allows you to initialise it to the proper value, instead of leaving it uninitialised or, horrors of horrors, adding a spurious dummy-initialisation.
That also allows you to avoid writing the type (Almost Alwaysauto
), and making itconst
or evenconstexpr
.There are compound-assignment-operators for most binary operators. Like
a |= b
fora = a | b
. Using them leads to shorter, more readable code.When you want to output a single character, why not use a character-literal? It's potentially even slightly more efficient.
return 0;
is implicit formain()
. Take it or leave it, but be aware.
The Algorithm:
Your algorithm uses $O(#characters+#words^2)$ time and $O(max_word_length + #words)$ space.
An optimal algorithm only needs $O(#characters+2^{#vowels})$ time and $O(1)$ space:
- For every word:
- Set the bits for all the vowels contained.
- Increment the count on the indicated bin.
- For every vowel:
- Iterate the bins containing words with that vowel.
- Add the count to the respective bin without that vowel.
- For all bins:
Multiply the count in the bin with the count in the complementary bin, and add that. - Subtract twice the count in the bin for words containing all vowels. Those account for all self-pairings.
- The answer is half the calculated number, as we counted double.
Exercise for the attentive reader: Save half the multiplications this algorithm uses.
$endgroup$
add a comment |
$begingroup$
Your Code:
Don't use
<bits/stdc++.h>
, it's unportable and inefficient. See "Why should I not #include <bits/stdc++.h>?" for the details.Don't use
using namespace std;
, that namespace just isn't designed for it. See "Why is “using namespace std” considered bad practice?" for more background.Invest in some better names. Being too parsimonious there hurts, even though brevity is a virtue.
Format your code consistentently.
- Sometimes you have space around binary operators (other than comma), sometimes you don't.
- Also add a space after
#include
,for
,if
, andwhile
. - You don't have to use a block for single statements, but please don't put them on the same line, at the least separate them with a space.
C++ does not have VLAs. Just use a
std::vector
instead.While in general, keeping the scope of a variable minimal is a good idea, there are exceptions. One of them is efficiency. Always spinning up a new
std::string
means being unable to re-use the buffer. Not that you really need the whole string at once, anyway.One reason you should minimise a variables scope, is that it allows you to initialise it to the proper value, instead of leaving it uninitialised or, horrors of horrors, adding a spurious dummy-initialisation.
That also allows you to avoid writing the type (Almost Alwaysauto
), and making itconst
or evenconstexpr
.There are compound-assignment-operators for most binary operators. Like
a |= b
fora = a | b
. Using them leads to shorter, more readable code.When you want to output a single character, why not use a character-literal? It's potentially even slightly more efficient.
return 0;
is implicit formain()
. Take it or leave it, but be aware.
The Algorithm:
Your algorithm uses $O(#characters+#words^2)$ time and $O(max_word_length + #words)$ space.
An optimal algorithm only needs $O(#characters+2^{#vowels})$ time and $O(1)$ space:
- For every word:
- Set the bits for all the vowels contained.
- Increment the count on the indicated bin.
- For every vowel:
- Iterate the bins containing words with that vowel.
- Add the count to the respective bin without that vowel.
- For all bins:
Multiply the count in the bin with the count in the complementary bin, and add that. - Subtract twice the count in the bin for words containing all vowels. Those account for all self-pairings.
- The answer is half the calculated number, as we counted double.
Exercise for the attentive reader: Save half the multiplications this algorithm uses.
$endgroup$
add a comment |
$begingroup$
Your Code:
Don't use
<bits/stdc++.h>
, it's unportable and inefficient. See "Why should I not #include <bits/stdc++.h>?" for the details.Don't use
using namespace std;
, that namespace just isn't designed for it. See "Why is “using namespace std” considered bad practice?" for more background.Invest in some better names. Being too parsimonious there hurts, even though brevity is a virtue.
Format your code consistentently.
- Sometimes you have space around binary operators (other than comma), sometimes you don't.
- Also add a space after
#include
,for
,if
, andwhile
. - You don't have to use a block for single statements, but please don't put them on the same line, at the least separate them with a space.
C++ does not have VLAs. Just use a
std::vector
instead.While in general, keeping the scope of a variable minimal is a good idea, there are exceptions. One of them is efficiency. Always spinning up a new
std::string
means being unable to re-use the buffer. Not that you really need the whole string at once, anyway.One reason you should minimise a variables scope, is that it allows you to initialise it to the proper value, instead of leaving it uninitialised or, horrors of horrors, adding a spurious dummy-initialisation.
That also allows you to avoid writing the type (Almost Alwaysauto
), and making itconst
or evenconstexpr
.There are compound-assignment-operators for most binary operators. Like
a |= b
fora = a | b
. Using them leads to shorter, more readable code.When you want to output a single character, why not use a character-literal? It's potentially even slightly more efficient.
return 0;
is implicit formain()
. Take it or leave it, but be aware.
The Algorithm:
Your algorithm uses $O(#characters+#words^2)$ time and $O(max_word_length + #words)$ space.
An optimal algorithm only needs $O(#characters+2^{#vowels})$ time and $O(1)$ space:
- For every word:
- Set the bits for all the vowels contained.
- Increment the count on the indicated bin.
- For every vowel:
- Iterate the bins containing words with that vowel.
- Add the count to the respective bin without that vowel.
- For all bins:
Multiply the count in the bin with the count in the complementary bin, and add that. - Subtract twice the count in the bin for words containing all vowels. Those account for all self-pairings.
- The answer is half the calculated number, as we counted double.
Exercise for the attentive reader: Save half the multiplications this algorithm uses.
$endgroup$
Your Code:
Don't use
<bits/stdc++.h>
, it's unportable and inefficient. See "Why should I not #include <bits/stdc++.h>?" for the details.Don't use
using namespace std;
, that namespace just isn't designed for it. See "Why is “using namespace std” considered bad practice?" for more background.Invest in some better names. Being too parsimonious there hurts, even though brevity is a virtue.
Format your code consistentently.
- Sometimes you have space around binary operators (other than comma), sometimes you don't.
- Also add a space after
#include
,for
,if
, andwhile
. - You don't have to use a block for single statements, but please don't put them on the same line, at the least separate them with a space.
C++ does not have VLAs. Just use a
std::vector
instead.While in general, keeping the scope of a variable minimal is a good idea, there are exceptions. One of them is efficiency. Always spinning up a new
std::string
means being unable to re-use the buffer. Not that you really need the whole string at once, anyway.One reason you should minimise a variables scope, is that it allows you to initialise it to the proper value, instead of leaving it uninitialised or, horrors of horrors, adding a spurious dummy-initialisation.
That also allows you to avoid writing the type (Almost Alwaysauto
), and making itconst
or evenconstexpr
.There are compound-assignment-operators for most binary operators. Like
a |= b
fora = a | b
. Using them leads to shorter, more readable code.When you want to output a single character, why not use a character-literal? It's potentially even slightly more efficient.
return 0;
is implicit formain()
. Take it or leave it, but be aware.
The Algorithm:
Your algorithm uses $O(#characters+#words^2)$ time and $O(max_word_length + #words)$ space.
An optimal algorithm only needs $O(#characters+2^{#vowels})$ time and $O(1)$ space:
- For every word:
- Set the bits for all the vowels contained.
- Increment the count on the indicated bin.
- For every vowel:
- Iterate the bins containing words with that vowel.
- Add the count to the respective bin without that vowel.
- For all bins:
Multiply the count in the bin with the count in the complementary bin, and add that. - Subtract twice the count in the bin for words containing all vowels. Those account for all self-pairings.
- The answer is half the calculated number, as we counted double.
Exercise for the attentive reader: Save half the multiplications this algorithm uses.
edited 3 hours ago
answered 8 hours ago
DeduplicatorDeduplicator
11.5k1850
11.5k1850
add a comment |
add a comment |
AnonymousJoe is a new contributor. Be nice, and check out our Code of Conduct.
AnonymousJoe is a new contributor. Be nice, and check out our Code of Conduct.
AnonymousJoe is a new contributor. Be nice, and check out our Code of Conduct.
AnonymousJoe is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214644%2fcount-number-of-pairs-of-strings-which-on-joining-contain-all-vowels%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Welcome! As it's written, your code is a little difficult to understand, particularly with the bit shifting. My suggestion is to use more informative variable names. Short one-letter variable names are good for loops and writing quick code, but they make it very difficult for others (even yourself) to review your code later on. I'd also recommend moving some of the code out of main and into functions.
$endgroup$
– AleksandrH
12 hours ago
$begingroup$
Is the "T" variable the number of tests, or something else? I don't see it mentioned in the description of the problem.
$endgroup$
– Austin Hastings
9 hours ago
$begingroup$
@AustinHastings yes, T is the number of testcases. Can you help me solve this ?
$endgroup$
– AnonymousJoe
9 hours ago
1
$begingroup$
@AleksandrH Please put all suggestions for improvements in answers, not comments.
$endgroup$
– 200_success
7 hours ago