Python: parallel code running slower than sequential version












-1












$begingroup$


I have a sequential code where I am counting unique events occurring at a timestamp given the data on time intervals. The sequential code I have prepared is:



a=list of timestamps of size 100.
number=
for i in range(100):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
number.append(len(set(dataset[indices,2])))


Since the actual size of a is large, it is expected to take large number of days to complete. Therefore, I created a parallel version of the code:



num_cores = multiprocessing.cpu_count()
inputs = range(100)
def processInput(i):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
return(len(set(dataset[indices,2])))

results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i in inputs)


Surprisingly, the sequential version on 100 elements is taking 2 minutes to complete and the parallel version takes about 9 minutes. Why










share|improve this question







New contributor




shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    This code looks a lot like your previous question. However, here you are specifically asking why code performs the way it does, rather than asking for suggestions on how to improve the code, so your question is off-topic for Code Review, and should be asked on Stack Overflow.
    $endgroup$
    – 200_success
    15 mins ago
















-1












$begingroup$


I have a sequential code where I am counting unique events occurring at a timestamp given the data on time intervals. The sequential code I have prepared is:



a=list of timestamps of size 100.
number=
for i in range(100):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
number.append(len(set(dataset[indices,2])))


Since the actual size of a is large, it is expected to take large number of days to complete. Therefore, I created a parallel version of the code:



num_cores = multiprocessing.cpu_count()
inputs = range(100)
def processInput(i):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
return(len(set(dataset[indices,2])))

results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i in inputs)


Surprisingly, the sequential version on 100 elements is taking 2 minutes to complete and the parallel version takes about 9 minutes. Why










share|improve this question







New contributor




shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$












  • $begingroup$
    This code looks a lot like your previous question. However, here you are specifically asking why code performs the way it does, rather than asking for suggestions on how to improve the code, so your question is off-topic for Code Review, and should be asked on Stack Overflow.
    $endgroup$
    – 200_success
    15 mins ago














-1












-1








-1





$begingroup$


I have a sequential code where I am counting unique events occurring at a timestamp given the data on time intervals. The sequential code I have prepared is:



a=list of timestamps of size 100.
number=
for i in range(100):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
number.append(len(set(dataset[indices,2])))


Since the actual size of a is large, it is expected to take large number of days to complete. Therefore, I created a parallel version of the code:



num_cores = multiprocessing.cpu_count()
inputs = range(100)
def processInput(i):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
return(len(set(dataset[indices,2])))

results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i in inputs)


Surprisingly, the sequential version on 100 elements is taking 2 minutes to complete and the parallel version takes about 9 minutes. Why










share|improve this question







New contributor




shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I have a sequential code where I am counting unique events occurring at a timestamp given the data on time intervals. The sequential code I have prepared is:



a=list of timestamps of size 100.
number=
for i in range(100):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
number.append(len(set(dataset[indices,2])))


Since the actual size of a is large, it is expected to take large number of days to complete. Therefore, I created a parallel version of the code:



num_cores = multiprocessing.cpu_count()
inputs = range(100)
def processInput(i):
indices=numpy.argwhere((a[i] >= dataset[:,0]) & (a[i] <= dataset[:,1]))[:,0]
return(len(set(dataset[indices,2])))

results = Parallel(n_jobs=num_cores)(delayed(processInput)(i) for i in inputs)


Surprisingly, the sequential version on 100 elements is taking 2 minutes to complete and the parallel version takes about 9 minutes. Why







python performance






share|improve this question







New contributor




shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 50 mins ago









shaifali Guptashaifali Gupta

12




12




New contributor




shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






shaifali Gupta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • $begingroup$
    This code looks a lot like your previous question. However, here you are specifically asking why code performs the way it does, rather than asking for suggestions on how to improve the code, so your question is off-topic for Code Review, and should be asked on Stack Overflow.
    $endgroup$
    – 200_success
    15 mins ago


















  • $begingroup$
    This code looks a lot like your previous question. However, here you are specifically asking why code performs the way it does, rather than asking for suggestions on how to improve the code, so your question is off-topic for Code Review, and should be asked on Stack Overflow.
    $endgroup$
    – 200_success
    15 mins ago
















$begingroup$
This code looks a lot like your previous question. However, here you are specifically asking why code performs the way it does, rather than asking for suggestions on how to improve the code, so your question is off-topic for Code Review, and should be asked on Stack Overflow.
$endgroup$
– 200_success
15 mins ago




$begingroup$
This code looks a lot like your previous question. However, here you are specifically asking why code performs the way it does, rather than asking for suggestions on how to improve the code, so your question is off-topic for Code Review, and should be asked on Stack Overflow.
$endgroup$
– 200_success
15 mins ago










0






active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






shaifali Gupta is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215319%2fpython-parallel-code-running-slower-than-sequential-version%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes








shaifali Gupta is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















shaifali Gupta is a new contributor. Be nice, and check out our Code of Conduct.













shaifali Gupta is a new contributor. Be nice, and check out our Code of Conduct.












shaifali Gupta is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Code Review Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215319%2fpython-parallel-code-running-slower-than-sequential-version%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to reconfigure Docker Trusted Registry 2.x.x to use CEPH FS mount instead of NFS and other traditional...

is 'sed' thread safe

How to make a Squid Proxy server?