Performance: getting first value from comma delimited string
$begingroup$
I've got a string that has values that are delimited by comma's, like so:
$var = '1,23,45,123,145,200';
I'd like to get just the first value, so what I do is create an array from it and get the first element:
$first = current(explode(',', $var));
Fine enough. But this string can sometimes contain perhaps hundreds of values. Exploding it into an array and only using the first one seems kind of a waste. Is there a smarter alternative which is also more performant/less wasteful? I'm thinking some sort of regex or trimming, but I'm guessing that could be actually slower...
php performance strings array regex
$endgroup$
add a comment |
$begingroup$
I've got a string that has values that are delimited by comma's, like so:
$var = '1,23,45,123,145,200';
I'd like to get just the first value, so what I do is create an array from it and get the first element:
$first = current(explode(',', $var));
Fine enough. But this string can sometimes contain perhaps hundreds of values. Exploding it into an array and only using the first one seems kind of a waste. Is there a smarter alternative which is also more performant/less wasteful? I'm thinking some sort of regex or trimming, but I'm guessing that could be actually slower...
php performance strings array regex
$endgroup$
$begingroup$
+1 for not ignoring your gut feeling, and being reluctant to tackle this using regex. It's proof of sentient activity, some people lack
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:11
add a comment |
$begingroup$
I've got a string that has values that are delimited by comma's, like so:
$var = '1,23,45,123,145,200';
I'd like to get just the first value, so what I do is create an array from it and get the first element:
$first = current(explode(',', $var));
Fine enough. But this string can sometimes contain perhaps hundreds of values. Exploding it into an array and only using the first one seems kind of a waste. Is there a smarter alternative which is also more performant/less wasteful? I'm thinking some sort of regex or trimming, but I'm guessing that could be actually slower...
php performance strings array regex
$endgroup$
I've got a string that has values that are delimited by comma's, like so:
$var = '1,23,45,123,145,200';
I'd like to get just the first value, so what I do is create an array from it and get the first element:
$first = current(explode(',', $var));
Fine enough. But this string can sometimes contain perhaps hundreds of values. Exploding it into an array and only using the first one seems kind of a waste. Is there a smarter alternative which is also more performant/less wasteful? I'm thinking some sort of regex or trimming, but I'm guessing that could be actually slower...
php performance strings array regex
php performance strings array regex
asked Oct 30 '13 at 10:01
kasimirkasimir
17027
17027
$begingroup$
+1 for not ignoring your gut feeling, and being reluctant to tackle this using regex. It's proof of sentient activity, some people lack
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:11
add a comment |
$begingroup$
+1 for not ignoring your gut feeling, and being reluctant to tackle this using regex. It's proof of sentient activity, some people lack
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:11
$begingroup$
+1 for not ignoring your gut feeling, and being reluctant to tackle this using regex. It's proof of sentient activity, some people lack
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:11
$begingroup$
+1 for not ignoring your gut feeling, and being reluctant to tackle this using regex. It's proof of sentient activity, some people lack
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:11
add a comment |
4 Answers
4
active
oldest
votes
$begingroup$
UPDATE:
A more complete benchmark script:
$start = $first = $str = null;//create vars, don't benchmark this
//time preg_match
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//time str* functions
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
$first = substr($str, 0, strpos($str, ','));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//now explode + current
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
The result varried a little, but after 100 runs, the averages amounted to:
#1 substr+strpos: ~.0022ms as 1//base for speed
#2 preg_match: ~.0041 as ~2//about twice as slow as #1
#3 explode: ~.00789 as ~4//about 4 times <=> #1, twice as slow <=> regex
You're absolutely right, exploding a string, constructing an array to get just the first value is a waste of resources, and it is not the fastest way to get what you want.
Some might run to regex for help, and chances are that, in your case that will be faster. But nothing I can think of will beat the speed of PHP's string functions (which are very close to the C string functions). I'd do this:
$first = substr($var, 0, strpos($var, ','));
If the comma isn't present (say $var = '123'
), then your current approach will assign 123
to $first
. To preserve this behaviour, I'd go for:
$first = strpos($var, ',') === false ? $var : substr($var, 0, strpos($var, ','));
This is to say: if strpos
returns false, then there is no comma at all, so assign the entire string to $first
, else get everything in front of the first comma.
For completeness sake (and after some initial bench-marking), using preg_match
did indeed prove to be faster than using explode
with large strings ($var = implode(',', range(1, 9999));
), when using this code:
$first = $var = implode(',', range(1,9999));
if (preg_match('/^[^,]*/',$var, $match))
{
$first = $match[0];
}
But honestly, I wouldn't use regex in this case.
In the interest of fairness, and to to clarify how I found the regex to be faster:
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, $str, PHP_EOL, microtime(true) - $start, ' time taken';
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
$endgroup$
$begingroup$
Thanks, I did some benchmarking myself and found that both string and regex solutions are a lot faster than exploding. They are about the same when using this exact code, when losing the ternary notation for the string solution, I found it was actually faster than the regex by about 20% (because ternary uses copy-on-write). So I think I'll use that. As you said: nothing beats the speed of PHP's string functions.
$endgroup$
– kasimir
Oct 30 '13 at 11:01
$begingroup$
@kasimir: I've edited my answer some more, adding my benchmark code, and its results (over 100 runs). I found the string functions to be twice as fast as regex. Though I did run it on a VM, and didn't check how I had configured PHP (it's been ages, and still running 5.3). But if my answer answered your question, would you mind awfully accepting (and or upvoting) it?
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:09
$begingroup$
Don't worry... Ok, so string function is definitely the fastest, great! Also, the regex solution is the 'ugliest' in my opinion, kind of obscuring what you are doing.
$endgroup$
– kasimir
Oct 30 '13 at 13:15
1
$begingroup$
Along the same lines: did you know$count = substr_count($var, ',') + 1;
is a lot faster than$count = count(explode(',', $var));
? codepad.org/KGqtWbxO
$endgroup$
– kasimir
Oct 30 '13 at 13:29
$begingroup$
@Kasimir: It ought to be... I would've been surprized if it wasn't. Iterating through achar
comparingj += char[i] == 44 ? 1 : 0
each time just has to be faster than iterating through that string, copying every chunk of data that is not a comma to a new array, only to count the chunks just cannot be as fast, because both operations start by doing the same thing, the difference is the copying, which isn't done in the first case. Good of you to check, though. I'd +1 you again for not taking assumptions for granted, but actually bother checking those things. A commendable attitude
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 13:59
add a comment |
$begingroup$
I am not familiar with php syntax but I hope you could do this
$var = "1,23,45,123,145,200";
$first_word = substr($var, 0, strpos($var, ','));
$endgroup$
add a comment |
$begingroup$
I am shocked that the first three approaches that came to mind didn't even get considered/mentioned/tested!
In reverse order of my preference...
Least robust because only works on integers and poorly handles an empty string, casting string as integer: (Demo)
$tests = ['1,23,45,123,145,200', '345,999,0,1', '0', '0,2', '-1,-122', '', '1.5,2.9'];
foreach ($tests as $test) {
var_export((int)$test);
echo "n";
}
// 1
// 345
// 0
// 0
// -1
// 0
// 1
strstr() with the before_needle
parameter: (Demo)
// same tests
$before_comma = strstr($test, ',', true);
var_export($before_comma === false ? $test : $before_comma);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
explode()
with a limit parameter: (Demo)
// same tests
var_export(explode(',', $test, 2)[0]);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
While I don't like the idea of creating an array to extract a string value of the first element, it is a single call solution. Setting an element limit means that function isn't asked to do heaps of unnecessary labor.
I am a big fan of strstr()
but it must involve a conditional to properly handle emptying strings.
If your comma-separated string is never empty and only contains integers, I would strongly recomment the (int)
approach, surely that is fastest.
As much as I love regex, I would not entertain the use of a preg_
call -- not even for a second.
$endgroup$
add a comment |
$begingroup$
for me the best way is:
$str = "Eeny, meeny, miny, moe";
$first = explode(',', $str)[0];
$endgroup$
1
$begingroup$
Great! But that's not a review. Reviews contain at least one insightful remark about the code provided. Why is your code better than the code provided? What problems does the original code have that yours does not?
$endgroup$
– Mast
May 23 '18 at 18:16
$begingroup$
This is pretty much exactly the same as the solution I did not want to use and my question was all about...
$endgroup$
– kasimir
May 24 '18 at 15:42
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f33500%2fperformance-getting-first-value-from-comma-delimited-string%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
UPDATE:
A more complete benchmark script:
$start = $first = $str = null;//create vars, don't benchmark this
//time preg_match
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//time str* functions
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
$first = substr($str, 0, strpos($str, ','));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//now explode + current
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
The result varried a little, but after 100 runs, the averages amounted to:
#1 substr+strpos: ~.0022ms as 1//base for speed
#2 preg_match: ~.0041 as ~2//about twice as slow as #1
#3 explode: ~.00789 as ~4//about 4 times <=> #1, twice as slow <=> regex
You're absolutely right, exploding a string, constructing an array to get just the first value is a waste of resources, and it is not the fastest way to get what you want.
Some might run to regex for help, and chances are that, in your case that will be faster. But nothing I can think of will beat the speed of PHP's string functions (which are very close to the C string functions). I'd do this:
$first = substr($var, 0, strpos($var, ','));
If the comma isn't present (say $var = '123'
), then your current approach will assign 123
to $first
. To preserve this behaviour, I'd go for:
$first = strpos($var, ',') === false ? $var : substr($var, 0, strpos($var, ','));
This is to say: if strpos
returns false, then there is no comma at all, so assign the entire string to $first
, else get everything in front of the first comma.
For completeness sake (and after some initial bench-marking), using preg_match
did indeed prove to be faster than using explode
with large strings ($var = implode(',', range(1, 9999));
), when using this code:
$first = $var = implode(',', range(1,9999));
if (preg_match('/^[^,]*/',$var, $match))
{
$first = $match[0];
}
But honestly, I wouldn't use regex in this case.
In the interest of fairness, and to to clarify how I found the regex to be faster:
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, $str, PHP_EOL, microtime(true) - $start, ' time taken';
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
$endgroup$
$begingroup$
Thanks, I did some benchmarking myself and found that both string and regex solutions are a lot faster than exploding. They are about the same when using this exact code, when losing the ternary notation for the string solution, I found it was actually faster than the regex by about 20% (because ternary uses copy-on-write). So I think I'll use that. As you said: nothing beats the speed of PHP's string functions.
$endgroup$
– kasimir
Oct 30 '13 at 11:01
$begingroup$
@kasimir: I've edited my answer some more, adding my benchmark code, and its results (over 100 runs). I found the string functions to be twice as fast as regex. Though I did run it on a VM, and didn't check how I had configured PHP (it's been ages, and still running 5.3). But if my answer answered your question, would you mind awfully accepting (and or upvoting) it?
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:09
$begingroup$
Don't worry... Ok, so string function is definitely the fastest, great! Also, the regex solution is the 'ugliest' in my opinion, kind of obscuring what you are doing.
$endgroup$
– kasimir
Oct 30 '13 at 13:15
1
$begingroup$
Along the same lines: did you know$count = substr_count($var, ',') + 1;
is a lot faster than$count = count(explode(',', $var));
? codepad.org/KGqtWbxO
$endgroup$
– kasimir
Oct 30 '13 at 13:29
$begingroup$
@Kasimir: It ought to be... I would've been surprized if it wasn't. Iterating through achar
comparingj += char[i] == 44 ? 1 : 0
each time just has to be faster than iterating through that string, copying every chunk of data that is not a comma to a new array, only to count the chunks just cannot be as fast, because both operations start by doing the same thing, the difference is the copying, which isn't done in the first case. Good of you to check, though. I'd +1 you again for not taking assumptions for granted, but actually bother checking those things. A commendable attitude
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 13:59
add a comment |
$begingroup$
UPDATE:
A more complete benchmark script:
$start = $first = $str = null;//create vars, don't benchmark this
//time preg_match
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//time str* functions
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
$first = substr($str, 0, strpos($str, ','));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//now explode + current
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
The result varried a little, but after 100 runs, the averages amounted to:
#1 substr+strpos: ~.0022ms as 1//base for speed
#2 preg_match: ~.0041 as ~2//about twice as slow as #1
#3 explode: ~.00789 as ~4//about 4 times <=> #1, twice as slow <=> regex
You're absolutely right, exploding a string, constructing an array to get just the first value is a waste of resources, and it is not the fastest way to get what you want.
Some might run to regex for help, and chances are that, in your case that will be faster. But nothing I can think of will beat the speed of PHP's string functions (which are very close to the C string functions). I'd do this:
$first = substr($var, 0, strpos($var, ','));
If the comma isn't present (say $var = '123'
), then your current approach will assign 123
to $first
. To preserve this behaviour, I'd go for:
$first = strpos($var, ',') === false ? $var : substr($var, 0, strpos($var, ','));
This is to say: if strpos
returns false, then there is no comma at all, so assign the entire string to $first
, else get everything in front of the first comma.
For completeness sake (and after some initial bench-marking), using preg_match
did indeed prove to be faster than using explode
with large strings ($var = implode(',', range(1, 9999));
), when using this code:
$first = $var = implode(',', range(1,9999));
if (preg_match('/^[^,]*/',$var, $match))
{
$first = $match[0];
}
But honestly, I wouldn't use regex in this case.
In the interest of fairness, and to to clarify how I found the regex to be faster:
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, $str, PHP_EOL, microtime(true) - $start, ' time taken';
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
$endgroup$
$begingroup$
Thanks, I did some benchmarking myself and found that both string and regex solutions are a lot faster than exploding. They are about the same when using this exact code, when losing the ternary notation for the string solution, I found it was actually faster than the regex by about 20% (because ternary uses copy-on-write). So I think I'll use that. As you said: nothing beats the speed of PHP's string functions.
$endgroup$
– kasimir
Oct 30 '13 at 11:01
$begingroup$
@kasimir: I've edited my answer some more, adding my benchmark code, and its results (over 100 runs). I found the string functions to be twice as fast as regex. Though I did run it on a VM, and didn't check how I had configured PHP (it's been ages, and still running 5.3). But if my answer answered your question, would you mind awfully accepting (and or upvoting) it?
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:09
$begingroup$
Don't worry... Ok, so string function is definitely the fastest, great! Also, the regex solution is the 'ugliest' in my opinion, kind of obscuring what you are doing.
$endgroup$
– kasimir
Oct 30 '13 at 13:15
1
$begingroup$
Along the same lines: did you know$count = substr_count($var, ',') + 1;
is a lot faster than$count = count(explode(',', $var));
? codepad.org/KGqtWbxO
$endgroup$
– kasimir
Oct 30 '13 at 13:29
$begingroup$
@Kasimir: It ought to be... I would've been surprized if it wasn't. Iterating through achar
comparingj += char[i] == 44 ? 1 : 0
each time just has to be faster than iterating through that string, copying every chunk of data that is not a comma to a new array, only to count the chunks just cannot be as fast, because both operations start by doing the same thing, the difference is the copying, which isn't done in the first case. Good of you to check, though. I'd +1 you again for not taking assumptions for granted, but actually bother checking those things. A commendable attitude
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 13:59
add a comment |
$begingroup$
UPDATE:
A more complete benchmark script:
$start = $first = $str = null;//create vars, don't benchmark this
//time preg_match
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//time str* functions
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
$first = substr($str, 0, strpos($str, ','));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//now explode + current
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
The result varried a little, but after 100 runs, the averages amounted to:
#1 substr+strpos: ~.0022ms as 1//base for speed
#2 preg_match: ~.0041 as ~2//about twice as slow as #1
#3 explode: ~.00789 as ~4//about 4 times <=> #1, twice as slow <=> regex
You're absolutely right, exploding a string, constructing an array to get just the first value is a waste of resources, and it is not the fastest way to get what you want.
Some might run to regex for help, and chances are that, in your case that will be faster. But nothing I can think of will beat the speed of PHP's string functions (which are very close to the C string functions). I'd do this:
$first = substr($var, 0, strpos($var, ','));
If the comma isn't present (say $var = '123'
), then your current approach will assign 123
to $first
. To preserve this behaviour, I'd go for:
$first = strpos($var, ',') === false ? $var : substr($var, 0, strpos($var, ','));
This is to say: if strpos
returns false, then there is no comma at all, so assign the entire string to $first
, else get everything in front of the first comma.
For completeness sake (and after some initial bench-marking), using preg_match
did indeed prove to be faster than using explode
with large strings ($var = implode(',', range(1, 9999));
), when using this code:
$first = $var = implode(',', range(1,9999));
if (preg_match('/^[^,]*/',$var, $match))
{
$first = $match[0];
}
But honestly, I wouldn't use regex in this case.
In the interest of fairness, and to to clarify how I found the regex to be faster:
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, $str, PHP_EOL, microtime(true) - $start, ' time taken';
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
$endgroup$
UPDATE:
A more complete benchmark script:
$start = $first = $str = null;//create vars, don't benchmark this
//time preg_match
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//time str* functions
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
$first = substr($str, 0, strpos($str, ','));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken<br/>', PHP_EOL;
//now explode + current
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
The result varried a little, but after 100 runs, the averages amounted to:
#1 substr+strpos: ~.0022ms as 1//base for speed
#2 preg_match: ~.0041 as ~2//about twice as slow as #1
#3 explode: ~.00789 as ~4//about 4 times <=> #1, twice as slow <=> regex
You're absolutely right, exploding a string, constructing an array to get just the first value is a waste of resources, and it is not the fastest way to get what you want.
Some might run to regex for help, and chances are that, in your case that will be faster. But nothing I can think of will beat the speed of PHP's string functions (which are very close to the C string functions). I'd do this:
$first = substr($var, 0, strpos($var, ','));
If the comma isn't present (say $var = '123'
), then your current approach will assign 123
to $first
. To preserve this behaviour, I'd go for:
$first = strpos($var, ',') === false ? $var : substr($var, 0, strpos($var, ','));
This is to say: if strpos
returns false, then there is no comma at all, so assign the entire string to $first
, else get everything in front of the first comma.
For completeness sake (and after some initial bench-marking), using preg_match
did indeed prove to be faster than using explode
with large strings ($var = implode(',', range(1, 9999));
), when using this code:
$first = $var = implode(',', range(1,9999));
if (preg_match('/^[^,]*/',$var, $match))
{
$first = $match[0];
}
But honestly, I wouldn't use regex in this case.
In the interest of fairness, and to to clarify how I found the regex to be faster:
$start = microtime(true);
$first = $str = implode(',', range(213,9999));
if (preg_match('/^[^,]+/', $str, $match))
{
$first = $match[0];
}
echo $first, PHP_EOL, $str, PHP_EOL, microtime(true) - $start, ' time taken';
$first = null;
$start = microtime(true);
$str = implode(',', range(213, 9999));
$first = current(explode(',', $str));
echo $first, PHP_EOL, microtime(true) - $start, ' time taken';
edited Oct 30 '13 at 11:05
answered Oct 30 '13 at 10:35
Elias Van OotegemElias Van Ootegem
9,0332144
9,0332144
$begingroup$
Thanks, I did some benchmarking myself and found that both string and regex solutions are a lot faster than exploding. They are about the same when using this exact code, when losing the ternary notation for the string solution, I found it was actually faster than the regex by about 20% (because ternary uses copy-on-write). So I think I'll use that. As you said: nothing beats the speed of PHP's string functions.
$endgroup$
– kasimir
Oct 30 '13 at 11:01
$begingroup$
@kasimir: I've edited my answer some more, adding my benchmark code, and its results (over 100 runs). I found the string functions to be twice as fast as regex. Though I did run it on a VM, and didn't check how I had configured PHP (it's been ages, and still running 5.3). But if my answer answered your question, would you mind awfully accepting (and or upvoting) it?
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:09
$begingroup$
Don't worry... Ok, so string function is definitely the fastest, great! Also, the regex solution is the 'ugliest' in my opinion, kind of obscuring what you are doing.
$endgroup$
– kasimir
Oct 30 '13 at 13:15
1
$begingroup$
Along the same lines: did you know$count = substr_count($var, ',') + 1;
is a lot faster than$count = count(explode(',', $var));
? codepad.org/KGqtWbxO
$endgroup$
– kasimir
Oct 30 '13 at 13:29
$begingroup$
@Kasimir: It ought to be... I would've been surprized if it wasn't. Iterating through achar
comparingj += char[i] == 44 ? 1 : 0
each time just has to be faster than iterating through that string, copying every chunk of data that is not a comma to a new array, only to count the chunks just cannot be as fast, because both operations start by doing the same thing, the difference is the copying, which isn't done in the first case. Good of you to check, though. I'd +1 you again for not taking assumptions for granted, but actually bother checking those things. A commendable attitude
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 13:59
add a comment |
$begingroup$
Thanks, I did some benchmarking myself and found that both string and regex solutions are a lot faster than exploding. They are about the same when using this exact code, when losing the ternary notation for the string solution, I found it was actually faster than the regex by about 20% (because ternary uses copy-on-write). So I think I'll use that. As you said: nothing beats the speed of PHP's string functions.
$endgroup$
– kasimir
Oct 30 '13 at 11:01
$begingroup$
@kasimir: I've edited my answer some more, adding my benchmark code, and its results (over 100 runs). I found the string functions to be twice as fast as regex. Though I did run it on a VM, and didn't check how I had configured PHP (it's been ages, and still running 5.3). But if my answer answered your question, would you mind awfully accepting (and or upvoting) it?
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:09
$begingroup$
Don't worry... Ok, so string function is definitely the fastest, great! Also, the regex solution is the 'ugliest' in my opinion, kind of obscuring what you are doing.
$endgroup$
– kasimir
Oct 30 '13 at 13:15
1
$begingroup$
Along the same lines: did you know$count = substr_count($var, ',') + 1;
is a lot faster than$count = count(explode(',', $var));
? codepad.org/KGqtWbxO
$endgroup$
– kasimir
Oct 30 '13 at 13:29
$begingroup$
@Kasimir: It ought to be... I would've been surprized if it wasn't. Iterating through achar
comparingj += char[i] == 44 ? 1 : 0
each time just has to be faster than iterating through that string, copying every chunk of data that is not a comma to a new array, only to count the chunks just cannot be as fast, because both operations start by doing the same thing, the difference is the copying, which isn't done in the first case. Good of you to check, though. I'd +1 you again for not taking assumptions for granted, but actually bother checking those things. A commendable attitude
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 13:59
$begingroup$
Thanks, I did some benchmarking myself and found that both string and regex solutions are a lot faster than exploding. They are about the same when using this exact code, when losing the ternary notation for the string solution, I found it was actually faster than the regex by about 20% (because ternary uses copy-on-write). So I think I'll use that. As you said: nothing beats the speed of PHP's string functions.
$endgroup$
– kasimir
Oct 30 '13 at 11:01
$begingroup$
Thanks, I did some benchmarking myself and found that both string and regex solutions are a lot faster than exploding. They are about the same when using this exact code, when losing the ternary notation for the string solution, I found it was actually faster than the regex by about 20% (because ternary uses copy-on-write). So I think I'll use that. As you said: nothing beats the speed of PHP's string functions.
$endgroup$
– kasimir
Oct 30 '13 at 11:01
$begingroup$
@kasimir: I've edited my answer some more, adding my benchmark code, and its results (over 100 runs). I found the string functions to be twice as fast as regex. Though I did run it on a VM, and didn't check how I had configured PHP (it's been ages, and still running 5.3). But if my answer answered your question, would you mind awfully accepting (and or upvoting) it?
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:09
$begingroup$
@kasimir: I've edited my answer some more, adding my benchmark code, and its results (over 100 runs). I found the string functions to be twice as fast as regex. Though I did run it on a VM, and didn't check how I had configured PHP (it's been ages, and still running 5.3). But if my answer answered your question, would you mind awfully accepting (and or upvoting) it?
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:09
$begingroup$
Don't worry... Ok, so string function is definitely the fastest, great! Also, the regex solution is the 'ugliest' in my opinion, kind of obscuring what you are doing.
$endgroup$
– kasimir
Oct 30 '13 at 13:15
$begingroup$
Don't worry... Ok, so string function is definitely the fastest, great! Also, the regex solution is the 'ugliest' in my opinion, kind of obscuring what you are doing.
$endgroup$
– kasimir
Oct 30 '13 at 13:15
1
1
$begingroup$
Along the same lines: did you know
$count = substr_count($var, ',') + 1;
is a lot faster than $count = count(explode(',', $var));
? codepad.org/KGqtWbxO$endgroup$
– kasimir
Oct 30 '13 at 13:29
$begingroup$
Along the same lines: did you know
$count = substr_count($var, ',') + 1;
is a lot faster than $count = count(explode(',', $var));
? codepad.org/KGqtWbxO$endgroup$
– kasimir
Oct 30 '13 at 13:29
$begingroup$
@Kasimir: It ought to be... I would've been surprized if it wasn't. Iterating through a
char
comparing j += char[i] == 44 ? 1 : 0
each time just has to be faster than iterating through that string, copying every chunk of data that is not a comma to a new array, only to count the chunks just cannot be as fast, because both operations start by doing the same thing, the difference is the copying, which isn't done in the first case. Good of you to check, though. I'd +1 you again for not taking assumptions for granted, but actually bother checking those things. A commendable attitude$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 13:59
$begingroup$
@Kasimir: It ought to be... I would've been surprized if it wasn't. Iterating through a
char
comparing j += char[i] == 44 ? 1 : 0
each time just has to be faster than iterating through that string, copying every chunk of data that is not a comma to a new array, only to count the chunks just cannot be as fast, because both operations start by doing the same thing, the difference is the copying, which isn't done in the first case. Good of you to check, though. I'd +1 you again for not taking assumptions for granted, but actually bother checking those things. A commendable attitude$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 13:59
add a comment |
$begingroup$
I am not familiar with php syntax but I hope you could do this
$var = "1,23,45,123,145,200";
$first_word = substr($var, 0, strpos($var, ','));
$endgroup$
add a comment |
$begingroup$
I am not familiar with php syntax but I hope you could do this
$var = "1,23,45,123,145,200";
$first_word = substr($var, 0, strpos($var, ','));
$endgroup$
add a comment |
$begingroup$
I am not familiar with php syntax but I hope you could do this
$var = "1,23,45,123,145,200";
$first_word = substr($var, 0, strpos($var, ','));
$endgroup$
I am not familiar with php syntax but I hope you could do this
$var = "1,23,45,123,145,200";
$first_word = substr($var, 0, strpos($var, ','));
answered Oct 30 '13 at 10:33
KinjalKinjal
97021023
97021023
add a comment |
add a comment |
$begingroup$
I am shocked that the first three approaches that came to mind didn't even get considered/mentioned/tested!
In reverse order of my preference...
Least robust because only works on integers and poorly handles an empty string, casting string as integer: (Demo)
$tests = ['1,23,45,123,145,200', '345,999,0,1', '0', '0,2', '-1,-122', '', '1.5,2.9'];
foreach ($tests as $test) {
var_export((int)$test);
echo "n";
}
// 1
// 345
// 0
// 0
// -1
// 0
// 1
strstr() with the before_needle
parameter: (Demo)
// same tests
$before_comma = strstr($test, ',', true);
var_export($before_comma === false ? $test : $before_comma);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
explode()
with a limit parameter: (Demo)
// same tests
var_export(explode(',', $test, 2)[0]);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
While I don't like the idea of creating an array to extract a string value of the first element, it is a single call solution. Setting an element limit means that function isn't asked to do heaps of unnecessary labor.
I am a big fan of strstr()
but it must involve a conditional to properly handle emptying strings.
If your comma-separated string is never empty and only contains integers, I would strongly recomment the (int)
approach, surely that is fastest.
As much as I love regex, I would not entertain the use of a preg_
call -- not even for a second.
$endgroup$
add a comment |
$begingroup$
I am shocked that the first three approaches that came to mind didn't even get considered/mentioned/tested!
In reverse order of my preference...
Least robust because only works on integers and poorly handles an empty string, casting string as integer: (Demo)
$tests = ['1,23,45,123,145,200', '345,999,0,1', '0', '0,2', '-1,-122', '', '1.5,2.9'];
foreach ($tests as $test) {
var_export((int)$test);
echo "n";
}
// 1
// 345
// 0
// 0
// -1
// 0
// 1
strstr() with the before_needle
parameter: (Demo)
// same tests
$before_comma = strstr($test, ',', true);
var_export($before_comma === false ? $test : $before_comma);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
explode()
with a limit parameter: (Demo)
// same tests
var_export(explode(',', $test, 2)[0]);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
While I don't like the idea of creating an array to extract a string value of the first element, it is a single call solution. Setting an element limit means that function isn't asked to do heaps of unnecessary labor.
I am a big fan of strstr()
but it must involve a conditional to properly handle emptying strings.
If your comma-separated string is never empty and only contains integers, I would strongly recomment the (int)
approach, surely that is fastest.
As much as I love regex, I would not entertain the use of a preg_
call -- not even for a second.
$endgroup$
add a comment |
$begingroup$
I am shocked that the first three approaches that came to mind didn't even get considered/mentioned/tested!
In reverse order of my preference...
Least robust because only works on integers and poorly handles an empty string, casting string as integer: (Demo)
$tests = ['1,23,45,123,145,200', '345,999,0,1', '0', '0,2', '-1,-122', '', '1.5,2.9'];
foreach ($tests as $test) {
var_export((int)$test);
echo "n";
}
// 1
// 345
// 0
// 0
// -1
// 0
// 1
strstr() with the before_needle
parameter: (Demo)
// same tests
$before_comma = strstr($test, ',', true);
var_export($before_comma === false ? $test : $before_comma);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
explode()
with a limit parameter: (Demo)
// same tests
var_export(explode(',', $test, 2)[0]);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
While I don't like the idea of creating an array to extract a string value of the first element, it is a single call solution. Setting an element limit means that function isn't asked to do heaps of unnecessary labor.
I am a big fan of strstr()
but it must involve a conditional to properly handle emptying strings.
If your comma-separated string is never empty and only contains integers, I would strongly recomment the (int)
approach, surely that is fastest.
As much as I love regex, I would not entertain the use of a preg_
call -- not even for a second.
$endgroup$
I am shocked that the first three approaches that came to mind didn't even get considered/mentioned/tested!
In reverse order of my preference...
Least robust because only works on integers and poorly handles an empty string, casting string as integer: (Demo)
$tests = ['1,23,45,123,145,200', '345,999,0,1', '0', '0,2', '-1,-122', '', '1.5,2.9'];
foreach ($tests as $test) {
var_export((int)$test);
echo "n";
}
// 1
// 345
// 0
// 0
// -1
// 0
// 1
strstr() with the before_needle
parameter: (Demo)
// same tests
$before_comma = strstr($test, ',', true);
var_export($before_comma === false ? $test : $before_comma);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
explode()
with a limit parameter: (Demo)
// same tests
var_export(explode(',', $test, 2)[0]);
// '1'
// '345'
// '0'
// '0'
// '-1'
// ''
// '1.5'
While I don't like the idea of creating an array to extract a string value of the first element, it is a single call solution. Setting an element limit means that function isn't asked to do heaps of unnecessary labor.
I am a big fan of strstr()
but it must involve a conditional to properly handle emptying strings.
If your comma-separated string is never empty and only contains integers, I would strongly recomment the (int)
approach, surely that is fastest.
As much as I love regex, I would not entertain the use of a preg_
call -- not even for a second.
answered 6 mins ago
mickmackusamickmackusa
1,149213
1,149213
add a comment |
add a comment |
$begingroup$
for me the best way is:
$str = "Eeny, meeny, miny, moe";
$first = explode(',', $str)[0];
$endgroup$
1
$begingroup$
Great! But that's not a review. Reviews contain at least one insightful remark about the code provided. Why is your code better than the code provided? What problems does the original code have that yours does not?
$endgroup$
– Mast
May 23 '18 at 18:16
$begingroup$
This is pretty much exactly the same as the solution I did not want to use and my question was all about...
$endgroup$
– kasimir
May 24 '18 at 15:42
add a comment |
$begingroup$
for me the best way is:
$str = "Eeny, meeny, miny, moe";
$first = explode(',', $str)[0];
$endgroup$
1
$begingroup$
Great! But that's not a review. Reviews contain at least one insightful remark about the code provided. Why is your code better than the code provided? What problems does the original code have that yours does not?
$endgroup$
– Mast
May 23 '18 at 18:16
$begingroup$
This is pretty much exactly the same as the solution I did not want to use and my question was all about...
$endgroup$
– kasimir
May 24 '18 at 15:42
add a comment |
$begingroup$
for me the best way is:
$str = "Eeny, meeny, miny, moe";
$first = explode(',', $str)[0];
$endgroup$
for me the best way is:
$str = "Eeny, meeny, miny, moe";
$first = explode(',', $str)[0];
answered May 23 '18 at 18:10
Sal CelliSal Celli
99
99
1
$begingroup$
Great! But that's not a review. Reviews contain at least one insightful remark about the code provided. Why is your code better than the code provided? What problems does the original code have that yours does not?
$endgroup$
– Mast
May 23 '18 at 18:16
$begingroup$
This is pretty much exactly the same as the solution I did not want to use and my question was all about...
$endgroup$
– kasimir
May 24 '18 at 15:42
add a comment |
1
$begingroup$
Great! But that's not a review. Reviews contain at least one insightful remark about the code provided. Why is your code better than the code provided? What problems does the original code have that yours does not?
$endgroup$
– Mast
May 23 '18 at 18:16
$begingroup$
This is pretty much exactly the same as the solution I did not want to use and my question was all about...
$endgroup$
– kasimir
May 24 '18 at 15:42
1
1
$begingroup$
Great! But that's not a review. Reviews contain at least one insightful remark about the code provided. Why is your code better than the code provided? What problems does the original code have that yours does not?
$endgroup$
– Mast
May 23 '18 at 18:16
$begingroup$
Great! But that's not a review. Reviews contain at least one insightful remark about the code provided. Why is your code better than the code provided? What problems does the original code have that yours does not?
$endgroup$
– Mast
May 23 '18 at 18:16
$begingroup$
This is pretty much exactly the same as the solution I did not want to use and my question was all about...
$endgroup$
– kasimir
May 24 '18 at 15:42
$begingroup$
This is pretty much exactly the same as the solution I did not want to use and my question was all about...
$endgroup$
– kasimir
May 24 '18 at 15:42
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f33500%2fperformance-getting-first-value-from-comma-delimited-string%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
+1 for not ignoring your gut feeling, and being reluctant to tackle this using regex. It's proof of sentient activity, some people lack
$endgroup$
– Elias Van Ootegem
Oct 30 '13 at 11:11