Find a first non NaN value in Pandas
I have a Pandas dataframe such that
|user_id|value|No|
|:-:|:-:|:-:|
|id1|100|1|
|id1|200|2|
|id1|250|3|
|id2|NaN|1|
|id2|100|2|
|id3|400|1|
|id3|NaN|2|
|id3|200|3|
|id4|NaN|1|
|id4|NaN|2|
|id4|300|3|.
Then I want the folloing dataset:
|user_id|value|No|NewNo|
|:-:|:-:|:-:|:-:|
|id1|100|1|1|
|id1|200|2|2|
|id1|250|3|3|
|id2|100|2|1|
|id3|400|1|1|
|id3|NaN|2|2|
|id3|200|3|3|
|id4|300|3|1|
namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.
python pandas pandas-groupby
migrated from superuser.com Jan 22 at 2:11
This question came from our site for computer enthusiasts and power users.
add a comment |
I have a Pandas dataframe such that
|user_id|value|No|
|:-:|:-:|:-:|
|id1|100|1|
|id1|200|2|
|id1|250|3|
|id2|NaN|1|
|id2|100|2|
|id3|400|1|
|id3|NaN|2|
|id3|200|3|
|id4|NaN|1|
|id4|NaN|2|
|id4|300|3|.
Then I want the folloing dataset:
|user_id|value|No|NewNo|
|:-:|:-:|:-:|:-:|
|id1|100|1|1|
|id1|200|2|2|
|id1|250|3|3|
|id2|100|2|1|
|id3|400|1|1|
|id3|NaN|2|2|
|id3|200|3|3|
|id4|300|3|1|
namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.
python pandas pandas-groupby
migrated from superuser.com Jan 22 at 2:11
This question came from our site for computer enthusiasts and power users.
add a comment |
I have a Pandas dataframe such that
|user_id|value|No|
|:-:|:-:|:-:|
|id1|100|1|
|id1|200|2|
|id1|250|3|
|id2|NaN|1|
|id2|100|2|
|id3|400|1|
|id3|NaN|2|
|id3|200|3|
|id4|NaN|1|
|id4|NaN|2|
|id4|300|3|.
Then I want the folloing dataset:
|user_id|value|No|NewNo|
|:-:|:-:|:-:|:-:|
|id1|100|1|1|
|id1|200|2|2|
|id1|250|3|3|
|id2|100|2|1|
|id3|400|1|1|
|id3|NaN|2|2|
|id3|200|3|3|
|id4|300|3|1|
namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.
python pandas pandas-groupby
I have a Pandas dataframe such that
|user_id|value|No|
|:-:|:-:|:-:|
|id1|100|1|
|id1|200|2|
|id1|250|3|
|id2|NaN|1|
|id2|100|2|
|id3|400|1|
|id3|NaN|2|
|id3|200|3|
|id4|NaN|1|
|id4|NaN|2|
|id4|300|3|.
Then I want the folloing dataset:
|user_id|value|No|NewNo|
|:-:|:-:|:-:|:-:|
|id1|100|1|1|
|id1|200|2|2|
|id1|250|3|3|
|id2|100|2|1|
|id3|400|1|1|
|id3|NaN|2|2|
|id3|200|3|3|
|id4|300|3|1|
namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.
python pandas pandas-groupby
python pandas pandas-groupby
edited Jan 22 at 10:40
jpp
100k2161111
100k2161111
asked Jan 22 at 1:58
s_narisawas_narisawa
62
62
migrated from superuser.com Jan 22 at 2:11
This question came from our site for computer enthusiasts and power users.
migrated from superuser.com Jan 22 at 2:11
This question came from our site for computer enthusiasts and power users.
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null
df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()
# application of copy here creates a new data frame and allows us to assign
# values to the result (df2). This is needed to create the column `NewNo`
# in the next & final step
# df2 outputs:
user_id value No
0 'id1' 100.0 1
1 'id1' 200.0 2
2 'id1' 250.0 3
4 'id2' 100.0 2
5 'id3' 400.0 1
6 'id3' NaN 2
7 'id3' 200.0 3
10 'id4' 300.0 3
Generate NewNo
column using ranking within the group.
df2['NewNo'] = df2.groupby('user_id').No.rank()
# df2 outputs:
user_id value No NewNo
0 'id1' 100.0 1 1.0
1 'id1' 200.0 2 2.0
2 'id1' 250.0 3 3.0
4 'id2' 100.0 2 1.0
5 'id3' 400.0 1 1.0
6 'id3' NaN 2 2.0
7 'id3' 200.0 3 3.0
10 'id4' 300.0 3 1.0
add a comment |
groupby
+ first_valid_index
+ cumcount
You can calculate indices for first non-null values by group, then use Boolean indexing:
# use transform to align groupwise first_valid_index with dataframe
firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)
# apply Boolean filter
res = df[df.index >= firsts]
# use groupby + cumcount to add groupwise labels
res['NewNo'] = res.groupby('user_id').cumcount() + 1
print(res)
user_id value No NewNo
0 id1 100.0 1 1
1 id1 200.0 2 2
2 id1 250.0 3 3
4 id2 100.0 2 1
5 id3 400.0 1 1
6 id3 NaN 2 2
7 id3 200.0 3 3
10 id4 300.0 3 1
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54300303%2ffind-a-first-non-nan-value-in-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null
df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()
# application of copy here creates a new data frame and allows us to assign
# values to the result (df2). This is needed to create the column `NewNo`
# in the next & final step
# df2 outputs:
user_id value No
0 'id1' 100.0 1
1 'id1' 200.0 2
2 'id1' 250.0 3
4 'id2' 100.0 2
5 'id3' 400.0 1
6 'id3' NaN 2
7 'id3' 200.0 3
10 'id4' 300.0 3
Generate NewNo
column using ranking within the group.
df2['NewNo'] = df2.groupby('user_id').No.rank()
# df2 outputs:
user_id value No NewNo
0 'id1' 100.0 1 1.0
1 'id1' 200.0 2 2.0
2 'id1' 250.0 3 3.0
4 'id2' 100.0 2 1.0
5 'id3' 400.0 1 1.0
6 'id3' NaN 2 2.0
7 'id3' 200.0 3 3.0
10 'id4' 300.0 3 1.0
add a comment |
you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null
df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()
# application of copy here creates a new data frame and allows us to assign
# values to the result (df2). This is needed to create the column `NewNo`
# in the next & final step
# df2 outputs:
user_id value No
0 'id1' 100.0 1
1 'id1' 200.0 2
2 'id1' 250.0 3
4 'id2' 100.0 2
5 'id3' 400.0 1
6 'id3' NaN 2
7 'id3' 200.0 3
10 'id4' 300.0 3
Generate NewNo
column using ranking within the group.
df2['NewNo'] = df2.groupby('user_id').No.rank()
# df2 outputs:
user_id value No NewNo
0 'id1' 100.0 1 1.0
1 'id1' 200.0 2 2.0
2 'id1' 250.0 3 3.0
4 'id2' 100.0 2 1.0
5 'id3' 400.0 1 1.0
6 'id3' NaN 2 2.0
7 'id3' 200.0 3 3.0
10 'id4' 300.0 3 1.0
add a comment |
you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null
df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()
# application of copy here creates a new data frame and allows us to assign
# values to the result (df2). This is needed to create the column `NewNo`
# in the next & final step
# df2 outputs:
user_id value No
0 'id1' 100.0 1
1 'id1' 200.0 2
2 'id1' 250.0 3
4 'id2' 100.0 2
5 'id3' 400.0 1
6 'id3' NaN 2
7 'id3' 200.0 3
10 'id4' 300.0 3
Generate NewNo
column using ranking within the group.
df2['NewNo'] = df2.groupby('user_id').No.rank()
# df2 outputs:
user_id value No NewNo
0 'id1' 100.0 1 1.0
1 'id1' 200.0 2 2.0
2 'id1' 250.0 3 3.0
4 'id2' 100.0 2 1.0
5 'id3' 400.0 1 1.0
6 'id3' NaN 2 2.0
7 'id3' 200.0 3 3.0
10 'id4' 300.0 3 1.0
you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null
df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()
# application of copy here creates a new data frame and allows us to assign
# values to the result (df2). This is needed to create the column `NewNo`
# in the next & final step
# df2 outputs:
user_id value No
0 'id1' 100.0 1
1 'id1' 200.0 2
2 'id1' 250.0 3
4 'id2' 100.0 2
5 'id3' 400.0 1
6 'id3' NaN 2
7 'id3' 200.0 3
10 'id4' 300.0 3
Generate NewNo
column using ranking within the group.
df2['NewNo'] = df2.groupby('user_id').No.rank()
# df2 outputs:
user_id value No NewNo
0 'id1' 100.0 1 1.0
1 'id1' 200.0 2 2.0
2 'id1' 250.0 3 3.0
4 'id2' 100.0 2 1.0
5 'id3' 400.0 1 1.0
6 'id3' NaN 2 2.0
7 'id3' 200.0 3 3.0
10 'id4' 300.0 3 1.0
answered Jan 22 at 2:46
Haleemur AliHaleemur Ali
12.3k21739
12.3k21739
add a comment |
add a comment |
groupby
+ first_valid_index
+ cumcount
You can calculate indices for first non-null values by group, then use Boolean indexing:
# use transform to align groupwise first_valid_index with dataframe
firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)
# apply Boolean filter
res = df[df.index >= firsts]
# use groupby + cumcount to add groupwise labels
res['NewNo'] = res.groupby('user_id').cumcount() + 1
print(res)
user_id value No NewNo
0 id1 100.0 1 1
1 id1 200.0 2 2
2 id1 250.0 3 3
4 id2 100.0 2 1
5 id3 400.0 1 1
6 id3 NaN 2 2
7 id3 200.0 3 3
10 id4 300.0 3 1
add a comment |
groupby
+ first_valid_index
+ cumcount
You can calculate indices for first non-null values by group, then use Boolean indexing:
# use transform to align groupwise first_valid_index with dataframe
firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)
# apply Boolean filter
res = df[df.index >= firsts]
# use groupby + cumcount to add groupwise labels
res['NewNo'] = res.groupby('user_id').cumcount() + 1
print(res)
user_id value No NewNo
0 id1 100.0 1 1
1 id1 200.0 2 2
2 id1 250.0 3 3
4 id2 100.0 2 1
5 id3 400.0 1 1
6 id3 NaN 2 2
7 id3 200.0 3 3
10 id4 300.0 3 1
add a comment |
groupby
+ first_valid_index
+ cumcount
You can calculate indices for first non-null values by group, then use Boolean indexing:
# use transform to align groupwise first_valid_index with dataframe
firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)
# apply Boolean filter
res = df[df.index >= firsts]
# use groupby + cumcount to add groupwise labels
res['NewNo'] = res.groupby('user_id').cumcount() + 1
print(res)
user_id value No NewNo
0 id1 100.0 1 1
1 id1 200.0 2 2
2 id1 250.0 3 3
4 id2 100.0 2 1
5 id3 400.0 1 1
6 id3 NaN 2 2
7 id3 200.0 3 3
10 id4 300.0 3 1
groupby
+ first_valid_index
+ cumcount
You can calculate indices for first non-null values by group, then use Boolean indexing:
# use transform to align groupwise first_valid_index with dataframe
firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)
# apply Boolean filter
res = df[df.index >= firsts]
# use groupby + cumcount to add groupwise labels
res['NewNo'] = res.groupby('user_id').cumcount() + 1
print(res)
user_id value No NewNo
0 id1 100.0 1 1
1 id1 200.0 2 2
2 id1 250.0 3 3
4 id2 100.0 2 1
5 id3 400.0 1 1
6 id3 NaN 2 2
7 id3 200.0 3 3
10 id4 300.0 3 1
edited Jan 22 at 10:45
answered Jan 22 at 10:39
jppjpp
100k2161111
100k2161111
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54300303%2ffind-a-first-non-nan-value-in-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown