Find a first non NaN value in Pandas

I have a Pandas dataframe such that

|user_id|value|No|

|:-:|:-:|:-:|

|id1|100|1|

|id1|200|2|

|id1|250|3|

|id2|NaN|1|

|id2|100|2|

|id3|400|1|

|id3|NaN|2|

|id3|200|3|

|id4|NaN|1|

|id4|NaN|2|

|id4|300|3|.

Then I want the folloing dataset:

|user_id|value|No|NewNo|

|:-:|:-:|:-:|:-:|

|id1|100|1|1|

|id1|200|2|2|

|id1|250|3|3|

|id2|100|2|1|

|id3|400|1|1|

|id3|NaN|2|2|

|id3|200|3|3|

|id4|300|3|1|

namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.

edited Jan 22 at 10:40

jpp

100k2161111

asked Jan 22 at 1:58

s_narisawa

migrated from superuser.com Jan 22 at 2:11

This question came from our site for computer enthusiasts and power users.

add a comment |

I have a Pandas dataframe such that

|user_id|value|No|

|:-:|:-:|:-:|

|id1|100|1|

|id1|200|2|

|id1|250|3|

|id2|NaN|1|

|id2|100|2|

|id3|400|1|

|id3|NaN|2|

|id3|200|3|

|id4|NaN|1|

|id4|NaN|2|

|id4|300|3|.

Then I want the folloing dataset:

|user_id|value|No|NewNo|

|:-:|:-:|:-:|:-:|

|id1|100|1|1|

|id1|200|2|2|

|id1|250|3|3|

|id2|100|2|1|

|id3|400|1|1|

|id3|NaN|2|2|

|id3|200|3|3|

|id4|300|3|1|

namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.

edited Jan 22 at 10:40

jpp

100k2161111

asked Jan 22 at 1:58

s_narisawa

migrated from superuser.com Jan 22 at 2:11

This question came from our site for computer enthusiasts and power users.

add a comment |

I have a Pandas dataframe such that

|user_id|value|No|

|:-:|:-:|:-:|

|id1|100|1|

|id1|200|2|

|id1|250|3|

|id2|NaN|1|

|id2|100|2|

|id3|400|1|

|id3|NaN|2|

|id3|200|3|

|id4|NaN|1|

|id4|NaN|2|

|id4|300|3|.

Then I want the folloing dataset:

|user_id|value|No|NewNo|

|:-:|:-:|:-:|:-:|

|id1|100|1|1|

|id1|200|2|2|

|id1|250|3|3|

|id2|100|2|1|

|id3|400|1|1|

|id3|NaN|2|2|

|id3|200|3|3|

|id4|300|3|1|

namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.

edited Jan 22 at 10:40

jpp

100k2161111

asked Jan 22 at 1:58

s_narisawa

I have a Pandas dataframe such that

|user_id|value|No|

|:-:|:-:|:-:|

|id1|100|1|

|id1|200|2|

|id1|250|3|

|id2|NaN|1|

|id2|100|2|

|id3|400|1|

|id3|NaN|2|

|id3|200|3|

|id4|NaN|1|

|id4|NaN|2|

|id4|300|3|.

Then I want the folloing dataset:

|user_id|value|No|NewNo|

|:-:|:-:|:-:|:-:|

|id1|100|1|1|

|id1|200|2|2|

|id1|250|3|3|

|id2|100|2|1|

|id3|400|1|1|

|id3|NaN|2|2|

|id3|200|3|3|

|id4|300|3|1|

namely, I want to delete NaN values such that the first value of user_id is not NaN value. Thank you.

python pandas pandas-groupby

edited Jan 22 at 10:40

jpp

100k2161111

asked Jan 22 at 1:58

s_narisawa

edited Jan 22 at 10:40

jpp

100k2161111

asked Jan 22 at 1:58

s_narisawa

edited Jan 22 at 10:40

jpp

100k2161111

edited Jan 22 at 10:40

jpp

100k2161111

edited Jan 22 at 10:40

jpp

100k2161111

asked Jan 22 at 1:58

s_narisawa

asked Jan 22 at 1:58

s_narisawa

asked Jan 22 at 1:58

s_narisawa

migrated from superuser.com Jan 22 at 2:11

This question came from our site for computer enthusiasts and power users.

migrated from superuser.com Jan 22 at 2:11

This question came from our site for computer enthusiasts and power users.

add a comment |

2 Answers
2

active

oldest

votes

you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null

df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()

# application of copy here creates a new data frame and allows us to assign

# values to the result (df2). This is needed to create the column `NewNo` 

# in the next & final step

# df2 outputs:

   user_id  value  No

0    'id1'  100.0   1

1    'id1'  200.0   2

2    'id1'  250.0   3

4    'id2'  100.0   2

5    'id3'  400.0   1

6    'id3'    NaN   2

7    'id3'  200.0   3

10   'id4'  300.0   3

Generate NewNo column using ranking within the group.

df2['NewNo'] = df2.groupby('user_id').No.rank()



# df2 outputs:



   user_id  value  No  NewNo

0    'id1'  100.0   1    1.0

1    'id1'  200.0   2    2.0

2    'id1'  250.0   3    3.0

4    'id2'  100.0   2    1.0

5    'id3'  400.0   1    1.0

6    'id3'    NaN   2    2.0

7    'id3'  200.0   3    3.0

10   'id4'  300.0   3    1.0

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

add a comment |

`groupby` + `first_valid_index` + `cumcount`

You can calculate indices for first non-null values by group, then use Boolean indexing:

# use transform to align groupwise first_valid_index with dataframe

firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)



# apply Boolean filter

res = df[df.index >= firsts]



# use groupby + cumcount to add groupwise labels

res['NewNo'] = res.groupby('user_id').cumcount() + 1



print(res)



   user_id  value  No  NewNo

0      id1  100.0   1      1

1      id1  200.0   2      2

2      id1  250.0   3      3

4      id2  100.0   2      1

5      id3  400.0   1      1

6      id3    NaN   2      2

7      id3  200.0   3      3

10     id4  300.0   3      1

edited Jan 22 at 10:45

answered Jan 22 at 10:39

jpp

100k2161111

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54300303%2ffind-a-first-non-nan-value-in-pandas%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null

df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()

# application of copy here creates a new data frame and allows us to assign

# values to the result (df2). This is needed to create the column `NewNo` 

# in the next & final step

# df2 outputs:

   user_id  value  No

0    'id1'  100.0   1

1    'id1'  200.0   2

2    'id1'  250.0   3

4    'id2'  100.0   2

5    'id3'  400.0   1

6    'id3'    NaN   2

7    'id3'  200.0   3

10   'id4'  300.0   3

Generate NewNo column using ranking within the group.

df2['NewNo'] = df2.groupby('user_id').No.rank()



# df2 outputs:



   user_id  value  No  NewNo

0    'id1'  100.0   1    1.0

1    'id1'  200.0   2    2.0

2    'id1'  250.0   3    3.0

4    'id2'  100.0   2    1.0

5    'id3'  400.0   1    1.0

6    'id3'    NaN   2    2.0

7    'id3'  200.0   3    3.0

10   'id4'  300.0   3    1.0

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

add a comment |

you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null

df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()

# application of copy here creates a new data frame and allows us to assign

# values to the result (df2). This is needed to create the column `NewNo` 

# in the next & final step

# df2 outputs:

   user_id  value  No

0    'id1'  100.0   1

1    'id1'  200.0   2

2    'id1'  250.0   3

4    'id2'  100.0   2

5    'id3'  400.0   1

6    'id3'    NaN   2

7    'id3'  200.0   3

10   'id4'  300.0   3

Generate NewNo column using ranking within the group.

df2['NewNo'] = df2.groupby('user_id').No.rank()



# df2 outputs:



   user_id  value  No  NewNo

0    'id1'  100.0   1    1.0

1    'id1'  200.0   2    2.0

2    'id1'  250.0   3    3.0

4    'id2'  100.0   2    1.0

5    'id3'  400.0   1    1.0

6    'id3'    NaN   2    2.0

7    'id3'  200.0   3    3.0

10   'id4'  300.0   3    1.0

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

add a comment |

you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null

df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()

# application of copy here creates a new data frame and allows us to assign

# values to the result (df2). This is needed to create the column `NewNo` 

# in the next & final step

# df2 outputs:

   user_id  value  No

0    'id1'  100.0   1

1    'id1'  200.0   2

2    'id1'  250.0   3

4    'id2'  100.0   2

5    'id3'  400.0   1

6    'id3'    NaN   2

7    'id3'  200.0   3

10   'id4'  300.0   3

Generate NewNo column using ranking within the group.

df2['NewNo'] = df2.groupby('user_id').No.rank()



# df2 outputs:



   user_id  value  No  NewNo

0    'id1'  100.0   1    1.0

1    'id1'  200.0   2    2.0

2    'id1'  250.0   3    3.0

4    'id2'  100.0   2    1.0

5    'id3'  400.0   1    1.0

6    'id3'    NaN   2    2.0

7    'id3'  200.0   3    3.0

10   'id4'  300.0   3    1.0

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

you can groupby & forward fill the value column. Null values in the transformed data indicate the values from the start for each group that are null. Filter out the rows that are null

df2 = df[df.groupby('user_id').value.ffill().apply(pd.notnull)].copy()

# application of copy here creates a new data frame and allows us to assign

# values to the result (df2). This is needed to create the column `NewNo` 

# in the next & final step

# df2 outputs:

   user_id  value  No

0    'id1'  100.0   1

1    'id1'  200.0   2

2    'id1'  250.0   3

4    'id2'  100.0   2

5    'id3'  400.0   1

6    'id3'    NaN   2

7    'id3'  200.0   3

10   'id4'  300.0   3

Generate NewNo column using ranking within the group.

df2['NewNo'] = df2.groupby('user_id').No.rank()



# df2 outputs:



   user_id  value  No  NewNo

0    'id1'  100.0   1    1.0

1    'id1'  200.0   2    2.0

2    'id1'  250.0   3    3.0

4    'id2'  100.0   2    1.0

5    'id3'  400.0   1    1.0

6    'id3'    NaN   2    2.0

7    'id3'  200.0   3    3.0

10   'id4'  300.0   3    1.0

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

answered Jan 22 at 2:46

Haleemur Ali

12.3k21739

add a comment |

`groupby` + `first_valid_index` + `cumcount`

You can calculate indices for first non-null values by group, then use Boolean indexing:

# use transform to align groupwise first_valid_index with dataframe

firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)



# apply Boolean filter

res = df[df.index >= firsts]



# use groupby + cumcount to add groupwise labels

res['NewNo'] = res.groupby('user_id').cumcount() + 1



print(res)



   user_id  value  No  NewNo

0      id1  100.0   1      1

1      id1  200.0   2      2

2      id1  250.0   3      3

4      id2  100.0   2      1

5      id3  400.0   1      1

6      id3    NaN   2      2

7      id3  200.0   3      3

10     id4  300.0   3      1

edited Jan 22 at 10:45

answered Jan 22 at 10:39

jpp

100k2161111

add a comment |

`groupby` + `first_valid_index` + `cumcount`

You can calculate indices for first non-null values by group, then use Boolean indexing:

# use transform to align groupwise first_valid_index with dataframe

firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)



# apply Boolean filter

res = df[df.index >= firsts]



# use groupby + cumcount to add groupwise labels

res['NewNo'] = res.groupby('user_id').cumcount() + 1



print(res)



   user_id  value  No  NewNo

0      id1  100.0   1      1

1      id1  200.0   2      2

2      id1  250.0   3      3

4      id2  100.0   2      1

5      id3  400.0   1      1

6      id3    NaN   2      2

7      id3  200.0   3      3

10     id4  300.0   3      1

edited Jan 22 at 10:45

answered Jan 22 at 10:39

jpp

100k2161111

add a comment |

`groupby` + `first_valid_index` + `cumcount`

You can calculate indices for first non-null values by group, then use Boolean indexing:

# use transform to align groupwise first_valid_index with dataframe

firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)



# apply Boolean filter

res = df[df.index >= firsts]



# use groupby + cumcount to add groupwise labels

res['NewNo'] = res.groupby('user_id').cumcount() + 1



print(res)



   user_id  value  No  NewNo

0      id1  100.0   1      1

1      id1  200.0   2      2

2      id1  250.0   3      3

4      id2  100.0   2      1

5      id3  400.0   1      1

6      id3    NaN   2      2

7      id3  200.0   3      3

10     id4  300.0   3      1

edited Jan 22 at 10:45

answered Jan 22 at 10:39

jpp

100k2161111

`groupby` + `first_valid_index` + `cumcount`

You can calculate indices for first non-null values by group, then use Boolean indexing:

# use transform to align groupwise first_valid_index with dataframe

firsts = df.groupby('user_id')['value'].transform(pd.Series.first_valid_index)



# apply Boolean filter

res = df[df.index >= firsts]



# use groupby + cumcount to add groupwise labels

res['NewNo'] = res.groupby('user_id').cumcount() + 1



print(res)



   user_id  value  No  NewNo

0      id1  100.0   1      1

1      id1  200.0   2      2

2      id1  250.0   3      3

4      id2  100.0   2      1

5      id3  400.0   1      1

6      id3    NaN   2      2

7      id3  200.0   3      3

10     id4  300.0   3      1

edited Jan 22 at 10:45

answered Jan 22 at 10:39

jpp

100k2161111

edited Jan 22 at 10:45

answered Jan 22 at 10:39

jpp

100k2161111

answered Jan 22 at 10:39

jpp

100k2161111

answered Jan 22 at 10:39

jpp

100k2161111

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytdyklly