Calculate Ranked Probability Score
I have a csv-file that consists of all match outcome probabilities for soccer matches. Each math can be result in a win, draw or loss. I also included the actual outcome. In order to test how accurate my predictions are I want to use the Ranked Probability Score (RPS). Basically, the RPS compares the cumulative probability distributions of the predictions and the outcome:
$ RPS = frac{1}{r-1} sumlimits_{i=1}^{r}left(sumlimits_{j=1}^i p_j - sumlimits_{j=1}^i e_j right)^2, $
where $r$ is the number of potential outcomes, and $p_j$ and
$e_j$ are the forecasts and observed outcomes at position $j$.
For additional information, see the following link.
import numpy as np
import pandas as pd
def RPS(predictions, observed):
ncat = 3
npred = len(predictions)
RPS = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
RPS[x] = (1/(ncat-1)) * cumulative
return RPS
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
RPS = RPS(predictions, observed)
The first argument (predictions) is a matrix with the predictions and the corresponding probabilities. Each row is one prediction, laid out in the proper order (H, D, L), where each element is a probability and each row sum to 1. The second argument (observed) is a numeric vector that indicates which outcome that was actually observed (1, 2, 3)
Feel free to give any feedback!
Thank you
Edit:
For some reason I am not able to reproduce the results of Table 3 of the link. I use Table 1 as input for predictions and observed. Any help is much appreciated!
Edit #2:
Hereby the small sample of the paper:
predictions = {'H': [1, 0.9, 0.8, 0.5, 0.35, 0.6, 0.6, 0.6, 0.5, 0.55],
'D': [0, 0.1, 0.1, 0.25, 0.3, 0.3, 0.3, 0.1, 0.45, 0.1],
'L': [0, 0, 0.1, 0.25, 0.35, 0.1, 0.1, 0.3, 0.05, 0.35]}
observed = {'Outcome': [1, 1, 1, 1, 2, 2, 1, 1, 1, 1]}
python
New contributor
add a comment |
I have a csv-file that consists of all match outcome probabilities for soccer matches. Each math can be result in a win, draw or loss. I also included the actual outcome. In order to test how accurate my predictions are I want to use the Ranked Probability Score (RPS). Basically, the RPS compares the cumulative probability distributions of the predictions and the outcome:
$ RPS = frac{1}{r-1} sumlimits_{i=1}^{r}left(sumlimits_{j=1}^i p_j - sumlimits_{j=1}^i e_j right)^2, $
where $r$ is the number of potential outcomes, and $p_j$ and
$e_j$ are the forecasts and observed outcomes at position $j$.
For additional information, see the following link.
import numpy as np
import pandas as pd
def RPS(predictions, observed):
ncat = 3
npred = len(predictions)
RPS = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
RPS[x] = (1/(ncat-1)) * cumulative
return RPS
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
RPS = RPS(predictions, observed)
The first argument (predictions) is a matrix with the predictions and the corresponding probabilities. Each row is one prediction, laid out in the proper order (H, D, L), where each element is a probability and each row sum to 1. The second argument (observed) is a numeric vector that indicates which outcome that was actually observed (1, 2, 3)
Feel free to give any feedback!
Thank you
Edit:
For some reason I am not able to reproduce the results of Table 3 of the link. I use Table 1 as input for predictions and observed. Any help is much appreciated!
Edit #2:
Hereby the small sample of the paper:
predictions = {'H': [1, 0.9, 0.8, 0.5, 0.35, 0.6, 0.6, 0.6, 0.5, 0.55],
'D': [0, 0.1, 0.1, 0.25, 0.3, 0.3, 0.3, 0.1, 0.45, 0.1],
'L': [0, 0, 0.1, 0.25, 0.35, 0.1, 0.1, 0.3, 0.05, 0.35]}
observed = {'Outcome': [1, 1, 1, 1, 2, 2, 1, 1, 1, 1]}
python
New contributor
1
"I am not able to reproduce the results of Table 3" If the code is not working correctly it is not ready for review.
– 1201ProgramAlarm
yesterday
add a comment |
I have a csv-file that consists of all match outcome probabilities for soccer matches. Each math can be result in a win, draw or loss. I also included the actual outcome. In order to test how accurate my predictions are I want to use the Ranked Probability Score (RPS). Basically, the RPS compares the cumulative probability distributions of the predictions and the outcome:
$ RPS = frac{1}{r-1} sumlimits_{i=1}^{r}left(sumlimits_{j=1}^i p_j - sumlimits_{j=1}^i e_j right)^2, $
where $r$ is the number of potential outcomes, and $p_j$ and
$e_j$ are the forecasts and observed outcomes at position $j$.
For additional information, see the following link.
import numpy as np
import pandas as pd
def RPS(predictions, observed):
ncat = 3
npred = len(predictions)
RPS = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
RPS[x] = (1/(ncat-1)) * cumulative
return RPS
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
RPS = RPS(predictions, observed)
The first argument (predictions) is a matrix with the predictions and the corresponding probabilities. Each row is one prediction, laid out in the proper order (H, D, L), where each element is a probability and each row sum to 1. The second argument (observed) is a numeric vector that indicates which outcome that was actually observed (1, 2, 3)
Feel free to give any feedback!
Thank you
Edit:
For some reason I am not able to reproduce the results of Table 3 of the link. I use Table 1 as input for predictions and observed. Any help is much appreciated!
Edit #2:
Hereby the small sample of the paper:
predictions = {'H': [1, 0.9, 0.8, 0.5, 0.35, 0.6, 0.6, 0.6, 0.5, 0.55],
'D': [0, 0.1, 0.1, 0.25, 0.3, 0.3, 0.3, 0.1, 0.45, 0.1],
'L': [0, 0, 0.1, 0.25, 0.35, 0.1, 0.1, 0.3, 0.05, 0.35]}
observed = {'Outcome': [1, 1, 1, 1, 2, 2, 1, 1, 1, 1]}
python
New contributor
I have a csv-file that consists of all match outcome probabilities for soccer matches. Each math can be result in a win, draw or loss. I also included the actual outcome. In order to test how accurate my predictions are I want to use the Ranked Probability Score (RPS). Basically, the RPS compares the cumulative probability distributions of the predictions and the outcome:
$ RPS = frac{1}{r-1} sumlimits_{i=1}^{r}left(sumlimits_{j=1}^i p_j - sumlimits_{j=1}^i e_j right)^2, $
where $r$ is the number of potential outcomes, and $p_j$ and
$e_j$ are the forecasts and observed outcomes at position $j$.
For additional information, see the following link.
import numpy as np
import pandas as pd
def RPS(predictions, observed):
ncat = 3
npred = len(predictions)
RPS = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
RPS[x] = (1/(ncat-1)) * cumulative
return RPS
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
RPS = RPS(predictions, observed)
The first argument (predictions) is a matrix with the predictions and the corresponding probabilities. Each row is one prediction, laid out in the proper order (H, D, L), where each element is a probability and each row sum to 1. The second argument (observed) is a numeric vector that indicates which outcome that was actually observed (1, 2, 3)
Feel free to give any feedback!
Thank you
Edit:
For some reason I am not able to reproduce the results of Table 3 of the link. I use Table 1 as input for predictions and observed. Any help is much appreciated!
Edit #2:
Hereby the small sample of the paper:
predictions = {'H': [1, 0.9, 0.8, 0.5, 0.35, 0.6, 0.6, 0.6, 0.5, 0.55],
'D': [0, 0.1, 0.1, 0.25, 0.3, 0.3, 0.3, 0.1, 0.45, 0.1],
'L': [0, 0, 0.1, 0.25, 0.35, 0.1, 0.1, 0.3, 0.05, 0.35]}
observed = {'Outcome': [1, 1, 1, 1, 2, 2, 1, 1, 1, 1]}
python
python
New contributor
New contributor
edited yesterday
New contributor
asked yesterday
HJA24
142
142
New contributor
New contributor
1
"I am not able to reproduce the results of Table 3" If the code is not working correctly it is not ready for review.
– 1201ProgramAlarm
yesterday
add a comment |
1
"I am not able to reproduce the results of Table 3" If the code is not working correctly it is not ready for review.
– 1201ProgramAlarm
yesterday
1
1
"I am not able to reproduce the results of Table 3" If the code is not working correctly it is not ready for review.
– 1201ProgramAlarm
yesterday
"I am not able to reproduce the results of Table 3" If the code is not working correctly it is not ready for review.
– 1201ProgramAlarm
yesterday
add a comment |
2 Answers
2
active
oldest
votes
Tidy up your math
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
can be
cumulative += (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
and
RPS[x] = (1/(ncat-1)) * cumulative
should be
RPS[x] = cumulative / (ncat-1)
Make a main method
This is a very small script, but still benefits from pulling your global code into a main
.
PEP8
By convention, method names (i.e. RPS
) should be lowercase.
That's all I see for now.
add a comment |
Thanks for all the replies. In the end it was relatively simple. My code is based on R, matrices start with 1 in R. Python, however, starts with 0. Adjusting my original code with this insight (sorry..) I am able to reproduce the output. I also included the remarks of Reinderien.
import numpy as np
import pandas as pd
def rps(predictions, observed):
ncat = 3
npred = len(predictions)
rps = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative += (sum(predictions.iloc[x, 0:i]) - sum(obsvec[0:i])) ** 2
rps[x] = cumulative / (ncat-1))
return rps
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
rps = RPS(predictions, observed)
New contributor
I'm glad you figured out your issue, but - a few things. Your code in your answer is not properly indented. Also, it definitely doesn't run, because you wroteRPS
, which no longer exists. Finally: if your code was broken in the first place, then this entire question is off-topic.
– Reinderien
34 mins ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
HJA24 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210870%2fcalculate-ranked-probability-score%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Tidy up your math
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
can be
cumulative += (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
and
RPS[x] = (1/(ncat-1)) * cumulative
should be
RPS[x] = cumulative / (ncat-1)
Make a main method
This is a very small script, but still benefits from pulling your global code into a main
.
PEP8
By convention, method names (i.e. RPS
) should be lowercase.
That's all I see for now.
add a comment |
Tidy up your math
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
can be
cumulative += (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
and
RPS[x] = (1/(ncat-1)) * cumulative
should be
RPS[x] = cumulative / (ncat-1)
Make a main method
This is a very small script, but still benefits from pulling your global code into a main
.
PEP8
By convention, method names (i.e. RPS
) should be lowercase.
That's all I see for now.
add a comment |
Tidy up your math
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
can be
cumulative += (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
and
RPS[x] = (1/(ncat-1)) * cumulative
should be
RPS[x] = cumulative / (ncat-1)
Make a main method
This is a very small script, but still benefits from pulling your global code into a main
.
PEP8
By convention, method names (i.e. RPS
) should be lowercase.
That's all I see for now.
Tidy up your math
cumulative = cumulative + (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
can be
cumulative += (sum(predictions.iloc[x, 1:i]) - sum(obsvec[1:i])) ** 2
and
RPS[x] = (1/(ncat-1)) * cumulative
should be
RPS[x] = cumulative / (ncat-1)
Make a main method
This is a very small script, but still benefits from pulling your global code into a main
.
PEP8
By convention, method names (i.e. RPS
) should be lowercase.
That's all I see for now.
answered yesterday
Reinderien
3,832821
3,832821
add a comment |
add a comment |
Thanks for all the replies. In the end it was relatively simple. My code is based on R, matrices start with 1 in R. Python, however, starts with 0. Adjusting my original code with this insight (sorry..) I am able to reproduce the output. I also included the remarks of Reinderien.
import numpy as np
import pandas as pd
def rps(predictions, observed):
ncat = 3
npred = len(predictions)
rps = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative += (sum(predictions.iloc[x, 0:i]) - sum(obsvec[0:i])) ** 2
rps[x] = cumulative / (ncat-1))
return rps
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
rps = RPS(predictions, observed)
New contributor
I'm glad you figured out your issue, but - a few things. Your code in your answer is not properly indented. Also, it definitely doesn't run, because you wroteRPS
, which no longer exists. Finally: if your code was broken in the first place, then this entire question is off-topic.
– Reinderien
34 mins ago
add a comment |
Thanks for all the replies. In the end it was relatively simple. My code is based on R, matrices start with 1 in R. Python, however, starts with 0. Adjusting my original code with this insight (sorry..) I am able to reproduce the output. I also included the remarks of Reinderien.
import numpy as np
import pandas as pd
def rps(predictions, observed):
ncat = 3
npred = len(predictions)
rps = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative += (sum(predictions.iloc[x, 0:i]) - sum(obsvec[0:i])) ** 2
rps[x] = cumulative / (ncat-1))
return rps
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
rps = RPS(predictions, observed)
New contributor
I'm glad you figured out your issue, but - a few things. Your code in your answer is not properly indented. Also, it definitely doesn't run, because you wroteRPS
, which no longer exists. Finally: if your code was broken in the first place, then this entire question is off-topic.
– Reinderien
34 mins ago
add a comment |
Thanks for all the replies. In the end it was relatively simple. My code is based on R, matrices start with 1 in R. Python, however, starts with 0. Adjusting my original code with this insight (sorry..) I am able to reproduce the output. I also included the remarks of Reinderien.
import numpy as np
import pandas as pd
def rps(predictions, observed):
ncat = 3
npred = len(predictions)
rps = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative += (sum(predictions.iloc[x, 0:i]) - sum(obsvec[0:i])) ** 2
rps[x] = cumulative / (ncat-1))
return rps
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
rps = RPS(predictions, observed)
New contributor
Thanks for all the replies. In the end it was relatively simple. My code is based on R, matrices start with 1 in R. Python, however, starts with 0. Adjusting my original code with this insight (sorry..) I am able to reproduce the output. I also included the remarks of Reinderien.
import numpy as np
import pandas as pd
def rps(predictions, observed):
ncat = 3
npred = len(predictions)
rps = np.zeros(npred)
for x in range(0, npred):
obsvec = np.zeros(ncat)
obsvec[observed.iloc[x]-1] = 1
cumulative = 0
for i in range(1, ncat):
cumulative += (sum(predictions.iloc[x, 0:i]) - sum(obsvec[0:i])) ** 2
rps[x] = cumulative / (ncat-1))
return rps
df = pd.read_csv('test.csv', header=0)
predictions = df[['H', 'D', 'L']]
observed = df[['Outcome']]
rps = RPS(predictions, observed)
New contributor
edited 21 hours ago
New contributor
answered 23 hours ago
HJA24
142
142
New contributor
New contributor
I'm glad you figured out your issue, but - a few things. Your code in your answer is not properly indented. Also, it definitely doesn't run, because you wroteRPS
, which no longer exists. Finally: if your code was broken in the first place, then this entire question is off-topic.
– Reinderien
34 mins ago
add a comment |
I'm glad you figured out your issue, but - a few things. Your code in your answer is not properly indented. Also, it definitely doesn't run, because you wroteRPS
, which no longer exists. Finally: if your code was broken in the first place, then this entire question is off-topic.
– Reinderien
34 mins ago
I'm glad you figured out your issue, but - a few things. Your code in your answer is not properly indented. Also, it definitely doesn't run, because you wrote
RPS
, which no longer exists. Finally: if your code was broken in the first place, then this entire question is off-topic.– Reinderien
34 mins ago
I'm glad you figured out your issue, but - a few things. Your code in your answer is not properly indented. Also, it definitely doesn't run, because you wrote
RPS
, which no longer exists. Finally: if your code was broken in the first place, then this entire question is off-topic.– Reinderien
34 mins ago
add a comment |
HJA24 is a new contributor. Be nice, and check out our Code of Conduct.
HJA24 is a new contributor. Be nice, and check out our Code of Conduct.
HJA24 is a new contributor. Be nice, and check out our Code of Conduct.
HJA24 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210870%2fcalculate-ranked-probability-score%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
"I am not able to reproduce the results of Table 3" If the code is not working correctly it is not ready for review.
– 1201ProgramAlarm
yesterday