Why not use instrumental variable directly as a covariate in the regression?
$begingroup$
I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:
- assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.
Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?
regression least-squares instrumental-variables
$endgroup$
add a comment |
$begingroup$
I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:
- assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.
Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?
regression least-squares instrumental-variables
$endgroup$
add a comment |
$begingroup$
I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:
- assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.
Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?
regression least-squares instrumental-variables
$endgroup$
I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:
- assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.
Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?
regression least-squares instrumental-variables
regression least-squares instrumental-variables
edited 1 hour ago
Alexis
16.3k34597
16.3k34597
asked 4 hours ago
DanielDaniel
755
755
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):

If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.
In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.
Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).
$endgroup$
add a comment |
$begingroup$
You can and people do. As @Alexis points out though, it doesn't give you a complete answer.
Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:
- The regression of $X$ on $Z$ is called the first stage regression.
- The regression of $Y$ on $Z$ is called the reduced form regression.
The reduced form regression on its own does not estimate the effect of $X$ on $Y$.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392190%2fwhy-not-use-instrumental-variable-directly-as-a-covariate-in-the-regression%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):

If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.
In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.
Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).
$endgroup$
add a comment |
$begingroup$
The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):

If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.
In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.
Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).
$endgroup$
add a comment |
$begingroup$
The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):

If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.
In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.
Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).
$endgroup$
The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):

If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.
In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.
Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).
edited 1 hour ago
answered 4 hours ago
AlexisAlexis
16.3k34597
16.3k34597
add a comment |
add a comment |
$begingroup$
You can and people do. As @Alexis points out though, it doesn't give you a complete answer.
Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:
- The regression of $X$ on $Z$ is called the first stage regression.
- The regression of $Y$ on $Z$ is called the reduced form regression.
The reduced form regression on its own does not estimate the effect of $X$ on $Y$.
$endgroup$
add a comment |
$begingroup$
You can and people do. As @Alexis points out though, it doesn't give you a complete answer.
Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:
- The regression of $X$ on $Z$ is called the first stage regression.
- The regression of $Y$ on $Z$ is called the reduced form regression.
The reduced form regression on its own does not estimate the effect of $X$ on $Y$.
$endgroup$
add a comment |
$begingroup$
You can and people do. As @Alexis points out though, it doesn't give you a complete answer.
Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:
- The regression of $X$ on $Z$ is called the first stage regression.
- The regression of $Y$ on $Z$ is called the reduced form regression.
The reduced form regression on its own does not estimate the effect of $X$ on $Y$.
$endgroup$
You can and people do. As @Alexis points out though, it doesn't give you a complete answer.
Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:
- The regression of $X$ on $Z$ is called the first stage regression.
- The regression of $Y$ on $Z$ is called the reduced form regression.
The reduced form regression on its own does not estimate the effect of $X$ on $Y$.
answered 40 mins ago
Matthew GunnMatthew Gunn
17.2k13370
17.2k13370
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392190%2fwhy-not-use-instrumental-variable-directly-as-a-covariate-in-the-regression%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown