Why not use instrumental variable directly as a covariate in the regression?












5












$begingroup$


I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:




  • assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.


Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?










share|cite|improve this question











$endgroup$

















    5












    $begingroup$


    I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:




    • assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.


    Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?










    share|cite|improve this question











    $endgroup$















      5












      5








      5





      $begingroup$


      I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:




      • assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.


      Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?










      share|cite|improve this question











      $endgroup$




      I know this is a silly question, as I know the theory of instrumental variables and two stage regression. Still, I never saw a clear answer to the following:




      • assume you have endogeneity due to unobserved variable correlated with one of the initial regressors. The typical way to correct for that is to find an instrumental variable correlated to the unobserved effect and to use a two-stage regression approach.


      Now my question is, why go through that trouble – why wouldn’t you just include the instrumental variable as a standard regressor in the initial estimation?







      regression least-squares instrumental-variables






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited 1 hour ago









      Alexis

      16.3k34597




      16.3k34597










      asked 4 hours ago









      DanielDaniel

      755




      755






















          2 Answers
          2






          active

          oldest

          votes


















          7












          $begingroup$

          The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):





          If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.



          In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.



          Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).






          share|cite|improve this answer











          $endgroup$





















            0












            $begingroup$

            You can and people do. As @Alexis points out though, it doesn't give you a complete answer.



            Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:




            • The regression of $X$ on $Z$ is called the first stage regression.

            • The regression of $Y$ on $Z$ is called the reduced form regression.


            The reduced form regression on its own does not estimate the effect of $X$ on $Y$.






            share|cite|improve this answer









            $endgroup$













              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "65"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392190%2fwhy-not-use-instrumental-variable-directly-as-a-covariate-in-the-regression%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              7












              $begingroup$

              The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):





              If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.



              In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.



              Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).






              share|cite|improve this answer











              $endgroup$


















                7












                $begingroup$

                The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):





                If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.



                In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.



                Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).






                share|cite|improve this answer











                $endgroup$
















                  7












                  7








                  7





                  $begingroup$

                  The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):





                  If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.



                  In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.



                  Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).






                  share|cite|improve this answer











                  $endgroup$



                  The point of instrumental variable regression is to provide an unbiased estimate of the causal effect of exposure $X$ on outcome $O$, when there is some unmeasured—possibly unmeasureable—variable $U$ confounding the relationship between $X$ and $O$. Here's a DAG of the simplest circumstance under which one would use instrumental variables estimation ($X$, $U$, and $Z$ can be sets of variables):





                  If an instrumental variable $Z$ causes $X$, has no effect on $O$ other than through $X$, there is no prior cause of both $Z$ and $O$, and the effect of $X$ on $O$ is homogeneous, then with a large enough sample $E[O|hat{X}]$ where $hat{X} = E[X|Z]$ can provide an unbiased estimate of the causal effect of $X$ on $O$.



                  In summary you do not care about the effect of $Z$ on $O$ (there is none except through $X$), and $E[O|hat{X}] ne E[O|X,Z]$, so simply including $Z$ in your model will not get you an instrumental variable estimate.



                  Final comment: The "...in the initial estimation?" closing of your question makes me want to clarify: one first estimates $hat{X}$ (so $Z$ is indeed part of that estimation), and one uses $hat{X}$ as a predictor in the second estimation (sans $Z$).







                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited 1 hour ago

























                  answered 4 hours ago









                  AlexisAlexis

                  16.3k34597




                  16.3k34597

























                      0












                      $begingroup$

                      You can and people do. As @Alexis points out though, it doesn't give you a complete answer.



                      Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:




                      • The regression of $X$ on $Z$ is called the first stage regression.

                      • The regression of $Y$ on $Z$ is called the reduced form regression.


                      The reduced form regression on its own does not estimate the effect of $X$ on $Y$.






                      share|cite|improve this answer









                      $endgroup$


















                        0












                        $begingroup$

                        You can and people do. As @Alexis points out though, it doesn't give you a complete answer.



                        Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:




                        • The regression of $X$ on $Z$ is called the first stage regression.

                        • The regression of $Y$ on $Z$ is called the reduced form regression.


                        The reduced form regression on its own does not estimate the effect of $X$ on $Y$.






                        share|cite|improve this answer









                        $endgroup$
















                          0












                          0








                          0





                          $begingroup$

                          You can and people do. As @Alexis points out though, it doesn't give you a complete answer.



                          Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:




                          • The regression of $X$ on $Z$ is called the first stage regression.

                          • The regression of $Y$ on $Z$ is called the reduced form regression.


                          The reduced form regression on its own does not estimate the effect of $X$ on $Y$.






                          share|cite|improve this answer









                          $endgroup$



                          You can and people do. As @Alexis points out though, it doesn't give you a complete answer.



                          Imagine you're interested in the effect of an endogenous variable $X$ on $Y$ and $Z$ is an instrument for $X$. When doing IV in econometrics:




                          • The regression of $X$ on $Z$ is called the first stage regression.

                          • The regression of $Y$ on $Z$ is called the reduced form regression.


                          The reduced form regression on its own does not estimate the effect of $X$ on $Y$.







                          share|cite|improve this answer












                          share|cite|improve this answer



                          share|cite|improve this answer










                          answered 40 mins ago









                          Matthew GunnMatthew Gunn

                          17.2k13370




                          17.2k13370






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Cross Validated!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392190%2fwhy-not-use-instrumental-variable-directly-as-a-covariate-in-the-regression%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              How to make a Squid Proxy server?

                              第一次世界大戦

                              Touch on Surface Book