Implementation in Stata#

This chapter shows how to setup a generic log-likelihood function in Stata and use that to estimate an econometric model. For ECON407, the models we will be investigating use maximum likelihood estimation and pre-existing log-likelihood definitions to estimate the model. Therefore, consider this chapter as NOT REQUIRED optional material.

The maximum likelihood method seeks to find model parameters \(\pmb{\theta}^{MLE}\) that maximizes the joint likelihood (the likelihood function- \(L\)) of observing \(\mathbf{y}\) conditional on some structural model that is a function of independent variables \(\mathbf{x}\) and model parameters \(\pmb{\theta}^{MLE}\):

\[ \pmb{\theta}^{MLE} = \underset{\pmb{\theta}}{max} L = \underset{\pmb{\theta}}{\operatorname{max}} \prod_{i=1}^N prob(y_i|\mathbf{x}_i,\pmb{\theta}) \]

In a simple ordinary least squares setting with a linear model, we have the model parameters \(\pmb{\theta}=[\pmb{\beta},\sigma]\) making the likelihood function

\[ prob(y_i|\mathbf{x}_i,\pmb{\beta},\sigma) = \frac{1}{\sqrt{2\sigma^2 \pi}}e^{\frac{-(y_i - \mathbf{x}_i\pmb{\beta})^2}{2\sigma^2}} \]

Typically, for computational purposes we work with the log-likelihood function rather than the the likelihood. This is

\[ log(L) = \sum_{i=1}^N log(prob(y_i|\mathbf{x}_i,\pmb{\theta})) \]

Simple OLS in Stata#

Let’s load some data for this example:

webuse set "https://rlhick.people.wm.edu/econ407/data"
webuse tobias_koop
keep if time==4
. webuse set "https://rlhick.people.wm.edu/econ407/data"
(prefix now "https://rlhick.people.wm.edu/econ407/data")

. webuse tobias_koop

. keep if time==4
(16,885 observations deleted)

. 

This is the standard OLS estimation results in stata:

regress ln_wage educ pexp pexp2 broken_home
      Source |       SS           df       MS      Number of obs   =     1,034
-------------+----------------------------------   F(4, 1029)      =     51.36
       Model |  37.3778146         4  9.34445366   Prob > F        =    0.0000
    Residual |   187.21445     1,029  .181938241   R-squared       =    0.1664
-------------+----------------------------------   Adj R-squared   =    0.1632
       Total |  224.592265     1,033  .217417488   Root MSE        =    .42654

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        educ |   .0852725   .0092897     9.18   0.000     .0670437    .1035014
        pexp |   .2035214   .0235859     8.63   0.000     .1572395    .2498033
       pexp2 |  -.0124126   .0022825    -5.44   0.000    -.0168916   -.0079336
 broken_home |  -.0087254   .0357107    -0.24   0.807    -.0787995    .0613488
       _cons |   .4603326    .137294     3.35   0.001     .1909243    .7297408
------------------------------------------------------------------------------

Maximum Likelihood Estimation of OLS in Stata#

Now we estimate the same model using Maximum Likelihood.

Since the normal pdf is undefined for negative variances, we transform the variance term by exp to force any real number that is attempted by the maximization routine to be evaluated as a postive number. Consequently to recover the estimate for the model \(\sigma^2\) we need to raise it by exp post-estimation to find the maximum likelihood estimate for \(\sigma^2\). Here is the code defining the likelihood function and estimating the maximum likelihood model:

program define ols_mle
  * This program defines the contribution of the likelihood function 
  * for each observation in a simple OLS model
  args lnf xb log_sigma2
  qui replace `lnf' =ln(1/sqrt(2*_pi*exp(`log_sigma2'))) - (($ML_y1 - `xb')^2)/(2*exp(`log_sigma2'))
  end

ml clear

* Tell the ML program we want a linear function (lf) modeling the mean
* of y as a linear function of the variables on the right hand side of
* the equals.  sigma: just models sigma as a constant
ml model lf ols_mle (ln_wage=educ pexp pexp2 broken_home) (log_sigma2:)

ml max
. program define ols_mle
  1.   * This program defines the contribution of the likelihood function 
.   * for each observation in a simple OLS model
.   args lnf xb log_sigma2
  2.   qui replace `lnf' =ln(1/sqrt(2*_pi*exp(`log_sigma2'))) - (($ML_y1 - `xb'
> )^2)/(2*exp(`log_sigma2'))
  3.   end

. 
. ml clear

. 
. * Tell the ML program we want a linear function (lf) modeling the mean
. * of y as a linear function of the variables on the right hand side of
. * the equals.  sigma: just models sigma as a constant
. ml model lf ols_mle (ln_wage=educ pexp pexp2 broken_home) (log_sigma2:)

. 
. ml max

initial:       log likelihood = -3426.2813
alternative:   log likelihood = -2118.3989
rescale:       log likelihood = -1754.9153
rescale eq:    log likelihood = -765.29913
Iteration 0:   log likelihood = -765.29913  
Iteration 1:   log likelihood = -652.29309  
Iteration 2:   log likelihood = -584.40533  
Iteration 3:   log likelihood = -583.66366  
Iteration 4:   log likelihood = -583.66289  
Iteration 5:   log likelihood = -583.66289  

                                                        Number of obs =  1,034
                                                        Wald chi2(4)  = 206.44
Log likelihood = -583.66289                             Prob > chi2   = 0.0000

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
eq1          |
        educ |   .0852725   .0092672     9.20   0.000     .0671092    .1034358
        pexp |   .2035214   .0235288     8.65   0.000     .1574058     .249637
       pexp2 |  -.0124126    .002277    -5.45   0.000    -.0168755   -.0079497
 broken_home |  -.0087254   .0356243    -0.24   0.807    -.0785477     .061097
       _cons |   .4603326   .1369617     3.36   0.001     .1918926    .7287726
-------------+----------------------------------------------------------------
log_sigma2   |
       _cons |  -1.708935   .0439799   -38.86   0.000    -1.795134   -1.622736
------------------------------------------------------------------------------

. 

Note that in our program, \(\sigma^2\) is equal to the exponential of what Stata reports (for numerical reasons so the standard deviation can’t be negative):

display exp(-1.708935)
.18105852