Linear Algebra in Stata (Mata)#

Stata has a linear algebra environment that can be started using the mata command from the stata command line. Notice, when you type mata from the stata command window, the command prompt changes from a . to :. This is really your only way of distinguishing if you are in the mata or stata environment. At this point “normal” stata commands (e.g. summary, reg, or use) will not work and will lead to error messages. To exit mata, issue the command end. Commands for mata may also be nested inside stata do files (command files) so long as all mata commands are between the commands mata and end

Getting help in mata is similar to the normal Stata environment. Type help mata command where command is some mata command. You can also do keyword searches: search mata keyword. To see the same set of results in a better help viewer, type view search mata keyword. For example view search mata inverse.

Let’s load some data for demonstration purposes:

sysuse auto
(1978 automobile data)
reg price mpg headroom
      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     10.44
       Model |   144280501         2  72140250.4   Prob > F        =    0.0001
    Residual |   490784895        71  6912463.32   R-squared       =    0.2272
-------------+----------------------------------   Adj R-squared   =    0.2054
       Total |   635065396        73  8699525.97   Root MSE        =    2629.2

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         mpg |  -259.1057   58.42485    -4.43   0.000    -375.6015   -142.6098
    headroom |  -334.0215   399.5499    -0.84   0.406    -1130.701    462.6585
       _cons |   12683.31   2074.497     6.11   0.000     8546.885    16819.74
------------------------------------------------------------------------------

In jupyter notebook code cells, use the %%mata magic to access Stata’s linear algebra capabilities in mata.

Creating matrices, vectors and scalars in Mata#

There are two ways to create a matrix. Consider a two by two matrix,

mata
A = (1,2 \ 3,4)
A
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A = (1,2 \ 3,4)

: A
       1   2
    +---------+
  1 |  1   2  |
  2 |  3   4  |
    +---------+

: end
-------------------------------------------------------------------------------

. 

or, you could create an empty matrix of a desired dimension:

mata
B=J(2,3,.)
B
end
. mata
------------------------------------------------- mata (type end to exit) -----
: B=J(2,3,.)

: B
       1   2   3
    +-------------+
  1 |  .   .   .  |
  2 |  .   .   .  |
    +-------------+

: end
-------------------------------------------------------------------------------

. 

where B is of dimension rows=2 and columns=3. We can fill \(\mathbf{B}\) element by element:

mata
B[1,1]=5 
B[1,2]=6 
B[1,3]=7 
B[2,1]=8 
B[2,2]=9 
B[2,3]=10
B
end
. mata
------------------------------------------------- mata (type end to exit) -----
: B[1,1]=5 

: B[1,2]=6 

: B[1,3]=7 

: B[2,1]=8 

: B[2,2]=9 

: B[2,3]=10

: B
        1    2    3
    +----------------+
  1 |   5    6    7  |
  2 |   8    9   10  |
    +----------------+

: end
-------------------------------------------------------------------------------

. 

Building a matrix from submatrices#

Suppose you have the matrices \(\mathbf{A}\) to \(\mathbf{D}\) defined as:

mata
A=(1,2 \ 3,4) 
B=(5,6,7 \ 8,9,10) 
C=(3,4 \ 5,6) 
D=(1,2,3 \ 4,5,6)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A=(1,2 \ 3,4) 

: B=(5,6,7 \ 8,9,10) 

: C=(3,4 \ 5,6) 

: D=(1,2,3 \ 4,5,6)

: end
-------------------------------------------------------------------------------

. 

And we want to construct the matrix \(\mathbf{E}\) as

\[\begin{split} \mathbf{E} = \begin{bmatrix} A & B \\ C & D \end{bmatrix} \end{split}\]

In mata, we use

mata
E=(A,B \ C,D)
E
end
. mata
------------------------------------------------- mata (type end to exit) -----
: E=(A,B \ C,D)

: E
        1    2    3    4    5
    +--------------------------+
  1 |   1    2    5    6    7  |
  2 |   3    4    8    9   10  |
  3 |   3    4    1    2    3  |
  4 |   5    6    4    5    6  |
    +--------------------------+

: end
-------------------------------------------------------------------------------

. 

Creating Vectors#

Row and column vectors can also be created using the same basic syntax:

mata
f = (1, 2, 3)
f
end
. mata
------------------------------------------------- mata (type end to exit) -----
: f = (1, 2, 3)

: f
       1   2   3
    +-------------+
  1 |  1   2   3  |
    +-------------+

: end
-------------------------------------------------------------------------------

. 

or, a column vector can be created by

mata
g=(3\ 4 \5)
g
end
. mata
------------------------------------------------- mata (type end to exit) -----
: g=(3\ 4 \5)

: g
       1
    +-----+
  1 |  3  |
  2 |  4  |
  3 |  5  |
    +-----+

: end
-------------------------------------------------------------------------------

. 

The command below can construct a row vectors of incremented integer values between 1 and 100 (e.g. 1, 2, 3,…, 99, 100).

mata
id_rows=(1::5)
id_rows
end
. mata
------------------------------------------------- mata (type end to exit) -----
: id_rows=(1::5)

: id_rows
       1
    +-----+
  1 |  1  |
  2 |  2  |
  3 |  3  |
  4 |  4  |
  5 |  5  |
    +-----+

: end
-------------------------------------------------------------------------------

. 

Creating Scalars#

These are easy. To define a scalar variable called u:

mata
u = 3
u
end
. mata
------------------------------------------------- mata (type end to exit) -----
: u = 3

: u
  3

: end
-------------------------------------------------------------------------------

. 

Creating a vector of zeros or ones#

Suppose we have 1000 observations and we wish to create a column of ones (this is especially useful for estimating a constant term), use this command

mata
ones=J(1000, 1, 1)
ones[1::5]
end
. mata
------------------------------------------------- mata (type end to exit) -----
: ones=J(1000, 1, 1)

: ones[1::5]
       1
    +-----+
  1 |  1  |
  2 |  1  |
  3 |  1  |
  4 |  1  |
  5 |  1  |
    +-----+

: end
-------------------------------------------------------------------------------

. 

This command can be combined with what we have previously to create the fully matrix of independent variables (with the constant in the first positions) using

mata
X=(J(1000, 1, 1),x)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: X=(J(1000, 1, 1),x)
                 <istmt>:  3499  x not found
r(3499);

: end
-------------------------------------------------------------------------------

. 

so long as your matrix of independent variables x exists in mata and has 1000 rows.

Creating the Identity Matrix#

The command will create an identity matrix with 5 rows/columns.

mata
identity = I(5)
identity 
end
. mata
------------------------------------------------- mata (type end to exit) -----
: identity = I(5)

: identity 
[symmetric]
       1   2   3   4   5
    +---------------------+
  1 |  1                  |
  2 |  0   1              |
  3 |  0   0   1          |
  4 |  0   0   0   1      |
  5 |  0   0   0   0   1  |
    +---------------------+

: end
-------------------------------------------------------------------------------

. 

Note, Stata only shows the lower triangular part of any symmetric matrix.

Stata datasets in Mata#

Once you have loaded data into stata as described above, it is easy to access that information from within mata. Using Stata’s auto dataset (that we loaded into Stata earlier), suppose we want to manipulate the data in mata. There are two ways to proceed. One can copy the data or one can create a view that always refers back to the original stata dataset. Views are useful if you want to modify the data in mata and then return to stata with the original dataset changed based on operations in mata, while copying the data is both faster and requires less memory. If you need to do all your work in mata and don’t need to change any of the underlying .dta data, I recommend the copy method. The command to load everything in the stata workspace into mata is

mata
X = st_data(.,.)
X[1::5,]
end
. mata
------------------------------------------------- mata (type end to exit) -----
: X = st_data(.,.)

: X[1::5,]
                 1             2             3             4             5
    +-----------------------------------------------------------------------
  1 |            .          4099            22             3           2.5
  2 |            .          4749            17             3             3
  3 |            .          3799            22             .             3
  4 |            .          4816            20             3           4.5
  5 |            .          7827            15             4             4
    +-----------------------------------------------------------------------
                 6             7             8             9            10
     -----------------------------------------------------------------------
  1             11          2930           186            40           121
  2             11          3350           173            40           258
  3             12          2640           168            35           121
  4             16          3250           196            40           196
  5             20          4080           222            43           350
     -----------------------------------------------------------------------
                11            12
     -----------------------------+
  1    3.579999924             0  |
  2    2.529999971             0  |
  3    3.079999924             0  |
  4    2.930000067             0  |
  5    2.410000086             0  |
     -----------------------------+

: end
-------------------------------------------------------------------------------

. 

Note, columns aren’t labeled and you need to keep track of variable order in Stata to know which columns are important for your work.

Alternatively, you can selectively include columns in the order you define using this and viewing the first 5 rows:

mata
X = st_data(.,("price","mpg","headroom"))
X[1::5,]
end
. mata
------------------------------------------------- mata (type end to exit) -----
: X = st_data(.,("price","mpg","headroom"))

: X[1::5,]
          1      2      3
    +----------------------+
  1 |  4099     22    2.5  |
  2 |  4749     17      3  |
  3 |  3799     22      3  |
  4 |  4816     20    4.5  |
  5 |  7827     15      4  |
    +----------------------+

: end
-------------------------------------------------------------------------------

. 

Remember, once you end the mata session, all changes to the data following an st_data command are lost. The st_view command has identical syntax to st_data and allows changes to the data to be preserved once back in stata. In this course, it is sufficient to use the command st_data to load data into mata as described above.

The mata workspace#

The command mata describe will list all the matrices, vectors, and scalars currently defined.

mata
mata describe
end
. mata
------------------------------------------------- mata (type end to exit) -----
: mata describe

      # bytes   type                        name and extent
-------------------------------------------------------------------------------
           32   real matrix                 A[2,2]
           48   real matrix                 B[2,3]
           32   real matrix                 C[2,2]
           48   real matrix                 D[2,3]
          160   real matrix                 E[4,5]
        1,776   real matrix                 X[74,3]
           24   real rowvector              f[3]
           24   real colvector              g[3]
           40   real colvector              id_rows[5]
          200   real matrix                 identity[5,5]
        8,000   real colvector              ones[1000]
            8   real scalar                 u
-------------------------------------------------------------------------------

: end
-------------------------------------------------------------------------------

. 

To delete all of these, issue mata clear. To delete only a few matrices, vectors, or scalars, issue mata drop X f g

Getting Information about your matrices and vectors#

Stata offers three functions useful for checking conformability conditions. The function rows(X) and cols(X) return the number of rows and columns of X respectively,

mata
rows(X)
cols(X)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: rows(X)
  74

: cols(X)
  3

: end
-------------------------------------------------------------------------------

. 

while length()

mata
length(X)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: length(X)
  222

: end
-------------------------------------------------------------------------------

. 

Calculates the total number of elements in matrix X, equal to (# rows) × (# columns.).

Stata Linear Algebra Operations#

Here we briefly show the commands for addition (scalar and matrix), multiplication, transposes and inverses. Additionally, we discuss a few other useful commands.

Scalar Addition (and subtraction)#

First for scalar addition order doesn’t matter, so

mata
A + 2
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A + 2
                 <istmt>:  3200  conformability error
r(3200);

: end
-------------------------------------------------------------------------------

. 

Stata wants us to add a \(2 \times 2\) matrix comprised of 2’s. We can do this using either

mata
A + J(2, 2, 2)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A + J(2, 2, 2)
       1   2
    +---------+
  1 |  3   4  |
  2 |  5   6  |
    +---------+

: end
-------------------------------------------------------------------------------

. 

or using the scalar operator ::

mata
A :+ 2
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A :+ 2
       1   2
    +---------+
  1 |  3   4  |
  2 |  5   6  |
    +---------+

: end
-------------------------------------------------------------------------------

. 

Since order doesn’t matter in scalar operations the above is equivalent to 2 :+ A.

Matrix Addition (and subtraction)#

For matrix addition, we need the conformability conditions outlined in the theory chapter, so

mata
A + C
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A + C
        1    2
    +-----------+
  1 |   4    6  |
  2 |   8   10  |
    +-----------+

: end
-------------------------------------------------------------------------------

. 

exists, whereas

mata
A + D
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A + D
                 <istmt>:  3200  conformability error
r(3200);

: end
-------------------------------------------------------------------------------

. 

does not.

Scalar multiplication#

We can again use the scalar operator : to distribute multiplication operators element by element for matrices:

mata
A :* 2
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A :* 2
       1   2
    +---------+
  1 |  2   4  |
  2 |  6   8  |
    +---------+

: end
-------------------------------------------------------------------------------

. 

or

mata
A :* C
end
. mata
------------------------------------------------- mata (type end to exit) -----
: A :* C
        1    2
    +-----------+
  1 |   3    8  |
  2 |  15   24  |
    +-----------+

: end
-------------------------------------------------------------------------------

. 

Matrix Multiplication#

Stata has simple syntax for multiplication of matrices:

mata
C*B
end
. mata
------------------------------------------------- mata (type end to exit) -----
: C*B
        1    2    3
    +----------------+
  1 |  47   54   61  |
  2 |  73   84   95  |
    +----------------+

: end
-------------------------------------------------------------------------------

. 

This product exists, whereas if we reverse the order of operations,

mata
B*C
end
. mata
------------------------------------------------- mata (type end to exit) -----
: B*C
                       *:  3200  conformability error
                 <istmt>:     -  function returned error
r(3200);

: end
-------------------------------------------------------------------------------

. 

does not exist, since we don’t have conformability (columns in \(\mathbf{B}\) must be equal to rows in \(\mathbf{C}\)):

mata
mata describe B C
end
. mata
------------------------------------------------- mata (type end to exit) -----
: mata describe B C

      # bytes   type                        name and extent
-------------------------------------------------------------------------------
           48   real matrix                 B[2,3]
           32   real matrix                 C[2,2]
-------------------------------------------------------------------------------

: end
-------------------------------------------------------------------------------

. 

Inverses#

To invert a matrix, use lu_inv on square matrices:

mata
lu_inv(D)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: lu_inv(D)
                 <istmt>:  3499  lu_inv() not found
r(3499);

: end
-------------------------------------------------------------------------------

. 

Note this assures the property \(\mathbf{D D^{-1} = I}\)

mata
D*lu_inv(D)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: D*lu_inv(D)
                 <istmt>:  3499  lu_inv() not found
r(3499);

: end
-------------------------------------------------------------------------------

. 

Transposes#

We can transpose matrices using the prime (' operator):

mata
D
D'
end
. mata
------------------------------------------------- mata (type end to exit) -----
: D
       1   2   3
    +-------------+
  1 |  1   2   3  |
  2 |  4   5   6  |
    +-------------+

: D'
       1   2
    +---------+
  1 |  1   4  |
  2 |  2   5  |
  3 |  3   6  |
    +---------+

: end
-------------------------------------------------------------------------------

. 

Diag and Diagonal#

These two commands are useful for early problem sets. The diagonal command grabs the diagonal elements and converts to a vector, whereas diag returns the diagonal elements as-is and assigns off diagonals as 0:

mata
diag(D)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: diag(D)
[symmetric]
       1   2
    +---------+
  1 |  1      |
  2 |  0   5  |
    +---------+

: end
-------------------------------------------------------------------------------

. 

and

mata
diagonal(D)
end
. mata
------------------------------------------------- mata (type end to exit) -----
: diagonal(D)
       1
    +-----+
  1 |  1  |
  2 |  5  |
    +-----+

: end
-------------------------------------------------------------------------------

.