estout
Short descriptions for main functions (see below for more details):
-
collect_stats
: extracts a given set of attributes from results object generated by stats packages (e.g.statsmodels
andlinearmodels
) -
to_df
: takes a list ofcollect_stats
outputs and merges them as separate columns in a pandas DataFrame -
to_tex
: takes one or more DataFrames and creates tex code to build table with each DataFrame as a different panel -
to_pdf
: takes one or more tex tables (either as strings or paths to tex files) and merges them in a pdf document
Install
pip install estout
How to use
First, we set up an example dataset and run a few regressions to showcase the functions in this module.
import numpy as np
import pandas as pd
import statsmodels.api as sm
from linearmodels import PanelOLS
import estout
np.random.seed(123)
df = pd.DataFrame(np.random.rand(9,3),
columns=['y','x','z'],
index = pd.MultiIndex.from_product([[1,2,3],[1,2,3]], names=['firmid','time'])
).assign(cons = 1)
sm1 = sm.OLS(df['y'], df[['cons','x']]).fit()
sm2 = sm.OLS(df['y'], df[['cons','x','z']]).fit().get_robustcov_results(cov_type='HAC', maxlags=2)
lmres = PanelOLS(df['y'], df[['cons','x','z']], entity_effects=True
).fit(cov_type='clustered', cluster_entity=True)
Extracting statistics after fitting a model
Below, we collect just the default set of statistics from the sm1
object.
estout.collect_stats(sm1)
{'package': 'statsmodels',
'ynames': ['y'],
'xnames': ['cons', 'x'],
'params': cons 0.507852
x 0.345003
dtype: float64,
'tstats': cons 3.905440
x 1.292246
dtype: float64,
'pvalues': cons 0.005858
x 0.237293
dtype: float64,
'covmat': cons x
cons 0.016910 -0.030531
x -0.030531 0.071278,
'se': cons 0.130037
x 0.266979
dtype: float64,
'nobs': 9,
'r2': 0.19260886185799486}
This list of default statistics is given by the functions implemented in
the statsmodels_results
module (since sm1
was generated by the
statsmodels
package).
The same functions exist in the linearmodels_results
module.
print(estout.statsmodels_results.__all__)
print(estout.linearmodels_results.__all__)
['ynames', 'xnames', 'params', 'tstats', 'pvalues', 'covmat', 'se', 'nobs', 'r2']
['ynames', 'xnames', 'params', 'tstats', 'pvalues', 'covmat', 'se', 'nobs', 'r2']
I might add to (but never subtract from) this list in future versions if I find that there are other statistics I use very often.
If there are other statistics that you need, and they are reported as
attributes in the results object, you can request them using the
add_stats
parameter:
estout.collect_stats(sm1, get_default_stats=False, add_stats={'xnames': 'model.exog_names',
'Adj. R2': 'rsquared_adj'})
{'package': 'statsmodels',
'xnames': ['cons', 'x'],
'Adj. R2': 0.07726727069485129}
The add_stats
parameter also takes custom functions for statistics
that are not reported as an individual attribute of the results object.
These custom functions must return scalars.
def allvars(lmres): return lmres.model.dependent.vars + lmres.model.exog.vars
estout.collect_stats(lmres, get_default_stats=False, add_stats={'all': allvars})
{'package': 'linearmodels', 'all': ['y', 'cons', 'x', 'z']}
Add scalar statistics not available as attributes of the results object
(using the add_literals
paramter):
estout.collect_stats(sm1, get_default_stats=False, add_literals={'Fixed Effects': 'No',
'Nr observations': 123})
{'package': 'statsmodels', 'Fixed Effects': 'No', 'Nr observations': 123}
Combining model results into a DataFrame
Start by collecting stats from each model and combining them in a list.
allmodels = []
for res in [sm1, sm2, lmres]:
allmodels.append(estout.collect_stats(res))
Then export them to a DataFrame.
estout.to_df(allmodels)
0 | 1 | 2 | ||
---|---|---|---|---|
cons | params | 0.51*** | 0.70*** | 0.73*** |
tstats | (3.91) | (21.48) | (167.36) | |
x | params | 0.35 | 0.57** | 0.64* |
tstats | (1.29) | (2.85) | (2.26) | |
z | params | -0.64** | -0.77** | |
tstats | (-3.55) | (-2.91) | ||
r2 | 0.193 | 0.487 | 0.352 | |
nobs | 9 | 9 | 9 |
We can choose to report only a subset of the regressors.
estout.to_df(allmodels, which_xvars=['x','z'])
0 | 1 | 2 | ||
---|---|---|---|---|
x | params | 0.35 | 0.57** | 0.64* |
tstats | (1.29) | (2.85) | (2.26) | |
z | params | -0.64** | -0.77** | |
tstats | (-3.55) | (-2.91) | ||
r2 | 0.193 | 0.487 | 0.352 | |
nobs | 9 | 9 | 9 |
Report other statistics under the parameter values.
estout.to_df(allmodels, stats_body=['params','se','pvalues'], which_xvars=['x'])
0 | 1 | 2 | ||
---|---|---|---|---|
x | params | 0.35 | 0.57** | 0.64* |
se | (0.27) | (0.20) | (0.28) | |
pvalues | (0.237) | (0.029) | (0.086) | |
r2 | 0.193 | 0.487 | 0.352 | |
nobs | 9 | 9 | 9 |
Change the statistics reported at the bottom of the table
estout.to_df(allmodels, stats_bottom=['r2'], which_xvars=['x'])
0 | 1 | 2 | ||
---|---|---|---|---|
x | params | 0.35 | 0.57** | 0.64* |
tstats | (1.29) | (2.85) | (2.26) | |
r2 | 0.193 | 0.487 | 0.352 |
Change the formatting for any of the statistics reported.
estout.to_df(allmodels, add_formats={'params':'{:.3}','r2':'{:.2f}'}, which_xvars=['x'])
0 | 1 | 2 | ||
---|---|---|---|---|
x | params | 0.345 | 0.571** | 0.643* |
tstats | (1.29) | (2.85) | (2.26) | |
r2 | 0.19 | 0.49 | 0.35 | |
nobs | 9 | 9 | 9 |
Replace names of regressor (or bottom stats) with labels.
estout.to_df(allmodels, labels={'cons':'Intercept', 'nobs':'Observations'}, which_xvars=['cons'])
0 | 1 | 2 | ||
---|---|---|---|---|
Intercept | params | 0.51*** | 0.70*** | 0.73*** |
tstats | (3.91) | (21.48) | (167.36) | |
r2 | 0.193 | 0.487 | 0.352 | |
Observations | 9 | 9 | 9 |
Since the output of
to_df
is a
pd.DataFrame, it is easy to add more information at the bottom of the
table without having to re-run
collect_stats
.
df = estout.to_df(allmodels)
df.loc['Fixed effects',:] = ['No','No','Entity']
df
0 | 1 | 2 | ||
---|---|---|---|---|
cons | params | 0.51*** | 0.70*** | 0.73*** |
tstats | (3.91) | (21.48) | (167.36) | |
x | params | 0.35 | 0.57** | 0.64* |
tstats | (1.29) | (2.85) | (2.26) | |
z | params | -0.64** | -0.77** | |
tstats | (-3.55) | (-2.91) | ||
r2 | 0.193 | 0.487 | 0.352 | |
nobs | 9 | 9 | 9 | |
Fixed effects | No | No | Entity |
Exporting to LaTex
With the estout.to_tex
function, we can combine one or more DataFrames
into a single LaTex table (each DataFrame will be a separate panel in
the LaTex table).
In the example below, we just return the tex code as a string, but the
function also takes an outfile
parameter that allows us to store the
output in a .tex
file. Either the file path or the string can be used
in the estout.to_pdf
function to create a PDF out of this tex code.
tbl = estout.to_tex([df,df], panel_title=['Panel A: Some title', 'Panel B: Some title'],
col_groups=[{'Group1':[1,2]}]*2,
col_names=[['Model 1', 'Model 2', 'Model 3']]*2,
hlines=[[0,1,4,13], [1,4,13]] )
print(tbl)
\newpage
\clearpage
\begin{table}[!h] \footnotesize
\addtocounter{table}{0}
\caption{\textbf{Table title}}
\par {Table description}
\vspace{2mm}
\begin{tabular*}{\textwidth}{@{\extracolsep{\fill}}l*{3}{c}}
\hline \noalign{\smallskip}
\multicolumn{4}{@{} l}{Panel A: Some title} \\
\hline \noalign{\smallskip}
& \multicolumn{2}{c}{Group1} \\
\cline{2-3}
& Model 1 & Model 2 & Model 3 \\
\hline \noalign{\smallskip}
cons & 0.51*** & 0.70*** & 0.73*** \\
& (3.91) & (21.48) & (167.36) \\
x & 0.35 & 0.57** & 0.64* \\
& (1.29) & (2.85) & (2.26) \\
z & & -0.64** & -0.77** \\
& & (-3.55) & (-2.91) \\
r2 & 0.193 & 0.487 & 0.352 \\
nobs & 9 & 9 & 9 \\
Fixed effects & No & No & Entity \\
\hline \noalign{\smallskip}
\end{tabular*}
\smallskip
\begin{tabular*}{\textwidth}{@{\extracolsep{\fill}}l*{3}{c}}
\multicolumn{4}{@{} l}{Panel B: Some title} \\
\hline \noalign{\smallskip}
& \multicolumn{2}{c}{Group1} \\
\cline{2-3}
& Model 1 & Model 2 & Model 3 \\
\hline \noalign{\smallskip}
cons & 0.51*** & 0.70*** & 0.73*** \\
& (3.91) & (21.48) & (167.36) \\
x & 0.35 & 0.57** & 0.64* \\
& (1.29) & (2.85) & (2.26) \\
z & & -0.64** & -0.77** \\
& & (-3.55) & (-2.91) \\
r2 & 0.193 & 0.487 & 0.352 \\
nobs & 9 & 9 & 9 \\
Fixed effects & No & No & Entity \\
\hline \noalign{\smallskip}
\end{tabular*}
\label{}
\end{table}
Exporting to PDF
With the estout.to_pdf
function, we can combine the LaTex code for
multiple tables (like the ones produced by estout.to_tex
) into a
single .tex document.
By default, the resulting .tex file is run through TexLive’s pdflatex
utility to produce a PDF document with the tables (set make_pdf
=False
if you do not want the PDF to be automatically produced).
You can also set open_pdf
to True if you want the resulting pdf to be
opened after it is produced.
For the code below to work, you need to have TexLive
installed (and
change the path below to a valid path on your system).
estout.to_pdf(outfile='../_outputs/paper.tex',
table_tex_code=[tbl, tbl],
make_pdf=True,
open_pdf=False)
PDF creation successful!
This produced a PDF with two tables (given by the tbl
tex string),
each with two panels (given by the df
DataFrame above).
Note that the table_tex_code
parameter also accepts paths to tex
files. Those tex files must have the complete table environment for that
table (i.e. from
).