Skip to content
Advertisement

Pandas Styler.to_latex() – how to pass commands and do simple editing

How do I pass the following commands into the latex environment?

centering (I need landscape tables to be centered)

and

caption* (I need to skip for a panel the table numbering)

In addition, I would need to add parentheses and asterisks to the t-statistics, meaning row-specific formatting on the dataframes.

For example:

Current

variable value
const 2.439628
t stat 13.921319
FamFirm 0.114914
t stat 0.351283
founder 0.154914
t stat 2.351283
Adjusted R Square 0.291328

I want this

variable value
const 2.439628
t stat (13.921319)***
FamFirm 0.114914
t stat (0.351283)
founder 0.154914
t stat (1.651283)**
Adjusted R Square 0.291328

I’m doing my research papers in DataSpell. All empirical work is in Python, and then I use Latex (TexiFy) to create the pdf within DataSpell. Due to this workflow, I can’t edit tables in latex code while they get overwritten every time I run the jupyter notebook.

In case it helps, here’s an example of how I pass a table to the latex environment:

# drop index to column
panel_a.reset_index(inplace=True)


# write Latex index and cut names to appropriate length

ind_list = [
    "ageFirm",
    "meanAgeF",
    "lnAssets",
    "bsVol",
    "roa",
    "fndrCeo",
    "lnQ",
    "sic",
    "hightech",
    "nonFndrFam"
]


# assign the list of values to the column
panel_a["index"] = ind_list

# format column names
header = ["", "count","mean", "std", "min", "25%", "50%", "75%", "max"]

panel_a.columns = header

with open(
    os.path.join(r"/.../tables/panel_a.tex"),"w"
) as tf:
    tf.write(
        panel_a
        .style
        .format(precision=3)
        .format_index(escape="latex", axis=1)
        .hide(level=0, axis=0)
        .to_latex(
            caption = "Panel A: Summary Statistics for the Full Sample",
            label = "tab:table_label",
            hrules=True,

    ))

Advertisement

Answer

You’re asking three questions in one. I think I can do you two out of three (I hear that “ain’t bad”).

  1. How to pass centering to the LaTeX env using Styler.to_latex?

Use the position_float parameter. Simplified:

df.style.to_latex(position_float='centering')
  1. How to pass caption*?

This one I don’t know. Perhaps useful: Why is caption not working.

  1. How to apply row-specific formatting?

This one’s a little tricky. Let me give an example of how I would normally do this:

df = pd.DataFrame({'a':['some_var','t stat'],'b':[1.01235,2.01235]})
df.style.format({'a': str, 'b': lambda x: "{:.3f}".format(x) 
                if x < 2 else '({:.3f})***'.format(x)})

Result:

You can see from this example that style.format accepts a callable (here nested inside a dict, but you could also do: .format(func, subset='value')). So, this is great if each value itself is evaluated (x < 2).

The problem in your case is that the evaluation is over some other value, namely a (not supplied) P value combined with panel_a['variable'] == 't stat'. Now, assuming you have those P values in a different column, I suggest you create a for loop to populate a list that becomes like this:

fmt_list = ['{:.3f}','({:.3f})***','{:.3f}','({:.3f})','{:.3f}','({:.3f})***','{:.3f}']

Now, we can apply a function to df.style.format, and pop/select from the list like so:

fmt_list = ['{:.3f}','({:.3f})***','{:.3f}','({:.3f})','{:.3f}','({:.3f})***','{:.3f}']

def func(v):
    fmt = fmt_list.pop(0)
    return fmt.format(v)

panel_a.style.format({'variable': str, 'value': func})

Result:

This solution is admittedly a bit “hacky”, since modifying a globally declared list inside a function is far from good practice; e.g. if you modify the list again before calling func, its functionality is unlikely to result in the expected behaviour or worse, it may throw an error that is difficult to track down. I’m not sure how to remedy this other than simply turning all the floats into strings in panel_a.value inplace. In that case, of course, you don’t need .format anymore, but it will alter your df and that’s also not ideal. I guess you could make a copy first (df2 = df.copy()), but that will affect memory.

Anyway, hope this helps. So, in full you add this as follows to your code:

fmt_list = ['{:.3f}','({:.3f})***','{:.3f}','({:.3f})','{:.3f}','({:.3f})***','{:.3f}']

def func(v):
    fmt = fmt_list.pop(0)
    return fmt.format(v)

with open(fname, "w") as tf:
    tf.write(
        panel_a
        .style
        .format({'variable': str, 'value': func})
        ...
        .to_latex(
            ...
            position_float='centering'
    ))
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement