{"id":10548,"date":"2023-11-07T16:17:35","date_gmt":"2023-11-07T16:17:35","guid":{"rendered":"https:\/\/www.blopig.com\/blog\/?p=10548"},"modified":"2023-11-07T16:23:41","modified_gmt":"2023-11-07T16:23:41","slug":"converting-pandas-dataframes-into-publication-ready-tables","status":"publish","type":"post","link":"https:\/\/www.blopig.com\/blog\/2023\/11\/converting-pandas-dataframes-into-publication-ready-tables\/","title":{"rendered":"Converting pandas DataFrames into Publication-Ready Tables"},"content":{"rendered":"\n<p>Analysing, comparing and communicating the predictive performance of machine learning models is a crucial component of any empirical research effort. Pandas, a staple in the Python data analysis stack, not only helps with the data wrangling itself, but also provides efficient solutions for data presentation. Two of its lesser-known yet incredibly useful features are <code>df.to_markdown()<\/code> and <code>df.to_latex()<\/code>, which allow for a seamless transition from DataFrames to publication-ready tables. Here\u2019s how you can use them!<\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\">Exporting DataFrames to Markdown<\/h2>\n\n\n\n<p>Markdown is widely used for its simplicity and readability, making it a go-to format for rendering your GitHub README or rebuttals on OpenReview. With the <code>df.to_markdown()<\/code> method, you can turn any DataFrame into a Markdown table with a single line of code.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import pandas as pd\n\n# construct example DataFrame\nresults = pd.DataFrame(\n    {\n        \"model\": [\n            \"random forest\", \n            \"support vector machine\", \n            \"multi-layer perceptron\"\n            ],\n        \"AUC-ROC\": [0.83, 0.79, 0.81],\n        \"AUC-PRC\": [0.46, 0.48, 0.49],\n        \"ECE\": [0.04, 0.09, 0.05],\n        \"runtime\": [0.004, 0.003, 0.01],\n    }\n)\n\n# convert it to Markdown\nprint(results.to_markdown(index=False))<\/pre>\n\n\n\n<p>This Markdown table can then be copied into any Markdown editor or platform that supports it (such as this website) and will be rendered as a neat table.<\/p>\n\n\n\n<div class=\"wp-block-jetpack-markdown\"><table>\n<thead>\n<tr>\n<th style=\"text-align:left\">model<\/th>\n<th style=\"text-align:right\">AUC-ROC<\/th>\n<th style=\"text-align:right\">AUC-PRC<\/th>\n<th style=\"text-align:right\">ECE<\/th>\n<th style=\"text-align:right\">runtime<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"text-align:left\">random forest<\/td>\n<td style=\"text-align:right\">0.83<\/td>\n<td style=\"text-align:right\">0.46<\/td>\n<td style=\"text-align:right\">0.04<\/td>\n<td style=\"text-align:right\">0.004<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">support vector machine<\/td>\n<td style=\"text-align:right\">0.79<\/td>\n<td style=\"text-align:right\">0.48<\/td>\n<td style=\"text-align:right\">0.09<\/td>\n<td style=\"text-align:right\">0.003<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align:left\">multi-layer perceptron<\/td>\n<td style=\"text-align:right\">0.81<\/td>\n<td style=\"text-align:right\">0.49<\/td>\n<td style=\"text-align:right\">0.05<\/td>\n<td style=\"text-align:right\">0.01<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n\n\n\n<p>This function uses the <code>tabulate<\/code> library, which additionally allows you to specify a range of <a href=\"https:\/\/github.com\/astanin\/python-tabulate\" data-type=\"link\" data-id=\"https:\/\/github.com\/astanin\/python-tabulate\">different table styles<\/a> using the <code>tablefmt<\/code> argument &#8211; e.g. a text grid like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>+------------------------+-----------+-----------+-------+-----------+\n| model                  |   AUC-ROC |   AUC-PRC |   ECE |   runtime |\n+========================+===========+===========+=======+===========+\n| random forest          |      0.83 |      0.46 |  0.04 |     0.004 |\n+------------------------+-----------+-----------+-------+-----------+\n| support vector machine |      0.79 |      0.48 |  0.09 |     0.003 |\n+------------------------+-----------+-----------+-------+-----------+\n| multi-layer perceptron |      0.81 |      0.49 |  0.05 |     0.01  |\n+------------------------+-----------+-----------+-------+-----------+<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Exporting DataFrames to LaTeX<\/h2>\n\n\n\n<p>LaTeX is the de facto standard for the typesetting of machine learning papers. The <code>df.to_latex()<\/code> method can convert a DataFrame into LaTeX tabular format which can be included directly in your LaTeX documents. Using the same example as above<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import pandas as pd\n\n# construct example DataFrame\nresults = pd.DataFrame(\n    {\n        \"model\": [\n            \"random forest\", \n            \"support vector machine\", \n            \"multi-layer perceptron\"\n            ],\n        \"AUC-ROC\": [0.83, 0.79, 0.81],\n        \"AUC-PRC\": [0.46, 0.48, 0.49],\n        \"ECE\": [0.04, 0.09, 0.05],\n        \"runtime\": [0.004, 0.003, 0.01],\n    }\n)\n\n# convert it to LaTeX\nprint(results.to_latex(index=False))<\/pre>\n\n\n\n<p>we can generate the following table:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" width=\"625\" height=\"135\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?resize=625%2C135&#038;ssl=1\" alt=\"\" class=\"wp-image-10552\" srcset=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?resize=1024%2C222&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?resize=300%2C65&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?resize=768%2C166&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?resize=624%2C135&amp;ssl=1 624w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?w=1348&amp;ssl=1 1348w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-12.png?w=1250&amp;ssl=1 1250w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/figure>\n\n\n\n<p>Similar to the <code>df.to_markdown()<\/code> function, <code>df.to_latex()<\/code> is quite flexible and allows you to customize the LaTeX table output to a great extent. Here are some of the specialized formatting options you can use with <code>df.to_latex()<\/code> to e.g. align columns, add captions and labels and standardise number formatting:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># Custom LaTeX table with specialized formatting\nlatex_output = results.to_latex(index=False,\n                                column_format='|l|r|r|r|r|',\n                                caption='Model Performance Metrics.',\n                                label='tab:model_performance',\n                                multicolumn_format='c',\n                                escape=False,\n                                header=[\n                                    'Model', 'AUC-ROC', 'AUC-PRC', \n                                    'ECE', 'Runtime (s)'\n                                    ],\n                                float_format=\"%.4f\")\n\nprint(latex_output)<\/pre>\n\n\n\n<p>resulting in the following table:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-14.png?ssl=1\"><img data-recalc-dims=\"1\" decoding=\"async\" width=\"625\" height=\"170\" loading=\"lazy\" src=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-14.png?resize=625%2C170&#038;ssl=1\" alt=\"\" class=\"wp-image-10554\" srcset=\"https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-14.png?w=915&amp;ssl=1 915w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-14.png?resize=300%2C82&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-14.png?resize=768%2C209&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.blopig.com\/blog\/wp-content\/uploads\/2023\/11\/image-14.png?resize=624%2C170&amp;ssl=1 624w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Over are the days of having to manually copy-paste your results into Overleaf! Both <code>df.to_markdown()<\/code> and <code>df.to_latex()<\/code> are straightforward yet highly customisable tools that allow you to easily compile and present your results for papers, blog posts and GitHub documentation. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Analysing, comparing and communicating the predictive performance of machine learning models is a crucial component of any empirical research effort. Pandas, a staple in the Python data analysis stack, not only helps with the data wrangling itself, but also provides efficient solutions for data presentation. Two of its lesser-known yet incredibly useful features are df.to_markdown() [&hellip;]<\/p>\n","protected":false},"author":86,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","wikipediapreview_detectlinks":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"ngg_post_thumbnail":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[361,14,463,48,227],"tags":[],"ppma_author":[616],"class_list":["post-10548","post","type-post","status-publish","format-standard","hentry","category-data-science","category-howto","category-latex","category-publication","category-python-code"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"authors":[{"term_id":616,"user_id":86,"is_guest":0,"slug":"leo","display_name":"Leo Klarner","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/8a288902cdb15c98aa887d33d06a4061fa3ebe87388f89f76734cf2be40ec362?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/10548","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/users\/86"}],"replies":[{"embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/comments?post=10548"}],"version-history":[{"count":5,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/10548\/revisions"}],"predecessor-version":[{"id":10581,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/posts\/10548\/revisions\/10581"}],"wp:attachment":[{"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/media?parent=10548"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/categories?post=10548"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/tags?post=10548"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.blopig.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=10548"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}