Dernières nouvelles

Stata Tips #9 - Creating HTML output with community-contributed Stata commands

Stata Tips #9 - Creating HTML output with community-contributed Stata commands

One of the strengths of Stata is that users can write their own commands and share them within the Stata user community. Install the contributed command and you can just start using it as if it was any other part of Stata. A number of commands have been written to make good-looking Stata outputs in HTML, and we’ll explore those in this tip.

HTML is the standard language for creating web pages but it is also a very useful format for sharing outputs from analyses. Because everyone has a web browser, and regardless of operating system, browser or screen size, HTML is interpreted and displayed the same way, it provides a common format for all of your audience. Text, images, tables and more can be included. Readers can print if they prefer, or save it to a format like PDF as they see fit.

We previously looked at new commands built into Stata 15 that do some of the writing to HTML without requiring you to know any HTML itself.

webdoc

webdoc was written by Ben Jann at the University of Bern. You can obtain it via findit webdoc. It has a web page here with links to further information, but let’s explore it with a simple example. Within a do-file, you have commands that create the output that you want (graphs, tables, text), some HTML or Markdown if you need it, and the commands that start up webdoc and create an HTML file.

We have everything we need in one do-file, we save it, and then we type

webdoc do mydofile.do

in the command line in Stata. webdoc takes care of the rest.

We might begin by initializing the output file we want:

webdoc init myreport.html

then, we can add text to the HTML file by enclosing it in a special webdoc block indicated like this:

/***
Here are the results from the 2017 survey.
***/

Plain text like that is interpreted by browsers as a “paragraph”, which is to say that you would get the same result if the HTML wrapped it in “p” tags like this:

<p>Here are the results from the 2017 survey.</p>

In fact, if you know some HTML and CSS, you can add those tags with some styling within the block too:

/***
<p style="color:blue;">Here are the results from the 2017 survey.</p>
***/

Here, the style attribute overrides the default settings of the browser for “p” tagged text.

It all gets written into the file, so anything you would write in HTML, you can just add inside the block. But there are some helpful shortcuts. To add a Stata graph, just create the graph in the usual way (no need to export it to a picture file) and include a line outside the block:

webdoc graph

or to include the graph named “graph123”:

webdoc graph graph123

You can also create a table of contents based on the HTML headings used inside the file, by the command

webdoc toc

Any standard Stata output generated outside the block will appear in a fixed-width font in the HTML, so this do file:

webdoc init example1, replace logall plain
/***
<html>
<head><title>Example 1</title></head>
<body>
<h2>Exercise 1</h2>
<p>Open the 1978 automobile data and run a regression of price on
mileage using the <code>regress</code> command.</p>
***/

sysuse auto
regress price mpg
twoway (scatter price mpg) (lfit price mpg)
webdoc graph, height(100)

/***
</body>
</html>
***/

will create example1.html, which looks like this in the browser:

Stata-tips-10-image-1

The same tricks we discussed before, using macros to embed changing text and numbers in the output and to loop over many output files, apply with webdoc too.

esttab

This is another command by Ben Jann, which creates nice-looking tables from regression (and other estimation command) outputs. It is installed, along with some companion commands, via findit esttab (choose the most recent update). Often, new users or potential users of Stata are concerned about having to work with plain-text outputs. Thankfully, there are some flexible commands like this to help you out. esttab will write the table you always had to spend lots of time typing and formatting by hand into an html file:

eststo clear
sysuse auto
eststo: quietly regress price weight mpg

eststo will store the estimation table from the regression and this can then be accessed by esttab:

esttab using table1.html, ar2 label replace

table1.html looks like this in the browser:

Stata-tips-10-image-2

You’ll notice that this is no longer just plain text in a fixed-width font like Courier, which looks like a table (but isn’t really). And webdoc has a command to copy in a second HTML file wherever you want, so the table can be added:

webdoc append table1.html

(the tfinsert command also achieves this, and we’ll discuss it below)

Stata-tips-10-image-3

You can try this yourself with example1.do downloadable here. There are lots of esttab options to tailor the table just the way you want it to look. Don’t forget also that there’s a Stata command called filefilter that allows you to do find-and-replace operations on files, and this is useful to operate on table HTML files en masse efficiently.

Next, we’ll look at three similar packages of commands. The difference to webdoc is that there is no weaving together of do-file code with blocks of text and HTML. Instead, there are individual commands that receive content as arguments and write this into a file. Along the way, they take care of the HTML for you, though two of them allow you to include any bespoke HTML or CSS if you want to control the look of the final report.

ht

This is a package of HTML-writing commands created by Llorenç Quintó and colleagues at the Barcelona Institute for International Health Research and documented in a freely available Stata Journal article. You can install it via findit dm0066. Here’s a simple example:

htopen using myreport.html, replace
sysuse auto, clear
htlog display "We will now conduct a fascinating analysis of the auto.dta data."
htlog regress price mpg
recode mpg (min/25 = 0 "Low/Medium") (25/max = 1 "High"), generate(mpg2)
label var mpg2 "Mileage (level)"
htsummary price mpg2, head median format(%8.2f) test
htsummary weight foreign, median format(%8.2f) test close
htclose

The resulting myreport.html file looks like this:

Stata-tips-10-image-4

There are just four commands: htopen and htclose open and close the html file, htlog writes any logged results from a command that follows (in a fixed-width font without formatting, although there is a trick to impose more formatting using “span” tags in HTML, which we’ll gloss over here), and htsummary writes a table of summary stats, possibly split by groups in the data. (See example2.do downloadable here) Apart from the summary statistics table, ht is limited to inserting Stata output from the results window. Next we consider two more flexible packages of commands.

htmlutil

htmlutil was written by Roger Newson of Imperial College London, a medical statistician and prolific contributor of new Stata commands, who uses HTML for reporting all his analyses. It can be installed via findit htmlutil. Like ht, there are a limited number of commands, but it has the ability for you to add the HTML of your choice from the do-file or the command line. There are htmlopen and htmlclose commands, corresponding to htopen and htclose described as part of the ht package, and also htmllink to insert hyperlinks and htmlimg to insert an image. There is no equivalent of htlog. You will find it essential to also install Newson’s commands tfinsert and listtab, which you can get with ssc install tfinsert and ssc install listtab. tfinsert allows you to add HTML one line at a time and listtab can add tables in HTML format.

html-reports

Similarly, Robert Grant of St George’s Medical School wrote a set of commands to help him speed up common tasks of writing into HTML. This is stored at https://github.com/robertgrant/html-reports

To install it, you should download the html-reports.do file and then load the commands it contains with

do html-reports.do

Before starting, open a new file to receive the output:

file open con using "myoutput.html", write text replace

Then, the command html_start writes the necessary header information to start up the file. This includes CSS that controls the look of the report; ht and htmlutil do not have this. html_unitab, html_multitab and html_xtab create various forms of table, html_h and html_p insert headings and paragraphs respectively, and html_img inserts image files. Again, there is no equivalent of htlog. When you are finished, html_close finishes off the required HTML, and then you have to close the file handle with file close con.

The interesting thing about html-reports for people who know some HTML and CSS is that you can open up html-reports.do and change what it writes directly, especially in the html_start command where the CSS in the header is easy to spot. In this way, you can change it to the branding choices of your own organization. A logo, background colour or preferred font can be added.

Picking and mixing (for more confident users)

Because the three community-contributed commands all write to a file identified with a file handle, we can swap between them to get whatever result we want. You might like the headings and tables from html-reports but the results logging from ht, and the model comparison table from estout and webdoc. Here’s a way of weaving them together. You can follow along in the file example3a.do downloadable here.

First, open the new file you want (nicereport.html) with standard Stata commands, with the file handle “con”, then write the headings, text and tables with html-reports, which will look a file handle called con by default. Don’t forget the trick to have incrementing table and figure numbers.

Close the file handle down, and use ht to write out the regression logged output to another file (delete_me.html). Use the notag option in htopen and htclose, so that it only writes the basic code you want (you already have header and formatting code from html-reports). Then htclose that, and use tfinsert to add it into nicereport.html. Finally, call another do-file (example3b.do downloadable here) with webdoc do. The code in example3b.do uses webdoc init with the append option, then esttab to make the table, then webdoc append to add it to nicereport.html, then webdoc close to close the file. It should now have all the parts you wanted to combine.

There’s just a few pointers to remember:

  • Unless you are starting with webdoc, you need to give webdoc init the “append” option so it does not wipe out what was created before
  • Put webdoc instructions into other do-files, and then call them with webdoc do from your main do-file; see the attached examples.
  • Unless you are starting with ht, get it to write out to a second file and then insert that into the first file using tfinsert
  • html-reports commands can have a file handle specified in the handle() option. ht and htmlutil do not allow this, so you have to close the file connection from html-reports before switching to one of the others (including webdoc).

Thank you to Robert Grant @robertstats for this article

Post your comment

Timberlake Consultants