“the j stands for Joying”
about me | research blog | wordpress plugins | jQuery plugins

11 January, 2013

Markdown, Pandoc and GitHub

I love writing in Markdown, and in general I try to always write in Markdown and then convert into HTML/TeX. Pandoc is a fantastic tool for converting from Markdown to other formats, and since it is so versatile I would like to use it for everything. I also use GitHub a lot, which has an automatic renderer for Markdown documents.

Unfortunately, Pandoc’s Markdown (PM) and GitHub Flavored Markdown (GFM) are not identical, and I find myself constantly torn between the two, trying to satisfy both. I typically have some code repository hosted on GitHub, with at least one main readme file written in Markdown format. When browsing the repository through the GitHub website, this readme file is automatically converted to HTML. Since this is often the first and only documentation for my code, it is important to me that it renders correctly.

I often want to also convert my Markdown document locally into a self-contained HTML file, and sometimes TeX too, and for this Pandoc is just the best. But herein begin the differences in syntax support:


GFM likes “pipe-tables”, as defined in Markdown Extra:

| Item      | Value |
| --------- | -----:|
| Computer  | $1600 |
| Phone     |   $12 |
| Pipe      |    $1 |

However the latest releases of Pandoc (1.9.x) support a bunch of other table types, but not pipe-tables. The latest Pandoc (1.10.x) does thankfully support them, so my current solution is to use the development version of Pandoc and compile from source. This means my Makefile might not be portable, but at least I know it works me (though arguably maybe I shouldn’t depend on Pandoc in the first place).

Definition lists

Quite simply, GFM does not support definition lists. However they are defined in Markdown Extra like so, and Pandoc handles them like champ. At least GFM tends to degrade gracefully in this case, so definition lists don’t bother me too much.

Pre/post code

When building a standalone HTML or TeX file, you will definitely need to include some before and after code around your actual content. You could have completely separate files for this, and then glue them together in a Makefile. But sometimes this seems like overkill for a simple </body></html>, and I just want to stick them at the bottom of my Markdown file and be done with it. In fact GFM will happily ignore HTML tags, but will still display the content of something like <title>Hello!</title>. And if you try to include some TeX code it only gets worse.


Maybe it’s my fault for trying to expect too many different things from a simple language. But with a Master’s thesis looming, I’m currently thinking out my writing options. While I love the idea of writing in Markdown and using Pandoc to convert to TeX, this lack of standard really bothers me and I can’t help wondering if I might be safer with something like txt2tags, which my professor swears by.