Convert Markdown to HTML with Python

Advertisement

Advertisement

Introduction

Python-Markdown is a package that converts content in Markdown format to HTML. In this example, we will look at how to convert Markdown to HTML and automatically generate a table-of-contents. We will also look at using the command-line tool to convert content. We will also cover how to use fenced code blocks and

Setup

Install the markdown library with pip. I am using Python 3.8 in this example.

python -m pip install markdown

Convert Markdown to HTML in Python

The easiest way to convert is just use a string for input and a string for output.

import markdown

# Simple conversion in memory
md_text = '# Hello\n\n**Text**'
html = markdown.markdown(md_text)
print(html)

To use files for input and output instead:

import markdown

markdown.markdownFromFile(
    input='input.md',
    output='output.html',
    encoding='utf8',
)

Convert Markdown to HTML with command-line tool

The Python-Markdown CLI tool is convenient when you just want to convert a document without embedding the code in a larger application.

The easiest way to invoke it by running is a module with python -m. For example:

# Convert from a file
python -m markdown input.md

# Convert using STDIN/STDOUT
cat input.md | python -m markdown > output.html

# Load extensions with -x (e.g. Table of contents)
python -m markdown -x toc input.md

Generate a table of contents (TOC)

To generate a TOC, we need to using the toc extension. There are a number of other extensions available with the package that you can check out at https://python-markdown.github.io/extensions/.

You convert the same way before, except this time you pass in an extra parameter to include the extension

import markdown

md_text = '[TOC]\n# Title\n**text**'
html = markdown.markdown(md_text, extensions=['toc'])
print(html)

To customize options, you need to include the markdown.extensions.toc.TocExtension class and pass an instance of that object to the extensions parameter. See the following example. Read more at https://python-markdown.github.io/extensions/toc/#usage

In your Markdown, add [TOC] to the Markdown where the TOC should go.

import markdown
from markdown.extensions.toc import TocExtension

md_text = '[TOC]\n# Title\n**text**'
# baselevel=2 sets headings to start at `h2`
html = markdown.markdown(md_text, extensions=[TocExtension(baselevel=2, title='Contents')])
print(html)

Fenced code blocks

To make a code block you can indent all lines by 4 spaces by default. Personally, I prefer using the three backticks (```) to enclose code without indenting. It also gives a place to define which language is being used.

To use the triple backticks you need to enable the fenced_code extension. This extensions already comes with Python-Markdown. This will wrap the code block with a <pre> and <code> tag.

import markdown

md_text = """
# Title

```python
# some code block
```
"""
html = markdown.markdown(md_text, extensions=['fenced_code'])
print(html)

TIP: If you need to write a triple backtick code block within your Markdown code, you can wrap the outermost codeblock with additional backticks. For example, use a set of 4 or 5 instead of 3 like this:

This is a **Markdown** file.

Here is an example of some code:

````python
# This python code contains triple backticks
markdown_text = """
```python
print(5 * 5)
```
"""
print(markdown_text)
````

Source code syntax highlighting

To build on the previous section using fenced_code, you can add syntax highlighting with the codehilite extension. This extensions already comes with Python-Markdown, but it depends on another Python library named Pygments.

Install pygments with pip:

python -m pip install pygments

Here is an example of generating HTML with both fenced_code and codehilite extensions together.

import markdown

md_text = """
```python hl_lines="1 3"
# some Python code
hi = 'Hello'
print(hi)
```
"""
html = markdown.markdown(md_text, extensions=['fenced_code', 'codehilite'])
print(html)

When you add the codehilite extension, the code block is wrapped with the class .codehilite and many other styles will be applied. You could write your own styles, but Pygments comes with several style sets you can use. You can generate the different styles using a command-line tool called pygmentize. Use this tool to list available color themes and to generate the styles. Save the CSS output to a .css file and link it in your HTML like normal.

To apply the proper styles, you must generate the CSS and apply it.

# List themes. E.g. `default`, `monokai` or `solarized-dark`
pygmentize -L
# Generate CSS styles that will apply to `.codehilite` class
pygmentize -S monokai -f html -a .codehilite > static/css/codehilite.css

In the HTML:

<link rel="stylesheet" href="/static/css/codehilite.css"/>

Conclusion

After reading this, you should understand how to convert Markdown content to HTML and how to automatically generate a table-of-contents. You should be able to use strings or files for conversion. You should also understand how to use the CLI tool to convert content. You should also know how to include extensions and apply fenced code blocks and source code syntax highlithing with Pygments.

References

Advertisement

Advertisement