How to get "smart" punctuation working in Zola---the hacky way

— 6 min read

In case you haven't noticed, I just recently upgraded the templates and CSS on this website to get a much nicer design. I was previously using the After Dark theme. While it looked pretty at first glance, it's not a typographical wonder: all of the text is at the exact same size, making it harder to read. Plus, monospaced fonts just aren't the right choice for reading long-form content. In the process of upgrading my site's design, I noticed that proper Unicode smart quotation---handled out of the box with Jekyll---was missing altogether from Zola. If you have JavaScript enabled, you can see that I came up with an imperfect solution to this problem on the front end.

As with most problems, I started with a Google search. The top relevant link is this GitHub issue from a few years ago. Basically, Wesley Moore wanted something like SmartyPants for Jekyll in Zola (which he uses for his site). Because Zola uses pulldown-cmark to efficiently render Markdown, it's that project's responsibility to implement this feature. Unfortunately, despite an issue having been open for years, they still have yet to ship this feature. While it looks like they're making progress on it, it will probably be a while until the feature is implemented.

Since getting quotes and dashes converted to the correct Unicode form in the CommonMark parser wasn't going to happen anytime soon, I realized that I had a few other options for solving the problem:

  1. Manually replace straight quotes with curly quotes in source files.
  2. Write or find a converter to do that for me as a preprocessing step.
  3. Write or find a converter to do it on the HTML output as a postprocessing step.
  4. Fix it in the browser.

#1 would be a lot of work. Programmers generally want to automate everything they can; manually altering source files is ridiculous. I managed to keep my source files nearly identical (minus front matter) when switching from Jekyll to Zola in the first place, so I don't want to go back on that now.

#2 and #3 are valid options, but I didn't want to complicate my build step too much. Plus, I would have to make sure that the conversion did not occur inside of code blocks or anywhere else that should be kept verbatim. Performing the conversion on the input would effectively require parsing Markdown and doing it on the output would certainly require parsing HTML.

#4 seemed like the easiest and least painful solution. It would be a good example of graceful degradation: users without JavaScript enabled would get a perfectly readable but less visually pleasing experience. Even those with CSS disabled would find the straight HTML fine to read. For users with everything turned on, there would be a nearly-imperceptible flash on the first page load as the JavaScript modified the appropriate content.

As usual, I like to avoid writing my own solution when a problem has been thoroughly solved by someone else. In the case of smart quotes, it's actually a pretty difficult problem. You need to know what counts as "start" and "end" punctuation, even in complex cases like 'breviations (where the single quote should be the "closing" variant). Luckily, someone had already solved the problem in the form of a JS library. Smartquotes.js is a good solution to the problem, since it handles all of these subtle typographical details correctly. It does not take care of dashes, ellipses, or accented characters, however.

My initial solution for the em dash---the only dash form that I frequently use---was to modify the following section of the Zola theme I forked:

{% macro title(page) %}
    <header>
        <h2 class="c-title c-article__title"><a href="{{ page.permalink }}">{{ page.title }}</a></h2>
        <p class="c-article__meta">
            <time datetime="{{ page.date | date(format="%F") | safe }}">
                {{ page.date | date(format="%F") }}
            </time>
            {% if page.extra.author -%}
              by {{ page.extra.author}}
            {%- endif %}
            {{ " — " }}{{ page.reading_time }} min read
        </p>
    </header>
{% endmacro title %}


{% macro polish(content) %}
    {{ content |
       replace(from="😄", to="😄") |
       replace(from="🍣", to="🍣") |
       replace(from="🍙", to="🍙") |
       replace(from="🍜", to="🍜") |
       replace(from="🥚", to="🥚") |
       replace(from="👍", to="👍") |
       replace(from="👍", to="👍") |
       replace(from="😛", to="😛") |
       replace(from="😀", to="😀") |
       replace(from="😦", to="😦") |
       replace(from="❤", to="❤") |
       replace(from="😞", to="😞") |
       replace(from="🎉", to="🎉") |
       replace(from="🥟", to="🥟") |
       safe
    }}
{% endmacro content %}

In the templates/post_macros.html file (shown above), the author defines a polish() macro that converts some different kinds of emoji representations into their actual Unicode forms. While that particular feature didn't (and doesn't) matter too much to me, I figured that I could use it to also convert --- to ---. Notice how there are Unicode emoji on both the from= and to= side of each of those lines? The issue with the polish() macro is that it gets applied to all content equally, including code blocks. In the original source code, those lines look different.

The problem with applying the transformation inside code blocks is that hyphens are frequently used as ASCII line-drawing characters. For example, one of my previous articles contained a diagram that used them for horizontal separators. With a replace(from="---", to="—") line in there, that diagram would break.

Given that this solution wouldn't work, just as it wouldn't have for smart quotes, I decided to implement em dash conversions in Smartquotes.js myself.

I'm not sure I ended up doing it correctly, but I put my conversion in a variety of different parts of the Smartquotes.js source code and only found one spot where it would work without trying to add an "end" dash like it would with quotes. You can check out my fork, where I got it working, although the code is a little less pretty (and unit tests don't work).


Regardless of whether you're a Zola user, adding this simple bit of JavaScript code makes your site's punctuation look all the more professional. Next, I might add support for ellipses and fix accent handling.