Creating website and blogging in Org mode

I discovered the power of Org mode when I started to work on my Ph.D. thesis which is being entirely written in Org mode too. Indeed, one can easily export an Org mode document to an HTML page or a PDF document typeset in LaTeX. Recently, I committed myself to overhaul my personal website and decided to produce it using Org. So, in this post, I detail the whole process step by step.

Project's structure

The idea here is to build a static HTML website generated from a collection of Org [1] documents. On one hand, the site have a couple of content pages such as Home, About and so on. On the other hand, it features a small blog as well.

When it comes to the style of the site, I am looking for simplicity. Although, there are some great Org HTML templates [2], they are adapted rather for standalone pages than for a complete website with navigation. So, I prefer to define my own tiny CSS style sheet.

The file structure of the project is described below.

.
├── .guix
│   ├── channels.scm
│   └── manifest.scm
├── blog
│   ├── creating-website-and-blogging-in-org-mode.org
│   ├── attachments
│   └── ...
├── images
│   ├── marek.jpg
│   └── ...
├── public
│   └── favicon.ico
├── styles
│   ├── custom.css
│   └── htmlize.css
├── shared
│   ├── footer.html
│   └── header.html
├── about.org
├── index.org
├── publish.el
├── README.md
├── references.bib
├── research.org
└── teaching.org

The .guix folder contains the description of the software environment required for building the HTML website (see more in Software environment). The blog folder holds the Org documents of the blog posts. The attachments subfolder contains static attachments related to the blog posts. images naturally contains all the image files featured on the site. public can be used to store the exported website (see more in General configuration). The custom CSS sheets reside in styles where htmlize.css stylizes syntax highlighting in source code blocks and custom.css defines the look and feel of all the other elements of the website. shared holds the common static header and footer files. The Org documents corresponding to the content pages are stored in the root of the project's folder as well as the global LaTeX bibliography file references.bib for the entire website. Finally, the Emacs Lisp script publish.el controls the publishing of the website.

Software environment

I use the GNU Guix [3] transactional package manager allowing for a self-contained, executable description of the whole software environment required for running the publishing Emacs Lisp script. The packages to include into the environment are listed in a manifest file [4], here .guix/manifest.scm.

(specifications->manifest
 (list "emacs"
       "emacs-org"
       "emacs-org-ref"
       "emacs-citeproc-el"
       "emacs-htmlize"
       "git"
       "bash"
       "coreutils"
       "tar"))

To ensure the same version of Guix and every single package in the environment every time I enter the latter, I use also a channel file [5], here .guix/channel.scm, which represents a list of the Git repositories providing package definitions, i.e. channels in Guix terminology, necessary to build the publishing environment together with the associated revision numbers.

(list
 (channel
  (name 'guix)
  (url "https://git.savannah.gnu.org/git/guix.git")
  (branch "master")
  (commit "abeffc82379c4f9bd2e6226ea27453b22cb4e0c8")
  (introduction
   (make-channel-introduction
    "9edb3f66fd807b096b48283debdcddccfea34bad"
    (openpgp-fingerprint
     "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA")))))

Note that I use Guix also at work to improve the reproducibility of my numerical experiments (see more in Research).

Publishing script

The core of the project is the Elisp publishing script publish.el responsible for generating the final HTML source of the site.

It begins by importing the Emacs packages providing:

Org mode support,
```
(require 'org)
```
HTML export backend,
```
(require 'ox-html)
```
publishing functions,
```
(require 'ox-publish)
```
engine for exporting source code blocks to HTML,
```
(require 'htmlize)
```

bibliography support.

(require 'oc)
(require 'citeproc) ;; for HTML
(require 'oc-csl) ;; for HTML

Then, I define a utility function file-dates allowing me to get the dates of the first publication and of the last modification of an Org document.

At first, the function tries to find the dates in the Git log.

(defun file-dates (file)
  (let*
      ((first-commit-date
        (shell-command-to-string
         (concat
          "git log --reverse --pretty=\"format:%cD\""
          " "
          file
          " 2> /dev/null | head -n 1")))
       (last-commit-date
        (shell-command-to-string
         (concat
          "git log --pretty=\"format:%cD\""
          " "
          file
          " 2> /dev/null | head -n 1")))
       (last-modification-date-raw
        (file-attribute-modification-time
         (file-attributes file)))
       (last-modification-date
        (format-time-string
         "%d/%m/%Y"
         (+
          (*
           (nth 0 last-modification-date-raw) (expt 2 16))
          (nth 1 last-modification-date-raw)))))

If there is no commit involving the file, I take the last modification timestamp recorded by the filesystem.

    (list
     (if (string= first-commit-date "")
         last-modification-date
       (substring
        (shell-command-to-string
         (concat
          "date -d \""
          first-commit-date
          "\" +%d/%m/%Y")) 0 -1))
     (if (string= last-commit-date "")
         last-modification-date
       (substring
        (shell-command-to-string
         (concat
          "date -d \""
          last-commit-date
          "\" +%d/%m/%Y")) 0 -1)))))

Blog post synopsis

Each blog post may contain a synopsis used to introduce the content of the post in the list of blog posts:

Figure 1: Excerpt of the list of blog posts.

In the source Org document, the synopsis text must be enclosed between the #+BEGIN_SYNOPSIS and #+END_SYNOPSIS tags.

For extracting the synopsis, I define the function get-post-synopsis taking as argument a blog-post.

(defun get-post-synopsis (blog-post)

The first thing to do is to load the Org file pointed by blog-post

  (with-temp-buffer
    (insert-file-contents blog-post)

and move the cursor to the beginning of the document.

    (goto-char (point-min))

In the core of the function, I use the markers beg and end to select the area in the buffer between the first and the last character of the synopsis. To exclude the newlines after the opening and before the closing tag, I move forward the starting marker by one and move backwards the ending marker by one too.

    (let
        ((beg (+ 1 (re-search-forward "^#\\+BEGIN_SYNOPSIS$")))
         (end (- (progn
                   (re-search-forward "^#\\+END_SYNOPSIS$")
                   (match-beginning 0)) 1)))

At the end, the function returns the sub-string of the buffer corresponding to the area between the two markers. At the same time, I need to remove any citations from the sub-string in order to prevent the apparition of artifacts on export.

      (replace-regexp-in-string "[ ]\\[cite.*\\]" ""
                                (buffer-substring beg end)))))

List of blog posts

For a handy access to blog posts, the site features a page containing the list of all blog posts with a short synopsis, the date of publishing, the author's name and the link to the post in form of a button (see Figure 1).

To create this page, we use the sitemap functionality in Org mode. The default appearance of the sitemap is rather basic. To customize it so the list of blog posts suits the design of the site, we need to define our own functions for formatting the sitemap (list of blog posts) and its items (blog posts).

Formatting items

The function format-blog-item changes the formatting of the sitemap item (blog post) entry belonging to project (see Project components). Note that, entry is the absolute path to the Org file of the blog post being processed. Also, I don't use the sitemap style argument here.

(defun format-blog-item (entry style project)
  (let

Unfortunately, when the function is called by the Emacs export machinery, the absolute path provided in entry is incorrect. It lacks the parent folder blog because Emacs thinks it is running in the project's root although the current working folder, when exporting blog posts, is blog (see Blog). Therefore, I have to re-include blog/ into the path.

For example, if the initial entry holds /home/marek/src/felsoci.sk/post.org, I need to transform it to /home/marek/src/felsoci.sk/blog/post.org.

      ((fixed-entry
        (concat
         (file-name-directory entry) "blog/" (file-name-nondirectory entry)))

Also, before actually formatting the sitemap entry, I need to determine its first publication and last modification dates.

       (entry-dates
        (file-dates
         (concat
          (plist-get (cdr project) :base-directory)
          "/"
          entry))))

Finally, return the Org string corresponding to the sitemap entry formatted using the format function similar to sprintf in C.

    (format "
@@html:<h2 class=\"post-title\">@@
[[file:%s][%s]]
@@html:</h2><span class=\"post-metadata\">@@
Published on %s by %s%s
@@html:</span>@@

%s

@@html:<a href=\"@@%s@@html:.html\"><button>Read more</button></a>@@
"

All of the %s are replaced by the values of the arguments following the string to format:

the path to the blog post Org document,
```
            entry
```
the title of the post found in the Org document under the #+TITLE directive,
```
            (org-publish-find-title entry project)
```
the formatted date of publishing,
```
            (nth 0 entry-dates)
```

the author's name extracted from the project property list project,

            (substring
             (format "%s"
                     (org-publish-find-property entry :author project)) 1 -1)

the formatted date of last modification, if any,

            (if (string= (nth 0 entry-dates) (nth 1 entry-dates))
                ""
              (concat " (updated on " (nth 1 entry-dates) ")"))

the synopsis of the blog post retrieved using our custom parsing function, get-post-synopsis
```
            (get-post-synopsis fixed-entry)
```
the path to the blog post file without extension because the link is not converted into a HTML link during the export as we do not use a standard Org-formatted link such as [[target][text]] but a button.
```
            (file-name-sans-extension entry))))
```

Formatting the list

The function format-blog-sitemap replaces the default function for generating sitemap which represents the list of blog posts in our case. It outputs an Org document having the title title. The blog posts formatted by the function format-blog-item are available as a list through the posts argument.

Actually, the function represents a concatenation of the title

(defun format-blog-sitemap (title posts)
  (concat
   "#+TITLE: " title "\n\n"

and the items of posts separated by a newline character and a horizontal line in the resulting Org document (see Figure 1).

Note that, posts is a nested list having the form:

‘unordered’
‘list of possibly nested posts’
‘list of possibly nested posts’
…

Therefore, I have to transform it into a simple list containing only the leading elements of the nested post lists. To achieve this, I apply a sequence filter on posts. Then, I strip the ‘unordered’ string from the beginning using cdr and I apply car as a filter on the lists of possibly nested posts which makes seq-filter return only the leading elements of the latter.

   (mapconcat
    (lambda (post)
      (format "%s\n" (car post)))
    (seq-filter #'car (cdr posts))
    "\n")))

Page titles

By default, the title of an output HTML page corresponds to the title of the original Org document. In addition to this title, I want to add a suffix, e.g. ‘Title - My site’.

To achieve this, I define the function add-suffix-to-html-title taking as argument the suffix to append and the list of html-files to process.

(defun add-suffix-to-html-title (suffix html-files)

For each HTML file in html-files, the function reads the content of the file,

  (while (setq html-file (pop html-files))
    (with-temp-buffer
      (insert-file-contents html-file)

navigates the cursor to the end of the buffer and backward searches for the closing </title> HTML tag.

      (goto-char (point-max))
      (re-search-backward "<\\/title>")

The cursor being at the beginning of the match, it inserts the text in suffix to the buffer immediately after the last character of the original document's title and saves the modified buffer.

      (insert suffix)
      (write-region 1 (point-max) html-file))))

Then, I define two wrappers for this function because I want to add a different suffix depending on whether the page is a content page or a blog post.

The wrapper add-suffix-to-html-title-for-pages calls the original function add-suffix-to-html-title after publishing content pages and adds the suffix ‘ - Marek Felšöci’. Note that, the list of corresponding HTML files is acquired through the project component property :publishing-directory read from the plist argument (see Project components).

(defun add-suffix-to-html-title-for-pages (plist)
  (add-suffix-to-html-title
   " - Marek Felšöci"
   (directory-files
    (plist-get plist :publishing-directory) t "\\.html$")))

The wrapper add-suffix-to-html-title-for-blog-posts calls the original function add-suffix-to-html-title when exporting blog posts and adds the suffix ‘ - Marek's blog’ to the titles of blog posts.

(defun add-suffix-to-html-title-for-blog-posts (plist)
  (add-suffix-to-html-title
   " - Marek's blog"
   (directory-files
    (plist-get plist :publishing-directory) t "\\.html$")))

These functions are called completion functions as they are triggered after publishing [6].

Last modification date

To include the last modification date to every page and blog post, I use an another completion function.

It begins by acquiring the list of original Org files through the project component property :base-directory read from the plist argument (see Project components).

(defun add-last-modification-date (plist)
  (let*
      ((org-files
        (directory-files
         (plist-get plist :base-directory) t "\\.org$"))

I also need to get the path to the publishing directory through the component property :publishing-directory.

       (output-directory
        (plist-get plist :publishing-directory)))

The idea is to determine the last modification dates of the original Org documents using the function last-modified from Publishing script and insert the dates to the published HTML documents straight before the footer (see General configuration).

To do this, I loop over each of the original Org documents to:

determine its last modification date,

    (while (setq org-file (pop org-files))
      (setq last-modification-date
            (nth 1 (file-dates org-file)))

get the path to the corresponding output HTML document,

      (setq output-html-file
            (concat
             output-directory "/" (file-name-base org-file) ".html"))

open the HTML document, place the cursor before the opening <div> tag of the footer, insert the last modification date and save the modification.

      (with-temp-buffer
        (insert-file-contents output-html-file)
        (goto-char (point-max))
        (re-search-backward "<div id=\"postamble\"")
        (insert
         "<div class=\"content\"><p id=\"last-modification\">"
         "Last update on "
         last-modification-date
         "</p></div>")
        (write-region 1 (point-max) output-html-file)))))

General configuration

Before configuring the publishing of the site, I set a couple of general preferences.

I deactivate the using of Org timestamp flags to force publishing of all files and not only changed files. It makes sure everything gets published.

(setq org-publish-use-timestamps-flag nil)

I also disable the prompt before each code block evaluation.

(setq org-confirm-babel-evaluate nil)

Then, I want to preserve the indentation in code blocks on export and tangle.

(setq org-src-preserve-indentation t)

Moreover, I need to instrument the publishing function to include the header and the footer to every exported page.

(setq org-html-preamble (org-file-contents "./shared/header.html"))
(setq org-html-postamble (org-file-contents "./shared/footer.html"))

In order to include my custom CSS styles and configure the favicon, I add three extra lines to the HTML header.

(setq org-html-head-extra "<link rel=\"stylesheet\" type=\"text/css\"
href=\"../styles/custom.css\">
<link rel=\"stylesheet\" type=\"text/css\"
href=\"../styles/htmlize.css\">
<link rel=\"icon\" type=\"image/x-icon\"
href=\"https://felsoci.sk/favicon.ico\"/>")

For the HTML export backend to stylize code blocks using a CSS style sheet file instead of inline CSS rules, I have to parameter the org-html-htmlize-output-type variable.

(setq org-html-htmlize-output-type 'css)

Also, I do not like the colon in the title of the footnote sections. So, I replace the original footnote export template as suggested here.

(setq org-html-footnotes-section "<div id=\"footnotes\">
<h2 class=\"footnotes\">%s</h2>
<div id=\"text-footnotes\">
%s
</div>
</div>")

Most of the Org documents constituting this site cite one or more bibliography entries. These documents then contain a References section providing the list of the bibliography entires. To specify the path to the bibliography file to consider, I use the #+BIBLIOGRAPHY: directive. Then, to export the contents of the section, I use the #+PRINT_BIBLIOGRAPHY: directive. For typesetting the bibliography in HTML, I opt for the csl [7] citation processor. The latter allows me to customize the appearance of citations and the bibliography listings using a style file [8]. The following configuration tells Org to use csl, but also where to look for style files which one to rely on.

(setq org-cite-csl-styles-dir "/home/marek/src/github.com/felsoci.sk/styles")
(setq org-cite-export-processors
      '((t . (csl "ieee-with-url.csl"))))

Project components

The last thing to do is to define the org-publish-project-alist. It represents the list of project's components and their individual export configuration as a list of properties, e. g. :publishing-directory.

(setq org-publish-project-alist
      (list

I split the site project into 5 components.

Blog

All of the configuration properties are pretty self-explanatory.

       (list "blog"
             :base-directory "./blog"
             :base-extension "org"
             :publishing-directory "./public/blog"
             :htmllized-source t
             :with-author t
             :with-creator t
             :with-date t
             :headline-level 4
             :section-numbers nil
             :with-toc nil
             :html-head nil
             :html-head-include-default-style nil
             :html-head-include-scripts nil

Although, I highlight the publishing function I choose. It tells Emacs to publish the Org documents composing this project component in the HTML format.

             :publishing-function '(org-html-publish-to-html)

The :completion-function property allows me to define functions to execute after publishing. Here, I set add-last-modification-date and add-suffix-to-html-title-for-blog-posts as completion functions (see Last modification date and Page titles).

             :completion-function '(add-last-modification-date
                                    add-suffix-to-html-title-for-blog-posts)

Eventually, I configure the sitemap corresponding to the list of blog posts. The title is ‘Posts’ and the posts are sorted from the latest to the oldest one.

             :auto-sitemap t
             :sitemap-filename "posts.org"
             :sitemap-title "Posts"
             :sitemap-sort-files 'anti-chronologically

Moreover, I use the functions format-blog-sitemap and format-blog-item to format the entires of the site map (blog post items) as well as the sitemap (list of blog posts) itself (see List of blog posts).

             :sitemap-function 'format-blog-sitemap
             :sitemap-format-entry 'format-blog-item)

Content pages

The export configuration for the content pages such as Home and About is very close to the previous one

        (list "pages"
              :base-directory "."
              :base-extension "org"
              :publishing-directory "./public/"
              :publishing-function '(org-html-publish-to-html)
              :htmllized-source t
              :with-author t
              :with-creator t
              :with-date t
              :headline-level 4
              :section-numbers nil
              :with-toc nil
              :html-head nil
              :html-head-include-default-style nil
              :html-head-include-scripts nil

except for the title suffix function add-suffix-to-html-title-for-pages (see Page titles).

              :completion-function '(add-last-modification-date
                                     add-suffix-to-html-title-for-pages)

Furthermore, I must exclude the blog folder from the list of input documents to prevent duplicate export.

              :exclude (regexp-opt '("blog")))

Styles, images and other attachments

In case of static files such as CSS styles, images and other attachments which are published as is, I use the publishing function for attachments. In case of the styles folder, I enable recursive lookup in order to include also the fonts sub-folder. Same for attachments (see Project's structure).

        (list "styles"
              :base-directory "./styles"
              :base-extension ".*"
              :recursive t
              :publishing-directory "./public/styles"
              :publishing-function '(org-publish-attachment))
        (list "images"
              :base-directory "./images"
              :base-extension ".*"
              :publishing-directory "./public/images"
              :publishing-function '(org-publish-attachment))
        (list "attachments"
              :base-directory "./blog/attachments"
              :base-extension ".*"
              :recursive t
              :publishing-directory "./public/blog/attachments"
              :publishing-function '(org-publish-attachment))

I complete the list by adding the list of all the components of the project as well as the name of the latter.

        (list "felsoci.sk"
              :components '("blog" "pages" "styles" "images" "attachments"))))

Ready, steady, go!

At this point, I am ready to go. To launch the publishing I need to:

extract the source code from the Org document corresponding to this page,

guix time-machine -C .guix/channels.scm -- shell --container \
     git emacs emacs-org -- emacs --batch -l org --eval \
'(org-babel-tangle-file "blog/creating-websites-and-blogging-in-org-mode.org")'

call the publishing function on the publish.el file.

guix time-machine -C .guix/channels.scm -- shell --container \
     -m .guix/manifest.scm -- emacs --batch --no-init-file \
     --eval '(setq org-confirm-babel-evaluate nil)' --load publish.el \
     --funcall org-publish-all

Feel free to send me your feedback!

Acknowledgement

Many thanks to Dennis Ogbe who published a similar post on his website. It helped me a lot while building my own publishing configuration!

References

[1]

“Org mode for Emacs.” https://orgmode.org.

[2]

F. Niessen, “How to export Org mode files into awesome HTML in 2 minutes.” https://github.com/fniessen/org-html-themes.

[3]

“GNU Guix software distribution and transactional package manager.” https://guix.gnu.org.

[4]

“GNU Guix Cookbook: Basic setup with manifests.” https://guix.gnu.org/cookbook/en/html_node/Basic-setup-with-manifests.html.

[5]

“GNU Guix Reference Manual: Channels.” https://guix.gnu.org/manual/en/html_node/Channels.html.

[6]

“Sources and destinations (The Org Manual).” https://orgmode.org/manual/Sources-and-destinations.html.

[7]

“Citation Style Language (CSL).” https://citationstyles.org/.

[8]

“Official repository for Citation Style Language (CSL) citation styles.” https://github.com/citation-style-language/styles.