Creating website and blogging in Org mode
I discovered the power of Org mode when I started to work on my Ph.D. thesis which is being entirely written in Org mode too. Indeed, one can easily export an Org mode document to an HTML page or a PDF document typeset in LaTeX. Recently, I committed myself to overhaul my personal website and decided to produce it using Org. So, in this post, I detail the whole process step by step.
Project's structure
The idea here is to build a static HTML website generated from a collection of Org [1] documents. On one hand, the site have a couple of content pages such as Home, About and so on. On the other hand, it features a small blog as well.
When it comes to the style of the site, I am looking for simplicity. Although, there are some great Org HTML templates [2], they are adapted rather for standalone pages than for a complete website with navigation. So, I prefer to define my own tiny CSS style sheet.
The file structure of the project is described below.
. ├── .guix │ ├── channels.scm │ └── manifest.scm ├── blog │ ├── creating-website-and-blogging-in-org-mode.org │ ├── attachments │ ├── biblio-setup.org (symbolic link to '../shared/biblio-setup.org') │ └── ... ├── images │ ├── marek.jpg │ └── ... ├── public │ └── favicon.ico ├── styles │ ├── custom.css │ └── htmlize.css ├── shared │ ├── biblio-setup.org │ ├── footer.html │ └── header.html ├── about.org ├── index.org ├── publish.el ├── README.md ├── references.bib ├── research.org └── teaching.org
The .guix
folder contains the description of the software environment required
for building the HTML website (see more in Software environment). The
blog
folder holds the Org documents of the blog posts. The attachments
subfolder contains static attachments related to the blog posts. images
naturally contains all the image files featured on the site. public
can be
used to store the exported website (see more in
General configuration). The custom CSS sheets reside
in styles
where htmlize.css
stylizes syntax highlighting in source code
blocks and custom.css
defines the look and feel of all the other elements of
the website. shared
holds the common static header and footer files as well as
an Org setup file providing some common document configuration, e.g. for
bibliography. The Org documents corresponding to the content pages are stored in
the root of the project's folder as well as the global LaTeX bibliography file
references.bib
for the entire website. Finally, the Emacs Lisp script
publish.el
controls the publishing of the website.
Software environment
I use the GNU Guix [3] transactional package manager allowing for a
self-contained, executable description of the whole software environment
required for running the publishing Emacs Lisp script. The packages to include
into the environment are listed in a manifest file [4], here
.guix/manifest.scm
.
(specifications->manifest (list "emacs" "emacs-org" "emacs-org-ref" "emacs-citeproc-el" "emacs-htmlize" "git" "bash" "coreutils" "tar"))
To ensure the same version of Guix and every single package in the environment
every time I enter the latter, I use also a channel file [5],
here .guix/channel.scm
, which represents a list of the Git repositories
providing package definitions, i.e. channels in Guix terminology, necessary to
build the publishing environment together with the associated revision numbers.
(list (channel (name 'guix) (url "https://git.savannah.gnu.org/git/guix.git") (branch "master") (commit "abeffc82379c4f9bd2e6226ea27453b22cb4e0c8") (introduction (make-channel-introduction "9edb3f66fd807b096b48283debdcddccfea34bad" (openpgp-fingerprint "BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))))
Note that I use Guix also at work to improve the reproducibility of my numerical experiments (see more in Research).
Setup file
I put global bibliography configuration into a dedicated Org file stored in the
shared
folder. This file can then be included in Org documents featuring
citations and a bibliography listing.
biblio-setup.org
is quiet short. I only need to provide the path to
the file containing bibliography entries
and specify the export processor to use, i.e. csl
[6] in this case.
The second argument of #+CITE_EXPORT
is the style file [7] to
use when formatting citations and bibliography listings on export.
Unfortunately, the absolute path must be given here.
Finally, to include a bibliography listing in an Org document I use the
#+PRINT_BIBLIOGRAPHY:
directive within an associated References section.
* References :PROPERTIES: :CUSTOM_ID: references :END: #+INCLUDE: ../shared/biblio-setup.org #+PRINT_BIBLIOGRAPHY:
Publishing script
The core of the project is the Elisp publishing script publish.el
responsible
for generating the final HTML source of the site.
It begins by importing the Emacs packages providing:
Org mode support,
(require 'org)
HTML export backend,
(require 'ox-html)
publishing functions,
(require 'ox-publish)
engine for exporting source code blocks to HTML,
(require 'htmlize)
bibliography support.
(require 'oc) (require 'citeproc) ;; for HTML (require 'oc-csl) ;; for HTML
Then, I define a utility function file-dates
allowing me to get the dates of
the first publication and of the last modification of an Org document.
At first, the function tries to find the dates in the Git log.
(defun file-dates (file) (let* ((first-commit-date (shell-command-to-string (concat "git log --reverse --pretty=\"format:%cD\"" " " file " 2> /dev/null | head -n 1"))) (last-commit-date (shell-command-to-string (concat "git log --pretty=\"format:%cD\"" " " file " 2> /dev/null | head -n 1"))) (last-modification-date-raw (file-attribute-modification-time (file-attributes file))) (last-modification-date (format-time-string "%d/%m/%Y" (+ (* (nth 0 last-modification-date-raw) (expt 2 16)) (nth 1 last-modification-date-raw)))))
If there is no commit involving the file, I take the last modification timestamp recorded by the filesystem.
(list (if (string= first-commit-date "") last-modification-date (substring (shell-command-to-string (concat "date -d \"" first-commit-date "\" +%d/%m/%Y")) 0 -1)) (if (string= last-commit-date "") last-modification-date (substring (shell-command-to-string (concat "date -d \"" last-commit-date "\" +%d/%m/%Y")) 0 -1)))))
Blog post synopsis
Each blog post may contain a synopsis used to introduce the content of the post in the list of blog posts:
Figure 1: Excerpt of the list of blog posts.
In the source Org document, the synopsis text must be enclosed between the
#+BEGIN_SYNOPSIS
and #+END_SYNOPSIS
tags.
For extracting the synopsis, I define the function get-post-synopsis
taking as
argument a blog-post
.
(defun get-post-synopsis (blog-post)
The first thing to do is to load the Org file pointed by blog-post
(with-temp-buffer
(insert-file-contents blog-post)
and move the cursor to the beginning of the document.
(goto-char (point-min))
In the core of the function, I use the markers beg
and end
to select the
area in the buffer between the first and the last character of the synopsis. To
exclude the newlines after the opening and before the closing tag, I move
forward the starting marker by one and move backwards the ending marker by one
too.
(let ((beg (+ 1 (re-search-forward "^#\\+BEGIN_SYNOPSIS$"))) (end (- (progn (re-search-forward "^#\\+END_SYNOPSIS$") (match-beginning 0)) 1)))
At the end, the function returns the sub-string of the buffer corresponding to the area between the two markers. At the same time, I need to remove any citations from the sub-string in order to prevent the apparition of artifacts on export.
(replace-regexp-in-string "[ ]\\[cite.*\\]" "" (buffer-substring beg end)))))
List of blog posts
For a handy access to blog posts, the site features a page containing the list of all blog posts with a short synopsis, the date of publishing, the author's name and the link to the post in form of a button (see Figure 1).
To create this page, we use the sitemap functionality in Org mode. The default appearance of the sitemap is rather basic. To customize it so the list of blog posts suits the design of the site, we need to define our own functions for formatting the sitemap (list of blog posts) and its items (blog posts).
Formatting items
The function format-blog-item
changes the formatting of the sitemap item (blog
post) entry
belonging to project
(see
Project components). Note that, entry
is the absolute
path to the Org file of the blog post being processed. Also, I don't use the
sitemap style
argument here.
(defun format-blog-item (entry style project) (let
Unfortunately, when the function is called by the Emacs export machinery, the
absolute path provided in entry
is incorrect. It lacks the parent folder
blog
because Emacs thinks it is running in the project's root although the
current working folder, when exporting blog posts, is blog
(see
Blog). Therefore, I have to re-include blog/
into the path.
For example, if the initial entry
holds /home/marek/src/felsoci.sk/post.org
,
I need to transform it to /home/marek/src/felsoci.sk/blog/post.org
.
((fixed-entry
(concat
(file-name-directory entry) "blog/" (file-name-nondirectory entry)))
Also, before actually formatting the sitemap entry, I need to determine its first publication and last modification dates.
(entry-dates (file-dates (concat (plist-get (cdr project) :base-directory) "/" entry))))
Finally, return the Org string corresponding to the sitemap entry formatted
using the format
function similar to sprintf
in C.
(format " @@html:<h2 class=\"post-title\">@@ [[file:%s][%s]] @@html:</h2><span class=\"post-metadata\">@@ Published on %s by %s%s @@html:</span>@@ %s @@html:<a href=\"@@%s@@html:.html\"><button>Read more</button></a>@@ "
All of the %s
are replaced by the values of the arguments following the string
to format:
the path to the blog post Org document,
entry
the title of the post found in the Org document under the
#+TITLE
directive,(org-publish-find-title entry project)
the formatted date of publishing,
(nth 0 entry-dates)
the author's name extracted from the project property list
project
,(substring (format "%s" (org-publish-find-property entry :author project)) 1 -1)
the formatted date of last modification, if any,
(if (string= (nth 0 entry-dates) (nth 1 entry-dates)) "" (concat " (updated on " (nth 1 entry-dates) ")"))
the synopsis of the blog post retrieved using our custom parsing function,
get-post-synopsis
(get-post-synopsis fixed-entry)
the path to the blog post file without extension because the link is not converted into a HTML link during the export as we do not use a standard Org-formatted link such as
[[target][text]]
but a button.(file-name-sans-extension entry))))
Formatting the list
The function format-blog-sitemap
replaces the default function for generating
sitemap which represents the list of blog posts in our case. It outputs an Org
document having the title title
. The blog posts formatted by the function
format-blog-item
are available as a list through the posts
argument.
Actually, the function represents a concatenation of the title
(defun format-blog-sitemap (title posts) (concat "#+TITLE: " title "\n\n"
and the items of posts
separated by a newline character and a horizontal line
in the resulting Org document (see Figure 1).
Note that, posts
is a nested list having the form:
- ‘unordered’
- ‘list of possibly nested posts’
- ‘list of possibly nested posts’
- …
Therefore, I have to transform it into a simple list containing only the
leading elements of the nested post lists. To achieve this, I apply a sequence
filter on posts
. Then, I strip the ‘unordered’ string from the beginning using
cdr
and I apply car
as a filter on the lists of possibly nested posts which
makes seq-filter
return only the leading elements of the latter.
(mapconcat (lambda (post) (format "%s\n" (car post))) (seq-filter #'car (cdr posts)) "\n")))
Page titles
By default, the title of an output HTML page corresponds to the title of the original Org document. In addition to this title, I want to add a suffix, e.g. ‘Title - My site’.
To achieve this, I define the function add-suffix-to-html-title
taking as
argument the suffix
to append and the list of html-files
to process.
(defun add-suffix-to-html-title (suffix html-files)
For each HTML file in html-files
, the function reads the content of the file,
(while (setq html-file (pop html-files)) (with-temp-buffer (insert-file-contents html-file)
navigates the cursor to the end of the buffer and backward searches for the
closing </title>
HTML tag.
(goto-char (point-max))
(re-search-backward "<\\/title>")
The cursor being at the beginning of the match, it inserts the text in suffix
to the buffer immediately after the last character of the original document's
title and saves the modified buffer.
(insert suffix) (write-region 1 (point-max) html-file))))
Then, I define two wrappers for this function because I want to add a different suffix depending on whether the page is a content page or a blog post.
The wrapper add-suffix-to-html-title-for-pages
calls the original function
add-suffix-to-html-title
after publishing content pages and adds the suffix
‘ - Marek Felšöci’. Note that, the list of corresponding HTML files is acquired
through the project component property :publishing-directory
read from the
plist
argument (see Project components).
(defun add-suffix-to-html-title-for-pages (plist) (add-suffix-to-html-title " - Marek Felšöci" (directory-files (plist-get plist :publishing-directory) t "\\.html$")))
The wrapper add-suffix-to-html-title-for-blog-posts
calls the original
function add-suffix-to-html-title
when exporting blog posts and adds the
suffix ‘ - Marek's blog’ to the titles of blog posts.
(defun add-suffix-to-html-title-for-blog-posts (plist) (add-suffix-to-html-title " - Marek's blog" (directory-files (plist-get plist :publishing-directory) t "\\.html$")))
These functions are called completion functions as they are triggered after publishing [8].
Last modification date
To include the last modification date to every page and blog post, I use an another completion function.
It begins by acquiring the list of original Org files through the project
component property :base-directory
read from the plist
argument (see
Project components).
(defun add-last-modification-date (plist) (let* ((org-files (directory-files (plist-get plist :base-directory) t "\\.org$"))
I also need to get the path to the publishing directory through the component
property :publishing-directory
.
(output-directory
(plist-get plist :publishing-directory)))
The idea is to determine the last modification dates of the original Org
documents using the function last-modified
from
Publishing script and insert the dates to the published
HTML documents straight before the footer (see
General configuration).
To do this, I loop over each of the original Org documents to:
determine its last modification date,
(while (setq org-file (pop org-files)) (setq last-modification-date (nth 1 (file-dates org-file)))
get the path to the corresponding output HTML document,
(setq output-html-file (concat output-directory "/" (file-name-base org-file) ".html"))
open the HTML document, place the cursor before the opening
<div>
tag of the footer, insert the last modification date and save the modification.(with-temp-buffer (insert-file-contents output-html-file) (goto-char (point-max)) (re-search-backward "<div id=\"postamble\"") (insert "<div class=\"content\"><p id=\"last-modification\">" "Last update on " last-modification-date "</p></div>") (write-region 1 (point-max) output-html-file)))))
General configuration
Before configuring the publishing of the site, I set a couple of general preferences.
I deactivate the using of Org timestamp flags to force publishing of all files and not only changed files. It makes sure everything gets published.
(setq org-publish-use-timestamps-flag nil)
I also disable the prompt before each code block evaluation.
(setq org-confirm-babel-evaluate nil)
Then, I want to preserve the indentation in code blocks on export and tangle.
(setq org-src-preserve-indentation t)
Moreover, I need to instrument the publishing function to include the header and the footer to every exported page.
(setq org-html-preamble (org-file-contents "./shared/header.html")) (setq org-html-postamble (org-file-contents "./shared/footer.html"))
In order to include my custom CSS styles and configure the favicon, I add three extra lines to the HTML header.
(setq org-html-head-extra "<link rel=\"stylesheet\" type=\"text/css\" href=\"../styles/custom.css\"> <link rel=\"stylesheet\" type=\"text/css\" href=\"../styles/htmlize.css\"> <link rel=\"icon\" type=\"image/x-icon\" href=\"https://felsoci.sk/favicon.ico\"/>")
For the HTML export backend to stylize code blocks using a CSS style sheet file
instead of inline CSS rules, I have to parameter the
org-html-htmlize-output-type
variable.
(setq org-html-htmlize-output-type 'css)
Also, I do not like the colon in the title of the footnote sections. So, I replace the original footnote export template as suggested here.
(setq org-html-footnotes-section "<div id=\"footnotes\"> <h2 class=\"footnotes\">%s</h2> <div id=\"text-footnotes\"> %s </div> </div>")
Finally, I define a utility function allowing me to change the output folder
through an environment variable, namely ORG_OUTPUT_PATH
. This way, I can
switch between my local Apache server for testing and the production server
easily. If the variable is not set in the current environment, the output will
be published into the public
folder located in the root of the project.
Note that, the optional suffix
argument specifies the local path starting from
the root of the output folder.
(defun get-output-path (&optional suffix) (let ((custom (getenv "ORG_OUTPUT_PATH"))) (if custom (concat custom "/" suffix) (concat "./public/" suffix))))
Project components
The last thing to do is to define the org-publish-project-alist
. It represents
the list of project's components and their individual export configuration as a
list of properties, e. g. :publishing-directory
.
(setq org-publish-project-alist
(list
I split the site project into 5 components.
Blog
All of the configuration properties are pretty self-explanatory.
(list "blog" :base-directory "./blog" :base-extension "org" :publishing-directory (get-output-path "blog") :htmllized-source t :with-author t :with-creator t :with-date t :headline-level 4 :section-numbers nil :with-toc nil :html-head nil :html-head-include-default-style nil :html-head-include-scripts nil
Although, I highlight the publishing function I choose. It tells Emacs to publish the Org documents composing this project component in the HTML format.
:publishing-function '(org-html-publish-to-html)
The :completion-function
property allows me to define functions to execute
after publishing. Here, I set add-last-modification-date
and
add-suffix-to-html-title-for-blog-posts
as completion functions (see
Last modification date and
Page titles).
:completion-function '(add-last-modification-date
add-suffix-to-html-title-for-blog-posts)
Eventually, I configure the sitemap corresponding to the list of blog posts. The title is ‘Posts’ and the posts are sorted from the latest to the oldest one.
:auto-sitemap t :sitemap-filename "posts.org" :sitemap-title "Posts" :sitemap-sort-files 'anti-chronologically
Moreover, I use the functions format-blog-sitemap
and format-blog-item
to
format the entires of the site map (blog post items) as well as the sitemap
(list of blog posts) itself (see List of blog posts).
:sitemap-function 'format-blog-sitemap :sitemap-format-entry 'format-blog-item)
Content pages
The export configuration for the content pages such as Home and About is very close to the previous one
(list "pages" :base-directory "." :base-extension "org" :publishing-directory (get-output-path) :publishing-function '(org-html-publish-to-html) :htmllized-source t :with-author t :with-creator t :with-date t :headline-level 4 :section-numbers nil :with-toc nil :html-head nil :html-head-include-default-style nil :html-head-include-scripts nil
except for the title suffix function add-suffix-to-html-title-for-pages
(see
Page titles).
:completion-function '(add-last-modification-date
add-suffix-to-html-title-for-pages)
Furthermore, I must exclude the blog
folder from the list of input documents
to prevent duplicate export.
:exclude (regexp-opt '("blog")))
Styles, images and other attachments
In case of static files such as CSS styles, images and other attachments which
are published as is, I use the publishing function for attachments. In case of
the styles
folder, I enable recursive lookup in order to include also the
fonts
sub-folder. Same for attachments
(see
Project's structure).
(list "styles" :base-directory "./styles" :base-extension ".*" :recursive t :publishing-directory (get-output-path "styles") :publishing-function '(org-publish-attachment)) (list "images" :base-directory "./images" :base-extension ".*" :publishing-directory (get-output-path "images") :publishing-function '(org-publish-attachment)) (list "attachments" :base-directory "./blog/attachments" :base-extension ".*" :recursive t :publishing-directory (get-output-path "blog/attachments") :publishing-function '(org-publish-attachment))
I complete the list by adding the list of all the components of the project as well as the name of the latter.
(list "felsoci.sk" :components '("blog" "pages" "styles" "images" "attachments"))))
Ready, steady, go!
At this point, I am ready to go. To launch the publishing I need to:
extract the source code from the Org document corresponding to this page,
guix time-machine -C .guix/channels.scm -- shell --pure \ -m .guix/manifest.scm -- emacs --batch -l org --eval \ '(org-babel-tangle-file "blog/creating-websites-and-blogging-in-org-mode.org")'
call the publishing function on the
publish.el
file.guix time-machine -C .guix/channels.scm -- shell --pure \ -m .guix/manifest.scm -- emacs --batch --no-init-file \ --eval '(setq org-confirm-babel-evaluate nil)' --load publish.el \ --funcall org-publish-all
Feel free to send me your feedback!
Acknowledgement
Many thanks to Dennis Ogbe who published a similar post on his website. It helped me a lot while building my own publishing configuration!