Creating website and blogging in Org mode
I discovered the power of Org mode after joining Inria (see Home) to work on my Ph.D. thesis which is also entirely written in Org mode. Indeed, one can easily export an Org mode document to an HTML page or a PDF document typeset in LaTeX. Recently, I committed myself to overhaul my personal website and decided to produce it using Org. So, in this post, I detail the whole process step by step.
Project's structure
The idea here is to build a static HTML website generated from a collection of Org documents. On one hand, the site have a couple of content pages such as Home, About and so on. On the other hand, it features a small blog as well.
When it comes to the style of the site, I am looking for simplicity. Although, there are some great Org HTML templates, they are rather adapted for standalone HTML pages than for creating a complete website with navigation. So, I prefer to define a tiny custom CSS style sheet to make the assembly of the site as easy to maintain as possible.
The file structure of the project is described below.
. ├── blog │ ├── creating-website-and-blogging-in-org-mode.org │ ├── attachments │ └── ... ├── cv │ └── cv-felsoci.org ├── images │ ├── marek.jpg │ └── ... ├── styles │ └── custom.css ├── shared │ ├── footer.html │ └── header.html ├── about.org ├── index.org ├── publish.el ├── README.md ├── research.org └── teaching.org
blog
folder holds the Org documents of the blog posts. The attachments
subfolder contains only static files to be published as is. cv
contains a
formal version of my CV published as a PDF document and accessible for download
from the home page. images
naturally contains all the image files featured on
the site. The custom CSS sheet resides in styles
. shared
holds the common
static header and footer files. The Org documents corresponding to the content
pages are stored in the root of the project's folder. Finally, the Emacs Lisp
script publish.el
controls the publishing of the site.
Setup file
In order to include the last modification time as well as the author's name to every page, I use a common Org file.
Besides a few lines of HTML, it call the Elisp function modification-time
to
determine and include the last modifcation date and time on every page that
includes the setup file using the #+INCLUDE
directive.
Finally, it uses the #+AUTHOR
directive to configure the same author's name
everywhere.
Publishing script
The core of the project is the Elisp publishing script publish.el
responsible
for generating the final HTML source of the site.
It begins by importing the Emacs packages required for:
Org mode support,
(require 'org)
HTML export backend,
(require 'ox-html)
publishing functions,
(require 'ox-publish)
source code block export to HTML,
(require 'htmlize)
bibliography support.
(require 'org-ref)
Then, I define a utility function last-modified
allowing me to get the date of
last modification of a file for the list of blog posts (see
Formatting items).
At first, the function tries to find the date of the last Git commit the file was subject of as well as the last modification time of the file from the local filesystem.
(defun last-modified (file) (let* ((last-commit-date (shell-command-to-string (concat "git log -1 --pretty=\"format:%cD\"" " " file))) (last-modification-date (file-attribute-modification-time (file-attributes file))))
If there is no commit involving the file, its last modification time is returned. This way, I prevent the apparition of wrong dates after cloning the site's repository.
(if (string= last-commit-date "") (format-time-string "%d/%m/%Y" (+ (* (nth 0 last-modification-date) (expt 2 16)) (nth 1 last-modification-date))) (substring (shell-command-to-string (concat "date -d \"" last-commit-date "\" +%d/%m/%Y")) 0 -1))))
Blog post synopsis
Each blog post may contain a synopsis used to introduce the content of the post in the list of blog posts:
Figure 1: Excerpt of the list of blog posts.
In the source Org document, the synopsis text must be enclosed between the
#+BEGIN_SYNOPSIS
and #+END_SYNOPSIS
tags.
For extracting the synopsis, I define the function get-post-synopsis
taking as
argument a blog-post
.
(defun get-post-synopsis (blog-post)
The first thing to do is to load the Org file pointed by blog-post
(with-temp-buffer
(insert-file-contents blog-post)
and move the cursor to the beginning of the document.
(goto-char (point-min))
In the core of the function, I use the markers beg
and end
to select the
area in the buffer between the first and the last character of the synopsis. To
exclude the newlines after the opening and before the closing tag, I move
forward the starting marker by one and move backwards the ending marker by one
too.
(let ((beg (+ 1 (re-search-forward "^#\\+BEGIN_SYNOPSIS$"))) (end (- (progn (re-search-forward "^#\\+END_SYNOPSIS$") (match-beginning 0)) 1)))
At the end, the function returns the sub-string of the buffer corresponding to the area between the two markers.
(buffer-substring beg end))))
List of blog posts
For a handy access to blog posts, the site features a page containing the list of all blog posts with a short synopsis, the date of publishing, the author's name and the link to the post in form of a button (see Figure 1).
To create this page, we use the sitemap functionality in Org mode. The default appearance of the sitemap is rather basic. To customize it so the list of blog posts suits the design of the site, we need to define our own functions for formatting the sitemap (list of blog posts) and its items (blog posts).
Formatting items
The function format-blog-item
changes the formatting of the sitemap item
entry
(blog post) belonging to project
(see
Project components). Note that, entry
is the absolute
path to the Org file of the blog post being processed. Also, I don't use the
sitemap style
argument here.
(defun format-blog-item (entry style project) (let
Unfortunately, when the function is called by the Emacs export machinery, the
absolute path provided in entry
is incorrect. It lacks the parent folder
blog
because Emacs thinks it is running in the project's root although the
current working folder, when exporting blog posts, is blog
(see
Blog). Therefore, I have to re-include blog/
into the path.
For example, if the initial entry
holds /home/marek/src/felsoci.sk/post.org
,
I need to transform it to /home/marek/src/felsoci.sk/blog/post.org
.
((fixed-entry
(concat
(file-name-directory entry) "blog/" (file-name-nondirectory entry))))
Finally, return the Org string corresponding to the blog post (sitemap) entry
formatted using the format
function similar to sprintf
in C.
(format " @@html:<h2 class=\"post-title\">@@ [[file:%s][%s]] @@html:</h2><span class=\"post-metadata\">@@ Published on %s by %s @@html:</span>@@ %s @@html:<a href=\"@@%s@@html:.html\"><button>Read more</button></a>@@ "
All of the %s
are replaced by the values of the arguments following the string
to format:
the path to the blog post Org document
entry
the title of the post found in the Org document under the
#+TITLE
directive(org-publish-find-title entry project)
the formatted date of publishing
(last-modified (concat (plist-get (cdr project) :base-directory) "/" entry))
the author's name extracted from the project property list
project
(substring (format "%s" (org-publish-find-property entry :author project)) 1 -1)
the synopsis of the blog post retrieved using our custom parsing function
get-post-synopsis
(get-post-synopsis fixed-entry)
the path to the blog post file without extension because the link is not converted into a HTML link during the export as we do not use a standard Org-formatted link such as
[[target][text]]
but a button(file-name-sans-extension entry))))
Formatting the list
The function format-blog-sitemap
replaces the default function for generating
sitemap which represents the list of blog posts in our case. It outputs an Org
document having the title title
. The blog posts formatted by the function
format-blog-item
are available as a list through the posts
argument.
Actually, the function represents a concatenation of the title
(defun format-blog-sitemap (title posts) (concat "#+TITLE: " title "\n\n"
and the items of posts
separated by a newline character and a horizontal line
in the resulting Org document (see Figure 1).
Note that, posts
is a nested list having the form:
- ‘unordered’
- ‘list of possibly nested posts’
- ‘list of possibly nested posts’
- …
Therefore, I have to transform it into a simple list containing only the
leading elements of the nested post lists. To achieve this, I apply a sequence
filter on posts
. Then, I strip the ‘unordered’ string from the beginning using
cdr
and I apply car
as a filter on the lists of possibly nested posts which
makes seq-filter
return only the leading elements of the latter.
(mapconcat (lambda (post) (format "%s\n" (car post))) (seq-filter #'car (cdr posts)) "\n")))
Page titles
By default, the title of an output HTML page corresponds to the title of the original Org document. In addition to this title, I want to add a suffix, e.g. ‘Title - My site’.
To achieve this, I define the function add-suffix-to-html-title
taking as
argument the suffix
to append and the list of html-files
to process.
(defun add-suffix-to-html-title (suffix html-files)
For each HTML file in html-files
, the function reads the content of the file,
(while (setq html-file (pop html-files)) (with-temp-buffer (insert-file-contents html-file)
navigates the cursor to the end of the buffer and backward searches for the
closing </title>
HTML tag.
(goto-char (point-max))
(re-search-backward "<\\/title>")
The cursor being at the beginning of the match, it inserts the text in suffix
to the buffer immediately after the last character of the original document's
title and saves the modified buffer.
(insert suffix) (write-region 1 (point-max) html-file))))
Then, I define two wrappers for this function because I want to add a different suffix depending on whether the page is a content page or a blog post.
The wrapper add-suffix-to-html-title-for-pages
calls the original function
add-suffix-to-html-title
after publishing content pages and adds the suffix
‘ - Marek Felšöci’. Note that, the list of corresponding HTML files is acquired
through the project component property :publishing-directory
read from the
plist
argument (see Project components).
(defun add-suffix-to-html-title-for-pages (plist) (add-suffix-to-html-title " - Marek Felšöci" (directory-files (plist-get plist :publishing-directory) t "\\.html$")))
The wrapper add-suffix-to-html-title-for-blog-posts
calls the original
function add-suffix-to-html-title
when exporting blog posts and adds the
suffix ‘ - Marek's blog’ to the titles of blog posts.
(defun add-suffix-to-html-title-for-blog-posts (plist) (add-suffix-to-html-title " - Marek's blog" (directory-files (plist-get plist :publishing-directory) t "\\.html$")))
These functions are called completion functions as they are triggered after publishing (see Sources and destinations in the Org Manual).
Last modification date
To include the last modification date to every page and blog post, I use an another completion function.
It begins by acquiring the list of original Org files through the project
component property :base-directory
read from the plist
argument (see
Project components).
(defun add-last-modification-date (plist) (let* ((org-files (directory-files (plist-get plist :base-directory) t "\\.org$"))
I also need to get the path to the publishing directory through the component
property :publishing-directory
.
(output-directory
(plist-get plist :publishing-directory)))
The idea is to determine the last modification dates of the original Org
documents using the function last-modified
from
Publishing script and insert the dates to the published
HTML documents straight before the footer (see
General configuration).
To do this, I loop over each of the original Org documents to:
determine its last modification date,
(while (setq org-file (pop org-files)) (setq last-modification-date (last-modified org-file))
get the path to the corresponding output HTML document,
(setq output-html-file (concat output-directory "/" (file-name-base org-file) ".html"))
open the HTML document, place the cursor before the opening
<div>
tag of the footer, insert the last modification date and save the modification.(with-temp-buffer (insert-file-contents output-html-file) (goto-char (point-max)) (re-search-backward "<div id=\"postamble\"") (insert "<div class=\"content\"><p id=\"last-modification\">" "Last update on " last-modification-date "</p></div>") (write-region 1 (point-max) output-html-file)))))
General configuration
Before configuring the publishing of the site, I set a couple of general preferences.
I deactivate the using of Org timestamp flags to force publishing of all files and not only changed files. It makes sure everything gets published.
(setq org-publish-use-timestamps-flag nil)
I also disable the prompt before each code block evaluation.
(setq org-confirm-babel-evaluate nil)
Then, I want to preserve the indentation in code blocks on export and tangle.
(setq org-src-preserve-indentation t)
In order to ensure the bibliography entries, if any, are published correctly I
override the default LaTeX publishing command to use latexmk
.
(setq org-latex-pdf-process (list "latexmk --shell-escape -f -pdf %f"))
Moreover, I need to instrument the publishing function to include the header and the footer to every exported page.
(setq org-html-preamble (org-file-contents "./shared/header.html")) (setq org-html-postamble (org-file-contents "./shared/footer.html"))
In order to include my custom CSS style and configure the favicon, I add two extra lines to the HTML header.
(setq org-html-head-extra "<link rel=\"stylesheet\" type=\"text/css\" href=\"../styles/custom.css\"> <link rel=\"icon\" type=\"image/x-icon\" href=\"https://felsoci.sk/favicon.ico\"/>")
Finally, I define a utility function allowing me to change the output folder
through an environment variable, namely ORG_OUTPUT_PATH
. This way, I can
switch between my local Apache server for testing and the production server
easily. If the variable is not set in the current environment, the output will
be published into the public
folder located in the root of the project.
Note that, the optional suffix
argument specifies the local path starting from
the root of the output folder.
(defun get-output-path (&optional suffix) (let ((custom (getenv "ORG_OUTPUT_PATH"))) (if custom (concat custom "/" suffix) (concat "./public/" suffix))))
Project components
The last thing to do is to define the org-publish-project-alist
. It represents
the list of project's components and their individual export configuration as a
list of properties, e. g. :publishing-directory
.
(setq org-publish-project-alist
(list
I split the site project into 5 components.
Blog
All of the configuration properties are pretty self-explanatory.
(list "blog" :base-directory "./blog" :base-extension "org" :publishing-directory (get-output-path "blog") :htmllized-source t :with-author t :with-creator t :with-date t :headline-level 4 :section-numbers nil :with-toc nil :html-head nil :html-head-include-default-style nil :html-head-include-scripts nil
Although, I highlight the publishing function I choose. It tells Emacs to publish the Org documents composing this project component in the HTML format.
:publishing-function '(org-html-publish-to-html)
The :completion-function
property allows me to define functions to execute
after publishing. Here, I set add-last-modification-date
and
add-suffix-to-html-title-for-blog-posts
as completion functions (see
Last modification date and Page titles).
:completion-function '(add-last-modification-date
add-suffix-to-html-title-for-blog-posts)
Eventually, I configure the sitemap corresponding to the list of blog posts. The title is ‘Posts’ and the posts are sorted from the latest to the oldest one.
:auto-sitemap t :sitemap-filename "posts.org" :sitemap-title "Posts" :sitemap-sort-files 'anti-chronologically
Moreover, I use the functions format-blog-sitemap
and format-blog-item
to
format the entires of the site map (blog post items) as well as the sitemap
(list of blog posts) itself (see List of blog posts).
:sitemap-function 'format-blog-sitemap :sitemap-format-entry 'format-blog-item)
Content pages
The export configuration for the content pages such as Home and About is very close to the previous one
(list "pages" :base-directory "." :base-extension "org" :publishing-directory (get-output-path) :publishing-function '(org-html-publish-to-html) :htmllized-source t :with-author t :with-creator t :with-date t :headline-level 4 :section-numbers nil :with-toc nil :html-head nil :html-head-include-default-style nil :html-head-include-scripts nil
except for the title suffix function add-suffix-to-html-title-for-pages
(see
Page titles).
:completion-function '(add-last-modification-date
add-suffix-to-html-title-for-pages)
Furthermore, I must exclude the blog
folder from the list of input documents
to prevent duplicate export.
:exclude (regexp-opt '("blog")))
CV
The most important thing in the export configuration for the CV is the publishing function. Here, I use the function allowing me to publish PDF documents on output.
(list "cv" :base-directory "./cv" :base-extension "org" :publishing-directory (get-output-path "cv") :publishing-function '(org-latex-publish-to-pdf))
Styles, images and other attachments
In case of static files such as CSS styles, images and other attachments which
are published as is, I use the publishing function for attachments. In case of
the styles
folder, I enable recursive lookup in order to include also the
fonts
sub-folder. Same for attachments
(see Project's structure).
(list "styles" :base-directory "./styles" :base-extension ".*" :recursive t :publishing-directory (get-output-path "styles") :publishing-function '(org-publish-attachment)) (list "images" :base-directory "./images" :base-extension ".*" :publishing-directory (get-output-path "images") :publishing-function '(org-publish-attachment)) (list "attachments" :base-directory "./blog/attachments" :base-extension ".*" :recursive t :publishing-directory (get-output-path "blog/attachments") :publishing-function '(org-publish-attachment))
I complete the list by adding the list of all the components of the project as well as the name of the latter.
(list "felsoci.sk" :components '("blog" "pages" "styles" "images" "attachments"))))
Ready, steady, go!
At this point, I am ready to go. To launch the publishing, I use the following shell command.
Notice that, in this command line, I disable the confirmation before evaluating each code block for the sake of simplicity.
emacs --batch --no-init-file --eval '(setq org-confirm-babel-evaluate nil)' \ --load publish.el --funcall org-publish-all
Feel free to send me your feedback!
Acknowledgement
Many thanks to Dennis Ogbe who published a similar post on his website. It helped me a lot while building my own publishing configuration!