Marek Felšöci

Creating website and blogging in Org mode

I discovered the power of Org mode after joining Inria (see Home) to work on my Ph.D. thesis which is also entirely written in Org mode. Indeed, one can easily export an Org mode document to an HTML page or a PDF document typeset in LaTeX. Recently, I committed myself to overhaul my personal website and decided to produce it using Org. So, in this post, I detail the whole process step by step.

Project's structure

The idea here is to build a static HTML website generated from a collection of Org documents. On one hand, the site have a couple of content pages such as Home, About and so on. On the other hand, it features a small blog as well.

When it comes to the style of the site, I am looking for simplicity. Although, there are some great Org HTML templates, they are rather adapted for standalone HTML pages than for creating a complete website with navigation. So, I prefer to define a tiny custom CSS style sheet to make the assembly of the site as easy to maintain as possible.

The file structure of the project is described below.

.
├── blog
│   ├── creating-website-and-blogging-in-org-mode.org
│   ├── attachments
│   └── ...
├── cv
│   └── cv-felsoci.org
├── images
│   ├── marek.jpg
│   └── ...
├── styles
│   └── custom.css
├── shared
│   ├── footer.html
│   └── header.html
├── about.org
├── index.org
├── publish.el
├── README.md
├── research.org
└── teaching.org

blog folder holds the Org documents of the blog posts. The attachments subfolder contains only static files to be published as is. cv contains a formal version of my CV published as a PDF document and accessible for download from the home page. images naturally contains all the image files featured on the site. The custom CSS sheet resides in styles. shared holds the common static header and footer files. The Org documents corresponding to the content pages are stored in the root of the project's folder. Finally, the Emacs Lisp script publish.el controls the publishing of the site.

Setup file

In order to include the last modification time as well as the author's name to every page, I use a common Org file.

Besides a few lines of HTML, it call the Elisp function modification-time to determine and include the last modifcation date and time on every page that includes the setup file using the #+INCLUDE directive.

Finally, it uses the #+AUTHOR directive to configure the same author's name everywhere.

Publishing script

The core of the project is the Elisp publishing script publish.el responsible for generating the final HTML source of the site.

It begins by importing the Emacs packages required for:

  • Org mode support,

    (require 'org)
    
  • HTML export backend,

    (require 'ox-html)
    
  • publishing functions,

    (require 'ox-publish)
    
  • source code block export to HTML,

    (require 'htmlize)
    
  • bibliography support.

    (require 'org-ref)
    

Then, I define a utility function last-modified allowing me to get the date of last modification of a file for the list of blog posts (see Formatting items).

At first, the function tries to find the date of the last Git commit the file was subject of as well as the last modification time of the file from the local filesystem.

(defun last-modified (file)
  (let*
      ((last-commit-date
        (shell-command-to-string
         (concat
          "git log -1 --pretty=\"format:%cD\"" " " file)))
       (last-modification-date
        (file-attribute-modification-time
         (file-attributes file))))

If there is no commit involving the file, its last modification time is returned. This way, I prevent the apparition of wrong dates after cloning the site's repository.

    (if (string= last-commit-date "")
        (format-time-string
         "%d/%m/%Y"
         (+
          (*
           (nth 0 last-modification-date) (expt 2 16))
          (nth 1 last-modification-date)))
      (substring
       (shell-command-to-string
        (concat
         "date -d \""
         last-commit-date
         "\" +%d/%m/%Y")) 0 -1))))

Blog post synopsis

Each blog post may contain a synopsis used to introduce the content of the post in the list of blog posts:

post-item.png

Figure 1: Excerpt of the list of blog posts.

In the source Org document, the synopsis text must be enclosed between the #+BEGIN_SYNOPSIS and #+END_SYNOPSIS tags.

For extracting the synopsis, I define the function get-post-synopsis taking as argument a blog-post.

(defun get-post-synopsis (blog-post)

The first thing to do is to load the Org file pointed by blog-post

  (with-temp-buffer
    (insert-file-contents blog-post)

and move the cursor to the beginning of the document.

    (goto-char (point-min))

In the core of the function, I use the markers beg and end to select the area in the buffer between the first and the last character of the synopsis. To exclude the newlines after the opening and before the closing tag, I move forward the starting marker by one and move backwards the ending marker by one too.

    (let
        ((beg (+ 1 (re-search-forward "^#\\+BEGIN_SYNOPSIS$")))
         (end (- (progn
                   (re-search-forward "^#\\+END_SYNOPSIS$")
                   (match-beginning 0)) 1)))

At the end, the function returns the sub-string of the buffer corresponding to the area between the two markers.

      (buffer-substring beg end))))

List of blog posts

For a handy access to blog posts, the site features a page containing the list of all blog posts with a short synopsis, the date of publishing, the author's name and the link to the post in form of a button (see Figure 1).

To create this page, we use the sitemap functionality in Org mode. The default appearance of the sitemap is rather basic. To customize it so the list of blog posts suits the design of the site, we need to define our own functions for formatting the sitemap (list of blog posts) and its items (blog posts).

Formatting items

The function format-blog-item changes the formatting of the sitemap item entry (blog post) belonging to project (see Project components). Note that, entry is the absolute path to the Org file of the blog post being processed. Also, I don't use the sitemap style argument here.

(defun format-blog-item (entry style project)
  (let

Unfortunately, when the function is called by the Emacs export machinery, the absolute path provided in entry is incorrect. It lacks the parent folder blog because Emacs thinks it is running in the project's root although the current working folder, when exporting blog posts, is blog (see Blog). Therefore, I have to re-include blog/ into the path.

For example, if the initial entry holds /home/marek/src/felsoci.sk/post.org, I need to transform it to /home/marek/src/felsoci.sk/blog/post.org.

      ((fixed-entry
        (concat
         (file-name-directory entry) "blog/" (file-name-nondirectory entry))))

Finally, return the Org string corresponding to the blog post (sitemap) entry formatted using the format function similar to sprintf in C.

    (format "
@@html:<h2 class=\"post-title\">@@
[[file:%s][%s]]
@@html:</h2><span class=\"post-metadata\">@@
Published on %s by %s
@@html:</span>@@

%s

@@html:<a href=\"@@%s@@html:.html\"><button>Read more</button></a>@@
"

All of the %s are replaced by the values of the arguments following the string to format:

  1. the path to the blog post Org document

                entry
    
  2. the title of the post found in the Org document under the #+TITLE directive

                (org-publish-find-title entry project)
    
  3. the formatted date of publishing

                (last-modified
                 (concat
                  (plist-get (cdr project) :base-directory)
                  "/"
                  entry))
    
  4. the author's name extracted from the project property list project

                (substring
                 (format "%s"
                         (org-publish-find-property entry :author project)) 1 -1)
    
  5. the synopsis of the blog post retrieved using our custom parsing function get-post-synopsis

                (get-post-synopsis fixed-entry)
    
  6. the path to the blog post file without extension because the link is not converted into a HTML link during the export as we do not use a standard Org-formatted link such as [[target][text]] but a button

                (file-name-sans-extension entry))))
    

Formatting the list

The function format-blog-sitemap replaces the default function for generating sitemap which represents the list of blog posts in our case. It outputs an Org document having the title title. The blog posts formatted by the function format-blog-item are available as a list through the posts argument.

Actually, the function represents a concatenation of the title

(defun format-blog-sitemap (title posts)
  (concat
   "#+TITLE: " title "\n\n"

and the items of posts separated by a newline character and a horizontal line in the resulting Org document (see Figure 1).

Note that, posts is a nested list having the form:

  • ‘unordered’
  • ‘list of possibly nested posts’
  • ‘list of possibly nested posts’

Therefore, I have to transform it into a simple list containing only the leading elements of the nested post lists. To achieve this, I apply a sequence filter on posts. Then, I strip the ‘unordered’ string from the beginning using cdr and I apply car as a filter on the lists of possibly nested posts which makes seq-filter return only the leading elements of the latter.

   (mapconcat
    (lambda (post)
      (format "%s\n" (car post)))
    (seq-filter #'car (cdr posts))
    "\n")))

Page titles

By default, the title of an output HTML page corresponds to the title of the original Org document. In addition to this title, I want to add a suffix, e.g. ‘Title - My site’.

To achieve this, I define the function add-suffix-to-html-title taking as argument the suffix to append and the list of html-files to process.

(defun add-suffix-to-html-title (suffix html-files)

For each HTML file in html-files, the function reads the content of the file,

  (while (setq html-file (pop html-files))
    (with-temp-buffer
      (insert-file-contents html-file)

navigates the cursor to the end of the buffer and backward searches for the closing </title> HTML tag.

      (goto-char (point-max))
      (re-search-backward "<\\/title>")

The cursor being at the beginning of the match, it inserts the text in suffix to the buffer immediately after the last character of the original document's title and saves the modified buffer.

      (insert suffix)
      (write-region 1 (point-max) html-file))))

Then, I define two wrappers for this function because I want to add a different suffix depending on whether the page is a content page or a blog post.

The wrapper add-suffix-to-html-title-for-pages calls the original function add-suffix-to-html-title after publishing content pages and adds the suffix ‘ - Marek Felšöci’. Note that, the list of corresponding HTML files is acquired through the project component property :publishing-directory read from the plist argument (see Project components).

(defun add-suffix-to-html-title-for-pages (plist)
  (add-suffix-to-html-title
   " - Marek Felšöci"
   (directory-files
    (plist-get plist :publishing-directory) t "\\.html$")))

The wrapper add-suffix-to-html-title-for-blog-posts calls the original function add-suffix-to-html-title when exporting blog posts and adds the suffix ‘ - Marek's blog’ to the titles of blog posts.

(defun add-suffix-to-html-title-for-blog-posts (plist)
  (add-suffix-to-html-title
   " - Marek's blog"
   (directory-files
    (plist-get plist :publishing-directory) t "\\.html$")))

These functions are called completion functions as they are triggered after publishing (see Sources and destinations in the Org Manual).

Last modification date

To include the last modification date to every page and blog post, I use an another completion function.

It begins by acquiring the list of original Org files through the project component property :base-directory read from the plist argument (see Project components).

(defun add-last-modification-date (plist)
  (let*
      ((org-files
        (directory-files
         (plist-get plist :base-directory) t "\\.org$"))

I also need to get the path to the publishing directory through the component property :publishing-directory.

       (output-directory
        (plist-get plist :publishing-directory)))

The idea is to determine the last modification dates of the original Org documents using the function last-modified from Publishing script and insert the dates to the published HTML documents straight before the footer (see General configuration).

To do this, I loop over each of the original Org documents to:

  • determine its last modification date,

        (while (setq org-file (pop org-files))
          (setq last-modification-date
                (last-modified org-file))
    
  • get the path to the corresponding output HTML document,

          (setq output-html-file
                (concat
                 output-directory "/" (file-name-base org-file) ".html"))
    
  • open the HTML document, place the cursor before the opening <div> tag of the footer, insert the last modification date and save the modification.

          (with-temp-buffer
            (insert-file-contents output-html-file)
            (goto-char (point-max))
            (re-search-backward "<div id=\"postamble\"")
            (insert
             "<div class=\"content\"><p id=\"last-modification\">"
             "Last update on "
             last-modification-date
             "</p></div>")
            (write-region 1 (point-max) output-html-file)))))
    

General configuration

Before configuring the publishing of the site, I set a couple of general preferences.

I deactivate the using of Org timestamp flags to force publishing of all files and not only changed files. It makes sure everything gets published.

(setq org-publish-use-timestamps-flag nil)

I also disable the prompt before each code block evaluation.

(setq org-confirm-babel-evaluate nil)

Then, I want to preserve the indentation in code blocks on export and tangle.

(setq org-src-preserve-indentation t)

In order to ensure the bibliography entries, if any, are published correctly I override the default LaTeX publishing command to use latexmk.

(setq org-latex-pdf-process (list "latexmk --shell-escape -f -pdf %f"))

Moreover, I need to instrument the publishing function to include the header and the footer to every exported page.

(setq org-html-preamble (org-file-contents "./shared/header.html"))
(setq org-html-postamble (org-file-contents "./shared/footer.html"))

In order to include my custom CSS style and configure the favicon, I add two extra lines to the HTML header.

(setq org-html-head-extra "<link rel=\"stylesheet\" type=\"text/css\"
href=\"../styles/custom.css\">
<link rel=\"icon\" type=\"image/x-icon\"
href=\"https://felsoci.sk/favicon.ico\"/>")

Finally, I define a utility function allowing me to change the output folder through an environment variable, namely ORG_OUTPUT_PATH. This way, I can switch between my local Apache server for testing and the production server easily. If the variable is not set in the current environment, the output will be published into the public folder located in the root of the project.

Note that, the optional suffix argument specifies the local path starting from the root of the output folder.

(defun get-output-path (&optional suffix)
  (let
      ((custom (getenv "ORG_OUTPUT_PATH")))
    (if custom
        (concat custom "/" suffix)
      (concat "./public/" suffix))))

Project components

The last thing to do is to define the org-publish-project-alist. It represents the list of project's components and their individual export configuration as a list of properties, e. g. :publishing-directory.

(setq org-publish-project-alist
      (list

I split the site project into 5 components.

Blog

All of the configuration properties are pretty self-explanatory.

       (list "blog"
             :base-directory "./blog"
             :base-extension "org"
             :publishing-directory (get-output-path "blog")
             :htmllized-source t
             :with-author t
             :with-creator t
             :with-date t
             :headline-level 4
             :section-numbers nil
             :with-toc nil
             :html-head nil
             :html-head-include-default-style nil
             :html-head-include-scripts nil

Although, I highlight the publishing function I choose. It tells Emacs to publish the Org documents composing this project component in the HTML format.

             :publishing-function '(org-html-publish-to-html)

The :completion-function property allows me to define functions to execute after publishing. Here, I set add-last-modification-date and add-suffix-to-html-title-for-blog-posts as completion functions (see Last modification date and Page titles).

             :completion-function '(add-last-modification-date
                                    add-suffix-to-html-title-for-blog-posts)

Eventually, I configure the sitemap corresponding to the list of blog posts. The title is ‘Posts’ and the posts are sorted from the latest to the oldest one.

             :auto-sitemap t
             :sitemap-filename "posts.org"
             :sitemap-title "Posts"
             :sitemap-sort-files 'anti-chronologically

Moreover, I use the functions format-blog-sitemap and format-blog-item to format the entires of the site map (blog post items) as well as the sitemap (list of blog posts) itself (see List of blog posts).

             :sitemap-function 'format-blog-sitemap
             :sitemap-format-entry 'format-blog-item)

Content pages

The export configuration for the content pages such as Home and About is very close to the previous one

        (list "pages"
              :base-directory "."
              :base-extension "org"
              :publishing-directory (get-output-path)
              :publishing-function '(org-html-publish-to-html)
              :htmllized-source t
              :with-author t
              :with-creator t
              :with-date t
              :headline-level 4
              :section-numbers nil
              :with-toc nil
              :html-head nil
              :html-head-include-default-style nil
              :html-head-include-scripts nil

except for the title suffix function add-suffix-to-html-title-for-pages (see Page titles).

              :completion-function '(add-last-modification-date
                                     add-suffix-to-html-title-for-pages)

Furthermore, I must exclude the blog folder from the list of input documents to prevent duplicate export.

              :exclude (regexp-opt '("blog")))

CV

The most important thing in the export configuration for the CV is the publishing function. Here, I use the function allowing me to publish PDF documents on output.

        (list "cv"
              :base-directory "./cv"
              :base-extension "org"
              :publishing-directory (get-output-path "cv")
              :publishing-function '(org-latex-publish-to-pdf))

Styles, images and other attachments

In case of static files such as CSS styles, images and other attachments which are published as is, I use the publishing function for attachments. In case of the styles folder, I enable recursive lookup in order to include also the fonts sub-folder. Same for attachments (see Project's structure).

        (list "styles"
              :base-directory "./styles"
              :base-extension ".*"
              :recursive t
              :publishing-directory (get-output-path "styles")
              :publishing-function '(org-publish-attachment))
        (list "images"
              :base-directory "./images"
              :base-extension ".*"
              :publishing-directory (get-output-path "images")
              :publishing-function '(org-publish-attachment))
        (list "attachments"
              :base-directory "./blog/attachments"
              :base-extension ".*"
              :recursive t
              :publishing-directory (get-output-path "blog/attachments")
              :publishing-function '(org-publish-attachment))

I complete the list by adding the list of all the components of the project as well as the name of the latter.

        (list "felsoci.sk"
              :components '("blog" "pages" "styles" "images" "attachments"))))

Ready, steady, go!

At this point, I am ready to go. To launch the publishing, I use the following shell command.

Notice that, in this command line, I disable the confirmation before evaluating each code block for the sake of simplicity.

emacs --batch --no-init-file --eval '(setq org-confirm-babel-evaluate nil)' \
      --load publish.el --funcall org-publish-all

Feel free to send me your feedback!

Acknowledgement

Many thanks to Dennis Ogbe who published a similar post on his website. It helped me a lot while building my own publishing configuration!

Last update on 17/07/2021


This site is proudly powered by Org mode for Emacs on the servers of Websupport, spol. s r. o.

Source code of the site is publicly available on GitHub.

Content is available under the Creative Commons BY NC ND 4.0 International license unless otherwise stated.

Creative Commons License