readme: static generator python wrapping org-mode

What

This is a repo to generate html for github pages from org mode files.

The file you are reading is both the readme and an individual "report" (linked into reports/) accessible from the generated index.

Motivation

Org-mode babel is the coolest thing!
I have a lot of git repos for one off experiments.
I don't know how to make literate programming work for me

If I make it easy to export org-babel, I can use it to reduce friction on one off projects and test out ways to implement literate programming

Notes

Usage

run make to read through reports/*.org and export to html/

this file

run tangle from reports/ directory not top level. relative links work much better there.

org mode

#+OPTIONS: _:{} ^:{} so _ doesn't subscript. Can still use _{} like_this
#+OPTIONS: toc:nil num:nil to remove table of contents and header numbering
C-c C-e # html for an org->html template
C-c . to insert the date
C-c C-v t to tangle files from source blocks with :tangel file in the header

Todos

add tangled file and link from src_blocks a la this stack overflow QA.
code highlighting
add a todo/idea page
use file_df.mt and file_df.date to annotate updated pages. maybe use creation time to make sure it wasn't just the file-system that was updated

Aside

The small python script — make_index.py — made within this document could be replaced by a Makefile and a few bash/perl/python one liners.

Code

The python and html template code to export org-files and render index.html is all tangled within this org file.

This is awkward. For this document to be meaningful, it is the only way these files should be edited.

find title and date

date can be in the file as #+DATE:.*yyyy-mm-dd or could be the modification time of the file
title #+TITLE:.* or the name of the file
we want to (re-)export files that have an out-of-date export file, so keep modification time info of source org file

find all pages

We're creating a py file in ../src, but running this from an org file in the directory we expect all the files to already be in. ../reports will work for a file launched in either location. All is lost if we run the code outside of the directory it lives.

We use the function above to pull out info on the file

tabulate(file_df.head(n=5),headers="keys", tablefmt="orgtbl",showindex=False)

f	mt	title	date
cgi-startuptime.org	2022-04-04 16:34:06.432610	cgi-bin and interpreter startup time	2021-10-22
risk.org	2021-10-22 21:26:10.164202	risk	2020-04-04
gopher.org	2021-10-22 21:26:10.163202	gopher	2019-03-15
netflix.org	2021-10-22 21:26:10.163202	Netflix Usage	2017-12-09
strava.org	2021-10-22 21:26:10.164202	Strava Tracked Workouts	2017-12-09

export pages

This is a weird place to be. This text (and code) is written using org-mode within emacs. The actual instructions are run inside python.

We need to use python to get back into an emacs lisp environment to export to html.

Conveniently, there's a tool for this: org-export!

org-export html --infile xxx.org --outfile yyy.html --bootstrap

So from python, we'll call a bash script that runs emacs. Meanwhile, the instructions to do all of this are written in emacs. Literate programming is hard.

bonuses

include css (--bootstrap) without much work
have more control in the name of the final output html file (--outname).

org-export configuration

By default org-export builds ess and org from git. This was failing. I removed these two from org-export-html.el:cli-el-get-setup

DIY Make target list

Makefile could (should) figure out what needs to be exported. But we already have modification times. So we can compare those to the output targets to see which needs to be (re)run.

quick look

TODO: add this to script with something like DRYRUN

from tabulate import tabulate
tabulate(need_update[['title','export_date','mt']],headers="keys", tablefmt="orgtbl",showindex=False)

title	export_date	mt
readme: static generator python wrapping org-mode	2022-04-04 17:19:55.509670	2022-04-04 21:36:50.713842

Actually run

create the index

The index page links to all the exported org files.

Template

We'll use a template engine — wheezy.template because it was linked here — to wrap generate the index page.

@require(file_df,title)
<html> <head>
  <title>@title</title>
  <link rel="stylesheet" type="text/css" href="style.css" />
  <link rel="alternate" type="application/rss+xml" title="WFLOG RSS Feed" href="rss.xml" />
  <link rel="shortcut icon" href="https://secure.gravatar.com/avatar/3fed911ae9175eaf6c4e4ec51de7e6ac?size=125">
 </head>
 <body>
   <h1>External</h1>
   <ul class="info">
      <li><a href="https://github.com/WillForan">Github</a></li>
      <li><a href="https://stackoverflow.com/users/1031776/will">StackOverflow</a></li>
      <li><a href="https://scholar.google.com/citations?user=PzX6F5oAAAAJ">GoogleScholar</a></li>
      <li><a href="https://www.strava.com/athletes/15036420">Strava</a></li>
      <li><a href="https://www.swrd.trade">SWRD</a></li>
   </ul>
   <h1>@title</h1>
   Also in <a href="gopher://www.xn--4-cmb.com">gopher space</a>
   <ul>
   @for i,f in file_df.iterrows():
       <li><a href="@f['uri']"><time>@f['date']</time> @f['title']</a></li>
   @end
   </ul>
 </body>
</html>

Styling

setup minimal styling.

https://brutalist-web.design/ says always have underlines. so added those back in but colored them lighter (20230122)
https://www.swyx.io/css-100-bytes is inspiration for centering
https://pitt.edu/~foran had my gravatar. picked that up

Populate

Gopher

This was done by another script or by hand somewhere since lost. Tracked here (readme.org) in 20220404

Template

Like html, we're using wheezy.template but there's a lot less ceremony.

Links look like [0|$desc|$link|server|70]

@require(file_df)
WF log
@for i,f in file_df.iterrows():
[0|@f['date'] - @f['title']|@f['uri']|server|70]
@end

Populate

We're not doing any processing or exporting. Gopher text files will be the same as org source but named .txt instead. ln here but rsync to the gopher server will copy as files.

Inspecting

Here's what we'll be linking

tabulate(gopher_df, headers="keys", tablefmt="orgtbl",showindex=False)

f	title	date	ln_to	need_ln
cgi-startuptime.org	cgi-bin and interpreter startup time	2021-10-22	../gopher/cgi-startuptime.txt	False
risk.org	risk	2020-04-04	../gopher/risk.txt	False
gopher.org	gopher	2019-03-15	../gopher/gopher.txt	False
netflix.org	Netflix Usage	2017-12-09	../gopher/netflix.txt	False
strava.org	Strava Tracked Workouts	2017-12-09	../gopher/strava.txt	False
climbingWallSPA.org	Climbing Wall Route Annotation SPA	2017-11-19	../gopher/climbingWallSPA.txt	False
readme.org	readme: static generator python wrapping org-mode	2017-11-18	../gopher/readme.txt	False

Commit to files

RSS

template

The template is easy enough. needs title, url, desc, date, and cdata encoded html

@require(time,rss_df)
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title>WFLOG</title>
<link>www.xn--cmb.com</link>
<description></description>
<lastBuildDate>@time</lastBuildDate>
@for i,f in rss_df.iterrows():
<item>
<title>@f['title']</title>
<link>@f['link']</link>
<description>@f['desc']</description>
<pubDate>@f['rss_date']</pubDate>
<content:encoded><![CDATA[ @f['cdata'] ]]></content:encoded>
<dc:creator>Will Foran</dc:creator>
</item>
@end
</channel>
</rss>

arrange data

need to pull in all the html body. also want date like Mon, 04 Apr 2022 15:22:29 -0400

write

we should limit the feed to just the most recent files. Though it's unlikely there will ever be enough text to warrent it.