readme: static generator python wrapping org-mode

What

This is a repo to generate html for github pages from org mode files.

The file you are reading is both the readme and an individual "report" (linked into reports/) accessible from the generated index.

Motivation

  • Org-mode babel is the coolest thing!
  • I have a lot of git repos for one off experiments.
  • I don't know how to make literate programming work for me

If I make it easy to export org-babel, I can use it to reduce friction on one off projects and test out ways to implement literate programming

Notes

Usage

run make to read through reports/*.org and export to html/

this file

run tangle from reports/ directory not top level. relative links work much better there.

org mode

  • #+OPTIONS: _:{} ^:{} so _ doesn't subscript. Can still use _{} likethis
  • #+OPTIONS: toc:nil num:nil to remove table of contents and header numbering
  • C-c C-e # html for an org->html template
  • C-c . to insert the date
  • C-c C-v t to tangle files from source blocks with :tangel file in the header

Todos

  • add tangled file and link from src_blocks a la this stack overflow QA.
  • code highlighting
  • add a todo/idea page
  • use file_df.mt and file_df.date to annotate updated pages. maybe use creation time to make sure it wasn't just the file-system that was updated

Aside

The small python script — make_index.py — made within this document could be replaced by a Makefile and a few bash/perl/python one liners.

Code

The python and html template code to export org-files and render index.html is all tangled within this org file.

This is awkward. For this document to be meaningful, it is the only way these files should be edited.

find title and date

  • date can be in the file as #+DATE:.*yyyy-mm-dd or could be the modification time of the file
  • title #+TITLE:.* or the name of the file
  • we want to (re-)export files that have an out-of-date export file, so keep modification time info of source org file

find all pages

We're creating a py file in ../src, but running this from an org file in the directory we expect all the files to already be in. ../reports will work for a file launched in either location. All is lost if we run the code outside of the directory it lives.

We use the function above to pull out info on the file

tabulate(file_df.head(n=5),headers="keys", tablefmt="orgtbl",showindex=False)
fmttitledate
cgi-startuptime.org2022-04-04 16:34:06.432610cgi-bin and interpreter startup time2021-10-22
risk.org2021-10-22 21:26:10.164202risk2020-04-04
gopher.org2021-10-22 21:26:10.163202gopher2019-03-15
netflix.org2021-10-22 21:26:10.163202Netflix Usage2017-12-09
strava.org2021-10-22 21:26:10.164202Strava Tracked Workouts2017-12-09

export pages

This is a weird place to be. This text (and code) is written using org-mode within emacs. The actual instructions are run inside python.

We need to use python to get back into an emacs lisp environment to export to html.

Conveniently, there's a tool for this: org-export!

org-export html --infile xxx.org --outfile yyy.html --bootstrap

So from python, we'll call a bash script that runs emacs. Meanwhile, the instructions to do all of this are written in emacs. Literate programming is hard.

bonuses

  • include css (--bootstrap) without much work
  • have more control in the name of the final output html file (--outname).

org-export configuration

By default org-export builds ess and org from git. This was failing. I removed these two from org-export-html.el:cli-el-get-setup

DIY Make target list

Makefile could (should) figure out what needs to be exported. But we already have modification times. So we can compare those to the output targets to see which needs to be (re)run.

quick look

TODO: add this to script with something like DRYRUN

from tabulate import tabulate
tabulate(need_update[['title','export_date','mt']],headers="keys", tablefmt="orgtbl",showindex=False)
titleexport_datemt
readme: static generator python wrapping org-mode2022-04-04 17:19:55.5096702022-04-04 21:36:50.713842

Actually run

create the index

The index page links to all the exported org files.

Template

We'll use a template engine — wheezy.template because it was linked here — to wrap generate the index page.

@require(file_df,title)
<html> <head>
  <title>@title</title>
  <link rel="stylesheet" type="text/css" href="style.css" />
  <link rel="alternate" type="application/rss+xml" title="WFLOG RSS Feed" href="rss.xml" />
  <link rel="shortcut icon" href="https://secure.gravatar.com/avatar/3fed911ae9175eaf6c4e4ec51de7e6ac?size=125">
 </head>
 <body>
   <h1>External</h1>
   <ul class="info">
      <li><a href="https://github.com/WillForan">Github</a></li>
      <li><a href="https://stackoverflow.com/users/1031776/will">StackOverflow</a></li>
      <li><a href="https://scholar.google.com/citations?user=PzX6F5oAAAAJ">GoogleScholar</a></li>
      <li><a href="https://www.strava.com/athletes/15036420">Strava</a></li>
      <li><a href="https://www.swrd.trade">SWRD</a></li>
   </ul>
   <h1>@title</h1>
   Also in <a href="gopher://www.xn--4-cmb.com">gopher space</a>
   <ul>
   @for i,f in file_df.iterrows():
       <li><a href="@f['uri']"><time>@f['date']</time> @f['title']</a></li>
   @end
   </ul>
 </body>
</html>

Styling

setup minimal styling.

Populate

Gopher

This was done by another script or by hand somewhere since lost. Tracked here (readme.org) in 20220404

Template

Like html, we're using wheezy.template but there's a lot less ceremony.

Links look like [0|$desc|$link|server|70]

@require(file_df)
WF log
@for i,f in file_df.iterrows():
[0|@f['date'] - @f['title']|@f['uri']|server|70]
@end

Populate

We're not doing any processing or exporting. Gopher text files will be the same as org source but named .txt instead. ln here but rsync to the gopher server will copy as files.

Inspecting

Here's what we'll be linking

tabulate(gopher_df, headers="keys", tablefmt="orgtbl",showindex=False)
ftitledateln_toneed_ln
cgi-startuptime.orgcgi-bin and interpreter startup time2021-10-22../gopher/cgi-startuptime.txtFalse
risk.orgrisk2020-04-04../gopher/risk.txtFalse
gopher.orggopher2019-03-15../gopher/gopher.txtFalse
netflix.orgNetflix Usage2017-12-09../gopher/netflix.txtFalse
strava.orgStrava Tracked Workouts2017-12-09../gopher/strava.txtFalse
climbingWallSPA.orgClimbing Wall Route Annotation SPA2017-11-19../gopher/climbingWallSPA.txtFalse
readme.orgreadme: static generator python wrapping org-mode2017-11-18../gopher/readme.txtFalse

Commit to files

RSS

template

The template is easy enough. needs title, url, desc, date, and cdata encoded html

@require(time,rss_df)
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title>WFLOG</title>
<link>www.xn--cmb.com</link>
<description></description>
<lastBuildDate>@time</lastBuildDate>
@for i,f in rss_df.iterrows():
<item>
<title>@f['title']</title>
<link>@f['link']</link>
<description>@f['desc']</description>
<pubDate>@f['rss_date']</pubDate>
<content:encoded><![CDATA[ @f['cdata'] ]]></content:encoded>
<dc:creator>Will Foran</dc:creator>
</item>
@end
</channel>
</rss>

arrange data

need to pull in all the html body. also want date like Mon, 04 Apr 2022 15:22:29 -0400

write

we should limit the feed to just the most recent files. Though it's unlikely there will ever be enough text to warrent it.


..