cgi-bin and interpreter startup time

Numbers

Motivated by generating an intuition for the bare minimum time a cgi-bin script request could take, I've generated an unprincipled list of startup times for various interpreters.

There are lots of reasons to find this metric useless. It's nothing like a measure of how fast code can run in the interpreter (arithmetic, regexp matching, etc), or even the time needed to load in common dependencies.

Eye-balling start up times for what I have installed on my system, there are 3 groups.

  • fast (<.005s): perl, awk, sed, bash, lisp(!)
  • okay: php, bb, py, node, ruby
  • slow (>1s): julia, lisp, R, clojure
    # hi in various interpreters 
    hi_lisp(){ sbcl --no-sysinit --no-userinit --eval '(progn (format t "hi!") (quit))';}
    # loads quicklisp
    hi_lpslow(){ sbcl --eval '(progn (format t "hi!") (quit))';}
    hi_pl(){ perl -E "say 'hi'";}
    hi_py(){ python -c "print('hi')";}
    hi_node(){ node -e "console.log('hi')";}
    hi_ruby(){ ruby -e "puts 'hi'";}
    hi_bb(){ bb '(print "hi\n")'; }
    hi_bash() { bash -c "echo hi";}
    hi_php() { php -r 'echo "hi\n";';}
    hi_r() { Rscript --vanilla -e "print('hi')";}
    hi_jl() { julia -e 'print("hi")';}
    hi_clj() { clj --eval '(println "hi")';}
    hi_apl() { echo -e '"hi"\n)OFF' |  apl --noCIN -q --noSV; }
    hi_awk() { echo hi | awk '{print $0}';}
    hi_sed() { echo hi | sed -n p;}

    # time it. find all hi* functions. export them so hyperfine can see 'em
    csvtime() { hyperfine $1 --style none --export-csv >(cat) 2>/dev/null ; }
    for f in $(typeset -F|grep -oP 'hi_[a-z]+$'); do
      export -f $f
      csvtime $f;
    done |
  sort -u | # remove repeated headers. "command" sorted before "hi*" => header on top
  perl -F, -slane 'print join "\t", map {$_=/^[0-9]/?sprintf("%.5f",$_):$_} @F'| # fewer sigfig for numbers
  awk 'NR<2{print $0;next}{print $0|"sort -k2,2"}'| # sort all but the header
  sed s/hi_// # remove function name that helped identify in $(typeset -F)
commandmeanstddevmedianusersystemminmax
sed0.002330.000250.002280.001820.001140.001890.00389
pl0.002810.000140.00280.001450.001420.002420.00377
bash0.003380.000140.003380.002130.001340.002970.00436
awk0.003890.000150.003890.002830.001460.00340.00491
lisp0.004630.000260.004630.001820.00290.003770.00542
apl0.007370.000260.007380.004170.003540.006710.00883
php0.014780.000290.014750.008380.006230.014230.01664
bb0.020020.000420.019960.005780.014910.019270.02251
py0.027950.000950.027520.022410.005230.026840.03187
node0.036090.00080.035940.027290.009130.035060.04009
ruby0.067910.008820.066480.058120.008880.05790.08943
jl0.163680.002410.163420.083490.078780.160080.16821
r0.188030.003710.187160.150060.040150.183740.19786
lpslow0.337050.004330.3360.284540.049510.332460.3469
clj1.078450.020571.074991.576960.218641.050551.11453

More on Motivation

Musings in https://eccentric-j.com/blog/clojure-like-its-php.html consider babashka's faster-than-clojure start up time in context of cgi-bin scripts. Aside: compiling clojure with GraalVM and serving quick starting binaries is an essential part of a work project.

I find the babashka write-up compelling, but not for the same reason as the author. I have a very small collection of local cgi-bin scripts, all in bash. And all are under the magic 100-or-so lines that mark when convention says use a "real programming language."

 echo -e "file\tline-count\ttype"
 ssh s2 '
  for f in /srv/http/cgi-bin/*; do
   echo -e "$(basename $f)\t$(wc -l < $f)\t$(file -bL $f|sed s/,.*//)";
  done'
fileline-counttype
books46Bourne-Again shell script
e48Bourne-Again shell script
hi6Bourne-Again shell script
tv73Bourne-Again shell script

These utilities emerged organically from small itches and lean heavily on other bigger tools (calibre, sqlite, filesystem). Shell's a good fit, but the choice wasn't deliberate. It's nice to be prompted to think about it. The procrastinator-looking-for-an-escape is always interested in exploring other hacks.

I'd previously been burned by noticeably slow python. Matplotlib generated an image of the week's soccer roster to embed in emails for up-to-date whos-going-to-show-up status. It was slow enough that opening the google sheet in a new tab was often competitive time-wise. (Were I to redo the project, I'd use shell and imagemagick)

To that end, I'm curious what the quickest start up is for common interpreters are.

Other thoughts

startup time is likely to be entirely eclipsed by library load times

(
  hyperfine 'python -c "1"'
  hyperfine 'python -c "import matplotlib;import pandas; 1"'
 ) |sed -n s/Range[^0-9]*//p
  25.8 ms …  27.6 ms    100 runs
  397.1 ms … 426.1 ms    10 runs

Common lisp can start up fast if it doesn't have to load much

I haven't tested leaving out quicklisp but still loading whatever libraries would be useful. I suspect lisp would do better than python while still providing an interface to live edit.

native binaries: fast but large

CL and clojure can both be compiled to very fast-starting native images. But both are large for even small programs (~60Mb for SBCL, and ~40Mb for clojure w/postgres at least)

APL nerd snipe

I fell into a small APL rabbit hole. The online manual is wrong. Neither --eval nor --off are valid arguments! Regardless, anytime I come near APL, I loose hours thinking this will be the time it clicks. Doubly so when considering april. Someday I'll figure out getting a nifti file into lisp as a matrix and april will open up a new world. (20220404 edit: also APL.jl)

User experience

20ms feels fine, 50ms feels laggy, and 150ms feels unbearable.

VR headset latency might be as orthogonal to application startup a time metric can get, but the psycho-physics still provide a useful baseline. With a shell terminal as a not-so-impoverished REPL for shell cgi-bin, the evaluate part of the loop is worth inspecting, though less so for the actual page rendering. Though there's psychology to explore there to. I wish APL had become an often embedded DSL as ubiquitous as regular expressions.

Based on a 0.1s natural mobile site speed improvement, we … conversions increase [~8-10%]

The same feels true for development, especially self-motivated ones. I'll move on to something with a quicker feedback if I am constantly waiting a second or two for iterative results. https://input-delay.glitch.me/


..