Rewriting my website in Rust

Everything is being rewritten in Rust these days, so I rewrote this website to see what the hype is all about.

I am very satisfied with the result.

main.rs
use alemat::{DisplayAttr, MathMl, MathMlAttr};
use anyhow::anyhow;
use clap::Parser;
use clap_verbosity_flag::{Verbosity};
use httpdate::{HttpDate};
use jotdown::{Attributes, Container, Event, LinkType, Render};
use lazy_static::lazy_static;
use log::{self, debug, error, info, trace, warn};
use mathemascii;
use mime_guess;
use minijinja::{context, Environment, value::Value};
use mlua::{Lua, LuaSerdeExt};
use notify::{EventKind, RecursiveMode, Watcher};
use path_slash::PathBufExt;
use seahash;
use serde::Serialize;
use serde_json;
use serde_yaml;
use syntect::{highlighting::Theme, parsing::SyntaxSet, html::highlighted_html_for_string};
use std::{
   collections::HashMap,
   ffi::OsStr,
   fs::{self, DirBuilder, File},
   hash::Hasher,
   io::{self, BufReader, Read, Write},
   path::{Path, PathBuf},
   net::{self, SocketAddr},
   str::FromStr,
   sync::mpsc,
   thread::{self, JoinHandle},
   time::{Duration, Instant},
};
use tiny_http::{Header, Request, Response, Server, StatusCode};
use two_face::theme::{EmbeddedLazyThemeSet, EmbeddedThemeName};
use walkdir::{DirEntry, WalkDir};

static BUILD_DIR : &str = "build";
static CONTENT_DIR : &str = "content";
static STATIC_DIR : &str = "content/static";
static INCLUDE_DIR : &str = "content/template/include";
static INDEX_DIR : &str = "content/template/index";
static LAYOUT_DIR : &str = "content/template/layout";
static PAGE_DIR : &str = "content/template/page";

/// Rust SSG for cthor.me
#[derive(Parser, Debug)]
#[command(version, about, long_about = None)]
struct Args {
   /// Only build site, don't run server
   #[arg(long)]
   build: bool,

   /// Address for server to listen on, ignored if --port specified
   #[arg(short, long, default_value = "0.0.0.0:8080")]
   addr: SocketAddr,

   /// Port for server to listen to on 0.0.0.0
   #[arg(short, long)]
   port: Option<u16>,

   #[command(flatten)]
   verbose: Verbosity<clap_verbosity_flag::InfoLevel>,
}

impl Args {
   fn new() -> Self {
      let mut args = Args::parse();
      if let Some(port) = args.port {
         args.addr = SocketAddr::new(net::IpAddr::V4(net::Ipv4Addr::UNSPECIFIED), port);
      }
      args
   }
}

trait BoolExt {
   fn and_or<T>(self, if_true: T, if_false: T) -> T;
}

impl BoolExt for bool {
   fn and_or<T>(self, if_true: T, if_false: T) -> T {
      if self { if_true } else { if_false }
   }
}

trait PathExt {
   fn to_string_normal(&self) -> String;
}

impl PathExt for Path {
   fn to_string_normal(&self) -> String {
         self.to_string_lossy().replace('/', &std::path::MAIN_SEPARATOR.to_string())
   }
}

macro_rules! walk_files {
   ($prefix:ident) => {
      WalkDir::new($prefix).into_iter().filter_map(|e| e.ok()).filter(|e| e.path().is_file())
   }
}

fn main() {
   let args = Args::new();
   env_logger::Builder::new()
      .filter_level(args.verbose.log_level_filter())
      .filter_module("mio", log::LevelFilter::Debug)
      .filter_module("notify", log::LevelFilter::Debug)
      .filter_module("tiny_http", log::LevelFilter::Info)
      .parse_env("RUST_LOG")
      .init();

   build_site();

   if !args.build {
      let mut joins = Vec::new();
      joins.push(start_watcher());
      joins.push(start_server(args.addr));
      joins.into_iter().for_each(|j| j.join().unwrap());
   }
}

fn build_site() {
   if let Err(e) = _build_site() {
      error!("{:?}", e);
      error!("Aborting build due to errors");
   }
}

fn _build_site() -> anyhow::Result<()> {
   let t0 = Instant::now();
   let mut env = Environment::new();
   env.set_lstrip_blocks(true);
   env.set_trim_blocks(true);
   env.add_filter("lua", lua_filter);
   env.add_filter("yaml", yaml_filter);
   env.add_function("parse_djot", parse_djot);
   env.add_function("dump_file",  dump_file);
   env.set_loader(|name: &str| {
      macro_rules! check_dir {
         ($dir:ident) => {
            let path = Path::new($dir).join(name);
            if path.exists() {
               return fs::read_to_string(&path)
                  .map(|s| Some(s))
                  .map_err(|e| {
                     minijinja::Error::new(
                        minijinja::ErrorKind::TemplateNotFound,
                        format!("Failed to read template '{}': {}", path.to_string_normal(), e),
                     )
                  });
            }
         }
      }
      check_dir!(LAYOUT_DIR);
      check_dir!(INCLUDE_DIR);
      Ok(None)
   });

   let mut builder = DirBuilder::new();
   builder.recursive(true);
   builder.create(BUILD_DIR)?;

   let t1 = Instant::now();
   for entry in walk_files!(STATIC_DIR) {
      let path = entry.into_path();
      let url = path.strip_prefix(STATIC_DIR)?;
      let dest_fn = PathBuf::from(BUILD_DIR).join(url);
      if files_match(&path, &dest_fn).unwrap_or(false) {
         trace!("files match {} {}", path.to_string_normal(), dest_fn.to_string_normal());
      } else {
         info!("{} => {}", path.to_string_normal(), dest_fn.to_string_normal());
         builder.create(dest_fn.with_extension(""))?;
         fs::copy(&path, &dest_fn)?;
      }
   }
   debug!("Static built in {:.2?}", t1.elapsed());

   let t2 = Instant::now();
   let mut pages : Vec<Value> = Vec::new();
   for entry in walk_files!(PAGE_DIR) {
      let state = process_template(&env, entry, PAGE_DIR)?;
      if state.get_attr("index").is_ok_and(|v| v.is_true()) {
         pages.push(state);
      }
   }
   debug!("Pages built in {:.2?}", t2.elapsed());
   trace!("Pages {:?}", pages);

   let t3 = Instant::now();
   env.add_global("pages", pages);
   for entry in walk_files!(INDEX_DIR) {
      process_template(&env, entry, INDEX_DIR)?;
   }
   debug!("Indexes built in {:.2?}", t3.elapsed());
   info!("Site built in {:.2?}", t0.elapsed());
   Ok(())
}

fn file_hash(path: &Path) -> Result<u64, io::Error> {
   let file = File::open(path)?;
   let mut reader = BufReader::new(file);
   let mut hasher = seahash::SeaHasher::new();
   let mut buf = [0; 64 * 1024];

   loop {
      let bytes_read = reader.read(&mut buf)?;
      if bytes_read == 0 {
         break;
      }
      hasher.write(&buf[..bytes_read]);
   }

   Ok(hasher.finish())
}

fn files_match(path1: &Path, path2: &Path) -> Result<bool, io::Error> {
   let file1 = File::open(path1)?;
   let file2 = File::open(path2)?;

   let mut reader1 = BufReader::new(file1);
   let mut reader2 = BufReader::new(file2);

   let mut buf1 = [0; 64 * 1024];
   let mut buf2 = [0; 64 * 1024];

   loop {
      let len1 = reader1.read(&mut buf1)?;
      let len2 = reader2.read(&mut buf2)?;

      if len1 != len2 || buf1[..len1] != buf2[..len2] {
         return Ok(false);
      }
      if len1 == 0 {
         return Ok(true);
      }
   }
}

lazy_static! (
   static ref SYNTAX_SET: SyntaxSet = SyntaxSet::load_defaults_newlines();
   static ref THEME_SET: EmbeddedLazyThemeSet = two_face::theme::extra();
   static ref THEME: &'static Theme = THEME_SET.get(EmbeddedThemeName::MonokaiExtendedLight);
);

fn highlight(code: &str, lang: &str) -> String {
   let t1 = Instant::now();
   let syntax = SYNTAX_SET.find_syntax_by_token(lang).unwrap_or(SYNTAX_SET.find_syntax_plain_text());
   let ret = match highlighted_html_for_string(code, &SYNTAX_SET, syntax, &THEME) {
      Ok(s) => s,
      Err(e) => format!("Error: {:?}", e),
   };
   trace!("highlighting took {:.2?}", t1.elapsed());
   ret
}

fn lua_filter(value: Value, data: Option<Value>) -> Result<Value, minijinja::Error> {
   let t0 = Instant::now();
   let string = value.to_string();
   let lua_code = strip_markdown_fences(&string, "lua");
   if lua_code.trim().is_empty() {
      return Err(minijinja::Error::new(
         minijinja::ErrorKind::MissingArgument,
         "No Lua code provided in block",
      ));
   }

   let lua = Lua::new();
   let debug = lua.create_function(|_, val: mlua::Value| {
      let serialized = serde_json::to_string_pretty(&val).map_err(mlua::Error::external)?;
      debug!("{}", serialized.trim_end());
      Ok(())
   }).expect("create debug function");
   lua.globals().set("debug", debug).expect("set debug global");
   lua.globals().set("null", lua.null()).expect("set null global");

   if let Some(d) = data {
      lua.to_value(&d)
         .and_then(|v| lua.globals().set("data", v))
         .map_err(|e| {
            minijinja::Error::new(
               minijinja::ErrorKind::InvalidOperation,
               format!("Failed to serialize argument to Lua: {}", e),
            )
         })?;
   }

   let result: mlua::Value = lua.load(lua_code).eval().map_err(|e| {
      minijinja::Error::new(
         minijinja::ErrorKind::InvalidOperation,
         format!("Lua error: {}", e),
      )
   })?;

   let ret = Value::from_serialize(&result);
   trace!("lua filter {:.2?}", t0.elapsed());
   Ok(ret)
}

fn yaml_filter(value: Value) -> Result<Value, minijinja::Error> {
   let string = value.to_string();
   let yaml = strip_markdown_fences(&string, "yaml");
   if yaml.trim().is_empty() {
      return Err(minijinja::Error::new(
         minijinja::ErrorKind::MissingArgument,
         "No YAML content provided in block",
      ));
   }

   let data: serde_yaml::Value = serde_yaml::from_str(yaml).map_err(|e| {
      minijinja::Error::new(
         minijinja::ErrorKind::InvalidOperation,
         format!("YAML parsing error: {}", e),
      )
   })?;

   Ok(Value::from_serialize(&data))
}

fn strip_markdown_fences<'a>(content: &'a str, language: &str) -> &'a str {
   let lines: Vec<&str> = content.lines().collect();
   if lines.len() >= 3
      && lines[0].starts_with("```")
      && lines[0][3..].trim_start() == language
      && lines.last().map_or(false, |l| l.trim_end() == "```")
   {
      let start = content.find('\n').unwrap() + 1;
      let end = content.rfind("```").unwrap();
      &content[start..end].trim_end()
   } else {
      content
   }
}

#[derive(Clone,Debug,Serialize)]
struct Heading {
   id: String,
   level: u16,
   text: String,
}

#[derive(Debug)]
struct Code<'a> {
   lang: &'a str,
   text: String,
   attr: Attributes<'a>,
}

#[derive(Debug)]
struct Math<'a> {
   display: bool,
   text: String,
   attr: Attributes<'a>,
}

fn parse_djot(input: &str) -> Value {
   let mut curr_heading : Option<Heading> = None;
   let mut curr_code : Option<Code> = None;
   let mut curr_math : Option<Math> = None;
   let mut headings = Vec::new();
   let mut x_events = Vec::new();
   let mut footnote_seen = false;

   let trace = input.starts_with("{% trace");
   let renderer = jotdown::html::Renderer::minified();
   let events = jotdown::Parser::new(input);
   for mut event in events.clone() {
      if trace {
         trace!("{:?}", event);
      }
      match event {
         Event::Str(ref cow) => {
            if let Some(ref mut h) = curr_heading {
               h.text.push_str(cow);
            }
            if let Some(ref mut c) = curr_code {
               c.text.push_str(cow);
            }
            if let Some(ref mut m) = curr_math {
               m.text.push_str(cow);
            }

            if curr_code.is_none() && curr_math.is_none() {
               x_events.push(event);
            }
         },
         Event::Start(Container::CodeBlock { language }, attr) => {
            curr_code = Some(Code {
               lang: language,
               attr: attr,
               text: String::new(),
            });
         },
         Event::Start(Container::Math { display }, attr) => {
            curr_math = Some(Math {
               display: display,
               attr: attr,
               text: String::new(),
            });
         },
         Event::Start(Container::Heading { ref id, level, .. }, ref mut attr) => {
            curr_heading = Some(Heading {
               id: id.clone().into_owned(),
               level: level,
               text: String::new(),
            });
            attr.push((jotdown::AttributeKind::Pair { key: "id" }, id.clone().into()));

            let mut link = id.clone();
            link.to_mut().insert(0, '#');
            x_events.push(event);
            x_events.push(Event::Start(
               Container::Link(link.into(), LinkType::AutoLink),
               r#"{class=h-anchor}"#.try_into().unwrap()
            ));
         },
         Event::End(Container::CodeBlock { .. }) => {
            match curr_code {
               Some(mut c) => {
                  let lines = c.text.lines().count();
                  let long_code = lines > 20;
                  let filename = c.attr.get_value("filename").map(|a| a.to_string());

                  c.attr.push((jotdown::AttributeKind::Pair { key: "lines" }, format!("{}", lines).into()));
                  if long_code {
                     c.attr.push((jotdown::AttributeKind::Pair { key: "class" }, "long".into()));
                  }

                  x_events.push(Event::Start(Container::Div { class: "code-block" }, c.attr));
                  x_events.push(Event::Start(Container::RawBlock { format: "html" }, Attributes::new()));
                  if let Some(f) = filename {
                     x_events.push(Event::Str(format!(concat!(
                        r#"<div class="code-header">"#,
                        r#"<div class="code-filename">{}</div>"#,
                        r#"<button class="code-select"></button>"#,
                        r#"<button class="code-download"></button>"#,
                        r#"<button class="code-collapse"></button>"#,
                        r#"</div>"#), f).into()
                     ));
                  }
                  x_events.push(Event::Str(highlight(c.text.as_ref(), c.lang).into()));
                  if long_code {
                     x_events.push(Event::Str(format!(
                        r#"<button class="code-footer" lines="{}"></button>"#, lines).into()
                     ));
                  }
                  x_events.push(Event::End(Container::RawBlock { format: "html" }));
                  x_events.push(Event::End(Container::Div { class: "code-block" }));
               },
               None => warn!("Event::End(Container::CodeBlock) with no curr_code"),
            };
            curr_code = None;
         },
         Event::End(Container::Heading { ref id, .. }) => {
            match curr_heading {
               Some(h) => headings.push(Value::from_serialize(h)),
               None => warn!("Event::End(Container::Heading) with no curr_heading"),
            };
            curr_heading = None;

            let mut link = id.clone();
            link.to_mut().insert(0, '#');
            x_events.push(Event::End(Container::Link(link.into(), LinkType::AutoLink)));
            x_events.push(event);
         },
         Event::End(Container::Math { .. }) => {
            match curr_math {
               Some(m) => {
                  let ascii_math = mathemascii::parse(&m.text);
                  let mut mathml = MathMl::from(ascii_math);
                  mathml.add_attr(MathMlAttr::Display(m.display.and_or(DisplayAttr::Block, DisplayAttr::Inline)));
                  if let Some(class) = m.attr.get_value("class") {
                     mathml.add_attr(MathMlAttr::Global(alemat::Attribute::Class(class.to_string())));
                  }
                  if let Some(style) = m.attr.get_value("style") {
                     mathml.add_attr(MathMlAttr::Global(alemat::Attribute::Style(style.to_string())));
                  }
                  let mathml_string = mathml.render().expect("BufMathMlWriter does not fail.");

                  x_events.push(Event::Start(Container::RawInline { format: "html" }, Attributes::new()));
                  x_events.push(Event::Str(mathml_string.into()));
                  x_events.push(Event::End(Container::RawInline { format: "html" }));
               },
               None => warn!("Event::End(Container::CodeBlock) with no curr_code"),
            };
            curr_math = None;
         },
         Event::Start(Container::Footnote { .. }, _) => {
            footnote_seen = true;
            x_events.push(event);
         }
         Event::Start(Container::Section { .. }, _) | Event::End(Container::Section { .. }) => {},
         _ => {
            if let Some(ref mut h) = curr_heading {
               let _ = renderer.push(std::iter::once(event.clone()), &mut h.text);
            }
            x_events.push(event);
         }
      }
   }
   if footnote_seen {
      headings.push(Value::from_serialize(Heading { id: "fn1".into(), level: 1, text: "Footnotes".into() }));
   }

   let mut result : HashMap<&str, Value> = HashMap::new();
   let html = jotdown::html::render_to_string(x_events.into_iter());
   result.insert("html", html.into());
   result.insert("headings", headings.into());
   result.into()
}

fn dump_file(filename: &str, lang: Option<&str>) -> String {
   let path = PathBuf::from(filename);
   let contents = match std::fs::read_to_string(&path) {
      Ok(c) => c,
      Err(e) => format!("{}", e),
   };

   let mut fence = String::from("~~~");
   while contents.lines().any(|l| l.starts_with(&fence)) {
      fence.push('~');
   }

   let attr = format!(r#"{{filename="{}"}}"#, path.file_name().and_then(OsStr::to_str).unwrap_or(filename));
   let lang = lang.map_or(String::new(), |l| format!(" {}", l));
   let eof = contents.ends_with("\n").and_or("", "\n");

   attr + "\n" + &fence + &lang + "\n" + &contents + eof + &fence
}

fn path_for_request(request: &Request) -> (PathBuf, StatusCode) {
   let mut path = PathBuf::from(BUILD_DIR);
   let request_url = request.url().trim_start_matches('/');

   if request_url.is_empty() {
      path.push("index.html");
   } else {
      for component in request_url.split('/') {
         if component == ".." {
            let _ = path.pop();
         } else {
            path.push(component);
         }
      }
   }
   if path.is_dir() {
      path.push("index.html");
   }

   if path.is_file() {
      (path, 200.into())
   } else {
      ([BUILD_DIR, "404", "index.html"].iter().collect(), 404.into())
   }
}

fn process_template(
   env: &Environment,
   entry: DirEntry,
   prefix: &'static str,
) -> anyhow::Result<Value> {
   let path = entry.path();

   if !matches!(path.extension(), Some(e) if e == OsStr::new("j2")) {
      return Err(anyhow!("{:?} is not a j2 template", path));
   }

   let mut url = PathBuf::from(path.with_extension("").strip_prefix(prefix)?);
   if let Some("md") = url.extension().and_then(OsStr::to_str) {
      url.set_extension("html");
   }
   if let Some("index.html") = url.file_name().and_then(OsStr::to_str) {
      url.pop();
   }
   if let Some("html") = url.extension().and_then(OsStr::to_str) {
      url.set_extension("");
   };
   let url_abs : Value = PathBuf::from("/").join(&url).to_slash().into();

   let mut dest_fn = PathBuf::from(BUILD_DIR).join(&url);
   if dest_fn.extension().is_none() {
      dest_fn.push("index.html");
   }

   DirBuilder::new().recursive(true).create(dest_fn.with_extension(""))?;

   let mut buf = Vec::new();
   let contents = fs::read_to_string(&path)?;
   let tmpl = env.template_from_str(&contents)
      .or_else(|e| Err(anyhow!("{} {}", e, e.display_debug_info())))?;
   let state = tmpl.render_to_write(context!{ url => url_abs }, &mut buf)
      .or_else(|e| Err(anyhow!("{} {}", e, e.display_debug_info())))?;

   if let Ok(true) = file_hash(&dest_fn).map(|h1| h1 == seahash::hash(&buf)) {
      trace!("hash same for {}", dest_fn.to_string_normal());
   } else {
      info!("{} => {}", path.to_string_normal(), dest_fn.to_string_normal());
      let mut file = File::create(&dest_fn)?;
      file.write_all(&buf)?;
   }

   let mut ret : HashMap<&str, Value> = HashMap::new();
   for key in ["title", "index", "topic"].into_iter() {
      if let Some(v) = state.lookup(key) {
         ret.insert(key, v);
      }
   }
   if state.lookup("index").is_some_and(|i| i.is_true()) && state.lookup("title").is_none() {
      warn!("index=true with no title in {}", path.to_string_normal());
      ret.insert("title", url_abs.clone());
   }
   ret.insert("url", url_abs);
   Ok(ret.into())
}

fn start_server(ip: SocketAddr) -> JoinHandle<()> {
   thread::spawn(move || {
      info!("Starting server at {}", ip);
      let server = Server::http(ip).expect("start server");

      loop {
         let req = match server.recv() {
            Ok(req) => req,
            Err(e) => { error!("error: {}", e); break }
         };

         let (path, status) = path_for_request(&req);
         match File::open(&path) {
            Ok(file) => {
               let mtime = path.metadata()
                  .and_then(|meta| meta.modified())
                  .ok()
                  .map(|modified| HttpDate::from(modified));
               let if_modified_since = req.headers().iter()
                  .find(|h| h.field.equiv("If-Modified-Since"))
                  .and_then(|h| HttpDate::from_str(h.value.as_str()).ok());
               trace!("{:?} {:?}", mtime, if_modified_since);

               let cache_control = Header::from_str("Cache-Control: no-cache").unwrap();
               let last_modified = mtime.map(|t| Header::from_str(format!("Last-Modified: {}", t).as_str()).unwrap());
               let mime_type = mime_guess::from_path(&path).first_or_octet_stream();
               let content_type = Header::from_str(format!("Content-Type: {}", mime_type).as_str()).unwrap();

               macro_rules! finish_response {
                  ($res:ident) => {
                     $res.add_header(cache_control);
                     $res.add_header(content_type);
                     if let Some(h) = last_modified {
                        $res.add_header(h);
                     }
                     req.respond($res).unwrap();
                  }
               }

               if mtime.is_some_and(|l| if_modified_since.is_some_and(|r| l <= r)) {
                  debug!("GET {} 304", req.url());
                  let mut res = Response::empty(304);
                  finish_response!(res);
               } else {
                  info!("GET {} {} {}", req.url(), status.as_ref(), path.to_string_normal());
                  let mut res = Response::from_file(file).with_status_code(status);
                  finish_response!(res);
               }
            },
            Err(e) => {
               error!("Internal server error {:?} {:?}", req, e);
               let res = Response::from_string("Internal Server Error");
               let _ = req.respond(res.with_status_code(500));
            },
         };
      }
   })
}

fn start_watcher() -> JoinHandle<()> {
   thread::spawn(move || {
      let path = Path::new(CONTENT_DIR);
      let debounce = Duration::from_millis(100);
      let (tx, rx) = mpsc::channel::<notify::Result<notify::Event>>();
      let mut watcher = notify::recommended_watcher(tx).expect("get watcher");
      info!("Watching for changes to {}", path.to_string_normal());
      watcher.watch(Path::new(&path), RecursiveMode::Recursive).expect("start watcher");
      for res in &rx {
         match res {
            Ok(notify::Event { kind: EventKind::Access(_), .. }) => {},
            Ok(_) => {
               loop {
                  match rx.recv_timeout(debounce) {
                     Ok(_) => { continue; },
                     Err(mpsc::RecvTimeoutError::Timeout) => { break; },
                     Err(mpsc::RecvTimeoutError::Disconnected) => {
                        error!("Watcher channel disconnected");
                        break;
                     }
                  }
               }
               info!("Changes detected, rebuilding site");
               build_site();
            },
            Err(e) => error!("watch error: {:?}", e),
         }
      }
   })
}
Cargo.toml
[package]
name = "cthor2"
version = "0.1.0"
edition = "2021"

[dependencies]
alemat = "^0.8.0"
anyhow = "^1.0.95"
clap = { version = "^4.5.28", features = ["derive"] }
clap-verbosity-flag = "^3.0.2"
env_logger = "^0.11.6"
httpdate = "^1.0.3"
jotdown = "^0.7.0"
lazy_static = "^1.5.0"
log = "^0.4.25"
mathemascii = "^0.4.0"
mime_guess = "^2.0.5"
minijinja = { version = "^2.7.0", features = ["loader", "deserialization"] }
mlua = { version = "^0.10.3", features = ["lua54", "vendored", "serialize"] }
notify = "^8.0.0"
path-slash = "^0.2.1"
seahash = "^4.1.0"
serde = { version = "^1.0.214", features = ["derive"] }
serde_json = "^1.0.140"
serde_yaml = "^0.9.34"
syntect = "^5.2.0"
tiny_http = "^0.12.0"
two-face = "^0.4.3"
walkdir = "^2.5.0"

New features

While in the process of rewriting I of course added new features. The state of the previous code was poor enough that I was no longer interested in adding things to it, so many of these are backlogged features.

  1. Djot markup. More powerful, extensible, and predictable than Pandoc’s markdown. Also looks nicer.
  2. Jinja templates. More purpose built than embedded Perl. Loop variables, recursive loops, macros, filters, includes, etc. Less powerful, which is actually a good thing. Does the easy things easier, leaves the hard things for embedded Lua.
  3. Page–index distinction. Pages are self-contained. Indexes can inspect state from rendered pages.
  4. File dumps. Vomit an entire file into the page, including to itself. I’ve done it all over this page.
  5. Embedded maths. 12+13=56
  6. Built-in server. Basic HTTP server, and a watcher that rebuilds the site when changes are detected.
  7. Full cold build under 1 second. No need for caching.
  8. Better logging. Rust’s log API is so elegant I don’t know how I ever went without it. Every other log system feels antiquated now.

Previous code

From 2014–2025, this website was built from a hodgepodge of mostly Perl. Here’s the source code:

Makefile
.PHONY: all cache clean fonts static pages preload deploy reload deps
all: clean css fonts static pages preload

cache:
	git clean -fdX cache

clean:
	git clean -fdqX build

css:
	perl script/css.pl

fonts:
	perl script/font-subset.pl
	rsync -az cache/fonts/ build/fonts/

static:
	rsync -az static/ build/

pages:
	perl script/pages.pl

preload:
	perl script/preload.pl

deploy:
	rsync -rvzt --chmod=Dugo+x,ugo+r build/ cthor.me:~/cthor/build/

reload:
	ssh -t cthor.me sudo systemctl reload nginx

CPANMFLAGS=--sudo

deps:
	cpanm $(CPANMFLAGS) Pandoc HTML5::DOM YAML::XS Try::Tiny Convert::Color::LCh
pages.pl
#!/usr/bin/perl
use utf8;
use v5.26;
use strict;
use warnings;
use Pandoc;
use HTML5::DOM;
use Data::Dumper;
use FindBin '$Bin';
use Digest::MD5 qw/md5_hex/;
use Encode;
use YAML::XS ();
use Try::Tiny;
use Convert::Color::LCh;
use Scalar::Util;
use Carp qw/croak/;
chdir "$Bin/..";
binmode STDOUT, 'utf8';
binmode STDERR, 'utf8';

sub _slurp_file {
   my $fn = shift;
   open my $fh, "<:utf8", $fn;
   my $ret = do { local $/ = <$fh> };
   close $fh;
   return $ret;
}

sub Digest::MD5::addfn {
   my ($self, $fn) = @_;
   open my $fh, '<', $fn or croak "failed to open $fn: $!\n";
   binmode $fh;
   $self->addfile($fh);
   close $fh;
   return $self;
}

###############################################################################
# Helper functions for pp templates
###############################################################################

package c {
   sub cache_bust {
      my $url = shift;
      my $digest = Digest::MD5->new->addfn("build$url")->hexdigest;
      my $id = substr $digest, 0, 8;
      return $url =~ s{ \. (.+?) $ }{-$id.$1}xr;
   }

   sub dump {
      return Dumper(@_);
   }

   sub lch {
      "#" . Convert::Color::LCh->new(@_)->convert_to('rgb8')->hex;
   }

   sub sign {
      my $n = shift;
      Scalar::Util::looks_like_number($n)
         ? ($n >= 0 ? '+' : '') . abs $n
         : $n;
   }

   sub toc {
      state $psr = HTML5::DOM->new;
      my @toc;
      my $dom = $psr->parse(shift);
      $dom->find('.h-anchor')->each(sub {
         my $h = shift->parentNode;
         push @toc, {
            level => int substr($h->tag, 1),
            id => $h->attr('id'),
            title => $h->textContent,
         };
      });
      return @toc;
   }
}

###############################################################################
# Step 1: Construct a list of every page and its metadata
###############################################################################

sub pp_read {
   my $fn = shift;
   my %page = (fn => $fn);
   die "$fn does not exist\n" if !-f $fn;

   $page{pp} = _slurp_file $page{fn};

   # Normalise line endings
   $page{pp} =~ s/\r\n/\n/g;
   $page{pp} =~ s/\n\r/\n/g;
   $page{pp} =~ s/\r/\n/g;

   # Extract YAML front matter
   $page{l0} = 1;
   if ($page{pp} =~ s{\A---$(.+?)^---\s*}{}ms) {
      my $yaml = $1;
      $page{l0} += $& =~ tr/\n//;
      %page = (
         %page,
         YAML::XS::Load(Encode::encode_utf8 $1)->%*,
      );
   }

   $page{markdown} //= 1;
   $page{cache} //= $page{markdown};

   \%page;
}

my @pages;
for my $path (glob 'pages/*.pp') {
   my $page = pp_read($path);
   my $id = $page->{id} = substr($path, 6, -3);
   $page->{url} ||= "/$id";
   $page->{dest} ||= "build/$id.html";
   $page->{title} ||= $id;
   $page->{layout} ||= 'wrapper';

   # For each page, we only want at most 1 template to be cached and 1
   # template to run the markdown processor. These templates will for the most
   # part be the same, since the main reason to cache is to *avoid* running
   # the markdown processor.
   #
   # But they aren't always the same. If a page is not cacheable, most notably
   # Index.pp, but is still using the markdown processor, we need to be able
   # to specify that.
   #
   # First, cache=0 propagates to all parents, to ensure we don't cache a
   # template that specifies not to.
   #
   # Then walk back down the tree and:
   # - markdown=1 is kept only once
   # - cache=1 is kept only once
   #
   # After we see cache=1, start filling the digest with file contents,
   # setting the digest hash to the root page.
   my $t = $page;
   while (exists $t->{layout}) {
      my $fn = "layouts/$t->{layout}.pp";
      my $layout = try { pp_read($fn) }
                   catch { die $t->{fn} . ": $_" };

      if ($layout->{layout}) {
         die "$fn: layout is recursive\n" if $t->{layout} eq $layout->{layout};
      }

      $t->{layout} = $layout;
      $layout->{cache} = 0 if !$t->{cache};
      $layout->{child} = $t;
      $t = $layout;

      # We want the root page to inheret all YAML metadata from its parents,
      # with some exceptions, so that this information is available for things
      # like the index.
      for my $key (keys %$layout) {
         next if $key =~ /^(?:fn|pp|l0|markdown|cache|id|url|dest|layout|child)$/;
         $page->{$key} = $layout->{$key};
      }
   }

   my ($md, $cache);
   my $digest = Digest::MD5->new;
   while (defined $t) {
      if ($md) {
         $t->{markdown} = 0;
      }
      elsif ($t->{markdown}) {
         $md = 1;
      }

      if ($cache) {
         $t->{cache} = 0;
         $digest->addfn($t->{fn});
      }
      elsif ($t->{cache}) {
         $cache = 1;
         $digest->addfn($t->{fn});
      }

      $t = $t->{child};
   }

   if ($cache) {
      $page->{digest} = $digest->hexdigest;
   }

   push @pages, $page;
}

###############################################################################
# Step 2: Render the pp templates
###############################################################################

sub pp_lex {
   my $text = shift;
   my $context = 'text';
   my @stack;
   my @tokens;
   my @chars = split //, $text;
   my $i = 0;
   my $j = 0;
   while ($i < $#chars) {
      my $char = $chars[$i];

      if ($context eq 'text') {
         if ($char eq '') {
            if ($i != $j) {
               push @tokens, { type => "text", body => substr($text, $j, $i-$j) };
            }

            my $next = $chars[$i+1] // '';
            if ($next eq '{' || $next eq '(') {
               $context = $next;
               $j = $i+2;
               ++$i;
            }
            else {
               # Line context will eat trailing spaces from the last token
               $tokens[-1]->{body} =~ s/\x20+$// if $i != $j;

               $j = $i+1;
               $context = 'line';
            }
         }
      }
      elsif ($context eq 'line') {
         if ($char eq "\n") {
            push @tokens, { type => "perlcode", body => substr($text, $j, $i-$j+1) };
            $j = $i+1;
            $context = 'text';
         }
      }
      elsif ($context eq '(') {
         if ($char eq '(') {
            push @stack, $char;
         }
         elsif ($char eq ')') {
            if (@stack) {
               pop @stack;
            }
            else {
               push @tokens, { type => "perltext", body => substr($text, $j, $i-$j) };
               $j = $i+1;
               $context = 'text';
            }
         }
      }
      elsif ($context eq '{') {
         if ($char eq '{') {
            push @stack, $char;
         }
         elsif ($char eq '}') {
            if (@stack) {
               pop @stack;
            }
            else {
               push @tokens, { type => "perlcode", body => substr($text, $j, $i-$j) };
               $j = $i+1;
               $context = 'text';
            }
         }
      }

      ++$i;
   }

   if ($i != $j) {
      push @tokens, {
         type => (
            $context eq 'line' || $context eq '{' ? 'perlcode' :
            $context eq '(' ? 'perltext' :
            'text'
         ),
         body => substr($text, $j, $i-$j+1),
      };
   }

   \@tokens;
}

sub pp_compile {
   my $page = shift;
   my $tokens = pp_lex($page->{pp});
   my $ns = "cthor::pp::" . md5_hex Encode::encode_utf8($page->{fn} . $page->{id});
   {
      no strict 'refs';
      *{"${ns}::pages"} = sub { \@pages };
      *{"${ns}::page"} = sub { $page };
      *{"${ns}::content"} = sub { $page->{content} };
      *{"${ns}::contents"} = sub { $page->{content} };
   }

   my $ovar = qq{\$_} . substr($ns, 11, 6);
   my $code = qq{
      use utf8;
      package $ns;
      use v5.26;
      use strict;
      use warnings;
      utf8::upgrade(my $ovar = '');
   };

   $code .= sprintf q{
      my $__line__ = %s;
      local $SIG{__WARN__} = sub {
         my $warn = shift =~ s{ at \(eval \d+\) line \d+\.\n$}{}r;
         print STDERR $warn . " at %s line $__line__\n";
      };
   }, $page->{l0}, $page->{fn};

   my $line = $page->{l0};
   for my $token ($tokens->@*) {
      if ($token->{type} eq 'text') {
         my $string = $token->{body};
         $string =~ s/\\/\\\\/g;
         $string =~ s/}/\\}/g;
         $string =~ s/{/\\{/g;
         $code .= qq{$ovar .= q{$string};\n};
      }
      elsif ($token->{type} eq 'perltext') {
         my $expr = $token->{body};
         $code .= qq{$ovar .= ($expr) // '';\n};
      }
      elsif ($token->{type} eq 'perlcode') {
         $code .= $token->{body};
      }

      $line += $token->{body} =~ tr/\n//;
      $code .= qq{\$__line__ = $line;\n};
   }

   $code .= "$ovar;";
   $code;
}

sub md_render {
   my $html = pandoc->convert("markdown", "html", shift);

   state $psr = HTML5::DOM->new;
   my $dom = $psr->parse($html);

   # Pandoc makes footnote-refs with a <sup> inside the <a>, but I'd prefer
   # to use unicode superscript characters
   $dom->find('.footnote-ref')->each(sub {
      my $anchor = shift;
      $anchor->innerHTML($anchor->textContent =~ tr{0123456789}{⁰¹²³⁴⁵⁶⁷⁸⁹}r);
   });

   # Pandoc puts footnotes in its own <section> with an <hr> at the top,
   # but I'd prefer only an <h1>
   my $section = $dom->at('.footnotes');
   if ($section) {
      $section->at('hr')->outerHTML(qq{<h1 id="footnotes">Footnotes</h1>});
      $section->at('ol')->attr('class', 'footnotes');
      my $parent = $section->parent;
      $section->remove;
      $parent->append($_) for $section->children->@*;
   }

   # Wrap h3 and above with an anchor that points to itself
   $dom->find('h1,h2,h3')->each(sub {
      my $h = shift;
      my $anchor = $dom->createElement('a');
      $anchor->attr(class => 'h-anchor');
      $anchor->attr(href => "#" . $h->attr("id"));
      $anchor->innerHTML($h->innerHTML);
      $h->innerHTML($anchor->html);
   });

   $dom->at('body')->innerHTML;
}

# Digests hash for cache invalidation
my $digests_fn = 'cache/.digests.yml';
my %digests = ();
%digests = YAML::XS::LoadFile($digests_fn)->%* if -f $digests_fn;

sub pp_render {
   my $page = shift;
   (my $cache_fn = $page->{dest}) =~ s/^build/cache/;
   my %merge = $page->%*;
   my $t = $page;

   my $step = sub {
      return if !$t;
      $t = $t->{layout};
      %merge = ( %merge, $t->%* ) if $t;
   };

   if (exists $page->{digest} && exists $digests{ $page->{fn} }
           && $page->{digest}    eq     $digests{ $page->{fn} }) {
      $merge{content} = _slurp_file($cache_fn);

      $step->() while $t && !$t->{cache};
      $step->();
   }

   my $printed;
   while ($t) {
      my $code = pp_compile(\%merge);
      $merge{content} = eval $code;
      if (!defined $merge{content}) {
         # Failed to eval, print line numbers with the compiled pp code
         my @lines = split /\n/, $code;
         my $fmt = "%" . (length $#lines) . "d. %s\n";
         for my $line (0..$#lines) {
            printf STDERR $fmt, $line+1, $lines[$line];
         }
         die $t->{fn} . "\n$@\n";
      }

      if ($t->{markdown}) {
         $merge{content} = md_render($merge{content});
      }

      if ($t->{cache}) {
         open my $fh, '>:utf8', $cache_fn;
         print $fh $merge{content};
         close $fh;
         $digests{ $page->{fn} } = $page->{digest};
      }

      if ($t->{cache} || $t->{markdown} and !$printed) {
         printf "> %s => %s\n", $page->{fn}, $page->{dest};
         $printed = 1;
      }

      $step->();
   }

   $merge{content};
}

###############################################################################
# Step 3: Write the rendered pages to the build dir
###############################################################################

for my $page (@pages) {
   my $render = pp_render($page);

   # Skip write if file is identical, preserving mtime
   next if -f $page->{dest}
      and $render eq _slurp_file $page->{dest};

   open my $fh, ">:utf8", $page->{dest};
   print $fh $render;
}

YAML::XS::DumpFile($digests_fn, \%digests);
css.pl
#!/usr/bin/perl
use utf8;
use v5.26;
use strict;
use warnings;
use FindBin '$Bin';
use Digest::MD5 ();
use YAML::XS;
use File::Copy qw/cp/;
chdir "$Bin/..";

sub fn_md5 {
   open my $fh, "<", shift;
   my $slurp = do { local $/ = <$fh> };
   close $fh;
   return Digest::MD5->new->add($slurp)->hexdigest;
}

my $digests_fn = 'cache/.digests.yml';
my %digests = ();
%digests = YAML::XS::LoadFile($digests_fn)->%* if -f $digests_fn;

my $src = 'src/cthor.scss';
my $dest = 'cache/cthor.css';
my $src_md5 = fn_md5($src);

sub run { say "> $_[0]"; system $_[0] }

if (!exists $digests{$src} || $digests{$src} ne $src_md5) {
   run qq{sassc src/cthor.scss $dest};
   run qq{npx postcss --use autoprefixer -o $dest $dest};
   $digests{$src} = $src_md5;
   YAML::XS::DumpFile($digests_fn, \%digests);
}

my $dest_md5 = sprintf "cache/cthor-%.8s.css", fn_md5($dest);

run qq{cp $dest $dest_md5};
run qq{rsync -az cache/*.css build};

preload.pl
# Read the built pages to find which core fonts they're definitely using.
# Writes a map file for NGINX to send appropriate Link headers.
use 5.01;
use HTML5::DOM;
use FindBin '$Bin';
chdir "$Bin/..";

my @fonts = qw/SourceSansPro-Regular SourceCodePro-Regular SourceSerifPro-It/;

my %s = (
   'SourceSansPro-Regular' => 'h1:not(.title), h2, .tabs, .move',
   'SourceCodePro-Regular' => 'pre code',
   'SourceSerifPro-It' => '.metadata',
);

open my $w, '>', 'build/preload.map';
for my $path (glob 'build/*.html') {
   my $url = substr $path, 5;

   open my $fh, '<:utf8', $path;
   state $psr = HTML5::DOM->new;
   my $dom = $psr->parse(do { local $/ = <$fh> });
   close $fh;

   my $css;
   if (my $node = $dom->at('link[rel=stylesheet]')) {
      $css = sprintf q{<%s>; as=style; rel=preload}, $node->attr('href');
   }

   my @found = ('SourceSerifPro-Regular');

   push @found, grep { $dom->find( $s{$_} )->length } @fonts;

   # Some pages might not use the default wrapper and have no CSS at all
   if (defined $css) {
      say $w qq{$url\n   "}
         . (join ', ', $css, (map qq{</fonts/$_-Core.ttf.woff>; as=font; rel=preload; crossorigin=anonymous}, @found))
         . q{";};
   }
}

__END__
Previous approach was to run a browser emulator and grab all fonts being used.
This was a little overkill and I couldn't actually get it working properly.

I moved away from that approach because we don't actually want *literally
every font* preloaded. If a page happens to use a lot of fonts, we want to
prioritise the important ones. So instead this much simpler approach works.

This is still better than simply preloading all 4 core fonts on every page,
since they still amount to ~130KB, and a lot of pages don't need them.
font-subset.pl
#!/usr/bin/env perl

use v5.26;
use FindBin '$Bin';
chdir "$Bin/..";

mkdir 'cache/fonts' if !-d 'cache/fonts';

for my $fn (glob "src/fonts/*.woff") {
   my $font = substr $fn, 10, -9;

   for my $subset (qw/Core Extra/) {
      my $dest = "cache/fonts/$font-$subset.ttf.woff";

      next if -e $dest;
      print "### $font-$subset\n";
      system(
         "pyftsubset",
         "$fn",
         qq{--output-file=$dest},
         qq{--unicodes-file=src/fonts/$subset.txt},
         qq{--no-ignore-missing-unicodes},
         qq{--layout-features+=smcp,c2sc},
         qq{--with-zopfli},
      );
   }
}

Issues with previous code

There were a few key issues I sought to fix in the rewrite.

Too many dependencies
  • The markup was piped through to pandoc.
  • The CSS was piped through to sassc and then postcss autoprefixer.
  • The fonts were subsetted using pyftsubset.
  • The static build step relied on rsync.
  • The preload hack relied on NGINX.
  • The cache busting relied on server side rewrite rules.
  • Backwards compatibility with an old link relied on a server side redirect.

Despite the above, I went out of my way to reduce the number of Perl dependencies, limiting features while not actually making the build any simpler. There’s not really much difference between 4 CPAN dependencies and 40 so long as none of them are particularly prickly. In hindsight, the whole thing looks quite ridiculous.

Part of the issue is that CPAN is a husk of its former glory. The old modules are still good, but for anything new it is sclerotic. At the time, there were no CPAN modules that could do the same thing as pandoc, sassc, postcss, or pyftsubset. There is a Pandoc module, but it doesn’t actually install Pandoc, so it’s basically worthless. There wasn’t Alien::pandoc to actually vendor it until Dec 2023.

CPAN also doesn’t seem to have adequate substitutes for a lot of the Rust libraries used like jotdown, mathemascii, minijinja, mlua, serde, and syntect.

Flimsy build system

The Makefile was a glorified script runner and wasn’t cross platform in the slightest. It depended on many shell commands that can’t be assumed to exist. On one occasion I ended up unknowingly building from an environment with an older version of Pandoc, temporarily ruining some pages.

Too slow

Building the entire site from scratch took long enough that I added caching. This made the flimsy build system even flimsier and I don’t think it was ever quite right.

Unsatisfying markup language

Pandoc’s markdown left a lot to be desired. It’s still much better than any other markdown flavor, but it has its warts.

Because the markup was just pure input–output with pandoc, there was no proper way to extend it. To get around this, md_render parsed the resulting HTML tree and transformed that. This works, but it’s also slow and fragile.

Disorganized directory structure

I often found it unclear where things that weren’t pages should go.

The Rust code fully addresses all of these issues on top of having far more features.

Challenge 1: Embedded data

The main challenge was how to port the more complicated pages like Lee movelist.

Here’s the original source for Strings:

Strings.pp
---
title: Strings
topic: Tekken 7
state: massive work in progress
index: 1
toc: 1
names:
   Fahk: Fahkumram
strings:
   Fahk:
      - |
         1 h +1
            1 m -7
            2 h -2 JA
               1 m -6 DL SSL
                  4 m -11 DL SSR
               4 h -9 CH
                  [4 extensions](#fahk-4)
            4 h -8
      - |
         2 h +0
            4 h -9
               [4 extensions](#fahk-4)
      - |
         3 m -6
            1 h -3
               4 m -13
                  4 m -14 DL
               d+4 l -16
            4 m -12 DL
            4~3 m -7 I
            4~3^ m +14 I
            4~4 h +6 I
      - |
         4 h -9
            3 h -7
            3~1 h -4 SSR I16
            3~4 m -5 SSR I17
            3~4^ m +14g SSR I32
            4 l -16 SSR
   Jin:
      - |
         1 h +1
            2 h -2 JA
               3 m +0 SSR I10
               4 h -4 DL CH
            3 h -6 JA
               2 m -1 JA
                  1 m -7 JA
                     4 l -13
               F (ZEN) -7
               B (?) -6
            3~3 m +6 SSR I15
               d/f+3 m -14 DL
            d+3 l -12
               4 l -13
      - |
         2 h +0
            1 m -3 JA
               4 m -9 DL
               4~4 l -31 I16
            4 h -9
      - |
         3 h -7
            1 h -1
               F (ZEN) +4
               4 l -13 DL CH
      - |
         f+3~3 m +6
            d/f+3 m -14 DL
      - |
         d/f+1 m -3
            4~4 m -12 SSL JGL
            4 h -9 DL
      - |
         d/f+3 m -16
            3 h -8 DL
               F (ZEN) -7
               B (?) -8
      - |
         d+3 l -13
            3 m -16
      - |
         d/b+2 m -10
            2 h -12 JA
               3 m -9 SSL JGL DL
            3 m -8
      - |
         b+2 m -11
            1 m -9 DL
      - |
         f,F+3 m +0 JGL
            1 h +1 JA
               [1 extensions](#jin-1)
      - |
         b,f+2 m -7
            1 h -5 DL
               1 m -5 DL
            3 m -12 DL
               F (ZEN) -1
      - |
         ws1 m -6
            2 m -8 DL
               F (ZEN) -9
      - |
         ZEN,1 m -3
            2 m -14 DL
            3 h -9
      - |
         CD,1+2 h,m,(ZEN) -7
            D/F (CD) +18g
            3 h +8 SS
               1 m +9
      - |
         CD,4 l -31 JGL
            3+4 m -14 SSR SWL
   Lee:
      - |
         1 h +1
            2 h -1 JA
               2 m -13
                  3 h -3 DL
                     4 (HMS) -9
               4 m -12 DL CH
                  3 (HMS) -14
               4^ m -7 SSL CH
               f~N (MS) -2
      - |
         2 h +1
            1 h -5 JA
               1 m -12 CH
               3 h -4 CH
               3+4 (HMS) -5
               4 l -16 CH
            2 m -13
               3 h -3 DL
                  4 (HMS) -9
      - |
         3 m -7
            3 h -10 DL
         3~3 h -10
            4 t!(h)
      - |
         4 h -9 CH
            3 h -14 JA
               3 m -13 DL
               4 m -9 SS I10 DL
            4 h -5 JA
               4 h -5 JA
            u+3 m -1 I14
               f~N (MS) -9 I14
      - |
         b+2 h -5
            4 h -8 DL CH
               3 h -3
               3+4 (HMS) -10
            f~N (MS) -1
      - |
         b+3 l -14
            3 h -12
               f~N (MS) -7
      - |
         d+4 l -13
            4 h -9 DL
               3 h -14 JA
                  3 m -13 DL
                  4 m -9 SS I10 DL
               4 h -5 JA
                  4 h -5 JA
               u+3 m -1 I14
                  f~N (MS) -9 I14
            d+4 l -15 DL
               4 l -15 DL
                  4 m -20 DL
                     3 (HMS) -11
      - |
         d/f+3 m -8
            2 h -6
               3 m -16 DL JGL
      - |
         f+4 h -7
            1 h -2 SSR DL
            3 m -8 DL CH
               4 (HMS) -4
      - |
         u/f+3 m -9
            1 m -12 CH
            4 h -3 JGL
      - |
         ws2 m -8
            3 m -13
            4 h -4
      - |
         ws3 m -21
            3 m -15 JA
               4 (HMS) -15
               d+3 l -19
                  3 m -19
                     3 h,(INF) -16
               d/f+3 m -19 SSR
                  3 h,(INF) -16
---

Strings are annoying. This page helps me remember how to punish them.

- Frame data is for the move on block
- Clicking the move will play a video

This page is not comprehensive. Characters may have more strings than what's
listed here.

However, every subsection is comprehensive, e.g., Lee has no extensions for
d+4 beyond [those listed](#lee-d4){.internal}.

#### Legend

SS, SSR, SSL, SWR, SWL
: can be stepped on block

I{N}
: can be interrupted on block, e.g., I11 loses to an 11 frame or faster move

DL
: can be delayed

JA
: forces block if previous move was blocked ("jails")

CH
: has dangerous counter-hit properties

JGL
: launches on hit

If a move is steppable only when delayed, the step note is listed *after* the
delay note, e.g. DL SS means it can be stepped only when delayed, whereas SS
DL means it can be stepped regardless of the delay.

◊ my %strings = page()->{strings}->%*;
◊ for my $char (sort keys %strings) {

# ◊( page()->{names}{$char} // $char ) {#◊( lc $char )}

   ◊ for my $string ($strings{$char}->@*) {
      ◊ $string =~ /(\S+)/;
      ◊ my $root = $1;

## ◊($root) {#◊( lc "$char-" . ($root =~ s{[,/+~]}{}gr =~ s{\s+}{-}gr) )}

``` {=html}
<div class="string-section">
   <div class="string">
      ◊ my @stack;
      ◊ my $prev;
      ◊ for my $line (split /\n/, $string) {
         ◊ $line =~ s{^(\s*)}{};
         ◊ my $depth = length $1;
         ◊ if (@stack * 3 < $depth) {
            <div class="string">
            ◊ push @stack, $prev;
         ◊ }
         ◊ while (@stack * 3 > $depth) {
            ◊ pop @stack;
            </div>
         ◊ }
         ◊ if ($line =~ /^\[/) {
```
<div class="move-ctn link">◊( $line )</div>
``` {=html}
         ◊ } else {
            ◊ my @parts = split /\s+/, $line;
            <div class="move-ctn">
               <ul class="move" data-video="◊( ("$char-" . join ',', @stack, $parts[0]) =~ s{/}{}gr )">
            ◊ for my $part (@parts) {
                  <li>◊($part =~ s/-/−/gr)</li>
            ◊ }
               </ul>
            </div>
            ◊ $prev = $parts[0];
         ◊ }
      ◊ }
      ◊ while (@stack) {
         ◊ pop @stack;
         </div>
      ◊ }
   </div>
   <div class="string-demo">
      ◊ my $poster = "/images/Strings/$char-idle.jpg";
      <div class="string-demo--error hidden">Video not found</div>
      <video muted loop autoplay width="448" height="252" poster="◊($poster)"
         style="background-image: url('◊($poster)')"></video>
   </div>
</div>
```

   ◊ }
◊ }

<style>
   h2, .toc-2 .internal {
      font-variant: normal;
   }

   .string-section {
      display: flex;
      flex-flow: row wrap;
   }

   .string-section > .string {
      min-width: 12em;
   }

   .string .string {
      position: relative;
      border-left: 3px solid #d2e5f9;
      margin-left: 13px;
   }

   .string .string::before,
   .move-ctn::before {
      content: " ";
      display: block;
      position: absolute;
      border-top: 3px solid #d2e5f9;
   }

   .move-ctn::before {
      top: 50%;
      transform: translateY(-50%);
      width: 14px;
      left: -16px;
   }

   .string .string::before {
      top: 0;
      width: 16px;
      left: -16px;
   }

   .move-ctn {
      display: flex;
      flex-flow: row nowrap;
      justify-content: start;
      position: relative;
      margin: 0.2em 0;
      margin-left: 16px;
   }

   .move-ctn.link {
      display: block;
      font-size: 0.9em;
      font-weight: bold;
   }

   .move-ctn::after {
      display: none;
      position: absolute;
      left: -11px;
      top: 50%;
      transform: translateY(-50%);
      content: "";
      width: 0;
      height: 0;
      border-top: 5px solid transparent;
      border-bottom: 5px solid transparent;
      border-left: 9px solid currentColor;
   }

   .move-ctn.current::after {
      display: block;
   }

   .move {
      line-height: 1.2;
      cursor: pointer;
      display: flex;
      flex-flow: row nowrap;
      align-items: center;
      list-style-type: none;
      margin-left: 1px;
      padding-left: 0;
      padding-right: 0.5em;
   }

   .move:hover {
      background-color: #f8eaea;
   }

   .move > li:first-child {
      font-weight: bold;
   }

   .move > li + li {
      margin-left: 0.2em;
      font-size: 0.8em;
   }

   .move > li:nth-child(4) {
      border-left: 1px solid #aaa;
      padding-left: 0.2em;
   }

   .string-demo {
      flex-grow: 1;
      margin-top: 0.5em;
      position: relative;
      height: 252px;
   }

   @media (max-width: 480px) {
      .string-demo {
         height: calc(56.25vw - 1em);
      }
   }

   .string-demo video {
      height: auto;
      background-size: cover;
   }

   .string-demo--error {
      position: absolute;
      top: 0;
      background-color: hsla(0, 0%, 100%, 0.4);
      padding: 0.2em 1em;
      left: 50%;
      transform: translateX(-50%);
      white-space: nowrap;
      border: 1px solid hsla(0, 90%, 30%, 0.8);
      border-top: none;
   }
</style>

<script type="text/javascript">
(function () {
   let fmts = ['webm', 'mp4'];
   document.querySelectorAll('.move').forEach((el, i) => {
      let ctnEl = el.parentElement;
      let secEl = ctnEl;
      while (secEl !== null && !secEl.classList.contains('string-section')) {
         secEl = secEl.parentElement;
      }
      let vidEl = secEl.querySelector('video');
      let errEl = secEl.querySelector('.string-demo--error');
      let srcEls = fmts.map((ext, i) => {
         let srcEl = document.createElement('source');
         if (i == fmts.length - 1) {
            srcEl.onerror = (ev) => errEl.classList.remove('hidden');
         }
         srcEl.setAttribute('type', 'video/' + ext);
         srcEl.setAttribute('src', '/videos/Strings/' + el.dataset.video + '.' + ext);
         return srcEl;
      });

      el.addEventListener('click', (ev) => {
         // For some reason, this click can sometimes cause the browser to scroll, which is jarring.
         // To correct this, take the current scroll position, and scroll to it after.
         let scrollX = window.scrollX;
         let scrollY = window.scrollY;

         vidEl.querySelectorAll('source').forEach(el => el.remove());
         errEl.classList.add('hidden');
         if (ctnEl.classList.contains('current')) {
            ctnEl.classList.remove('current');
            vidEl.load();
            vidEl.pause();
         }
         else {
            secEl.querySelectorAll('.move-ctn').forEach((el) => el.classList.remove('current'));
            ctnEl.classList.add('current');
            srcEls.forEach(el => vidEl.appendChild(el));
            vidEl.load();
            vidEl.play();
         }
         window.scrollTo(scrollX, scrollY);
      });
   });
})();
</script>

This page embedded a bunch of data at the top in a DSL, then rendered that data while at the same time parsing and transforming it.

Now, that sounds bad written like that, but hear me out. You don’t want to pollute the SSG itself (in this case pages.pl) with code that only applies to a single page. That code should just live in the page, or in an explicit dependency of that page (e.g. an include), to limit its scope.

The advantage of the Perl site is that the templates were just Perl—a terrifyingly powerful language—so crazy things like that are more than possible without putting any extra thought into the design.

With more typical templating engines, such a complex data transformation is not so easy. Things like Handlebars, Jinja, Liquid, Template Toolkit, etc. are intentionally limited in what they can do to stop naughty people like me from writing abominations like this. They’re intended for MVC web frameworks like Rails and Django where the data lives in a database and separation of concerns is the whole point of the design pattern.

…and there’s nothing wrong with the MVC pattern. I just had to rethink how to apply it to the use case of authoring self-contained web pages.

The solution: add filters to define data (M) and to transform it (C). Specifically, a YAML filter takes a string of YAML and returns the serialized data. Then, a Lua filter takes that data and some Lua code and returns the transformed data. With this, each language works together, each doing what it does best.

The new page:

Strings.md.j2
{% extends "djocument.html.j2" %}
{% set title = "Strings" %}
{% set topic = "Tekken 7" %}
{% set state = "proof of concept" %}
{% set index = true %}
{% set toc = true %}
{% set data | yaml %}
``` yaml
names:
   Fahk: Fahkumram
strings:
   Fahk:
      - |
         1 h +1
            1 m -7
            2 h -2 JA
               1 m -6 DL SSL
                  4 m -11 DL SSR
               4 h -9 CH
                  <a href="#Fahk-4">4 extensions</a>
            4 h -8
      - |
         2 h +0
            4 h -9
               <a href="#Fahk-4">4 extensions</a>
      - |
         3 m -6
            1 h -3
               4 m -13
                  4 m -14 DL
               d+4 l -16
            4 m -12 DL
            4~3 m -7 I
            4~3^ m +14 I
            4~4 h +6 I
      - |
         4 h -9
            3 h -7
            3~1 h -4 SSR I16
            3~4 m -5 SSR I17
            3~4^ m +14g SSR I32
            4 l -16 SSR
   Jin:
      - |
         1 h +1
            2 h -2 JA
               3 m +0 SSR I10
               4 h -4 DL CH
            3 h -6 JA
               2 m -1 JA
                  1 m -7 JA
                     4 l -13
               F (ZEN) -7
               B (?) -6
            3~3 m +6 SSR I15
               df+3 m -14 DL
            d+3 l -12
               4 l -13
      - |
         2 h +0
            1 m -3 JA
               4 m -9 DL
               4~4 l -31 I16
            4 h -9
      - |
         3 h -7
            1 h -1
               F (ZEN) +4
               4 l -13 DL CH
      - |
         f+3~3 m +6
            df+3 m -14 DL
      - |
         df+1 m -3
            4~4 m -12 SSL JGL
            4 h -9 DL
      - |
         df+3 m -16
            3 h -8 DL
               F (ZEN) -7
               B (?) -8
      - |
         d+3 l -13
            3 m -16
      - |
         db+2 m -10
            2 h -12 JA
               3 m -9 SSL JGL DL
            3 m -8
      - |
         b+2 m -11
            1 m -9 DL
      - |
         f,F+3 m +0 JGL
            1 h +1 JA
               <a href="#Jin-1">1 extensions</a>
      - |
         b,f+2 m -7
            1 h -5 DL
               1 m -5 DL
            3 m -12 DL
               F (ZEN) -1
      - |
         ws1 m -6
            2 m -8 DL
               F (ZEN) -9
      - |
         ZEN,1 m -3
            2 m -14 DL
            3 h -9
      - |
         CD,1+2 h,m,(ZEN) -7
            D/F (CD) +18g
            3 h +8 SS
               1 m +9
      - |
         CD,4 l -31 JGL
            3+4 m -14 SSR SWL
   Lee:
      - |
         1 h +1
            2 h -1 JA
               2 m -13
                  3 h -3 DL
                     4 (HMS) -9
               4 m -12 DL CH
                  3 (HMS) -14
               4^ m -7 SSL CH
               f~N (MS) -2
      - |
         2 h +1
            1 h -5 JA
               1 m -12 CH
               3 h -4 CH
               3+4 (HMS) -5
               4 l -16 CH
            2 m -13
               3 h -3 DL
                  4 (HMS) -9
      - |
         3 m -7
            3 h -10 DL
         3~3 h -10
            4 t!(h)
      - |
         4 h -9 CH
            3 h -14 JA
               3 m -13 DL
               4 m -9 SS I10 DL
            4 h -5 JA
               4 h -5 JA
            u+3 m -1 I14
               f~N (MS) -9 I14
      - |
         b+2 h -5
            4 h -8 DL CH
               3 h -3
               3+4 (HMS) -10
            f~N (MS) -1
      - |
         b+3 l -14
            3 h -12
               f~N (MS) -7
      - |
         d+4 l -13
            4 h -9 DL
               3 h -14 JA
                  3 m -13 DL
                  4 m -9 SS I10 DL
               4 h -5 JA
                  4 h -5 JA
               u+3 m -1 I14
                  f~N (MS) -9 I14
            d+4 l -15 DL
               4 l -15 DL
                  4 m -20 DL
                     3 (HMS) -11
      - |
         df+3 m -8
            2 h -6
               3 m -16 DL JGL
      - |
         f+4 h -7
            1 h -2 SSR DL
            3 m -8 DL CH
               4 (HMS) -4
      - |
         uf+3 m -9
            1 m -12 CH
            4 h -3 JGL
      - |
         ws2 m -8
            3 m -13
            4 h -4
      - |
         ws3 m -21
            3 m -15 JA
               4 (HMS) -15
               d+3 l -19
                  3 m -19
                     3 h,(INF) -16
               df+3 m -19 SSR
                  3 h,(INF) -16
```
{% endset %}
{% set data | lua(data) %}
``` lua
function parseTree(str)
   local root = { children = {}, indent = -1 }
   local stack = { root }

   for line in str:gmatch("[^\n]+") do
      local indent = #line:match("^(%s*)")
      local content = line:match("^%s*(.-)$")
      local node = { indent = indent, children = {} }

      if content:match("^<") then
         node.html = content
      else
         node.parts = {}
         for part in content:gmatch("%S+") do
            table.insert(node.parts, part)
         end
      end

      -- Pop until we find a parent with less indent
      while #stack > 1 and stack[#stack].indent >= indent do
         table.remove(stack)
      end

      -- Add node as child of current top of stack
      local parent = stack[#stack]
      node.id = (parent.id and parent.id .. "," or "") .. line:match("(%S+)");
      table.insert(parent.children, node)
      table.insert(stack, node)
   end

   return root
end

local transformed = {}
for char, sections in pairs(data.strings) do
   transformed[char] = {
      heading = data.names[char] or char,
      id = char,
      sections = {},
   }
   for i, string in ipairs(sections) do
      local section = {}
      section.children = parseTree(string).children
      section.heading = section.children[1].parts[1]
      section.id = char .. "-" .. section.heading
      table.insert(transformed[char].sections, section)
   end
end

return transformed
```
{% endset %}
{% set content %}

_(This page's notation differs somewhat from [Wavu Wiki](https://wavu.wiki/).
It was written before that project was started and is made largely redundant by it.
As such, I've kept the notation as is for archival purposes.)_

_(Videos are missing for Fahk 2, 3, and 4. All other sections should have videos.)_

Strings are annoying. This page helps me remember how to punish them.

- Frame data is for the move on block
- Clicking the move will play a video

This page is not comprehensive. Characters may have more strings than what's
listed here.

However, every subsection is comprehensive, e.g., Lee has no extensions for
d+4 beyond [those listed](#Lee-d+4).

# Legend

: SS, SSR, SSL, SWR, SWL

  can be stepped on block

: I{N}

  can be interrupted on block, e.g., I11 loses to an 11 frame or faster move

: DL

  can be delayed

: JA

  forces block if previous move was blocked ("jails")

: CH

  has dangerous counter-hit properties

: JGL

  launches on hit

If a move is steppable only when delayed, the step note is listed _after_ the
delay note, e.g. DL SS means it can be stepped only when delayed, whereas SS
DL means it can be stepped regardless of the delay.

``` =html
<style>
   .string-section {
      display: flex;
      flex-flow: row wrap;
   }

   .string-section > .string {
      min-width: 12em;
   }

   .string .string {
      position: relative;
      border-left: 3px solid #d2e5f9;
      margin-left: 13px;
   }

   .string .string::before,
   .move-ctn::before {
      content: " ";
      display: block;
      position: absolute;
      border-top: 3px solid #d2e5f9;
   }

   .move-ctn::before {
      top: 50%;
      transform: translateY(-50%);
      width: 14px;
      left: -16px;
   }

   .string .string::before {
      top: 0;
      width: 16px;
      left: -16px;
   }

   .move-ctn {
      display: flex;
      flex-flow: row nowrap;
      justify-content: start;
      position: relative;
      margin: 0.2em 0;
      margin-left: 16px;
   }

   .move-ctn.html {
      display: block;
      font-size: 0.9em;
      font-weight: bold;
   }

   .move-ctn::after {
      display: none;
      position: absolute;
      left: -11px;
      top: 50%;
      transform: translateY(-50%);
      content: "";
      width: 0;
      height: 0;
      border-top: 5px solid transparent;
      border-bottom: 5px solid transparent;
      border-left: 9px solid currentColor;
   }

   .move-ctn.current::after {
      display: block;
   }

   .move {
      line-height: 1.2;
      cursor: pointer;
      display: flex;
      flex-flow: row nowrap;
      align-items: center;
      list-style-type: none;
      margin-left: 1px;
      padding-left: 0;
      padding-right: 0.5em;
   }

   .move:hover {
      background-color: #f8eaea;
   }

   .move > li:first-child {
      font-weight: bold;
   }

   .move > li + li {
      margin-left: 0.2em;
      font-size: 0.8em;
   }

   .move > li:nth-child(4) {
      border-left: 1px solid #aaa;
      padding-left: 0.2em;
   }

   .string-demo {
      flex-grow: 1;
      margin-top: 0.5em;
      position: relative;
      height: 252px;
   }

   @media (max-width: 480px) {
      .string-demo {
         height: calc(56.25vw - 1em);
      }
   }

   .string-demo video {
      background-size: cover;
      height: auto;
      margin-top: 0;
   }

   .string-demo--error {
      position: absolute;
      top: 0;
      background-color: hsla(0, 0%, 100%, 0.4);
      padding: 0.2em 1em;
      left: 50%;
      transform: translateX(-50%);
      white-space: nowrap;
      border: 1px solid hsla(0, 90%, 30%, 0.8);
      border-top: none;
   }
</style>
```

{% for char, char_data in data | items %}
# {{ char_data.heading }}

{% for section in char_data.sections %}
{id="{{ section.id }}"}
## {{ section.heading }}

``` =html
<div class="string-section">
   <div class="string">
   {% for string in section.children recursive %}
      {% if string.html %}
         <div class="move-ctn html">
            {{ string.html }}
         </div>
      {% else %}
         <div class="move-ctn">
            <ul class="move" data-video="{{ char }}-{{ string.id }}">
            {% for part in string.parts %}
               <li>{{ part }}</li>
            {% endfor %}
            </ul>
         </div>
      {% endif %}
      {% if string.children | length %}
      <div class="string">
         {{ loop(string.children) }}
      </div>
      {% endif %}
   {% endfor %}
   </div>
   <div class="string-demo">
      {% set poster = "/Strings/" + char + "-idle.jpg" %}
      <div class="string-demo--error hidden">Video not found</div>
      <video muted loop autoplay width="448" height="252" poster="{{ poster }}"
         style="background-image: url('{{ poster }}')"></video>
   </div>
</div>
```
{% endfor %}
{% endfor %}

``` =html
<script type="text/javascript">
(function () {
   let fmts = ['mp4'];
   document.querySelectorAll('.move').forEach((el, i) => {
      let ctnEl = el.parentElement;
      let secEl = ctnEl;
      while (secEl !== null && !secEl.classList.contains('string-section')) {
         secEl = secEl.parentElement;
      }
      let vidEl = secEl.querySelector('video');
      let errEl = secEl.querySelector('.string-demo--error');
      let srcEls = fmts.map((ext, i) => {
         let srcEl = document.createElement('source');
         if (i == fmts.length - 1) {
            srcEl.onerror = (ev) => errEl.classList.remove('hidden');
         }
         srcEl.setAttribute('type', 'video/' + ext);
         srcEl.setAttribute('src', '/Strings/' + el.dataset.video + '.' + ext);
         return srcEl;
      });

      el.addEventListener('click', (ev) => {
         // For some reason, this click can sometimes cause the browser to scroll, which is jarring.
         // To correct this, take the current scroll position, and scroll to it after.
         let scrollX = window.scrollX;
         let scrollY = window.scrollY;

         vidEl.querySelectorAll('source').forEach(el => el.remove());
         errEl.classList.add('hidden');
         if (ctnEl.classList.contains('current')) {
            ctnEl.classList.remove('current');
            vidEl.load();
            vidEl.pause();
         }
         else {
            secEl.querySelectorAll('.move-ctn').forEach((el) => el.classList.remove('current'));
            ctnEl.classList.add('current');
            srcEls.forEach(el => vidEl.appendChild(el));
            vidEl.load();
            vidEl.play();
         }
         window.scrollTo(scrollX, scrollY);
      });
   });
})();
</script>
```
{% endset %}

Source

For demonstration purposes:

Rewrite.md.j2
{% extends "djocument.html.j2" %}
{% set title = "Rewriting my website in Rust" %}
{% set published = "10 Mar 2025" %}
{% set topic = "Meta" %}
{% set index = true %}
{% set toc = true %}
{% set content %}
Everything is being rewritten in Rust these days,
so I rewrote this website to see what the hype is all about.

I am very satisfied with the result.

{{ dump_file("src/main.rs", "rust") }}

{{ dump_file("Cargo.toml", "toml") }}

# New features

While in the process of rewriting I of course added new features.
The state of the previous code was poor enough that I was no longer interested in adding things to it,
so many of these are backlogged features.

1. *Djot markup.* More powerful, extensible, and predictable than Pandoc's markdown. Also looks nicer.
2. *Jinja templates.* More purpose built than embedded Perl. Loop variables, recursive loops, macros, filters, includes, etc. _Less_ powerful, which is actually a good thing. Does the easy things easier, leaves the hard things for embedded Lua.
3. *Page--index distinction.* Pages are self-contained. Indexes can inspect state from rendered pages.
4. *File dumps.* Vomit an entire file into the page, including to itself. I've done it all over this page.
5. *Embedded maths.* $`1/2 + 1/3 = 5/6`
6. *Built-in server.* Basic HTTP server, and a watcher that rebuilds the site when changes are detected.
7. *Full cold build under 1 second.* No need for caching.
8. *Better logging.* Rust's log API is so elegant I don't know how I ever went without it. Every other log system feels antiquated now.

# Previous code

From 2014--2025, this website was built from a hodgepodge of mostly Perl. Here's the source code:

{{ dump_file("content/archive/cthor/Makefile", "make") }}

{{ dump_file("content/archive/cthor/pages.pl", "perl") }}

{{ dump_file("content/archive/cthor/css.pl", "perl") }}

{{ dump_file("content/archive/cthor/preload.pl", "perl") }}

{{ dump_file("content/archive/cthor/font-subset.pl", "perl") }}

# Issues with previous code

There were a few key issues I sought to fix in the rewrite.

: Too many dependencies
  
  - The markup was piped through to `pandoc`.
  - The CSS was piped through to `sassc` and then `postcss autoprefixer`.
  - The fonts were subsetted using `pyftsubset`.
  - The static build step relied on `rsync`.
  - The preload hack relied on NGINX.
  - The cache busting relied on server side rewrite rules.
  - Backwards compatibility with [an old link](/2013/07/19/per-request-wrapper-for-catalyst-view-tt/) relied on a server side redirect.

  Despite the above, I went out of my way to _reduce_ the number of _Perl_ dependencies,
  limiting features while not actually making the build any simpler.
  There's not really much difference between 4 CPAN dependencies and 40
  so long as none of them are particularly prickly.
  In hindsight, the whole thing looks quite ridiculous.

  Part of the issue is that CPAN is a husk of its former glory.
  The old modules are still good, but for anything new it is sclerotic.
  At the time, there were no CPAN modules that could do the same thing as `pandoc`, `sassc`, `postcss`, or `pyftsubset`.
  There _is_ a `Pandoc` module, but it doesn't actually install Pandoc, so it's basically worthless.
  There wasn't `Alien::pandoc` to actually vendor it until _Dec 2023_.

  CPAN also doesn't seem to have adequate substitutes for a lot of the Rust libraries used like
  `jotdown`, `mathemascii`, `minijinja`, `mlua`, `serde`, and `syntect`.

: Flimsy build system

  The Makefile was a glorified script runner and wasn't cross platform in the slightest.
  It depended on many shell commands that can't be assumed to exist.
  On one occasion I ended up unknowingly building from an environment with an older version of Pandoc,
  temporarily ruining some pages.

: Too slow

  Building the entire site from scratch took long enough that I added caching.
  This made the flimsy build system even flimsier and I don't think it was ever quite right.

: Unsatisfying markup language

  Pandoc's markdown left a lot to be desired.
  It's still much better than any other markdown flavor, but it has its warts.

  Because the markup was just pure input--output with `pandoc`, there was no proper way to extend it.
  To get around this, `md_render` parsed the resulting HTML tree and transformed that.
  This works, but it's also slow and fragile.

: Disorganized directory structure

  I often found it unclear where things that weren't pages should go.

The Rust code fully addresses all of these issues on top of having far more features.

# Challenge 1: Embedded data

The main challenge was how to port the more complicated pages like [Lee movelist](/Lee-movelist).

Here's the original source for [Strings](/Strings):

{{ dump_file("content/archive/cthor/Strings.pp") }}

This page embedded a bunch of data at the top in a DSL,
then rendered that data while at the same time parsing and transforming it.

Now, that sounds bad written like that, but hear me out.
You don't want to pollute the SSG itself (in this case `pages.pl`) with code that only applies to a single page.
That code should just live in the page, or in an explicit dependency of that page (e.g. an include), to limit its scope.

The advantage of the Perl site is that the templates were just Perl---a terrifyingly powerful language---so
crazy things like that are more than possible without putting any extra thought into the design.

With more typical templating engines, such a complex data transformation is not so easy.
Things like Handlebars, Jinja, Liquid, Template Toolkit, etc. are intentionally limited in what they can do
to stop naughty people like me from writing abominations like this.
They're intended for MVC web frameworks like Rails and Django where the data lives in a database
and separation of concerns is the whole point of the design pattern.

...and there's nothing wrong with the MVC pattern.
I just had to rethink how to apply it to the use case of authoring self-contained web pages.

The solution: add filters to define data (M) and to transform it (C).
Specifically, a YAML filter takes a string of YAML and returns the serialized data.
Then, a Lua filter takes that data and some Lua code and returns the transformed data.
With this, each language works together, each doing what it does best.

The new page:

{{ dump_file("content/template/page/Strings.md.j2") }}

* * *

# Source

For demonstration purposes:

{{ dump_file("content/template/page/Rewrite.md.j2") }}

{% endset %}