<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.tunbury.org/atom.xml" rel="self" type="application/atom+xml" /><link href="https://www.tunbury.org/" rel="alternate" type="text/html" /><updated>2026-03-03T22:22:32+00:00</updated><id>https://www.tunbury.org/atom.xml</id><title type="html">Tunbury.ORG</title><subtitle>It&apos;s a website!</subtitle><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><entry><title type="html">OCaml-CI and native Windows builds</title><link href="https://www.tunbury.org/2026/03/03/obuilder-hcs-2/" rel="alternate" type="text/html" title="OCaml-CI and native Windows builds" /><published>2026-03-03T22:20:00+00:00</published><updated>2026-03-03T22:20:00+00:00</updated><id>https://www.tunbury.org/2026/03/03/obuilder-hcs-2</id><content type="html" xml:base="https://www.tunbury.org/2026/03/03/obuilder-hcs-2/"><![CDATA[<p>Following from <a href="https://www.tunbury.org/2026/02/19/obuilder-hcs/">post last week about obuilder and Windows Host Compute Services</a>, I am pleased to report that this is now running on OCaml-CI. In this early phase, I have enabled testing only on Windows 2025 with OCaml 5.4 and opam 2.5 using the MinGW toolchain.</p>

<p>Since my earlier post, I have achieved reliable operation and pushed the workarounds I had in obuilder into LWT. Furthermore, I have switched from a JSON configuration file per layer to an S-expression format, as this better matches the existing style, and the PPX deriving was already installed. There have also been numerous other small clean-ups.</p>

<p>Containerd uses the Windows Host Network Service, as does Docker. Docker creates a new network at boot with a random subnet. In the extract below, the network is 172.17.32.0/20.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PS C:\Users\Administrator&gt; Get-HnsNetwork


ActivityId             : 0F32EF26-00D8-4B04-BB0A-57F18075F9EA
AdditionalParams       : 
CurrentEndpointCount   : 1
Extensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; 
                         Name=Microsoft Windows Filtering Platform}, 
                         @{Id=F74F241B-440F-4433-BB28-00F89EAD20D8; IsEnabled=False; 
                         Name=Microsoft Azure VFP Switch Extension}, 
                         @{Id=430BDADD-BAB0-41AB-A369-94B67FA5BE0A; IsEnabled=True; Name=Microsoft 
                         NDIS Capture}}
Flags                  : 8
Health                 : @{LastErrorCode=0; LastUpdateTime=134170237512475197}
ID                     : 4EE1C263-FD69-45F9-8F4D-1D7137222B79
IPv6                   : False
LayeredOn              : FBA38879-AA6A-48AF-AD6D-35127F74313A
MacPools               : {@{EndMacAddress=00-15-5D-D2-1F-FF; StartMacAddress=00-15-5D-D2-10-00}}
MaxConcurrentEndpoints : 3
Name                   : nat
NatName                : NAT9A2D26A3-7226-46EE-9D96-5CDA0BF27595
Policies               : {@{Type=VLAN; VLAN=1}}
State                  : 1
Subnets                : {@{AdditionalParams=; AddressPrefix=172.17.32.0/20; Flags=0; 
                         GatewayAddress=172.17.32.1; Health=; 
                         ID=FD5E1DC1-71A1-4669-94D1-AD980E405535; IpSubnets=System.Object[]; 
                         ObjectType=5; Policies=System.Object[]; State=0}}
SwitchGuid             : 4EE1C263-FD69-45F9-8F4D-1D7137222B79
TotalEndpoints         : 13
Type                   : nat
Version                : 68719476736
Resources              : @{AdditionalParams=; AllocationOrder=2; Allocators=System.Object[];        
                         CompartmentOperationTime=0; Flags=0; Health=; 
                         ID=0F32EF26-00D8-4B04-BB0A-57F18075F9EA; PortOperationTime=0; State=1;     
                         SwitchOperationTime=0; VfpOperationTime=0;
                         parentId=95C9A579-958E-4991-A38A-A15BA23F39D9}
</code></pre></div></div>

<p>I had been running these commands on startup:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Get-HnsNetwork | Where-Object { $_.Name -eq 'nat' } | Remove-HnsNetwork
New-HnsNetwork -Type nat -Name nat -AddressPrefix '172.20.0.0/16' -Gateway '172.20.0.1'
</code></pre></div></div>

<p>And setting the network configuration to <code class="language-plaintext highlighter-rouge">172.20.0.0/16</code> in <code class="language-plaintext highlighter-rouge">C:\Program Files\containerd\cni\conf\0-containerd-nat.conf</code>. However, this broke <code class="language-plaintext highlighter-rouge">docker build</code> as it could not find the network it was expecting:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>failed to create endpoint vibrant_tu on network nat: failed during hnsCallRawResponse: hnsCall failed in Win32: Element not found. (0x490)
</code></pre></div></div>

<p>Changing direction, I have instead used <code class="language-plaintext highlighter-rouge">fix-nat.ps1</code> as a scheduled task at reboot to align containerd’s configuration with Docker’s.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Read Docker's existing NAT network configuration and write it into the
# containerd CNI config so both runtimes share the same subnet.
Import-Module c:\windows\hns.psm1
$net = Get-HnsNetwork | Where-Object { $_.Name -eq 'nat' }
if (-not $net) {
    Write-Error "No NAT network found"
    exit 1
}
$subnet = $net.Subnets[0].AddressPrefix
$gateway = $net.Subnets[0].GatewayAddress
$json = @{
    cniVersion = "0.3.0"
    name = "nat"
    type = "nat"
    master = "Ethernet"
    ipam = @{
        subnet = $subnet
        routes = @(@{ gateway = $gateway })
    }
    capabilities = @{
        portMappings = $true
        dns = $true
    }
} | ConvertTo-Json -Depth 3
$json | Set-Content 'C:\Program Files\containerd\cni\conf\0-containerd-nat.conf' -Encoding ASCII
Write-Host "CNI config updated: subnet=$subnet gateway=$gateway"
</code></pre></div></div>

<p>Here is the log of a successful run from OCaml-CI: <a href="https://ocaml.ci.dev/github/mtelvers/mandelbrot/commit/14e08f30f087994a19822546a55405d078acd0d3/variant/windows-server-mingw-ltsc2025-5.4_opam-2.5">mtelvers/mandelbrot/commit/14e08f30f087994a19822546a55405d078acd0d3/variant/windows-server-mingw-ltsc2025-5.4_opam-2.5</a></p>

<p>PRs</p>

<ul>
  <li><a href="https://github.com/ocurrent/obuilder/pull/204">ocurrent/obuilder/pull/204</a></li>
  <li><a href="https://github.com/ocurrent/ocluster/pull/258">ocurrent/ocluster/pull/258</a></li>
  <li><a href="https://github.com/ocurrent/ocaml-ci/pull/1041">ocurrent/ocaml-ci/pull/1041</a></li>
  <li><a href="https://github.com/ocsigen/lwt/pull/1103">ocsigen/lwt/pull/1103</a></li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml-ci,obuilder" /><category term="tunbury.org" /><summary type="html"><![CDATA[Following from post last week about obuilder and Windows Host Compute Services, I am pleased to report that this is now running on OCaml-CI. In this early phase, I have enabled testing only on Windows 2025 with OCaml 5.4 and opam 2.5 using the MinGW toolchain.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OCaml 5 native 32-bit backends: i386 and PPC32</title><link href="https://www.tunbury.org/2026/03/03/32bit-backends/" rel="alternate" type="text/html" title="OCaml 5 native 32-bit backends: i386 and PPC32" /><published>2026-03-03T14:30:00+00:00</published><updated>2026-03-03T14:30:00+00:00</updated><id>https://www.tunbury.org/2026/03/03/32bit-backends</id><content type="html" xml:base="https://www.tunbury.org/2026/03/03/32bit-backends/"><![CDATA[<p>Following on from the <a href="/2025/11/27/ocaml-54-native/">Arm32 multicore backend</a>, I have now ported the remaining two 32-bit architectures to OCaml 5 with multicore support: i386 and PowerPC 32-bit (PPC32).</p>

<p>OCaml 5’s multicore runtime needs a per-domain state: the allocation pointer, exception handler, GC data and so on. On 64-bit platforms, there are registers to spare, but on 32-bit architectures, particularly i386, there are far fewer, and I want to retain the shared nature of the ppc64/ppc32 backend, which caused more problems.</p>

<h1 id="design-choices">Design choices</h1>

<h2 id="i386-thread-local-storage-via-gs">i386: Thread-local storage via %gs</h2>

<p>The i386 architecture has only 7 general-purpose registers. I initially tried dedicating one to the domain state pointer, but with only 6 remaining registers, the graph colouring register allocator could not find a valid allocation for many programs. Instead, the i386 backend uses the <code class="language-plaintext highlighter-rouge">%gs</code> segment register to access thread-local storage (TLS). Every time a domain state is needed, the compiler emits:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>movl %gs:caml_state@ntpoff, %ebx
</code></pre></div></div>

<p>This loads the domain state pointer on demand from the thread-local <code class="language-plaintext highlighter-rouge">caml_state</code> variable. It costs an extra instruction per access but keeps all general-purpose registers available for allocation. The <code class="language-plaintext highlighter-rouge">@ntpoff</code> relocation uses the local-exec TLS model, which is the fastest TLS access pattern on Linux. This mechanism is Linux/ELF-specific; on Windows, <code class="language-plaintext highlighter-rouge">%fs</code> is reserved for the Thread Information Block and <code class="language-plaintext highlighter-rouge">%gs</code> is not available for TLS in the same way, so a Windows port would need a different approach.</p>

<h2 id="ppc32-dedicated-register-r30">PPC32: Dedicated register r30</h2>

<p>PPC32 has 32 general-purpose registers, so dedicating one is affordable. Register r30 permanently holds the domain state pointer (<code class="language-plaintext highlighter-rouge">DOMAIN_STATE_PTR</code>), matching the approach used by Arm32 and the existing PPC64 backend. The allocation pointer lives in r31, and the exception handler pointer in r29. The PPC32 and PPC64 backends share the same source files (<code class="language-plaintext highlighter-rouge">emit.mlp</code>, <code class="language-plaintext highlighter-rouge">proc.ml</code>, <code class="language-plaintext highlighter-rouge">power.S</code>) with conditionals for the two modes, so keeping the same register assignments avoids divergence in shared code.</p>

<p>However, there were some challenges with position-independent code (PIC). On PPC32, calls to shared library functions go through the PLT (Procedure Linkage Table), and the PLT stubs use the GOT (Global Offset Table) to find the actual function addresses at runtime. The standard PPC32 secure-PLT convention uses r30 as the GOT base pointer, which conflicts directly with its use as <code class="language-plaintext highlighter-rouge">DOMAIN_STATE_PTR</code>. The solution was to bypass PLT stubs entirely, using a per-compilation-unit <code class="language-plaintext highlighter-rouge">.got2</code> section with PC-relative addressing for all external symbol references. This avoids the system GOT (which can overflow its 16-bit offset limit in large programs) and keeps r30 free for OCaml’s use.</p>

<p>Another interesting thing to note is that the PPC <code class="language-plaintext highlighter-rouge">bltl-</code> instruction used for allocation checks unconditionally clobbers the link register (LR) regardless of whether the branch is taken. This is per the PPC ISA specification (LR is set when LK=1), which means LR must be saved and restored in every function that has a stack frame, not just those that make explicit calls.</p>

<h1 id="benchmarks">Benchmarks</h1>

<p>Both backends were tested under QEMU using a <a href="https://gist.github.com/mtelvers/def18d646a217c3219ba3e54c6d53bec">trivial prime counter</a> as a benchmark as I used for Arm32.</p>

<h2 id="i386-qemu-4-vcpus">i386 (QEMU, 4 vCPUs)</h2>

<table>
  <thead>
    <tr>
      <th>Mode</th>
      <th>Domains</th>
      <th>Time</th>
      <th>Speedup vs slowest</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Native</td>
      <td>4</td>
      <td>0.17s</td>
      <td>6.9x</td>
    </tr>
    <tr>
      <td>Native</td>
      <td>2</td>
      <td>0.35s</td>
      <td>3.4x</td>
    </tr>
    <tr>
      <td>Native</td>
      <td>1</td>
      <td>0.46s</td>
      <td>2.6x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>4</td>
      <td>0.50s</td>
      <td>2.4x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>2</td>
      <td>0.69s</td>
      <td>1.7x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>1</td>
      <td>1.18s</td>
      <td>1.0x</td>
    </tr>
  </tbody>
</table>

<h2 id="ppc32-qemu-1-vcpu">PPC32 (QEMU, 1 vCPU)</h2>

<table>
  <thead>
    <tr>
      <th>Mode</th>
      <th>Domains</th>
      <th>Time</th>
      <th>Speedup vs slowest</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Native</td>
      <td>2</td>
      <td>2.02s</td>
      <td>9.5x</td>
    </tr>
    <tr>
      <td>Native</td>
      <td>4</td>
      <td>2.09s</td>
      <td>9.2x</td>
    </tr>
    <tr>
      <td>Native</td>
      <td>1</td>
      <td>2.28s</td>
      <td>8.5x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>1</td>
      <td>19.28s</td>
      <td>1.0x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>4</td>
      <td>19.96s</td>
      <td>1.0x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>2</td>
      <td>20.53s</td>
      <td>0.9x</td>
    </tr>
  </tbody>
</table>

<p>The i386 results show real multicore scaling: native code with 4 domains is 2.7x faster than single-domain, and nearly 7x faster than single-domain bytecode. The PPC32 machine only has a single emulated CPU, so there is no multicore scaling, but the native backend is consistently 8-10x faster than bytecode. QEMU’s <code class="language-plaintext highlighter-rouge">mac99</code> machine does not support SMP, so testing true PPC32 parallelism will need either real hardware or a different emulation platform.</p>

<h1 id="test-suites">Test suites</h1>

<p>Both backends pass the OCaml test suite with only bytecode-related exceptions. On PPC32, the two failing tests (<code class="language-plaintext highlighter-rouge">lazy7</code> and <code class="language-plaintext highlighter-rouge">test_compact_manydomains</code>) both fail only in bytecode mode; the native backend passes everything.</p>

<h1 id="try-it">Try it</h1>

<p>Both backends are available on my fork:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/mtelvers/ocaml -b arm32-multicore
cd ocaml
./configure &amp;&amp; make world.opt &amp;&amp; make tests
</code></pre></div></div>

<p>The branch now supports Arm32, i386, and PPC32 architectures.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Following on from the Arm32 multicore backend, I have now ported the remaining two 32-bit architectures to OCaml 5 with multicore support: i386 and PowerPC 32-bit (PPC32).]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Tessera Pipeline</title><link href="https://www.tunbury.org/2026/02/25/teserra-pipeline/" rel="alternate" type="text/html" title="Tessera Pipeline" /><published>2026-02-25T20:45:00+00:00</published><updated>2026-02-25T20:45:00+00:00</updated><id>https://www.tunbury.org/2026/02/25/teserra-pipeline</id><content type="html" xml:base="https://www.tunbury.org/2026/02/25/teserra-pipeline/"><![CDATA[<p>Mainly for my future reference here is a walk-through of the Tessera pipeline.</p>

<h1 id="data-sources-and-acronyms">Data Sources and Acronyms</h1>

<p>The Sentinel-1 Radiometrically Terrain Corrected (RTC) collection on Microsoft Planetary Computer (MPC) provides processed C-band Synthetic Aperture Radar (SAR) data.</p>

<p>Observational Products for End-Users from Remote Sensing Analysis, OPERA, Radiometric Terrain Corrected (RTC) SAR Backscatter from Sentinel-1 (RTC-S1) has a 30m resolution.</p>

<p>S1 width is typically 250km x 250km, but the exact values vary.</p>

<p>Sentinel-1 makes two passes which view the ground from different angles.</p>
<ul>
  <li>Ascending: satellite moving south-to-north (evening pass, ~6pm local time)</li>
  <li>Descending: satellite moving north-to-south (morning pass, ~6am local time)</li>
</ul>

<p>Sentinel-1 transmits a vertically polarised radar pulse and records two return signals:</p>
<ul>
  <li>VV: vertical transmit, vertical receive — the “like-polarised” return sensitive to surface roughness and moisture (soil, water)</li>
  <li>VH: vertical transmit, horizontal receive — the “cross-polarised” return sensitive to volume scattering (vegetation canopy, forest structure)</li>
</ul>

<p>Sentinel-2 Level-2A (L2A) data provides surface reflectance images, formatted in 100km x 100km tiles based on the Military Grid Reference System (MGRS). These are 10,980 x 10,820 pixel at 10m resolution.</p>

<p>MGRS tiles are defined on Universal Transverse Mercator (UTM) projections, which are local flat approximations of the Earth’s surface.</p>

<p>Each “100km × 100km” tile is a 100km square in the local UTM coordinate system, which maps to a slightly trapezoidal shape on the actual Earth surface. The deviation from true square is small within a single tile (UTM distortion is &lt;0.04% within a zone), but it means tiles at different latitudes cover different amounts of actual ground area when measured in degrees.</p>

<p>Sentinel-2 is an optical sensor which looks straight down.</p>

<p>COG = Cloud-Optimised GeoTIFF.</p>

<p>STAC = SpatioTemporal Asset Catalog.</p>

<p>ROI = Region of Interest.</p>

<p>SCL = Scene Classification Layer.</p>

<h1 id="the-pipeline">The Pipeline</h1>

<p>The pipeline uses 0.1-degree blocks.</p>

<p>Load a GeoTIFF that defines the ROI’s spatial extent (CRS, bounds, resolution, dimensions) and a binary mask (1 = land, 0 = sea/skip). The bounds are reprojected to latitude/longitude for satellite data queries.</p>

<p>Query MPC or AWS for Sentinel-2 and Sentinel-1 data covering the ROI, for the entire year, filtered by cloud cover. S2 uses STAC on both sources; S1 uses STAC on MPC and NASA’s Common Metadata Repository CMR on AWS.</p>

<p>For Sentinel-2 data, there will be multiple passes, perhaps even on the same day. The cloud mask, SCL, is downloaded for all passes and used to identify valid (non-cloudy) dates. A second pass downloads the additional bands for the valid dates. This is nuanced, as a given day can be assembled from a mosaic of valid pixels rather than requiring an entirely cloud-free tile.</p>

<p>For Sentinel-1 data, both ascending and descending data is collected for all available dates.</p>

<p>This results in three 4D arrays, one 3D mask, and three arrays of dates:</p>
<ul>
  <li>S2: [n_dates, H, W, 10] bands + [n_dates, H, W] masks + [n_dates] day-of-year</li>
  <li>S1: separate ascending and descending arrays [n_dates, H, W, 2] + [n_dates] DOYs each</li>
</ul>

<p>For each pixel, the model needs exactly 40 S2 timesteps and 40 S1 timesteps as input. Since there are typically more valid timesteps available, a sampling step selects which ones to use. The pipeline uses random selection to pick the dates to use. It supports multiple passes with averaging, though it defaults to a single pass.</p>

<p>The S2 input is shaped as [40, 11], that is 10 spectral bands normalised plus the day-of-year. The S1 input is [40, 3], this is VV and VH (normalised) plus day-of-year. Ascending and descending S1 passes are merged into a single pool before sampling.</p>

<p>Thus for each pixel 10m x 10m pixel, there are 40 S2 dates, each with 10 spectral bands and for each of a (potentially different) 40 S1 dates, there are VV and VH values. These are passed to the model, which produces a 128-dimensional float32 embedding per pixel.</p>

<p>In the final step, the 128-dimensional embeddings are quantised to int8 with a per-pixel float32 scale factor, reducing storage to 132 bytes per pixel, compared to 512 bytes for full float32.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="tessera" /><category term="tunbury.org" /><summary type="html"><![CDATA[Mainly for my future reference here is a walk-through of the Tessera pipeline.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/new_delhi_pca.png" /><media:content medium="image" url="https://www.tunbury.org/images/new_delhi_pca.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OBuilder on Windows: Bringing Native Container Builds with the HCS Backend</title><link href="https://www.tunbury.org/2026/02/19/obuilder-hcs/" rel="alternate" type="text/html" title="OBuilder on Windows: Bringing Native Container Builds with the HCS Backend" /><published>2026-02-19T19:25:00+00:00</published><updated>2026-02-19T19:25:00+00:00</updated><id>https://www.tunbury.org/2026/02/19/obuilder-hcs</id><content type="html" xml:base="https://www.tunbury.org/2026/02/19/obuilder-hcs/"><![CDATA[<p>Following from my containerd <a href="https://www.tunbury.org/2025/06/11/windows-containerd/">posts</a> <a href="https://www.tunbury.org/2025/06/14/windows-containerd-2/">last</a> <a href="https://www.tunbury.org/2025/06/27/windows-containerd-3/">year</a> and my previous work on obuilder backends for <a href="https://tarides.com/blog/2023-08-02-obuilder-on-macos/">macOS</a> and <a href="https://github.com/ocurrent/obuilder/pull/195">QEMU</a>, this post extends obuilder to use the Host Compute System (HCS) and <a href="https://containerd.io">containerd</a> on Windows.</p>

<p>OBuilder, written by Thomas Leonard, is a sandboxed build executor for OCaml CI pipelines. It takes a build specification, similar to a Dockerfile, but written in S-expression syntax, and executes each step in an isolated environment, caching results at the filesystem level.</p>

<p>OBuilder’s sandbox backends target Linux (via runc), macOS (via user sandboxing), FreeBSD (via jails), and Docker and any else via QEMU. This post introduces the HCS backend, which brings native Windows container builds to OBuilder using Microsoft’s Host Compute Service and containerd.</p>

<h2 id="how-obuilder-works">How OBuilder Works</h2>

<p>Before looking at the Windows-specific details, let’s recap on how OBuilder works.</p>

<h3 id="build-specifications">Build Specifications</h3>

<p>A typical OBuilder is shown below:</p>

<div class="language-scheme highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">((</span><span class="nf">from</span> <span class="nv">ocaml/opam:debian</span><span class="p">)</span>
 <span class="p">(</span><span class="nf">workdir</span> <span class="nv">/src</span><span class="p">)</span>
 <span class="p">(</span><span class="nf">user</span> <span class="p">(</span><span class="nf">uid</span> <span class="mi">1000</span><span class="p">)</span> <span class="p">(</span><span class="nf">gid</span> <span class="mi">1000</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"sudo chown opam /src"</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">copy</span> <span class="p">(</span><span class="nf">src</span> <span class="nv">obuilder-spec</span><span class="o">.</span><span class="nv">opam</span> <span class="nv">obuilder</span><span class="o">.</span><span class="nv">opam</span><span class="p">)</span> <span class="p">(</span><span class="nf">dst</span> <span class="o">.</span><span class="nv">/</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"opam pin add -yn ."</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span>
  <span class="p">(</span><span class="nf">network</span> <span class="nv">host</span><span class="p">)</span>
  <span class="p">(</span><span class="nf">shell</span> <span class="s">"opam install --deps-only -t obuilder"</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">copy</span> <span class="p">(</span><span class="nf">src</span> <span class="o">.</span><span class="p">)</span> <span class="p">(</span><span class="nf">dst</span> <span class="nv">/src/</span><span class="p">)</span> <span class="p">(</span><span class="nf">exclude</span> <span class="o">.</span><span class="nv">git</span> <span class="nv">_build</span> <span class="nv">_opam</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"opam exec -- dune build @install @runtest"</span><span class="p">)))</span>
</code></pre></div></div>

<p>Each operation, such as <code class="language-plaintext highlighter-rouge">from</code>, <code class="language-plaintext highlighter-rouge">run</code>, <code class="language-plaintext highlighter-rouge">copy</code>, <code class="language-plaintext highlighter-rouge">workdir</code>, <code class="language-plaintext highlighter-rouge">env</code>, <code class="language-plaintext highlighter-rouge">shell</code>, is executed in sequence inside a sandboxed container. The resulting filestem is the aggregation of all the previous steps and is recorded as the hash of all the steps up to that point. OBuilder will reuse these layers as a cache of the build steps up to that point instead of re-executing the step.</p>

<p>OBuilder’s functor architecture allows it to be easily extended by providing new store, sandbox, and fetcher implementations. The new Windows backend uses <code class="language-plaintext highlighter-rouge">hcs_store.ml</code>, <code class="language-plaintext highlighter-rouge">hcs_sandbox.ml</code> and <code class="language-plaintext highlighter-rouge">hcs_fetch.ml</code>.</p>

<h3 id="the-build-flow">The Build Flow</h3>

<p>When OBuilder processes a spec, it:</p>

<ol>
  <li>Fetches the base image: (<code class="language-plaintext highlighter-rouge">from</code> directive) using the fetcher module</li>
  <li>For each operation, compute a content hash from the operation and its inputs</li>
  <li>Checks the cache: if a result for that hash exists, skip execution</li>
  <li>Creates a snapshot from the previous step’s result using the store module</li>
  <li>Runs the operation inside the sandbox using the sandbox</li>
  <li>Commits the result as a new snapshot, keyed by the content hash</li>
</ol>

<p>This means repeat builds are very fast, and with carefully constructed spec files, incremental builds due to code changes can be built without needing to rebuild the project dependencies (the opam switch).</p>

<h2 id="the-hcs-backend">The HCS Backend</h2>

<p>The Host Compute Service (HCS) backend enables native Windows container builds using <code class="language-plaintext highlighter-rouge">containerd</code>.</p>

<h3 id="architecture">Architecture</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌────────────────────────────────────────────────────┐
│              OBuilder CLI (main.ml)                │
│     obuilder build --store=hcs:C:\obuilder         │
└────────────────────────────────────────────────────┘
                        │
                        ▼
┌────────────────────────────────────────────────────┐
│            Builder Functor (build.ml)              │
│     Build.Make(Hcs_store)(Hcs_sandbox)(Hcs_fetch)  │
└────────────────────────────────────────────────────┘
        │                │                │
        ▼                ▼                ▼
  ┌───────────┐   ┌────────────┐   ┌───────────┐
  │ Hcs_store │   │Hcs_sandbox │   │ Hcs_fetch │
  │           │   │            │   │           │
  │ Snapshot  │   │ Container  │   │ Base image│
  │ mgmt via  │   │ exec via   │   │ import via│
  │ ctr snap  │   │ ctr run    │   │ ctr pull  │
  └───────────┘   └────────────┘   └───────────┘
        │                │                │
        └────────────────┼────────────────┘
                         ▼
┌────────────────────────────────────────────────────┐
│              containerd (Windows)                  │
│   Images  │  Snapshots (VHDX)  │  Runtime (HCS)    │
└────────────────────────────────────────────────────┘
</code></pre></div></div>

<h3 id="split-storage-model">Split Storage Model</h3>

<p>Obuilder backends, typically use filesystem features, such as BTRFS or ZFS snapshots to store the cache layer within the obuilder results directory, typically <code class="language-plaintext highlighter-rouge">/var/cache/obuilder/results/&lt;hashid&gt;/rootfs</code>. However, HCS automatically stores the actual filesystem snaphots in VHDX files in <code class="language-plaintext highlighter-rouge">C:\ProgramData\containerd\snapshots\&lt;N&gt;</code>, so the obuilder results directory contains only a JSON file with a pointer to this directory.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>OBuilder Store (C:\obuilder\)         Containerd (C:\ProgramData\containerd\)
├── result\&lt;id&gt;\                      ├── snapshots\
│   ├── rootfs\                       │   ├── 1\    ← VHDX layer data
│   │   └── layerinfo.json ────────►  │   ├── 2\    ← VHDX layer data
│   ├── log                           │   └── 3\    ← VHDX layer data
│   └── env                           └── metadata.db
├── state\db\db.sqlite
└── cache\
</code></pre></div></div>

<h2 id="walking-through-a-build">Walking Through a Build</h2>

<p>Let’s trace what happens when you run:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">obuilder</span><span class="w"> </span><span class="nx">build</span><span class="w"> </span><span class="nt">-f</span><span class="w"> </span><span class="nx">example.windows.hcs.spec</span><span class="w"> </span><span class="o">.</span><span class="w"> </span><span class="nt">--store</span><span class="o">=</span><span class="n">hcs:C:\obuilder</span><span class="w">
</span></code></pre></div></div>

<p>with the following spec:</p>

<div class="language-scheme highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">((</span><span class="nf">from</span> <span class="nv">mcr</span><span class="o">.</span><span class="nv">microsoft</span><span class="o">.</span><span class="nv">com/windows/nanoserver:ltsc2025</span><span class="p">)</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"echo hello"</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"mkdir C:\\app"</span><span class="p">)))</span>
</code></pre></div></div>

<h3 id="step-1-fetch-the-base-image-hcs_fetchml">Step 1: Fetch the Base Image (hcs_fetch.ml)</h3>

<p>The fetcher pulls the base image from the Microsoft Container Registry and prepares an initial snapshot.</p>

<p>First, it normalises the image reference. Docker Hub images need a <code class="language-plaintext highlighter-rouge">docker.io/</code> prefix for containerd (e.g. <code class="language-plaintext highlighter-rouge">ubuntu:latest</code> becomes <code class="language-plaintext highlighter-rouge">docker.io/library/ubuntu:latest</code>), but Microsoft Container Registry (MCR) images are used as-is.</p>

<p>The equivalent manual commands are:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Pull the image</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">image</span><span class="w"> </span><span class="nx">pull</span><span class="w"> </span><span class="nx">mcr.microsoft.com/windows/nanoserver:ltsc2025</span><span class="w">

</span><span class="c"># Get the chain ID (the snapshot key for the image's top layer)</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">images</span><span class="w"> </span><span class="nx">pull</span><span class="w"> </span><span class="nt">--print-chainid</span><span class="w"> </span><span class="nt">--local</span><span class="w"> </span><span class="nx">mcr.microsoft.com/windows/nanoserver:ltsc2025</span><span class="w">
</span><span class="c"># Output includes: "image chain ID: sha256:abc123..."</span><span class="w">

</span><span class="c"># Prepare a writable snapshot from the image</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">snapshot</span><span class="w"> </span><span class="nx">prepare</span><span class="w"> </span><span class="nt">--mounts</span><span class="w"> </span><span class="nx">obuilder-base-</span><span class="err">&lt;</span><span class="nx">hash</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">sha256:abc123...</span><span class="w">
</span><span class="c"># Returns JSON with mount information:</span><span class="w">
</span><span class="c"># [{"Type":"windows-layer","Source":"C:\\...\\snapshots\\42",</span><span class="w">
</span><span class="c">#   "Options":["rw","parentLayerPaths=[\"C:\\\\...\\\\snapshots\\\\20\"]"]}]</span><span class="w">
</span></code></pre></div></div>

<p>The fetcher parses this mount JSON to extract the source path and parent layer paths, then writes <code class="language-plaintext highlighter-rouge">layerinfo.json</code>:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"snapshot_key"</span><span class="p">:</span><span class="w"> </span><span class="s2">"obuilder-base-&lt;hash&gt;"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"source"</span><span class="p">:</span><span class="w"> </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">...</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">42"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"parent_layer_paths"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
    </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">...</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">20"</span><span class="p">,</span><span class="w">
    </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">...</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">21"</span><span class="w">
  </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Finally, it extracts environment variables from the image config:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Get the config digest</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">images</span><span class="w"> </span><span class="nx">inspect</span><span class="w"> </span><span class="nx">mcr.microsoft.com/windows/nanoserver:ltsc2025</span><span class="w">
</span><span class="c"># Look for: "application/vnd.docker.container.image.v1+json @sha256:def456..."</span><span class="w">

</span><span class="c"># Get the config content</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">content</span><span class="w"> </span><span class="nx">get</span><span class="w"> </span><span class="nx">sha256:def456...</span><span class="w">
</span><span class="c"># Parse the config.Env array from the JSON</span><span class="w">
</span></code></pre></div></div>

<h3 id="step-2-run-echo-hello-hcs_storeml--hcs_sandboxml">Step 2: Run “echo hello” (hcs_store.ml + hcs_sandbox.ml)</h3>

<p>For each <code class="language-plaintext highlighter-rouge">run</code> directive, the store creates a new snapshot from the previous step, the sandbox executes the command, and the store commits the result.</p>

<h4 id="store-prepare-a-snapshot">Store: prepare a snapshot</h4>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Read layerinfo.json from parent to get its snapshot key</span><span class="w">
</span><span class="c"># Prepare a new writable snapshot from the parent's committed snapshot</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">snapshot</span><span class="w"> </span><span class="nx">prepare</span><span class="w"> </span><span class="nt">--mounts</span><span class="w"> </span><span class="nx">obuilder-</span><span class="err">&lt;</span><span class="nx">id2</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">obuilder-base-</span><span class="err">&lt;</span><span class="nx">hash</span><span class="err">&gt;</span><span class="nt">-committed</span><span class="w">
</span></code></pre></div></div>

<h4 id="sandbox-generate-oci-config-and-run">Sandbox: generate OCI config and run</h4>

<p>The sandbox reads <code class="language-plaintext highlighter-rouge">layerinfo.json</code> and generates an OCI runtime config:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"ociVersion"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1.1.0"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"process"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"terminal"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span><span class="w">
    </span><span class="nl">"user"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"username"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ContainerUser"</span><span class="w"> </span><span class="p">},</span><span class="w">
    </span><span class="nl">"args"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"cmd"</span><span class="p">,</span><span class="w"> </span><span class="s2">"/S"</span><span class="p">,</span><span class="w"> </span><span class="s2">"/C"</span><span class="p">,</span><span class="w"> </span><span class="s2">"echo hello"</span><span class="p">],</span><span class="w">
    </span><span class="nl">"env"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"PATH=C:</span><span class="se">\\</span><span class="s2">Windows</span><span class="se">\\</span><span class="s2">System32;C:</span><span class="se">\\</span><span class="s2">Windows"</span><span class="p">],</span><span class="w">
    </span><span class="nl">"cwd"</span><span class="p">:</span><span class="w"> </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">"</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="nl">"root"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"path"</span><span class="p">:</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="nl">"readonly"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w"> </span><span class="p">},</span><span class="w">
  </span><span class="nl">"hostname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"builder"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"windows"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"layerFolders"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
      </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">...</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">20"</span><span class="p">,</span><span class="w">
      </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">...</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">21"</span><span class="p">,</span><span class="w">
      </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">...</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">42"</span><span class="p">,</span><span class="w">
      </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">...</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">43"</span><span class="w">
    </span><span class="p">]</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">layerFolders</code> array lists all parent layers followed by the writable scratch layer. This is the Windows container equivalent of an overlay filesystem — the HCS merges all these layers together when the container starts.</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Run the container</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">run</span><span class="w"> </span><span class="nt">--rm</span><span class="w"> </span><span class="nt">--config</span><span class="w"> </span><span class="nx">config.json</span><span class="w"> </span><span class="nx">obuilder-run-0</span><span class="w">
</span></code></pre></div></div>

<h4 id="store-commit-the-result">Store: commit the result</h4>

<p>After the command succeeds:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Commit the writable snapshot to a permanent one</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">snapshot</span><span class="w"> </span><span class="nx">commit</span><span class="w"> </span><span class="nx">obuilder-</span><span class="err">&lt;</span><span class="nx">id2</span><span class="err">&gt;</span><span class="nt">-committed</span><span class="w"> </span><span class="nx">obuilder-</span><span class="err">&lt;</span><span class="nx">id2</span><span class="err">&gt;</span><span class="w">
</span></code></pre></div></div>

<p>The result directory is then moved from <code class="language-plaintext highlighter-rouge">result-tmp/&lt;id2&gt;</code> to <code class="language-plaintext highlighter-rouge">result/&lt;id2&gt;</code>.</p>

<h3 id="step-3-run-mkdir-capp">Step 3: Run “mkdir C:\app”</h3>

<p>The process repeats: prepare a snapshot from <code class="language-plaintext highlighter-rouge">obuilder-&lt;id2&gt;-committed</code>, run the command, commit the result. Each step builds on the previous one, forming a chain of containerd snapshots.</p>

<h2 id="networking">Networking</h2>

<p>Windows containers don’t support <code class="language-plaintext highlighter-rouge">--net-host</code> in the way Linux containers do. Instead, network access requires three components working together:</p>

<ol>
  <li>An Host Networking Service (HNS) NAT network with a specific subnet</li>
  <li>A Container Network Interface (CNI) config at <code class="language-plaintext highlighter-rouge">C:\Program Files\containerd\cni\conf\0-containerd-nat.conf</code> matching that subnet</li>
  <li>An HCN namespace per container</li>
</ol>

<p>The sandbox creates and destroys HCN namespaces around each networked container execution:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Before the container</span><span class="w">
</span><span class="n">hcn-namespace</span><span class="w"> </span><span class="nx">create</span><span class="w">
</span><span class="c"># Returns a GUID, e.g. "a1b2c3d4-..."</span><span class="w">

</span><span class="c"># The GUID is passed in the OCI config:</span><span class="w">
</span><span class="c"># "windows": { "network": { "networkNamespace": "a1b2c3d4-..." } }</span><span class="w">

</span><span class="c"># Run with --cni flag</span><span class="w">
</span><span class="n">ctr</span><span class="w"> </span><span class="nx">run</span><span class="w"> </span><span class="nt">--rm</span><span class="w"> </span><span class="nt">--cni</span><span class="w"> </span><span class="nt">--config</span><span class="w"> </span><span class="nx">config.json</span><span class="w"> </span><span class="nx">obuilder-run-0</span><span class="w">

</span><span class="c"># After the container</span><span class="w">
</span><span class="n">hcn-namespace</span><span class="w"> </span><span class="nx">delete</span><span class="w"> </span><span class="nx">a1b2c3d4-...</span><span class="w">
</span></code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">hcn-namespace</code> tool is a small OCaml utility (<a href="https://github.com/mtelvers/hcn-namespace">mtelvers/hcn-namespace</a>) that wraps the Windows HCN API, written last year while working on <code class="language-plaintext highlighter-rouge">day10</code>.</p>

<h2 id="the-copy-operation">The COPY Operation</h2>

<p>File copying works differently on Windows due to I/O constraints. On Linux, OBuilder streams tar data through a pipe directly into the sandbox’s stdin. On Windows, the tar data is first written to a temporary file, then the file is passed as stdin to the container:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Linux:   generate tar  ──pipe──►  sandbox stdin  ──►  tar -xf -
Windows: generate tar  ──►  temp file  ──►  sandbox stdin  ──►  tar -xf -
</code></pre></div></div>

<p>This extra step is needed because Lwt’s pipe I/O is unreliable on Windows (more on this below).</p>

<h2 id="running-it">Running It</h2>

<h3 id="prerequisites">Prerequisites</h3>

<ol>
  <li>Windows Server 2019 or later (tested on LTSC 2019 and LTSC 2025)</li>
  <li>Containerd v2.0+ installed and running as a service</li>
  <li>ctr: CLI available in PATH</li>
  <li><a href="https://github.com/mtelvers/hcn-namespace">hcn-namespace</a>: tool for networking support</li>
</ol>

<h3 id="building-obuilder-on-windows">Building OBuilder on Windows</h3>

<p>OBuilder builds itself — the provided <code class="language-plaintext highlighter-rouge">example.windows.hcs.spec</code> bootstraps the build using an MSVC-based OCaml image:</p>

<div class="language-scheme highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">((</span><span class="nf">from</span> <span class="nv">ocaml/opam:windows-server-msvc-ltsc2025-ocaml-5</span><span class="o">.</span><span class="mi">4</span><span class="p">)</span>
 <span class="p">(</span><span class="nf">workdir</span> <span class="s">"C:/src"</span><span class="p">)</span>
 <span class="p">(</span><span class="nf">copy</span> <span class="p">(</span><span class="nf">src</span> <span class="nv">obuilder-spec</span><span class="o">.</span><span class="nv">opam</span> <span class="nv">obuilder</span><span class="o">.</span><span class="nv">opam</span><span class="p">)</span> <span class="p">(</span><span class="nf">dst</span> <span class="o">.</span><span class="nv">/</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"echo (lang dune 3.0)&gt; dune-project"</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"opam pin add -yn ."</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">network</span> <span class="nv">host</span><span class="p">)</span>
  <span class="p">(</span><span class="nf">shell</span> <span class="s">"opam install --deps-only -t obuilder"</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">copy</span> <span class="p">(</span><span class="nf">src</span> <span class="o">.</span><span class="p">)</span> <span class="p">(</span><span class="nf">dst</span> <span class="s">"C:/src/"</span><span class="p">)</span> <span class="p">(</span><span class="nf">exclude</span> <span class="o">.</span><span class="nv">git</span> <span class="nv">_build</span> <span class="nv">_opam</span><span class="p">))</span>
 <span class="p">(</span><span class="nf">run</span> <span class="p">(</span><span class="nf">shell</span> <span class="s">"opam exec -- dune build @install @runtest"</span><span class="p">)))</span>
</code></pre></div></div>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">obuilder</span><span class="w"> </span><span class="nx">build</span><span class="w"> </span><span class="nt">-f</span><span class="w"> </span><span class="nx">example.windows.hcs.spec</span><span class="w"> </span><span class="o">.</span><span class="w"> </span><span class="nt">--store</span><span class="o">=</span><span class="n">hcs:C:\obuilder</span><span class="w">
</span></code></pre></div></div>

<h3 id="healthcheck">Healthcheck</h3>

<p>To verify the setup:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">obuilder</span><span class="w"> </span><span class="nx">healthcheck</span><span class="w"> </span><span class="nt">--store</span><span class="o">=</span><span class="n">hcs:C:\obuilder</span><span class="w">
</span></code></pre></div></div>

<p>This pulls <code class="language-plaintext highlighter-rouge">mcr.microsoft.com/windows/nanoserver:ltsc2025</code>, runs <code class="language-plaintext highlighter-rouge">echo healthcheck</code> inside a container, and confirms everything works end-to-end.</p>

<h2 id="addendum-lwt-on-windows">Addendum: Lwt on Windows</h2>

<p>The HCS backend development highlighted serveral issues with Lwt on Windows:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Lwt_process.exec</code> child promise isn’t resolved</li>
  <li><code class="language-plaintext highlighter-rouge">Lwt_unix.waitpid</code> hangs indefinitely unless created with <code class="language-plaintext highlighter-rouge">cmd.exe /c</code></li>
  <li><code class="language-plaintext highlighter-rouge">Lwt_unix.write</code> can randomly hang, affecting tar and log streaming.</li>
  <li><code class="language-plaintext highlighter-rouge">Lwt_io.with_file</code> fails with “Permission denied”</li>
  <li><code class="language-plaintext highlighter-rouge">Os.pread_result</code> works intermittently, but frequently fails with <code class="language-plaintext highlighter-rouge">ctr</code></li>
</ul>

<h2 id="code">Code</h2>

<p>My code is available at <a href="https://github.com/mtelvers/obuilder/tree/hcs">mtelvers/obuilder/tree/hcs</a>. I have an ocluster and OCaml-CI patch, but the LWT issues dominate reliability.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="obuilder" /><category term="tunbury.org" /><summary type="html"><![CDATA[Following from my containerd posts last year and my previous work on obuilder backends for macOS and QEMU, this post extends obuilder to use the Host Compute System (HCS) and containerd on Windows.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Day10: opam package testing tool</title><link href="https://www.tunbury.org/2026/02/16/day10/" rel="alternate" type="text/html" title="Day10: opam package testing tool" /><published>2026-02-16T19:30:00+00:00</published><updated>2026-02-16T19:30:00+00:00</updated><id>https://www.tunbury.org/2026/02/16/day10</id><content type="html" xml:base="https://www.tunbury.org/2026/02/16/day10/"><![CDATA[<p><a href="https://github.com/ocurrent/obuilder">ocurrent/obuilder</a> is the workhorse of OCaml CI testing, but the current deployment causes packages to be built repeatedly because the opam switch is assembled from scratch for each package, leading to common dependencies being frequently recompiled. <code class="language-plaintext highlighter-rouge">day10</code> uses an alternative model whereby switches are assembled from their component packages.</p>

<p>Assuming a package A depends upon B and C, while package B depends upon D and E, which is represented by the graph below. <code class="language-plaintext highlighter-rouge">day10</code> would build package D in isolation, capturing the files written to the opam switch and the operating system dependencies. This would be repeated for all the leaf packages E and C. Then, the sets of changed files for both D and E are merged into a new switch, and package B is installed in that switch using the same capturing methodology. For package A, the file sets for D, E, C and B, in order, are merged, and package A is installed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    A
   / \
  B   C
 / \
D   E
</code></pre></div></div>

<p>On its own, this is slower than using opam to create the same switch, as opam processes these steps in parallel. However, to create a new switch for package F, <code class="language-plaintext highlighter-rouge">day10</code> can reuse the file sets for B, D, and E without recreating them.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    F
   / \
  B   G
 / \
D   E
</code></pre></div></div>

<p>In general, each package is installed exactly once and reused on other switches. However, in some cases, packages enable different functionality depending on which other packages are installed. <code class="language-plaintext highlighter-rouge">logs</code> is a good example, with optional libraries such as <code class="language-plaintext highlighter-rouge">fmt</code>, <code class="language-plaintext highlighter-rouge">cmdliner</code>, <code class="language-plaintext highlighter-rouge">lwt</code>, etc. In this case, the package would be installed once for each dependency combination.</p>

<p>The original concept of merging files and recreating the switch came from Jon’s Opam hijinx tool. <a href="https://github.com/jonludlam/opamh">jonludlam/opamh</a>. This functionality is distilled in <code class="language-plaintext highlighter-rouge">day10</code> in this function, which builds the switch state from the directory listing of the installed packages.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">dump_state</span> <span class="n">packages_dir</span> <span class="n">state_file</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">content</span> <span class="o">=</span> <span class="nn">Sys</span><span class="p">.</span><span class="n">readdir</span> <span class="n">packages_dir</span> <span class="o">|&gt;</span> <span class="nn">Array</span><span class="p">.</span><span class="n">to_list</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">packages</span> <span class="o">=</span> <span class="nn">List</span><span class="p">.</span><span class="n">filter_map</span> <span class="p">(</span><span class="k">fun</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="nn">OpamPackage</span><span class="p">.</span><span class="n">of_string_opt</span> <span class="n">x</span><span class="p">)</span> <span class="n">content</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">sel_compiler</span> <span class="o">=</span> <span class="nn">List</span><span class="p">.</span><span class="n">filter</span> <span class="p">(</span><span class="k">fun</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">mem</span> <span class="p">(</span><span class="nn">OpamPackage</span><span class="p">.</span><span class="n">name</span> <span class="n">x</span><span class="p">)</span> <span class="n">compiler_packages</span><span class="p">)</span> <span class="n">packages</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">new_state</span> <span class="o">=</span>
    <span class="k">let</span> <span class="n">s</span> <span class="o">=</span> <span class="nn">OpamPackage</span><span class="p">.</span><span class="nn">Set</span><span class="p">.</span><span class="n">of_list</span> <span class="n">packages</span> <span class="k">in</span>
    <span class="p">{</span> <span class="nn">OpamTypes</span><span class="p">.</span><span class="n">sel_installed</span> <span class="o">=</span> <span class="n">s</span><span class="p">;</span> <span class="n">sel_roots</span> <span class="o">=</span> <span class="n">s</span><span class="p">;</span> <span class="n">sel_pinned</span> <span class="o">=</span> <span class="nn">OpamPackage</span><span class="p">.</span><span class="nn">Set</span><span class="p">.</span><span class="n">empty</span><span class="p">;</span> <span class="n">sel_compiler</span> <span class="o">=</span> <span class="nn">OpamPackage</span><span class="p">.</span><span class="nn">Set</span><span class="p">.</span><span class="n">of_list</span> <span class="n">sel_compiler</span> <span class="p">}</span>
  <span class="k">in</span>
  <span class="nn">OpamFilename</span><span class="p">.</span><span class="n">write</span> <span class="p">(</span><span class="nn">OpamFilename</span><span class="p">.</span><span class="n">raw</span> <span class="n">state_file</span><span class="p">)</span> <span class="p">(</span><span class="nn">OpamFile</span><span class="p">.</span><span class="nn">SwitchSelections</span><span class="p">.</span><span class="n">write_to_string</span> <span class="n">new_state</span><span class="p">)</span>
</code></pre></div></div>

<p>opam could be used to install the package in the “recreated” switch, but opam does unnecessary checks, such as finding and checking whether the necessary dependencies are installed. This led to the tool <a href="https://github.com/mtelvers/opam-build">mtelvers/opam-build</a>, which assumes everything is already in place and calls the opam library to install the package without any checks!</p>

<p>The dependency graph includes the compiler, so package ‘D’ might be OCaml 5.4.0, and package ‘E’ might be an OS dependency like ‘conf-curl’, and the captured layer would include <code class="language-plaintext highlighter-rouge">libcurl.so</code>. The underlying OS distribution and version are also captured, so <code class="language-plaintext highlighter-rouge">logs</code> on Debian is assumed to be different to <code class="language-plaintext highlighter-rouge">logs</code> on Fedora.</p>

<p>On Linux, <code class="language-plaintext highlighter-rouge">day10</code> uses overlayfs with <code class="language-plaintext highlighter-rouge">runc</code>. Overlayfs has the concept of a read-only lower directory and a writable upper directory. While these can be stacked, the depth is limited. Therefore, <code class="language-plaintext highlighter-rouge">day10</code> assembles the lower directory by creating a file system tree of hard links to the originally captured files, and an initially empty upper directory is used to capture the files written to it. On FreeBSD, unionfs is used similarly using <code class="language-plaintext highlighter-rouge">jails</code>. On Windows, <code class="language-plaintext highlighter-rouge">containerd</code> is used, but the filesystem isn’t isolated as the hard-linked directory is writable. This hasn’t presented a probably in day-to-day use.</p>

<p>A typical command-line for <code class="language-plaintext highlighter-rouge">day10</code> would specify an initially empty layer-cache directory, your clone of the opam repository, an output format of Markdown or JSON, and the package to be installed using opam’s naming syntax:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>day10 health-check <span class="nt">--cache-dir</span> /var/cache/day10 <span class="nt">--opam-repository</span> /home/mtelvers/opam-repository <span class="nt">--md</span> log.md 0install.2.18
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">day10</code> will attempt to detect your system and build an appropriate container, but you can override this <code class="language-plaintext highlighter-rouge">--os</code> and then in detail with <code class="language-plaintext highlighter-rouge">--os-distribution</code>, <code class="language-plaintext highlighter-rouge">--os-family</code> and <code class="language-plaintext highlighter-rouge">--os-version</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/var/cache/day10/
└── debian-13-x86_64                         # os specific tag
    ├── 0149f9a8b66c1568d0b962417e827d9f     # layer hash
    │   ├── build.log                        # build log
    │   ├── config.json                      # runc configuration file
    │   ├── fs                               # root file system
    │   ├── hosts                            # runc container hosts file
    │   ├── layer.json                       # layer dependencies and their hashes
    │   └── opam-repository                  # copy of the opam files used to create the switch
    │       ├── packages                     #   laid out in opam repository layout
    │       └── repo
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">day10</code> uses lock files on each layer to allow multiple instances to be invoked at the same time to build different packages. <code class="language-plaintext highlighter-rouge">day10</code> also accepts a list of packages using <code class="language-plaintext highlighter-rouge">@packages.json</code> rather than a specific package name, which can be used along with <code class="language-plaintext highlighter-rouge">--fork</code> to internally create multiple instances.</p>

<p>It’s difficult to know how many processes to fork at once, particularly when packages may be partially or entirely cached and only require the SAT solver to process the request (typically takes less than one second). Therefore, it is often useful to separate the solving step from the building step and run them with different levels of parallelism.  For example, <code class="language-plaintext highlighter-rouge">day10 health-check ... --json /path/to/output --fork $(nproc) --dry-run @packages.json</code> which will solve every package and output a JSON file containing a status field. This allows a second pass to be made with a more conservative <code class="language-plaintext highlighter-rouge">--fork N</code> parameter, as actual building of packages will be performed, only those with a status of “solution” actually need to be submitted.</p>

<table>
  <thead>
    <tr>
      <th>Status</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>success</td>
      <td>The package built successfully, and the build log and dependency graph are included in the output.</td>
    </tr>
    <tr>
      <td>failure</td>
      <td>The package itself fails to build. The build log is included in the output.</td>
    </tr>
    <tr>
      <td>no_solution</td>
      <td>The dependencies of the package cannot be satisfied with the current constraints: compiler version, OS, etc</td>
    </tr>
    <tr>
      <td>dependency_failed</td>
      <td>A dependency failed to build; the log of that failure is included.</td>
    </tr>
    <tr>
      <td>solution</td>
      <td>A solution is available, but a dependency and/or the package itself has not been built. This is only generated with <code class="language-plaintext highlighter-rouge">--dry-run</code>.</td>
    </tr>
  </tbody>
</table>

<p>There is a list command to extract a list of packages from an opam repository. This accepts <code class="language-plaintext highlighter-rouge">--all-version</code> but defaults to the latest version.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>day10 list <span class="nt">--opam-repository</span> ~/opam-repository <span class="nt">--os-distribution</span> debian <span class="nt">--os-family</span> debian <span class="nt">--os-version</span> 13 <span class="nt">--json</span> packages.json
</code></pre></div></div>

<p>Run the build with <code class="language-plaintext highlighter-rouge">--fork 20</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>day10 health-check <span class="nt">--cache-dir</span> ~/cache/ <span class="nt">--opam-repository</span> ~/opam-repository <span class="nt">--os-distribution</span> debian <span class="nt">--os-family</span> debian <span class="nt">--os-version</span> 13 <span class="nt">--json</span> /tmp/foo <span class="nt">--fork</span> 20 @packages.json
</code></pre></div></div>

<p>On my E5-2640 machine (2 x 10C 20T) with an SATA SSD, building the latest version of every package for a single compiler version and OS variant takes a little over an hour.</p>

<p>The project code is available at <a href="https://github.com/mtelvers/day10">mtelvers/day10</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,day10" /><category term="tunbury.org" /><summary type="html"><![CDATA[ocurrent/obuilder is the workhorse of OCaml CI testing, but the current deployment causes packages to be built repeatedly because the opam switch is assembled from scratch for each package, leading to common dependencies being frequently recompiled. day10 uses an alternative model whereby switches are assembled from their component packages.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Tessera pipeline in OCaml</title><link href="https://www.tunbury.org/2026/02/15/ocaml-tessera/" rel="alternate" type="text/html" title="Tessera pipeline in OCaml" /><published>2026-02-15T19:30:00+00:00</published><updated>2026-02-15T19:30:00+00:00</updated><id>https://www.tunbury.org/2026/02/15/ocaml-tessera</id><content type="html" xml:base="https://www.tunbury.org/2026/02/15/ocaml-tessera/"><![CDATA[<p>The Tessera pipeline is written in Python. What would it take to have an OCaml version?</p>

<p>Looking at the Python code, these are the key libraries which are used:</p>

<table>
  <thead>
    <tr>
      <th>Python Library</th>
      <th>Used for</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>numpy</strong></td>
      <td>N-dim arrays, math, <code class="language-plaintext highlighter-rouge">.npy</code> I/O</td>
    </tr>
    <tr>
      <td><strong>torch</strong></td>
      <td>Model inference</td>
    </tr>
    <tr>
      <td><strong>rasterio</strong></td>
      <td>Read GeoTIFF (ROI mask), CRS/bounds, <code class="language-plaintext highlighter-rouge">transform_bounds</code></td>
    </tr>
    <tr>
      <td><strong>pystac-client</strong></td>
      <td>STAC API search (Planetary Computer catalog)</td>
    </tr>
    <tr>
      <td><strong>planetary-computer</strong></td>
      <td>Sign STAC URLs (Azure SAS tokens)</td>
    </tr>
    <tr>
      <td><strong>stackstac</strong></td>
      <td>Load COGs into arrays, reproject, mosaic</td>
    </tr>
  </tbody>
</table>

<h1 id="numpy">numpy</h1>

<p>Last year, when I first looked at the Tessera titles, I wrote <a href="https://github.com/mtelvers/npy-pca">mtelvers/npy-pca</a> as a basic visualisation tool that included an npy reader. Now, I have spun that off into its own library <a href="https://github.com/mtelvers/ocaml-npy">mtelvers/ocaml-npy</a>. I subsequently noticed that there already was <a href="https://github.com/LaurentMazare/npy-ocaml">LaurentMazare/npy-ocaml</a> which may have saved me some time!</p>

<h1 id="pystac-client-and-planetary-computer">pystac-client and planetary-computer</h1>

<p>For these, a new library was needed as I couldn’t see an OCaml equivalent. However, OCaml already has <a href="https://github.com/ocaml-multicore/eio">Eio</a>, <a href="https://github.com/mirage/ocaml-cohttp">cohttp-eio</a> and <a href="https://github.com/ocaml-community/yojson">yojson</a>, so it was relatively easy to produce <a href="https://github.com/mtelvers/stac-client">mtelvers/stac-client</a>, which implemented the <a href="https://stacspec.org/">STAC</a> (SpatioTemporal Asset Catalogue) API, with built-in support for <a href="https://planetarycomputer.microsoft.com/">Microsoft Planetary Computer</a> SAS token signing. This was easy to validate against the results from Python.</p>

<h1 id="rasterio">rasterio</h1>

<p><a href="https://github.com/geocaml/ocaml-tiff">geocaml/ocaml-tiff</a> already exists, but it does not handle tiled tiff files, which are used in the land masks. Rather than reinventing the entire library, I added tiled tiff support.</p>

<h1 id="stackstac">stackstac</h1>

<p><a href="https://github.com/geocaml/ocaml-gdal">geocaml/ocaml-gdal</a> already existed, but it lacked some required features and was a little outdated. More bindings were added for GDAL’s C API using OCaml’s ctypes-foreign adding:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">GDALOpenEx</code> with <code class="language-plaintext highlighter-rouge">/vsicurl/</code> for reading remote COGs</li>
  <li><code class="language-plaintext highlighter-rouge">GDALWarp</code> for reprojection and resampling</li>
  <li><code class="language-plaintext highlighter-rouge">GDALRasterIO</code> for reading band data</li>
  <li><code class="language-plaintext highlighter-rouge">OSRNewSpatialReference</code> / <code class="language-plaintext highlighter-rouge">OCTTransformBounds</code> for coordinate transformations</li>
</ul>

<h1 id="torch">torch</h1>

<p><a href="https://github.com/LaurentMazare/ocaml-torch">LaurentMazare/ocaml-torch</a> already existed with the latest version published on opam <a href="https://github.com/janestreet/torch">janestreet/torch</a>. This uses the Jane Street standard library but it seemed pointless to reimplement this using the OCaml Standard Library, so instead, I went with implementing the OCaml bindings for the ONNX runtime <a href="https://github.com/mtelvers/ocaml-onnxruntime">mtelvers/ocaml-onnxruntime</a> as I only need the inference stage. The PyTorch model can be easily exported to ONNX format.</p>

<p>ONNX Runtime’s C API uses a function-table pattern (a struct with 500+ function pointers) which doesn’t easily map to ctypes. This needed a thin C shim (<code class="language-plaintext highlighter-rouge">libert_shim.so</code>) that exposed the needed functions as regular C symbols, which could be bound from OCaml.</p>

<h1 id="cpu-testing">CPU Testing</h1>

<p>The initial OCaml pipeline was tested on my local machine without a GPU. It stored satellite data as nested OCaml arrays (<code class="language-plaintext highlighter-rouge">float array array array array</code> for 4D data), which performed poorly. This was replaced with flat <code class="language-plaintext highlighter-rouge">Bigarray.Array1.t</code> using a stride-based index arithmetic, matching NumPy’s contiguous memory layout, which performed much better. However, the real test was on a GPU.</p>

<h2 id="benchmark-results">Benchmark results</h2>

<p>All benchmarks on the same machine (AMD EPYC 9965 2 x 192-Core, NVIDIA L4 24GB), same dataset (269,908 pixels), same parameters (<code class="language-plaintext highlighter-rouge">batch_size=1024</code>, <code class="language-plaintext highlighter-rouge">num_threads=20</code>, <code class="language-plaintext highlighter-rouge">repeat_times=1</code>):</p>

<table>
  <thead>
    <tr>
      <th>Rank</th>
      <th>Configuration</th>
      <th>Inference Time</th>
      <th>vs Python CPU</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1</td>
      <td><strong>OCaml + ONNX Runtime + CUDA</strong></td>
      <td><strong>2 min 10s</strong></td>
      <td><strong>9.5x faster</strong></td>
    </tr>
    <tr>
      <td>2</td>
      <td>Python + PyTorch + CUDA</td>
      <td>2 min 41s</td>
      <td>7.7x faster</td>
    </tr>
    <tr>
      <td>3</td>
      <td>Python + PyTorch (CPU)</td>
      <td>20 min 32s</td>
      <td>1x (baseline)</td>
    </tr>
    <tr>
      <td>4</td>
      <td>OCaml + ONNX Runtime (CPU)</td>
      <td>24 min 56s</td>
      <td>0.82x</td>
    </tr>
  </tbody>
</table>

<p>The OCaml + GPU configuration is the fastest overall. I put this difference down less data marshalling in OCaml before passing it to the ONNX runtime. I’ve also read that the ONNX Runtime might edge out ahead of PyTorch as it was purpose-built as an inference-only engine.</p>

<h1 id="checks">Checks</h1>

<p>The OCaml pipeline produces results that are effectively identical to Python’s, differing only due to floating-point rounding.</p>

<ul>
  <li>OCaml CPU vs Python CPU: max embedding difference of 1 in only 1,028 out of 155 million int8 elements (rounding at the quantisation boundary). Scale factors match exactly.</li>
  <li>GPU vs CPU (either language): max embedding difference of 1 in ~0.3% of elements, with negligible scale differences — expected floating-point rounding differences from GPU arithmetic.</li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,tessera" /><category term="tunbury.org" /><summary type="html"><![CDATA[The Tessera pipeline is written in Python. What would it take to have an OCaml version?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/manchester.png" /><media:content medium="image" url="https://www.tunbury.org/images/manchester.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">The 15-Game</title><link href="https://www.tunbury.org/2026/02/11/fifteen/" rel="alternate" type="text/html" title="The 15-Game" /><published>2026-02-11T20:00:00+00:00</published><updated>2026-02-11T20:00:00+00:00</updated><id>https://www.tunbury.org/2026/02/11/fifteen</id><content type="html" xml:base="https://www.tunbury.org/2026/02/11/fifteen/"><![CDATA[<p>Numberphile’s latest <a href="https://www.youtube.com/watch?v=UafhPUOCM1E">video</a> has been released, and in it Ben Sparks introduces the 15 Game.</p>

<p>Two players take turns choosing tiles numbered 1–9 from a shared pool. The first player to collect exactly three tiles that sum to 15 wins. If all tiles are taken with no winner, the game is a draw.</p>

<p>This looks like a great project for a <a href="https://ocaml.org/p/js_of_ocaml/latest/doc/js_of_ocaml/Js_of_ocaml/Js/index.html">js_of_ocaml</a> solution.</p>

<p>As the game maps directly onto tic-tac-toe, creating an AI player was straightforward. You can play my version from GitHub Pages <a href="https://mtelvers.github.io/fifteen">fifteen</a>.</p>

<p>Numberphile’s videos are sponsored by <a href="https://www.janestreet.com/join-jane-street/programs-and-events/amp/">Jane Street</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,js_of_ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Numberphile’s latest video has been released, and in it Ben Sparks introduces the 15 Game.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/fifteen.png" /><media:content medium="image" url="https://www.tunbury.org/images/fifteen.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Optimizing an MP3 Codec with OCaml/OxCaml</title><link href="https://www.tunbury.org/2026/02/11/ocaml-mp3/" rel="alternate" type="text/html" title="Optimizing an MP3 Codec with OCaml/OxCaml" /><published>2026-02-11T18:30:00+00:00</published><updated>2026-02-11T18:30:00+00:00</updated><id>https://www.tunbury.org/2026/02/11/ocaml-mp3</id><content type="html" xml:base="https://www.tunbury.org/2026/02/11/ocaml-mp3/"><![CDATA[<p>After reading Anil’s post about his zero-allocation HTTP parser <a href="https://anil.recoil.org/notes/oxcaml-httpz">httpz</a>, I decided to apply some OxCaml optimisation techniques to my pure OCaml MP3 encoder/decoder.</p>

<p>The <a href="https://github.com/mtelvers/ocaml-mp3">OCaml-based MP3 encoder/decoder</a> has been the most ambitious project I’ve tried in Opus 4.5. It was a struggle to get it over the line, and I even needed to read large chunks of the ISO standard and get to grips with some of the maths and help the AI troubleshoot.</p>

<h1 id="profiling-an-ocaml-mp3-decoder-with-landmarks">Profiling an OCaml MP3 Decoder with Landmarks</h1>

<p>Before dividing into OxCaml, I wanted to get a feel for the current performance and also to make obvious non-OxCaml performance improvements; otherwise, I would be comparing an optimised OxCaml version with an underperforming OCaml version.</p>

<p>It was 40 times slower than <code class="language-plaintext highlighter-rouge">ffmpeg</code>: 29.5 seconds to decode a 3-minute file versus 0.74 seconds. I used the <a href="https://github.com/LexiFi/landmarks">landmarks</a> profiling library to identify and fix the bottlenecks, bringing decode time down to 3.5 seconds (a 8x speedup).</p>

<h2 id="setting-up-landmarks">Setting Up Landmarks</h2>

<p>Landmarks is an OCaml profiling library that instruments functions and reports cycle counts. It was easy to add to the project (*) with a simple edit of the <code class="language-plaintext highlighter-rouge">dune</code> file:</p>

<pre><code class="language-sexp">(libraries ... landmarks)
(preprocess (pps landmarks-ppx --auto))
</code></pre>

<p>The <code class="language-plaintext highlighter-rouge">--auto</code> flag automatically instruments every top-level function — no manual annotation needed. Running the decoder with <code class="language-plaintext highlighter-rouge">OCAML_LANDMARKS=on</code> prints a call tree with cycle counts and percentages.</p>

<blockquote>
  <p>(*) It needed OCaml 5.3.0 for <code class="language-plaintext highlighter-rouge">landmarks-ppx</code> compatibility; it wouldn’t install on OCaml 5.4.0 due to a ppxlib version constraint.</p>
</blockquote>

<h2 id="issues">Issues</h2>

<p>78% of the time was spent in the Huffman decoding, specifically <code class="language-plaintext highlighter-rouge">decode_pair</code>. The implementation read one bit at a time, then scanned the table for a matching Huffman code. I initially tried a Hashtbl, which was much better than the scan before deciding to use array lookup instead.</p>

<p>The bitstream operations still accounted for much of the time, but these could be optimised with appropriate <code class="language-plaintext highlighter-rouge">Bytes.get_...</code> calls, as the most frequent path is reading 32 bits in big endian layout.</p>

<p>The profile now showed <code class="language-plaintext highlighter-rouge">find_sfb_long</code> consuming 3.4 billion cycles inside requantization. This function does a linear search through scalefactor band boundaries for every one of the 576 frequency lines, every granule, every frame. Switching to precomputed 576-entry arrays mapping each frequency line directly to its scalefactor band index.</p>

<p>There were some additional tweaks, such as adding more precomputed lookup tables stored in <code class="language-plaintext highlighter-rouge">floatarray</code>, using <code class="language-plaintext highlighter-rouge">[@inline]</code> and <code class="language-plaintext highlighter-rouge">unsafe_get</code>, <code class="language-plaintext highlighter-rouge">land</code> instead of <code class="language-plaintext highlighter-rouge">mod</code>.</p>

<p>After this, no single function dominated the profile, and I could move on to OxCaml.</p>

<h1 id="oxcaml">OxCaml</h1>

<p>OxCaml has <code class="language-plaintext highlighter-rouge">float#</code>, an unboxed float type that lives in registers, and <code class="language-plaintext highlighter-rouge">let mutable</code> for stack-allocated mutable variables. Together, they let you write inner loops where the accumulator never touches the heap:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">F</span> <span class="o">=</span> <span class="nn">Stdlib_upstream_compatible</span><span class="p">.</span><span class="nc">Float_u</span>

<span class="k">let</span><span class="p">[</span><span class="o">@</span><span class="n">inline</span><span class="p">]</span> <span class="n">imdct_long</span> <span class="n">input</span> <span class="o">=</span>
  <span class="k">for</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">to</span> <span class="mi">35</span> <span class="k">do</span>
    <span class="k">let</span> <span class="k">mutable</span> <span class="n">sum</span> <span class="o">:</span> <span class="kt">float</span><span class="o">#</span> <span class="o">=</span> <span class="nn">F</span><span class="p">.</span><span class="n">of_float</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="k">in</span>
    <span class="k">for</span> <span class="n">k</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">to</span> <span class="mi">17</span> <span class="k">do</span>
      <span class="k">let</span> <span class="n">cos_val</span> <span class="o">=</span> <span class="nn">F</span><span class="p">.</span><span class="n">of_float</span> <span class="p">(</span><span class="nn">Float</span><span class="p">.</span><span class="nn">Array</span><span class="p">.</span><span class="n">unsafe_get</span> <span class="n">cos_table</span> <span class="p">(</span><span class="n">i</span> <span class="o">*</span> <span class="mi">18</span> <span class="o">+</span> <span class="n">k</span><span class="p">))</span> <span class="k">in</span>
      <span class="k">let</span> <span class="n">inp_val</span> <span class="o">=</span> <span class="nn">F</span><span class="p">.</span><span class="n">of_float</span> <span class="p">(</span><span class="nn">Array</span><span class="p">.</span><span class="n">unsafe_get</span> <span class="n">input</span> <span class="n">k</span><span class="p">)</span> <span class="k">in</span>
      <span class="n">sum</span> <span class="o">&lt;-</span> <span class="nn">F</span><span class="p">.</span><span class="n">add</span> <span class="n">sum</span> <span class="p">(</span><span class="nn">F</span><span class="p">.</span><span class="n">mul</span> <span class="n">inp_val</span> <span class="n">cos_val</span><span class="p">)</span>
    <span class="k">done</span><span class="p">;</span>
    <span class="nn">Array</span><span class="p">.</span><span class="n">unsafe_set</span> <span class="n">output</span> <span class="n">i</span> <span class="p">(</span><span class="nn">F</span><span class="p">.</span><span class="n">to_float</span> <span class="n">sum</span><span class="p">)</span>
  <span class="k">done</span>
</code></pre></div></div>

<p>These kinds of optimisations got me from 2.35s down to 2.01s.</p>

<p>What I felt was missing was an accessor function which returned an unboxed float from a floatarray, so I wouldn’t need to unbox with <code class="language-plaintext highlighter-rouge">F.of_float</code>. However, I couldn’t find it.</p>

<p>The httpz parser really benefited from OxCaml’s unboxed types because its hot path operates on small unboxed records that stay entirely in registers:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span><span class="p">{</span> <span class="n">off</span><span class="o">:</span> <span class="n">int16</span><span class="o">#;</span> <span class="n">len</span><span class="o">:</span> <span class="n">int16</span><span class="o">#</span> <span class="p">}</span>
</code></pre></div></div>

<h1 id="results">Results</h1>

<p>The optimisations brought a 29.5s MP3 decoder down to 2.01s. Mostly through standard OCaml optimisations, but OxCaml’s <code class="language-plaintext highlighter-rouge">float#</code> saved another ~14%.</p>

<table>
  <thead>
    <tr>
      <th>Decoder</th>
      <th>Time</th>
      <th>vs ffmpeg</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>ffmpeg</td>
      <td>0.74s</td>
      <td>1x</td>
    </tr>
    <tr>
      <td>LAME</td>
      <td>0.81s</td>
      <td>1.1x</td>
    </tr>
    <tr>
      <td>ocaml-mp3 (original)</td>
      <td>29.5s</td>
      <td>40x</td>
    </tr>
    <tr>
      <td>ocaml-mp3 (Hashtbl)</td>
      <td>6.4s</td>
      <td>8.6x</td>
    </tr>
    <tr>
      <td>ocaml-mp3 (flat + fast bitstream)</td>
      <td>3.5s</td>
      <td>4.7x</td>
    </tr>
    <tr>
      <td>ocaml-mp3 (best)</td>
      <td>2.4s</td>
      <td>3.2x</td>
    </tr>
    <tr>
      <td>ocaml-mp3 (OxCaml)</td>
      <td>2.0s</td>
      <td>2.7x</td>
    </tr>
  </tbody>
</table>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,oxcaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[After reading Anil’s post about his zero-allocation HTTP parser httpz, I decided to apply some OxCaml optimisation techniques to my pure OCaml MP3 encoder/decoder.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Windows Docker Images</title><link href="https://www.tunbury.org/2026/02/09/base-image-builder/" rel="alternate" type="text/html" title="Windows Docker Images" /><published>2026-02-09T09:30:00+00:00</published><updated>2026-02-09T09:30:00+00:00</updated><id>https://www.tunbury.org/2026/02/09/base-image-builder</id><content type="html" xml:base="https://www.tunbury.org/2026/02/09/base-image-builder/"><![CDATA[<p>In my previous post on the <a href="https://www.tunbury.org/2026/01/16/base-image-builder/">base image builder</a>, I included a footnote that we now had Windows 2025 workers, but I didn’t mention that the base images weren’t building.</p>

<p>Docker on Windows is very slow, so I have had a background task nudging these builds forward a little bit each day, and I’m pleased to now report that over the weekend, the images all built, and the entire dashboard is green!</p>

<p>The most significant change was moving away from fdopen’s opam to native opam. This has unlocked OCaml 5 builds for the first time but has removed images for OCaml &lt; 4.13. MSVC 5.0-5.2 are not available as the MSVC port was broken until OCaml 5.3 <a href="https://github.com/ocaml/ocaml/pull/12954">ocaml/ocaml#12954</a>. Each version is built on Windows Server LTSC 2019, LTSC 2022, and LTSC 2025.</p>

<table>
  <thead>
    <tr>
      <th>OCaml Version</th>
      <th style="text-align: center">MinGW</th>
      <th style="text-align: center">MSVC</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>4.13.1</td>
      <td style="text-align: center">✓</td>
      <td style="text-align: center">✓</td>
    </tr>
    <tr>
      <td>4.14.2</td>
      <td style="text-align: center">✓</td>
      <td style="text-align: center">✓</td>
    </tr>
    <tr>
      <td>5.0.0</td>
      <td style="text-align: center">✓</td>
      <td style="text-align: center">✗</td>
    </tr>
    <tr>
      <td>5.1.1</td>
      <td style="text-align: center">✓</td>
      <td style="text-align: center">✗</td>
    </tr>
    <tr>
      <td>5.2.1</td>
      <td style="text-align: center">✓</td>
      <td style="text-align: center">✗</td>
    </tr>
    <tr>
      <td>5.3.0</td>
      <td style="text-align: center">✓</td>
      <td style="text-align: center">✓</td>
    </tr>
    <tr>
      <td>5.4.0</td>
      <td style="text-align: center">✓</td>
      <td style="text-align: center">✓</td>
    </tr>
  </tbody>
</table>

<p>Below are the detailed changes.</p>

<h1 id="pr-257-ocaml-dockerfile"><a href="https://github.com/ocurrent/ocaml-dockerfile/pull/257">PR 257 ocaml-dockerfile</a></h1>

<p><code class="language-plaintext highlighter-rouge">src-opam/distro.ml</code>:</p>

<ul>
  <li>Changed <code class="language-plaintext highlighter-rouge">opam_repository</code> to use standard <code class="language-plaintext highlighter-rouge">ocaml/opam-repository.git</code> for Windows instead of <code class="language-plaintext highlighter-rouge">ocaml-opam/opam-repository-mingw.git#sunset</code></li>
  <li>Added version filter: Windows builds now require OCaml &gt;= 4.13 (native opam 2.2+ requires official packages)</li>
  <li>MSVC filter: OCaml 5.0-5.2 excluded (MSVC support restored in 5.3)</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">src-opam/windows.ml</code>:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">ocaml_for_windows_package_exn</code> now returns <code class="language-plaintext highlighter-rouge">Ocaml_version.Opam.V2.package</code> directly, using official package names (<code class="language-plaintext highlighter-rouge">ocaml-base-compiler/ocaml-variants+options</code>) instead of fdopen’s <code class="language-plaintext highlighter-rouge">+mingw64</code>/<code class="language-plaintext highlighter-rouge">+msvc64</code> naming</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">src-opam/opam.ml</code>:</p>

<ul>
  <li>Reduce parallelism on Windows to avoid OOM on unbound <code class="language-plaintext highlighter-rouge">make -j</code></li>
  <li>Update Visual Studio to Windows 11 SDK</li>
  <li>create_switch adds <code class="language-plaintext highlighter-rouge">system-mingw/system-msvc</code> for all Windows versions (not just 5.x)</li>
  <li><code class="language-plaintext highlighter-rouge">setup_default_opam_windows_msvc</code> persists MSVC environment (<code class="language-plaintext highlighter-rouge">PATH</code>, <code class="language-plaintext highlighter-rouge">INCLUDE</code>, <code class="language-plaintext highlighter-rouge">LIB</code>, <code class="language-plaintext highlighter-rouge">LIBPATH</code>) with correct <code class="language-plaintext highlighter-rouge">PATH</code> ordering: MSVC → Cygwin → Windows</li>
</ul>

<h1 id="pr-339-docker-base-images"><a href="https://github.com/ocurrent/docker-base-images/pull/339">PR 339 docker-base-images</a></h1>

<p><code class="language-plaintext highlighter-rouge">src/pipeline.ml</code>:</p>

<ul>
  <li>Port package (<code class="language-plaintext highlighter-rouge">system-mingw</code>/<code class="language-plaintext highlighter-rouge">system-msvc</code>) added for all Windows versions</li>
  <li>Removed fdopen overlay addition (<code class="language-plaintext highlighter-rouge">maybe_add_overlay</code> no longer called for Windows)</li>
  <li>Removed <code class="language-plaintext highlighter-rouge">opam repo remove ocurrent-overlay</code> step</li>
  <li>Changed <code class="language-plaintext highlighter-rouge">depext</code> to <code class="language-plaintext highlighter-rouge">Option</code> type - returns <code class="language-plaintext highlighter-rouge">None</code> for Windows (opam 2.2+ has depext built-in)</li>
  <li>Uses <code class="language-plaintext highlighter-rouge">opam_repository_master</code> for Windows instead of <code class="language-plaintext highlighter-rouge">opam_repository_mingw_sunset</code></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">src/git_repositories.ml</code> (implied by pipeline changes):</p>

<ul>
  <li>Removed references to <code class="language-plaintext highlighter-rouge">opam_repository_mingw_sunset</code> and <code class="language-plaintext highlighter-rouge">opam_overlays</code>.</li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[In my previous post on the base image builder, I included a footnote that we now had Windows 2025 workers, but I didn’t mention that the base images weren’t building.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/docker-base-images.png" /><media:content medium="image" url="https://www.tunbury.org/images/docker-base-images.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Getting Claude to complete the spec</title><link href="https://www.tunbury.org/2026/01/28/claude-spec/" rel="alternate" type="text/html" title="Getting Claude to complete the spec" /><published>2026-01-28T16:00:00+00:00</published><updated>2026-01-28T16:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/28/claude-spec</id><content type="html" xml:base="https://www.tunbury.org/2026/01/28/claude-spec/"><![CDATA[<p>With Claude Code, perhaps we are now at the point where the test suite is actually more valuable than the code itself.</p>

<p>I’ve been experimenting with Claude quite successfully and have <em>evolved</em> a working <a href="ihttps://github.com/mtelvers/ocaml-imapd">IMAP</a> and <a href="https://github.com/mtelvers/ocaml-smtpd">SMTP</a> server implementation in OCaml. I say evolved there as Claude generated the code from the RFCs, in a single pass, but what followed was an extensive period of debugging. I added the account to Apple Mail, and it didn’t work at all! Claude dutifully debugged the code, with much back-and-forth, until we had a working version. Or, at least, a version that worked with Apple Mail. What about Thunderbird? I didn’t try. My point, though, is that the more agentic coding we do, the more testing we inevitably need.</p>

<p>In the case of IMAP, I could have asked Claude to use a third-party, command-line IMAP client, and then the testing and debugging could have been automated. What about in cases where there is no client?</p>

<p>I decided to reimplement the IMAP daemon, breaking it down from a single prompt into an actual software project. Claude did the legwork, reviewing the RFCs and creating an architecture of the libraries/modules needed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ┌─────────────────────────────────────────────────────────────────┐
 │                        imap-server                              │
 │  (Connection handling, state machine, command dispatch)         │
 └─────────────────────────────────────────────────────────────────┘
          │           │            │            │           │
          ▼           ▼            ▼            ▼           ▼
 ┌─────────────┐ ┌─────────┐ ┌──────────┐ ┌─────────┐ ┌──────────┐
 │  imap-auth  │ │ mailbox │ │  search  │ │condstore│ │ tls-layer│
 │  (SASL,     │ │ (UID,   │ │ (SEARCH, │ │(QRESYNC,│ │ (ocaml-  │
 │   LOGIN)    │ │  flags) │ │  SORT)   │ │ modseq) │ │   tls)   │
 └─────────────┘ └─────────┘ └──────────┘ └─────────┘ └──────────┘
          │           │            │
          ▼           ▼            ▼
 ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
 │   maildir   │ │ mime-parser │ │ imap-parser │
 │  (storage)  │ │ (RFC 5322)  │ │  (ABNF)     │
 └─────────────┘ └─────────────┘ └─────────────┘
                       │               │
                       ▼               ▼
               ┌─────────────────────────────┐
               │         imap-types          │
               │  (Shared type definitions)  │
               └─────────────────────────────┘
</code></pre></div></div>

<p>I asked for a design document per module with the intention that these specifications could be completed in parallel by N Claude instances. Each Claude should write an extensive test suite for their code. I opted for a serial approach.</p>

<p>Starting with <code class="language-plaintext highlighter-rouge">imap-types</code>, I noticed that a message UID was defined as an int32, which I knew was wrong because it should be an unsigned int32. Anyway, an easy fix.</p>

<blockquote>
  <p>IMAP messages start with a tag so that responses can be aligned with messages when multiple commands are sent without waiting for a response. Apple Mail uses a tag format of <code class="language-plaintext highlighter-rouge">1.1</code>, which was the first thing which needed to be fixed in the original server implementation.</p>
</blockquote>

<p>The <code class="language-plaintext highlighter-rouge">imap-parser</code> passed all the tests. I asked for a specific test to be added, which covered a tag with a dot. It failed. The new parser wasn’t any better than the original one.</p>

<blockquote>
  <p>Prompts to Claude can be submitted on the command line. e.g. <code class="language-plaintext highlighter-rouge">echo "hello" | claude --print</code>.</p>
</blockquote>

<p>How about creating two Claude instances, with one implementing a server and the other a client? Each instance could declare what features it supports, and an orchestration script could run the tests. Or, taking this further, have a third Claude instance act as the moderator and generate the test suite based on the features implemented by the server and the client.</p>

<p>This worked reasonably well.</p>

<ul>
  <li>86 tests generated by moderator</li>
  <li>85 passed, 1 failed</li>
</ul>

<p>The tests covered:</p>

<ul>
  <li>Basic protocol (greeting, capability, noop, logout)</li>
  <li>Authentication (valid/invalid logins, multiple users, quoted passwords)</li>
  <li>Mailbox operations (select, examine, create, delete, rename)</li>
  <li>LIST, LSUB, SUBSCRIBE, STATUS</li>
  <li>FETCH (flags, uid, body, envelope, headers, bodystructure, etc.)</li>
  <li>STORE (add/remove/replace flags, silent mode)</li>
  <li>COPY, MOVE, EXPUNGE, CLOSE, UNSELECT</li>
  <li>SEARCH (all, unseen, flagged, subject)</li>
  <li>UID variants (fetch, store, search, copy)</li>
  <li>APPEND (simple, with flags, literal+)</li>
  <li>Extensions: IDLE, ENABLE, NAMESPACE</li>
  <li>Adversarial tests: malformed commands, missing tags, garbage input, null bytes, rapid commands</li>
  <li>Concurrency: multiple simultaneous connections</li>
</ul>

<p>And this found a real bug where the server did not reject a fetch sequence starting at zero, which is not allowed in the RFC.</p>

<p>Interestingly, neither the client nor server supported STARTTLS. Both had a copy of RFC 8314 but chose not to implement the feature. I put this down to a poor choice of wording in the prompt. I’d said “production-ready”, which to me implies TLS, but “feature-complete” removes the wiggle room. I specifically didn’t want to say “implement TLS” as this is specific to IMAP and wouldn’t apply in other projects.</p>

<p>The next generation of the script provided the three Claude instances with a copy of the RFCs. Client Claude, Server Claude and Moderator Claude were tasked with implementing the client, server and testing and moderation entirely from the RFCs. The script ran iteratively, with more testing being added at each pass, and client and server fixes.</p>

<p>Did I get TLS? No. The dune file called for it, but the library wasn’t opened in the code. The test checked for STARTTLS, and the server replied with “OK Begin TLS negotiation now” but that’s as far as it got in five cycles.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[With Claude Code, perhaps we are now at the point where the test suite is actually more valuable than the code itself.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/anthropic-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/anthropic-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Extending RPC capabilities in OCurrent</title><link href="https://www.tunbury.org/2026/01/26/ocurrent-rpc/" rel="alternate" type="text/html" title="Extending RPC capabilities in OCurrent" /><published>2026-01-26T12:00:00+00:00</published><updated>2026-01-26T12:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/26/ocurrent-rpc</id><content type="html" xml:base="https://www.tunbury.org/2026/01/26/ocurrent-rpc/"><![CDATA[<p>As our workflows become more agentic, CLI tools are becoming preferred over web GUIs; OCurrent pipelines are no exceptions.</p>

<p>OCurrent already had an RPC endpoint which allowed functions such as listing active jobs, viewing a job log and rebuilding. <a href="https://github.com/ocurrent/ocurrent/pull/469">PR#469</a> extends this, adding full pipeline observability and control, including statistics, state, history queries, bulk rebuild, and pipeline visualisation and configuration management. This all works over <a href="https://capnproto.org">Cap’n Proto</a>.</p>

<p><code class="language-plaintext highlighter-rouge">rpc_client.ml</code> can be used as a standalone executable to query any OCurrent pipeline. Alternatively, by including the cmdliner term, your application can be its own client.</p>

<p>In the server code, the RPC endpoint must be specifically exposed. Many OCurrent applications do this already, such as the <a href="https://github.com/ocurrent/docker-base-images">Docker base image builder</a>. Any application which currently supports <code class="language-plaintext highlighter-rouge">--capnp-address</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Rpc</span> <span class="o">=</span> <span class="nn">Current_rpc</span><span class="p">.</span><span class="nc">Impl</span><span class="p">(</span><span class="nc">Current</span><span class="p">)</span>

<span class="c">(* In the main function, set up Cap'n Proto serving *)</span>
<span class="k">let</span> <span class="n">serve_rpc</span> <span class="n">engine</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">config</span> <span class="o">=</span> <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="nn">Vat_config</span><span class="p">.</span><span class="n">create</span> <span class="o">~</span><span class="n">secret_key</span> <span class="o">~</span><span class="n">public_address</span> <span class="n">listen_address</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">service</span> <span class="o">=</span> <span class="nn">Rpc</span><span class="p">.</span><span class="n">engine</span> <span class="n">engine</span> <span class="k">in</span>
  <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="n">serve</span> <span class="n">config</span> <span class="n">service</span> <span class="o">&gt;&gt;=</span> <span class="k">fun</span> <span class="n">vat</span> <span class="o">-&gt;</span>
  <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="nn">Cap_file</span><span class="p">.</span><span class="n">save_service</span> <span class="n">vat</span> <span class="n">service</span> <span class="n">cap_file</span>
</code></pre></div></div>

<p>Then the <a href="https://github.com/dbuenzli/cmdliner">Cmdliner</a> command group needs to be added.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">client_cmd</span> <span class="o">=</span>
  <span class="nn">Current_rpc</span><span class="p">.</span><span class="nn">Client</span><span class="p">.</span><span class="nn">Cmdliner</span><span class="p">.</span><span class="n">client_cmd</span>
      <span class="o">~</span><span class="n">name</span><span class="o">:</span><span class="s2">"client"</span>
      <span class="o">~</span><span class="n">cap_file</span><span class="o">:</span><span class="s2">"/capnp-secrets/base-images.cap"</span>
      <span class="bp">()</span>

<span class="c">(* Add to your command group *)</span>
<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">cmds</span> <span class="o">=</span> <span class="p">[</span><span class="n">main_cmd</span><span class="p">;</span> <span class="n">client_cmd</span><span class="p">]</span> <span class="k">in</span>
  <span class="n">exit</span> <span class="o">@@</span> <span class="nn">Cmdliner</span><span class="p">.</span><span class="nn">Cmd</span><span class="p">.</span><span class="n">eval</span> <span class="p">(</span><span class="nn">Cmdliner</span><span class="p">.</span><span class="nn">Cmd</span><span class="p">.</span><span class="n">group</span> <span class="n">info</span> <span class="n">cmds</span><span class="p">)</span>
</code></pre></div></div>

<p>All 12 sub-commands are now available:-</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>base-images client overview
base-images client <span class="nb">jobs
</span>base-images client status &lt;job_id&gt;
base-images client log &lt;job_id&gt;
base-images client cancel &lt;job_id&gt;
base-images client rebuild &lt;job_id&gt;
base-images client start &lt;job_id&gt;
base-images client query <span class="o">[</span><span class="nt">--ok</span><span class="o">=</span>...] <span class="o">[</span><span class="nt">--prefix</span><span class="o">=</span>...] <span class="o">[</span><span class="nt">--op</span><span class="o">=</span>...] <span class="o">[</span><span class="nt">--rebuild</span><span class="o">=</span>...]
base-images client ops
base-images client dot
base-images client confirm <span class="o">[</span><span class="nt">--set</span><span class="o">=</span>...]
base-images client rebuild-all &lt;job_id&gt; ...
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[As our workflows become more agentic, CLI tools are becoming preferred over web GUIs; OCurrent pipelines are no exceptions.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Base Image Builder</title><link href="https://www.tunbury.org/2026/01/16/base-image-builder/" rel="alternate" type="text/html" title="Base Image Builder" /><published>2026-01-16T17:20:00+00:00</published><updated>2026-01-16T17:20:00+00:00</updated><id>https://www.tunbury.org/2026/01/16/base-image-builder</id><content type="html" xml:base="https://www.tunbury.org/2026/01/16/base-image-builder/"><![CDATA[<p>The base image builder has a growing number of failed builds; it’s time to address these.</p>

<h1 id="ocaml--51-with-gcc--15">OCaml &lt; 5.1 with GCC &gt;= 15</h1>

<p>Distributions that have moved to GCC 15 have had failing builds since last <a href="https://github.com/ocurrent/docker-base-images/issues/320">April</a>. This affects builds older than OCaml 5.1.1 but not OCaml 4.14.2.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># gcc -c -O2 -fno-strict-aliasing -fwrapv -pthread -g -Wall -fno-common -fexcess-precision=standard -ffunction-sections  -I./runtime  -D_FILE_OFFSET_BITS=64  -DCAMLDLLIMPORT= -DIN_CAML_RUNTIME -DDEBUG  -o runtime/main.bd.o runtime/main.c
In file included from runtime/interp.c:34:
runtime/interp.c: In function 'caml_interprete':
runtime/caml/prims.h:33:23: error: too many arguments to function '(value (*)(void))*(caml_prim_table.contents + (sizetype)((long unsigned int)*pc * 8))'; expected 0, have 1
33 | #define Primitive(n) ((c_primitive)(caml_prim_table.contents[n]))
   |                      ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
runtime/interp.c:1037:14: note: in expansion of macro 'Primitive'
1037 |       accu = Primitive(*pc)(accu);
     |              ^~~~~~~~~
</code></pre></div></div>

<p>I was about to create the patches, but I noticed that @dra27 had already done so. <a href="https://github.com/ocaml-opam/ocaml/branches">ocaml-opam/ocaml</a>. The patches can be added as an overlay repository. I have done this before for GCC 14 when a similar issue occurred for OCaml &lt; 4.08. <a href="https://github.com/ocurrent/docker-base-images/pull/298">PR#298</a>. The new PR is <a href="https://github.com/ocurrent/docker-base-images/pull/337">PR#337</a></p>

<h1 id="ubuntu-2510">Ubuntu 25.10</h1>

<p>The GCC 15 patch resolved most Ubuntu issues, but Ubuntu 25.10 persisted. Ubuntu 25.10 switched to the Rust-based Coreutils, which does not support commas in the install command until version 0.5.0. Ubuntu 25.10 ships with 0.2.2.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#9 137.0 # /usr/bin/install -c -m u=rw,g=rw,o=r \
#9 137.0 #   VERSION \
#9 137.0 #   "/home/opam/.opam/4.09/lib/ocaml"
#9 137.0 # /usr/bin/install: Invalid mode string: invalid operator (expected +, -, or =, but found ,)
</code></pre></div></div>

<p><a href="https://github.com/ocurrent/ocaml-dockerfile/pull/255">PR#255</a> switches to GNU Coreutils. I expect this problem will be cleared in subsequent releases of Ubuntu.</p>

<h1 id="windows">Windows</h1>

<p>The Windows workers needed to be updated to Windows Server 2025, as older kernels cannot run newer containers. Furthermore, the OCluster code is not yet using native Windows opam.</p>

<p>The Windows Server virtual machines are created with Packer. I’ve pushed my scripts to <a href="https://github.com/mtelvers/packer">mtelvers/packer</a>.</p>

<p>OCluster worker is deployed using Ansible. My scripts are at <a href="https://github.com/mtelvers/windows_worker">mtelvers/windows_worker</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[The base image builder has a growing number of failed builds; it’s time to address these.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Updating ARM64 Workers to Ubuntu Noble</title><link href="https://www.tunbury.org/2026/01/16/arm64-workers/" rel="alternate" type="text/html" title="Updating ARM64 Workers to Ubuntu Noble" /><published>2026-01-16T17:00:00+00:00</published><updated>2026-01-16T17:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/16/arm64-workers</id><content type="html" xml:base="https://www.tunbury.org/2026/01/16/arm64-workers/"><![CDATA[<p>Early in the upgrade program for Ubuntu 24.04, there were <a href="https://github.com/ocaml/infrastructure/issues/121">permission issues</a> when extracting <code class="language-plaintext highlighter-rouge">tar</code> files. The workaround was to update to the latest <code class="language-plaintext highlighter-rouge">dev</code> version of Docker. However, this didn’t resolve all the issues on ARM64, so only one machine was updated and excluded from the base image builder work.</p>

<p>There were <a href="https://github.com/tarides/infrastructure/issues/331">segmentation faults</a> at the second stage of the build process. This cleared due to upstream updates.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#8 24.53 # gcc -c -O2 -fno-strict-aliasing -fwrapv -Wall -fno-common -g -D_FILE_OFFSET_BITS=64 -D_REENTRANT -DCAML_NAME_SPACE -DOCAML_STDLIB_DIR='"/home/opam/.opam/4.09/lib/ocaml"' -o prims.o prims.c
#8 24.53 # rm -f libcamlrund.a &amp;&amp; ar rc libcamlrund.a interp_bd.o misc_bd.o stacks_bd.o fix_code_bd.o startup_aux_bd.o startup_byt_bd.o freelist_bd.o major_gc_bd.o minor_gc_bd.o memory_bd.o alloc_bd.o roots_byt_bd.o globroots_bd.o fail_byt_bd.o signals_bd.o signals_byt_bd.o printexc_bd.o backtrace_byt_bd.o backtrace_bd.o compare_bd.o ints_bd.o floats_bd.o str_bd.o array_bd.o io_bd.o extern_bd.o intern_bd.o hash_bd.o sys_bd.o meta_bd.o parsing_bd.o gc_ctrl_bd.o md5_bd.o obj_bd.o lexing_bd.o callback_bd.o debugger_bd.o weak_bd.o compact_bd.o finalise_bd.o custom_bd.o dynlink_bd.o spacetime_byt_bd.o afl_bd.o unix_bd.o bigarray_bd.o main_bd.o instrtrace_bd.o &amp;&amp; ranlib libcamlrund.a
#8 24.53 # gcc -O2 -fno-strict-aliasing -fwrapv -Wall -fno-common -g -D_FILE_OFFSET_BITS=64 -D_REENTRANT -DCAML_NAME_SPACE -DOCAML_STDLIB_DIR='"/home/opam/.opam/4.09/lib/ocaml"' -Wl,-E -g -o ocamlrund prims.o libcamlrund.a -lm -lpthread 
#8 24.53 # rm -f libcamlruni.a &amp;&amp; ar rc libcamlruni.a interp_bi.o misc_bi.o stacks_bi.o fix_code_bi.o startup_aux_bi.o startup_byt_bi.o freelist_bi.o major_gc_bi.o minor_gc_bi.o memory_bi.o alloc_bi.o roots_byt_bi.o globroots_bi.o fail_byt_bi.o signals_bi.o signals_byt_bi.o printexc_bi.o backtrace_byt_bi.o backtrace_bi.o compare_bi.o ints_bi.o floats_bi.o str_bi.o array_bi.o io_bi.o extern_bi.o intern_bi.o hash_bi.o sys_bi.o meta_bi.o parsing_bi.o gc_ctrl_bi.o md5_bi.o obj_bi.o lexing_bi.o callback_bi.o debugger_bi.o weak_bi.o compact_bi.o finalise_bi.o custom_bi.o dynlink_bi.o spacetime_byt_bi.o afl_bi.o unix_bi.o bigarray_bi.o main_bi.o &amp;&amp; ranlib libcamlruni.a
#8 24.53 # gcc -O2 -fno-strict-aliasing -fwrapv -Wall -fno-common -g -D_FILE_OFFSET_BITS=64 -D_REENTRANT -DCAML_NAME_SPACE -DOCAML_STDLIB_DIR='"/home/opam/.opam/4.09/lib/ocaml"' -Wl,-E -o ocamlruni prims.o libcamlruni.a -lm -lpthread 
#8 24.53 # rm -f libcamlrun_pic.a &amp;&amp; ar rc libcamlrun_pic.a interp_bpic.o misc_bpic.o stacks_bpic.o fix_code_bpic.o startup_aux_bpic.o startup_byt_bpic.o freelist_bpic.o major_gc_bpic.o minor_gc_bpic.o memory_bpic.o alloc_bpic.o roots_byt_bpic.o globroots_bpic.o fail_byt_bpic.o signals_bpic.o signals_byt_bpic.o printexc_bpic.o backtrace_byt_bpic.o backtrace_bpic.o compare_bpic.o ints_bpic.o floats_bpic.o str_bpic.o array_bpic.o io_bpic.o extern_bpic.o intern_bpic.o hash_bpic.o sys_bpic.o meta_bpic.o parsing_bpic.o gc_ctrl_bpic.o md5_bpic.o obj_bpic.o lexing_bpic.o callback_bpic.o debugger_bpic.o weak_bpic.o compact_bpic.o finalise_bpic.o custom_bpic.o dynlink_bpic.o spacetime_byt_bpic.o afl_bpic.o unix_bpic.o bigarray_bpic.o main_bpic.o &amp;&amp; ranlib libcamlrun_pic.a
#8 24.53 # gcc -shared -o libcamlrun_shared.so interp_bpic.o misc_bpic.o stacks_bpic.o fix_code_bpic.o startup_aux_bpic.o startup_byt_bpic.o freelist_bpic.o major_gc_bpic.o minor_gc_bpic.o memory_bpic.o alloc_bpic.o roots_byt_bpic.o globroots_bpic.o fail_byt_bpic.o signals_bpic.o signals_byt_bpic.o printexc_bpic.o backtrace_byt_bpic.o backtrace_bpic.o compare_bpic.o ints_bpic.o floats_bpic.o str_bpic.o array_bpic.o io_bpic.o extern_bpic.o intern_bpic.o hash_bpic.o sys_bpic.o meta_bpic.o parsing_bpic.o gc_ctrl_bpic.o md5_bpic.o obj_bpic.o lexing_bpic.o callback_bpic.o debugger_bpic.o weak_bpic.o compact_bpic.o finalise_bpic.o custom_bpic.o dynlink_bpic.o spacetime_byt_bpic.o afl_bpic.o unix_bpic.o bigarray_bpic.o main_bpic.o -lm -lpthread 
#8 24.53 # rm -f libcamlrun.a &amp;&amp; ar rc libcamlrun.a interp_b.o misc_b.o stacks_b.o fix_code_b.o startup_aux_b.o startup_byt_b.o freelist_b.o major_gc_b.o minor_gc_b.o memory_b.o alloc_b.o roots_byt_b.o globroots_b.o fail_byt_b.o signals_b.o signals_byt_b.o printexc_b.o backtrace_byt_b.o backtrace_b.o compare_b.o ints_b.o floats_b.o str_b.o array_b.o io_b.o extern_b.o intern_b.o hash_b.o sys_b.o meta_b.o parsing_b.o gc_ctrl_b.o md5_b.o obj_b.o lexing_b.o callback_b.o debugger_b.o weak_b.o compact_b.o finalise_b.o custom_b.o dynlink_b.o spacetime_byt_b.o afl_b.o unix_b.o bigarray_b.o main_b.o &amp;&amp; ranlib libcamlrun.a
#8 24.53 # gcc -O2 -fno-strict-aliasing -fwrapv -Wall -fno-common -g -D_FILE_OFFSET_BITS=64 -D_REENTRANT -DCAML_NAME_SPACE -DOCAML_STDLIB_DIR='"/home/opam/.opam/4.09/lib/ocaml"' -Wl,-E -o ocamlrun prims.o libcamlrun.a -lm -lpthread 
#8 24.53 # make[1]: Leaving directory '/home/opam/.opam/4.09/.opam-switch/build/ocaml-variants.4.09.1+flambda/runtime'
#8 24.53 # cp runtime/ocamlrun boot/ocamlrun
#8 24.53 # /usr/bin/make -C stdlib \
#8 24.53 # CAMLC='$(BOOT_OCAMLC) -use-prims ../runtime/primitives' all
#8 24.53 # make[1]: Entering directory '/home/opam/.opam/4.09/.opam-switch/build/ocaml-variants.4.09.1+flambda/stdlib'
#8 24.53 # ../boot/ocamlrun ../boot/ocamlc -use-prims ../runtime/primitives -strict-sequence -absname -w +a-4-9-41-42-44-45-48 -g -warn-error A -bin-annot -nostdlib -safe-string -strict-formats -nopervasives -c camlinternalFormatBasics.mli
#8 24.53 # sed -e "s|%%VERSION%%|`sed -e 1q ../VERSION | tr -d '\r'`|" sys.mlp &gt; sys.ml
#8 24.53 # make[1]: *** [Makefile:218: camlinternalFormatBasics.cmi] Segmentation fault (core dumped)
#8 24.53 # make[1]: *** Waiting for unfinished jobs....
#8 24.53 # make[1]: Leaving directory '/home/opam/.opam/4.09/.opam-switch/build/ocaml-variants.4.09.1+flambda/stdlib'
#8 24.53 # make: *** [Makefile:345: coldstart] Error 2
</code></pre></div></div>

<p>The remaining <a href="https://images.ci.ocaml.org/job/2026-01-15/084331-ocluster-build-69c63c">error</a> was this innocuous <code class="language-plaintext highlighter-rouge">COPY</code> command. It was possible to work around this by changing the <code class="language-plaintext highlighter-rouge">chown</code> section to <code class="language-plaintext highlighter-rouge">--chown=1000:1000</code>, but that was odd as it wasn’t needed on any other system.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Dockerfile:78
--------------------
  76 |     RUN git config --global user.email "docker@example.com"
  77 |     RUN git config --global user.name "Docker"
  78 | &gt;&gt;&gt; COPY --link --chown=opam:opam [ ".", "/home/opam/opam-repository" ]
  79 |     RUN opam-sandbox-disable
  80 |     RUN opam init -k git -a /home/opam/opam-repository --bare
--------------------
ERROR: failed to solve: invalid user index: -1
</code></pre></div></div>

<p>The root cause, which I have investigated today, is a version mismatch between the Docker <code class="language-plaintext highlighter-rouge">dev</code> daemon and the <code class="language-plaintext highlighter-rouge">27.5.1</code> client on the (partially) updated system. It’s interesting to note that client version <code class="language-plaintext highlighter-rouge">20.10.21</code> worked against the <code class="language-plaintext highlighter-rouge">dev</code> daemon.</p>

<p>Anyway, none of this is a concern any longer, as Ubuntu now packages Docker 28.2.2. The system can be upgraded via <code class="language-plaintext highlighter-rouge">apt</code>, and everything works as expected.</p>

<p>The ARM64 workers have been updated using this <a href="https://gist.github.com/mtelvers/dbc3828eab622530a67d72535feb5908">Ansible playbook</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ubuntu,ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Early in the upgrade program for Ubuntu 24.04, there were permission issues when extracting tar files. The workaround was to update to the latest dev version of Docker. However, this didn’t resolve all the issues on ARM64, so only one machine was updated and excluded from the base image builder work.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ubuntu.png" /><media:content medium="image" url="https://www.tunbury.org/images/ubuntu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Installation from recovery console</title><link href="https://www.tunbury.org/2026/01/13/pima-nvme/" rel="alternate" type="text/html" title="Installation from recovery console" /><published>2026-01-13T17:00:00+00:00</published><updated>2026-01-13T17:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/13/pima-nvme</id><content type="html" xml:base="https://www.tunbury.org/2026/01/13/pima-nvme/"><![CDATA[<p>Over the weekend, one of the NVMe drives in pima failed, which brought down the whole system.</p>

<p>Booting over the network to a recovery console showed that <code class="language-plaintext highlighter-rouge">nvme6</code> was dead. The kernel logged errors on any access, and this was confirmed by the SMART log, which showed a critical warning flag <code class="language-plaintext highlighter-rouge">0x4</code> despite zero media errors.</p>

<p>I have logged a ticket with Micron to investigate the failure, but we’d like to get the machine back online as soon as possible. Since the other seven drives have the same firmware, I’m suspicious that another drive or two will fail without warning; therefore, I’m going to rebuild with (extra) redundancy.</p>

<p>There is already a reasonable partition table on each of the drives, so I’m going to use that. <code class="language-plaintext highlighter-rouge">md127</code> turned out to be the swap space. I’ll create a RAID6 MD on p2 across the seven healthy drives, ~20GB usable will be enough for root, and create ZFS RAIDz2 on p4, giving ~65TB usable space.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># lsblk</span>
NAME        MAJ:MIN RM  SIZE RO TYPE   MOUNTPOINTS
nvme6n1     259:0    0   14T  0 disk   
nvme5n1     259:1    0   14T  0 disk   
├─nvme5n1p1 259:2    0  512M  0 part   
├─nvme5n1p2 259:3    0    4G  0 part   
│ └─md127     9:127  0   16G  0 raid10 
├─nvme5n1p3 259:4    0    2G  0 part   
└─nvme5n1p4 259:5    0   14T  0 part   
...
</code></pre></div></div>

<p>Remove <code class="language-plaintext highlighter-rouge">md127</code> and clear the disks.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm <span class="nt">--stop</span> /dev/md127
mdadm <span class="nt">--zero-superblock</span> /dev/nvme<span class="o">{</span>0,1,2,3,4,5,7<span class="o">}</span>n1p2
</code></pre></div></div>

<p>Then create the new array and format it with <code class="language-plaintext highlighter-rouge">ext4</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm <span class="nt">--create</span> /dev/md0 <span class="nt">--level</span><span class="o">=</span>6 <span class="nt">--raid-devices</span><span class="o">=</span>7 <span class="se">\</span>
  /dev/nvme0n1p2 /dev/nvme1n1p2 /dev/nvme2n1p2 <span class="se">\</span>
  /dev/nvme3n1p2 /dev/nvme4n1p2 /dev/nvme5n1p2 /dev/nvme7n1p2

mkfs.ext4 <span class="nt">-L</span> root /dev/md0
</code></pre></div></div>

<p>We are in the recovery console so we can start the installation directly by mounting the new file system and running <code class="language-plaintext highlighter-rouge">debootstrap</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount /dev/md0 /mnt
debootstrap noble /mnt http://archive.ubuntu.com/ubuntu
</code></pre></div></div>

<p>Once that completes, prepare for a <code class="language-plaintext highlighter-rouge">chroot</code> environment by mounting the pseudo-filesystems and the first EFI partition.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount <span class="nt">--bind</span> /dev /mnt/dev
mount <span class="nt">--bind</span> /dev/pts /mnt/dev/pts
mount <span class="nt">--bind</span> /proc /mnt/proc
mount <span class="nt">--bind</span> /sys /mnt/sys
mount <span class="nt">--bind</span> /run /mnt/run

<span class="nb">mkdir</span> <span class="nt">-p</span> /mnt/boot/efi
mount /dev/nvme0n1p1 /mnt/boot/efi
</code></pre></div></div>

<p>Chroot in to the new environment.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">chroot</span> /mnt /bin/bash
</code></pre></div></div>

<p>Then inside the <code class="language-plaintext highlighter-rouge">chroot</code>, create <code class="language-plaintext highlighter-rouge">/etc/fstab</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/dev/md0       /           ext4    errors=remount-ro   0 1
/dev/nvme0n1p1 /boot/efi   vfat    umask=0077          0 1
</code></pre></div></div>

<p>And <code class="language-plaintext highlighter-rouge">/etc/apt/sources.list</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>deb http://archive.ubuntu.com/ubuntu noble main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu noble-updates main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu noble-security main restricted universe multiverse
</code></pre></div></div>

<p>Install the kernel, GRUB, admin tools for MD and ZFS and SSHD.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt update &amp;&amp; apt install -y linux-image-generic grub-efi-amd64 mdadm zfsutils-linux openssh-server networkd-dispatcher
</code></pre></div></div>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/default/grub</code>. This machine uses a serial port console on the second serial port.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">GRUB_DEFAULT</span><span class="o">=</span>0
<span class="nv">GRUB_TIMEOUT</span><span class="o">=</span>5
<span class="nv">GRUB_DISTRIBUTOR</span><span class="o">=</span><span class="s2">"Ubuntu"</span>
<span class="nv">GRUB_CMDLINE_LINUX_DEFAULT</span><span class="o">=</span><span class="s2">""</span>
<span class="nv">GRUB_CMDLINE_LINUX</span><span class="o">=</span><span class="s2">"console=tty0 console=ttyS1,115200n8"</span>
<span class="nv">GRUB_TERMINAL</span><span class="o">=</span><span class="s2">"console serial"</span>
<span class="nv">GRUB_SERIAL_COMMAND</span><span class="o">=</span><span class="s2">"serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1"</span>
</code></pre></div></div>

<p>Enable getty on ttyS1.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl <span class="nb">enable </span>serial-getty@ttyS1.service
</code></pre></div></div>

<p>Update <code class="language-plaintext highlighter-rouge">mdadm.conf</code> so it finds the array at boot.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm <span class="nt">--detail</span> <span class="nt">--scan</span> <span class="o">&gt;&gt;</span> /etc/mdadm/mdadm.conf
update-initramfs <span class="nt">-u</span>
</code></pre></div></div>

<p>Install GRUB to all 7 EFI partitions for redundancy.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>grub-install <span class="nt">--target</span><span class="o">=</span>x86_64-efi <span class="nt">--efi-directory</span><span class="o">=</span>/boot/efi <span class="nt">--bootloader-id</span><span class="o">=</span>ubuntu <span class="nt">--recheck</span>
</code></pre></div></div>

<p>Copy the EFI bootloader to the other drives.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for </span>disk <span class="k">in </span>nvme1n1 nvme2n1 nvme3n1 nvme4n1 nvme5n1 nvme7n1<span class="p">;</span> <span class="k">do
  </span><span class="nb">mkdir</span> <span class="nt">-p</span> /tmp/efi
  mount /dev/<span class="k">${</span><span class="nv">disk</span><span class="k">}</span>p1 /tmp/efi
  <span class="nb">cp</span> <span class="nt">-r</span> /boot/efi/EFI /tmp/efi/
  umount /tmp/efi
<span class="k">done</span>
</code></pre></div></div>

<p>Update the GRUB installation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>update-grub
</code></pre></div></div>

<p>Set a root password <code class="language-plaintext highlighter-rouge">passwd root</code>.</p>

<p>Set the hostname</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo</span> <span class="s2">"pima"</span> <span class="o">&gt;</span> /etc/hostname
</code></pre></div></div>

<p>Create a netplan file, <code class="language-plaintext highlighter-rouge">/etc/netplan/01-netcfg.yaml</code>, to match the configuration of the machine and <code class="language-plaintext highlighter-rouge">chmod</code> it.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">chmod </span>600 /etc/netplan/01-netcfg.yaml
</code></pre></div></div>

<p>Create a regular user with <code class="language-plaintext highlighter-rouge">sudo</code> access and get their keys from GitHub.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>useradd <span class="nt">-m</span> <span class="nt">-s</span> /bin/bash <span class="nt">-G</span> <span class="nb">sudo </span>username
passwd username
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="nt">-m</span> 0700 /home/username/.ssh
curl <span class="nt">-o</span> /home/username/.ssh/authorized_keys https://github.com/username.keys
<span class="nb">chown </span>username:username /home/username/.ssh /home/username/.ssh/authorized_keys
<span class="nb">chmod </span>0600 /home/username/.ssh/authorized_keys
</code></pre></div></div>

<p>Set the timezone.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">ln</span> <span class="nt">-sf</span> /usr/share/zoneinfo/Europe/London /etc/localtime
</code></pre></div></div>

<p>Set the locale.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install</span> <span class="nt">-y</span> locales
locale-gen en_GB.UTF-8
update-locale <span class="nv">LANG</span><span class="o">=</span>en_GB.UTF-8
</code></pre></div></div>

<p>Exit and reboot.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">exit
</span>umount /mnt/boot/efi
umount /mnt/<span class="o">{</span>dev/pts,dev,proc,sys,run<span class="o">}</span>
umount /mnt
reboot
</code></pre></div></div>

<p>Create the ZFS pool.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zpool create <span class="nt">-f</span> <span class="nt">-o</span> <span class="nv">ashift</span><span class="o">=</span>12 data raidz2 <span class="se">\</span>
  /dev/nvme0n1p4 /dev/nvme1n1p4 /dev/nvme2n1p4 <span class="se">\</span>
  /dev/nvme3n1p4 /dev/nvme4n1p4 /dev/nvme5n1p4 /dev/nvme7n1p4
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ubuntu,zfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[Over the weekend, one of the NVMe drives in pima failed, which brought down the whole system.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ubuntu.png" /><media:content medium="image" url="https://www.tunbury.org/images/ubuntu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Moving to opam 2.5</title><link href="https://www.tunbury.org/2026/01/12/opam-25/" rel="alternate" type="text/html" title="Moving to opam 2.5" /><published>2026-01-12T21:00:00+00:00</published><updated>2026-01-12T21:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/12/opam-25</id><content type="html" xml:base="https://www.tunbury.org/2026/01/12/opam-25/"><![CDATA[<p><a href="https://opam.ocaml.org/blog/opam-2-5-0/">opam 2.5.0</a> was released on 27th November, and this update needs to be propagated through the CI infrastructure. This post mirrors the steps taken for the release of <a href="https://www.tunbury.org/2025/07/30/opam-24/">opam 2.4.1</a>.</p>

<h1 id="base-images">Base Images</h1>

<h2 id="linux">Linux</h2>

<p><a href="https://github.com/ocurrent/docker-base-images">ocurrent/docker-base-images</a></p>

<p>The Linux base images are created using the <a href="https://images.ci.ocaml.org/">Docker base image builder</a>, which uses <a href="https://github.com/ocurrent/ocaml-dockerfile">ocurrent/ocaml-dockerfile</a> to know which versions of opam are available. Antonin submitted <a href="https://github.com/ocurrent/ocaml-dockerfile/pull/249">PR#249</a> with the necessary changes. This was released as v8.3.4.</p>

<p>With v8.3.4 released, <a href="https://github.com/ocurrent/docker-base-images/pull/336">PR#336</a> can be opened to update the pipeline to build images which include opam 2.5. Rebuilding the base images requires a significant amount of time, especially since it’s marked as a low-priority task on the cluster.</p>

<h2 id="macos">macOS</h2>

<p><a href="https://github.com/ocurrent/macos-infra">ocurrent/macos-infra</a></p>

<p>Including opam 2.5 in the macOS required <a href="https://github.com/ocurrent/macos-infra/pull/58">PR#58</a>, which adds 2.5 to the list of opam packages to download. There are Ansible playbooks that build the macOS base images and recursively remove the old images and their (ZFS) clones. They take about half an hour per machine. I run the Intel and Apple Silicon updates in parallel, but process each pool one at a time.</p>

<p>The Ansible command is: <code class="language-plaintext highlighter-rouge">ansible-playbook update-ocluster.yml</code></p>

<h2 id="freebsd">FreeBSD</h2>

<p><a href="https://github.com/ocurrent/freebsd-infra">ocurrent/freebsd-infra</a></p>

<p>The FreeBSD update parallels the macOS update, requiring that 2.5 be added to the loop of available versions. <a href="https://github.com/ocurrent/freebsd-infra/pull/20">PR#20</a>.</p>

<p>The Ansible command is: <code class="language-plaintext highlighter-rouge">ansible-playbook update.yml</code></p>

<h2 id="windows-thymecaelumcidev">Windows (thyme.caelum.ci.dev)</h2>

<p><a href="https://github.com/ocurrent/obuilder">ocurrent/obuilder</a></p>

<p>The Windows base images are built using a <code class="language-plaintext highlighter-rouge">Makefile</code> which runs unattended builds of Windows using QEMU virtual machines. The Makefile requires <a href="https://github.com/ocurrent/obuilder/pull/202">PR#202</a>. The build command is <code class="language-plaintext highlighter-rouge">make windows</code>.</p>

<p>Once the new images have been built, stop <code class="language-plaintext highlighter-rouge">ocluster worker</code> and move the new base images into place. The next is to remove <code class="language-plaintext highlighter-rouge">results/*</code> as these layers will link to the old base images, and remove <code class="language-plaintext highlighter-rouge">state/*</code> so obuilder will create a new empty database on startup. Avoid removing <code class="language-plaintext highlighter-rouge">cache/*</code> as this is the download cache for opam objects.</p>

<p>The unattended installation can be monitored via VNC by connecting to localhost:5900.</p>

<h2 id="openbsd-oreganocaelumcidev">OpenBSD (oregano.caelum.ci.dev)</h2>

<p><a href="https://github.com/ocurrent/obuilder">ocurrent/obuilder</a></p>

<p>The OpenBSD base images are built using the same Makefile used for Windows. There is a separate commit in <a href="https://github.com/ocurrent/obuilder/pull/202">PR#202</a> for the changes needed for OpenBSD, which include moving from OpenBSD 7.6 to 7.7. Run <code class="language-plaintext highlighter-rouge">make openbsd</code>.</p>

<p>Once the new images have been built, stop <code class="language-plaintext highlighter-rouge">ocluster worker</code> and move the new base images into place. The next is to remove <code class="language-plaintext highlighter-rouge">results/*</code> as these layers will link to the old base images, and remove <code class="language-plaintext highlighter-rouge">state/*</code> so obuilder will create a new empty database on startup. Avoid removing <code class="language-plaintext highlighter-rouge">cache/*</code> as this is the download cache for opam objects.</p>

<p>As with Windows, the unattended installation can be monitored via VNC by connecting to localhost:5900.</p>

<h1 id="ocaml-ci">OCaml-CI</h1>

<p><a href="https://ocaml.ci.dev">OCaml-CI</a> uses <a href="https://github.com/ocurrent/ocaml-dockerfile">ocurrent/ocaml-dockerfile</a> as a submodule, so the module needs to be updated to the released version. Edits are needed to <code class="language-plaintext highlighter-rouge">lib/opam_version.ml</code> to include V2_5, then the pipeline needs to be updated in <code class="language-plaintext highlighter-rouge">service/conf.ml</code> to use version 2.5 rather than 2.4 for all the different operating systems. Linux is rather more automated than the others.</p>

<h1 id="opam-repo-ci">opam-repo-ci</h1>

<p><a href="https://opam.ci.ocaml.org">opam-repo-ci</a> tests using the latest tagged version of opam, which is called <code class="language-plaintext highlighter-rouge">opam-dev</code> within the base images. It also explicitly tests against the latest release in each of the 2.x series. With 2.5 being tagged, this will automatically become the used <code class="language-plaintext highlighter-rouge">dev</code> version once the base images are updated, but over time, 2.5 and the latest tagged version will diverge, so <a href="https://github.com/ocurrent/opam-repo-ci/pull/463">PR#463</a> is needed to ensure we continue to test with the released version of 2.5.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[opam 2.5.0 was released on 27th November, and this update needs to be propagated through the CI infrastructure. This post mirrors the steps taken for the release of opam 2.4.1.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Base Fibonacci</title><link href="https://www.tunbury.org/2026/01/11/base-fibonacci/" rel="alternate" type="text/html" title="Base Fibonacci" /><published>2026-01-11T21:00:00+00:00</published><updated>2026-01-11T21:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/11/base-fibonacci</id><content type="html" xml:base="https://www.tunbury.org/2026/01/11/base-fibonacci/"><![CDATA[<p>In Numberphile’s latest <a href="https://www.youtube.com/watch?v=S5FTe5KP2Cw">video</a>, Tony Padilla does a ‘magic trick’ with Fibonacci numbers and talks about Zeckendorf decompositions, and I had my laptop out even before the video ended.</p>

<p>As a summary of the video, a player is asked to pick a number up to a maximum, and mark down which rows of a table their number appears in.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1,4,6,9,12
2,7,10
3,4,11,12
5,6,7
8,9,10,11,12
</code></pre></div></div>

<p>Let’s say I picked seven as my number; it appears in row 2 and row 4. Then, I can <em>magically</em> work out the original number by adding together the first two numbers in the row. 5 + 2 = 7. The first number in each row of the table is a Fibonacci number.</p>

<p>All numbers are the sum of one or more Fibonacci numbers, and there are typically multiple solutions. However, the Zeckendorf decomposition gives a unique solution by greedily subtracting the largest possible Fibonacci number. Let’s see that in OCaml.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">to_zeckendorf</span> <span class="n">n</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">rec</span> <span class="n">fibs</span> <span class="n">a</span> <span class="n">b</span> <span class="n">acc</span> <span class="o">=</span>
    <span class="k">if</span> <span class="n">a</span> <span class="o">&gt;</span> <span class="n">n</span> <span class="k">then</span> <span class="n">acc</span> <span class="k">else</span> <span class="n">fibs</span> <span class="n">b</span> <span class="p">(</span><span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="n">a</span> <span class="o">::</span> <span class="n">acc</span><span class="p">)</span>
  <span class="k">in</span>
  <span class="k">let</span> <span class="n">fib_list</span> <span class="o">=</span> <span class="n">fibs</span> <span class="mi">1</span> <span class="mi">2</span> <span class="bp">[]</span> <span class="k">in</span>
  
  <span class="k">let</span> <span class="k">rec</span> <span class="n">convert</span> <span class="n">remaining</span> <span class="n">fibs</span> <span class="n">acc</span> <span class="o">=</span>
    <span class="k">match</span> <span class="n">fibs</span> <span class="k">with</span>
    <span class="o">|</span> <span class="bp">[]</span> <span class="o">-&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">acc</span>
    <span class="o">|</span> <span class="n">f</span> <span class="o">::</span> <span class="n">rest</span> <span class="o">-&gt;</span>
        <span class="k">if</span> <span class="n">f</span> <span class="o">&lt;=</span> <span class="n">remaining</span> <span class="k">then</span> <span class="n">convert</span> <span class="p">(</span><span class="n">remaining</span> <span class="o">-</span> <span class="n">f</span><span class="p">)</span> <span class="n">rest</span> <span class="p">(</span><span class="mi">1</span> <span class="o">::</span> <span class="n">acc</span><span class="p">)</span>
        <span class="k">else</span> <span class="n">convert</span> <span class="n">remaining</span> <span class="n">rest</span> <span class="p">(</span><span class="mi">0</span> <span class="o">::</span> <span class="n">acc</span><span class="p">)</span>
  <span class="k">in</span>
  <span class="n">convert</span> <span class="n">n</span> <span class="n">fib_list</span> <span class="bp">[]</span>

<span class="k">let</span> <span class="n">zeck_to_string</span> <span class="n">bits</span> <span class="o">=</span>
  <span class="n">bits</span> <span class="o">|&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">map</span> <span class="n">string_of_int</span> <span class="o">|&gt;</span> <span class="nn">String</span><span class="p">.</span><span class="n">concat</span> <span class="s2">""</span>
</code></pre></div></div>

<p>Resulting in this binary-ish string representation:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zeck_to_string (to_zeckendorf 7);;
- : string = "1010"
</code></pre></div></div>

<p>What we really want, though, is the original table so we can play the game with our friends with even larger numbers.</p>

<p>The simplest approach may be to count up while generating the Fibonacci sequence. This looks reasonably efficient. The <code class="language-plaintext highlighter-rouge">max_fibs</code> constant isn’t a big constraint, as the 94th Fibonacci number is the largest which can be represented in an unsigned 64-bit integer, so we will run out of system resources long before that’s an issue.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">fib_table</span> <span class="n">hi</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">max_fibs</span> <span class="o">=</span> <span class="mi">94</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">fibs</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">make</span> <span class="n">max_fibs</span> <span class="mi">0</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">buckets</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">make</span> <span class="n">max_fibs</span> <span class="bp">[]</span> <span class="k">in</span>

  <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="mi">1</span><span class="p">;</span>
  <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="mi">2</span><span class="p">;</span>

  <span class="k">let</span> <span class="k">rec</span> <span class="n">decompose</span> <span class="n">orig</span> <span class="n">remaining</span> <span class="n">i</span> <span class="o">=</span>
    <span class="k">if</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">0</span> <span class="k">then</span> <span class="bp">()</span>
    <span class="k">else</span> <span class="k">if</span> <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">&lt;=</span> <span class="n">remaining</span> <span class="k">then</span> <span class="p">(</span>
      <span class="n">buckets</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">orig</span> <span class="o">::</span> <span class="n">buckets</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
      <span class="n">decompose</span> <span class="n">orig</span> <span class="p">(</span><span class="n">remaining</span> <span class="o">-</span> <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="p">(</span><span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
    <span class="p">)</span> <span class="k">else</span>
      <span class="n">decompose</span> <span class="n">orig</span> <span class="n">remaining</span> <span class="p">(</span><span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
  <span class="k">in</span>

  <span class="k">let</span> <span class="k">rec</span> <span class="n">go</span> <span class="n">n</span> <span class="n">num_fibs</span> <span class="o">=</span>
    <span class="k">if</span> <span class="n">n</span> <span class="o">&gt;</span> <span class="n">hi</span> <span class="k">then</span> <span class="n">num_fibs</span>
    <span class="k">else</span>
      <span class="k">let</span> <span class="n">next</span> <span class="o">=</span> <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="n">num_fibs</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="n">num_fibs</span> <span class="o">-</span> <span class="mi">2</span><span class="p">)</span> <span class="k">in</span>
      <span class="k">if</span> <span class="n">n</span> <span class="o">&gt;=</span> <span class="n">next</span> <span class="k">then</span> <span class="p">(</span>
        <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="n">num_fibs</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">next</span><span class="p">;</span>
        <span class="n">decompose</span> <span class="n">n</span> <span class="n">n</span> <span class="n">num_fibs</span><span class="p">;</span>
        <span class="n">go</span> <span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="n">num_fibs</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
      <span class="p">)</span> <span class="k">else</span> <span class="p">(</span>
        <span class="n">decompose</span> <span class="n">n</span> <span class="n">n</span> <span class="p">(</span><span class="n">num_fibs</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
        <span class="n">go</span> <span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="n">num_fibs</span>
      <span class="p">)</span>
  <span class="k">in</span>

  <span class="k">let</span> <span class="n">num_fibs</span> <span class="o">=</span> <span class="n">go</span> <span class="mi">1</span> <span class="mi">2</span> <span class="k">in</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">init</span> <span class="n">num_fibs</span> <span class="p">(</span><span class="k">fun</span> <span class="n">i</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="o">,</span> <span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">buckets</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">)))</span>
  <span class="o">|&gt;</span> <span class="nn">Array</span><span class="p">.</span><span class="n">to_list</span>
</code></pre></div></div>

<p>Here is the resulting table.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># fib_table 100;;
- : (int * int list) list =
[(1, [1; 4; 6; 9; 12; 14; 17; 19; 22; 25; 27; 30; 33; 35; 38; 40; 43; 46; 48; 51; 53; 56; 59; 61; 64; 67; 69; 72; 74; 77; 80; 82; 85; 88; 90; 93; 95; 98]);
 (2, [2; 7; 10; 15; 20; 23; 28; 31; 36; 41; 44; 49; 54; 57; 62; 65; 70; 75; 78; 83; 86; 91; 96; 99]);
 (3, [3; 4; 11; 12; 16; 17; 24; 25; 32; 33; 37; 38; 45; 46; 50; 51; 58; 59; 66; 67; 71; 72; 79; 80; 87; 88; 92; 93; 100]);
 (5, [5; 6; 7; 18; 19; 20; 26; 27; 28; 39; 40; 41; 52; 53; 54; 60; 61; 62; 73; 74; 75; 81; 82; 83; 94; 95; 96]);
 (8, [8; 9; 10; 11; 12; 29; 30; 31; 32; 33; 42; 43; 44; 45; 46; 63; 64; 65; 66; 67; 84; 85; 86; 87; 88; 97; 98; 99; 100]);
 (13, [13; 14; 15; 16; 17; 18; 19; 20; 47; 48; 49; 50; 51; 52; 53; 54; 68; 69; 70; 71; 72; 73; 74; 75]);
 (21, [21; 22; 23; 24; 25; 26; 27; 28; 29; 30; 31; 32; 33; 76; 77; 78; 79; 80; 81; 82; 83; 84; 85; 86; 87; 88]);
 (34, [34; 35; 36; 37; 38; 39; 40; 41; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54]);
 (55, [55; 56; 57; 58; 59; 60; 61; 62; 63; 64; 65; 66; 67; 68; 69; 70; 71; 72; 73; 74; 75; 76; 77; 78; 79; 80; 81; 82; 83; 84; 85; 86; 87; 88]);
 (89, [89; 90; 91; 92; 93; 94; 95; 96; 97; 98; 99; 100])]
</code></pre></div></div>

<p>The algorithm builds up an array of lists during execution and prints the results at the end. We can’t print out row 1 in the table until the entire range has been evaluated. Upon closer examination of the table, a pattern of ranges emerges. For example, for 8, we have the ranges 8-12, 29-33, 42-46, 63-67, 84-88 and finally 97-100. There must be a pattern.</p>

<p>Here are the Fibonacci numbers less than 12.</p>

<table>
  <thead>
    <tr>
      <th>index</th>
      <th>F(n)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>1</td>
    </tr>
    <tr>
      <td>1</td>
      <td>2</td>
    </tr>
    <tr>
      <td>2</td>
      <td>3</td>
    </tr>
    <tr>
      <td>3</td>
      <td>5</td>
    </tr>
    <tr>
      <td>4</td>
      <td>8</td>
    </tr>
  </tbody>
</table>

<p>We want all numbers containing <code class="language-plaintext highlighter-rouge">1</code>. These are <code class="language-plaintext highlighter-rouge">1</code> plus all of <code class="language-plaintext highlighter-rouge">Z + 1</code>, where <code class="language-plaintext highlighter-rouge">Z</code> is <code class="language-plaintext highlighter-rouge">{3, 5, 8}</code>, the subset of the Fibonacci sequence greater than <code class="language-plaintext highlighter-rouge">2</code>. We can’t use <code class="language-plaintext highlighter-rouge">2</code> as a Zeckendorf decomposition cannot have consecutive Fibonacci numbers (by definition).</p>

<p>Starting with the highest Fibonacci number in our subset, <code class="language-plaintext highlighter-rouge">8</code>, we cannot use <code class="language-plaintext highlighter-rouge">5</code>, but can use <code class="language-plaintext highlighter-rouge">3</code>, resulting in <code class="language-plaintext highlighter-rouge">8 + 1</code>, <code class="language-plaintext highlighter-rouge">8 + 3 + 1</code>, aka <code class="language-plaintext highlighter-rouge">9</code> and <code class="language-plaintext highlighter-rouge">12</code>. Then, taking our next highest starting number of <code class="language-plaintext highlighter-rouge">5</code>, we have only <code class="language-plaintext highlighter-rouge">5 + 1</code>, aka <code class="language-plaintext highlighter-rouge">6</code> and finally <code class="language-plaintext highlighter-rouge">3 + 1</code> aka <code class="language-plaintext highlighter-rouge">4</code>. The result is <code class="language-plaintext highlighter-rouge">1, 4, 6, 9, 12</code>.</p>

<p>Continuing to the next row in the output, we now need to find all the numbers containing <code class="language-plaintext highlighter-rouge">2</code> which are <code class="language-plaintext highlighter-rouge">Z + 2</code> where Z is <code class="language-plaintext highlighter-rouge">{5, 8}</code>. This results in <code class="language-plaintext highlighter-rouge">8 + 2</code> and <code class="language-plaintext highlighter-rouge">5 + 2</code>, resulting in <code class="language-plaintext highlighter-rouge">2, 7, 10</code>.</p>

<p>This can be written as a recursive algorithm which requires no storage beyond the Fibonacci sequence itself. It prints the numbers as they are generated.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">fib_print</span> <span class="n">hi</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">rec</span> <span class="n">build</span> <span class="n">a</span> <span class="n">b</span> <span class="n">acc</span> <span class="o">=</span>
    <span class="k">if</span> <span class="n">a</span> <span class="o">&gt;</span> <span class="n">hi</span> <span class="k">then</span> <span class="nn">Array</span><span class="p">.</span><span class="n">of_list</span> <span class="p">(</span><span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="n">acc</span><span class="p">)</span>
    <span class="k">else</span> <span class="n">build</span> <span class="n">b</span> <span class="p">(</span><span class="n">a</span> <span class="o">+</span> <span class="n">b</span><span class="p">)</span> <span class="p">(</span><span class="n">a</span> <span class="o">::</span> <span class="n">acc</span><span class="p">)</span>
  <span class="k">in</span>
  <span class="k">let</span> <span class="n">fibs</span> <span class="o">=</span> <span class="n">build</span> <span class="mi">1</span> <span class="mi">2</span> <span class="bp">[]</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">n</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">fibs</span> <span class="k">in</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">iteri</span> <span class="p">(</span><span class="k">fun</span> <span class="n">k</span> <span class="n">fk</span> <span class="o">-&gt;</span>
    <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"%d:"</span> <span class="n">fk</span><span class="p">;</span>
    <span class="k">let</span> <span class="k">rec</span> <span class="n">go</span> <span class="n">idx</span> <span class="n">value</span> <span class="n">prev_used</span> <span class="o">=</span>
      <span class="k">if</span> <span class="n">fk</span> <span class="o">+</span> <span class="n">value</span> <span class="o">&gt;</span> <span class="n">hi</span> <span class="k">then</span> <span class="bp">()</span>
      <span class="k">else</span> <span class="k">if</span> <span class="n">idx</span> <span class="o">&lt;</span> <span class="mi">0</span> <span class="k">then</span>
        <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">" %d"</span> <span class="p">(</span><span class="n">fk</span> <span class="o">+</span> <span class="n">value</span><span class="p">)</span>
      <span class="k">else</span> <span class="k">if</span> <span class="n">idx</span> <span class="o">&gt;=</span> <span class="n">k</span> <span class="o">-</span> <span class="mi">1</span> <span class="o">&amp;&amp;</span> <span class="n">idx</span> <span class="o">&lt;=</span> <span class="n">k</span> <span class="o">+</span> <span class="mi">1</span> <span class="k">then</span>
        <span class="n">go</span> <span class="p">(</span><span class="n">idx</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="n">value</span> <span class="bp">false</span>
      <span class="k">else</span> <span class="p">(</span>
        <span class="n">go</span> <span class="p">(</span><span class="n">idx</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="n">value</span> <span class="bp">false</span><span class="p">;</span>
        <span class="k">if</span> <span class="n">not</span> <span class="n">prev_used</span> <span class="k">then</span>
          <span class="n">go</span> <span class="p">(</span><span class="n">idx</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="n">value</span> <span class="o">+</span> <span class="n">fibs</span><span class="o">.</span><span class="p">(</span><span class="n">idx</span><span class="p">))</span> <span class="bp">true</span>
      <span class="p">)</span>
    <span class="k">in</span>
    <span class="n">go</span> <span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="mi">0</span> <span class="bp">false</span><span class="p">;</span>
    <span class="n">print_newline</span> <span class="bp">()</span>
  <span class="p">)</span> <span class="n">fibs</span>
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="fibonacci" /><category term="tunbury.org" /><summary type="html"><![CDATA[In Numberphile’s latest video, Tony Padilla does a ‘magic trick’ with Fibonacci numbers and talks about Zeckendorf decompositions, and I had my laptop out even before the video ended.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/base-fibonacci.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/base-fibonacci.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">More OCaml on Pi Pico 2 W</title><link href="https://www.tunbury.org/2026/01/10/ocaml-pico/" rel="alternate" type="text/html" title="More OCaml on Pi Pico 2 W" /><published>2026-01-10T21:00:00+00:00</published><updated>2026-01-10T21:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/10/ocaml-pico</id><content type="html" xml:base="https://www.tunbury.org/2026/01/10/ocaml-pico/"><![CDATA[<p>Extending the Pico 2 implementation to add effects-based WiFi networking and improve the build system.</p>

<h1 id="pio">Pio</h1>

<p>Pio is an effects-based I/O library for OCaml 5 running bare-metal on Raspberry Pi Pico 2 W. It provides an API compatible with <a href="https://github.com/ocaml-multicore/eio">Eio</a>, enabling direct-style concurrent programming with cooperative fibers and non-blocking network I/O. The Pico SDK provides lwIP and CYW43 drivers but these are not thread-safe.</p>

<p>The code matches the Eio style, for example:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(* Main entry point *)</span>
<span class="nn">Pio</span><span class="p">.</span><span class="n">run</span> <span class="p">(</span><span class="k">fun</span> <span class="n">sw</span> <span class="o">-&gt;</span>
  <span class="c">(* Fork concurrent fibers *)</span>
  <span class="k">let</span> <span class="n">p1</span> <span class="o">=</span> <span class="nn">Pio</span><span class="p">.</span><span class="nn">Fiber</span><span class="p">.</span><span class="n">fork_promise</span> <span class="o">~</span><span class="n">sw</span> <span class="p">(</span><span class="k">fun</span> <span class="bp">()</span> <span class="o">-&gt;</span>
    <span class="nn">Net</span><span class="p">.</span><span class="nn">Tcp</span><span class="p">.</span><span class="n">connect</span> <span class="o">~</span><span class="n">host</span><span class="o">:</span><span class="s2">"example.com"</span> <span class="o">~</span><span class="n">port</span><span class="o">:</span><span class="mi">80</span>
    <span class="o">...</span>
  <span class="p">)</span> <span class="k">in</span>

  <span class="c">(* CPU work on Core 1 *)</span>
  <span class="k">let</span> <span class="n">d</span> <span class="o">=</span> <span class="nn">Domain</span><span class="p">.</span><span class="n">spawn</span> <span class="p">(</span><span class="k">fun</span> <span class="bp">()</span> <span class="o">-&gt;</span> <span class="n">heavy_computation</span> <span class="bp">()</span><span class="p">)</span> <span class="k">in</span>

  <span class="c">(* Await results *)</span>
  <span class="k">let</span> <span class="n">result1</span> <span class="o">=</span> <span class="nn">Pio</span><span class="p">.</span><span class="nn">Promise</span><span class="p">.</span><span class="n">await_exn</span> <span class="n">p1</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">result2</span> <span class="o">=</span> <span class="nn">Domain</span><span class="p">.</span><span class="n">join</span> <span class="n">d</span> <span class="k">in</span>
  <span class="o">...</span>
<span class="p">)</span>
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  ┌─────────────────────────────────────────────────────────────┐
  │                    Pico 2 W (RP2350)                        │
  ├─────────────────────────────┬───────────────────────────────┤
  │         Core 0              │           Core 1              │
  │  ┌───────────────────────┐  │  ┌──────────────────────────┐ │
  │  │    Pio Scheduler      │  │  │    Domain.spawn          │ │
  │  │  ┌─────┐ ┌─────┐      │  │  │  ┌──────────────────┐    │ │
  │  │  │Fiber│ │Fiber│ ...  │  │  │  │ Pure computation │    │ │
  │  │  └──┬──┘ └──┬──┘      │  │  │  │ (no effects)     │    │ │
  │  │     └───┬───┘         │  │  │  └──────────────────┘    │ │
  │  │         ▼             │  │  └──────────────────────────┘ │
  │  │   Effect Handlers     │  │              │                │
  │  │   (Fork, Await,       │  │              │                │
  │  │    Tcp_*, Udp_*)      │  │              │                │
  │  └───────────────────────┘  │              │                │
  │            │                │              │                │
  │            ▼                │              │                │
  │    lwIP + CYW43 WiFi        │       Domain.join             │
  └─────────────────────────────┴───────────────────────────────┘
</code></pre></div></div>

<h1 id="build-system">Build system</h1>

<p>In a chance conversation with David, he was surprised that I had needed to do so much manual effort to complete the build. He pointed the <code class="language-plaintext highlighter-rouge">-output-obj</code> command line option to the compiler.</p>

<p>Compiling with <code class="language-plaintext highlighter-rouge">-output-obj -without-runtime</code> automatically provides, <code class="language-plaintext highlighter-rouge">caml_program</code>, <code class="language-plaintext highlighter-rouge">caml_globals</code>, <code class="language-plaintext highlighter-rouge">caml_code_segments</code>, <code class="language-plaintext highlighter-rouge">caml_exn_*</code>, all of which I had stubs for as well as <code class="language-plaintext highlighter-rouge">caml_frametable</code> which I covered with <code class="language-plaintext highlighter-rouge">frametable.S</code> and <code class="language-plaintext highlighter-rouge">caml_curry*</code> and <code class="language-plaintext highlighter-rouge">caml_apply*</code>, which I manually created from <code class="language-plaintext highlighter-rouge">curry.ml</code>.</p>

<p>The results in a single OCaml compilation step followed by a linking step for <code class="language-plaintext highlighter-rouge">ocaml_code.o</code> + <code class="language-plaintext highlighter-rouge">libasmrun.a</code></p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/mtelvers/ocaml/ocamlopt.opt <span class="se">\</span>
    <span class="nt">-I</span> /home/mtelvers/ocaml/stdlib <span class="se">\</span>
    /home/mtelvers/ocaml/stdlib/stdlib.cmxa <span class="se">\</span>
    <span class="nt">-farch</span> armv8-m.main <span class="nt">-ffpu</span> soft <span class="nt">-fthumb</span> <span class="se">\</span>
    <span class="nt">-output-obj</span> <span class="nt">-without-runtime</span> <span class="se">\</span>
    <span class="nt">-o</span> <span class="k">${</span><span class="nv">CMAKE_CURRENT_BINARY_DIR</span><span class="k">}</span>/ocaml_code.o <span class="se">\</span>
    net.ml pio.ml hello.ml
</code></pre></div></div>

<p>The only disadvantage this gave me was that it used slightly more memory than before. The increased memory requirement came from properly initialising all the stdlib modules, where I had been selective before.</p>

<p>The space was recovered by reducing <code class="language-plaintext highlighter-rouge">POOL_WSIZE</code>, the allocation size for major heap pools.</p>

<ol>
  <li>Module initialisation creates OCaml values (closures, data structures, etc.)</li>
  <li>These values are first allocated in the minor heap (8KB per domain)</li>
  <li>When the minor heap fills up, or during GC, surviving objects are promoted to the major heap</li>
  <li>The major heap grows by allocating pools of <code class="language-plaintext highlighter-rouge">POOL_WSIZE</code> words (was 16KB reduced to 8KB)</li>
  <li>Multiple objects from multiple modules share pools</li>
</ol>

<p>Objects are packed into pools by size class which is where the saving is made. By default, there are 32 size classes, and objects of different sizes cannot share the same pool; thus, there can be underutilised pools. On a normal system this would matter, but with only 520KB of RAM this is significant.</p>

<p>With <code class="language-plaintext highlighter-rouge">POOL_WSIZE</code> at 4096, 17 pools were created for a total of 272K, but with the smaller 8K pools, there are 20 allocations, but only 160K used.</p>

<p>The change is made by editing <code class="language-plaintext highlighter-rouge">let arena = 2048</code> in <code class="language-plaintext highlighter-rouge">tools/gen_sizeclasses.ml</code> and regenerating <code class="language-plaintext highlighter-rouge">runtime/caml/sizeclasses.h</code>. The blocksizes function (lines 35-47) recursively builds size classes from 128 down to 1, adding a new size class whenever the overhead would exceed 10.1%. The change increases the number of size classes from 32 to 35.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,pico" /><category term="tunbury.org" /><summary type="html"><![CDATA[Extending the Pico 2 implementation to add effects-based WiFi networking and improve the build system.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-pico.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-pico.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ceph Notes</title><link href="https://www.tunbury.org/2026/01/06/ceph-notes/" rel="alternate" type="text/html" title="Ceph Notes" /><published>2026-01-06T12:00:00+00:00</published><updated>2026-01-06T12:00:00+00:00</updated><id>https://www.tunbury.org/2026/01/06/ceph-notes</id><content type="html" xml:base="https://www.tunbury.org/2026/01/06/ceph-notes/"><![CDATA[<p>We now have 209 TB of data on a seven-node Ceph cluster. Here are some further notes on using Ceph.</p>

<h1 id="mounts">Mounts</h1>

<p>To mount a Ceph FS volume, obtain the base64-encoded secret using this command.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cephadm shell <span class="nt">--</span> ceph auth get-key client.admin
</code></pre></div></div>

<p>Then pass that as an option to the <code class="language-plaintext highlighter-rouge">mount</code> command.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount <span class="nt">-o</span> <span class="nv">name</span><span class="o">=</span>admin,secret<span class="o">=</span>&lt;base64-data&gt; <span class="nt">-t</span> ceph &lt;fqdn&gt;:6789:/ /mnt/cephfs
</code></pre></div></div>

<p>You can create additional users using <code class="language-plaintext highlighter-rouge">ceph auth get-or-create client.foo ...</code> with different access permissions.</p>

<p>You can provide a comma-separated list of Ceph monitor machines. The client tries to connect to these in sequence to provide redundancy during the initial connection phase. This isn’t for load balancing.</p>

<p>Once the mount has been set up, the client communicates directly with the metadata server and the individual OSD daemons, bypassing the monitor machine.</p>

<h1 id="subvolumes">Subvolumes</h1>

<p>Our source data is on ZFS, which has a multitude of file system features. It’s worth noting that Ceph FS has <em>subvolumes</em> which provide snapshots, quotas, clone capabilities and namespaces. Like in ZFS, these need to be created in advance, which fortunately, I did!</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph fs subvolumegroup create cephfs tessera
ceph fs subvolume create cephfs v1 <span class="nt">--group_name</span> tessera
</code></pre></div></div>

<p>These structures do not support arbitrary depths like ZFS; you are limited to a two level hierarchy of subvolume groups and, within that, multiple subvolumes, like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Filesystem
├── Subvolume Group (e.g., "tessera")
│   ├── Subvolume (e.g., "v1")
│   ├── Subvolume (e.g., "v2")
│   └── Subvolume (e.g., "v3")
└── Subvolume Group (e.g., "other-project")
    └── Subvolume (e.g., "data")
</code></pre></div></div>

<p>The sub volumes appear as UUID values. e.g.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@ceph-1:~# <span class="nb">du</span> /mnt/cephfs/
0    /mnt/cephfs/volumes/tessera/v1/dec6285d-84a2-4d34-9e8b-469d1c6180a8
1    /mnt/cephfs/volumes/tessera/v1
1    /mnt/cephfs/volumes/tessera
1    /mnt/cephfs/volumes
1    /mnt/cephfs/
</code></pre></div></div>

<p>The subvolume path structure is non-negotiable; therefore, I have used symlinks to match the original structure.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">ln</span> <span class="nt">-s</span> ../volumes/tessera/v1/dec6285d-84a2-4d34-9e8b-469d1c6180a8 /mnt/cephfs/tessera/v1
</code></pre></div></div>

<h1 id="copying-data">Copying data</h1>

<p>This Ceph cluster is composed of Scaleway machines, which are interconnected at 1 Gb/s. This is far from ideal, particularly as my source/client machine has 10 Gb/s networking.</p>

<p>The go-to tool for this is <code class="language-plaintext highlighter-rouge">rsync</code>, but the upfront file scan on large directories was extremely slow. <code class="language-plaintext highlighter-rouge">rclone</code> proved more effective in streaming files while scanning the tree simultaneously.</p>

<p>Initially, I mounted the Ceph file system on one of the Ceph machines and used <code class="language-plaintext highlighter-rouge">rclone</code> to copy from the client to that machine’s local mount point. However, this created a bottleneck, as the incoming interface only operates at 1 Gb/s, resulting in a best-case transfer speed of ~100 MB/s. That machine received the data and then retransmitted it to the cluster machine holding the OSD, so the interface was maxed out in both directions. In practice, I saw a maximum write rate of ~70MBps.</p>

<p>However, mounting the Ceph cluster directly from the client machine means that the client, with 10 Gb/s networking, can communicate directly with multiple cluster machines.</p>

<p>On the client machine, first install <code class="language-plaintext highlighter-rouge">ceph-common</code> using your package manager. Then, copy <code class="language-plaintext highlighter-rouge">ceph.conf</code> and <code class="language-plaintext highlighter-rouge">ceph.client.admin.keyring</code> from the cluster to the local machine, and finally mount using the earlier commands.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scp root@ceph-1:/etc/ceph/ceph.conf /etc/ceph
scp root@ceph-1:/etc/ceph/ceph.client.admin.keyring /etc/ceph
</code></pre></div></div>

<p>The sync is now between two local mounts on the client machine. I used <code class="language-plaintext highlighter-rouge">rclone</code> again as it still outperformed <code class="language-plaintext highlighter-rouge">rsync</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rclone <span class="nb">sync</span> /data/tessera/v1/ /mnt/cephfs/tessera/v1/ <span class="nt">--transfers</span> 16 <span class="nt">--progress</span> <span class="nt">--stats</span> 10s <span class="nt">--checkers</span> 1024 <span class="nt">--max-backlog</span> 10000000 <span class="nt">--modify-window</span> 1s
</code></pre></div></div>

<p>With this configuration, I saw write speeds of around 350 MB/s.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ceph" /><category term="tunbury.org" /><summary type="html"><![CDATA[We now have 209 TB of data on a seven-node Ceph cluster. Here are some further notes on using Ceph.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ceph-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ceph-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Multi Domain OCaml on Raspberry Pi Pico 2 Microcontroller</title><link href="https://www.tunbury.org/2025/12/31/ocaml-pico/" rel="alternate" type="text/html" title="Multi Domain OCaml on Raspberry Pi Pico 2 Microcontroller" /><published>2025-12-31T17:00:00+00:00</published><updated>2025-12-31T17:00:00+00:00</updated><id>https://www.tunbury.org/2025/12/31/ocaml-pico</id><content type="html" xml:base="https://www.tunbury.org/2025/12/31/ocaml-pico/"><![CDATA[<p>Running OCaml 5 with multicore support on bare-metal Raspberry Pi Pico 2 W (RP2350, ARM Cortex-M33).</p>

<p>The OCaml Arm32 backend, which <a href="https://www.tunbury.org/2025/11/27/ocaml-54-native/">I updated to OCaml 5 Domains</a>, generates ARMv7-A code (Application profile), but the Pico 2’s Cortex-M33 is ARMv8-M (Microcontroller profile). These instruction sets are compatible (both using Thumb-2), but the object file metadata differs. The linker will not mix “A” and “M” profiles.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error: hello.o: conflicting architecture profiles A/M
</code></pre></div></div>

<p>Initially, I worked with the existing Arm32 support, compiling to assembly files from OCaml and then patching them with <code class="language-plaintext highlighter-rouge">sed</code> and reassembling with <code class="language-plaintext highlighter-rouge">arm-none-eabi-as</code> to get a Cortex-M compatible object file.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sed</span> <span class="nt">-e</span> <span class="s1">'s/.arch[[:space:]]*armv7-a/.arch armv8-m.main/'</span> <span class="se">\</span>
    <span class="nt">-e</span> <span class="s1">'s/.fpu[[:space:]]*softvfp/.fpu fpv5-sp-d16/'</span> <span class="se">\</span>
    hello.s.orig <span class="o">&gt;</span> hello.s
</code></pre></div></div>

<p>After a while, I decided to add a new architecture to the ARM backend to avoid the external processing. The Cortex-M33 has a single-precision only FPU. OCaml’s float type is double-precision (64-bit), so the hardware FPU cannot accelerate OCaml floats. The default Pico SDK linker script copies some code to RAM for faster execution, including the soft FPU. I have used a custom linker script to put everything in flash to maximise the memory available for the OCaml heap.</p>

<p>Creating a minimal runtime was relatively simple. OCaml’s calling convention puts the function pointer in r7 and calls <code class="language-plaintext highlighter-rouge">caml_c_call</code>. My function calls <code class="language-plaintext highlighter-rouge">blx r7</code> to invoke the actual C function. OCaml expects r8, r10, r11 to hold runtime state, so these are initialised with minimal structures.</p>

<ul>
  <li>r8 - trap_ptr (exception handler)</li>
  <li>r10 - alloc_ptr (allocation pointer)</li>
  <li>r11 - domain_state_ptr (runtime state)</li>
</ul>

<p>Thus, creating a simple program using OCaml syntax was now possible. It was also possible to have recursive functions to calculate a factorial; however, there was no garbage collector, no exception handling, no standard library and no multicore/domain support.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">external</span> <span class="n">pico_print</span> <span class="o">:</span> <span class="kt">string</span> <span class="o">-&gt;</span> <span class="kt">unit</span> <span class="o">=</span> <span class="s2">"pico_print"</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="n">pico_print</span> <span class="s2">"Hello from OCaml!"</span>
</code></pre></div></div>

<p>This limited success, though, was enough to inspire me to push on to the second phase. I added per-core thread-local storage and provided a mapping between pthread and Pico SDK primitives. The Pico SDK does not provide condition variables, so I implemented a simple polling solution.</p>

<p>OCaml’s <code class="language-plaintext highlighter-rouge">Domain.spawn</code> calls <code class="language-plaintext highlighter-rouge">pthread_create()</code>, which now calls <code class="language-plaintext highlighter-rouge">multicore_launch_core1_with_stack()</code> from the Pico SDK. OCaml creates a backup thread which handles stop-the-world GC synchronisation when a domain’s main thread is blocked. On the Pico, I fake the creation of the backup thread by only creating a thread on every other call to <code class="language-plaintext highlighter-rouge">pthread_create()</code>. Since there is no backup thread, during <code class="language-plaintext highlighter-rouge">pthread_cond_wait()</code>, <code class="language-plaintext highlighter-rouge">pthread_mutex_lock</code>, even in <code class="language-plaintext highlighter-rouge">_write</code>, I poll the status of the STW interrupt flag to simulate what the backup thread would do on a real OS.</p>

<p>All of Stdlib compiles, but I only initialise 25 modules, which don’t have extensive OS dependencies.</p>

<ul>
  <li>CamlinternalFormatBasics, Stdlib, Either, Sys, Obj, Type</li>
  <li>Atomic, CamlinternalLazy, Lazy, Seq, Option, Pair, Result</li>
  <li>Bool, Char, Uchar, List, Int, Array, Bytes, String, Unit</li>
  <li>Mutex, Condition, Domain</li>
</ul>

<p>The curry functions are generated at link time by the OCaml linker. I am using Pico SDK linker, <code class="language-plaintext highlighter-rouge">arm-none-eabi-ld</code> and therefore the curry functions are not generated automatically. The workaround was to create a dummy OCaml file that uses enough partial applications to force the generation of <code class="language-plaintext highlighter-rouge">caml_curry2-8</code>, then extract them to assembly, <code class="language-plaintext highlighter-rouge">curry.s</code>, and add that to <code class="language-plaintext highlighter-rouge">libstdlib_pico.a</code> for linking.</p>

<p>As a test, I used the prime number benchmark I used for the original Arm32 work to count the number of prime numbers less than 1 million and compared the single-core and dual-core performance.</p>

<table>
  <thead>
    <tr>
      <th>Test</th>
      <th>Time</th>
      <th>Primes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Single-core</td>
      <td>21,166 ms</td>
      <td>78,498</td>
    </tr>
    <tr>
      <td>Dual-core</td>
      <td>12,350 ms</td>
      <td>78,498</td>
    </tr>
    <tr>
      <td>Speedup</td>
      <td>1.71x</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<p>The code for this project is available in <a href="https://github.com/mtelvers/pico_ocaml">mtelvers/pico_ocaml</a> and <a href="https://github.com/mtelvers/ocaml">mtelvers/ocaml</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,pico" /><category term="tunbury.org" /><summary type="html"><![CDATA[Running OCaml 5 with multicore support on bare-metal Raspberry Pi Pico 2 W (RP2350, ARM Cortex-M33).]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-pico.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-pico.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Partridge Puzzle</title><link href="https://www.tunbury.org/2025/12/17/partridge-puzzle/" rel="alternate" type="text/html" title="Partridge Puzzle" /><published>2025-12-17T23:00:00+00:00</published><updated>2025-12-17T23:00:00+00:00</updated><id>https://www.tunbury.org/2025/12/17/partridge-puzzle</id><content type="html" xml:base="https://www.tunbury.org/2025/12/17/partridge-puzzle/"><![CDATA[<p>Quite often, you install a new package from opam and are keen to jump right in and want a tailored example for your use case. This happened to me during <a href="https://www.tunbury.org/2025/12/12/advent-of-code/">Day 12</a> of <a href="https://adventofcode.com">Advent of Code</a>.</p>

<p>I had identified the <a href="https://opam.ocaml.org/packages/combine/">combine</a> package as potentially suitable to solve the day’s problem of placing shapes in a grid. I cloned the git repository.</p>

<p>I prompted Claude Code with <code class="language-plaintext highlighter-rouge">Please can you read the combine directory</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>● Here's a summary of the combine directory:

  Overview

  Combine is an OCaml library for combinatorics problem solving, created by
  Remy El Sibaie and Jean-Christophe Filliâtre (2012-2014). Licensed under
  LGPL.

  Main Modules

  The library contains four main modules:
  - Dlx: Implements Knuth's dancing links algorithm
  - Zdd: Implements Zero-suppressed binary decision diagrams
  - Emc: Common interface to Dlx and Zdd to solve Exact Matrix Cover problems (+ SAT reduction)
  - Tiling: Converts 2D tiling problems into EMC problems

  Project Structure

  combine/
  ├── src/
  │   ├── main.ml           # CLI entry point
  │   └── lib/              # Core library
  │       ├── dlx.ml/mli    # Dancing links
  │       ├── zdd.ml/mli    # ZDD implementation
  │       ├── emc.ml/mli    # Exact Matrix Cover
  │       ├── tiling.ml/mli # 2D tiling
  │       ├── parser.mly    # Menhir parser
  │       ├── lexer.mll     # OCamllex lexer
  │       └── ...           # Other modules
  ├── tests/                # Test files
  ├── examples/
  │   ├── cmb/              # .cmb tiling problem files (pentominoes, dominos, etc.)
  │   └── lib/              # Example programs (queens, sudoku, color)
  ├── misc/                 # Statistics and test data
  ├── www/                  # Website files
  ├── dune-project          # Dune build (v2.4, uses menhir)
  └── combine.opam          # Package definition

  Dependencies

  - OCaml ≥ 4.07
  - Dune ≥ 2.4
  - Menhir ≥ 20200211
  - Zarith
  - Optional: graphics, ocamlgraph

  Tiling Language

  The project includes a DSL (.cmb files) for describing 2D tiling problems,
   with operations like union, intersection, shift, rotation, and various
  isometries.
</code></pre></div></div>

<p>This is perfect for knowing that you’re on the right track. I continued with Claude and asked, “Please, can you show me an example of how to use the API to tile a 5x3 rectangle with 3 L pentoninos?”</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>● Here's how to use the Tiling API programmatically for your 3 L pentominoes
   example:

  open Combine

  (* Define the L pentomino pattern as a bool matrix *)
  let l_pattern = Tiling.Pattern.create [|
    [| true;  false; false; false |];  (* bottom row: *... *)
    [| true;  true;  true;  true  |];  (* top row:    **** *)
  |]

  (* Create 3 L tiles with ~one multiplicity and all symmetries *)
  let l_tile () =
    Tiling.Tile.create ~name:"L" ~s:Sall ~m:Mone l_pattern

  let tiles = [ l_tile (); l_tile (); l_tile () ]

  (* Define a 5x3 grid (all true = all cells available) *)
  let grid = Tiling.Pattern.create (Array.make_matrix 3 5 true)

  (* Create the problem *)
  let problem = Tiling.Problem.create ~name:"three_Ls" grid tiles

  (* Convert to EMC *)
  let emc = Tiling.Problem.ToEMC.make problem

  (* Create DLX solver *)
  let dlx = Emc.D.create_sparse ~primary:emc.primary ~columns:emc.columns
  emc.emc

  (* Check if solution exists *)
  let has_solution =
    try
      let _ = Emc.D.find_solution dlx in
      true
    with Not_found -&gt;
      false

  (* Or get the solution *)
  let solution = Emc.D.find_solution dlx  (* raises Not_found if none *)

  (* Print ASCII output *)
  let () =
    Format.printf "%a@."
      (Tiling.Problem.ToEMC.print_solution_ascii problem emc)
      solution

  Key API functions:

  | Function                                  | Purpose                                    |
  |-------------------------------------------|--------------------------------------------|
  | Tiling.Pattern.create                     | Create a pattern from bool array array     |
  | Tiling.Tile.create ~s ~m pattern          | Create a tile with symmetries/multiplicity |
  | Tiling.Problem.create grid tiles          | Create a tiling problem                    |
  | Tiling.Problem.ToEMC.make problem         | Convert to EMC matrix                      |
  | Emc.D.create_sparse ~primary ~columns emc | Create DLX solver                          |
  | Emc.D.find_solution dlx                   | Find one solution (raises Not_found)       |
  | Emc.D.count_solutions dlx                 | Count all solution                         |

  Multiplicity options: Minf (unlimited), Mone (exactly one), Mmaybe (zero or one)

  Symmetry options: Snone (no rotations), Spositive (rotations only), Sall (rotations + reflections)
</code></pre></div></div>

<p>This was Advent of Code, so I stopped there - there’s no challenge if I ask Claude to solve the puzzles - but the takeaway is that I was now able to quickly develop a solution based upon this example.</p>

<p>The library didn’t work for me as the search space was too large; in the end, I used a deeply unsatisfying area calculation to get the solution.</p>

<p>Back in the summer, I was introduced to the Partridge Puzzle by <a href="http://standupmaths.com/">Matt Parker’s</a> YouTube video <a href="https://youtu.be/eqyuQZHfNPQ?si=KnQhzQ5gvRkXObSa">The impossible puzzle with over a million solutions!</a>. In the video, Matt used this puzzle as the basis for some new artwork for <a href="https://janestreet.com">Jane Street’s</a> office.</p>

<p>In the puzzle, you need to pack <code class="language-plaintext highlighter-rouge">k</code> squares of size <code class="language-plaintext highlighter-rouge">k x k</code>, and <code class="language-plaintext highlighter-rouge">(k-1)</code> squares of size <code class="language-plaintext highlighter-rouge">(k-1) x (k-1)</code> and <code class="language-plaintext highlighter-rouge">(k-2)</code> squares of size <code class="language-plaintext highlighter-rouge">(k-2) x (k-2)</code> … and <code class="language-plaintext highlighter-rouge">1</code> square of size <code class="language-plaintext highlighter-rouge">1x1</code> into a square with side length <code class="language-plaintext highlighter-rouge">k * (k + 1) / 2</code> (the Pyrimid number of k). Matt uses <code class="language-plaintext highlighter-rouge">k=9</code> to pack the squares into a <code class="language-plaintext highlighter-rouge">45 x 45 = 2025</code> unit square.</p>

<p>I had originally written a basic DFS solver which placed blocks in the first free square. However, it didn’t find a solution in the time I was prepared to wait. I had tried some clever optimisations, placing things sensibly to avoid narrow gaps, but these were costly to calculate and still didn’t yield a solution.</p>

<p>Claude stepped up and generated the code using the combine library. The EMC/DLX solution was too slow due to the number of symmetrical arrangements. One <code class="language-plaintext highlighter-rouge">9x9</code> square is indistinguishable from another. Next, <em>we</em> tried using the SAT encoding module and passed it to <code class="language-plaintext highlighter-rouge">minisat</code>. After 30 minutes, there was still no solution. Forcing an ordering to the square placement reduced the memory footprint, but there was no solution yet.</p>

<p>Ultimately, I threw all 40 threads of my machine at my basic DFS version, which got a result in under a minute.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Quite often, you install a new package from opam and are keen to jump right in and want a tailored example for your use case. This happened to me during Day 12 of Advent of Code.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/solution.png" /><media:content medium="image" url="https://www.tunbury.org/images/solution.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Advent of Code 2025</title><link href="https://www.tunbury.org/2025/12/12/advent-of-code/" rel="alternate" type="text/html" title="Advent of Code 2025" /><published>2025-12-12T18:00:00+00:00</published><updated>2025-12-12T18:00:00+00:00</updated><id>https://www.tunbury.org/2025/12/12/advent-of-code</id><content type="html" xml:base="https://www.tunbury.org/2025/12/12/advent-of-code/"><![CDATA[<p>With the start of Advent comes a new set of Advent of Code problems. My code is available at <a href="https://github.com/mtelvers/aoc2025">mtelvers/aoc2025</a>.</p>

<h1 id="day-1---secret-entrance">Day 1 - Secret Entrance</h1>

<p>A dial points to 50. Follow the sequence of turns to see how many times it lands on zero. The only gotcha here was that the real input had values &gt; 100.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>L68
L30
R48
L5
R60
L55
L1
L99
R14
L82
</code></pre></div></div>

<p>Part 2 was fiddly as the corner cases needed careful consideration. Landing on zero should be counted, so start with the answer from part 1. Add the number of clicks to turn through divided by 100 (the quotient) to count the number of full rotations. Then add the cases where the turning left by the number of clicks modulo 100 would be less than zero, and the same for turning right when it would be greater than 100.</p>

<p>Note that starting at 0 and turning left 5 does not count as passing zero. So if your zero passing test is <code class="language-plaintext highlighter-rouge">position &lt; value</code>, then this is only true when <code class="language-plaintext highlighter-rouge">position &gt; 0</code>.</p>
<h1 id="day-2---gift-shop">Day 2 - Gift Shop</h1>

<p>Find repeating patterns in some number ranges.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>11-22,95-115,998-1012,1188511880-1188511890,222220-222224,
1698522-1698528,446443-446449,38593856-38593862,565653-565659,
824824821-824824827,2121212118-2121212124
</code></pre></div></div>

<h2 id="part-1">Part 1</h2>

<p>I decided right away to use integer comparison rather than converting numbers to strings and then comparing them. The number of digits in an integer when written in base 10 is the <code class="language-plaintext highlighter-rouge">1 + int(log10 x)</code>. For part 1, the challenge was to look for exact splits <code class="language-plaintext highlighter-rouge">11</code> or <code class="language-plaintext highlighter-rouge">123123</code>; therefore, the length must be even. We can use the divisor <code class="language-plaintext highlighter-rouge">10^(length/2)</code>, and test with <code class="language-plaintext highlighter-rouge">x / divisor = x mod divisor</code> and sum all the numbers where this is true.</p>

<h2 id="part-2">Part 2</h2>

<p>The problem is extended to allow any equal chunking. Thus, <code class="language-plaintext highlighter-rouge">824824824</code> is now valid as it has three chunks of 3 digits. Given the maximum length of a 64-bit integer is 20 digits, we only need the factors of the numbers 1 to 20, which could be entered as a static list. I decided to calculate these in code using a simple division test up to the square root of the number. I should memoise these results to avoid repeated recalculation. Once I had a list of factors, I folded over the list, testing each with a recursive function to verify that each chunk was equal.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">base</span> <span class="o">=</span> <span class="n">pow</span> <span class="mi">10</span> <span class="n">factor</span> <span class="k">in</span>
<span class="k">let</span> <span class="n">modulo</span> <span class="o">=</span> <span class="n">x</span> <span class="ow">mod</span> <span class="n">base</span> <span class="k">in</span>

<span class="k">let</span> <span class="k">rec</span> <span class="n">loop</span> <span class="n">v</span> <span class="o">=</span>
  <span class="k">if</span> <span class="n">v</span> <span class="o">=</span> <span class="mi">0</span> <span class="k">then</span> <span class="bp">true</span>
  <span class="k">else</span> <span class="k">if</span> <span class="n">v</span> <span class="ow">mod</span> <span class="n">base</span> <span class="o">=</span> <span class="n">modulo</span> <span class="k">then</span> <span class="n">loop</span> <span class="p">(</span><span class="n">v</span> <span class="o">/</span> <span class="n">base</span><span class="p">)</span>
  <span class="k">else</span> <span class="bp">false</span>
<span class="k">in</span>

<span class="n">loop</span> <span class="p">(</span><span class="n">x</span> <span class="o">/</span> <span class="n">base</span><span class="p">)</span>
</code></pre></div></div>

<p>The only gotcha I found was that numbers less than 10 came out as true, so I constrained the lower bound of the range to 10.</p>
<h1 id="day-3---lobby">Day 3 - Lobby</h1>

<p>Sum the largest number you can make using N digits from the given sequences. The order of the digits cannot be changed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>987654321111111
811111111111119
234234234234278
818181911112111
</code></pre></div></div>

<h2 id="part-1-1">Part 1</h2>

<p>As it was initially presented, you are only required to consider two digits. As <code class="language-plaintext highlighter-rouge">9_</code> will always be bigger than <code class="language-plaintext highlighter-rouge">8_</code>, this becomes a case of finding the largest digit available, which still leaves one digit. If there is more than one digit left, then pick the largest one. For example, given <code class="language-plaintext highlighter-rouge">818181911112111</code>, the largest first digit is <code class="language-plaintext highlighter-rouge">9</code>, followed by the largest digit in <code class="language-plaintext highlighter-rouge">11112111</code>, which is 2.</p>

<p>I pattern-matched the list of numbers to extract two digits in a recursive loop:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">rec</span> <span class="n">loop</span> <span class="n">max_left</span> <span class="n">max_right</span> <span class="o">=</span> <span class="k">function</span>
  <span class="o">|</span> <span class="n">l</span> <span class="o">::</span> <span class="n">r</span> <span class="o">::</span> <span class="n">tl</span> <span class="o">-&gt;</span>
    <span class="k">if</span> <span class="n">l</span> <span class="o">&gt;</span> <span class="n">max_left</span> <span class="k">then</span> <span class="n">loop</span> <span class="n">l</span> <span class="n">r</span> <span class="p">(</span><span class="n">r</span> <span class="o">::</span> <span class="n">tl</span><span class="p">)</span>
    <span class="k">else</span> <span class="k">if</span> <span class="n">r</span> <span class="o">&gt;</span> <span class="n">max_right</span> <span class="k">then</span> <span class="n">loop</span> <span class="n">max_left</span> <span class="n">r</span> <span class="p">(</span><span class="n">r</span> <span class="o">::</span> <span class="n">tl</span><span class="p">)</span>
    <span class="k">else</span> <span class="n">loop</span> <span class="n">max_left</span> <span class="n">max_right</span> <span class="p">(</span><span class="n">r</span> <span class="o">::</span> <span class="n">tl</span><span class="p">)</span>
  <span class="o">|</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">max_left</span><span class="o">,</span> <span class="n">max_right</span><span class="p">)</span>
<span class="k">in</span>
<span class="k">let</span> <span class="n">l</span><span class="o">,</span> <span class="n">r</span> <span class="o">=</span> <span class="n">loop</span> <span class="mi">0</span> <span class="mi">0</span> <span class="n">bank</span> <span class="k">in</span>
<span class="k">let</span> <span class="n">num</span> <span class="o">=</span> <span class="n">l</span> <span class="o">*</span> <span class="mi">10</span> <span class="o">+</span> <span class="n">r</span>
</code></pre></div></div>

<h2 id="part-2-1">Part 2</h2>

<p>Annoyingly, this changed the problem significantly, as it increased the number length from 2 to 12. My list approach now seemed unworkable, and I switched to using arrays.</p>

<p>Taking <code class="language-plaintext highlighter-rouge">818181911112111</code> as an example, I extracted <code class="language-plaintext highlighter-rouge">8181</code>, leaving 11 digits available and found the maximum value, which is the first <code class="language-plaintext highlighter-rouge">8</code>. Then I extracted a new subarray, <code class="language-plaintext highlighter-rouge">1818</code>, starting after the first digit matched and leaving 10 digits available. The maximum here is the <code class="language-plaintext highlighter-rouge">8</code> at index 1. Repeating this process, finding the maximum in <code class="language-plaintext highlighter-rouge">181</code>, then of <code class="language-plaintext highlighter-rouge">19</code>, and finally, all the remaining numbers must be taken to achieve the correct length.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>818181911112111

8181 -&gt; i=0 [i]=8
1818 -&gt; i=1 [i]=8
181 -&gt; i=1 [i]=8
19 -&gt; i=1 [i]=9
1 -&gt; i=0 [i]=1
1 -&gt; i=0 [i]=1
1 -&gt; i=0 [i]=1
1 -&gt; i=0 [i]=1
2 -&gt; i=0 [i]=2
1 -&gt; i=0 [i]=1
1 -&gt; i=0 [i]=1
1 -&gt; i=0 [i]=1
</code></pre></div></div>

<p>This worked out nicely, and I parameterised the function to accept the length of the number required so that I could use this code for part 1 as well.</p>

<h1 id="day-4---paper-bale-warehouse">Day 4 - Paper Bale Warehouse</h1>

<p>Find the number of <code class="language-plaintext highlighter-rouge">@</code> which have fewer than four <code class="language-plaintext highlighter-rouge">@</code> as neighbours.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>..@@.@@@@.
@@@.@.@.@@
@@@@@.@.@@
@.@@@@..@.
@@.@@@@.@@
.@@@@@@@.@
.@.@.@.@@@
@.@@@.@@@@
.@@@@@@@@.
@.@.@@@.@.
</code></pre></div></div>

<p>I chose	to read the input into a Map, which I have used several times before, so I copied my implementation from AoC 2024 Day 10.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">coord</span> <span class="o">=</span> <span class="p">{</span> <span class="n">y</span> <span class="o">:</span> <span class="kt">int</span><span class="p">;</span> <span class="n">x</span> <span class="o">:</span> <span class="kt">int</span> <span class="p">}</span>

<span class="k">module</span> <span class="nc">CoordMap</span> <span class="o">=</span> <span class="nn">Map</span><span class="p">.</span><span class="nc">Make</span> <span class="p">(</span><span class="k">struct</span>
  <span class="k">type</span> <span class="n">t</span> <span class="o">=</span> <span class="n">coord</span>

  <span class="k">let</span> <span class="n">compare</span> <span class="o">=</span> <span class="n">compare</span>
<span class="k">end</span><span class="p">)</span>
</code></pre></div></div>

<p>I set up a list of directions around the centre point.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">neighbours</span> <span class="o">=</span>
  <span class="p">[</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span> <span class="p">};</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">0</span> <span class="p">};</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">1</span> <span class="p">};</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span> <span class="p">};</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">1</span> <span class="p">};</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span> <span class="p">};</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">0</span> <span class="p">};</span>
    <span class="p">{</span> <span class="n">y</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">1</span> <span class="p">};</span>
  <span class="p">]</span>
</code></pre></div></div>

<h2 id="part-1-2">Part 1</h2>

<p>Fold over the map, and where there is an <code class="language-plaintext highlighter-rouge">@</code>, I folded over the list of neighbours, counting the number with bales, which could then be summed in the outer fold.</p>

<h2 id="part-2-2">Part 2</h2>

<p>For the second part, the free bales needed to be removed, and then the calculation was repeated, trying again until no more bales could be removed.</p>

<p>At this point, I realised that the map could be simplified to a set, as there is no need to distinguish between the boundary and an empty square.</p>

<p>Therefore, rather than just counting the free bales, I added these to a set which could be subtracted from the original set and iterated.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">rec</span> <span class="n">part2</span> <span class="n">w</span> <span class="o">=</span>
  <span class="nn">CoordSet</span><span class="p">.</span><span class="n">fold</span>
    <span class="p">(</span><span class="k">fun</span> <span class="n">k</span> <span class="n">acc</span> <span class="o">-&gt;</span> <span class="k">if</span> <span class="n">is_free_bales</span> <span class="n">w</span> <span class="n">k</span> <span class="k">then</span> <span class="nn">CoordSet</span><span class="p">.</span><span class="n">add</span> <span class="n">k</span> <span class="n">acc</span> <span class="k">else</span> <span class="n">acc</span><span class="p">)</span>
    <span class="n">w</span> <span class="nn">CoordSet</span><span class="p">.</span><span class="n">empty</span>
  <span class="o">|&gt;</span> <span class="k">fun</span> <span class="n">free_bales</span> <span class="o">-&gt;</span>
  <span class="k">if</span> <span class="nn">CoordSet</span><span class="p">.</span><span class="n">is_empty</span> <span class="n">free_bales</span> <span class="k">then</span> <span class="nn">CoordSet</span><span class="p">.</span><span class="n">cardinal</span> <span class="n">w</span>
  <span class="k">else</span> <span class="nn">CoordSet</span><span class="p">.</span><span class="n">diff</span> <span class="n">w</span> <span class="n">free_bales</span> <span class="o">|&gt;</span> <span class="n">part2</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"part 2: %i</span><span class="se">\n</span><span class="s2">"</span> <span class="p">(</span><span class="nn">CoordSet</span><span class="p">.</span><span class="n">cardinal</span> <span class="n">warehouse</span> <span class="o">-</span> <span class="n">part2</span> <span class="n">warehouse</span><span class="p">)</span>
</code></pre></div></div>
<h1 id="day-5---cafeteria">Day 5 - Cafeteria</h1>

<p>Count the number of elements from the second list which appear in the list of (inclusive) ranges.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>3-5
10-14
16-20
12-18

1
5
8
11
17
32
</code></pre></div></div>

<h2 id="part-1-3">Part 1</h2>

<p>I read the input data into two variables, <code class="language-plaintext highlighter-rouge">fresh</code> as a list of pairs for the ranges and <code class="language-plaintext highlighter-rouge">ingredients</code> as an int list. For part one, it’s just a case of summing values where the ingredient falls within the range:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">part1</span> <span class="o">=</span>
  <span class="nn">List</span><span class="p">.</span><span class="n">fold_left</span>
    <span class="p">(</span><span class="k">fun</span> <span class="n">f</span> <span class="n">i</span> <span class="o">-&gt;</span>
      <span class="nn">List</span><span class="p">.</span><span class="n">find_opt</span> <span class="p">(</span><span class="k">fun</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">i</span> <span class="o">&gt;=</span> <span class="n">l</span> <span class="o">&amp;&amp;</span> <span class="n">i</span> <span class="o">&lt;=</span> <span class="n">h</span><span class="p">)</span> <span class="n">fresh</span> <span class="o">|&gt;</span> <span class="k">function</span>
      <span class="o">|</span> <span class="nc">Some</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="n">f</span> <span class="o">+</span> <span class="mi">1</span>
      <span class="o">|</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="n">f</span><span class="p">)</span>
    <span class="mi">0</span> <span class="n">ingredients</span>
</code></pre></div></div>

<h2 id="part-2-3">Part 2</h2>

<p>Ignoring the second list, count the values represented by the list of ranges. <code class="language-plaintext highlighter-rouge">3-5,10-14</code> would be a <code class="language-plaintext highlighter-rouge">3 + 5 = 8</code>. I didn’t verify this, but it is likely that the actual input ranges aren’t as tidy as the example data. We are told that ranges overlap, but I expect there will be ranges that entirely encompass other ranges, as well as ranges that are immediately adjacent, and so on. I wrote an <code class="language-plaintext highlighter-rouge">add</code> function to add a range to a list of ranges. I think it would have looked better using <code class="language-plaintext highlighter-rouge">type range = { low: int; high: int }</code>, but I’d come this far using pairs.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">add</span> <span class="p">(</span><span class="n">low</span><span class="o">,</span> <span class="n">high</span><span class="p">)</span> <span class="n">t</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">rec</span> <span class="n">loop</span> <span class="n">acc</span> <span class="p">(</span><span class="n">low</span><span class="o">,</span> <span class="n">high</span><span class="p">)</span> <span class="o">=</span> <span class="k">function</span>
    <span class="o">|</span> <span class="bp">[]</span> <span class="o">-&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">rev</span> <span class="p">((</span><span class="n">low</span><span class="o">,</span> <span class="n">high</span><span class="p">)</span> <span class="o">::</span> <span class="n">acc</span><span class="p">)</span>
    <span class="o">|</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">::</span> <span class="n">tl</span> <span class="k">when</span> <span class="n">h</span> <span class="o">+</span> <span class="mi">1</span> <span class="o">&lt;</span> <span class="n">low</span> <span class="o">-&gt;</span> <span class="n">loop</span> <span class="p">((</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">::</span> <span class="n">acc</span><span class="p">)</span> <span class="p">(</span><span class="n">low</span><span class="o">,</span> <span class="n">high</span><span class="p">)</span> <span class="n">tl</span>
    <span class="o">|</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">::</span> <span class="n">tl</span> <span class="k">when</span> <span class="n">high</span> <span class="o">+</span> <span class="mi">1</span> <span class="o">&lt;</span> <span class="n">l</span> <span class="o">-&gt;</span>
        <span class="nn">List</span><span class="p">.</span><span class="n">rev_append</span> <span class="n">acc</span> <span class="p">((</span><span class="n">low</span><span class="o">,</span> <span class="n">high</span><span class="p">)</span> <span class="o">::</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">::</span> <span class="n">tl</span><span class="p">)</span>
    <span class="o">|</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">::</span> <span class="n">tl</span> <span class="o">-&gt;</span> <span class="n">loop</span> <span class="n">acc</span> <span class="p">(</span><span class="n">min</span> <span class="n">l</span> <span class="n">low</span><span class="o">,</span> <span class="n">max</span> <span class="n">h</span> <span class="n">high</span><span class="p">)</span> <span class="n">tl</span>
  <span class="k">in</span>
  <span class="n">loop</span> <span class="bp">[]</span> <span class="p">(</span><span class="n">low</span><span class="o">,</span> <span class="n">high</span><span class="p">)</span> <span class="n">t</span>
</code></pre></div></div>

<p>I wrote some test cases to cover the weird cases not present in the example data.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="bp">[]</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">5</span><span class="p">)</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">7</span><span class="o">,</span> <span class="mi">9</span><span class="p">);;</span>                 <span class="o">#</span> <span class="n">simple</span> <span class="p">[(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">5</span><span class="p">);</span> <span class="p">(</span><span class="mi">7</span><span class="o">,</span> <span class="mi">9</span><span class="p">)]</span>
<span class="bp">[]</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">5</span><span class="p">)</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">7</span><span class="o">,</span> <span class="mi">9</span><span class="p">)</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">4</span><span class="o">,</span> <span class="mi">8</span><span class="p">);;</span>   <span class="o">#</span> <span class="n">join</span> <span class="p">[(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">9</span><span class="p">)]</span>
<span class="bp">[]</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">5</span><span class="p">)</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">7</span><span class="o">,</span> <span class="mi">9</span><span class="p">)</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">1</span><span class="o">,</span> <span class="mi">10</span><span class="p">);;</span>  <span class="o">#</span> <span class="n">encompass</span> <span class="p">[(</span><span class="mi">1</span><span class="o">,</span> <span class="mi">10</span><span class="p">)]</span>
<span class="bp">[]</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">5</span><span class="p">)</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="mi">6</span><span class="o">,</span> <span class="mi">9</span><span class="p">);;</span>                 <span class="o">#</span> <span class="n">adjacent</span> <span class="p">[(</span><span class="mi">2</span><span class="o">,</span> <span class="mi">9</span><span class="p">)]</span>
</code></pre></div></div>

<p>With the code tested, part 2 used the <code class="language-plaintext highlighter-rouge">add</code> function to create a combined list, and then summed the difference between the high and low values + 1.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">part2</span> <span class="o">=</span>
  <span class="nn">List</span><span class="p">.</span><span class="n">fold_left</span> <span class="p">(</span><span class="k">fun</span> <span class="n">acc</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">add</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="n">acc</span><span class="p">)</span> <span class="bp">[]</span> <span class="n">fresh</span>
  <span class="o">|&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">fold_left</span> <span class="p">(</span><span class="k">fun</span> <span class="n">acc</span> <span class="p">(</span><span class="n">l</span><span class="o">,</span> <span class="n">h</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">acc</span> <span class="o">+</span> <span class="p">(</span><span class="n">h</span> <span class="o">-</span> <span class="n">l</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="mi">0</span>
</code></pre></div></div>
<h1 id="day-6---trash-compactor">Day 6 - Trash Compactor</h1>

<p>Sum the cryptically presented equations.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>123 328  51 64 
 45 64  387 23 
  6 98  215 314
*   +   *   +  
</code></pre></div></div>

<h1 id="part-1-4">Part 1</h1>

<p>Apply the operator at the bottom of the column to the numbers above it and sum the results.</p>

<p>This was a straightforward case of reading a list of lines, then splitting it up into a list of lists of numbers, resulting in a kind of matrix. Use a transpose function and then apply the operator on each list using a fold operation. Note that it’s just addition and multiplication, both of which are commutative.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">rec</span> <span class="n">transpose</span> <span class="o">=</span> <span class="k">function</span>
  <span class="o">|</span> <span class="bp">[]</span> <span class="o">|</span> <span class="bp">[]</span> <span class="o">::</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="bp">[]</span>
  <span class="o">|</span> <span class="n">rows</span> <span class="o">-&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">map</span> <span class="nn">List</span><span class="p">.</span><span class="n">hd</span> <span class="n">rows</span> <span class="o">::</span> <span class="n">transpose</span> <span class="p">(</span><span class="nn">List</span><span class="p">.</span><span class="n">map</span> <span class="nn">List</span><span class="p">.</span><span class="n">tl</span> <span class="n">rows</span><span class="p">)</span>
</code></pre></div></div>

<h1 id="part-2-4">Part 2</h1>

<p>It was odd in the original input that sometimes there was one space between the numbers, while other times there were two. This all became clear in part 2, as the problem was reframed that the numbers themselves were also transposed. Thus, the far right column was actually <code class="language-plaintext highlighter-rouge">4 + 431 + 623</code>.</p>

<p>Reading the input as characters and transposing it resulted in, what is in effect, the part 1 problem, but the data structure isn’t pretty.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1  *
24  
356 
    
369+
248 
8   
    
 32*
581 
175 
    
623+
431 
  4 
</code></pre></div></div>

<p>I can see that you could write a conversion function for both the part 1 and the transposed part 2 structure into a standard format and use the same processing function to sum both datasets, but I didn’t!</p>

<p>I created a <code class="language-plaintext highlighter-rouge">split_last</code> function to from the last element from each row (list).</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">rec</span> <span class="n">split_last</span> <span class="o">=</span> <span class="k">function</span>
  <span class="o">|</span> <span class="bp">[]</span> <span class="o">-&gt;</span> <span class="k">assert</span> <span class="bp">false</span>
  <span class="o">|</span> <span class="p">[</span> <span class="n">x</span> <span class="p">]</span> <span class="o">-&gt;</span> <span class="p">([]</span><span class="o">,</span> <span class="n">x</span><span class="p">)</span>
  <span class="o">|</span> <span class="n">x</span> <span class="o">::</span> <span class="n">xs</span> <span class="o">-&gt;</span>
      <span class="k">let</span> <span class="n">init</span><span class="o">,</span> <span class="n">last</span> <span class="o">=</span> <span class="n">split_last</span> <span class="n">xs</span> <span class="k">in</span>
      <span class="p">(</span><span class="n">x</span> <span class="o">::</span> <span class="n">init</span><span class="o">,</span> <span class="n">last</span><span class="p">)</span>
</code></pre></div></div>

<p>This gives me the operator plus a list of characters. The list of characters can be concatenated, trimmed and converted into a number. Then, using an inelegant fold which threads the operator, the intermediate sum and the overall sum, you can calculate the answer.</p>
<h1 id="day-7---laboratories">Day 7 - Laboratories</h1>

<p>Starting from <code class="language-plaintext highlighter-rouge">S</code>, beam down through the map, splitting at each <code class="language-plaintext highlighter-rouge">^</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.......S.......
...............
.......^.......
...............
......^.^......
...............
.....^.^.^.....
...............
....^.^...^....
...............
...^.^...^.^...
...............
..^...^.....^..
...............
.^.^.^.^.^...^.
...............
</code></pre></div></div>

<p>I read the diagram as a map of <code class="language-plaintext highlighter-rouge">(x,y)</code> coordinates, but in retrospect, a list of arrays may have been a more optimal choice.</p>

<h2 id="part-1-5">Part 1</h2>

<p>In this part, calculate how many times we reach an <code class="language-plaintext highlighter-rouge">^</code>. This is a breadth-first search tracking the number of beams at each iteration. I used a coordinate map to track the beams at each level which helpfully automatically absorbs duplicate beams.</p>

<h2 id="part-2-5">Part 2</h2>

<p>This time, follow each possible path and count how many ways there are to get to the end. This is a depth-first search where the trivial algorithm works on the test dataset, but with the actual input, the number of possibilities is too large. Therefore, I added a hashtbl to memoise the results at each level. With this, all 25 trillion ways are counted in a matter of a few milliseconds.</p>
<h1 id="day-8---playground">Day 8 - Playground</h1>

<p>Compute the distance between vectors in 3D space and build them into a graph by linking the closest pairs.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>162,817,812
57,618,57
906,360,560
592,479,940
352,342,300
466,668,158
542,29,236
431,825,988
739,650,466
52,470,668
216,146,977
819,987,18
117,168,530
805,96,715
346,949,466
970,615,88
941,993,340
862,61,35
984,92,344
425,690,689
</code></pre></div></div>

<p>I read the input in as a list of vectors <code class="language-plaintext highlighter-rouge">type vector = { x : float; y : float; z : float }</code>. Next, I computed a list of distances between all the pairs, resulting in a <code class="language-plaintext highlighter-rouge">((vector * vector) * float) list</code>. A network is a set of vectors, and overall, there is a set of networks. I couldn’t decide on the best way to store this, so for expediency, I went with sets.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Network</span> <span class="o">=</span> <span class="nn">Set</span><span class="p">.</span><span class="nc">Make</span> <span class="p">(</span><span class="k">struct</span>
  <span class="k">type</span> <span class="n">t</span> <span class="o">=</span> <span class="n">vector</span>

  <span class="k">let</span> <span class="n">compare</span> <span class="o">=</span> <span class="n">compare</span>
<span class="k">end</span><span class="p">)</span>

<span class="k">module</span> <span class="nc">NetworkSet</span> <span class="o">=</span> <span class="nn">Set</span><span class="p">.</span><span class="nc">Make</span> <span class="p">(</span><span class="nc">Network</span><span class="p">)</span> 
</code></pre></div></div>

<p>With this, I wrote a function to join two nodes together. This first checks if either node already existed in any network. If neither node exists, create a new network with those two nodes. If one node exists in any network, then add the other node. If both nodes exist, then union the two networks together. As adding a value to a set is idempotent, it is not necessary to distinguish which value needs to be added: <code class="language-plaintext highlighter-rouge">|&gt; Network.add v1 |&gt; Network.add v2</code></p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">join</span> <span class="n">v1</span> <span class="n">v2</span> <span class="n">acc</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">s1</span><span class="o">,</span> <span class="n">s2</span> <span class="o">=</span>
    <span class="nn">NetworkSet</span><span class="p">.</span><span class="n">partition</span> <span class="p">(</span><span class="k">fun</span> <span class="n">vs</span> <span class="o">-&gt;</span> <span class="nn">Network</span><span class="p">.</span><span class="n">mem</span> <span class="n">v1</span> <span class="n">vs</span> <span class="o">||</span> <span class="nn">Network</span><span class="p">.</span><span class="n">mem</span> <span class="n">v2</span> <span class="n">vs</span><span class="p">)</span> <span class="n">acc</span>
  <span class="k">in</span>  
  <span class="nn">NetworkSet</span><span class="p">.</span><span class="n">singleton</span>
    <span class="p">(</span><span class="k">match</span> <span class="nn">NetworkSet</span><span class="p">.</span><span class="n">cardinal</span> <span class="n">s1</span> <span class="k">with</span>
    <span class="o">|</span> <span class="mi">0</span> <span class="o">-&gt;</span> <span class="nn">Network</span><span class="p">.(</span><span class="n">singleton</span> <span class="n">v1</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="n">v2</span><span class="p">)</span>
    <span class="o">|</span> <span class="mi">1</span> <span class="o">-&gt;</span> <span class="nn">NetworkSet</span><span class="p">.</span><span class="n">choose</span> <span class="n">s1</span> <span class="o">|&gt;</span> <span class="nn">Network</span><span class="p">.</span><span class="n">add</span> <span class="n">v1</span> <span class="o">|&gt;</span> <span class="nn">Network</span><span class="p">.</span><span class="n">add</span> <span class="n">v2</span>
    <span class="o">|</span> <span class="mi">2</span> <span class="o">-&gt;</span> <span class="nn">NetworkSet</span><span class="p">.</span><span class="n">fold</span> <span class="p">(</span><span class="k">fun</span> <span class="n">vs</span> <span class="n">acc</span> <span class="o">-&gt;</span> <span class="nn">Network</span><span class="p">.</span><span class="n">union</span> <span class="n">acc</span> <span class="n">vs</span><span class="p">)</span> <span class="n">s1</span> <span class="nn">Network</span><span class="p">.</span><span class="n">empty</span>
    <span class="o">|</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="k">assert</span> <span class="bp">false</span><span class="p">)</span>
  <span class="o">|&gt;</span> <span class="nn">NetworkSet</span><span class="p">.</span><span class="n">union</span> <span class="n">s2</span>

</code></pre></div></div>

<h1 id="part-1-6">Part 1</h1>

<p>Take the first 1000 vector pairs and add them to the <code class="language-plaintext highlighter-rouge">NetworkSet</code>, then convert the <code class="language-plaintext highlighter-rouge">NetworkSet</code> into a list of the size of each network, sort the list, take the first three and fold over them to get the answer.</p>

<h1 id="part-2-6">Part 2</h1>

<p>Continue adding vector pairs until all the vectors are connected then find the produce of the x coordinate of the final two vectors. I used a recursive function to repeatedly add pairs until the size of the network equalled the total number of vectors.</p>
<h1 id="day-9---movie-theatre">Day 9 - Movie Theatre</h1>

<p>The input is a set of vertices. Draw the largest rectangle between any pair.</p>

<p>The vertices were specified as a list.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>7,1
11,1
11,7
9,7
9,5
2,5
2,3
7,3
</code></pre></div></div>

<p>Visually, this is:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>..............
.......#...#..
..............
..#....#......
..............
..#......#....
..............
.........#.#..
..............
</code></pre></div></div>

<h2 id="part-1-7">Part 1</h2>

<p>This couldn’t have been easier, particularly following day 8, as the input parser and combination generator are the same. Calculate the area of all the rectangles, then sort the list to find the largest.</p>

<h2 id="part-2-7">Part 2</h2>

<p>The extension was that the rectangle must be within the polygon defined by the input list of vertices. The input coordinates are in the range 0-100,000 on both x and y; therefore, we must do this mathematically, as the set will be too large.</p>

<p>To test if a polygon is contained within another polygon, then all vertices of A must be inside B, and none of the edges of A must cross the edges of B.</p>

<p>I used the ray casting algorithm to determine if a point was in a polygon. Due to the way the coordinate grid works, the code is somewhat messy, as all the boundaries are contained within the shape. Then test all pairs of edges to see if they crossed using the cross product to see if the endpoints lie on opposite sides of the infinite line defined by the other segment.</p>

<h1 id="day-10---factory">Day 10 - Factory</h1>

<p>The input is a pattern of lights, followed by a list of buttons and which lights they turn on and finally a list of counter values.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[.##.] (3) (1,3) (2) (2,3) (0,2) (0,1) {3,5,4,7}
[...#.] (0,2,3,4) (2,3) (0,4) (0,1,2) (1,2,3,4) {7,5,12,7,2}
[.###.#] (0,1,2,3,4) (0,3,4) (0,1,2,4,5) (1,2) {10,11,11,5,10,5}
</code></pre></div></div>

<h1 id="part-1-8">Part 1</h1>

<p>Press the buttons to toggle the lights on/off until you achieve the target pattern. The lights are a target bit pattern (but in reverse order), and the button positions are bit positions. So, <code class="language-plaintext highlighter-rouge">(1,3)</code> means toggle bits 1 and 3. The problem then becomes a breadth-first search through all the possible options. Starting at 0, xor that once for each button, then xor each of those with all the buttons again. This width grows quickly, but there aren’t many bit positions, so it only takes a few iterations to cover all the possible values. I used a set of integers to store the values at each iteration.</p>

<h1 id="part-2-8">Part 2</h1>

<p>In part two, there are n counters set to zero; you need to increment the counters until you get to the values specified in the final field of the input data. Pressing button <code class="language-plaintext highlighter-rouge">(1,3)</code> increments counters 1 and 3 by one. You might view this as an extension of the first problem, but since the counter target values range from 1 to 300, the problem depth is too great to be solved naively using a BFS.</p>

<p>Looking at the first example in more detail, I rewrote it like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>btn | 0 1 2 3 | index
----+---------+----
3   | 0 0 0 1 | 5
1,3 | 0 1 0 1 | 4
2   | 0 0 1 0 | 3
2,3 | 0 0 1 1 | 2
0,2 | 1 0 1 0 | 1
0,1 | 1 1 0 0 | 0
----+---------+----
    | 3 5 4 7
</code></pre></div></div>

<p>From that matrix, a set of equations can be written as</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>v0 + v1 = 3
v0 + v4 = 5
v1 + v2 + v3 = 4
v2 + v4 + v5 = 7
</code></pre></div></div>

<p>These linear equations need to be solved, and the minimum sum solution found. I used the package <a href="https://opam.ocaml.org/packages/lp/">lp</a> to do this.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span><span class="n">require</span> <span class="s2">"lp"</span><span class="p">;;</span>
<span class="o">#</span><span class="n">require</span> <span class="s2">"lp-glpk"</span><span class="p">;;</span>
<span class="k">open</span> <span class="nc">Lp</span>

<span class="k">let</span> <span class="n">v</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">init</span> <span class="mi">6</span> <span class="p">(</span><span class="k">fun</span> <span class="n">i</span> <span class="o">-&gt;</span> <span class="n">var</span> <span class="o">~</span><span class="n">integer</span><span class="o">:</span><span class="bp">true</span> <span class="p">(</span><span class="nn">Printf</span><span class="p">.</span><span class="n">sprintf</span> <span class="s2">"v%d"</span> <span class="n">i</span><span class="p">))</span>
  
<span class="k">let</span> <span class="n">sum</span> <span class="n">indices</span> <span class="o">=</span> 
  <span class="nn">List</span><span class="p">.</span><span class="n">fold_left</span> <span class="p">(</span><span class="k">fun</span> <span class="n">acc</span> <span class="n">i</span> <span class="o">-&gt;</span> <span class="n">acc</span> <span class="o">++</span> <span class="n">v</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="p">(</span><span class="n">c</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span><span class="p">)</span> <span class="n">indices</span> 
  
<span class="k">let</span> <span class="n">obj</span> <span class="o">=</span> <span class="n">minimize</span> <span class="p">(</span><span class="n">sum</span> <span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">1</span><span class="p">;</span> <span class="mi">2</span><span class="p">;</span> <span class="mi">3</span><span class="p">;</span> <span class="mi">4</span><span class="p">;</span> <span class="mi">5</span><span class="p">])</span>   <span class="c">(* sum of all variables *)</span>

<span class="k">let</span> <span class="n">constraints</span> <span class="o">=</span> <span class="p">[</span>
  <span class="n">sum</span> <span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">1</span><span class="p">]</span> <span class="o">=~</span> <span class="n">c</span> <span class="mi">3</span><span class="o">.</span><span class="mi">0</span><span class="p">;</span>       <span class="c">(* v0 + v1 = 3 *)</span>
  <span class="n">sum</span> <span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">4</span><span class="p">]</span> <span class="o">=~</span> <span class="n">c</span> <span class="mi">5</span><span class="o">.</span><span class="mi">0</span><span class="p">;</span>       <span class="c">(* v0 + v4 = 5 *)</span>
  <span class="n">sum</span> <span class="p">[</span><span class="mi">1</span><span class="p">;</span> <span class="mi">2</span><span class="p">;</span> <span class="mi">3</span><span class="p">]</span> <span class="o">=~</span> <span class="n">c</span> <span class="mi">4</span><span class="o">.</span><span class="mi">0</span><span class="p">;</span>    <span class="c">(* v1 + v2 + v3 = 4 *)</span>
  <span class="n">sum</span> <span class="p">[</span><span class="mi">2</span><span class="p">;</span> <span class="mi">4</span><span class="p">;</span> <span class="mi">5</span><span class="p">]</span> <span class="o">=~</span> <span class="n">c</span> <span class="mi">7</span><span class="o">.</span><span class="mi">0</span><span class="p">;</span>    <span class="c">(* v2 + v4 + v5 = 7 *)</span>
<span class="p">]</span>
  
<span class="k">let</span> <span class="n">problem</span> <span class="o">=</span> <span class="n">make</span> <span class="n">obj</span> <span class="n">constraints</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="k">match</span> <span class="nn">Lp_glpk</span><span class="p">.</span><span class="n">solve</span> <span class="n">problem</span> <span class="k">with</span>
  <span class="o">|</span> <span class="nc">Ok</span> <span class="p">(</span><span class="n">obj_val</span><span class="o">,</span> <span class="n">xs</span><span class="p">)</span> <span class="o">-&gt;</span>
      <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"Minimum: %.2f</span><span class="se">\n</span><span class="s2">"</span> <span class="n">obj_val</span><span class="p">;</span>
      <span class="nn">Array</span><span class="p">.</span><span class="n">iteri</span> <span class="p">(</span><span class="k">fun</span> <span class="n">i</span> <span class="n">var</span> <span class="o">-&gt;</span>
        <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"v%d = %.2f</span><span class="se">\n</span><span class="s2">"</span> <span class="n">i</span> <span class="p">(</span><span class="nn">PMap</span><span class="p">.</span><span class="n">find</span> <span class="n">var</span> <span class="n">xs</span><span class="p">)</span>
      <span class="p">)</span> <span class="n">v</span>
  <span class="o">|</span> <span class="nc">Error</span> <span class="n">msg</span> <span class="o">-&gt;</span>
      <span class="n">print_endline</span> <span class="n">msg</span>
</code></pre></div></div>

<p>This gives the solution as 10.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>GLPK Simplex Optimizer 5.0
4 rows, 6 columns, 10 non-zeros
      0: obj =   0.000000000e+00 inf =   1.900e+01 (4)
      4: obj =   1.000000000e+01 inf =   0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
GLPK Integer Optimizer 5.0
4 rows, 6 columns, 10 non-zeros
6 integer variables, none of which are binary
Integer optimization begins...
Long-step dual simplex will be used
+     4: mip =     not found yet &gt;=              -inf        (1; 0)
+     4: &gt;&gt;&gt;&gt;&gt;   1.000000000e+01 &gt;=   1.000000000e+01   0.0% (1; 0)
+     4: mip =   1.000000000e+01 &gt;=     tree is empty   0.0% (0; 1)
INTEGER OPTIMAL SOLUTION FOUND
Minimum: 10.00
v0 = 3.00
v1 = 0.00
v2 = 4.00
v3 = 0.00
v4 = 2.00
v5 = 1.00
</code></pre></div></div>

<p>All that is left is to sum the answer for each line of input.</p>
<h1 id="day-11---reactor">Day 11 - Reactor</h1>

<p>Count the number of paths to traverse a graph.</p>

<h1 id="part-1-9">Part 1</h1>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aaa: you hhh
you: bbb ccc
bbb: ddd eee
ccc: ddd eee fff
ddd: ggg
eee: out
fff: out
ggg: out
hhh: ccc fff iii
iii: out
</code></pre></div></div>

<p>In the first part, the task was to count the number of many ways to get from <code class="language-plaintext highlighter-rouge">you</code> to <code class="language-plaintext highlighter-rouge">out</code>. There aren’t many, so a simple depth-first search worked out of the box.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Outputs</span> <span class="o">=</span> <span class="nn">Set</span><span class="p">.</span><span class="nc">Make</span> <span class="p">(</span><span class="nc">String</span><span class="p">)</span>
<span class="k">module</span> <span class="nc">Racks</span> <span class="o">=</span> <span class="nn">Map</span><span class="p">.</span><span class="nc">Make</span> <span class="p">(</span><span class="nc">String</span><span class="p">)</span>

<span class="k">let</span> <span class="k">rec</span> <span class="n">dfs</span> <span class="o">=</span> <span class="k">function</span>
  <span class="o">|</span> <span class="s2">"out"</span> <span class="o">-&gt;</span> <span class="mi">1</span>
  <span class="o">|</span> <span class="n">r</span> <span class="o">-&gt;</span> <span class="nn">Outputs</span><span class="p">.</span><span class="n">fold</span> <span class="p">(</span><span class="k">fun</span> <span class="n">o</span> <span class="n">acc</span> <span class="o">-&gt;</span> <span class="n">acc</span> <span class="o">+</span> <span class="n">dfs</span> <span class="n">o</span><span class="p">)</span> <span class="p">(</span><span class="nn">Racks</span><span class="p">.</span><span class="n">find</span> <span class="n">r</span> <span class="n">racks</span><span class="p">)</span> <span class="mi">0</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="n">dfs</span> <span class="s2">"you"</span> <span class="o">|&gt;</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"Part 1: %i</span><span class="se">\n</span><span class="s2">"</span>
</code></pre></div></div>

<h1 id="part-2-9">Part 2</h1>

<p>The examples for the second part unusually gave new data. However, the puzzle input was the same. The new example data removed the <code class="language-plaintext highlighter-rouge">you</code> node and added an <code class="language-plaintext highlighter-rouge">svr</code> node. The question is now, how many ways from <code class="language-plaintext highlighter-rouge">svr</code> to <code class="language-plaintext highlighter-rouge">out</code>, but passing through <code class="language-plaintext highlighter-rouge">fft</code> and <code class="language-plaintext highlighter-rouge">dac</code>?</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>svr: aaa bbb
aaa: fft
fft: ccc
bbb: tty
tty: ccc
ccc: ddd eee
ddd: hub
hub: fff
eee: dac
dac: fff
fff: ggg hhh
ggg: out
hhh: out
</code></pre></div></div>

<p>On my actual dataset, the number of ways from <code class="language-plaintext highlighter-rouge">svr</code> to <code class="language-plaintext highlighter-rouge">out</code> was vast (45 quadrillion), so we definitely need memoisation. The key here was to realise that it was a DAG and so either <code class="language-plaintext highlighter-rouge">dac</code> to <code class="language-plaintext highlighter-rouge">fft</code> was possible or <code class="language-plaintext highlighter-rouge">fft</code> to <code class="language-plaintext highlighter-rouge">dac</code> was possible, but not both.</p>

<p>Using a DFS I calculated the number of paths between the key components and simplied the graph to four nodes. Since <code class="language-plaintext highlighter-rouge">dac</code> to <code class="language-plaintext highlighter-rouge">fft</code> has zero paths, the path must be <code class="language-plaintext highlighter-rouge">svr</code> to <code class="language-plaintext highlighter-rouge">fft</code> to <code class="language-plaintext highlighter-rouge">dac</code> to <code class="language-plaintext highlighter-rouge">out</code>. Thus the solution is <code class="language-plaintext highlighter-rouge">1 * 1 * 2 = 2</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                   ┌─────┐
                   │ svr │
                   └──┬──┘
            ┌─────────┴─────────┐
            │                   │
          2 │                   │ 1
            │                   │
            ▼         0         ▼
         ┌─────┐ ──────────► ┌─────┐
         │ dac │      1      │ fft │
         └──┬──┘ ◄────────── └──┬──┘
            │                   │
          2 │                   │ 4
            │                   │
            │      ┌─────┐      │
            └────► │ out │ ◄────┘
                   └─────┘
</code></pre></div></div>

<h1 id="day-12---christmas-tree-farm">Day 12 - Christmas Tree Farm</h1>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0:
###
##.
##.

1:
###
##.
.##

2:
.##
###
##.

3:
##.
###
##.

4:
###
#..
###

5:
###
.#.
###

4x4: 0 0 0 0 2 0
12x5: 1 0 1 0 2 2
12x5: 1 0 1 0 3 2
</code></pre></div></div>

<p>This is a packing problem. Given this input, <code class="language-plaintext highlighter-rouge">12x5: 1 0 1 0 2 2</code>, take a 12x5 grid and try to place 1 copy of shape 0, 1 copy of shape 2, 2 copies each of shapes 4 and 5.</p>

<p>On face value, this is a variation on the pentominoes problem, and the packing does not need to be complete. Fortunately, I looked at the real dataset before coding up a depth-first search to place the objects.</p>

<p>My first line of actual input is <code class="language-plaintext highlighter-rouge">45x41: 52 43 45 41 47 59</code>, still with 3x3 shapes to be placed. This is a massive problem space. Google has shown that Knuth’s Dancing Links is a common approach for this, and OCaml/opam has a <a href="https://opam.ocaml.org/packages/combine/">combine</a> package that implements this. I read the input data and passed it to the library to solve. However, the problem was too large.</p>

<p>As there are so many ways to pack the shapes, may there always be a solution at this scale? I used a simplistic area calculation to try this. I calculated the area of each shape, multiplied it by the number of copies and compared it to the area of the grid. Rightly or wrongly, this gave the correct answer to the problem on the real dataset (but not on the test input)</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="aoc" /><category term="tunbury.org" /><summary type="html"><![CDATA[With the start of Advent comes a new set of Advent of Code problems. My code is available at mtelvers/aoc2025.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/aoc2025.png" /><media:content medium="image" url="https://www.tunbury.org/images/aoc2025.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ceph Placemant Groups</title><link href="https://www.tunbury.org/2025/12/09/ceph-placement-groups/" rel="alternate" type="text/html" title="Ceph Placemant Groups" /><published>2025-12-09T12:00:00+00:00</published><updated>2025-12-09T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/12/09/ceph-placement-groups</id><content type="html" xml:base="https://www.tunbury.org/2025/12/09/ceph-placement-groups/"><![CDATA[<p>Better planning leads to less data movement later!</p>

<p>Rather than tracking the placement of every individual object, Ceph hashes objects into placement groups, PGs, and then maps those PGs to Object Storage Daemons, OSDs. A PG is a logical collection of objects that are all stored on the same set of OSDs.</p>

<p>When a pool is created, it has few PGs. In my case, only 1 PG was allocated. As data is written, the autoscaler increases the target number of PGs. For my cluster, 1 became 32, then 128 and then 512. Each time this happens, a PG “splits”, becoming 4, and then data is remapped to balance the placement across the OSDs. By default, only 5% of data can be misplaced, so the number of active placement groups increases slowly. Each time the amount of misplaced data is less than 5% more placement groups are created, resulting in more misplaced data, and the cycle continues.</p>

<p>As I am doing a bulk data copy, this behaviour is undesirable. Instead of creating the pool with:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool create mypool erasure &lt;ec-profile&gt;
</code></pre></div></div>

<p>I should have specified the number of PGs.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool create mypool 512 erasure &lt;ec-profile&gt;
</code></pre></div></div>

<p>You can calculate the number of PGs upfront. Firstly, work out your pool size factor:</p>

<ul>
  <li>For a replicated pool, use the replication size</li>
  <li>For an EC pool, use k + m</li>
</ul>

<blockquote>
  <p>Target PGs = (Total OSDs * 100) / pool_size_factor</p>
</blockquote>

<p>In my case, I have 24 OSDs with EC 3+1 (size factor = 4). <code class="language-plaintext highlighter-rouge">(24 * 100) / 4 = 600</code> then round to nearest power of 2 = 512.</p>

<p>The “100” appears to be a rule of thumb for target PGs per OSD. I have seen a range of recommended values between 100-200, depending on workload. The division by pool size accounts for the fact that each PG is stored on multiple OSDs.</p>

<p>You can set the number retrospectively, or let the autoscaler do it.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool <span class="nb">set </span>cephfs_data pg_num 512
ceph osd pool <span class="nb">set </span>cephfs_data pgp_num 512
</code></pre></div></div>

<p>Right now, I am waiting for 128 PGs to be autoscaled to 512. This could result in data being moved twice. For example, object X is in PG 5 on OSD 1. PG 5 splits, and object X hashes to new PG 133, which CRUSH puts on OSD 3. Subsequently, PG 133 splits, object X hashes to new PG 389, which CRUSH places on OSD 7.</p>

<p>I want to minimise the movement, so I have set the misplaced target ratio to 80% which will allow all the PG splits to occur.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph config <span class="nb">set </span>mgr target_max_misplaced_ratio 0.80
</code></pre></div></div>

<p>I would not recommend this for a cluster with active users, as the splitting causes a significant amount of I/O and performance degradation. However, all the splits occurred, and now the data is remapping. 52% of the data is misplaced. The recovery rate is ~300MB/s.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ceph" /><category term="tunbury.org" /><summary type="html"><![CDATA[Better planning leads to less data movement later!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ceph-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ceph-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Private repos in OCurrent</title><link href="https://www.tunbury.org/2025/12/05/ocurrent-private-repos/" rel="alternate" type="text/html" title="Private repos in OCurrent" /><published>2025-12-05T11:30:00+00:00</published><updated>2025-12-05T11:30:00+00:00</updated><id>https://www.tunbury.org/2025/12/05/ocurrent-private-repos</id><content type="html" xml:base="https://www.tunbury.org/2025/12/05/ocurrent-private-repos/"><![CDATA[<p><a href="https://github.com/ocurrent/ocurrent">OCurrent</a> has long wanted to access private repositories. You can achieve this by embedding a scoped PAT in the <code class="language-plaintext highlighter-rouge">.git-credentials</code> file, typically within the Docker container; however, this is untidy, to say the least! The approach presented works in cases where a GitHub app is used.</p>

<p>OCurrent authenticates to GitHub using a JWT (JSON Web Token). This token is signed using the application’s RSA private key (from <code class="language-plaintext highlighter-rouge">--github-private-key-file</code>) and contains the app_id. GitHub verifies this signature to confirm it’s really from the GitHub app. OCurrent then calls <code class="language-plaintext highlighter-rouge">get_token</code>, which POSTs to GitHub’s API to get an installation access token. This is a short-lived token (60 min) that can access the repositories the app has permission to. In summary, OCurrent already has the token, but there is no accessor function.</p>

<p>Git supports the <code class="language-plaintext highlighter-rouge">https://x-access-token:ghs_XXXX@github.com/...</code> access method to pass the password; however, OCurrent displays logs in real-time, so this would show in plain text on the web GUI. You can pass a custom pretty-print function and use it to mask the value. Alternatively, you can pass an environment variable to <code class="language-plaintext highlighter-rouge">git</code>, for example <code class="language-plaintext highlighter-rouge">GIT_CONFIG_PARAMETERS="'http.extraHeader=Authorization: Basic dXNlcjpwYXNz'"</code>.</p>

<p>I have added <code class="language-plaintext highlighter-rouge">get_cached_token</code>, which returns the cached token from the GitHub API plugin. Essentially, this is <code class="language-plaintext highlighter-rouge">let get_cached_token t = t.token</code>. This token then becomes the context parameter for the <code class="language-plaintext highlighter-rouge">git fetch</code> operation, replacing the original <code class="language-plaintext highlighter-rouge">No_context</code>.</p>

<p>The environment variable is created by calling <code class="language-plaintext highlighter-rouge">Base64.encode_string</code> on the <code class="language-plaintext highlighter-rouge">x-access-token:ghs_XXXX</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">make_auth_env</span> <span class="n">token</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">b64</span> <span class="o">=</span> <span class="nn">Base64</span><span class="p">.</span><span class="n">encode_string</span> <span class="p">(</span><span class="s2">"x-access-token:"</span> <span class="o">^</span> <span class="n">token</span><span class="p">)</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">header</span> <span class="o">=</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">sprintf</span> <span class="s2">"'http.extraHeader=Authorization: Basic %s'"</span> <span class="n">b64</span> <span class="k">in</span>
  <span class="p">[</span><span class="o">|</span> <span class="s2">"GIT_CONFIG_PARAMETERS="</span> <span class="o">^</span> <span class="n">header</span> <span class="o">|</span><span class="p">]</span>
</code></pre></div></div>

<p>The remaining changes in the PR thread the <code class="language-plaintext highlighter-rouge">env</code> parameter through the <code class="language-plaintext highlighter-rouge">git</code> module to the <code class="language-plaintext highlighter-rouge">process</code> module, where it is ultimately passed to <code class="language-plaintext highlighter-rouge">Lwt_process.open_process</code>.</p>

<p>Therefore, considering the example, <code class="language-plaintext highlighter-rouge">doc/examples/github_app.ml</code>, the diff would be:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>   <span class="nn">Github</span><span class="p">.</span><span class="nn">App</span><span class="p">.</span><span class="n">installations</span> <span class="n">app</span> <span class="o">|&gt;</span> <span class="nn">Current</span><span class="p">.</span><span class="n">list_iter</span> <span class="p">(</span><span class="k">module</span> <span class="nn">Github</span><span class="p">.</span><span class="nc">Installation</span><span class="p">)</span> <span class="o">@@</span> <span class="k">fun</span> <span class="n">installation</span> <span class="o">-&gt;</span>
<span class="o">+</span>  <span class="nn">Current</span><span class="p">.</span><span class="n">component</span> <span class="s2">"api"</span> <span class="o">|&gt;</span>
<span class="o">+</span>  <span class="k">let</span><span class="o">**</span> <span class="n">inst</span> <span class="o">=</span> <span class="n">installation</span> <span class="k">in</span>
<span class="o">+</span>  <span class="k">let</span> <span class="n">github</span> <span class="o">=</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Installation</span><span class="p">.</span><span class="n">api</span> <span class="n">inst</span> <span class="k">in</span>
   <span class="k">let</span> <span class="n">repos</span> <span class="o">=</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Installation</span><span class="p">.</span><span class="n">repositories</span> <span class="n">installation</span> <span class="k">in</span>
   <span class="n">repos</span> <span class="o">|&gt;</span> <span class="nn">Current</span><span class="p">.</span><span class="n">list_iter</span> <span class="o">~</span><span class="n">collapse_key</span><span class="o">:</span><span class="s2">"repo"</span> <span class="p">(</span><span class="k">module</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Api</span><span class="p">.</span><span class="nc">Repo</span><span class="p">)</span> <span class="o">@@</span> <span class="k">fun</span> <span class="n">repo</span> <span class="o">-&gt;</span>
   <span class="nn">Github</span><span class="p">.</span><span class="nn">Api</span><span class="p">.</span><span class="nn">Repo</span><span class="p">.</span><span class="n">ci_refs</span> <span class="o">~</span><span class="n">staleness</span><span class="o">:</span><span class="p">(</span><span class="nn">Duration</span><span class="p">.</span><span class="n">of_day</span> <span class="mi">90</span><span class="p">)</span> <span class="n">repo</span>
   <span class="o">|&gt;</span> <span class="nn">Current</span><span class="p">.</span><span class="n">list_iter</span> <span class="p">(</span><span class="k">module</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Api</span><span class="p">.</span><span class="nc">Commit</span><span class="p">)</span> <span class="o">@@</span> <span class="k">fun</span> <span class="n">head</span> <span class="o">-&gt;</span>
<span class="o">-</span>  <span class="k">let</span> <span class="n">src</span> <span class="o">=</span> <span class="nn">Git</span><span class="p">.</span><span class="n">fetch</span> <span class="p">(</span><span class="nn">Current</span><span class="p">.</span><span class="n">map</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Api</span><span class="p">.</span><span class="nn">Commit</span><span class="p">.</span><span class="n">id</span> <span class="n">head</span><span class="p">)</span> <span class="k">in</span>
<span class="o">+</span>  <span class="k">let</span> <span class="n">token</span> <span class="o">=</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Api</span><span class="p">.</span><span class="n">get_cached_token</span> <span class="n">github</span> <span class="k">in</span>
<span class="o">+</span>  <span class="k">let</span> <span class="n">src</span> <span class="o">=</span> <span class="nn">Git</span><span class="p">.</span><span class="n">fetch</span> <span class="o">?</span><span class="n">token</span> <span class="p">(</span><span class="nn">Current</span><span class="p">.</span><span class="n">map</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Api</span><span class="p">.</span><span class="nn">Commit</span><span class="p">.</span><span class="n">id</span> <span class="n">head</span><span class="p">)</span> <span class="k">in</span>
   <span class="nn">Docker</span><span class="p">.</span><span class="n">build</span> <span class="o">~</span><span class="n">pool</span> <span class="o">~</span><span class="n">pull</span><span class="o">:</span><span class="bp">false</span> <span class="o">~</span><span class="n">dockerfile</span> <span class="p">(</span><span class="nt">`Git</span> <span class="n">src</span><span class="p">)</span>
   <span class="o">|&gt;</span> <span class="n">check_run_status</span>
   <span class="o">|&gt;</span> <span class="nn">Github</span><span class="p">.</span><span class="nn">Api</span><span class="p">.</span><span class="nn">CheckRun</span><span class="p">.</span><span class="n">set_status</span> <span class="n">head</span> <span class="n">program_name</span>
</code></pre></div></div>

<p>This adds an <code class="language-plaintext highlighter-rouge">api</code> node in the graph for each installation, which is semantically correct as the token is per organisation.</p>

<p>I considered that the token might be stale or uninitialised before the <code class="language-plaintext highlighter-rouge">Git.fetch</code> call, but the only way to get a <code class="language-plaintext highlighter-rouge">Github.Api.Commit.id</code> is through an API call, so the token will always be refreshed. When a webhook is received, it triggers the reevaluation of the graph, which again refreshes the API token.</p>

<p>ref <a href="https://github.com/ocurrent/ocurrent/pull/466">ocurrent/ocurrent PR#466</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[OCurrent has long wanted to access private repositories. You can achieve this by embedding a scoped PAT in the .git-credentials file, typically within the Docker container; however, this is untidy, to say the least! The approach presented works in cases where a GitHub app is used.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Tile Server</title><link href="https://www.tunbury.org/2025/12/02/tessera-stac/" rel="alternate" type="text/html" title="Tile Server" /><published>2025-12-02T20:00:00+00:00</published><updated>2025-12-02T20:00:00+00:00</updated><id>https://www.tunbury.org/2025/12/02/tessera-stac</id><content type="html" xml:base="https://www.tunbury.org/2025/12/02/tessera-stac/"><![CDATA[<p>My throw-away comment at the end of my earlier <a href="https://www.tunbury.org/2025/11/30/tessera-zarr/">post</a> shows my scepticism that the JSON file approach was really viable.</p>

<p>A quick <code class="language-plaintext highlighter-rouge">ls | wc -l</code> shows nearly one million tiles in 2024 alone. We need a different approach. There are already parquet files available, and checking <code class="language-plaintext highlighter-rouge">register.parquet</code>, I can see it has everything we need!</p>

<p>As an alternative, more scalable solution, we could have a server that loads the Parquet files using <a href="https://github.com/mtelvers/arrow">mtelvers/arrow</a>, derived from <a href="https://github.com/LaurentMazare/ocaml-arrow">LaurentMazare/ocaml-arrow</a>, which can respond to queries raised by callbacks from Leaflet, allowing it to draw the required bounding boxes. Ultimately this could provide links to the Zarr data in stored in S3.</p>

<p>It’s a pretty simple API:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">GET /years</code> - Available years</li>
  <li><code class="language-plaintext highlighter-rouge">GET /stats?year=YYYY</code> - Coverage statistics</li>
  <li><code class="language-plaintext highlighter-rouge">GET /tiles?minx=&amp;miny=&amp;maxx=&amp;maxy=&amp;year=&amp;limit=</code> - Tiles in bounding box</li>
  <li><code class="language-plaintext highlighter-rouge">GET /density?year=&amp;resolution=</code> - Tile density grid</li>
</ul>

<p>The code is available at <a href="https://github.com/mtelvers/tile-server">mtelvers/title-server</a> and currently deployed at <a href="https://stac.mint.caelum.ci.dev">stac.mint.caelum.ci.dev</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="tessera,stac" /><category term="tunbury.org" /><summary type="html"><![CDATA[My throw-away comment at the end of my earlier post shows my scepticism that the JSON file approach was really viable.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/meighen-island.png" /><media:content medium="image" url="https://www.tunbury.org/images/meighen-island.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Keeping your branch up-to-date</title><link href="https://www.tunbury.org/2025/12/01/github-actions/" rel="alternate" type="text/html" title="Keeping your branch up-to-date" /><published>2025-12-01T23:20:00+00:00</published><updated>2025-12-01T23:20:00+00:00</updated><id>https://www.tunbury.org/2025/12/01/github-actions</id><content type="html" xml:base="https://www.tunbury.org/2025/12/01/github-actions/"><![CDATA[<p>My Arm32 branch will quickly go stale and will need to be rebased and tested. Can GitHub Actions do that for me automatically?</p>

<p>Adding a self-hosted runner is pretty straightforward. Go to your repository, then navigate to Settings, Actions, Runners, and click “New self-hosted runner”. Select your OS and architecture, and the customised installation instructions are provided:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Create a folder</span>
<span class="nv">$ </span><span class="nb">mkdir </span>actions-runner <span class="o">&amp;&amp;</span> <span class="nb">cd </span>actions-runner
<span class="c"># Download the latest runner package</span>
<span class="nv">$ </span>curl <span class="nt">-o</span> actions-runner-linux-arm-2.329.0.tar.gz <span class="nt">-L</span> https://github.com/actions/runner/releases/download/v2.329.0/actions-runner-linux-arm-2.329.0.tar.gz
<span class="c"># Optional: Validate the hash</span>
<span class="nv">$ </span><span class="nb">echo</span> <span class="s2">"b958284b8af869bd6d3542210fbd23702449182ba1c2b1b1eef575913434f13a  actions-runner-linux-arm-2.329.0.tar.gz"</span> | shasum <span class="nt">-a</span> 256 <span class="nt">-c</span>
<span class="c"># Extract the installer</span>
<span class="nv">$ </span><span class="nb">tar </span>xzf ./actions-runner-linux-arm-2.329.0.tar.gz
</code></pre></div></div>

<p>Then the configuration as follows:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Create the runner and start the configuration experience</span>
<span class="nv">$ </span>./config.sh <span class="nt">--url</span> https://github.com/mtelvers/ocaml <span class="nt">--token</span> YOUR_TOKEN
<span class="c"># Last step, run it!</span>
<span class="nv">$ </span>./run.sh
</code></pre></div></div>

<p>I choose not to run it and instead configure it to run via systemd using:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">sudo</span> ./svc.sh <span class="nb">install</span>
<span class="nv">$ </span><span class="nb">sudo</span> ./svc.sh start
</code></pre></div></div>

<p>My problems began as my Raspbian OS was out of date, and the GitHub runner requires Node.js 20. Runner version 2.303.0, which uses Node.js 16, was still available, so I installed it from <code class="language-plaintext highlighter-rouge">https://github.com/actions/runner/releases/download/v2.303.0/actions-runner-linux-arm-2.303.0.tar.gz</code>. This installation was successful, but it immediately updated itself to 2.329.0, resulting in the same problem.</p>

<p>Adding <code class="language-plaintext highlighter-rouge">--disableupdate</code> to <code class="language-plaintext highlighter-rouge">config.sh</code> prevented behaviour, but the error message was now terminal:</p>

<blockquote>
  <p>runsvc.sh[20543]: An error occurred: Runner version v2.303.0 is deprecated and cannot receive messages.</p>
</blockquote>

<p>I updated the OS to the latest Raspberry Pi OS (32-bit) based on Debian Trixie, and the installation completed as expected. My runner was now ready.</p>

<p>Scheduled workflows only run on the default branch, so I changed my fork’s default branch to <code class="language-plaintext highlighter-rouge">arm32-multicore</code> and committed a GitHub Action workflow, as shown in <a href="https://gist.github.com/mtelvers/c08b324cab705cf0ad84f04f3e79a9ab">this gist</a>. The workflow checks out my branch, rebases it on <code class="language-plaintext highlighter-rouge">upstream/trunk</code>, builds the compiler and runs the test suite.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[My Arm32 branch will quickly go stale and will need to be rebased and tested. Can GitHub Actions do that for me automatically?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">TESSERA and Zarr</title><link href="https://www.tunbury.org/2025/11/30/tessera-zarr/" rel="alternate" type="text/html" title="TESSERA and Zarr" /><published>2025-11-30T22:50:00+00:00</published><updated>2025-11-30T22:50:00+00:00</updated><id>https://www.tunbury.org/2025/11/30/tessera-zarr</id><content type="html" xml:base="https://www.tunbury.org/2025/11/30/tessera-zarr/"><![CDATA[<p>I’ve been copying the TESSERA data to Cephfs, but what is actually in the files?</p>

<p>There are directories for each tile, which are named <code class="language-plaintext highlighter-rouge">grid_longitude_latitude</code>. Each of these contains two NPY files. Picking one at random, I found these two files covering an area in the Canadian Arctic region.</p>

<table>
  <thead>
    <tr>
      <th>File</th>
      <th>Shape</th>
      <th>Type</th>
      <th>Elements</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>grid_-99.95_80.05.npy</td>
      <td>1119 × 211 × 128</td>
      <td>int8</td>
      <td>~30 million</td>
    </tr>
    <tr>
      <td>grid_-99.95_80.05_scales.npy</td>
      <td>1119 × 211</td>
      <td>float32</td>
      <td>~236k</td>
    </tr>
  </tbody>
</table>

<p>This is quantised data, where the actual values would be: <code class="language-plaintext highlighter-rouge">data[i,j,k] * scales[i,j]</code></p>

<p>There are 128 channels of machine learning data which need to be processed further by a downstream model, but I wanted to “see” it. Claude suggested a <a href="https://github.com/mtelvers/npy-pca">PCA visualisation</a> of the file, which converts the 128 dimensions into 3 dimensions, which are mapped to RGB values. This is the header image for this post.</p>

<p>The Zarr format is designed for large chunked arrays, especially for use in cloud storage. Rather than being a single file like NPY, it is a directory containing metadata in <code class="language-plaintext highlighter-rouge">.zarray</code>, attributes in <code class="language-plaintext highlighter-rouge">.zattrs</code> and then a series of files like <code class="language-plaintext highlighter-rouge">0.0</code>, <code class="language-plaintext highlighter-rouge">0.1</code>, <code class="language-plaintext highlighter-rouge">1.0</code>, <code class="language-plaintext highlighter-rouge">1.1</code>. Each of those files contains the respective chunk of data. So, if the chunk size is 256, then those four files would contain at most an array of 512x512. For example, the scales data about would need <code class="language-plaintext highlighter-rouge">0.0</code>, <code class="language-plaintext highlighter-rouge">1.0</code>, <code class="language-plaintext highlighter-rouge">2.0</code>, <code class="language-plaintext highlighter-rouge">3.0</code>, <code class="language-plaintext highlighter-rouge">4.0</code>. Note that there is no <code class="language-plaintext highlighter-rouge">.1</code> file as the second dimension is less than 256; therefore, all the data fits into the <code class="language-plaintext highlighter-rouge">.0</code> file.</p>

<p>For higher dimensions, more dots are added. For example, with a chunk size of 256, chunk <code class="language-plaintext highlighter-rouge">2.1.0</code> would mean:</p>

<ul>
  <li>Dimension 0: chunk 2 - pixels 512-767</li>
  <li>Dimension 1: chunk 1 - pixels 256-511</li>
  <li>Dimension 2: chunk 0 - channels 0-127 (all of them)</li>
</ul>

<p>The Zarr format allows the client to request a subset of the full dataset. The smallest element which can be returned is one chunk. Thus, smaller chunks may be better; however, these add more protocol overhead than larger chunks when requesting a large dataset, so a trade-off needs to be made. Zarr also compresses the data. Each dimension can be chunked with a different chunk size. Extra dimensions, such as the year of the dataset, could be incorporated.</p>

<p>Zarr’s real proposition is to allow the client to request “Give me latitude 50-55, longitude 100-110” without concern for the internal structure. However, this requires a unified array, which conflicts with the current structure, where tiles have different pixel dimensions depending on latitude (because longitude degrees shrink toward the poles). The data could be padded with zeros (wasting space), or bands could be created at different latitudes (gaps over the sea?).</p>

<p>I looked at some other <a href="https://planetarycomputer.microsoft.com/catalog?filter=zarr">datasets</a> to see how they handled this problem. Smaller regional datasets covering North America (for example), use a regular 1km grid and ignore distortions. The ERA5 climate data uses variable-sized pixels. It maps the globe to a 1440 x 720 array. <a href="https://confluence.ecmwf.int/display/CKB/ERA5:+What+is+the+spatial+reference">ref</a>. At the Equator, they have 28km per pixel; at 80 degrees latitude, they have 5km per pixel.</p>

<p>Discrete Global Grid Systems, DGGS, exist which divide the sphere into polyhedra, such as Uber’s <a href="https://www.uber.com/en-GB/blog/h3/">H3</a>; however, this doesn’t nicely map over the existing square pixels. The data would need to be resampled, and it’s not clear to me how you would average or interpolate 128 channels of ML data.</p>

<p>Possibly the best approach in the short term would be to provide the tiles as is and include appropriate metadata to describe them. <a href="https://cfconventions.org/">Climate and Forest Conventions</a> and <a href="https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3">Attribute Convention for Data Discovery 1-3</a> seem to be the standards and are used in xarray and Planetary Computer.</p>

<p>Anil pointed me to <a href="https://stac.browser.user.eopf.eodc.eu">EOPF Sentinel Zarr Samples Service STAC API</a>. STAC is just a JSON schema convention. We provide a <code class="language-plaintext highlighter-rouge">catalog.json</code> at the top level, which lists the yearly collections. In each year subdirectory, we provide <code class="language-plaintext highlighter-rouge">collection.json</code> that gives a list of each tile’s JSON file. The tile’s JSON file gives the hyperlink to the Zarr storage on S3.</p>

<p>Using Leaflet to visualise the map with some JavaScript to load the JSON files and extract the bounding boxes, we can fairly easily generate this <a href="https://stac.mint.caelum.ci.dev">map</a>. I do wonder how well that would scale, though.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="tessera,zarr" /><category term="tunbury.org" /><summary type="html"><![CDATA[I’ve been copying the TESSERA data to Cephfs, but what is actually in the files?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/embedding_pca.png" /><media:content medium="image" url="https://www.tunbury.org/images/embedding_pca.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OCaml 5.4 native Arm32 branch</title><link href="https://www.tunbury.org/2025/11/27/ocaml-54-native/" rel="alternate" type="text/html" title="OCaml 5.4 native Arm32 branch" /><published>2025-11-27T22:05:00+00:00</published><updated>2025-11-27T22:05:00+00:00</updated><id>https://www.tunbury.org/2025/11/27/ocaml-54-native</id><content type="html" xml:base="https://www.tunbury.org/2025/11/27/ocaml-54-native/"><![CDATA[<p>Recently, I have been using my Pi Zero (armv6), which has reminded me that OCaml 5 dropped native 32-bit support, and I wondered what it would take to reinstate it.</p>

<p>This started as a bit of tinkering; the Pi Zero is slow with a single CPU, 512MB of RAM and SD card storage. Building OCaml 5.4 takes several hours. I’d make a change in the morning, and leave it to build/fail and come back to it the next day.</p>

<p>There was an obvious candidate to revert starting with <a href="https://github.com/ocaml/ocaml/pull/11904">PR#11904 Remove arm, i386 native-code backends</a>. However, OCaml had moved on and cleaned up, so these updates now needed to include Arm32 or reverted:
<a href="https://github.com/ocaml/ocaml/pull/12242">PR#12242 Refactor the computation of stack frame parameters</a>, and
<a href="https://github.com/ocaml/ocaml/pull/12686">PR#12686 Fix the types of C primitives and remove some that are unused</a>,
<a href="https://github.com/ocaml/ocaml/pull/13119">PR#13119 Introduce a platform-independent header for portable CFI/DWARF constructs</a>.</p>

<p>However, this only restored and updated the original Arm32 code, but that code did not implement multicore. Arm64 support was added in <a href="https://github.com/ocaml/ocaml/pull/10972">PR#10972 Arm64 multicore support</a>, and that was the template for the Arm32 implementation.</p>

<p>For debugging, I used small examples, starting with the factorial example on the homepage <a href="https://ocaml.org">ocaml.org</a>, and then working through my <a href="https://github.com/mtelvers/aoc2024">AOC</a> solutions from last year. I compiled these with <code class="language-plaintext highlighter-rouge">ocamlopt</code> and used <code class="language-plaintext highlighter-rouge">gdb</code> on the resulting code rather than trying to debug a segmentation fault in <code class="language-plaintext highlighter-rouge">ocamlopt.opt</code>. Once the compiler was working, I could use the test suite to identify the remaining issues.</p>

<p>The only test I could not get to run was <code class="language-plaintext highlighter-rouge">tests/parallel/max_domains2.ml</code>, which creates 129 domains. Realistically, this test is too large for a 32-bit machine with very limited memory.</p>

<p>I have used a trivial <a href="https://gist.github.com/mtelvers/def18d646a217c3219ba3e54c6d53bec">prime checker</a> as a benchmark, which broadly shows a 3x speed improvement between native code and byte code, and 3x speed improvement in multicore over single core on a quad core machine.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./ocamlc.opt <span class="nt">-I</span> stdlib <span class="nt">-o</span> bench.byte bench.ml
./ocamlopt.opt <span class="nt">-I</span> stdlib <span class="nt">-o</span> bench.opt bench.ml
hyperfine <span class="s1">'./bench.opt 1'</span> <span class="s1">'./bench.opt 4'</span> <span class="s1">'./bench.byte 1'</span> <span class="s1">'./bench.byte 4'</span>
</code></pre></div></div>

<h4 id="raspberry-pi-2-4-cores-armv7">Raspberry Pi 2 (4 cores, ARMv7)</h4>

<table>
  <thead>
    <tr>
      <th>Mode</th>
      <th>Domains</th>
      <th>Time</th>
      <th>Speedup vs slowest</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Native</td>
      <td>4</td>
      <td>1.61s</td>
      <td>10.3x</td>
    </tr>
    <tr>
      <td>Native</td>
      <td>1</td>
      <td>4.79s</td>
      <td>3.5x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>4</td>
      <td>5.52s</td>
      <td>3.0x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>1</td>
      <td>16.56s</td>
      <td>1.0x</td>
    </tr>
  </tbody>
</table>

<h4 id="raspberry-pi-zero-1-core-armv6">Raspberry Pi Zero (1 core, ARMv6)</h4>

<table>
  <thead>
    <tr>
      <th>Mode</th>
      <th>Domains</th>
      <th>Time</th>
      <th>Speedup vs slowest</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Native</td>
      <td>1</td>
      <td>9.33s</td>
      <td>2.5x</td>
    </tr>
    <tr>
      <td>Native</td>
      <td>4</td>
      <td>9.39s</td>
      <td>2.5x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>4</td>
      <td>23.25s</td>
      <td>1.0x</td>
    </tr>
    <tr>
      <td>Bytecode</td>
      <td>1</td>
      <td>23.38s</td>
      <td>1.0x</td>
    </tr>
  </tbody>
</table>

<p>I have created a tidy commit history on my fork at <a href="https://github.com/mtelvers/ocaml/commits/arm32-multicore/">arm32-multicore</a>, but the actual path was nowhere near this orderly!</p>

<p>If you have a niche requirement and a spare Pi or other 32-bit Arm and want to have a play:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/mtelvers/ocaml <span class="nt">-b</span> arm32-multicore
<span class="nb">cd </span>ocaml
./configure <span class="o">&amp;&amp;</span> make world.opt <span class="o">&amp;&amp;</span> make tests
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Recently, I have been using my Pi Zero (armv6), which has reminded me that OCaml 5 dropped native 32-bit support, and I wondered what it would take to reinstate it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Reading the Gas Meter</title><link href="https://www.tunbury.org/2025/11/23/gas-meter/" rel="alternate" type="text/html" title="Reading the Gas Meter" /><published>2025-11-23T18:30:00+00:00</published><updated>2025-11-23T18:30:00+00:00</updated><id>https://www.tunbury.org/2025/11/23/gas-meter</id><content type="html" xml:base="https://www.tunbury.org/2025/11/23/gas-meter/"><![CDATA[<p>My gas supplier has tried and failed to install a smart gas meter, so I’ll give it a go myself.</p>

<p>Numerous videos on YouTube demonstrate a pipeline for capturing and processing images with AI, but this is a heavyweight solution for basic image recognition. With a fixed camera, I can compare the reference images of each digit with the current values.</p>

<p>I have placed a Raspberry Pi with a camera module pointing at the gas meter.</p>

<p><img src="/images/gas-meter-camera.png" alt="" /></p>

<p>In an ideal world, my image would be a grid of numbers with 0 = black and 255 = white.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[[  0,   0, 255, 255, 255,   0,   0  ];
 [  0, 255,   0,   0,   0, 255,   0  ];
 [  0,   0,   0,   0,   0, 255,   0  ];
 [  0,   0, 255, 255, 255,   0,   0  ];
 [  0,   0,   0,   0,   0, 255,   0  ];
 [  0, 255,   0,   0,   0, 255,   0  ];
 [  0,   0, 255, 255, 255,   0,   0  ]]
</code></pre></div></div>

<p>This would flatten into a 1D vector.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[ 0; 0; 255; 255; 255; 0; 0; 0; 255; 0; 0; 0; 255; 0; ...]
</code></pre></div></div>

<p>Then I could use the Euclidean distance to see how far apart the current image is from each of the reference images:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">euclidean_distance</span> <span class="n">v1</span> <span class="n">v2</span> <span class="o">=</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">mapi</span> <span class="p">(</span><span class="k">fun</span> <span class="n">i</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="n">x</span> <span class="o">-.</span> <span class="n">v2</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="o">**</span> <span class="mi">2</span><span class="o">.</span><span class="p">)</span> <span class="n">v1</span>
  <span class="o">|&gt;</span> <span class="nn">Array</span><span class="p">.</span><span class="n">fold_left</span> <span class="p">(</span> <span class="o">+.</span> <span class="p">)</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span>
  <span class="o">|&gt;</span> <span class="n">sqrt</span>
</code></pre></div></div>

<p>However, as the brightness of images may vary due to reflections from the plastic housing, using the angle between the two vectors would likely be more effective. Ranging from -1 to 1, where 1 = identical.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">dot_product</span> <span class="n">v1</span> <span class="n">v2</span> <span class="o">=</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">map2</span> <span class="p">(</span> <span class="o">*.</span> <span class="p">)</span> <span class="n">v1</span> <span class="n">v2</span> <span class="o">|&gt;</span> <span class="nn">Array</span><span class="p">.</span><span class="n">fold_left</span> <span class="p">(</span> <span class="o">+.</span> <span class="p">)</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span>

<span class="k">let</span> <span class="n">magnitude</span> <span class="n">v</span> <span class="o">=</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">fold_left</span> <span class="p">(</span><span class="k">fun</span> <span class="n">acc</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="n">acc</span> <span class="o">+.</span> <span class="n">x</span> <span class="o">*.</span> <span class="n">x</span><span class="p">)</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="n">v</span> <span class="o">|&gt;</span> <span class="n">sqrt</span>

<span class="k">let</span> <span class="n">cosine_similarity</span> <span class="n">v1</span> <span class="n">v2</span> <span class="o">=</span>
  <span class="n">dot_product</span> <span class="n">v1</span> <span class="n">v2</span> <span class="o">/.</span> <span class="p">(</span><span class="n">magnitude</span> <span class="n">v1</span> <span class="o">*.</span> <span class="n">magnitude</span> <span class="n">v2</span><span class="p">)</span>
</code></pre></div></div>

<p>My gas meter is the kind where the digits rotate on mechanical wheels, which makes their vertical position vary over time. If I capture the basic area where the digit is, it could be near the top, near the bottom, or anywhere in between, resulting in a wide range of outcomes.</p>

<p>Therefore, I must first find the bounding box of the number. As the numbers are white on a black background, the simplest approach is to find the maximum and minimum brightness levels and set a threshold accordingly. I tested levels from 10% to 90% in steps of 10 and opted for 85%.</p>

<p><img src="/images/gas-threshold-10.png" alt="" /> <img src="/images/gas-threshold-20.png" alt="" /> <img src="/images/gas-threshold-30.png" alt="" /> <img src="/images/gas-threshold-40.png" alt="" /> <img src="/images/gas-threshold-50.png" alt="" /> <img src="/images/gas-threshold-60.png" alt="" /> <img src="/images/gas-threshold-70.png" alt="" /> <img src="/images/gas-threshold-80.png" alt="" /> <img src="/images/gas-threshold-90.png" alt="" /></p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">threshold</span> <span class="o">=</span> <span class="n">min_v</span> <span class="o">+</span> <span class="p">(</span><span class="n">max_v</span> <span class="o">-</span> <span class="n">min_v</span><span class="p">)</span> <span class="o">*</span> <span class="mi">85</span> <span class="o">/</span> <span class="mi">100</span>
</code></pre></div></div>

<p>The bounding box can be found by searching for the first row with a bright pixel and the first column with a bright pixel:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">first_row</span> <span class="o">=</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">find_index</span> <span class="p">(</span><span class="k">fun</span> <span class="n">row</span> <span class="o">-&gt;</span> <span class="nn">Array</span><span class="p">.</span><span class="n">exists</span> <span class="p">(</span><span class="k">fun</span> <span class="n">v</span> <span class="o">-&gt;</span> <span class="n">v</span> <span class="o">&gt;</span> <span class="n">threshold</span><span class="p">)</span> <span class="n">row</span><span class="p">)</span> <span class="n">arr</span>
  <span class="o">|&gt;</span> <span class="nn">Option</span><span class="p">.</span><span class="n">value</span> <span class="o">~</span><span class="n">default</span><span class="o">:</span><span class="mi">0</span>

<span class="k">let</span> <span class="n">first_col</span> <span class="o">=</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">find_mapi</span> <span class="p">(</span><span class="k">fun</span> <span class="n">x</span> <span class="n">_</span> <span class="o">-&gt;</span>
    <span class="nn">Array</span><span class="p">.</span><span class="n">find_opt</span> <span class="p">(</span><span class="k">fun</span> <span class="n">row</span> <span class="o">-&gt;</span> <span class="n">row</span><span class="o">.</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">&gt;</span> <span class="n">threshold</span><span class="p">)</span> <span class="n">arr</span>
    <span class="o">|&gt;</span> <span class="nn">Option</span><span class="p">.</span><span class="n">map</span> <span class="p">(</span><span class="k">fun</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="n">x</span><span class="p">)</span>
  <span class="p">)</span> <span class="n">arr</span><span class="o">.</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">|&gt;</span> <span class="nn">Option</span><span class="p">.</span><span class="n">value</span> <span class="o">~</span><span class="n">default</span><span class="o">:</span><span class="mi">0</span>
</code></pre></div></div>

<p>The captured image is first cropped to the area where the digit is known to appear and converted to grayscale. The 85% threshold is applied to create a two-colour image, which makes it easy to find the bounding box. The grey-scale image is then extracted for processing.</p>

<p><img src="/images/gas-1-grayscale.png" alt="" /> <img src="/images/gas-2-binary.png" alt="" /> <img src="/images/gas-3-bbox.png" alt="" /> <img src="/images/gas-4-extracted.png" alt="" /></p>

<p>With the image extracted, calculate the cosine similarity with all the template images and sort them.</p>

<table>
  <thead>
    <tr>
      <th>Template</th>
      <th>Score</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>6</td>
      <td>0.9260</td>
    </tr>
    <tr>
      <td>4</td>
      <td>0.8447</td>
    </tr>
    <tr>
      <td>8</td>
      <td>0.8358</td>
    </tr>
    <tr>
      <td>0</td>
      <td>0.8123</td>
    </tr>
    <tr>
      <td>5</td>
      <td>0.7764</td>
    </tr>
    <tr>
      <td>9</td>
      <td>0.7449</td>
    </tr>
    <tr>
      <td>3</td>
      <td>0.7640</td>
    </tr>
    <tr>
      <td>1</td>
      <td>0.6674</td>
    </tr>
    <tr>
      <td>2</td>
      <td>0.6623</td>
    </tr>
    <tr>
      <td>7</td>
      <td>0.6062</td>
    </tr>
  </tbody>
</table>

<p>The interpretation success is perfect except for the final digit, which rotates very quickly, and the captured image is often cropped or shows multiple digits.</p>

<p>The code for this project is available at <a href="https://github.com/mtelvers/gas-meter">mtelvers/gas-meter</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,raspberry-pi" /><category term="tunbury.org" /><summary type="html"><![CDATA[My gas supplier has tried and failed to install a smart gas meter, so I’ll give it a go myself.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/gas-meter.png" /><media:content medium="image" url="https://www.tunbury.org/images/gas-meter.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Solar Position Library</title><link href="https://www.tunbury.org/2025/11/16/solar-epaper/" rel="alternate" type="text/html" title="Solar Position Library" /><published>2025-11-16T18:30:00+00:00</published><updated>2025-11-16T18:30:00+00:00</updated><id>https://www.tunbury.org/2025/11/16/solar-epaper</id><content type="html" xml:base="https://www.tunbury.org/2025/11/16/solar-epaper/"><![CDATA[<p>My Apple Watch has a cool watch face which shows the position of the sun, which feels like a good application for my <a href="https://github.com/mtelvers/gpio">GPIO library</a> and ePaper display.</p>

<p><img src="/images/solar-watch.jpg" alt="" /></p>

<p>I’ve published the code for the application and the solar position library at <a href="https://github.com/mtelvers/solar">mtelvers/solar</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,raspberry-pi,solar" /><category term="tunbury.org" /><summary type="html"><![CDATA[My Apple Watch has a cool watch face which shows the position of the sun, which feels like a good application for my GPIO library and ePaper display.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/solar-landscape.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/solar-landscape.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OCaml on a Raspberry Pi</title><link href="https://www.tunbury.org/2025/11/15/ocaml-raspberry-pi/" rel="alternate" type="text/html" title="OCaml on a Raspberry Pi" /><published>2025-11-15T22:00:00+00:00</published><updated>2025-11-15T22:00:00+00:00</updated><id>https://www.tunbury.org/2025/11/15/ocaml-raspberry-pi</id><content type="html" xml:base="https://www.tunbury.org/2025/11/15/ocaml-raspberry-pi/"><![CDATA[<p>The weather outside is frightful, but the Raspberry Pi is so delightful; I have been cheering myself by connecting up all the various bits of hardware scattered on my desk. I often buy these components but never quite get around to using them.</p>

<p>My latest purchase was the <a href="https://www.amazon.co.uk/dp/B07J3FHJVP">Waveshare 2.13” e-Paper Display HAT</a>, which is exactly the same size as a Pi Zero. The basic interface is SPI, plus the device uses various GPIO lines. The drivers provided are in C and Python, and unsurprisingly, no OCaml. Looking on opam, there is <a href="https://opam.ocaml.org/packages/wiringpi/">wiringpi</a>, which provides OCaml bindings for the WiringPi library for OCaml &lt; 5.0.</p>

<p>Do I need a 3rd party library? The kernel provides <code class="language-plaintext highlighter-rouge">/dev/spi*</code> and <code class="language-plaintext highlighter-rouge">/dev/i2c*</code> when these interfaces are enabled with <code class="language-plaintext highlighter-rouge">raspi-config</code>. GPIO can be accessed via <code class="language-plaintext highlighter-rouge">/sys/bus/gpio</code>, but this interface is deprecated and only provides a subset of the full functionality. All I really need to do is call <code class="language-plaintext highlighter-rouge">ioctl()</code> on <code class="language-plaintext highlighter-rouge">/dev/gpiochipN</code>, and I can access that via Ctypes.</p>

<p>Experimenting with some basic functionality, I managed to blink an LED on GPIO17.</p>

<p><img src="/images/gpio-led.jpg" alt="" /></p>

<p>After that, I was hooked. Adding I2C to read from a <a href="https://www.amazon.co.uk/WINGONEER-DS3231-AT24C32-Precision-Arduino/dp/B01H5NAFUY">DS3231 real time clock with EEPROM</a>, followed by SPI to output to an <a href="https://www.amazon.co.uk/MAX7219-Matrix-Display-Arduino-Microcontroller/dp/B07YWRZ3FC">LED matrix</a>.</p>

<p><img src="/images/gpio-max7219.jpg" alt="" /></p>

<p>I found a large LCD2004 display with an I2C driver board, so that was my next target. These are handy displays for basic text. They limit you to 8 custom characters, but if you think about it, a seven-segment display only needs seven elements so you can turn that into a nice big retro digital clock!</p>

<p><img src="/images/gpio-lcd2004.jpg" alt="" /></p>

<p>On to the e-Paper display and basic framebuffer display. This display is very cool as it has two buffers and can do a partial update of the display from the secondary buffer without needing to refresh the display completely.</p>

<p><img src="/images/gpio-epaper.jpg" alt="" /></p>

<p>The library and test code are available at <a href="https://github.com/mtelvers/gpio">mtelvers/gpio</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,raspberry-pi" /><category term="tunbury.org" /><summary type="html"><![CDATA[The weather outside is frightful, but the Raspberry Pi is so delightful; I have been cheering myself by connecting up all the various bits of hardware scattered on my desk. I often buy these components but never quite get around to using them.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/raspberry-pi-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/raspberry-pi-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">CephFS Partition Setup</title><link href="https://www.tunbury.org/2025/11/03/cepfs-partition-setup/" rel="alternate" type="text/html" title="CephFS Partition Setup" /><published>2025-11-03T19:00:00+00:00</published><updated>2025-11-03T19:00:00+00:00</updated><id>https://www.tunbury.org/2025/11/03/cepfs-partition-setup</id><content type="html" xml:base="https://www.tunbury.org/2025/11/03/cepfs-partition-setup/"><![CDATA[<p>If you’re working with full disks, adding an Object Storage Daemon, OSD, to your Ceph cluster couldn’t be simpler. Running one command, <code class="language-plaintext highlighter-rouge">ceph orch apply osd --all-available-devices</code>, does everything for you. When working with partitions, the process is more manual.</p>

<p>Firstly, you can invoke both <code class="language-plaintext highlighter-rouge">cephadm shell -- ceph-volume</code> and <code class="language-plaintext highlighter-rouge">cephadm ceph-volume</code>. These both invoke containers and run <code class="language-plaintext highlighter-rouge">ceph-volume</code>, but they differ in what parts of the system they can interact with and how the keyrings are provided.</p>

<p>For example, immediately after installation, running <code class="language-plaintext highlighter-rouge">cephadm ceph-volume lvm create --data /dev/sda4</code> fails with <code class="language-plaintext highlighter-rouge">RADOS permission denied</code> as no keyring can be found in <code class="language-plaintext highlighter-rouge">/var/lib/ceph/bootstrap-osd/ceph.keyring</code>. You can get the keyring using <code class="language-plaintext highlighter-rouge">cephadm shell -- ceph auth get client.bootstrap-osd &gt; osd.keyring</code>, be sure to redirect the output to avoid creating it in the keyring container.</p>

<p>With the extracted keyring, <code class="language-plaintext highlighter-rouge">cephadm ceph-volume --keyring /etc/ceph/ceph.client.bootstrap-osd.keyring lvm create --data /dev/sda4</code> starts out creating the LVM devices perfectly, but subsequently fails to start the <code class="language-plaintext highlighter-rouge">systemd</code> service, undoubtedly because it tries to start it within the container.</p>

<p>Running in a <code class="language-plaintext highlighter-rouge">cephadm shell</code>, the keyring can be created in the default directory by running <code class="language-plaintext highlighter-rouge">ceph auth get client.bootstrap-osd &gt; /var/lib/ceph/bootstrap-osd/ceph.keyring</code>, allowing <code class="language-plaintext highlighter-rouge">ceph-volume lvm create --data /dev/sda4</code> to run without extra parameters. This fails as the <code class="language-plaintext highlighter-rouge">lvcreate</code> command can’t see the group it created in the previous step. I presume that this problem stems from how <code class="language-plaintext highlighter-rouge">/dev</code> is mapped into the container.</p>

<p><code class="language-plaintext highlighter-rouge">cephadm shell -- ceph orch daemon add osd &lt;hostname&gt;:/dev/sda4</code> looks like the answer, but this fails with “please pass LVs or raw block devices”.</p>

<p>Manually creating a PV, VG, and LV, then passing those to <code class="language-plaintext highlighter-rouge">ceph orch daemon add osd &lt;hostname&gt;:/dev/&lt;vg&gt;/&lt;lv&gt;</code>, does work, but I feel that I’ve missed a trick that would get <code class="language-plaintext highlighter-rouge">ceph-volume</code> to do this for me. It tries on several of the above command variations, but when something goes wrong, the configuration is always rolled back.</p>

<p>I had initially tried to use a combination of <code class="language-plaintext highlighter-rouge">ceph-volume raw prepare</code>/<code class="language-plaintext highlighter-rouge">ceph-volume raw activate</code>, which operated on the partitions without issue. Those devices appear in <code class="language-plaintext highlighter-rouge">ceph-volume raw list</code>. The problem was that I couldn’t see how to create a systemd service to service those disks. Running <code class="language-plaintext highlighter-rouge">/usr/bin/ceph-osd -i $id --cluster ceph</code> worked, but that is not persistent! Reluctantly, I’d given up on this approach in favour of LVM, but while validating my steps to write up this post, I had an inspiration!</p>

<p>With some excitement, may I present a working sequence:</p>

<ol>
  <li>Run <code class="language-plaintext highlighter-rouge">cephadm shell -- ceph auth get client.bootstrap-osd</code> to show the keyring.</li>
  <li>In a <code class="language-plaintext highlighter-rouge">cephadm shell</code> on each host:
    <ol>
      <li>Create the keyring in <code class="language-plaintext highlighter-rouge">/var/lib/ceph/bootstrap-osd/ceph.keyring</code></li>
      <li>Run <code class="language-plaintext highlighter-rouge">for x in {a..d} ; do ceph-volume raw prepare --bluestore --data /dev/sd${x}4 ; done</code></li>
    </ol>
  </li>
  <li>For each host, run <code class="language-plaintext highlighter-rouge">cephadm shell -- ceph cephadm osd activate &lt;hostname&gt;</code></li>
</ol>

<blockquote>
  <p>Note that the keyring file needs a trailing newline, which Ansible absorbs in certain circumstances, resulting in a parse error.</p>
</blockquote>

<p>That final command <code class="language-plaintext highlighter-rouge">cephadm shell -- ceph cephadm osd activate</code> causes any missing OSD services to be created.</p>

<p>For my deployment, I provisioned four Scaleway EM-L110X-SATA machines and booted them in rescue mode.   Taking the deployment steps from my last <a href="https://www.tunbury.org/2025/10/31/scaleway-reconfiguration/">post</a>, I have rolled them into an Ansible Playbook, <a href="https://gist.github.com/4012e6860ff4e12d7b827fe96669318b.git">gist</a>, which reconfigures the machine automatically.</p>

<p>With the machines prepared, Ceph can be deployed using the notes from this earlier <a href="https://www.tunbury.org/2025/10/18/quick-look-at-ceph/">post</a> combined with the OSD setup steps above. The entire process is available in this <a href="https://gist.github.com/mtelvers/15e8bb0328aca66520ebe1351572a7d3">gist</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ceph" /><category term="tunbury.org" /><summary type="html"><![CDATA[If you’re working with full disks, adding an Object Storage Daemon, OSD, to your Ceph cluster couldn’t be simpler. Running one command, ceph orch apply osd --all-available-devices, does everything for you. When working with partitions, the process is more manual.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ceph-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ceph-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Scaleway Elastic Metal Reconfiguration</title><link href="https://www.tunbury.org/2025/10/31/scaleway-reconfiguration/" rel="alternate" type="text/html" title="Scaleway Elastic Metal Reconfiguration" /><published>2025-10-31T12:00:00+00:00</published><updated>2025-10-31T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/10/31/scaleway-reconfiguration</id><content type="html" xml:base="https://www.tunbury.org/2025/10/31/scaleway-reconfiguration/"><![CDATA[<p>Scaleway offers the EM-L110X-SATA machine, which has 4 x 12TB disks. I’ve noted in a previous <a href="https://www.tunbury.org/2025/05/01/removing-mdadm/">post</a> that the configuration isn’t ideal for my purposes, and I outlined a way to reconfigure the machine. The premise of that post is that you can eject one of the disks from the RAID5 array to use as the new root filesystem. All well and good, but you must wait for the RAID5 array to finish building; otherwise, ejecting the disk immediately leads to an inaccessible file system.</p>

<p>Scaleway allows you to boot into a rescue console. This is a netboot environment which has SSH access using a randomly generated username and password.</p>

<p>Once booted, <code class="language-plaintext highlighter-rouge">lsblk</code> shows <code class="language-plaintext highlighter-rouge">md0</code> is now <code class="language-plaintext highlighter-rouge">md127</code> and <code class="language-plaintext highlighter-rouge">md1</code> is missing.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>NAME          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0           7:0    0 826.6M  1 loop  /usr/lib/live/mount/rootfs/filesystem.squashfs
sda             8:0    0  10.9T  0 disk  
├─sda1          8:1    0     1M  0 part  
├─sda2          8:2    0   512M  0 part  
│ └─md127       9:127  0   511M  0 raid1 
│   └─md127p1 259:0    0   506M  0 part  
├─sda3          8:3    0  10.7T  0 part  
└─sda4          8:4    0   512M  0 part  
sdb             8:16   0  10.9T  0 disk  
├─sdb1          8:17   0     1M  0 part  
├─sdb2          8:18   0   512M  0 part  
│ └─md127       9:127  0   511M  0 raid1 
│   └─md127p1 259:0    0   506M  0 part  
├─sdb3          8:19   0  10.7T  0 part  
└─sdb4          8:20   0   512M  0 part  
sdc             8:32   0  10.9T  0 disk  
├─sdc1          8:33   0     1M  0 part  
├─sdc2          8:34   0   512M  0 part  
│ └─md127       9:127  0   511M  0 raid1 
│   └─md127p1 259:0    0   506M  0 part  
├─sdc3          8:35   0  10.7T  0 part  
└─sdc4          8:36   0   512M  0 part  
sdd             8:48   0  10.9T  0 disk  
├─sdd1          8:49   0     1M  0 part  
├─sdd2          8:50   0   512M  0 part  
│ └─md127       9:127  0   511M  0 raid1 
│   └─md127p1 259:0    0   506M  0 part  
├─sdd3          8:51   0  10.7T  0 part  
└─sdd4          8:52   0   512M  0 part  
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">cat /proc/mdstat</code> shows that <code class="language-plaintext highlighter-rouge">md1</code> is now <code class="language-plaintext highlighter-rouge">md126</code> but is <code class="language-plaintext highlighter-rouge">inactive</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10] 
md126 : inactive sdb3[4] sdc3[0] sda3[2] sdd3[1]
      45751787520 blocks super 1.2
       
md127 : active (auto-read-only) raid1 sdb2[3] sdc2[0] sda2[2] sdd2[1]
      523264 blocks super 1.2 [4/4] [UUUU]
      
unused devices: &lt;none&gt;
</code></pre></div></div>

<p>We can now use <code class="language-plaintext highlighter-rouge">mdadm --assemble --force --run /dev/md126 /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3</code> bring the array back online</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm: Fail create md126 when using /sys/module/md_mod/parameters/new_array
mdadm: Marking array /dev/md126 as 'clean'
mdadm: /dev/md126 has been started with 3 drives (out of 4) and 1 rebuilding.
</code></pre></div></div>

<p>This is confirmed with <code class="language-plaintext highlighter-rouge">cat /proc/mdstat</code> which shows that the rebuild has automatically restarted and will finish in about a day.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@51-159-101-156:~# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10] 
md126 : active raid5 sdc3[0] sdb3[4] sda3[2] sdd3[1]
      34313840640 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [=&gt;...................]  recovery =  8.8% (1014124636/11437946880) finish=1579.5min speed=109982K/sec
      bitmap: 10/86 pages [40KB], 65536KB chunk

md127 : active (auto-read-only) raid1 sdb2[3] sdc2[0] sda2[2] sdd2[1]
      523264 blocks super 1.2 [4/4] [UUUU]
      
unused devices: &lt;none&gt;
</code></pre></div></div>

<p>Stop the rebuild <code class="language-plaintext highlighter-rouge">echo frozen &gt; /sys/block/md126/md/sync_action</code> and mount the drive as read-only.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir -p /mnt/old
mount -o ro /dev/md126p1 /mnt/old
</code></pre></div></div>

<p>The Scaleway base installation is only ~2G, and <code class="language-plaintext highlighter-rouge">/tmp</code> is huge (as these systems have 96GB of RAM)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Filesystem      Size  Used Avail Use% Mounted on
tmpfs            48G   28K   48G   1% /tmp
</code></pre></div></div>

<p>Create the backup</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd /mnt/old
tar czf /tmp/rootfs-backup.tar.gz \
  --exclude=./proc \
  --exclude=./sys \
  --exclude=./dev \
  --exclude=./tmp \
  --exclude=./run \
  --exclude=./mnt \
  .
</code></pre></div></div>

<p>Check if the backup was created successfully.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-rw-r--r-- 1 root root 1.2G Oct 31 14:50 /tmp/rootfs-backup.tar.gz
</code></pre></div></div>

<p>Unmount the drive, and stop the array.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd /
umount /mnt/old
mdadm --stop /dev/md126
</code></pre></div></div>

<p>I found that the kernel was keen to remount the device, so I zeroed it out to prevent it.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm --zero-superblock /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3
</code></pre></div></div>

<p>Remove the partition from all the disks.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for disk in sda sdb sdc sdd; do
  parted /dev/$disk --script "rm 3 mkpart primary 1025MiB 34GiB set 3 raid on"
done
</code></pre></div></div>

<p>Create new 99GB RAID5 array (33GB × 3 usable with RAID5)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm --create /dev/md126 --level=5 --raid-devices=4 \
  /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3 \
  --chunk=512 --metadata=1.2
</code></pre></div></div>

<p>Check that it is building <code class="language-plaintext highlighter-rouge">cat /proc/mdstat</code>: 2 minutes to go!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Personalities : [raid1] [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid10] 
md126 : active raid5 sdd3[4] sdc3[2] sdb3[1] sda3[0]
      103704576 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [===&gt;.................]  recovery = 19.1% (6606848/34568192) finish=2.2min speed=202080K/sec
      
md127 : active (auto-read-only) raid1 sdd2[1] sdc2[0] sdb2[3] sda2[2]
      523264 blocks super 1.2 [4/4] [UUUU]
      
unused devices: &lt;none&gt;
</code></pre></div></div>

<p>Create GPT partition table.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>parted /dev/md126 mklabel gpt
parted /dev/md126 mkpart primary ext4 0% 100%
</code></pre></div></div>

<p>Format with ext4</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkfs.ext4 -L root /dev/md126p1
</code></pre></div></div>

<p>Verify</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@51-159-101-156:/# lsblk | grep md126
│ └─md126       9:126  0  98.9G  0 raid5 
│   └─md126p1 259:2    0  98.9G  0 part  
│ └─md126       9:126  0  98.9G  0 raid5 
│   └─md126p1 259:2    0  98.9G  0 part  
│ └─md126       9:126  0  98.9G  0 raid5 
│   └─md126p1 259:2    0  98.9G  0 part  
│ └─md126       9:126  0  98.9G  0 raid5 
│   └─md126p1 259:2    0  98.9G  0 part  
</code></pre></div></div>

<p>Mount the new filesystem</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir -p /mnt/new
mount /dev/md126p1 /mnt/new
</code></pre></div></div>

<p>Restore system</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd /mnt/new
tar xzf /tmp/rootfs-backup.tar.gz
</code></pre></div></div>

<p>Create system directories with correct permissions as these were excluded from the <code class="language-plaintext highlighter-rouge">tar</code> operation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir -p /mnt/new/proc
mkdir -p /mnt/new/sys
mkdir -p /mnt/new/dev
mkdir -p /mnt/new/run
mkdir -p /mnt/new/mnt
mkdir -p /mnt/new/tmp
</code></pre></div></div>

<p>Set correct permissions with the sticky bit for /tmp.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chmod 0555 /mnt/new/proc
chmod 0555 /mnt/new/sys
chmod 0755 /mnt/new/dev
chmod 0755 /mnt/new/run
chmod 0755 /mnt/new/mnt
chmod 1777 /mnt/new/tmp
</code></pre></div></div>

<p>Set ownership (but should already be <code class="language-plaintext highlighter-rouge">root:root</code>)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chown root:root /mnt/new/{proc,sys,dev,run,mnt,tmp}
</code></pre></div></div>

<p>Mount boot partition.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount /dev/md127p1 /mnt/new/boot
</code></pre></div></div>

<p>Bind mount system directories for chroot.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount --bind /dev /mnt/new/dev
mount --bind /proc /mnt/new/proc
mount --bind /sys /mnt/new/sys
</code></pre></div></div>

<p>Chroot into the system.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chroot /mnt/new /bin/bash
</code></pre></div></div>

<p>Check the original <code class="language-plaintext highlighter-rouge">mdadm.conf</code> file in <code class="language-plaintext highlighter-rouge">/mnt/new/etc/mdadm/mdadm.conf</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ARRAY /dev/md0 metadata=1.2 UUID=54d65d24:831a6594:d2c51416:5dd1692c
ARRAY /dev/md1 metadata=1.2 spares=1 UUID=dd7844ac:07f188e7:995ade90:71c23f7b
MAILADDR root
</code></pre></div></div>

<p>And compare that with the output from <code class="language-plaintext highlighter-rouge">mdadm --detail --scan</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ARRAY /dev/md/ubuntu-server:0 metadata=1.2 name=ubuntu-server:0 UUID=54d65d24:831a6594:d2c51416:5dd1692c
ARRAY /dev/md126 metadata=1.2 name=52-158-100-155:126 UUID=6a249202:c916a184:76fd6446:839ad3a4
</code></pre></div></div>

<p>Fix the UUID <code class="language-plaintext highlighter-rouge">/dev/md1</code> in <code class="language-plaintext highlighter-rouge">mdadm.conf</code> using your favourite text editor:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sed -i "s/spares=1 UUID=.*/UUID=6a249202:c916a184:76fd6446:839ad3a4/g" /mnt/new/etc/mdadm/mdadm.conf 
</code></pre></div></div>

<p>Verify the changes to <code class="language-plaintext highlighter-rouge">/mnt/new/etc/mdadm/mdadm.conf</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ARRAY /dev/md0 metadata=1.2 UUID=54d65d24:831a6594:d2c51416:5dd1692c
ARRAY /dev/md1 metadata=1.2 UUID=6a249202:c916a184:76fd6446:839ad3a4
MAILADDR root
</code></pre></div></div>

<p>Make the same edit to <code class="language-plaintext highlighter-rouge">/etc/fstab</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sed -i 's/dd7844ac:07f188e7:995ade90:71c23f7b/6a249202:c916a184:76fd6446:839ad3a4/' /etc/fstab
</code></pre></div></div>

<p>Update initramfs with new array config.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>update-initramfs -u -k all
</code></pre></div></div>

<p>Reinstall GRUB on all 4 disks (for redundancy).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for disk in sda sdb sdc sdd; do
  echo "Installing GRUB on /dev/$disk..."
  grub-install /dev/$disk
done
</code></pre></div></div>

<p>Update GRUB config.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>update-grub
</code></pre></div></div>

<p>Exit the chroot environment.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>exit
</code></pre></div></div>

<p>Many people would be happy here, but the free space is now in the middle of the disk with the swap space (nearly) at the end, and this means that my new partition, 5, would be out of order. <code class="language-plaintext highlighter-rouge">parted /dev/sda print free</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Model: ATA TOSHIBA MG07ACA1 (scsi)
Disk /dev/sda: 12.0TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 
Number  Start   End     Size    File system     Name     Flags
        17.4kB  1049kB  1031kB  Free Space
 1      1049kB  2097kB  1049kB                           bios_grub
 2      2097kB  539MB   537MB
        539MB   1075MB  536MB   Free Space
 3      1075MB  36.5GB  35.4GB                  primary  raid
        36.5GB  11.7TB  11.7TB  Free Space
 4      11.7TB  11.7TB  537MB   linux-swap(v1)
        11.7TB  12.0TB  286GB   Free Space
</code></pre></div></div>

<p>I’m going to delete the swap partition, create the new data partition, and finally, create the swap partition at the very end.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for disk in sda sdb sdc sdd; do
  parted /dev/$disk --script "rm 4 mkpart primary ext4 36.5GB -1GB mkpart primary linux-swap -1GB 100%"
  mkswap /dev/${disk}5
done
</code></pre></div></div>

<p>Delete the old references to the swap space from <code class="language-plaintext highlighter-rouge">/etc/fstab</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sed -i '/swap/d' /mnt/new/etc/fstab
</code></pre></div></div>

<p>Then add the new swap space to <code class="language-plaintext highlighter-rouge">/etc/fstab</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>blkid | awk -F'"' '/TYPE="swap"/ {print "/dev/disk/by-uuid/" $2 " none swap sw 0 0"}' &gt;&gt; /etc/fstab
</code></pre></div></div>

<p>Unmount everything.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>umount /mnt/new/boot
umount /mnt/new/dev
umount /mnt/new/proc
umount /mnt/new/sys
umount /mnt/new
</code></pre></div></div>

<p>Final sync and reboot.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sync &amp;&amp; reboot
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Scaleway" /><category term="tunbury.org" /><summary type="html"><![CDATA[Scaleway offers the EM-L110X-SATA machine, which has 4 x 12TB disks. I’ve noted in a previous post that the configuration isn’t ideal for my purposes, and I outlined a way to reconfigure the machine. The premise of that post is that you can eject one of the disks from the RAID5 array to use as the new root filesystem. All well and good, but you must wait for the RAID5 array to finish building; otherwise, ejecting the disk immediately leads to an inaccessible file system.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/scaleway-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/scaleway-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Slurm with multiple architectures</title><link href="https://www.tunbury.org/2025/10/24/slurm-ansible/" rel="alternate" type="text/html" title="Slurm with multiple architectures" /><published>2025-10-24T12:00:00+00:00</published><updated>2025-10-24T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/10/24/slurm-ansible</id><content type="html" xml:base="https://www.tunbury.org/2025/10/24/slurm-ansible/"><![CDATA[<p>If we implement Slurm over a cluster of machines with different processor architectures, what would the job submission look like?</p>

<p>Slurm will happily have different processor architectures in the same cluster and even in the same partition. The processor cores and memory are aggregated as they would be for like architectures. It is the submitter’s responsibility to ensure that their script runs on the available processors. Rather than leave it to chance, we could create multiple partitions within a cluster. For example, with these settings in <code class="language-plaintext highlighter-rouge">slurm.conf</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Define your node groups first
NodeName=node[01-10] CPUs=32 RealMemory=128000
NodeName=node[11-20] CPUs=64 RealMemory=256000

# Then define partitions
PartitionName=x86_64 Nodes=node[01-10] Default=YES MaxTime=INFINITE State=UP
PartitionName=arm64 Nodes=node[11-20] Default=NO MaxTime=INFINITE State=UP
</code></pre></div></div>

<p>However, it is probably better to use node “features” and keep all the machines in a single partition:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Define your node groups first
NodeName=node[01-10] CPUs=32 Feature=x86_64
NodeName=node[11-20] CPUs=64 Feature=arm64

# Then define the partition
PartitionName=compute Nodes=node[01-20] Default=YES State=UP
</code></pre></div></div>

<p>Users can select the processor architecture using the <code class="language-plaintext highlighter-rouge">--constraint</code> option to <code class="language-plaintext highlighter-rouge">sbatch</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sbatch --constraint=x86_64 job.sh
sbatch --constraint=arm64 job.sh
</code></pre></div></div>

<p>I have implemented this strategy in <a href="https://github.com/mtelvers/slurm-ansible">mtelvers/slurm-ansible</a>, which builds a Slurm cluster based upon my previous posts on <a href="https://www.tunbury.org/2025/04/14/slurm-workload-manager/">14/4</a> and <a href="https://www.tunbury.org/2025/08/06/slurm-limits/">6/8</a> to include accounting, cgroups and NFS sharing and additionally applies features based upon <code class="language-plaintext highlighter-rouge">uname -m</code>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Slurm" /><category term="tunbury.org" /><summary type="html"><![CDATA[If we implement Slurm over a cluster of machines with different processor architectures, what would the job submission look like?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/slurm.png" /><media:content medium="image" url="https://www.tunbury.org/images/slurm.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">A quick look at CephFS</title><link href="https://www.tunbury.org/2025/10/18/quick-look-at-ceph/" rel="alternate" type="text/html" title="A quick look at CephFS" /><published>2025-10-18T22:00:00+00:00</published><updated>2025-10-18T22:00:00+00:00</updated><id>https://www.tunbury.org/2025/10/18/quick-look-at-ceph</id><content type="html" xml:base="https://www.tunbury.org/2025/10/18/quick-look-at-ceph/"><![CDATA[<p>There are Ansible playbooks available at <a href="https://github.com/ceph/cephadm-ansible">ceph/cephadm-ansible</a> to configure CephFS; however, I decided to set it up manually on some test VMs to gain a better understanding of the process.</p>

<p>I used Vagrant to create a couple of VMs. One with 3 x 500GB disks and one with 11 x 1TB disks.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu2204"
  config.vm.provider "libvirt" do |v|
    v.memory = 8192
    v.cpus = 4
    (1..3).each do |i|
      v.storage :file, :size =&gt; '500G'
    end
  end
  config.vm.network :public_network, :dev =&gt; 'br0', :type =&gt; 'bridge'
end
</code></pre></div></div>

<p>After <code class="language-plaintext highlighter-rouge">vagrant up</code>, SSHed to the 3 disk node, which I will use to bootstrap the cluster. Install the cephadm tool, which installs docker.io and other packages needed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt install cephadm
</code></pre></div></div>

<p>Set the hostname and run cephadm:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hostnamectl set-hostname host226.ocl.cl.cam.ac.uk
cephadm bootstrap --mon-ip 128.232.124.226 --allow-fqdn-hostname
</code></pre></div></div>

<p>After that completes the admin interface is available on port 8443 and the initial password is given. This needs to be changed on first login.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Ceph Dashboard is now available at:

	     URL: https://host226.ocl.cl.cam.ac.uk:8443/
	    User: admin
	Password: 6n2knvhka0

Enabling client.admin keyring and conf on hosts with "admin" label
Saving cluster configuration to /var/lib/ceph/8c498470-b01f-11f0-8941-1baf58a32558/config directory
Enabling autotune for osd_memory_target
You can access the Ceph CLI as following in case of multi-cluster or non-default config:

	sudo /usr/sbin/cephadm shell --fsid 8c498470-b01f-11f0-8941-1baf58a32558 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Or, if you are only running a single cluster on this host:

	sudo /usr/sbin/cephadm shell 

Please consider enabling telemetry to help improve Ceph:

	ceph telemetry on

For more information see:

	https://docs.ceph.com/docs/master/mgr/telemetry/

Bootstrap complete.

</code></pre></div></div>

<p>To run <code class="language-plaintext highlighter-rouge">ceph</code> commands either run <code class="language-plaintext highlighter-rouge">cephadm shell -- ceph -s</code> run them interactively after first running <code class="language-plaintext highlighter-rouge">cephadm shell</code>.</p>

<p>On the other, 11 disk machine, install Docker:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt install docker.io
</code></pre></div></div>

<p>Copy <code class="language-plaintext highlighter-rouge">/etc/ceph/ceph.pub</code> from the master node to this node in <code class="language-plaintext highlighter-rouge">~/.ssh/authorized_keys</code>.</p>

<p>Then from master add the other machine to the cluster:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph orch host add host190.ocl.cl.cam.ac.uk 128.232.124.190
</code></pre></div></div>

<p>The disks should now appear as available.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ceph orch device ls
HOST                      PATH      TYPE  DEVICE ID                              SIZE  AVAILABLE  REFRESHED  REJECT REASONS  
host190.ocl.cl.cam.ac.uk  /dev/vdb  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdc  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdd  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vde  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdf  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdg  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdh  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdi  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdj  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdk  hdd                                         1000G  Yes        21s ago                    
host190.ocl.cl.cam.ac.uk  /dev/vdl  hdd                                         1000G  Yes        21s ago                    
host226.ocl.cl.cam.ac.uk  /dev/sda  hdd   QEMU_HARDDISK_drive-ua-disk-volume-0   500G  Yes        2m ago                     
host226.ocl.cl.cam.ac.uk  /dev/sdb  hdd   QEMU_HARDDISK_drive-ua-disk-volume-1   500G  Yes        2m ago                     
host226.ocl.cl.cam.ac.uk  /dev/sdc  hdd   QEMU_HARDDISK_drive-ua-disk-volume-2   500G  Yes        2m ago                     
</code></pre></div></div>

<p>Each Object Storage Daemon, OSD, backs a single disk. Add all the available devices</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph orch apply osd --all-available-devices
</code></pre></div></div>

<p>Since these disks are virtual disks, we need to configure some to be SSD. Check the device numbers with <code class="language-plaintext highlighter-rouge">ceph osd tree</code>:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ID  CLASS  WEIGHT    TYPE NAME         STATUS  REWEIGHT  PRI-AFF
-1         12.20741  root default                               
-5         10.74252      host host190                           
 1    hdd   0.97659          osd.1         up   1.00000  1.00000
 3    hdd   0.97659          osd.3         up   1.00000  1.00000
 5    hdd   0.97659          osd.5         up   1.00000  1.00000
 6    hdd   0.97659          osd.6         up   1.00000  1.00000
 7    hdd   0.97659          osd.7         up   1.00000  1.00000
 8    hdd   0.97659          osd.8         up   1.00000  1.00000
 9    hdd   0.97659          osd.9         up   1.00000  1.00000
10    hdd   0.97659          osd.10        up   1.00000  1.00000
11    hdd   0.97659          osd.11        up   1.00000  1.00000
12    hdd   0.97659          osd.12        up   1.00000  1.00000
13    hdd   0.97659          osd.13        up   1.00000  1.00000
-3          1.46489      host host226                           
 0    hdd   0.48830          osd.0         up   1.00000  1.00000
 2    hdd   0.48830          osd.2         up   1.00000  1.00000
 4    hdd   0.48830          osd.4         up   1.00000  1.00000
</code></pre></div></div>

<p>Create CRUSH rules to separate fast vs slow disks.  We want to target pools to specific devices.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd crush rm-device-class osd.0 osd.2 osd.4
ceph osd crush set-device-class ssd osd.0 osd.2 osd.4
</code></pre></div></div>

<p>Create a metadata pool (replicated, should be on fast disks)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool create cephfs_metadata 32 replicated
</code></pre></div></div>

<p>Fast data pool (replicated, for root filesystem)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool create cephfs_data_fast 64 replicated
</code></pre></div></div>

<p>Archive pool (erasure coded 8+3, for slow disks)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd erasure-code-profile set ec83profile k=8 m=3 crush-failure-domain=osd
ceph osd pool create cephfs_data_archive 128 erasure ec83profile
</code></pre></div></div>

<p>Set up pool properties. There are only two hosts in this test setup; ideally, the size would be three or more.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool set cephfs_metadata size 2
ceph osd pool set cephfs_data_fast size 2
</code></pre></div></div>

<p>Create CRUSH rules to allocate the data correctly and apply them to the pools.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd crush rule create-erasure cephfs_data_archive_osd ec83profile
ceph osd crush rule create-replicated fast_ssd_rule default osd ssd

ceph osd pool set cephfs_data_fast crush_rule fast_ssd_rule
ceph osd pool set cephfs_metadata crush_rule fast_ssd_rule
ceph osd pool set cephfs_data_archive crush_rule cephfs_data_archive
</code></pre></div></div>

<p>Allow CephFS to use the pools</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool application enable cephfs_metadata cephfs
ceph osd pool application enable cephfs_data_fast cephfs
ceph osd pool application enable cephfs_data_archive cephfs
</code></pre></div></div>

<p>Create the filesystem</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph fs new cephfs cephfs_metadata cephfs_data_fast
</code></pre></div></div>

<p>Add the EC pool as an additional data pool</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph osd pool set cephfs_data_archive allow_ec_overwrites true
ceph fs add_data_pool cephfs cephfs_data_archive
</code></pre></div></div>

<p>Create an MDS for the file system</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph fs set cephfs max_mds 1
ceph orch apply mds cephfs
</code></pre></div></div>

<p>CephFS warns about having the root of the file system on an erasure coding disk hence we use the fast disk as the root and map the other pool to a specific directory.</p>

<p>Get your admin key</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ceph auth get-key client.admin
</code></pre></div></div>

<p>On a client machine, mount CephFS</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir -p /mnt/cephfs
mount -t ceph host226:6789,host190:6789:/ /mnt/cephfs -o name=admin,secret=YOUR_KEY_HERE
</code></pre></div></div>

<p>Create and configure the archive directory</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir /mnt/cephfs/archive
setfattr -n ceph.dir.layout.pool -v cephfs_data_archive /mnt/cephfs/archive
</code></pre></div></div>

<p>Verify it worked</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>getfattr -n ceph.dir.layout.pool /mnt/cephfs/archive
</code></pre></div></div>

<p>This ensures new files in <code class="language-plaintext highlighter-rouge">/archive</code> use the erasure-coded pool on the large disks, while the root uses the replicated fast pool.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ceph" /><category term="tunbury.org" /><summary type="html"><![CDATA[There are Ansible playbooks available at ceph/cephadm-ansible to configure CephFS; however, I decided to set it up manually on some test VMs to gain a better understanding of the process.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ceph-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ceph-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">CI support for OCaml 5.4</title><link href="https://www.tunbury.org/2025/10/18/ci-support-for-ocaml-54/" rel="alternate" type="text/html" title="CI support for OCaml 5.4" /><published>2025-10-18T00:00:00+00:00</published><updated>2025-10-18T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/10/18/ci-support-for-ocaml-54</id><content type="html" xml:base="https://www.tunbury.org/2025/10/18/ci-support-for-ocaml-54/"><![CDATA[<p>Following the release of <a href="https://ocaml.org/releases/5.4.0">OCaml 5.4</a> the CI systems need to be updated to use it.</p>

<p>This process starts with the update of <a href="https://github.com/ocurrent/ocaml-version">ocaml-version</a>, which Octachron added through <a href="https://github.com/ocurrent/ocaml-version/pull/85">PR#85</a>.</p>

<p>The base images now need to be updated, which consists of updating the <a href="https://images.ci.ocaml.org">base image builder</a>, <a href="https://github.com/ocaml/macos-infra">macos-infra</a> and <a href="https://github.com/ocurrent/freebsd-infra">freebsd-infra</a>. The latter two are Ansible scripts <a href="https://github.com/ocurrent/macos-infra/pull/57">PR#57</a> for macOS and <a href="https://github.com/ocurrent/freebsd-infra/pull/19">PR#19</a> for FreeBSD. New base images are also required for OpenBSD 7.7 and Windows Server 2022. This needs minor edits to the <code class="language-plaintext highlighter-rouge">Makefile</code>,  which are included in <a href="https://github.com/ocurrent/obuilder/pull/201">PR#201</a>.</p>

<p>The base image builder was updated with <a href="https://github.com/ocurrent/docker-base-images/pull/335">PR#335</a> which pulled in the latest <a href="https://github.com/ocurrent/ocaml-version">ocaml-version</a> and <a href="https://github.com/ocurrent/ocaml-dockerfile">ocaml-dockerfile</a>. <a href="https://github.com/ocurrent/ocaml-dockerfile">ocaml-dockerfile</a> contains the build instructions for the base images across different OS distributions and architectures as Dockerfiles.</p>

<p><a href="https://github.com/ocurrent/ocaml-dockerfile">ocaml-dockerfile</a> had recently been updated with <a href="https://github.com/ocurrent/ocaml-dockerfile/pull/243">PR#243</a>, which added CentOS Stream 9 and 10, Oracle Linux 10 and Ubuntu 25.10. However, this resulted in a couple of build failures, plus MisterDA opened <a href="https://github.com/ocurrent/ocaml-dockerfile/issues/244">issue#244</a>, noting openSUSE and Windows Server 2025 needed to be updated.</p>

<p>There were build failures in OpenSUSE that came from <code class="language-plaintext highlighter-rouge">RUN yum install -y ... curl ...</code> which conflicted with the <code class="language-plaintext highlighter-rouge">curl</code> which was already installed. Easily fixed by removing <code class="language-plaintext highlighter-rouge">curl</code> as it was already installed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#17 1.503 Error: 
#17 1.503  Problem: problem with installed package curl-minimal-7.76.1-34.el9.x86_64
#17 1.503   - package curl-minimal-7.76.1-34.el9.x86_64 from @System conflicts with curl provided by curl-7.76.1-34.el9.x86_64 from baseos
#17 1.503   - package curl-minimal-7.76.1-26.el9.x86_64 from baseos conflicts with curl provided by curl-7.76.1-34.el9.x86_64 from baseos
#17 1.503   - package curl-minimal-7.76.1-28.el9.x86_64 from baseos conflicts with curl provided by curl-7.76.1-34.el9.x86_64 from baseos
#17 1.503   - package curl-minimal-7.76.1-29.el9.x86_64 from baseos conflicts with curl provided by curl-7.76.1-34.el9.x86_64 from baseos
#17 1.503   - package curl-minimal-7.76.1-31.el9.x86_64 from baseos conflicts with curl provided by curl-7.76.1-34.el9.x86_64 from baseos
#17 1.503   - package curl-minimal-7.76.1-34.el9.x86_64 from baseos conflicts with curl provided by curl-7.76.1-34.el9.x86_64 from baseos
#17 1.503   - cannot install the best candidate for the job
</code></pre></div></div>

<p>The next issue was with <code class="language-plaintext highlighter-rouge">RUN yum config-manager --set-enabled powertools</code> as this repository had changed its name (again):</p>

<ul>
  <li>CentOS 7: Uses yum-config-manager</li>
  <li>CentOS 8: Uses powertools</li>
  <li>CentOS Stream 9+: Uses crb</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#34 [stage-1 13/41] RUN yum config-manager --set-enabled powertools
#34 0.447 Error: No matching repo to modify: powertools.
#34 ERROR: process "/bin/sh -c yum config-manager --set-enabled powertools" did not complete successfully: exit code: 1
------
 &gt; [stage-1 13/41] RUN yum config-manager --set-enabled powertools:
0.447 Error: No matching repo to modify: powertools.
------
</code></pre></div></div>

<p>The final blocker was building the Ubuntu 25.10 images on RISCV. These images failed on <code class="language-plaintext highlighter-rouge">apt-get update</code>, which I initially assumed was a transitory network issue, but it persisted.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#14 [stage-0  2/13] RUN apt-get -y update
#14 ERROR: process "/bin/sh -c apt-get -y update" did not complete successfully: exit code: 132
</code></pre></div></div>

<p>Oddly, <code class="language-plaintext highlighter-rouge">docker run --rm -it ubuntu:questing</code> didn’t give me a container and simply returned the command prompt. However, <code class="language-plaintext highlighter-rouge">docker run --rm -it ubuntu:questing-20250830</code> did give me a prompt but I still couldn’t run <code class="language-plaintext highlighter-rouge">apt</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># docker run --rm -it ubuntu:questing-20250830
root@8754b6373f6f:/# apt update   
Illegal instruction (core dumped)
</code></pre></div></div>

<p>Interestingly, <code class="language-plaintext highlighter-rouge">ubuntu:questing-20250806</code> (even older) could run <code class="language-plaintext highlighter-rouge">apt update</code>. However, attempting to build the Dockerfile didn’t work.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>48.49 Preparing to unpack .../libc6_2.42-0ubuntu3_riscv64.deb ...
49.27 Checking for services that may need to be restarted...
49.30 Checking init scripts...
49.30 Checking for services that may need to be restarted...
49.34 Checking init scripts...
49.34 Nothing to restart.
49.44 Unpacking libc6:riscv64 (2.42-0ubuntu3) over (2.41-9ubuntu1) ...
52.05 dpkg: warning: old libc6:riscv64 package post-removal script subprocess was killed by signal (Illegal instruction), core dumped
52.05 dpkg: trying script from the new package instead ...
52.06 dpkg: error processing archive /var/cache/apt/archives/libc6_2.42-0ubuntu3_riscv64.deb (--unpack):
52.06  new libc6:riscv64 package post-removal script subprocess was killed by signal (Illegal instruction), core dumped
52.07 dpkg: error while cleaning up:
52.07  installed libc6:riscv64 package pre-installation script subprocess was killed by signal (Illegal instruction), core dumped
52.30 Errors were encountered while processing:
52.30  /var/cache/apt/archives/libc6_2.42-0ubuntu3_riscv64.deb
52.46 E: Sub-process /usr/bin/dpkg returned an error code (1)
</code></pre></div></div>

<p>Checking the Ubuntu <a href="https://ubuntu.com/download/risc-v">download</a> page shows that Ubuntu have changed the hardware requirements.</p>

<blockquote>
  <p>We have upgraded the required RISC-V ISA profile to RVA23S64 with the 25.10 release. Hardware that is not RVA23 ready continues to be supported by our 24.04.3 LTS release.</p>
</blockquote>

<p>Searching online found this <a href="https://www.phoronix.com/news/Ubuntu-25.10-RISC-V-QEMU">article</a>.</p>

<blockquote>
  <p>Back in June it was announced by Canonical that for the Ubuntu 25.10 release <a href="https://www.phoronix.com/news/Ubuntu-25.10-To-Require-RVA23">they would be raising the RISC-V baseline to the RVA23 profile even with barely any available RISC-V platforms supporting that newer RISC-V profile</a>. That change is still going ahead and leaves Ubuntu 25.10 on RISC-V currently only supporting the QEMU virtualized target.</p>
</blockquote>

<p>Therefore, I have removed RISCV as a supported platform on for Ubuntu 25.10 until we can get some hardware to support it or set up some QEMU workers.</p>

<p>Additionally, Anil suggested dropping support for Debian 11, Oracle Linux 8 and 9, and Fedora 41 to reduce the size of the build matrix.</p>

<p><a href="https://github.com/ocurrent/ocaml-dockerfile">ocaml-dockerfile</a> release 8.3.3 is now pending on <a href="https://github.com/ocaml/opam-repository/pull/28736">opam repository</a>.</p>

<p>Now that the base images have been successfully built, I can continue with the updates to <a href="https://github.com/ocurrent/opam-repo-ci">ocurrent/opam-repo-ci</a> with <a href="https://github.com/ocurrent/opam-repo-ci/pull/460">PR#460</a>, which only needs the opam repository SHA updated to include the new release of ocaml-version.</p>

<p><a href="https://github.com/ocurrent/ocaml-ci">ocurrent/ocaml-ci</a> uses git submodules for these packages, so these need to be updated: <a href="https://github.com/ocurrent/ocaml-ci/pull/1032">PR#1042</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Following the release of OCaml 5.4 the CI systems need to be updated to use it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Docker base image build rate</title><link href="https://www.tunbury.org/2025/10/10/docker-base-images/" rel="alternate" type="text/html" title="Docker base image build rate" /><published>2025-10-10T00:00:00+00:00</published><updated>2025-10-10T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/10/10/docker-base-images</id><content type="html" xml:base="https://www.tunbury.org/2025/10/10/docker-base-images/"><![CDATA[<p>We are increasingly hitting the Docker Hub rate limits when pushing the Docker base images. This issue was previously identified in <a href="https://github.com/ocurrent/docker-base-images/issues/267">issue #267</a>. However, this is now becoming critical as many more jobs are failing.</p>

<p>A typical failure log looks like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#13 [1/7] FROM docker.io/ocurrent/opam-staging@sha256:8ff156dd3a4ad8853b82940ac8965e8f0f4b18245e54fb26b9304f1ab961030b
#13 sha256:6b6519a49e416508fe7152b16035ad70bebba4d8f3486b6c0732c21da9433445
#13 resolve docker.io/ocurrent/opam-staging@sha256:8ff156dd3a4ad8853b82940ac8965e8f0f4b18245e54fb26b9304f1ab961030b
#13 resolve docker.io/ocurrent/opam-staging@sha256:8ff156dd3a4ad8853b82940ac8965e8f0f4b18245e54fb26b9304f1ab961030b 1.6s done
#13 ERROR: failed to copy: httpReadSeeker: failed open: unexpected status from GET request to https://registry-1.docker.io/v2/ocurrent/opam-staging/manifests/sha256:8ff156dd3a4ad8853b82940ac8965e8f0f4b18245e54fb26b9304f1ab961030b: 429 Too Many Requests
toomanyrequests: You have reached your unauthenticated pull rate limit. https://www.docker.com/increase-rate-limit
------
 &gt; [1/7] FROM docker.io/ocurrent/opam-staging@sha256:8ff156dd3a4ad8853b82940ac8965e8f0f4b18245e54fb26b9304f1ab961030b:
------
failed to load cache key: failed to copy: httpReadSeeker: failed open: unexpected status from GET request to https://registry-1.docker.io/v2/ocurrent/opam-staging/manifests/sha256:8ff156dd3a4ad8853b82940ac8965e8f0f4b18245e54fb26b9304f1ab961030b: 429 Too Many Requests
toomanyrequests: You have reached your unauthenticated pull rate limit. https://www.docker.com/increase-rate-limit
docker-build failed with exit-code 1
</code></pre></div></div>

<p>In the base image builder, we create our OCluster connection using the defaults:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">let</span> <span class="n">connection</span> <span class="o">=</span> <span class="nn">Current_ocluster</span><span class="p">.</span><span class="nn">Connection</span><span class="p">.</span><span class="n">create</span> <span class="n">submission_cap</span> <span class="k">in</span>
</code></pre></div></div>

<p>Looking at <a href="https://github.com/ocurrent/ocluster/blob/ba26623c6bca8b917c4252fa9739313fb14692ea/ocurrent-plugin/connection.ml#L177">ocurrent/ocluster</a>, the default is 200 jobs <em>per pool</em>. We submit to 6 pools with a rate limit of 200 per pool, resulting in an overall limit of 1,200 jobs.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">create</span> <span class="o">?</span><span class="p">(</span><span class="n">max_pipeline</span><span class="o">=</span><span class="mi">200</span><span class="p">)</span> <span class="n">sr</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">rate_limits</span> <span class="o">=</span> <span class="nn">Hashtbl</span><span class="p">.</span><span class="n">create</span> <span class="mi">10</span> <span class="k">in</span>
  <span class="p">{</span> <span class="n">sr</span><span class="p">;</span> <span class="n">sched</span> <span class="o">=</span> <span class="nn">Lwt</span><span class="p">.</span><span class="n">fail_with</span> <span class="s2">"init"</span><span class="p">;</span> <span class="n">rate_limits</span><span class="p">;</span> <span class="n">max_pipeline</span> <span class="p">}</span>
</code></pre></div></div>

<p>The current <code class="language-plaintext highlighter-rouge">builds.expected</code> file defines 1029 builds. The first 50 jobs building opam can run immediately; then, all the rest of the builds are unleashed. The breakdown of those follow-up compiler builds by pool is as follows: 352 for amd64, 232 for arm64, 102 for ppc64, 102 for s390x, 69 for riscv64, and 28 for Windows.</p>

<p><a href="https://github.com/ocurrent/docker-base-images/pull/333">PR#333</a> reduces the rate to 20 builds per pool.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="docker,go" /><category term="tunbury.org" /><summary type="html"><![CDATA[We are increasingly hitting the Docker Hub rate limits when pushing the Docker base images. This issue was previously identified in issue #267. However, this is now becoming critical as many more jobs are failing.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/docker-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/docker-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Updating OCaml CI systems to FreeBSD 14.3</title><link href="https://www.tunbury.org/2025/10/07/freebsd-14.3/" rel="alternate" type="text/html" title="Updating OCaml CI systems to FreeBSD 14.3" /><published>2025-10-07T00:00:00+00:00</published><updated>2025-10-07T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/10/07/freebsd-14.3</id><content type="html" xml:base="https://www.tunbury.org/2025/10/07/freebsd-14.3/"><![CDATA[<p>The FreeBSD CI worker <code class="language-plaintext highlighter-rouge">rosemary</code> needs to be updated to FreeBSD 14.3.</p>

<p>The upgrade went without issue following the notes from last <a href="https://www.tunbury.org/2025/03/26/freebsd-14.2/">time</a>. <a href="https://github.com/ocurrent/freebsd-infra">ocurrent/freebsd-infra</a> was updated with <a href="https://github.com/ocurrent/freebsd-infra/pull/18">PR#18</a> and the base images were recreated with <code class="language-plaintext highlighter-rouge">ansible-playbook update.yml</code>.</p>

<p><a href="https://github.com/ocurrent/ocaml-ci/pull/1029">PR#1029</a> for OCaml CI and <a href="https://github.com/ocurrent/opam-repo-ci/pull/459">PR#459</a> for opam-repo-ci were pushed to their respective live branches.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="FreeBSD" /><category term="tunbury.org" /><summary type="html"><![CDATA[The FreeBSD CI worker rosemary needs to be updated to FreeBSD 14.3.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/freebsd-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/freebsd-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Attempting overlayfs with macFuse</title><link href="https://www.tunbury.org/2025/10/06/overlayfs-macFuse/" rel="alternate" type="text/html" title="Attempting overlayfs with macFuse" /><published>2025-10-06T06:00:00+00:00</published><updated>2025-10-06T06:00:00+00:00</updated><id>https://www.tunbury.org/2025/10/06/overlayfs-macFuse</id><content type="html" xml:base="https://www.tunbury.org/2025/10/06/overlayfs-macFuse/"><![CDATA[<p>It would be great if overlayFS or unionFS worked on macOS! Initially, I attempted to use DYLD_INTERPOSE, but I wasn’t able to intercept enough system calls to get it to work. However, macFuse provides a way to implement our own userspace file systems. Patrick previously wrote <a href="https://github.com/ocurrent/obuilder-fs">obuilder-fs</a>, which implemented a per-user filesystem redirection. It would be interesting to extend this concept to provide an overlayfs-style implementation.</p>

<p>My approach was to use an environment variable to flag which process should have the I/O redirected. When the user space layer of Fuse is called, the context includes the UID of the calling process. It is then possible to query the process’s environment and check for the marker variables. If none are found, then we can check the parent process. This won’t work for a double <code class="language-plaintext highlighter-rouge">fork()</code>, but it’s good enough to traverse <code class="language-plaintext highlighter-rouge">sudo</code>. Processes without the environment marker will pass through to the existing path.</p>

<p>Passing through to the existing path is easier said than done. When the Fuse filesystem is mounted, the content of the underlying filesystem is completely hidden. The workaround was to move the existing files out of the way and redirect to requests to this temporary directory.</p>

<p>Initially, this showed promise as trivial commands like <code class="language-plaintext highlighter-rouge">stat</code> and <code class="language-plaintext highlighter-rouge">ls</code> worked. However, the excitement was short-lived as complex commands failed with “Device not configured”.</p>

<p>For example, with Fuse mounted on <code class="language-plaintext highlighter-rouge">/usr/local</code>, some files and directories were created in <code class="language-plaintext highlighter-rouge">/tmp/a</code>, but very few.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nv">WRAPPER</span><span class="o">=</span>/tmp/a git <span class="nt">-C</span> /usr/local clone https://github.com/ocaml/opam-repository
Cloning into <span class="s1">'opam-repository'</span>...
/System/Volumes/Data/usr/local/opam-repository/.git/hooks/: Device not configured
</code></pre></div></div>

<p>The log showed that <code class="language-plaintext highlighter-rouge">fseventsd</code> tried to query all the directories which <code class="language-plaintext highlighter-rouge">git</code> created, but since it didn’t have the environment variable set, it couldn’t find the files. After a few failures, <code class="language-plaintext highlighter-rouge">fseventsd</code> seem to mark the filesystem as bad and block access. The log snippet below shows a a typically request from <code class="language-plaintext highlighter-rouge">fseventsd</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>unique: 8, opcode: GETATTR (3), nodeid: 21, insize: 56, pid: 522
getattr /opam-repository/.git
Searching for WRAPPER in process tree starting from PID 522:
    PID 522 has 1 args, checking environment...
    arg[0]: /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/Support/fseventsd
    Checked 4 environment variables, no WRAPPER found
  PID 522 (fseventsd): no wrapper
No WRAPPER found in process tree
*** GETATTR PASSTHROUGH: /opam-repository/.git -&gt; /System/Volumes/Data/usr/local.fuse/opam-repository/.git ***
   unique: 8, error: -2 (No such file or directory), outsize: 16
unique: 6, opcode: LOOKUP (1), nodeid: 20, insize: 45, pid: 522
LOOKUP /opam-repository/.git
getattr /opam-repository/.git
Searching for WRAPPER in process tree starting from PID 522:
    PID 522 has 1 args, checking environment...
    arg[0]: /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/Support/fseventsd
    Checked 4 environment variables, no WRAPPER found
  PID 522 (fseventsd): no wrapper
No WRAPPER found in process tree
*** GETATTR PASSTHROUGH: /opam-repository/.git -&gt; /System/Volumes/Data/usr/local.fuse/opam-repository/.git ***
   unique: 6, error: -2 (No such file or directory), outsize: 16
</code></pre></div></div>

<p>Searching online suggested that <code class="language-plaintext highlighter-rouge">fseventsd</code> could be blocked by creating a file named <code class="language-plaintext highlighter-rouge">/.fseventsd/no_log</code> on the filesystem. This didn’t work. Since the incoming request always came from <code class="language-plaintext highlighter-rouge">fseventsd</code> could it be blocked at the Fuse level?  As a quick test, I tried returning <code class="language-plaintext highlighter-rouge">ENOTSUP</code> based on the PID, and that worked! I replaced the static PID with a call to <code class="language-plaintext highlighter-rouge">proc_pidpath()</code> and matched the name against <code class="language-plaintext highlighter-rouge">fseventsd</code>.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">if</span> <span class="p">(</span><span class="n">context</span><span class="o">-&gt;</span><span class="n">pid</span> <span class="o">==</span> <span class="mi">522</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">return</span> <span class="o">-</span><span class="n">ENOTSUP</span><span class="p">;</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>With this working, I implemented an overlayfs-style semantics using environment variables <code class="language-plaintext highlighter-rouge">WRAPPER_UPPER</code> and <code class="language-plaintext highlighter-rouge">WRAPPER_LOWER</code>. Deletions are handled by creating a whiteout directory, <code class="language-plaintext highlighter-rouge">.deleted</code>, at the root, which is populated with empty files that reflect the files/directories which have been deleted. If a file <code class="language-plaintext highlighter-rouge">bar</code> is deleted from directory <code class="language-plaintext highlighter-rouge">foo</code>, then <code class="language-plaintext highlighter-rouge">/.delete/foo/bar</code> would be created. Later, if <code class="language-plaintext highlighter-rouge">foo</code> was removed, the directory <code class="language-plaintext highlighter-rouge">foo</code> would be removed from the whiteout directory and be replaced with a file instead. <code class="language-plaintext highlighter-rouge">/.deleted/foo</code></p>

<p>opendir()/readdir() were the most complex functions to implement, as they needed to scan the upper directory and merge in the lower directory, taking account of any deleted files and hide the <code class="language-plaintext highlighter-rouge">/.deleted</code> directory.</p>

<p>The redirection worked. For example, given these steps, <code class="language-plaintext highlighter-rouge">/tmp/a</code> would be empty, <code class="language-plaintext highlighter-rouge">/tmp/b</code> contains the vanilla checkout of opam-repository, and <code class="language-plaintext highlighter-rouge">/tmp/c</code> contains the difference: <code class="language-plaintext highlighter-rouge">/tmp/c/.deleted</code> with the files removed, <code class="language-plaintext highlighter-rouge">/tmp/c/opam-repository/...</code> and <code class="language-plaintext highlighter-rouge">/tmp/c/opam-repository/.git</code> with just the files which contain differences.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">mkdir</span> /tmp/a /tmp/b /tmp/c
% <span class="nv">WRAPPER_LOWER</span><span class="o">=</span>/tmp/a <span class="nv">WRAPPER_UPPER</span><span class="o">=</span>/tmp/b git <span class="nt">-C</span> /usr/local clone https://github.com/ocaml/opam-repository
% <span class="nv">WRAPPER_LOWER</span><span class="o">=</span>/tmp/b <span class="nv">WRAPPER_UPPER</span><span class="o">=</span>/tmp/c git <span class="nt">-C</span> /usr/local/opam-repository checkout c35a0314d6c7c7260c978f490fb8f7109f4e9766
</code></pre></div></div>

<p>Extending this further allows <code class="language-plaintext highlighter-rouge">/tmp/d</code> to be created with a different delta.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">mkdir</span> /tmp/d
% <span class="nv">WRAPPER_LOWER</span><span class="o">=</span>/tmp/b <span class="nv">WRAPPER_UPPER</span><span class="o">=</span>/tmp/d git <span class="nt">-C</span> /usr/local/opam-repository checkout f33f62ebff75cd03620d09d46a4540340f5564a6
</code></pre></div></div>

<p>Annoyingly, this revealed a significant issue: running <code class="language-plaintext highlighter-rouge">git status</code> on <code class="language-plaintext highlighter-rouge">/tmp/c</code> showed that files had changed. I presumed there was a flaw in my code which was corrupting the files, but I couldn’t find it. Examining the files on disk showed that they were correct, but when reading them through Fuse, gave different data:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="k">for </span>x <span class="k">in </span>c d <span class="p">;</span> <span class="k">do </span><span class="nb">cat</span> /tmp/<span class="nv">$x</span>/opam-repository/.git/HEAD <span class="p">;</span> <span class="nv">WRAPPER_LOWER</span><span class="o">=</span>/tmp/b <span class="nv">WRAPPER_UPPER</span><span class="o">=</span>/tmp/<span class="nv">$x</span> <span class="nb">cat</span> /usr/local/opam-repository/.git/HEAD <span class="p">;</span> <span class="k">done
</span>c35a0314d6c7c7260c978f490fb8f7109f4e9766
c35a0314d6c7c7260c978f490fb8f7109f4e9766
f33f62ebff75cd03620d09d46a4540340f5564a6
c35a0314d6c7c7260c978f490fb8f7109f4e9766
</code></pre></div></div>

<p>The log showed the root cause - two OPEN calls, but only a single READ. The kernel is caching the reads.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">grep </span>OPEN log5.txt
unique: 2, opcode: OPEN <span class="o">(</span>14<span class="o">)</span>, nodeid: 4, insize: 48, pid: 52976
<span class="k">***</span> OPEN: /opam-repository/.git/HEAD from UPPER: /tmp/c/opam-repository/.git/HEAD <span class="k">***</span>
unique: 3, opcode: OPEN <span class="o">(</span>14<span class="o">)</span>, nodeid: 4, insize: 48, pid: 52980
<span class="k">***</span> OPEN: /opam-repository/.git/HEAD from UPPER: /tmp/d/opam-repository/.git/HEAD <span class="k">***</span>

% <span class="nb">grep </span>READ log5.txt
unique: 3, opcode: READ <span class="o">(</span>15<span class="o">)</span>, nodeid: 4, insize: 80, pid: 52976
</code></pre></div></div>

<p>You can disable attribute caching with <code class="language-plaintext highlighter-rouge">-o attr_timeout=0 -o entry_timeout=0</code>, and you can circumvent the cache by specifying <code class="language-plaintext highlighter-rouge">-o direct_io</code>. Setting <code class="language-plaintext highlighter-rouge">direct_io</code> is sufficient to resolve the issue in a simple <code class="language-plaintext highlighter-rouge">cat</code> test, but it has the side effect of disabling <code class="language-plaintext highlighter-rouge">mmap()</code>, which causes <code class="language-plaintext highlighter-rouge">git</code> to crash with a <code class="language-plaintext highlighter-rouge">bus error</code>. Setting <code class="language-plaintext highlighter-rouge">fi-&gt;keep_cache = 0</code> doesn’t prevent the cache.</p>

<p>The kernel asks Fuse to allocate a node ID for a path. The node ID number is passed as a parameter to GETATTR, OPEN and READ. Even though GETATTR returns different mtime values at the second call, the kernel still sees a cache hit and returns the file content from the cache.</p>

<p>To control the node ID allocation process this needs to be rewritten using the Fuse low level API. This would allow full control over the allocation process and gives access to calls such as <code class="language-plaintext highlighter-rouge">fuse_lowlevel_notify_inval_inode()</code>.</p>

<p>My work-in-progress code is available on GitHub <a href="https://github.com/mtelvers/macfuse/blob/master/LoopbackFS-C/loopback/loopback.c">mtelvers/macfuse</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="macfuse" /><category term="tunbury.org" /><summary type="html"><![CDATA[It would be great if overlayFS or unionFS worked on macOS! Initially, I attempted to use DYLD_INTERPOSE, but I wasn’t able to intercept enough system calls to get it to work. However, macFuse provides a way to implement our own userspace file systems. Patrick previously wrote obuilder-fs, which implemented a per-user filesystem redirection. It would be interesting to extend this concept to provide an overlayfs-style implementation.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/macfuse-home.png" /><media:content medium="image" url="https://www.tunbury.org/images/macfuse-home.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Apache Parquet Files</title><link href="https://www.tunbury.org/2025/09/17/parquet-files/" rel="alternate" type="text/html" title="Apache Parquet Files" /><published>2025-09-17T21:00:00+00:00</published><updated>2025-09-17T21:00:00+00:00</updated><id>https://www.tunbury.org/2025/09/17/parquet-files</id><content type="html" xml:base="https://www.tunbury.org/2025/09/17/parquet-files/"><![CDATA[<p>If you haven’t discovered the <a href="https://parquet.apache.org">Apache Parquet</a> file format, allow me to introduce it along with <a href="https://clickhouse.com">ClickHouse</a>.</p>

<p>Parquet is a columnar storage file format designed for analytics and big data processing. Data is stored by column rather than by row, there is efficient compression, and the file contains the schema definition.</p>

<p>On Ubuntu, you first need to add the ClickHouse repository.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -fsSL 'https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key' | sudo gpg --dearmor -o /usr/share/keyrings/clickhouse-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list
</code></pre></div></div>

<p>Update and install - I’m going to use <code class="language-plaintext highlighter-rouge">clickhouse local</code>, so I only need the client.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt update
apt install -y clickhouse-client
</code></pre></div></div>

<p>Given the JSON file below, you can use ClickHouse to run SQL queries on it directly: <code class="language-plaintext highlighter-rouge">clickhouse local --query "SELECT * FROM file('x.json')"</code></p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="w">
  </span><span class="p">{</span><span class="w">
    </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0install-gtk.2.18"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"status"</span><span class="p">:</span><span class="w"> </span><span class="s2">"no_solution"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"sha"</span><span class="p">:</span><span class="w"> </span><span class="s2">"d0b74334d458c26f4b769b9b5819f7af222b159c"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"solution"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Can't find all required versions."</span><span class="p">,</span><span class="w">
    </span><span class="nl">"os"</span><span class="p">:</span><span class="w"> </span><span class="s2">"debian-12"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"compiler"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ocaml-base-compiler.5.4.0~beta1"</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span></code></pre></div></div>

<p>Powerfully, the <code class="language-plaintext highlighter-rouge">file</code> parameter can contain wildcards, such as<code class="language-plaintext highlighter-rouge">*.json</code>, in which case the <code class="language-plaintext highlighter-rouge">SELECT</code> is performed across all the files.</p>

<p>In my examples below, the JSON file is 573MB. Let’s try to find all the records where `status = “no_solution”.</p>

<p>We could use <code class="language-plaintext highlighter-rouge">jq</code> with a command like <code class="language-plaintext highlighter-rouge">jq 'map(select(.status == "no_solution")) | length' commit.json</code>. This takes over 2 seconds on my machine. Cheating and using <code class="language-plaintext highlighter-rouge">grep no_solution commit.json | wc -l</code> takes 0.2 seconds.</p>

<p>Using ClickHouse on the same datasource, <code class="language-plaintext highlighter-rouge">clickhouse local --query "SELECT COUNT() FROM file('commit.json') WHERE status = 'no_solution'"</code> matches the performance of <code class="language-plaintext highlighter-rouge">grep</code> returning the count in 0.2 seconds.</p>

<p>Converting the JSON into Parquet format is straightforward. The output file size is an amazing 24MB. Contrast that with <code class="language-plaintext highlighter-rouge">gzip -9 commit.json</code>, which creates a file of 33MB!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>clickhouse local --query "SELECT * FROM file('commit.json', 'JSONEachRow') INTO OUTFILE 'commit.parquet' FORMAT Parquet"
</code></pre></div></div>

<p>Now running our query again: <code class="language-plaintext highlighter-rouge">clickhouse local --query "SELECT COUNT() FROM file('commit.parquet') WHERE status = 'no_solution'"</code>. Just over 0.1 seconds.</p>

<p>How can I use these in my OCaml project? <a href="https://github.com/LaurentMazare/ocaml-arrow">LaurentMazare/ocaml-arrow</a> has created extensive OCaml bindings for Apache Arrow using the C++ API. This supports versions 4 and 5, but the current implementation is version 21. I have an updated commit which works on version 21 and C++ 17. <a href="https://github.com/mtelvers/ocaml-arrow/tree/arrow-21-cpp17">mtelvers/ocaml-arrow/tree/arrow-21-cpp17</a></p>

<p>I have also reimplemented the bulk of the library using the OCaml Standard Library which is available in <a href="https://github.com/mtelvers/arrow">mtelvers/arrow</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="apache,parquet" /><category term="tunbury.org" /><summary type="html"><![CDATA[If you haven’t discovered the Apache Parquet file format, allow me to introduce it along with ClickHouse.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/apache-parquet-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/apache-parquet-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Optimising Data Access in Parquet Files</title><link href="https://www.tunbury.org/2025/09/17/optimising-parquet-files/" rel="alternate" type="text/html" title="Optimising Data Access in Parquet Files" /><published>2025-09-17T21:00:00+00:00</published><updated>2025-09-17T21:00:00+00:00</updated><id>https://www.tunbury.org/2025/09/17/optimising-parquet-files</id><content type="html" xml:base="https://www.tunbury.org/2025/09/17/optimising-parquet-files/"><![CDATA[<p>Yesterday I wrote about the amazing performance of Apache Parquet files; today I reflect on how that translates into an actual application reading Parquet files using the OCaml wrapper of Apache’s C++ library.</p>

<p>I have a TUI application that displays build results for OCaml packages across multiple compiler versions. The application needs to provide two primary operations:</p>

<ol>
  <li>Table view: Display a matrix of build statuses (packages × compilers)</li>
  <li>Detail view: Show detailed build logs and dependency solutions for specific package-compiler combinations</li>
</ol>

<p>The dataset contained 48,895 records with the following schema:</p>

<ul>
  <li>name: Package name (~4,500 unique values)</li>
  <li>compiler: Compiler version (~11 unique versions)</li>
  <li>status: Build result (success/failure/etc.)</li>
  <li>log: Detailed build output (large text field)</li>
  <li>solution: Dependency resolution graph (large text field)</li>
</ul>

<h1 id="initial-implementation-and-performance-bottleneck">Initial Implementation and Performance Bottleneck</h1>

<p>The initial implementation used Apache Arrow’s OCaml bindings to load the complete Parquet file into memory:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>let analyze_data filename =
  let table = Arrow.Parquet_reader.table filename in
  let name_col = Arrow.Wrapper.Column.read_utf8 table ~column:(`Name "name") in
  let status_col = Arrow.Wrapper.Column.read_utf8_opt table ~column:(`Name "status") in
  let compiler_col = Arrow.Wrapper.Column.read_utf8 table ~column:(`Name "compiler") in
  let log_col = Arrow.Wrapper.Column.read_utf8_opt table ~column:(`Name "log") in
  let solution_col = Arrow.Wrapper.Column.read_utf8_opt table ~column:(`Name "solution") in
  (* Build hashtable for O(1) lookups *)
</code></pre></div></div>

<p>This approach exhibited 3-4 second loading times, creating an unacceptable user experience for interactive data exploration.</p>

<h1 id="performance-analysis">Performance Analysis</h1>

<h2 id="phase-1-timing-instrumentation">Phase 1: Timing Instrumentation</h2>

<p>I implemented some basic timing instrumentation to identify bottlenecks by logging data to a file.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">append_to_file</span> <span class="n">filename</span> <span class="n">message</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">oc</span> <span class="o">=</span> <span class="n">open_out_gen</span> <span class="p">[</span><span class="nc">Open_creat</span><span class="p">;</span> <span class="nc">Open_text</span><span class="p">;</span> <span class="nc">Open_append</span><span class="p">]</span> <span class="mo">0o644</span> <span class="n">filename</span> <span class="k">in</span>
  <span class="nn">Printf</span><span class="p">.</span><span class="n">fprintf</span> <span class="n">oc</span> <span class="s2">"%s: %s</span><span class="se">\n</span><span class="s2">"</span> <span class="p">(</span><span class="nn">Sys</span><span class="p">.</span><span class="n">time</span> <span class="bp">()</span> <span class="o">|&gt;</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">sprintf</span> <span class="s2">"%.3f"</span><span class="p">)</span> <span class="n">message</span><span class="p">;</span>
  <span class="n">close_out</span> <span class="n">oc</span>
</code></pre></div></div>

<p>The timings revealed that <code class="language-plaintext highlighter-rouge">Arrow.Parquet_reader.table</code> consumed ~3.6 seconds (80%) of the total loading time, with individual column extractions adding minimal overhead.</p>

<h2 id="phase-2-deep-api-analysis">Phase 2: Deep API Analysis</h2>

<p>Reviewing the Arrow C++ implementation to understand the performance characteristics:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="c1">// From arrow_c_api.cc - the core bottleneck</span>
  <span class="n">TablePtr</span> <span class="o">*</span><span class="nf">parquet_read_table</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="n">filename</span><span class="p">,</span> <span class="kt">int</span> <span class="o">*</span><span class="n">col_idxs</span><span class="p">,</span> <span class="kt">int</span> <span class="n">ncols</span><span class="p">,</span>
                                <span class="kt">int</span> <span class="n">use_threads</span><span class="p">,</span> <span class="kt">int64_t</span> <span class="n">only_first</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// ...</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">only_first</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">st</span> <span class="o">=</span> <span class="n">reader</span><span class="o">-&gt;</span><span class="n">ReadTable</span><span class="p">(</span><span class="o">&amp;</span><span class="n">table</span><span class="p">);</span>  <span class="c1">// Loads entire table!</span>
    <span class="p">}</span>
    <span class="c1">// ...</span>
  <span class="p">}</span>
</code></pre></div></div>

<p>This shows that the <code class="language-plaintext highlighter-rouge">ReadTable()</code> operation materialises the complete dataset in memory, regardless of actual usage patterns.</p>

<h1 id="optimisation-strategy-column-selection">Optimisation Strategy: Column Selection</h1>

<p>Could the large text fields (log and solution columns) be responsible for the performance bottleneck?</p>

<p>I modified the table loading to exclude large columns during initial load:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">table</span> <span class="o">=</span> <span class="nn">Arrow</span><span class="p">.</span><span class="nn">Parquet_reader</span><span class="p">.</span><span class="n">table</span> <span class="o">~</span><span class="n">column_idxs</span><span class="o">:</span><span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">1</span><span class="p">;</span> <span class="mi">6</span><span class="p">;</span> <span class="mi">7</span><span class="p">]</span> <span class="n">filename</span> <span class="k">in</span>
  <span class="c">(* Only load: name, status, os, compiler *)</span>
</code></pre></div></div>

<p>This dramatically reduced the loading time from 3.6 seconds to 0.021 seconds.</p>

<p>This optimisation validated the hypothesis that the large text columns were the primary bottleneck. However, it created a new challenge of accessing the detailed log/solution data for individual records.</p>

<p>There is a function <code class="language-plaintext highlighter-rouge">Arrow.Parquet_reader.fold_batches</code> which could be used for on-demand detail loading:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">find_package_detail</span> <span class="n">filename</span> <span class="n">target_package</span> <span class="n">target_compiler</span> <span class="o">=</span>
  <span class="nn">Arrow</span><span class="p">.</span><span class="nn">Parquet_reader</span><span class="p">.</span><span class="n">fold_batches</span> <span class="n">filename</span>
    <span class="o">~</span><span class="n">column_idxs</span><span class="o">:</span><span class="p">[</span><span class="mi">0</span><span class="p">;</span> <span class="mi">4</span><span class="p">;</span> <span class="mi">5</span><span class="p">;</span> <span class="mi">7</span><span class="p">]</span>  <span class="c">(* name, log, solution, compiler *)</span>
    <span class="o">~</span><span class="n">batch_size</span><span class="o">:</span><span class="mi">100</span>
    <span class="o">~</span><span class="n">f</span><span class="o">:</span><span class="p">(</span><span class="k">fun</span> <span class="bp">()</span> <span class="n">batch</span> <span class="o">-&gt;</span>
      <span class="c">(* Search batch for target, stop when found *)</span>
    <span class="p">)</span>
</code></pre></div></div>

<p>However, the performance analysis showed that it was equivalent to loading the whole table. If the log and solution columns were omitted, then the performance was fast!</p>

<ul>
  <li>With large columns: 2.981 seconds</li>
  <li>Without large columns: 0.033 seconds (33ms)</li>
</ul>

<h1 id="comparative-analysis-clickhouse-vs-arrow">Comparative Analysis: ClickHouse vs Arrow</h1>

<p>To establish performance baselines, I compared Arrow’s performance with <code class="language-plaintext highlighter-rouge">clickhouse local</code>:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ClickHouse aggregation query (equivalent to table view)</span>
<span class="nb">time </span>clickhouse <span class="nb">local</span> <span class="nt">--query</span> <span class="s2">"
  SELECT name, anyIf(status, compiler = 'ocaml.5.3.0') as col1, ...
  FROM file('data.parquet', 'Parquet') GROUP BY name ORDER BY name"</span>
<span class="c"># Result: 0.2 seconds</span>

<span class="c"># ClickHouse individual lookup</span>
<span class="nb">time </span>clickhouse <span class="nb">local</span> <span class="nt">--query</span> <span class="s2">"
  SELECT log, solution FROM file('data.parquet', 'Parquet') WHERE name = '0install.2.18' AND compiler = 'ocaml.5.3.0'"</span>
<span class="c"># Result: 1.716 seconds</span>

<span class="c"># ClickHouse lookup without large columns</span>
<span class="nb">time </span>clickhouse <span class="nb">local</span> <span class="nt">--query</span> <span class="s2">"
  SELECT COUNT() FROM file('data.parquet', 'Parquet') WHERE name = '0install.2.18' AND compiler = 'ocaml.5.3.0'"</span>
<span class="c"># Result: 0.190 seconds</span>
</code></pre></div></div>

<p>The 1.5-second difference (1.716s - 0.190s) represented the fundamental cost of decompressing and decoding large text fields and this is present both in OCaml and when using ClickHouse.</p>

<h1 id="data-structure-redesign-the-wide-table-approach">Data Structure Redesign: The Wide Table Approach</h1>

<p>Instead of searching through 48,895 rows to find specific package-compiler combinations, I restructured the data into a wide table format:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span>
    <span class="n">name</span><span class="p">,</span>
    <span class="n">anyIf</span><span class="p">(</span><span class="n">status</span><span class="p">,</span> <span class="n">compiler</span> <span class="o">=</span> <span class="s1">'ocaml.5.3.0'</span><span class="p">)</span> <span class="k">as</span> <span class="n">status_5_3_0</span><span class="p">,</span>
    <span class="n">anyIf</span><span class="p">(</span><span class="n">log</span><span class="p">,</span> <span class="n">compiler</span> <span class="o">=</span> <span class="s1">'ocaml.5.3.0'</span><span class="p">)</span> <span class="k">as</span> <span class="n">log_5_3_0</span><span class="p">,</span>
    <span class="n">anyIf</span><span class="p">(</span><span class="n">solution</span><span class="p">,</span> <span class="n">compiler</span> <span class="o">=</span> <span class="s1">'ocaml.5.3.0'</span><span class="p">)</span> <span class="k">as</span> <span class="n">solution_5_3_0</span><span class="p">,</span>
    <span class="c1">-- ... repeat for all compilers</span>
<span class="k">FROM</span> <span class="n">file</span><span class="p">(</span><span class="s1">'original.parquet'</span><span class="p">,</span> <span class="s1">'Parquet'</span><span class="p">)</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="n">name</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">name</span>
</code></pre></div></div>

<p>This transformation:</p>
<ul>
  <li>Reduced row count from ~48,895 to ~4,500 (one row per package)</li>
  <li>Eliminated search operations - direct column access by name</li>
  <li>Preserved all data while optimising access patterns</li>
</ul>

<p>The wide table restructure delivered the expected performance both in ClickHouse and OCaml.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">time </span>clickhouse <span class="nb">local</span> <span class="nt">--query</span> <span class="s2">"
  SELECT log_5_3_0, solution_5_3_0      FROM file('restructured.parquet', 'Parquet')      WHERE name = '0install.2.18'"</span>
<span class="c"># Result: 0.294 seconds</span>
</code></pre></div></div>

<h1 id="conclusion">Conclusion</h1>

<p>There is no way to access a specific row within a column without loading (thus decompressing) the entire column. Given a column of ~50K rows, this takes a significant time. By splitting this table by compiler and by log, any given column which needs to be loaded is only ~4.5K rows make the application more responsive.</p>

<p>The wide table schema goes against my instincts for database table structure, and adds complexity when later using this dataset in other queries. This trade-off between performance and schema flexibility needs careful thought based on specific application requirements.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="apache,parquet" /><category term="tunbury.org" /><summary type="html"><![CDATA[Yesterday I wrote about the amazing performance of Apache Parquet files; today I reflect on how that translates into an actual application reading Parquet files using the OCaml wrapper of Apache’s C++ library.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/apache-parquet-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/apache-parquet-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">FreeBSD unionfs deadlock</title><link href="https://www.tunbury.org/2025/09/17/freebsd-unionfs/" rel="alternate" type="text/html" title="FreeBSD unionfs deadlock" /><published>2025-09-17T12:00:00+00:00</published><updated>2025-09-17T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/09/17/freebsd-unionfs</id><content type="html" xml:base="https://www.tunbury.org/2025/09/17/freebsd-unionfs/"><![CDATA[<p>FreeBSD Jails provide isolated system containers that are perfect for CI testing. Miod <a href="https://tarides.com/blog/2023-10-04-porting-obuilder-to-freebsd/">ported OBuilder</a> to FreeBSD back in 2023. I have been looking at some different approaches using unionfs.</p>

<p>I’d like to have a read-only base layer with the OS, a middle layer containing source code and system libraries, and a top writable layer for the build results. This is easily constructed in an <code class="language-plaintext highlighter-rouge">fstab</code> for the <code class="language-plaintext highlighter-rouge">jail</code> like this.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/opam/bsd-1402000-x86_64/base/fs /home/opam/temp-2b9f69/work nullfs ro 0 0
/home/opam/temp-2b9f69/lower /home/opam/temp-2b9f69/work unionfs ro 0 0
/home/opam/temp-2b9f69/fs /home/opam/temp-2b9f69/work unionfs rw 0 0
/home/opam/opam-repository /home/opam/temp-2b9f69/work/home/opam/opam-repository nullfs ro 0 0
</code></pre></div></div>

<p>Running <code class="language-plaintext highlighter-rouge">jail -c name=temp-2b9f69 path=/home/opam/temp-2b9f69/work mount.devfs mount.fstab=/home/opam/temp-7323b6/fstab ...</code> works as expected; it’s good enough to build OCaml, but it reliably deadlocks the entire machine when trying to build dune. This appears to be an old problem: <a href="https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=165087">165087</a>, <a href="https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201677">201677</a> and <a href="https://people.freebsd.org/~daichi/unionfs">unionfs</a>. There is a <a href="https://freebsdfoundation.org/project/unionfs-stability-and-enhancement">project</a> aiming to improve unionfs for use in jails.</p>

<p>My workaround is to create a temporary layer that merges the base and lower layers together. Initially, I did this by mounting <code class="language-plaintext highlighter-rouge">tmpfs</code> to the lower mount point and using <code class="language-plaintext highlighter-rouge">cp</code> to copy the files. The performance was poor, so instead I created the layer on disk and used <code class="language-plaintext highlighter-rouge">cp -l</code> to hard link the files. The simplified <code class="language-plaintext highlighter-rouge">fstab</code> works successfully in my testing.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home/opam/temp-2b9f69/lower /home/opam/temp-2b9f69/work nullfs ro 0 0
/home/opam/temp-2b9f69/fs /home/opam/temp-2b9f69/work unionfs rw 0 0
/home/opam/opam-repository /home/opam/temp-2b9f69/work/home/opam/opam-repository nullfs ro 0 0
</code></pre></div></div>

<p>FreeBSD protects key system files by marking them as immutable; this prevents hard links to the files. Therefore, I needed to remove these flags after the <code class="language-plaintext highlighter-rouge">bsdinstall</code> has completed. <code class="language-plaintext highlighter-rouge">chflags -R 0 basefs</code></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="FreeBSD,unionfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[FreeBSD Jails provide isolated system containers that are perfect for CI testing. Miod ported OBuilder to FreeBSD back in 2023. I have been looking at some different approaches using unionfs.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/freebsd-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/freebsd-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Terminal Plotter</title><link href="https://www.tunbury.org/2025/09/07/terminal-plotter/" rel="alternate" type="text/html" title="Terminal Plotter" /><published>2025-09-07T00:00:00+00:00</published><updated>2025-09-07T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/09/07/terminal-plotter</id><content type="html" xml:base="https://www.tunbury.org/2025/09/07/terminal-plotter/"><![CDATA[<p>I frequently want a quick way to monitor things from the shell, be that <code class="language-plaintext highlighter-rouge">watch -d df -h</code>, or <code class="language-plaintext highlighter-rouge">while true</code> loop, or an <code class="language-plaintext highlighter-rouge">awk</code> script. The scripts become increasingly complex when you want to measure the difference between the current and previous value. The solution is <a href="https://github.com/mtelvers/terminal-plotter">mtelvers/terminal-plotter</a>.</p>

<p>I set out to write this in Mosaic, but I ran into various bugs within the framework, so I abandoned it in place of the <a href="https://github.com/pqwy/notty">pqwy/notty</a> and the histograms I created for <a href="https://www.tunbury.org/2025/08/24/ocluster-monitor/">ocluster-monitor</a>.</p>

<p>Consider <code class="language-plaintext highlighter-rouge">/proc/loadavg</code>: typical values are shown below, where the first 3 are the load averages over 1 minute, 5 minutes and 15 minutes, and there is 1 process running out of the 623 on the system, and the final value is the PID of the most recently created process.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0.04 0.02 0.00 1/623 2828549
</code></pre></div></div>

<p>A simple use case is to run <code class="language-plaintext highlighter-rouge">terminal-plotter --file /proc/loadavg</code>, which reads <code class="language-plaintext highlighter-rouge">/proc/loadavg</code> every 2 seconds and displays the values in 5 graphs. The entry <code class="language-plaintext highlighter-rouge">1/623</code> is automatically considered a fraction.</p>

<p><img src="/images/loadavg-simple.png" alt="" /></p>

<p>You can add labels to your charts. In the example below, <code class="language-plaintext highlighter-rouge">c0</code> represents column 0, <code class="language-plaintext highlighter-rouge">c1</code> column 1, etc.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>terminal-plotter <span class="nt">--file</span> /proc/loadavg <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"load 1m:c0"</span> <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"load 5m:c1"</span> <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"load 15m:c2"</span> <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"running:c3"</span> <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"pid:c4"</span>
</code></pre></div></div>

<p><img src="/images/loadavg-labels.png" alt="" /></p>

<p>Since <code class="language-plaintext highlighter-rouge">pid</code> always increases, graphing it is a bit pointless. We’d rather see the difference between the current and previous values. We can use <code class="language-plaintext highlighter-rouge">--counter</code> to indicate that we want the delta rather than the absolute value.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>terminal-plotter <span class="nt">--file</span> /proc/loadavg <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"load 1m:c0"</span> <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"load 5m:c1"</span> <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"load 15m:c2"</span> <span class="se">\</span>
  <span class="nt">--value</span> <span class="s2">"running:c3"</span> <span class="se">\</span>
  <span class="nt">--counter</span> <span class="s2">"pid:c4"</span>
</code></pre></div></div>

<p><img src="/images/loadavg-counter.png" alt="" /></p>

<p>Imagine a more complex example of <code class="language-plaintext highlighter-rouge">/proc/stat</code>. Here we can see the CPU activity for each processor following the first line, which aggregates the values below.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cat /proc/stat 
cpu  67153280 1763 14886491 1223984556 65971570 0 59050 0 0 0
cpu0 2029319 125 556429 29970217 1631023 0 1612 0 0 0
cpu1 2002631 152 467156 29813299 1980226 0 1344 0 0 0
cpu2 1918663 134 425357 29983099 1957736 0 1346 0 0 0
...
</code></pre></div></div>

<p>Here, we can use the <code class="language-plaintext highlighter-rouge">r1c0</code> notation to indicate the number in row 1, column 0 (the first numeric value on the second row). The example sums the various jiffy counters and plots the difference between the current and previous value.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>terminal-plotter --file /proc/stat \
  --counter CPU0:r1c0+r1c1+r1c2+r1c4+r1c5+r1c6 \
  --counter CPU1:r2c0+r2c1+r2c2+r2c4+r2c5+r2c6 \
... etc
</code></pre></div></div>

<p><img src="/images/stat.png" alt="" /></p>

<p>After crafting a complex command line, you can save it to <code class="language-plaintext highlighter-rouge">~/.terminal-plotter</code> with a unique <em>key</em>, then future invocations can load the settings from the file. e.g. <code class="language-plaintext highlighter-rouge">terminal-plotter loadavg</code> will load the profile <code class="language-plaintext highlighter-rouge">loadavg</code> from <code class="language-plaintext highlighter-rouge">~/.terminal-plotter</code> containing:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>loadavg --file /proc/loadavg --value "Load:c0" --value "load 5m:c1" --value "load 15m:c2" --value "running:c3" --counter "pid:c4"
</code></pre></div></div>

<p>You may recall my <code class="language-plaintext highlighter-rouge">awk</code> script for <code class="language-plaintext highlighter-rouge">dmsetup</code>, which I used to monitor <a href="https://www.tunbury.org/2025/09/04/dm-cache/">dm-cache</a>. This can be implemented as below.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>terminal-plotter <span class="nt">-i</span> 15 <span class="nt">--exec</span> <span class="s2">"sudo dmsetup status fast-sdd"</span> <span class="nt">--value</span> c6 <span class="nt">--value</span> <span class="s2">"Read Hits: (c7/(c7+c8))"</span> <span class="nt">--value</span> <span class="s2">"Write Hits: (c9/(c9+c10))"</span> <span class="nt">--value</span> <span class="s2">"Dirty:c13"</span>  <span class="nt">--counter</span> <span class="s2">"Demotions:c11"</span> <span class="nt">--counter</span> <span class="s2">"Promotions:c12"</span>
</code></pre></div></div>

<p>You can use standard arithmetic expressions using either <code class="language-plaintext highlighter-rouge">rNcM</code> or <code class="language-plaintext highlighter-rouge">cM</code> notation.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="dm-cache,Ubuntu" /><category term="tunbury.org" /><summary type="html"><![CDATA[I frequently want a quick way to monitor things from the shell, be that watch -d df -h, or while true loop, or an awk script. The scripts become increasingly complex when you want to measure the difference between the current and previous value. The solution is mtelvers/terminal-plotter.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/stat.png" /><media:content medium="image" url="https://www.tunbury.org/images/stat.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Using dm-cache with a RAM disk</title><link href="https://www.tunbury.org/2025/09/04/dm-cache/" rel="alternate" type="text/html" title="Using dm-cache with a RAM disk" /><published>2025-09-04T00:00:00+00:00</published><updated>2025-09-04T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/09/04/dm-cache</id><content type="html" xml:base="https://www.tunbury.org/2025/09/04/dm-cache/"><![CDATA[<p>I have written about <a href="https://www.tunbury.org/2025/04/21/ubuntu-dm-cache/">dm-cache</a> previously, when I used it with LVM for SSD/HDD caching. In this post, I will explore using dm-cache with RAM as the cache layer over a spinning disk.</p>

<p>I have a CI workload that I could almost fit entirely in tmpfs, but then I would not have any data persistence across reboots. I also have existing data on disk, which I’d rather not regenerate.</p>

<p>To use any cache we need a block store and a meta datastore. As I mentioned in the previous post the metadata is typically 1% of the size of the block store. Since empty RAM disks take up any space, I’ll create two RAM disks for 100G, one for metadata and one for the block store. Equally, I could have partitioned a single RAM disk.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>modprobe brd <span class="nv">rd_size</span><span class="o">=</span>107374182400 <span class="nv">rd_nr</span><span class="o">=</span>2 <span class="nv">max_part</span><span class="o">=</span>1
</code></pre></div></div>

<p>Let’s configure these with <code class="language-plaintext highlighter-rouge">dmsetup</code> with the sizes given in 512-byte sectors.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dmsetup create cache-meta <span class="nt">--table</span> <span class="s2">"0 2097152 linear /dev/ram0 0"</span>
dmsetup create cache-data <span class="nt">--table</span> <span class="s2">"0 209715200 linear /dev/ram1 0"</span>
</code></pre></div></div>

<p>There is a lot of outdated information online about the cache settings. Firstly, there are references to a default policy, a Stochastic Multiqueue (SMQ) policy and a Multiqueue (MQ) policy. However, the kernel logs show that the “mq policy is now an alias for smq”. Many of the configuration options, such as <code class="language-plaintext highlighter-rouge">write_promote_adjustment</code> and <code class="language-plaintext highlighter-rouge">read_promote_adjustment</code>, have been removed: “tunable ‘write_promote_adjustment’ no longer has any effect”.</p>

<p>This leaves <code class="language-plaintext highlighter-rouge">migration_threshold</code> as about the only tunable setting. It controls the minimum activity level a block needs before SMQ considers promoting it to cache. I’ve picked 100 to move blocks into the cache aggressively rather than the conservative default of 2048.</p>

<p>There is the choice between <em>writeback</em> and <em>writethrough</em>, but since I want performance over data integrity, I have selected <em>writeback</em>, which asynchronously writes the data back to the disk. I can easily regenerate the data if it is lost.</p>

<p>The final question is the block size. I initially selected 8 sectors (4KB blocks), but the kernel rejected this with an “Invalid data block size” message. The smallest size I could use was 64 sectors, and even with that, the kernel warns about excess memory usage. Larger blocks reduce memory overhead and potentially improve performance, but reduce granularity for small random writes. 256 sectors does not give me a warning, so I have selected that.</p>

<p>Below is my final command. <code class="language-plaintext highlighter-rouge">smq 2</code> means that there are two parameters after it.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dmsetup create fast-sdd <span class="nt">--table</span> <span class="s2">"0 </span><span class="si">$(</span>blockdev <span class="nt">--getsz</span> /dev/sdd<span class="si">)</span><span class="s2"> cache /dev/mapper/cache-meta /dev/mapper/cache-data /dev/sdd 256 1 writeback smq 2 migration_threshold 100"</span>
</code></pre></div></div>

<p>Finally, mount the new device. This assumes <code class="language-plaintext highlighter-rouge">/dev/sdd</code> had a filesystem on it; if not, make one in the usual way <code class="language-plaintext highlighter-rouge">mkfs /dev/mapper/fast-sdd</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount /dev/mapper/fast-sdd /mnt
</code></pre></div></div>

<p>We can view the statistics from <code class="language-plaintext highlighter-rouge">dmsetup status</code>, but its string of numbers needs improvement!</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">while </span><span class="nb">true</span><span class="p">;</span> <span class="k">do
  </span>dmsetup status fast-sdd | <span class="nb">awk</span> <span class="s1">'{
    split($7, cache, "/")
    printf "Cache Usage: %d/%d blocks (%.1f%%)\n", cache[1], cache[2], (cache[1]/cache[2])*100
    printf "Read Hits: %d, Misses: %d (%.1f%% hit rate)\n", $8, $9, ($8/($8+$9))*100  
    printf "Write Hits: %d, Misses: %d (%.1f%% hit rate)\n", $10, $11, ($10/($10+$11))*100
    printf "Dirty blocks: %d\n", $14
    printf "Metadata usage: %s\n", $5
    printf "Promotions: %d, Demotions: %d\n\n", $13, $12
  }'</span><span class="p">;</span>
  <span class="nb">sleep </span>2<span class="p">;</span>
<span class="k">done</span>
</code></pre></div></div>

<p>There are some impressive hit rates, albeit, remembering that my dataset does fit within the cache.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Cache Usage: 922060/6553600 blocks (14.1%)
Read Hits: 98508, Misses: 864 (99.1% hit rate)
Write Hits: 3515392, Misses: 116691 (96.8% hit rate)
Dirty blocks: 897985
Metadata usage: 19416/262144
Promotions: 922045, Demotions: 0
</code></pre></div></div>

<p>When it is time to shut down the machine, special care needs to be taken to write the dirty blocks out to disk. The process requires setting the device policy to <code class="language-plaintext highlighter-rouge">cleaner</code>, which requires a suspend/resume operation.</p>

<blockquote>
  <p>After <code class="language-plaintext highlighter-rouge">dmsetup suspend</code> and <code class="language-plaintext highlighter-rouge">dmsetup reload</code>, the table shown with <code class="language-plaintext highlighter-rouge">dmsetup table</code> remains unchanged. The new table does not take effect until <code class="language-plaintext highlighter-rouge">dmsetup resume</code>.</p>
</blockquote>

<p><code class="language-plaintext highlighter-rouge">dmsetup wait</code> pauses until all the dirty blocks have been written.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>umount /dev/mapper/fast-sdd
dmsetup <span class="nb">suspend </span>fast-sdd
dmsetup reload fast-sdd <span class="nt">--table</span> <span class="s2">"0 </span><span class="si">$(</span>blockdev <span class="nt">--getsz</span> /dev/sdd<span class="si">)</span><span class="s2"> cache /dev/mapper/cache-meta /dev/mapper/cache-data /dev/sdd 256 0 cleaner 0"</span>
dmsetup resume fast-sdd
dmsetup <span class="nb">wait </span>fast-sdd
dmsetup remove fast-sdd
dmsetup remove cache-meta
dmsetup remove cache-data
rmmod brd
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="dm-cache,Ubuntu" /><category term="tunbury.org" /><summary type="html"><![CDATA[I have written about dm-cache previously, when I used it with LVM for SSD/HDD caching. In this post, I will explore using dm-cache with RAM as the cache layer over a spinning disk.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ubuntu.png" /><media:content medium="image" url="https://www.tunbury.org/images/ubuntu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Blocking unlimited memory requests in Slurm</title><link href="https://www.tunbury.org/2025/09/03/slurm-mem/" rel="alternate" type="text/html" title="Blocking unlimited memory requests in Slurm" /><published>2025-09-03T00:00:00+00:00</published><updated>2025-09-03T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/09/03/slurm-mem</id><content type="html" xml:base="https://www.tunbury.org/2025/09/03/slurm-mem/"><![CDATA[<p>When a Slurm node is added to a partition, you specify the quantity of physical memory the machine has. Running <code class="language-plaintext highlighter-rouge">slurmd -C</code> on the node generates the configuration data based on the machine capacity.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>NodeName=foo CPUs=16 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=1048576
</code></pre></div></div>

<p>However, even with cgoups enabled, users running with <code class="language-plaintext highlighter-rouge">--mem 0</code> run unchecked, which can lead to the out-of-memory reaper collecting the process. Even limiting <code class="language-plaintext highlighter-rouge">RealMemory</code> to 99% or 66% of the actual RAM does not protect the machine.</p>

<p>Using a job_submit plugin, you can block <code class="language-plaintext highlighter-rouge">--mem=0</code>. Slurm provides a Lua interface that allows you to intercept and modify/reject job submissions.</p>

<p>Here’s how to create a custom job_submit plugin that blocks <code class="language-plaintext highlighter-rouge">--mem=0</code>:</p>

<p>1. Create <code class="language-plaintext highlighter-rouge">/etc/slurm/job_submit.lua</code>:</p>

<div class="language-lua highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">function</span> <span class="nf">slurm_job_submit</span><span class="p">(</span><span class="n">job_desc</span><span class="p">,</span> <span class="n">part_list</span><span class="p">,</span> <span class="n">submit_uid</span><span class="p">)</span>
    <span class="c1">-- Check if user requested unlimited memory (--mem=0)</span>
    <span class="k">if</span> <span class="n">job_desc</span><span class="p">.</span><span class="n">pn_min_memory</span> <span class="o">==</span> <span class="mi">0</span> <span class="k">then</span>
        <span class="n">slurm</span><span class="p">.</span><span class="n">log_user</span><span class="p">(</span><span class="s2">"ERROR: --mem=0 is not allowed. Please specify an explicit memory limit."</span><span class="p">)</span>
        <span class="n">slurm</span><span class="p">.</span><span class="n">user_msg</span><span class="p">(</span><span class="s2">"--mem=0 is not allowed. Please specify an explicit memory limit (e.g., --mem=100G)"</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">slurm</span><span class="p">.</span><span class="n">ERROR</span>
    <span class="k">end</span>

    <span class="k">return</span> <span class="n">slurm</span><span class="p">.</span><span class="n">SUCCESS</span>
<span class="k">end</span>

<span class="k">function</span> <span class="nf">slurm_job_modify</span><span class="p">(</span><span class="n">job_desc</span><span class="p">,</span> <span class="n">job_rec</span><span class="p">,</span> <span class="n">part_list</span><span class="p">,</span> <span class="n">modify_uid</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">slurm</span><span class="p">.</span><span class="n">SUCCESS</span>
<span class="k">end</span>

<span class="k">return</span> <span class="n">slurm</span><span class="p">.</span><span class="n">SUCCESS</span>
</code></pre></div></div>

<p>2. Configure <code class="language-plaintext highlighter-rouge">slurm.conf</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Enable the Lua job submit plugin
JobSubmitPlugins=lua
</code></pre></div></div>

<p>3. Restart <code class="language-plaintext highlighter-rouge">slurmctld</code>:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>systemctl restart slurmctld
</code></pre></div></div>

<p>Users trying <code class="language-plaintext highlighter-rouge">srun --mem=0</code> will get an error message:</p>

<blockquote>
  <p>–mem=0 is not allowed. Please specify an explicit memory limit (e.g., –mem=100G)</p>
</blockquote>

<p>This enforces explicit memory requests while still allowing up to the RealMemory limits. The plugin intercepts job submissions before they are processed, allowing you to reject <code class="language-plaintext highlighter-rouge">--mem=0</code> requests and force users to specify explicit memory amounts within your configured limits.</p>

<p>I’d still recommend additionally setting <code class="language-plaintext highlighter-rouge">RealMemory</code> to less than the physical RAM installed to allow some room for the OS.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Slurm" /><category term="tunbury.org" /><summary type="html"><![CDATA[When a Slurm node is added to a partition, you specify the quantity of physical memory the machine has. Running slurmd -C on the node generates the configuration data based on the machine capacity.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/slurm.png" /><media:content medium="image" url="https://www.tunbury.org/images/slurm.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Arduino Uno Fish Feeder</title><link href="https://www.tunbury.org/2025/08/31/fish-feeder/" rel="alternate" type="text/html" title="Arduino Uno Fish Feeder" /><published>2025-08-31T12:00:00+00:00</published><updated>2025-08-31T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/31/fish-feeder</id><content type="html" xml:base="https://www.tunbury.org/2025/08/31/fish-feeder/"><![CDATA[<p>My daughter and I have had a fun summer project building a fish feeder. It uses a 3D-printed container to hold the fish food, which is rotated 360 degrees using an Arduino Uno and a 28BYJ-48 stepper motor.</p>

<p>Gravity ensures that the food falls to the bottom of the container. An internal scoop collects the food as it rotates and, when inverted, the food drops into the tank. The container lid isn’t shown, as we reused a transparent lid from a Pringles tube.</p>

<p>The initial version of the code performed the rotation, waited for a 12-hour delay, and looped. Subsequently, we have used the LED matrix on the UNO R4 to display a countdown until feeding time. The code is available at <a href="https://github.com/mtelvers/fish-feeder">mtelvers/fish-feeder</a></p>

<p><img src="/images/fish-feeder-design.png" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="arduino" /><category term="tunbury.org" /><summary type="html"><![CDATA[My daughter and I have had a fun summer project building a fish feeder. It uses a 3D-printed container to hold the fish food, which is rotated 360 degrees using an Arduino Uno and a 28BYJ-48 stepper motor.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/fish-feeder.png" /><media:content medium="image" url="https://www.tunbury.org/images/fish-feeder.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Mosaic Terminal User Interface</title><link href="https://www.tunbury.org/2025/08/31/mless/" rel="alternate" type="text/html" title="Mosaic Terminal User Interface" /><published>2025-08-31T00:00:00+00:00</published><updated>2025-08-31T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/31/mless</id><content type="html" xml:base="https://www.tunbury.org/2025/08/31/mless/"><![CDATA[<p>In testing various visual components, terminal resizing, keyboard handling and the use of hooks, I inadvertently wrote the <code class="language-plaintext highlighter-rouge">less</code> tool in <a href="https://github.com/tmattio/mosaic">Mosaic</a>. Below are my notes on using the framework.</p>

<p><code class="language-plaintext highlighter-rouge">use_state</code> is a React-style hook that manages local component state. It returns a tuple of (value, set, update) where:</p>

<ol>
  <li>count - the current value</li>
  <li>set_count - sets to a specific value (takes a value)</li>
  <li>update_count - transforms the current value (takes a function)</li>
</ol>

<p>Thus, you might have</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="p">(</span><span class="n">count</span><span class="o">,</span> <span class="n">set_count</span><span class="o">,</span> <span class="n">update_count</span><span class="p">)</span> <span class="o">=</span> <span class="n">use_state</span> <span class="mi">0</span><span class="p">;;</span>

<span class="n">count</span> <span class="c">(* returns the current value - zero in this case *)</span>
<span class="n">set_count</span> <span class="mi">5</span> <span class="c">(* set the value to 5 *)</span>
<span class="n">update_count</span> <span class="p">(</span><span class="k">fun</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="c">(* adds 1 to the current value *)</span>
</code></pre></div></div>

<p>In practice, this could be used to keep track of the selected index in a table of values:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">directory_browser</span> <span class="n">dir_info</span> <span class="n">window_height</span> <span class="n">window_width</span> <span class="n">set_mode</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">open</span> <span class="nc">Ui</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">selected_index</span><span class="o">,</span> <span class="n">set_selected_index</span><span class="o">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">use_state</span> <span class="mi">0</span> <span class="k">in</span>
  
  <span class="n">use_subscription</span>
    <span class="p">(</span><span class="nn">Sub</span><span class="p">.</span><span class="n">keyboard_filter</span> <span class="p">(</span><span class="k">fun</span> <span class="n">event</span> <span class="o">-&gt;</span>
         <span class="k">match</span> <span class="n">event</span><span class="o">.</span><span class="nn">Input</span><span class="p">.</span><span class="n">key</span> <span class="k">with</span>
         <span class="o">|</span> <span class="nn">Input</span><span class="p">.</span><span class="nc">Up</span> <span class="o">-&gt;</span> <span class="n">set_selected_index</span> <span class="p">(</span><span class="n">max</span> <span class="mi">0</span> <span class="p">(</span><span class="n">selected_index</span> <span class="o">-</span> <span class="mi">1</span><span class="p">));</span> <span class="nc">None</span>
         <span class="o">|</span> <span class="nn">Input</span><span class="p">.</span><span class="nc">Down</span> <span class="o">-&gt;</span> <span class="n">set_selected_index</span> <span class="p">(</span><span class="n">min</span> <span class="p">(</span><span class="n">num_entries</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="n">selected_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">));</span> <span class="nc">None</span>
         <span class="o">|</span> <span class="nn">Input</span><span class="p">.</span><span class="nc">Enter</span> <span class="o">-&gt;</span> <span class="n">set_mode</span> <span class="p">(</span><span class="n">load_path</span> <span class="n">entry</span><span class="o">.</span><span class="n">full_path</span><span class="p">);</span> <span class="nc">Some</span> <span class="bp">()</span>
         <span class="o">|</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="nc">None</span><span class="p">));</span>
</code></pre></div></div>

<p>Any change in the value of a state causes the UI component to be re-rendered. Consider this snippet, which uses the subscription <code class="language-plaintext highlighter-rouge">Sub.window</code> to update the window size, which calls <code class="language-plaintext highlighter-rouge">set_window_height</code> and <code class="language-plaintext highlighter-rouge">set_window_width</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">app</span> <span class="n">path</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">mode</span><span class="o">,</span> <span class="n">set_mode</span><span class="o">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">use_state</span> <span class="p">(</span><span class="n">load_path</span> <span class="n">path</span><span class="p">)</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">window_height</span><span class="o">,</span> <span class="n">set_window_height</span><span class="o">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">use_state</span> <span class="mi">24</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">window_width</span><span class="o">,</span> <span class="n">set_window_width</span><span class="o">,</span> <span class="n">_</span> <span class="o">=</span> <span class="n">use_state</span> <span class="mi">80</span> <span class="k">in</span>

  <span class="c">(* Handle window resize *)</span>
  <span class="n">use_subscription</span>
    <span class="p">(</span><span class="nn">Sub</span><span class="p">.</span><span class="n">window</span> <span class="p">(</span><span class="k">fun</span> <span class="n">size</span> <span class="o">-&gt;</span>
         <span class="n">set_window_height</span> <span class="n">size</span><span class="o">.</span><span class="n">height</span><span class="p">;</span>
         <span class="n">set_window_width</span> <span class="n">size</span><span class="o">.</span><span class="n">width</span><span class="p">));</span>

  <span class="c">(* Return a Ui.element using window_height and window_width *)</span>
  <span class="n">directory_browser</span> <span class="n">dir_info</span> <span class="n">window_height</span> <span class="n">window_width</span> <span class="n">set_mode</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="n">run</span> <span class="o">~</span><span class="n">alt_screen</span><span class="o">:</span><span class="bp">true</span> <span class="p">(</span><span class="k">fun</span> <span class="bp">()</span> <span class="o">-&gt;</span> <span class="n">app</span> <span class="n">path</span><span class="p">)</span>
</code></pre></div></div>

<p>In my testing, this worked but left unattached text fragments on the screen. This forced me to add a <code class="language-plaintext highlighter-rouge">Cmd.clear_screen</code> to manually clear the screen. <code class="language-plaintext highlighter-rouge">Cmd.repaint</code> doesn’t seem strictly necessary. The working subscription was:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="n">use_subscription</span>
    <span class="p">(</span><span class="nn">Sub</span><span class="p">.</span><span class="n">window</span> <span class="p">(</span><span class="k">fun</span> <span class="n">size</span> <span class="o">-&gt;</span>
         <span class="n">set_window_height</span> <span class="n">size</span><span class="o">.</span><span class="n">height</span><span class="p">;</span>
         <span class="n">set_window_width</span> <span class="n">size</span><span class="o">.</span><span class="n">width</span><span class="p">;</span>
         <span class="n">dispatch_cmd</span> <span class="p">(</span><span class="nn">Cmd</span><span class="p">.</span><span class="n">batch</span> <span class="p">[</span> <span class="nn">Cmd</span><span class="p">.</span><span class="n">clear_screen</span><span class="p">;</span> <span class="nn">Cmd</span><span class="p">.</span><span class="n">repaint</span> <span class="p">])));</span>
</code></pre></div></div>

<p>It is also possible to monitor values using <code class="language-plaintext highlighter-rouge">use_effect</code>. In the example below, the scroll position is reset when the filename is changed. The effect is triggered only when the component is rendered and when the value differs from the value on the previous render.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">use_effect</span> <span class="o">~</span><span class="n">deps</span><span class="o">:</span><span class="p">(</span><span class="nn">Deps</span><span class="p">.</span><span class="n">keys</span> <span class="p">[</span><span class="nn">Deps</span><span class="p">.</span><span class="n">string</span> <span class="n">content</span><span class="o">.</span><span class="n">filename</span><span class="p">])</span> <span class="p">(</span><span class="k">fun</span> <span class="bp">()</span> <span class="o">-&gt;</span>
  <span class="n">set_scroll_offset</span> <span class="mi">0</span><span class="p">;</span>
  <span class="n">set_h_scroll_offset</span> <span class="mi">0</span><span class="p">;</span>
  <span class="nc">None</span>
<span class="p">);</span>
</code></pre></div></div>

<p>The sequence is:</p>
<ol>
  <li>Component renders (first time or re-render due to state change)</li>
  <li>Framework checks if any values in ~deps changed since last render</li>
  <li>If they changed, run the effect function</li>
  <li>If the effect returns cleanup, that cleanup runs before the next effect</li>
</ol>

<p>For some widgets, I found I needed to perform manual calculations on the size to fill the space and correctly account for panel borders, header, dividers, and status. <code class="language-plaintext highlighter-rouge">window_height - 6</code>. In other cases, <code class="language-plaintext highlighter-rouge">~expand:true</code> was available.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">scroll_view</span>
  <span class="o">~</span><span class="n">height</span><span class="o">:</span><span class="p">(</span><span class="nt">`Cells</span> <span class="p">(</span><span class="n">window_height</span> <span class="o">-</span> <span class="mi">6</span><span class="p">))</span>
  <span class="o">~</span><span class="n">h_offset</span><span class="o">:</span><span class="n">h_scroll_offset</span> 
  <span class="o">~</span><span class="n">v_offset</span><span class="o">:</span><span class="n">scroll_offset</span> 
  <span class="n">file_content</span><span class="p">;</span>
</code></pre></div></div>

<p>Colours can be defined as RGB values and then composed into Syles with the <code class="language-plaintext highlighter-rouge">++</code> operator. Styles are then applied to elements such as table headers:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Colors</span> <span class="o">=</span> <span class="k">struct</span>
  <span class="k">let</span> <span class="n">primary_blue</span> <span class="o">=</span> <span class="nn">Style</span><span class="p">.</span><span class="n">rgb</span> <span class="mi">66</span> <span class="mi">165</span> <span class="mi">245</span>    <span class="c">(* Material Blue 400 *)</span>
<span class="k">end</span>

<span class="k">module</span> <span class="nc">Styles</span> <span class="o">=</span> <span class="k">struct</span>
  <span class="k">let</span> <span class="n">header</span> <span class="o">=</span> <span class="nn">Style</span><span class="p">.(</span><span class="n">fg</span> <span class="nn">Colors</span><span class="p">.</span><span class="n">primary_blue</span> <span class="o">++</span> <span class="n">bold</span><span class="p">)</span>
<span class="k">end</span>

<span class="n">table</span> <span class="o">~</span><span class="n">header_style</span><span class="o">:</span><span class="nn">Styles</span><span class="p">.</span><span class="n">header</span> <span class="o">...</span>
</code></pre></div></div>

<p>The panel serves as the primary container for our application content, providing both visual framing and structural organisation:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">panel</span> 
  <span class="o">~</span><span class="n">title</span><span class="o">:</span><span class="p">(</span><span class="nn">Printf</span><span class="p">.</span><span class="n">sprintf</span> <span class="s2">"Directory Browser - %s"</span> <span class="p">(</span><span class="nn">Filename</span><span class="p">.</span><span class="n">basename</span> <span class="n">dir_info</span><span class="o">.</span><span class="n">path</span><span class="p">))</span>
  <span class="o">~</span><span class="n">box_style</span><span class="o">:</span><span class="nc">Rounded</span> 
  <span class="o">~</span><span class="n">border_style</span><span class="o">:</span><span class="nn">Styles</span><span class="p">.</span><span class="n">accent</span> 
  <span class="o">~</span><span class="n">expand</span><span class="o">:</span><span class="bp">true</span>
  <span class="p">(</span><span class="n">vbox</span> <span class="p">[</span>
    <span class="c">(* content goes here *)</span>
  <span class="p">])</span>
</code></pre></div></div>

<p>Mosaic provides the table widget, which I found had a layout <a href="https://github.com/tmattio/mosaic/issues/2">issue</a> when the column widths exceeded the table width. It worked pretty well, but it takes about 1 second per 1000 rows on my machine, so consider pagination.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">table_columns</span> <span class="o">=</span> <span class="p">[</span>
  <span class="nn">Table</span><span class="p">.{</span> <span class="p">(</span><span class="n">default_column</span> <span class="o">~</span><span class="n">header</span><span class="o">:</span><span class="s2">"Name"</span><span class="p">)</span> <span class="k">with</span> <span class="n">style</span> <span class="o">=</span> <span class="nn">Styles</span><span class="p">.</span><span class="n">file</span> <span class="p">};</span>
  <span class="nn">Table</span><span class="p">.{</span> <span class="p">(</span><span class="n">default_column</span> <span class="o">~</span><span class="n">header</span><span class="o">:</span><span class="s2">"Type"</span><span class="p">)</span> <span class="k">with</span> <span class="n">style</span> <span class="o">=</span> <span class="nn">Styles</span><span class="p">.</span><span class="n">file</span> <span class="p">};</span>
  <span class="nn">Table</span><span class="p">.{</span> <span class="p">(</span><span class="n">default_column</span> <span class="o">~</span><span class="n">header</span><span class="o">:</span><span class="s2">"Size"</span><span class="p">)</span> <span class="k">with</span> <span class="n">style</span> <span class="o">=</span> <span class="nn">Styles</span><span class="p">.</span><span class="n">file</span><span class="p">;</span> <span class="n">justify</span> <span class="o">=</span> <span class="nt">`Right</span> <span class="p">};</span>
<span class="p">]</span> <span class="k">in</span>

<span class="n">table</span> 
  <span class="o">~</span><span class="n">columns</span><span class="o">:</span><span class="n">table_columns</span> 
  <span class="o">~</span><span class="n">rows</span><span class="o">:</span><span class="n">table_rows</span> 
  <span class="o">~</span><span class="n">box_style</span><span class="o">:</span><span class="nn">Table</span><span class="p">.</span><span class="nc">Minimal</span> 
  <span class="o">~</span><span class="n">expand</span><span class="o">:</span><span class="bp">true</span>
  <span class="o">~</span><span class="n">header_style</span><span class="o">:</span><span class="nn">Styles</span><span class="p">.</span><span class="n">header</span>
  <span class="o">~</span><span class="n">row_styles</span><span class="o">:</span><span class="n">table_row_styles</span>
  <span class="o">~</span><span class="n">width</span><span class="o">:</span><span class="p">(</span><span class="nc">Some</span> <span class="p">(</span><span class="n">window_width</span> <span class="o">-</span> <span class="mi">4</span><span class="p">))</span>
  <span class="bp">()</span>
</code></pre></div></div>

<p>The primary layout primitives are <code class="language-plaintext highlighter-rouge">vbox</code> and <code class="language-plaintext highlighter-rouge">hbox</code>:</p>

<p>Vertical Box (vbox) - for stacking components vertically.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">vbox</span> <span class="p">[</span>
  <span class="n">text</span> <span class="s2">"Header"</span><span class="p">;</span>
  <span class="n">divider</span> <span class="o">~</span><span class="n">orientation</span><span class="o">:</span><span class="nt">`Horizontal</span> <span class="bp">()</span><span class="p">;</span>
  <span class="n">content</span><span class="p">;</span>
  <span class="n">text</span> <span class="s2">"Footer"</span><span class="p">;</span>
<span class="p">]</span>
</code></pre></div></div>

<p>Horizontal Box (hbox) - for arranging components horizontally.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">hbox</span> <span class="o">~</span><span class="n">gap</span><span class="o">:</span><span class="p">(</span><span class="nt">`Cells</span> <span class="mi">2</span><span class="p">)</span> <span class="p">[</span>
  <span class="n">text</span> <span class="s2">"Left column"</span><span class="p">;</span>
  <span class="n">text</span> <span class="s2">"Right column"</span><span class="p">;</span>
<span class="p">]</span>
</code></pre></div></div>

<p>As I mentioned earlier, a subscription-based event handling system, for example, a component could subscribe to the keyboard events.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">use_subscription</span>
  <span class="p">(</span><span class="nn">Sub</span><span class="p">.</span><span class="n">keyboard_filter</span> <span class="p">(</span><span class="k">fun</span> <span class="n">event</span> <span class="o">-&gt;</span>
       <span class="k">match</span> <span class="n">event</span><span class="o">.</span><span class="nn">Input</span><span class="p">.</span><span class="n">key</span> <span class="k">with</span>
       <span class="o">|</span> <span class="nn">Input</span><span class="p">.</span><span class="nc">Char</span> <span class="n">c</span> <span class="k">when</span> <span class="nn">Uchar</span><span class="p">.</span><span class="n">to_int</span> <span class="n">c</span> <span class="o">=</span> <span class="mh">0x71</span> <span class="o">-&gt;</span> <span class="c">(* 'q' *)</span>
           <span class="n">dispatch_cmd</span> <span class="nn">Cmd</span><span class="p">.</span><span class="n">quit</span><span class="p">;</span> <span class="nc">Some</span> <span class="bp">()</span>
       <span class="o">|</span> <span class="nn">Input</span><span class="p">.</span><span class="nc">Enter</span> <span class="o">-&gt;</span> 
           <span class="c">(* handle enter *)</span>
           <span class="nc">Some</span> <span class="bp">()</span>
       <span class="o">|</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="nc">None</span><span class="p">))</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">keyboard_filter</code> function allows components to selectively handle keyboard events, returning <code class="language-plaintext highlighter-rouge">Some ()</code> for events that are handled and <code class="language-plaintext highlighter-rouge">None</code> for events that should be passed to other components.</p>

<p>Mosaic provides a command system for handling side effects and application lifecycle events some of these you will have seen in earlier examples.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dispatch_cmd</span> <span class="nn">Cmd</span><span class="p">.</span><span class="n">quit</span>                    <span class="c">(* Exit the application *)</span>
<span class="n">dispatch_cmd</span> <span class="nn">Cmd</span><span class="p">.</span><span class="n">repaint</span>                 <span class="c">(* Force a screen repaint *)</span>
<span class="n">dispatch_cmd</span> <span class="p">(</span><span class="nn">Cmd</span><span class="p">.</span><span class="n">batch</span> <span class="p">[</span>                <span class="c">(* Execute multiple commands *)</span>
  <span class="nn">Cmd</span><span class="p">.</span><span class="n">clear_screen</span><span class="p">;</span> 
  <span class="nn">Cmd</span><span class="p">.</span><span class="n">repaint</span>
<span class="p">])</span>
</code></pre></div></div>

<p>I found that using Unicode characters in strings caused alignment errors, as their length was the number of data bytes, not the visual space used on the screen.</p>

<p>The <a href="https://github.com/mtelvers/mless">mless</a> application is available on GitHub for further investigation or as a starter project.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,mosaic" /><category term="tunbury.org" /><summary type="html"><![CDATA[In testing various visual components, terminal resizing, keyboard handling and the use of hooks, I inadvertently wrote the less tool in Mosaic. Below are my notes on using the framework.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Measurement of Filesystem Performance</title><link href="https://www.tunbury.org/2025/08/27/fsperf/" rel="alternate" type="text/html" title="Measurement of Filesystem Performance" /><published>2025-08-27T00:00:00+00:00</published><updated>2025-08-27T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/27/fsperf</id><content type="html" xml:base="https://www.tunbury.org/2025/08/27/fsperf/"><![CDATA[<p>I ran into numerous problems when using tools such as <code class="language-plaintext highlighter-rouge">fio</code> and <code class="language-plaintext highlighter-rouge">filebench</code> to measure file system performance. Furthermore, these tool primarily measure the number of megabytes transferred to and from disk. My goal with this project is to measure the relative performance overhead the file system adds. For example, if I had thousands of small files with multiple workers, which file system should I use with my disk subsystem?</p>

<h1 id="the-testing-strategy">The Testing Strategy</h1>

<p>The application employ several distinct workload patterns:</p>

<p><strong>Random File Creation</strong>: Files of varying sizes are distributed across a directory structure of moderate depth. This simulates the behaviour of applications which generate output files, temporary data, or cached content without strict organisational constraints.</p>

<p><strong>Deep Directory Structures</strong>: We construct hierarchical directory trees of significant depth, then populate them with files. This pattern reflects the organisation of source code repositories, nested configuration directories, or hierarchical data storage schemes.</p>

<p><strong>Many Small Files</strong>: A concentrated effort to create numerous files of minimal size within a constrained directory structure. This workload approximates the behaviour of systems which fragment data into many small components—consider email storage systems, or applications which maintain extensive metadata collections.</p>

<p><strong>Large File Operations</strong>: Though not our primary focus, we include tests involving files of substantial size to ensure our measurements capture the full spectrum of filesystem behaviour.</p>

<h1 id="concurrency">Concurrency</h1>

<p>A filesystem which performs admirably under single-threaded access may exhibit quite different characteristics when subjected to concurrent operations. Modern systems rarely operate in isolation; multiple processes, threads, and users compete for filesystem resources simultaneously.</p>

<p>The testing methodology employs multiple concurrent workers, each operating independently within its own portion of the filesystem hierarchy. This approach serves two purposes: first, it more accurately reflects the concurrent nature of modern computing; second, it exposes performance characteristics which might remain hidden under purely sequential testing.</p>

<p>Each worker operates with its own random seed, ensuring that the patterns of file creation and directory traversal differ between workers, thus avoiding artificial synchronisation, which might skew results.</p>

<h1 id="filesystems">Filesystems</h1>

<p>Different filesystem implementations take different approaches. XFS, designed for large-scale server deployments, optimises for scalability and large file handling. EXT4 represents a mature, general-purpose approach with broad compatibility. ZFS brings advanced features such as built-in compression and snapshot capabilities, though at the cost of increased complexity. BTRFS offers similar advanced features with a different implementation strategy.</p>

<p>The testing framework treats these filesystems as interchangeable backends, applying identical workloads to each. This approach allows for direct comparison of their relative strengths and weaknesses under our specific test conditions.</p>

<p>The code is available on GitHub <a href="https://github.com/mtelvers/fsperf">mtelvers/fsperf</a>.</p>

<h1 id="next-steps">Next Steps</h1>

<p>Add filessystem specific features such as snapshots and clones to the testing matrix. Ref <a href="https://www.tunbury.org/2025/08/23/zfs-scaling/">A ZFS Scaling Adventure</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[I ran into numerous problems when using tools such as fio and filebench to measure file system performance. Furthermore, these tool primarily measure the number of megabytes transferred to and from disk. My goal with this project is to measure the relative performance overhead the file system adds. For example, if I had thousands of small files with multiple workers, which file system should I use with my disk subsystem?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/fsperf.png" /><media:content medium="image" url="https://www.tunbury.org/images/fsperf.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Terminal GUI for ocluster monitoring</title><link href="https://www.tunbury.org/2025/08/24/ocluster-monitor/" rel="alternate" type="text/html" title="Terminal GUI for ocluster monitoring" /><published>2025-08-24T00:00:00+00:00</published><updated>2025-08-24T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/24/ocluster-monitor</id><content type="html" xml:base="https://www.tunbury.org/2025/08/24/ocluster-monitor/"><![CDATA[<p>I’ve been thinking about terminal-based GUI applications recently and decided to give <a href="https://ocaml.org/p/notty/latest">notty</a> a try.</p>

<p>I decided to write a tool to display the status of the <a href="https://github.com/ocurrent/ocsluter">ocurrent/ocluster</a> in the terminal by gathering the statistics from <code class="language-plaintext highlighter-rouge">ocluster-admin</code>. I want to have histograms showing each pool’s current utilisation and backlog. The histograms will resize vertically and horizontally as the terminal size changes. And yes, I do love <code class="language-plaintext highlighter-rouge">btop</code>.</p>

<p>It’s functional, but still a work in progress. <a href="https://github.com/mtelvers/ocluster-monitor">mtelvers/ocluster-monitor</a></p>

<p>The histogram module uses braille characters (U+2800-U+28FF) to create dense visualizations where each character can represent up to 2x4 data points using the dots of a braille cell. In the code, these positions map to bit values:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Left Column Bits    Right Column Bits
   0x01 (1)            0x08 (4)
   0x02 (2)            0x10 (5)
   0x04 (3)            0x20 (6)
   0x40 (7)            0x80 (8)
</code></pre></div></div>

<h1 id="1-bit-mapping">1. Bit Mapping</h1>
<p>The code defines bit arrays for each column:</p>
<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">left_bits</span> <span class="o">=</span> <span class="p">[</span> <span class="mh">0x40</span><span class="p">;</span> <span class="mh">0x04</span><span class="p">;</span> <span class="mh">0x02</span><span class="p">;</span> <span class="mh">0x01</span> <span class="p">]</span>   <span class="c">(* Bottom to top *)</span>
<span class="k">let</span> <span class="n">right_bits</span> <span class="o">=</span> <span class="p">[</span> <span class="mh">0x80</span><span class="p">;</span> <span class="mh">0x20</span><span class="p">;</span> <span class="mh">0x10</span><span class="p">;</span> <span class="mh">0x08</span> <span class="p">]</span>  <span class="c">(* Bottom to top *)</span>
</code></pre></div></div>

<h1 id="2-height-to-dots-conversion">2. Height to Dots Conversion</h1>
<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">level</span> <span class="o">=</span> <span class="n">int_of_float</span> <span class="p">(</span><span class="n">height</span> <span class="o">*.</span> <span class="mi">4</span><span class="o">.</span><span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<p>This converts a height value (0.0-1.0) to the number of dots to fill (0-4).</p>

<h1 id="3-dot-pattern-generation">3. Dot Pattern Generation</h1>
<p>For each column, the algorithm:</p>
<ol>
  <li>Iterates through the bit array from bottom to top</li>
  <li>Sets each bit if the current level is high enough</li>
  <li>Uses bitwise OR to combine all active dots</li>
</ol>

<h1 id="4-character-assembly">4. Character Assembly</h1>
<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">braille_char</span> <span class="o">=</span> <span class="n">braille_base</span> <span class="ow">lor</span> <span class="n">left_dots</span> <span class="ow">lor</span> <span class="n">right_dots</span>
</code></pre></div></div>
<ul>
  <li><code class="language-plaintext highlighter-rouge">braille_base</code> = 0x2800 (base braille character)</li>
  <li><code class="language-plaintext highlighter-rouge">left_dots</code> and <code class="language-plaintext highlighter-rouge">right_dots</code> are OR’d together</li>
  <li>Result is converted to a Unicode character</li>
</ul>

<h1 id="5-multi-row-histograms">5. Multi-Row Histograms</h1>
<p>For taller displays, the histogram is split into multiple rows:</p>
<ul>
  <li>Each row represents a fraction of the total height</li>
  <li>Data values are normalized to fit within each row’s range</li>
  <li>Rows are generated from top to bottom</li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,notty" /><category term="tunbury.org" /><summary type="html"><![CDATA[I’ve been thinking about terminal-based GUI applications recently and decided to give notty a try.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocluster-monitor.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocluster-monitor.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">A ZFS Scaling Adventure</title><link href="https://www.tunbury.org/2025/08/23/zfs-scaling/" rel="alternate" type="text/html" title="A ZFS Scaling Adventure" /><published>2025-08-23T00:00:00+00:00</published><updated>2025-08-23T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/23/zfs-scaling</id><content type="html" xml:base="https://www.tunbury.org/2025/08/23/zfs-scaling/"><![CDATA[<p>The FreeBSD workers have been getting [slower](
(https://github.com/ocurrent/opam-repo-ci/issues/449): jobs that should take a few minutes are now timing out after 60 minutes. My first instinct was that ZFS was acting strangely.</p>

<p>I checked the classic ZFS performance indicators:</p>

<ul>
  <li>Pool health: <code class="language-plaintext highlighter-rouge">zpool status</code> - ONLINE, no errors</li>
  <li>ARC hit ratio: <code class="language-plaintext highlighter-rouge">sysctl kstat.zfs.misc.arcstats.hits kstat.zfs.misc.arcstats.misses</code> - 98.8% (excellent!)</li>
  <li>Fragmentation: <code class="language-plaintext highlighter-rouge">zpool list</code> - 53% (high but not catastrophic)</li>
  <li>I/O latency: <code class="language-plaintext highlighter-rouge">zpool iostat -v 1 3</code> and <code class="language-plaintext highlighter-rouge">iostat -x 1 3</code> - 1ms read/write (actually pretty good)</li>
</ul>

<p>But the <code class="language-plaintext highlighter-rouge">sync</code> command was taking 70-160ms when it should be under 10ms for an SSD. We don’t need <code class="language-plaintext highlighter-rouge">sync</code> as the disk has disposable CI artefacts, so why not try:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zfs <span class="nb">set sync</span><span class="o">=</span>disabled obuilder
</code></pre></div></div>

<p>The sync times improved to 40-50ms, but the CI jobs were still crawling.</p>

<p>I applied some ZFS tuning to try to improve things:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Crank up those queue depths</span>
sysctl vfs.zfs.vdev.async_read_max_active<span class="o">=</span>32
sysctl vfs.zfs.vdev.async_write_max_active<span class="o">=</span>32
sysctl vfs.zfs.vdev.sync_read_max_active<span class="o">=</span>32
sysctl vfs.zfs.vdev.sync_write_max_active<span class="o">=</span>32

<span class="c"># Speed up transaction groups</span>
sysctl vfs.zfs.txg.timeout<span class="o">=</span>1
sysctl vfs.zfs.dirty_data_max<span class="o">=</span>8589934592

<span class="c"># Optimize for metadata</span>
zfs <span class="nb">set </span><span class="nv">atime</span><span class="o">=</span>off obuilder
zfs <span class="nb">set </span><span class="nv">primarycache</span><span class="o">=</span>metadata obuilder
sysctl vfs.zfs.arc.meta_balance<span class="o">=</span>1000
</code></pre></div></div>

<p>However, these changes were making no measurable difference to the actual performance.</p>

<p>For comparison, I ran one of the CI steps on an identical machine, which was running Ubuntu with BTRFS:-</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>opam <span class="nb">install </span>astring.0.8.5 base-bigarray.base base-domains.base base-effects.base base-nnp.base base-threads.base base-unix.base base64.3.5.1 bechamel.0.5.0 camlp-streams.5.0.1 cmdliner.1.3.0 cppo.1.8.0 csexp.1.5.2 dune.3.20.0 either.1.0.0 fmt.0.11.0 gg.1.0.0 jsonm.1.0.2 logs.0.9.0 mdx.2.5.0 ocaml.5.3.0 ocaml-base-compiler.5.3.0 ocaml-compiler.5.3.0 ocaml-config.3 ocaml-options-vanilla.1 ocaml-version.4.0.1 ocamlbuild.0.16.1 ocamlfind.1.9.8 optint.0.3.0 ounit2.2.2.7 re.1.13.2 repr.0.7.0 result.1.5 seq.base stdlib-shims.0.3.0 topkg.1.1.0 uutf.1.0.4 vg.0.9.5
</code></pre></div></div>

<p>This took &lt; 3 minutes, but the worker logs showed the same step took 35 minutes. What could cause such a massive difference on identical hardware?</p>

<p>On macOS, I’ve previously seen problems when the number of mounted filesystems got to around 1000. <code class="language-plaintext highlighter-rouge">mount</code> would take t minutes to complete. I wondered, how many file systems are mounted?</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># mount | grep obuilder | wc -l</span>
    33787
</code></pre></div></div>

<p>Now, that’s quite a few file systems.  Historically, our FreeBSD workers had tiny SSDs, circa 128GB, but with the move to a new server with a 1.7TB SSD disk and using the same 25% prune threshold, the number of mounted file systems has become quite large.</p>

<p>I gradually increased the prune threshold and waited for <a href="https://github.com/ocurrent/ocluster">ocurrent/ocluster</a> to prune jobs. With the threshold at 90% the number of file systems was down to ~5,000, and performance was restored.</p>

<p>It’s not really a bug; it’s just an unexpected side effect of having a large number of mounted file systems. On macOS, the resolution was to unmount all the file systems at the end of each job, but that’s easy when the concurrency is limited to one and more tricky when the concurrency is 20 jobs.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="obuilder" /><category term="tunbury.org" /><summary type="html"><![CDATA[The FreeBSD workers have been getting [slower]( (https://github.com/ocurrent/opam-repo-ci/issues/449): jobs that should take a few minutes are now timing out after 60 minutes. My first instinct was that ZFS was acting strangely.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Label Maker in js_of_ocaml using Claude</title><link href="https://www.tunbury.org/2025/08/22/label-maker/" rel="alternate" type="text/html" title="Label Maker in js_of_ocaml using Claude" /><published>2025-08-22T00:00:00+00:00</published><updated>2025-08-22T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/22/label-maker</id><content type="html" xml:base="https://www.tunbury.org/2025/08/22/label-maker/"><![CDATA[<p>I’ve taken a few days off, and while I’ve been travelling, I’ve been working on a personal project with Claude. I’ve used Claude Code for the first time, which is a much more powerful experience than using <a href="https://claude.ai">claude.ai</a> as Claude can apply changes to the code and use your build tools directly to quickly iterate on a problem. In another first, I used <code class="language-plaintext highlighter-rouge">js_of_ocaml</code>, which has been awesome.</p>

<p>The project isn’t anything special; it’s a website that creates sheets of Avery labels. It is needed for a niche educational environment where the only devices available are iPads, which are administratively locked down, so no custom applications or fonts can be loaded. You enter what you want on the label, and it initiates the download of the resulting PDF.</p>

<p>The original <a href="https://label.tunbury.org">implementation</a>, written in OCaml (of course), uses a <a href="https://ocaml.org/p/cohttp/latest">cohttp</a> web server, which generates a <a href="https://en.wikipedia.org/wiki/ReStructuredText">reStructuredText</a> file which is processed via <a href="https://rst2pdf.org">rst2pdf</a> with custom page templates for the different label layouts. The disadvantage of this approach is that it requires a server to host it. I have wrapped the application into a Docker container, so it isn’t intrusive, but it would be easier if it could be hosted as a static file on GitHub Pages.</p>

<p>On OCaml.org, I found <a href="https://ocaml.org/p/camlpdf/latest">camlpdf</a>, <a href="https://ocaml.org/p/otfm/latest">otfm</a> and <a href="https://ocaml.org/p/vg/latest">vg</a>, which when combined with <code class="language-plaintext highlighter-rouge">js_of_ocaml</code>, should give me a complete tool in the browser. The virtual file system embeds the TTF font into the JavaScript code!</p>

<p>I set Claude to work, which didn’t take long, but the custom font embedding proved problematic. I gave Claude an example PDF from the original implementation, and after some debugging, we had a working project.</p>

<p>Let’s look at the code! I should add that the labels can optionally have a box drawn on them, which the student uses to provide feedback on how they got on with the objective. Claude produced three functions for rendering text: one for a single line, one for multiline text with a checkbox, and one for multiline text without a checkbox. I pointed out that these three functions were similar and could be combined. Claude agreed and created a merged function with the original three functions calling the new merged function. It took another prompt to update the calling locations to call the new merged function rather than having the stub functions.</p>

<p>While Claude had generated code that compiles in a functional language, the code tends to look imperative; for example, there were several instances like this:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">t</span> <span class="o">=</span> <span class="n">ref</span> <span class="mi">0</span> <span class="k">in</span>
<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="nn">List</span><span class="p">.</span><span class="n">iter</span> <span class="p">(</span><span class="k">fun</span> <span class="n">v</span> <span class="o">-&gt;</span> <span class="n">t</span> <span class="o">:=</span> <span class="o">!</span><span class="n">t</span> <span class="o">+</span> <span class="n">v</span><span class="p">)</span> <span class="p">[</span><span class="mi">1</span><span class="p">;</span> <span class="mi">2</span><span class="p">;</span> <span class="mi">3</span><span class="p">]</span> <span class="k">in</span>
<span class="n">t</span>
</code></pre></div></div>

<p>Where we would expect to see a <code class="language-plaintext highlighter-rouge">List.fold_left</code>! Claude can easily fix these when you point them out.</p>

<p>As I mentioned earlier, Claude code can build your project and respond to <code class="language-plaintext highlighter-rouge">dune build</code> errors for you; however, some fixes suppress the warning rather than actually fixing the root cause. A classic example of this is:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% dune build
File "bin/main.ml", line 4, characters 4-5:
4 | let x = List.length lst
        ^
Error (warning 32 [unused-value-declaration]): unused value x.
</code></pre></div></div>

<p>The proposed fix is to discard the value of <code class="language-plaintext highlighter-rouge">x</code>, thus <code class="language-plaintext highlighter-rouge">let _x = List.length lst</code> rather than realising that the entire line is unnecessary as <code class="language-plaintext highlighter-rouge">List.length</code> has no side effects.</p>

<p>I’d been using Chrome 139 for development, but thought I’d try in the native Safari on my Monterey-based based MacPro which has Safari 17.6. This gave me this error on the JavaScript console.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Error] TypeError: undefined is not 
  an object (evaluating 'k.UNSIGNED_MAX.udivmod')
          db (label_maker.bc.js:1758)
          (anonymous function) (label_maker.bc.js:1930)
          Global Code (label_maker.bc.js:2727:180993)
</code></pre></div></div>

<p>I found that since <code class="language-plaintext highlighter-rouge">js_of_ocaml</code> 6.0.1 the minimum browser version is Safari 18.2, so I switched to <code class="language-plaintext highlighter-rouge">js_of_ocaml</code> 5.9.1 and that worked fine.</p>

<p>The resulting project can be found at <a href="https://github.com/mtelvers/label-maker-js">mtelvers/label-maker-js</a> and published at <a href="https://mtelvers.github.io/label-maker-js/">mtelvers.github.io/label-maker-js</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="js_of_ocaml,ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[I’ve taken a few days off, and while I’ve been travelling, I’ve been working on a personal project with Claude. I’ve used Claude Code for the first time, which is a much more powerful experience than using claude.ai as Claude can apply changes to the code and use your build tools directly to quickly iterate on a problem. In another first, I used js_of_ocaml, which has been awesome.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">BuildKit Bake-off</title><link href="https://www.tunbury.org/2025/08/18/buildkit-bake/" rel="alternate" type="text/html" title="BuildKit Bake-off" /><published>2025-08-18T00:00:00+00:00</published><updated>2025-08-18T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/18/buildkit-bake</id><content type="html" xml:base="https://www.tunbury.org/2025/08/18/buildkit-bake/"><![CDATA[<p>I previously <a href="https://www.tunbury.org/2025/07/22/package-tool/">wrote</a> about a <a href="mtelvers/package-tool">mtelvers/package-tool</a> which would generate Dockerfiles for each package in opam.</p>

<p>The tool also created a single 10MB Dockerfile containing all ~4000 package builds. Each build looked like this:</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="w"> </span><span class="s">debian:12</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">builder_package_name</span>
<span class="k">RUN </span>apt update <span class="o">&amp;&amp;</span> apt upgrade <span class="nt">-y</span>
<span class="c"># ... setup opam</span>
<span class="k">RUN </span>opam <span class="nb">install </span>dependency1.version <span class="o">&gt;&gt;</span> build.log 2&gt;&amp;1 <span class="o">||</span> <span class="nb">echo</span> <span class="s1">'FAILED'</span> <span class="o">&gt;&gt;</span> build.log
<span class="k">RUN </span>opam <span class="nb">install </span>dependency2.version <span class="o">&gt;&gt;</span> build.log 2&gt;&amp;1 <span class="o">||</span> <span class="nb">echo</span> <span class="s1">'FAILED'</span> <span class="o">&gt;&gt;</span> build.log
<span class="k">RUN </span>opam <span class="nb">install </span>package.version <span class="o">&gt;&gt;</span> build.log 2&gt;&amp;1 <span class="o">||</span> <span class="nb">echo</span> <span class="s1">'FAILED'</span> <span class="o">&gt;&gt;</span> build.log
</code></pre></div></div>

<p>Followed by a final aggregation step:</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="w"> </span><span class="s">debian:12</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">results</span>
<span class="k">COPY</span><span class="s"> --from=builder_package_1 ["/home/opam/build.log", "/results/package1"]</span>
<span class="k">COPY</span><span class="s"> --from=builder_package_2 ["/home/opam/build.log", "/results/package2"]</span>
<span class="c"># ... ~4000 times</span>
</code></pre></div></div>

<p>This is a spectacular failure. Docker’s RPC layer cannot handle the 10MB Dockerfile, throwing <code class="language-plaintext highlighter-rouge">COMPRESSION_ERROR</code> messages.</p>

<p>I attempted to bypass Docker’s RPC limitations and go straight to BuildKit.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>buildctl build <span class="se">\</span>
  <span class="nt">--frontend</span> dockerfile.v0 <span class="se">\</span>
  <span class="nt">--local</span> <span class="nv">context</span><span class="o">=</span><span class="nb">.</span> <span class="se">\</span>
  <span class="nt">--local</span> <span class="nv">dockerfile</span><span class="o">=</span><span class="nb">.</span> <span class="se">\</span>
  <span class="nt">--output</span> <span class="nb">type</span><span class="o">=</span>image,name<span class="o">=</span>myimage:latest
</code></pre></div></div>

<p>The result was the same: compression errors. BuildKit’s RPC layer cannot handle the massive Dockerfile either.</p>

<p>Surely there is an elegant solution to build this with Docker? I generated a <code class="language-plaintext highlighter-rouge">docker-bake.hcl</code> file defining all the targets:</p>

<div class="language-hcl highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">group</span> <span class="s2">"all-packages"</span> <span class="p">{</span>
  <span class="nx">targets</span> <span class="p">=</span> <span class="p">[</span>
    <span class="s2">"pkg-0install-2-18"</span><span class="p">,</span>
    <span class="s2">"pkg-abella-2-0-8"</span><span class="p">,</span>
    <span class="c1">// ... ~4000 packages</span>
  <span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>BuildKit starts fine, but collapses in a few seconds with errors like <code class="language-plaintext highlighter-rouge">rpc error: code = NotFound desc = no such job</code>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker buildx bake results
 <span class="o">=&gt;</span> <span class="o">[</span>internal] load <span class="nb">local </span>bake definitions
 <span class="o">=&gt;</span> <span class="o">=&gt;</span> reading docker-bake.hcl 698.97kB / 698.97kB
 <span class="o">=&gt;</span> <span class="o">[</span>pkg-random-package internal] load build definition from random-package.dockerfile
 <span class="o">=&gt;</span> <span class="o">=&gt;</span> transferring dockerfile: 4.74kB
...
ERROR: target pkg-random-package: failed to receive status: rpc error: code <span class="o">=</span> NotFound desc <span class="o">=</span> no such job dwu7wqewt4vppoe4lhe3xx44f
</code></pre></div></div>

<p>Maybe BuildKit just needed some restraint? I tried various approaches:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">export </span><span class="nv">GOMAXPROCS</span><span class="o">=</span>100
<span class="nb">export </span><span class="nv">BUILDKIT_STEP_LOG_MAX_SIZE</span><span class="o">=</span>50000000
docker buildx bake results
</code></pre></div></div>

<p>I even created a custom BuildKit configuration, tried different drivers, and limited concurrent operations. However, it was still failing.</p>

<p>Building, at first one, then two, and then three packages at once worked well:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker buildx bake pkg-0install-2-18 pkg-abella-2-0-8 pkg-absolute-0-3
<span class="c"># [+] Building 17.7s (100/100) FINISHED</span>
</code></pre></div></div>

<p>This led me to add the <code class="language-plaintext highlighter-rouge">--batch-size</code> parameter to create batches of packages rather than listing them on the command line. By trial and error, I found that 100 is about the upper bound.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>package-tool <span class="nt">--opam-repository</span> ~/opam-repository <span class="nt">--dockerfile</span> <span class="nt">--batch-size</span> 100
<span class="k">for </span>a <span class="k">in</span> <span class="o">{</span>0..33<span class="o">}</span> <span class="p">;</span> <span class="k">do </span><span class="nb">sudo </span>docker buildx bake batch<span class="nv">$a</span> <span class="p">;</span> <span class="k">done</span>
</code></pre></div></div>

<p>I have now hit the next limitation: there is a maximum number of layers.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ERROR: target pkg-async_rpc_websocket-v0-17-0: failed to solve: failed to prepare ofhokk68c4o0esql38hz1yrzb as n4ytj8qd0izkhvs0srfj9vyi3: max depth exceeded
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="docker,buildkit,opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[I previously wrote about a mtelvers/package-tool which would generate Dockerfiles for each package in opam.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/docker-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/docker-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Lastlog in newer Ubuntu releases</title><link href="https://www.tunbury.org/2025/08/12/ubuntu-lastlog/" rel="alternate" type="text/html" title="Lastlog in newer Ubuntu releases" /><published>2025-08-12T00:00:00+00:00</published><updated>2025-08-12T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/12/ubuntu-lastlog</id><content type="html" xml:base="https://www.tunbury.org/2025/08/12/ubuntu-lastlog/"><![CDATA[<p>With the release of Ubuntu 24.10 and subsequently Ubuntu 25.04, the <code class="language-plaintext highlighter-rouge">lastlog</code> command has been removed.</p>

<p>Running <code class="language-plaintext highlighter-rouge">lastlog</code> results in a straight <code class="language-plaintext highlighter-rouge">command not found</code> error from the shell. Checking on an older system, <code class="language-plaintext highlighter-rouge">dpkg -S /usr/bin/last</code> and <code class="language-plaintext highlighter-rouge">/usr/bin/lastlog</code> come from packages <code class="language-plaintext highlighter-rouge">util-linux</code> and <code class="language-plaintext highlighter-rouge">login</code> respectively.</p>

<p>We can view the change log with <code class="language-plaintext highlighter-rouge">apt-get changelog login</code> or <code class="language-plaintext highlighter-rouge">apt-get changelog util-linux</code>, which shows a deliberate move away from these commands.</p>

<p>See also <a href="https://git.launchpad.net/ubuntu/+source/util-linux/commit/?id=e8866bb93ef4cdfa36a8ec94fc43fb66d33a67e4">https://git.launchpad.net/ubuntu/+source/util-linux/commit/?id=e8866bb93ef4cdfa36a8ec94fc43fb66d33a67e4</a></p>

<p>The suggestion is to install <code class="language-plaintext highlighter-rouge">wtmpdb</code>, which restores <code class="language-plaintext highlighter-rouge">last</code>. It’s a shame as it was helpful that <code class="language-plaintext highlighter-rouge">lastlog</code> was always available so you could see if a machine had been used recently without needing to install <code class="language-plaintext highlighter-rouge">wtmpdb</code>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ubuntu" /><category term="tunbury.org" /><summary type="html"><![CDATA[With the release of Ubuntu 24.10 and subsequently Ubuntu 25.04, the lastlog command has been removed.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ubuntu.png" /><media:content medium="image" url="https://www.tunbury.org/images/ubuntu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Zulip Terminal in Docker</title><link href="https://www.tunbury.org/2025/08/12/zulip-docker/" rel="alternate" type="text/html" title="Zulip Terminal in Docker" /><published>2025-08-12T00:00:00+00:00</published><updated>2025-08-12T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/12/zulip-docker</id><content type="html" xml:base="https://www.tunbury.org/2025/08/12/zulip-docker/"><![CDATA[<p>Anil spotted that there is a Zulip client available to run in a terminal window <a href="https://github.com/zulip/zulip-terminal">zulip/zulip-terminal</a>.</p>

<p>I dived into the instructions and built the <code class="language-plaintext highlighter-rouge">Dockerfile</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone <span class="nt">--depth</span><span class="o">=</span>1 git@github.com:zulip/zulip-terminal.git
<span class="nb">cd </span>zulip-terminal/docker
docker build <span class="nt">-t</span> zulip-terminal:latest <span class="nt">-f</span> Dockerfile.alpine <span class="nb">.</span>
</code></pre></div></div>

<p>However, I ran into a permission problem when running the container:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">mkdir</span> ~/.zulip
<span class="nv">$ </span>docker run <span class="nt">-it</span> <span class="nt">-v</span> ~/.zulip:/.zulip zulip-terminal:latest
zuliprc file was not found at /.zulip/zuliprc
Please enter your credentials to login into your Zulip organization.

NOTE: The Zulip URL is where you would go <span class="k">in </span>a web browser to log <span class="k">in </span>to Zulip.
It often looks like one of the following:
   your-org.zulipchat.com <span class="o">(</span>Zulip cloud<span class="o">)</span>
   zulip.your-org.com <span class="o">(</span>self-hosted servers<span class="o">)</span>
   chat.zulip.org <span class="o">(</span>the Zulip community server<span class="o">)</span>
Zulip URL: <span class="k">****</span>.zulipchat.com
Email: <span class="k">****</span>    
Password: 
PermissionError: zuliprc could not be created at /.zulip/zuliprc
</code></pre></div></div>

<p>I set the permissions with <code class="language-plaintext highlighter-rouge">chmod 777 ~/.zulip</code> and was up and running. <code class="language-plaintext highlighter-rouge">ls -n ~/.zulip</code> showed that the uid and gid were 100:101.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-rw-------   1 100   101         95 Aug 11 12:09 zuliprc
</code></pre></div></div>

<p>Looking at the <code class="language-plaintext highlighter-rouge">Dockerfile</code>, it has <code class="language-plaintext highlighter-rouge">RUN useradd --user-group --create-home zulip</code> which gets the next available uid/gid. I am 1000:1000 on my local machine. I’ve make a slight change to the <code class="language-plaintext highlighter-rouge">Dockerfile</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ git diff
diff --git i/docker/Dockerfile.buster w/docker/Dockerfile.buster
index f7a9dc2..315c010 100644
--- i/docker/Dockerfile.buster
+++ w/docker/Dockerfile.buster
@@ -1,6 +1,8 @@
 FROM python:3.7-buster AS builder
 
-RUN useradd --user-group --create-home zulip
+RUN if getent passwd 1000; then userdel -r $(id -nu 1000); fi
+RUN if getent group 1000; then groupdel -r $(id -nu 1000); fi
+RUN useradd --uid 1000 --user-group --create-home zulip
 USER zulip
 WORKDIR /home/zulip
 
@@ -19,7 +21,9 @@ RUN set -ex; python3 -m venv zt_venv \
 
 FROM python:3.7-slim-buster
 
-RUN useradd --user-group --create-home zulip
+RUN if getent passwd 1000; then userdel -r $(id -nu 1000); fi
+RUN if getent group 1000; then groupdel -r $(id -nu 1000); fi
+RUN useradd --uid 1000 --user-group --create-home zulip
 COPY --from=builder --chown=zulip:zulip /home/zulip /home/zulip
 USER zulip
 WORKDIR /home/zulip
</code></pre></div></div>

<p>Now it doesn’t give me a permission error, and I own the file!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker build -t zulip-terminal:latest -f Dockerfile.buster .
sudo rm -r ~/.zulip/
mkdir ~/.zulip
docker run -it -v ~/.zulip:/.zulip zulip-terminal:latest
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="docker,zulip" /><category term="tunbury.org" /><summary type="html"><![CDATA[Anil spotted that there is a Zulip client available to run in a terminal window zulip/zulip-terminal.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/zulip-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/zulip-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Debian 13 Trixie</title><link href="https://www.tunbury.org/2025/08/11/debian-trixie/" rel="alternate" type="text/html" title="Debian 13 Trixie" /><published>2025-08-11T00:00:00+00:00</published><updated>2025-08-11T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/11/debian-trixie</id><content type="html" xml:base="https://www.tunbury.org/2025/08/11/debian-trixie/"><![CDATA[<p>Antonin noticed that Debian 13 <em>trixie</em> has been released. The <a href="https://www.debian.org/News/2025/20250809">release notes</a> mention that i386 is no longer supported as a regular architecture. However, very excitingly, RISCV 64 is now supported.</p>

<blockquote>
  <p>This release for the first time officially supports the riscv64 architecture, allowing users to run Debian on 64-bit RISC-V hardware and benefit from all Debian 13 features.</p>
</blockquote>

<blockquote>
  <p>i386 is no longer supported as a regular architecture: there is no official kernel and no Debian installer for i386 systems. The i386 architecture is now only intended to be used on a 64-bit (amd64) CPU. Users running i386 systems should not upgrade to trixie. Instead, Debian recommends either reinstalling them as amd64, where possible, or retiring the hardware.</p>
</blockquote>

<p>The wording of the release notes made me wonder. Since we only need a Docker image would there still be one?</p>

<p><code class="language-plaintext highlighter-rouge">docker manifest inspect debian:trixie</code> showed there was a layer available:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="w">      </span><span class="p">{</span><span class="w">
         </span><span class="nl">"mediaType"</span><span class="p">:</span><span class="w"> </span><span class="s2">"application/vnd.oci.image.manifest.v1+json"</span><span class="p">,</span><span class="w">
         </span><span class="nl">"size"</span><span class="p">:</span><span class="w"> </span><span class="mi">1017</span><span class="p">,</span><span class="w">
         </span><span class="nl">"digest"</span><span class="p">:</span><span class="w"> </span><span class="s2">"sha256:b67fca6916104c1b11c5d1b47a62af92268318051971094acc9c5366c8eac7ad"</span><span class="p">,</span><span class="w">
         </span><span class="nl">"platform"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
            </span><span class="nl">"architecture"</span><span class="p">:</span><span class="w"> </span><span class="s2">"386"</span><span class="p">,</span><span class="w">
            </span><span class="nl">"os"</span><span class="p">:</span><span class="w"> </span><span class="s2">"linux"</span><span class="w">
         </span><span class="p">}</span><span class="w">
      </span><span class="p">}</span><span class="err">,</span><span class="w">
</span></code></pre></div></div>

<p>Then I noticed this weird behaviour:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/386 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
i386
<span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/amd64 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
i386
</code></pre></div></div>

<p>That’s odd. Let’s start again.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker system prune <span class="nt">-af</span>
<span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/amd64 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
amd64
<span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/386 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
i386
<span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/amd64 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
i386
</code></pre></div></div>

<p>Seems that after you have run the 386 variant, it gets stuck:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker system prune <span class="nt">-af</span>
<span class="nv">$ </span>docker pull <span class="nt">--platform</span> linux/amd64 debian:trixie
<span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/amd64 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
amd64
<span class="nv">$ </span>docker pull <span class="nt">--platform</span> linux/386 debian:trixie
<span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/386 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
i386
<span class="nv">$ </span>docker pull <span class="nt">--platform</span> linux/amd64 debian:trixie
<span class="nv">$ </span>docker run <span class="nt">--platform</span> linux/amd64 <span class="nt">--rm</span> <span class="nt">-it</span> debian:trixie dpkg <span class="nt">--print-architecture</span>
amd64
</code></pre></div></div>

<p>Adding the <code class="language-plaintext highlighter-rouge">docker pull</code> step seems to resolve this, even though it doesn’t actually pull anything.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="debian,trixie" /><category term="tunbury.org" /><summary type="html"><![CDATA[Antonin noticed that Debian 13 trixie has been released. The release notes mention that i386 is no longer supported as a regular architecture. However, very excitingly, RISCV 64 is now supported.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/debian-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/debian-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">SSL Password Authentication</title><link href="https://www.tunbury.org/2025/08/08/ssl-password/" rel="alternate" type="text/html" title="SSL Password Authentication" /><published>2025-08-08T00:00:00+00:00</published><updated>2025-08-08T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/08/ssl-password</id><content type="html" xml:base="https://www.tunbury.org/2025/08/08/ssl-password/"><![CDATA[<p>Would you like the maintainer’s version of the file or the local one? It’s one of those questions during <code class="language-plaintext highlighter-rouge">apt upgrade</code> that you answer on autopilot. Normally, it’s  <em>local</em> every time. Sometimes, the changes look mundane, and you take the <em>maintainer’s</em>. I did that today on <code class="language-plaintext highlighter-rouge">/etc/ssh/sshd_config</code>, but it made me pause and check whether password authentication had been inadvertently turned back on.</p>

<p>I could check the defaults for <code class="language-plaintext highlighter-rouge">sshd</code> and look at the values set in <code class="language-plaintext highlighter-rouge">/etc/ssh/sshd_config</code> and any files in <code class="language-plaintext highlighter-rouge">/etc/ssh/ssh_config.d</code>, but it would surely be easier to try to log in remotely using a password by turning off public key authentication.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~<span class="nv">$ </span>ssh <span class="nt">-o</span> <span class="nv">PreferredAuthentications</span><span class="o">=</span>password <span class="nt">-o</span> <span class="nv">PubkeyAuthentication</span><span class="o">=</span>no username@hostname
username@hostname: Permission denied <span class="o">(</span>publickey<span class="o">)</span><span class="nb">.</span>
</code></pre></div></div>

<p>That machine looks secure. What about other machines? I have an Ansible inventory <em>hosts</em> file, and an extensive history in my <code class="language-plaintext highlighter-rouge">~/.ssh/known_hosts</code>. I need an automated tool to check everything! <a href="https://github.com/mtelvers/ssh-security-checker">mtelvers/ssh-security-checker</a> is that tool!</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>dune <span class="nb">exec</span> <span class="nt">--</span> ssh-security-checker ./hosts
Testing SSH password authentication security <span class="k">for </span>9 hosts...

Testing host1... ❌ NETWORK UNREACHABLE
Testing host2... ✅ SECURE <span class="o">(</span>password auth disabled<span class="o">)</span>
Testing host3... 🔑 HOST KEY CHANGED <span class="o">(</span>security warning!<span class="o">)</span>
Testing host4... ❌ NETWORK UNREACHABLE
Testing host5... ✅ SECURE <span class="o">(</span>password auth disabled<span class="o">)</span>
Testing host6... ✅ SECURE <span class="o">(</span>password auth disabled<span class="o">)</span>
Testing host7... ✅ SECURE <span class="o">(</span>password auth disabled<span class="o">)</span>
Testing host8... ⚠️  WARNING: PASSWORD AUTH ENABLED!
Testing host9... ✅ SECURE <span class="o">(</span>password auth disabled<span class="o">)</span>
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml,ssh" /><category term="tunbury.org" /><summary type="html"><![CDATA[Would you like the maintainer’s version of the file or the local one? It’s one of those questions during apt upgrade that you answer on autopilot. Normally, it’s local every time. Sometimes, the changes look mundane, and you take the maintainer’s. I did that today on /etc/ssh/sshd_config, but it made me pause and check whether password authentication had been inadvertently turned back on.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Further investigations with Slurm</title><link href="https://www.tunbury.org/2025/08/06/slurm-limits/" rel="alternate" type="text/html" title="Further investigations with Slurm" /><published>2025-08-06T00:00:00+00:00</published><updated>2025-08-06T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/06/slurm-limits</id><content type="html" xml:base="https://www.tunbury.org/2025/08/06/slurm-limits/"><![CDATA[<p>Slurm uses cgroups to constrain jobs with the specified parameters and an accounting database to track job statistics.</p>

<p>After the initial <a href="https://www.tunbury.org/2025/04/14/slurm-workload-manager/">configuration</a> and ensuring everything is at the same <a href="https://www.tunbury.org/2025/07/29/slurm-versions/">version</a>, what we really need is some shared storage between the head node and the cluster machine(s). I’m going to quickly share <code class="language-plaintext highlighter-rouge">/home</code> over NFS.</p>

<p>Install an NFS server on the head node with <code class="language-plaintext highlighter-rouge">apt install nfs-kernel-server</code> and set up <code class="language-plaintext highlighter-rouge">/etc/exports</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/home    foo(rw,sync,no_subtree_check,no_root_squash)
</code></pre></div></div>

<p>On the cluster worker, install the NFS client, <code class="language-plaintext highlighter-rouge">apt install nfs-common</code> and mount the home directory:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount -t nfs head:/home/mte24 /home/mte24
</code></pre></div></div>

<p>I have deleted my user account on the cluster worker and set my UID/GID on the head node to values that do not conflict with any of those on the worker.</p>

<p>With the directory shared, and signed into the head node as my users, I can run <code class="language-plaintext highlighter-rouge">sbatch ./myscript</code></p>

<p>Configure Slurm to use cgroups, create <code class="language-plaintext highlighter-rouge">/etc/slurm/cgroups.conf</code> containing the following:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ConstrainCores=yes
ConstrainDevices=yes
ConstrainRAMSpace=yes
ConstrainSwapSpace=yes
</code></pre></div></div>

<p>Set these values in <code class="language-plaintext highlighter-rouge">/etc/slurm/slurm.conf</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ProctrackType=proctrack/cgroup
TaskPlugin=task/cgroup,task/affinity
JobAcctGatherType=jobacct_gather/cgroup
DefMemPerNode=16384
</code></pre></div></div>

<p>For accounting, we need to install a database and another Slurm daemon.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>mariadb-server
</code></pre></div></div>

<p>And <code class="language-plaintext highlighter-rouge">slurmdbd</code> with:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dpkg <span class="nt">-i</span> slurm-smd-slurmdbd_25.05.1-1_amd64.deb
</code></pre></div></div>

<p>Set up a database in MariaDB:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mysql</span> <span class="o">-</span><span class="n">e</span> <span class="nv">"CREATE DATABASE slurm_acct_db; CREATE USER 'slurm'@'localhost' IDENTIFIED BY 'password'; GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost';"</span>
</code></pre></div></div>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/slurm/slurmdbd.conf</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DbdHost=localhost
SlurmUser=slurm
StorageType=accounting_storage/mysql
StorageHost=localhost
StorageUser=slurm
StoragePass=password
StorageLoc=slurm_acct_db
LogFile=/var/log/slurm/slurmdbd.log
PidFile=/var/run/slurmdbd/slurmdbd.pid
</code></pre></div></div>

<p>Secure the file as the password is in plain text:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">chown </span>slurm:slurm /etc/slurm/slurmdbd.conf
<span class="nb">chmod </span>600 /etc/slurm/slurmdbd.conf
</code></pre></div></div>

<p>Then add these lines to slurm.conf</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>AccountingStorageType=accounting_storage/slurmdbd
AccountingStoragePort=6819
AccountingStorageEnforce=limits,qos,safe
</code></pre></div></div>

<p>Finally, we need to configure a cluster with a name that matches the name in <code class="language-plaintext highlighter-rouge">slurm.conf</code>. An account is a logical grouping, such as a department name. It is not a user account. Actual user accounts are associated with a cluster and an account. Therefore, a minimum configuration might be:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sacctmgr add cluster cluster
sacctmgr add account <span class="nv">name</span><span class="o">=</span>eeg <span class="nv">Organization</span><span class="o">=</span>EEG
sacctmgr <span class="nt">-i</span> create user <span class="nv">name</span><span class="o">=</span>mte24 <span class="nv">cluster</span><span class="o">=</span>cluster <span class="nv">account</span><span class="o">=</span>eeg
</code></pre></div></div>

<p>To test this out, create <code class="language-plaintext highlighter-rouge">script1</code> as follows:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash
# Test script
date
echo "I am now running on compute node:"
hostname
sleep 120
date
echo "Done..."
exit 0 
</code></pre></div></div>

<p>Then submit the job with a timeout of 30 seconds.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~<span class="nv">$ </span>sbatch <span class="nt">-t</span> 00:00:30 script1
Submitted batch job 10
</code></pre></div></div>

<p>The job output is in <code class="language-plaintext highlighter-rouge">slurm-10.out</code>, and we can see the completion state with <code class="language-plaintext highlighter-rouge">sacct</code>:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~<span class="nv">$ </span>sacct <span class="nt">-j</span> 10
JobID           JobName  Partition    Account  AllocCPUS      State ExitCode 
<span class="nt">------------</span> <span class="nt">----------</span> <span class="nt">----------</span> <span class="nt">----------</span> <span class="nt">----------</span> <span class="nt">----------</span> <span class="nt">--------</span> 
10              script1        eeg        eeg          2    TIMEOUT      0:0 
10.batch          batch                   eeg          2  COMPLETED      0:0 
</code></pre></div></div>

<p>Running a job with a specific memory and cpu limitation:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sbatch --mem=32768 --cpus-per-task=64 script1
</code></pre></div></div>

<p>To cancel a job, use <code class="language-plaintext highlighter-rouge">scancel</code>.</p>

<p>Slurm queues up jobs when the required resources can’t be satisfied. What is unclear is why users won’t request excessive RAM and CPU per job.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Slurm" /><category term="tunbury.org" /><summary type="html"><![CDATA[Slurm uses cgroups to constrain jobs with the specified parameters and an accounting database to track job statistics.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/slurm.png" /><media:content medium="image" url="https://www.tunbury.org/images/slurm.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Shuffling Lists</title><link href="https://www.tunbury.org/2025/08/04/list-shuffle/" rel="alternate" type="text/html" title="Shuffling Lists" /><published>2025-08-04T00:00:00+00:00</published><updated>2025-08-04T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/04/list-shuffle</id><content type="html" xml:base="https://www.tunbury.org/2025/08/04/list-shuffle/"><![CDATA[<p>Shuffling a list into a random order is usually handled by the <a href="https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle">Fisher-Yates Shuffle</a>.</p>

<p>It could be efficiently written in OCaml using arrays:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">Random</span><span class="p">.</span><span class="n">self_init</span> <span class="bp">()</span><span class="p">;</span>

<span class="k">let</span> <span class="n">fisher_yates_shuffle</span> <span class="n">arr</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">n</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">length</span> <span class="n">arr</span> <span class="k">in</span>
  <span class="k">for</span> <span class="n">i</span> <span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">downto</span> <span class="mi">1</span> <span class="k">do</span>
    <span class="k">let</span> <span class="n">j</span> <span class="o">=</span> <span class="nn">Random</span><span class="p">.</span><span class="n">int</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="k">in</span>
    <span class="k">let</span> <span class="n">temp</span> <span class="o">=</span> <span class="n">arr</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="k">in</span>
    <span class="n">arr</span><span class="o">.</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">arr</span><span class="o">.</span><span class="p">(</span><span class="n">j</span><span class="p">);</span>
    <span class="n">arr</span><span class="o">.</span><span class="p">(</span><span class="n">j</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">temp</span>
  <span class="k">done</span>
</code></pre></div></div>

<p>However, I had a one-off requirement to randomise a list, and this approach felt very <em>functional</em>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">Random</span><span class="p">.</span><span class="n">self_init</span> <span class="bp">()</span><span class="p">;</span>

<span class="k">let</span> <span class="n">shuffle</span> <span class="n">lst</span> <span class="o">=</span>
  <span class="nn">List</span><span class="p">.</span><span class="n">map</span> <span class="p">(</span><span class="k">fun</span> <span class="n">x</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="nn">Random</span><span class="p">.</span><span class="n">bits</span> <span class="bp">()</span><span class="o">,</span> <span class="n">x</span><span class="p">))</span> <span class="n">lst</span> <span class="o">|&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">sort</span> <span class="n">compare</span> <span class="o">|&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">map</span> <span class="n">snd</span>
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Shuffling a list into a random order is usually handled by the Fisher-Yates Shuffle.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OCaml Program Specification for Claude</title><link href="https://www.tunbury.org/2025/08/01/program-specification/" rel="alternate" type="text/html" title="OCaml Program Specification for Claude" /><published>2025-08-01T00:00:00+00:00</published><updated>2025-08-01T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/08/01/program-specification</id><content type="html" xml:base="https://www.tunbury.org/2025/08/01/program-specification/"><![CDATA[<p>I have a dataset that I would like to visualise using a static website hosted on GitHub Pages. The application that generates the dataset is still under development, which results in frequently changing data formats. Therefore, rather than writing a static website generator and needing to revise it continually, could I write a specification and have Claude create a new one each time there was a change?</p>

<p>Potentially, I could do this cumulatively by giving Claude the original specification and code and then the new specification, but my chosen approach is to see if Claude can create the application in one pass from the specification. I’ve also chosen to do this using Claude Sonnet’s web interface; obviously, the code I will request will be in OCaml.</p>

<p>I wrote a detailed 500-word specification that included the file formats involved, example directory tree layouts, and what I thought was a clear definition of the output file structure.</p>

<p>The resulting code wasn’t what I wanted: Claude had inlined huge swathes of HTML and was using <code class="language-plaintext highlighter-rouge">Printf.sprintf</code> extensively. Each file included the stylesheet as a <code class="language-plaintext highlighter-rouge">&lt;style&gt;...&lt;/style&gt;</code>. However, the biggest problem was that Claude had chosen to write the JSON parser from scratch, and this code had numerous issues and wouldn’t even build. I directed Claude to use <code class="language-plaintext highlighter-rouge">yojson</code> rather than handcraft a parser.</p>

<p>I intended but did not state in my specification that I wanted the code to generate HTML using <code class="language-plaintext highlighter-rouge">tyxml</code>. I updated my specification, requesting that the code be written using <code class="language-plaintext highlighter-rouge">tyxml</code>, <code class="language-plaintext highlighter-rouge">yojson</code>, and <code class="language-plaintext highlighter-rouge">timedesc</code> to handle the ISO date format. I also thought of some additional functionality around extracting data from a Git repo.</p>

<p>Round 2 - Possibly a step backwards as Claude struggled to find the appropriate functions in the <code class="language-plaintext highlighter-rouge">timedesc</code> library to parse and sort dates. There were also some issues extracting data using <code class="language-plaintext highlighter-rouge">git</code>. I have to take responsibility here as I gave the example command as <code class="language-plaintext highlighter-rouge">git show --date=iso-strict ce03608b4ba656c052ef5e868cf34b9e86d02aac -C /path/to/repo</code>, but <code class="language-plaintext highlighter-rouge">git</code> requires the <code class="language-plaintext highlighter-rouge">-C /path/to/repo</code> to precede the <code class="language-plaintext highlighter-rouge">show</code> command. However, the fact that my example had overwritten Claude’s <em>knowledge</em> was potentially interesting. Could I use this to seed facts I knew Claude would need?</p>

<p>Claude still wasn’t creating a separate <code class="language-plaintext highlighter-rouge">stylesheet.css</code>.</p>

<p>Round 3 - This time, I gave examples on how to use the <code class="language-plaintext highlighter-rouge">timedesc</code> library, i.e.</p>

<blockquote>
  <p>To use the <code class="language-plaintext highlighter-rouge">timedesc</code> library, we can call <code class="language-plaintext highlighter-rouge">Timedesc.of_iso8601</code> to convert the Git ISO strict output to a Timedesc object and then compare it with <code class="language-plaintext highlighter-rouge">compare (Timedesc.to_timestamp_float_s b.date) (Timedesc.to_timestamp_float_s a.date)</code>.</p>
</blockquote>

<p>Also, in addition to stating that all the styles should be shared in a common <code class="language-plaintext highlighter-rouge">stylesheet.css</code>, I gave a file tree of the expected output, including the <code class="language-plaintext highlighter-rouge">stylesheet.css</code>.</p>

<p>Claude now correctly used the <code class="language-plaintext highlighter-rouge">timedesc</code> library and tried to write a stylesheet. However, Claude had hallucinated a <code class="language-plaintext highlighter-rouge">css</code> and <code class="language-plaintext highlighter-rouge">css_rule</code> function in <code class="language-plaintext highlighter-rouge">tyxml</code> to do this, where none exists. Furthermore, adding the link to the stylesheet was causing problems as <code class="language-plaintext highlighter-rouge">link</code> had multiple definitions in scope and needed to be explicitly referenced as <code class="language-plaintext highlighter-rouge">Tyxml.Html.link</code>. Claude’s style was to open everything at the beginning of the file:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">open</span> <span class="nn">Yojson</span><span class="p">.</span><span class="nc">Safe</span>
<span class="k">open</span> <span class="nn">Yojson</span><span class="p">.</span><span class="nn">Safe</span><span class="p">.</span><span class="nc">Util</span>
<span class="k">open</span> <span class="nn">Tyxml</span><span class="p">.</span><span class="nc">Html</span>
<span class="k">open</span> <span class="nc">Printf</span> 
<span class="k">open</span> <span class="nc">Unix</span> 
</code></pre></div></div>

<p>The compiler picked <code class="language-plaintext highlighter-rouge">Unix.link</code> rather than <code class="language-plaintext highlighter-rouge">Tyxml.Html.link</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>File "ci_generator.ml", line 347, characters 18-33:
347 |         link ~rel:[ `Stylesheet ] ~href:"/stylesheet.css" ();
                        ^^^^^^^^^^^^^^^
Error: The function applied to this argument has type
         ?follow:bool -&gt; string -&gt; unit
This argument cannot be applied with label ~rel
</code></pre></div></div>

<blockquote>
  <p>Stylistically, please can we only <code class="language-plaintext highlighter-rouge">open</code> things in functions where they are used: <code class="language-plaintext highlighter-rouge">let foo () = let open Tyxml.Html in ...</code>. This will avoid global opens at the top of the file and avoid any confusion where libraries have functions with the same name, e.g., <code class="language-plaintext highlighter-rouge">Unix.link</code> and <code class="language-plaintext highlighter-rouge">TyXml.Html.link</code>.</p>
</blockquote>

<p>Furthermore, I had two JSON files in my input, each with the field <code class="language-plaintext highlighter-rouge">name</code>. Claude converted these into OCaml types; however, when referencing these later as function parameters, the compiler frequently picks the wrong one. This can be <em>fixed</em> by adding a specific type to the function parameter <code class="language-plaintext highlighter-rouge">let f (t:foo) = ...</code>. I’ve cheated here and renamed the field in one of the JSON files.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">foo</span> <span class="o">=</span> <span class="p">{</span>
  <span class="n">name</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
  <span class="n">x</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">type</span> <span class="n">bar</span> <span class="o">=</span> <span class="p">{</span>
  <span class="n">name</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
  <span class="n">y</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Claude chose to extract the data from the Git repo using <code class="language-plaintext highlighter-rouge">git show --pretty=format:'%H|%ai|%s'</code>, this  ignores the <code class="language-plaintext highlighter-rouge">--date=iso-strict</code> directive. The correct format should be <code class="language-plaintext highlighter-rouge">%aI</code>. I updated my guidance on the use of <code class="language-plaintext highlighter-rouge">git show</code>.</p>

<p>My specification now comes in just under 1000 words. From that single specification document, Claude produces a valid OCaml program on the first try, which builds the static site as per my design. <code class="language-plaintext highlighter-rouge">wc -l</code> shows me there are 662 lines of code.</p>

<p>It’s amusing to run it more than once to see the variations in styling!</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[I have a dataset that I would like to visualise using a static website hosted on GitHub Pages. The application that generates the dataset is still under development, which results in frequently changing data formats. Therefore, rather than writing a static website generator and needing to revise it continually, could I write a specification and have Claude create a new one each time there was a change?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Moving to opam 2.4</title><link href="https://www.tunbury.org/2025/07/30/opam-24/" rel="alternate" type="text/html" title="Moving to opam 2.4" /><published>2025-07-30T00:00:00+00:00</published><updated>2025-07-30T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/30/opam-24</id><content type="html" xml:base="https://www.tunbury.org/2025/07/30/opam-24/"><![CDATA[<p><a href="https://opam.ocaml.org/blog/opam-2-4-0/">opam 2.4.0</a> was released on 18th July followed by <a href="https://opam.ocaml.org/blog/opam-2-4-1/">opam 2.4.1</a> a few days later. This update needs to be propagated through the CI infrastructure.  The first step is to update the base images for each OS.</p>

<h1 id="linux">Linux</h1>

<h3 id="ocurrentdocker-base-images"><a href="https://github.com/ocurrent/docker-base-images">ocurrent/docker-base-images</a></h3>

<p>The Linux base images are created using the <a href="https://images.ci.ocaml.org">Docker base image builder</a>, which uses <a href="https://github.com/ocurrent/ocaml-dockerfile">ocurrent/ocaml-dockerfile</a> to know which versions of opam are available. Kate submitted <a href="https://github.com/ocurrent/ocaml-dockerfile/pull/235">PR#235</a> with the necessary changes to <a href="https://github.com/ocurrent/ocaml-dockerfile">ocurrent/ocaml-dockerfile</a>. This was released as v8.2.9 under <a href="https://github.com/ocaml/opam-repository/pull/28251">PR#28251</a>.</p>

<p>With v8.2.9 released, <a href="https://github.com/ocurrent/docker-base-images/pull/327">PR#327</a> can be opened to update the pipeline to build images which include opam 2.4. Rebuilding the base images takes a good deal of time, particularly as it’s marked as a low-priority task on the cluster.</p>

<h1 id="macos">macOS</h1>

<h3 id="ocurrentmacos-infra"><a href="https://github.com/ocurrent/macos-infra">ocurrent/macos-infra</a></h3>

<p>Including opam 2.4 in the macOS required <a href="https://github.com/ocurrent/macos-infra/pull/56">PR#56</a>, which adds <code class="language-plaintext highlighter-rouge">2.4.1</code> to the list of opam packages to download. There are Ansible playbooks that build the macOS base images and recursively remove the old images and their (ZFS) clones. They take about half an hour per machine. I run the Intel and Apple Silicon updates in parallel, but process each pool one at a time.</p>

<p>The Ansible command is:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ansible-playbook update-ocluster.yml
</code></pre></div></div>

<h1 id="freebsd-rosemarycaelumcidev">FreeBSD (rosemary.caelum.ci.dev)</h1>

<h3 id="ocurrentfreebsd-infra"><a href="https://github.com/ocurrent/freebsd-infra">ocurrent/freebsd-infra</a></h3>

<p>The FreeBSD update parallels the macOS update, requiring that <code class="language-plaintext highlighter-rouge">2.4.1</code> be added to the loop of available versions. <a href="https://github.com/ocurrent/freebsd-infra/pull/15">PR#15</a>.</p>

<p>The Ansible playbook for updating the machine is named <code class="language-plaintext highlighter-rouge">update.yml</code>. However, we have been suffering from some reliability issues with the FreeBSD worker, see <a href="https://github.com/ocurrent/opam-repo-ci/issues/449">issue#449</a>, so I took the opportunity to rebuild the worker from scratch.</p>

<p>The OS reinstallation is documented in this <a href="https://www.tunbury.org/2025/05/06/freebsd-uefi/">post</a>, and it’s definitely worth reading the <a href="https://github.com/ocurrent/freebsd-infra/blob/master/README.md">README.md</a> in the repo for the post-installation steps.</p>

<h1 id="windows-thymecaelumcidev">Windows (thyme.caelum.ci.dev)</h1>

<h3 id="ocurrentobuilder"><a href="https://github.com/ocurrent/obuilder">ocurrent/obuilder</a></h3>

<p>The Windows base images are built using a <code class="language-plaintext highlighter-rouge">Makefile</code> which runs unattended builds of Windows using QEMU virtual machines. The Makefile required <a href="https://github.com/ocurrent/obuilder/pull/198">PR#198</a> to The command is <code class="language-plaintext highlighter-rouge">make windows</code>.</p>

<p>Once the new images have been built, stop ocluster worker and move the new base images into place.
The next is to remove <code class="language-plaintext highlighter-rouge">results/*</code> as these layers will link to the old base images, and remove <code class="language-plaintext highlighter-rouge">state/*</code> so obuilder will create a new empty database on startup. Avoid removing <code class="language-plaintext highlighter-rouge">cache/*</code> as this is the download cache for opam objects.</p>

<p>The unattended installation can be monitored via VNC by connecting to localhost:5900.</p>

<h1 id="openbsd-oreganocaelumcidev">OpenBSD (oregano.caelum.ci.dev)</h1>

<h3 id="ocurrentobuilder-1"><a href="https://github.com/ocurrent/obuilder">ocurrent/obuilder</a></h3>

<p>The OpenBSD base images are built using the same <code class="language-plaintext highlighter-rouge">Makefile</code> used for Windows. There is a seperate commit in <a href="https://github.com/ocurrent/obuilder/pull/198">PR#198</a> for the changes needed for OpenBSD, which include moving from OpenBSD 7.6 to 7.7. Run <code class="language-plaintext highlighter-rouge">make openbsd</code>.</p>

<p>Once the new images have been built, stop ocluster worker and move the new base images into place.
The next is to remove <code class="language-plaintext highlighter-rouge">results/*</code> as these layers will link to the old base images, and remove <code class="language-plaintext highlighter-rouge">state/*</code> so obuilder will create a new empty database on startup. Avoid removing <code class="language-plaintext highlighter-rouge">cache/*</code> as this is the download cache for opam objects.</p>

<p>As with Windows, the unattended installation can be monitored via VNC by connecting to localhost:5900.</p>

<h1 id="ocaml-ci">OCaml-CI</h1>

<p>OCaml-CI uses <a href="https://github.com/ocurrent/ocaml-dockerfile">ocurrent/ocaml-dockerfile</a> as a submodule, so the module needs to be updated to the released version. Edits are needed to <code class="language-plaintext highlighter-rouge">lib/opam_version.ml</code> to include <code class="language-plaintext highlighter-rouge">V2_4</code>, then the pipeline needs to be updated in <code class="language-plaintext highlighter-rouge">service/conf.ml</code> to use version 2.4 rather than 2.3 for all the different operating systems. Linux is rather more automated than the others</p>

<p>Lastly, since we now have OpenBSD 7.7, I have also updated references to OpenBSD 7.6. <a href="https://github.com/ocurrent/ocaml-ci/pull/1020">PR#1020</a>.</p>

<h1 id="opam-repo-ci">opam-repo-ci</h1>

<p>opam-repo-ci tests using the latest <em>tagged</em> version of opam, which is called <code class="language-plaintext highlighter-rouge">opam-dev</code> within the base images. It also explicitly tests against the latest release in each of the 2.x series. With 2.4 being tagged, this will automatically become the used <em>dev</em> version once the base images are updated, but over time, 2.4 and the latest tagged version will diverge, so <a href="https://github.com/ocurrent/opam-repo-ci/pull/448">PR#448</a> is needed to ensure we continue to test with the released version of 2.4.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[opam 2.4.0 was released on 18th July followed by opam 2.4.1 a few days later. This update needs to be propagated through the CI infrastructure. The first step is to update the base images for each OS.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Slurm Versions</title><link href="https://www.tunbury.org/2025/07/29/slurm-versions/" rel="alternate" type="text/html" title="Slurm Versions" /><published>2025-07-29T00:00:00+00:00</published><updated>2025-07-29T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/29/slurm-versions</id><content type="html" xml:base="https://www.tunbury.org/2025/07/29/slurm-versions/"><![CDATA[<p>Slurm requires both the client and server to be on the same version.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[2025-07-29T15:41:34.492] error: slurm_unpack_received_msg: [[foo.cl.cam.ac.uk]:34214] Invalid Protocol Version 10752 from uid=0: No error
[2025-07-29T15:41:34.492] error: slurm_unpack_received_msg: [[foo.cl.cam.ac.uk]:34214] Incompatible versions of client and server code
[2025-07-29T15:41:34.502] error: slurm_receive_msg [128.232.93.254:34214]: Incompatible versions of client and server code
</code></pre></div></div>

<p>Noble (24.04) has Slurm 23.11.4-1.2ubuntu5, whereas Plucky (25.04) has 24.11.3-2.</p>

<p>The latest version is 25.05.1. <a href="https://www.schedmd.com/download-slurm">https://www.schedmd.com/download-slurm</a>.</p>

<p>The recommended approach is to build the Debian <code class="language-plaintext highlighter-rouge">.deb</code> packages from source. First, install basic Debian package build requirements:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>build-essential fakeroot devscripts equivs
</code></pre></div></div>

<p>Unpack the distributed tarball:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-L</span> https://download.schedmd.com/slurm/slurm-25.05.1.tar.bz2 | <span class="nb">tar</span> <span class="nt">-xajf</span> - <span class="o">&amp;&amp;</span> <span class="nb">cd </span>slurm-25.05.1
</code></pre></div></div>

<p>Install the Slurm package dependencies:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mk-build-deps <span class="nt">-i</span> debian/control
</code></pre></div></div>

<p>Build the Slurm packages:</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>debuild <span class="nt">-b</span> <span class="nt">-uc</span> <span class="nt">-us</span>
</code></pre></div></div>

<blockquote>
  <p>Before installing, ensure any old installations have been removed with <code class="language-plaintext highlighter-rouge">apt remove slurm*</code> and <code class="language-plaintext highlighter-rouge">apt remove libslurm*</code>.</p>
</blockquote>

<h1 id="worker">Worker</h1>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dpkg <span class="nt">-i</span> slurm-smd-slurmd_25.05.1-1_amd64.deb slurm-smd-client_25.05.1-1_amd64.deb slurm-smd_25.05.1-1_amd64.deb slurm-smd_25.05.1-1_amd64.deb
</code></pre></div></div>

<h1 id="head-controller">Head controller</h1>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dpkg <span class="nt">-i</span> slurm-smd-slurmctld_25.05.1-1_amd64.deb slurm-smd-client_25.05.1-1_amd64.deb slurm-smd_25.05.1-1_amd64.deb slurm-smd_25.05.1-1_amd64.deb
</code></pre></div></div>

<p>With the same version of Slurm on both machines, the instructions from my earlier <a href="https://www.tunbury.org/2025/04/14/slurm-workload-manager/">post</a> are working again.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># sinfo</span>
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
eeg<span class="k">*</span>         up   infinite      1   idle foo

<span class="c"># srun -N1 -l /bin/hostname</span>
0: foo.cl.cam.ac.uk
</code></pre></div></div>

<p>Slurm communicates directly over TCP connections using ports 6817/6818, so ensure that no firewalls are in the way!</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Slurm" /><category term="tunbury.org" /><summary type="html"><![CDATA[Slurm requires both the client and server to be on the same version.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/slurm.png" /><media:content medium="image" url="https://www.tunbury.org/images/slurm.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Pulley Diameter vs Rope Diameter</title><link href="https://www.tunbury.org/2025/07/27/pulley-diameter/" rel="alternate" type="text/html" title="Pulley Diameter vs Rope Diameter" /><published>2025-07-27T00:00:00+00:00</published><updated>2025-07-27T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/27/pulley-diameter</id><content type="html" xml:base="https://www.tunbury.org/2025/07/27/pulley-diameter/"><![CDATA[<p>About three months ago, I <a href="https://www.tunbury.org/2025/04/21/clock-winder-repair/">wrote</a> about replacing a snapped galvanised steel wire rope on a clock winder. Unfortunately, it has snapped again!</p>

<p>I am very surprised that the steel wire rope snaps, as the weight it lifts is, at most, 2 kg, however it does run every hour. I posed the question to Claude:</p>

<blockquote>
  <p>Hi Claude, I have a mechanism which lifts a 2Kg weight using a multistranded galvanised steel wire rope with a diameter of 3mm over a 20mm diameter pulley.  It works fine, but the rope snaps after about 3 months of operating once per hour. Can you suggest where the issue might lie?</p>
</blockquote>

<p>“The most likely issue is your pulley diameter is too small relative to your wire rope diameter. With a 3mm rope over a 20mm pulley, you have a diameter ratio of only 6.7:1, which is well below recommended standards.</p>

<p>Key problems with small pulley diameters:</p>

<ul>
  <li>Excessive bending stress in the wire rope as it wraps around the pulley</li>
  <li>Rapid fatigue failure from repeated flexing (your ~2,160 cycles over 3 months)</li>
  <li>Individual wires break from the severe bending, leading to rope failure</li>
</ul>

<p>Industry recommendations:</p>

<ul>
  <li>Minimum D/d ratio should typically be 20:1 for standard applications</li>
  <li>For your 3mm rope, you’d ideally want at least a 60mm diameter pulley</li>
  <li>Higher ratios (30:1 or more) would be even better for longevity”</li>
</ul>

<p>With this insight, I was easily able to verify that this is a good assessment.  For example, <a href="https://www.flints.co.uk/pdf/catalogue/pulleys.pdf">Flints</a> recommend, “In the case of wire ropes, the sheave diameter should ideally be 30 times the diameter of the wire.”</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="clock" /><summary type="html"><![CDATA[About three months ago, I wrote about replacing a snapped galvanised steel wire rope on a clock winder. Unfortunately, it has snapped again!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/aylesford-snap.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/aylesford-snap.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Octoprint</title><link href="https://www.tunbury.org/2025/07/26/octoprint/" rel="alternate" type="text/html" title="Octoprint" /><published>2025-07-26T00:00:00+00:00</published><updated>2025-07-26T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/26/octoprint</id><content type="html" xml:base="https://www.tunbury.org/2025/07/26/octoprint/"><![CDATA[<p>After my <a href="https://www.tunbury.org/2025/07/23/rochester/">3D print</a> last week, Michael asked whether I was using <a href="https://octoprint.org">OctoPrint</a>. I’ve been using <a href="https://www.pronterface.com">Pronterface</a> for some years, and I’ve never been terribly happy with it, but it does the job.</p>

<p>I had a <em>Pet Camera</em> pointed at the printer to see what was happening, <a href="https://syncthing.net">Syncthing</a> configured to share the GCode directory from my Mac to the Raspberry Pi, and the VNC Server to access the GUI. I decided that it was time to overhaul the setup with OctoPi!</p>

<p>OctoPi is available from the <a href="https://raspberrypi.org/software">Raspberry Pi Imager</a>, so updating my SD card was straightforward.  Step-by-step instructions are <a href="https://octoprint.org/download/">available</a>.</p>

<p>PrusaSlicer can be configured to communicate with OctoPi over IP. Therefore, once the model has been sliced, you can upload (and print) it directly from PrusaSlicer. This uses an API key for authentication. There is no longer a need for Syncthing.</p>

<p>Adding a USB web camera to the Pi lets you watch the printer remotely and record a time-lapse video.</p>

<p>Here’s my first attempt at a time-lapse print of a vase. There are some obvious issues with the camera position, and it got dark towards the end, which was a bit annoying.</p>

<iframe width="315" height="560" src="https://www.youtube.com/embed/DvMHkZs-KpI" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="3d-printing" /><summary type="html"><![CDATA[After my 3D print last week, Michael asked whether I was using OctoPrint. I’ve been using Pronterface for some years, and I’ve never been terribly happy with it, but it does the job.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/octoprint.png" /><media:content medium="image" url="https://www.tunbury.org/images/octoprint.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Website Build Analysis with Claude</title><link href="https://www.tunbury.org/2025/07/25/build-analysis/" rel="alternate" type="text/html" title="Website Build Analysis with Claude" /><published>2025-07-25T00:00:00+00:00</published><updated>2025-07-25T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/25/build-analysis</id><content type="html" xml:base="https://www.tunbury.org/2025/07/25/build-analysis/"><![CDATA[<p>The Tarides website is built using Docker, and it would be interesting to run a quick analysis over the logs, given that we have over 300 days’ worth. This is one of those things where I’d usually turn to AWK and spend ages fiddling with the script.</p>

<p>However, this time I decided to ask Claude. The log files are organised by date e.g. 2024-09-24/HHMMSS-docker-build-HASH.log, where each day directory may contain many logs, as there can be several builds in a day. The HHMMSS is the time the job was created, and HASH is the MD5 hash of the job. The log format is as below, with only the start and end shown.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>2024-09-24 14:45.02: New job: docker build
...
2024-09-24 14:55.14: Job succeeded
</code></pre></div></div>

<p>I would like a graph over time showing the duration each build takes to see if there are any trends.</p>

<p>With a few iterations and very few minutes of effort, Claude had a working script. Beyond my initial description, I added the complexity that I wanted to run it in a Docker container with a bind mount for my logs and to exclude failed jobs and jobs that completed very quickly (likely due to the Docker caching).</p>

<p>Claude’s code is in this <a href="https://gist.github.com/mtelvers/8383fb563e171778bfaf412f3119d50c">gist</a></p>

<p>Here’s the summary output</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>==================================================
BUILD ANALYSIS SUMMARY (FILTERED DATA)
==================================================
Original builds found: 1676
Builds after filtering: 655
Filtered out: 1021 (60.9%)
Filter criteria: min_duration &gt;= 100s, exclude_failed = True

Duration Statistics (minutes):
  Mean: 10.16
  Median: 6.92
  Min: 5.53
  Max: 68.87
  Std Dev: 6.00

Date Range:
  First build: 2024-09-24 14:45:50
  Last build: 2025-07-25 09:29:10

Analysis period: 305 days
Average builds per day: 2.1

Top 5 longest builds:
  ✓ 2025-02-05 15:37 - 68.87m - 153726-docker-build-f9426a.log
  ✓ 2025-02-05 15:37 - 62.72m - 153724-docker-build-d227b6.log
  ✓ 2025-02-05 15:37 - 56.03m - 153723-docker-build-65de8e.log
  ✓ 2025-05-07 12:41 - 55.90m - 124115-docker-build-f4091b.log
  ✓ 2025-02-05 15:37 - 42.47m - 153722-docker-build-dafc1d.log

Top 5 shortest builds (above threshold):
  ✓ 2025-01-13 14:26 - 5.53m - 142624-docker-build-fec55f.log
  ✓ 2024-09-25 10:10 - 5.65m - 101005-docker-build-c78655.log
  ✓ 2024-09-26 10:01 - 5.77m - 100119-docker-build-efd190.log
  ✓ 2025-02-07 18:09 - 5.83m - 180951-docker-build-ab19e5.log
  ✓ 2024-09-30 14:03 - 5.85m - 140301-docker-build-4028bb.log
Filtered data exported to /data/output/build_analysis.csv
Raw data exported to /data/output/build_analysis_raw.csv
</code></pre></div></div>

<p>And the graphs</p>

<p><img src="/images/build_times_timeline.png" alt="" /></p>

<p><img src="/images/daily_performance_trends.png" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="tarides" /><summary type="html"><![CDATA[The Tarides website is built using Docker, and it would be interesting to run a quick analysis over the logs, given that we have over 300 days’ worth. This is one of those things where I’d usually turn to AWK and spend ages fiddling with the script.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/docker_build_analysis.png" /><media:content medium="image" url="https://www.tunbury.org/images/docker_build_analysis.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Tarides Website</title><link href="https://www.tunbury.org/2025/07/24/tarides-website/" rel="alternate" type="text/html" title="Tarides Website" /><published>2025-07-24T00:00:00+00:00</published><updated>2025-07-24T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/24/tarides-website</id><content type="html" xml:base="https://www.tunbury.org/2025/07/24/tarides-website/"><![CDATA[<p>Bella was in touch as the tarides.com website is no longer building. The initial error is that <code class="language-plaintext highlighter-rouge">cmarkit</code> was missing, which I assumed was due to an outdated PR which needed to be rebased.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#20 [build 13/15] RUN ./generate-images.sh</span>
<span class="c">#20 0.259 + dune exec -- src/gen/main.exe file.dune</span>
<span class="c">#20 2.399     Building ocaml-config.3</span>
<span class="c">#20 9.486 File "src/gen/dune", line 7, characters 2-9:</span>
<span class="c">#20 9.486 7 |   cmarkit</span>
<span class="c">#20 9.486       ^^^^^^^</span>
<span class="c">#20 9.486 Error: Library "cmarkit" not found.</span>
<span class="c">#20 9.486 -&gt; required by _build/default/src/gen/main.exe</span>
<span class="c">#20 10.92 + dune build @convert</span>
<span class="c">#20 18.23 Error: Alias "convert" specified on the command line is empty.</span>
<span class="c">#20 18.23 It is not defined in . or any of its descendants.</span>
<span class="c">#20 ERROR: process "/bin/sh -c ./generate-images.sh" did not complete successfully: exit code: 1</span>
</code></pre></div></div>

<p>The site recently moved to Dune Package Management, so this was my first opportunity to dig into how that works. Comparing the current build to the last successful build, I can see that <code class="language-plaintext highlighter-rouge">cmarkit</code> was installed previously but isn’t now.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#19 [build 12/15] RUN dune pkg lock &amp;&amp; dune build @pkg-install
#19 25.39 Solution for dune.lock:
...
#19 25.39 - cmarkit.dev
...
</code></pre></div></div>

<p>Easy fix, I added <code class="language-plaintext highlighter-rouge">cmarkit</code> to the <code class="language-plaintext highlighter-rouge">.opam</code> file. Oddly, it’s in the <code class="language-plaintext highlighter-rouge">.opam</code> file as a pinned depend. However, the build now fails with a new message:</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#21 [build 13/15] RUN ./generate-images.sh</span>
<span class="c">#21 0.173 + dune exec -- src/gen/main.exe file.dune</span>
<span class="c">#21 2.582     Building ocaml-config.3</span>
<span class="c">#21 10.78 File "src/gen/grant.ml", line 15, characters 5-24:</span>
<span class="c">#21 10.78 15 |   |&gt; Hilite.Md.transform</span>
<span class="c">#21 10.78           ^^^^^^^^^^^^^^^^^^^</span>
<span class="c">#21 10.78 Error: Unbound module "Hilite.Md"</span>
<span class="c">#21 10.81 File "src/gen/blog.ml", line 142, characters 5-24:</span>
<span class="c">#21 10.81 142 |   |&gt; Hilite.Md.transform</span>
<span class="c">#21 10.81            ^^^^^^^^^^^^^^^^^^^</span>
<span class="c">#21 10.81 Error: Unbound module "Hilite.Md"</span>
<span class="c">#21 10.82 File "src/gen/page.ml", line 52, characters 5-24:</span>
<span class="c">#21 10.82 52 |   |&gt; Hilite.Md.transform</span>
<span class="c">#21 10.82           ^^^^^^^^^^^^^^^^^^^</span>
<span class="c">#21 10.82 Error: Unbound module "Hilite.Md"</span>
<span class="c">#21 10.94 + dune build @convert</span>
<span class="c">#21 19.46 Error: Alias "convert" specified on the command line is empty.</span>
<span class="c">#21 19.46 It is not defined in . or any of its descendants.</span>
<span class="c">#21 ERROR: process "/bin/sh -c ./generate-images.sh" did not complete successfully: exit code: 1</span>
</code></pre></div></div>

<p>Checking the <a href="https://opam.ocaml.org/packages/hilite/hilite.0.5.0/">hilite</a> package, I saw that there had been a new release last week. The change log lists:</p>

<ul>
  <li>Separate markdown package into an optional hilite.markdown package</li>
</ul>

<p>Ah, commit <a href="https://github.com/patricoferris/hilite/commit/529cb756b05dd15793c181304f438ba1aa48f12a">aaf60f7</a> removed the dependency on <code class="language-plaintext highlighter-rouge">cmarkit</code> by including the function <code class="language-plaintext highlighter-rouge">buffer_add_html_escaped_string</code> in the <code class="language-plaintext highlighter-rouge">hilite</code> source.</p>

<p>Pausing for a moment, if I constrain <code class="language-plaintext highlighter-rouge">hilite</code> to 0.4.0, does the site build? Yes. Ok, so that’s a valid solution. How hard would it be to switch to 0.5.0?</p>

<p>I hit a weird corner case as I was unable to link against <code class="language-plaintext highlighter-rouge">hilite.markdown</code>. I chatted with Patrick, and I recreated my switch, and everything worked.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>File "x/dune", line 3, characters 20-35:
3 |  (libraries cmarkit hilite.markdown))
                        ^^^^^^^^^^^^^^^
Error: Library "hilite.markdown" not found.
-&gt; required by library "help" in _build/default/x
-&gt; required by _build/default/x/.help.objs/native/help__X.cmx
-&gt; required by _build/default/x/help.a
-&gt; required by alias x/all
-&gt; required by alias default
</code></pre></div></div>

<p>Talking with Jon later about a tangential issue of docs for optional submodules gave me a sudden insight into the corner I’d found myself in. The code base depends on <code class="language-plaintext highlighter-rouge">hilite</code>, so after running <code class="language-plaintext highlighter-rouge">opam update</code> (to ensure I would get version 0.5.0), I created a new switch <code class="language-plaintext highlighter-rouge">opam switch create . --deps-only</code>, and opam installed 0.5.0. When I ran <code class="language-plaintext highlighter-rouge">dune build</code>, it reported a missing dependency on <code class="language-plaintext highlighter-rouge">cmarkit</code>, so I dutifully added it as a dependency and ran <code class="language-plaintext highlighter-rouge">opam install cmarkit</code>. Do you see the problem? <code class="language-plaintext highlighter-rouge">hilite</code> only builds the markdown module when <code class="language-plaintext highlighter-rouge">cmarkit</code> is installed. If both packages are listed in the opam file when the switch is created, everything works as expected.</p>

<p>The diff turned out to be pretty straightforward.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">let</span> <span class="n">html_of_md</span> <span class="o">~</span><span class="n">slug</span> <span class="n">body</span> <span class="o">=</span>
   <span class="nn">String</span><span class="p">.</span><span class="n">trim</span> <span class="n">body</span>
   <span class="o">|&gt;</span> <span class="nn">Cmarkit</span><span class="p">.</span><span class="nn">Doc</span><span class="p">.</span><span class="n">of_string</span> <span class="o">~</span><span class="n">strict</span><span class="o">:</span><span class="bp">false</span>
<span class="o">-</span>  <span class="o">|&gt;</span> <span class="nn">Hilite</span><span class="p">.</span><span class="nn">Md</span><span class="p">.</span><span class="n">transform</span>
<span class="o">+</span>  <span class="o">|&gt;</span> <span class="nn">Hilite_markdown</span><span class="p">.</span><span class="n">transform</span>
   <span class="o">|&gt;</span> <span class="nn">Cmarkit_html</span><span class="p">.</span><span class="n">of_doc</span> <span class="o">~</span><span class="n">safe</span><span class="o">:</span><span class="bp">false</span>
   <span class="o">|&gt;</span> <span class="nn">Soup</span><span class="p">.</span><span class="n">parse</span>
   <span class="o">|&gt;</span> <span class="n">rewrite_links</span> <span class="o">~</span><span class="n">slug</span>
</code></pre></div></div>

<p>Unfortunately, the build still does not complete successfully. When Dune Package Management builds <code class="language-plaintext highlighter-rouge">hilite</code>, it does not build the markdown module even though <code class="language-plaintext highlighter-rouge">cmarkit</code> is installed. I wish there was a <code class="language-plaintext highlighter-rouge">dune pkg install</code> command!</p>

<p>I tried to split the build by creating a .opam file which contained just <code class="language-plaintext highlighter-rouge">ocaml</code> and <code class="language-plaintext highlighter-rouge">cmarkit</code>, but this meant running <code class="language-plaintext highlighter-rouge">dune pkg lock</code> a second time, and that caused me to run straight into <a href="https://github.com/ocaml/dune/issues/11644">issue #11644</a>.</p>

<p>Perhaps I can patch <code class="language-plaintext highlighter-rouge">hilite</code> to make Dune Package Management deal with it as opam does? Jon commented earlier that <code class="language-plaintext highlighter-rouge">cmarkit</code> is listed as a <code class="language-plaintext highlighter-rouge">with-test</code> dependency. opam would use it if it were present, but perhaps Dune Package Management needs to be explicitly told that it can? I will add <code class="language-plaintext highlighter-rouge">cmarkit</code> as an optional dependency.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>depends: [
  "dune" {&gt;= "3.8"}
  "mdx" {&gt;= "2.4.1" &amp; with-test}
  "cmarkit" {&gt;= "0.3.0" &amp; with-test}
  "textmate-language" {&gt;= "0.3.3"}
  "odoc" {with-doc}
]
depopts: [
  "cmarkit" {&gt;= "0.3.0"}
]
</code></pre></div></div>

<p>With my <a href="https://github.com/mtelvers/hilite/tree/depopts">branch</a> of <code class="language-plaintext highlighter-rouge">hilite</code>, the website builds again with Dune Package Management.</p>

<p>I have created a <a href="https://github.com/patricoferris/hilite/pull/27">PR#27</a> to see if Patrick would be happy to update the package.</p>

<p>Feature request for Dune Package Management would be the equivalent of <code class="language-plaintext highlighter-rouge">opam option --global archive-mirrors="https://opam.ocaml.org/cache"</code> as a lengthy <code class="language-plaintext highlighter-rouge">dune pkg lock</code> may fail due to a single <code class="language-plaintext highlighter-rouge">curl</code> failure and need to be restarted from scratch.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="tarides" /><summary type="html"><![CDATA[Bella was in touch as the tarides.com website is no longer building. The initial error is that cmarkit was missing, which I assumed was due to an outdated PR which needed to be rebased.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/tarides.png" /><media:content medium="image" url="https://www.tunbury.org/images/tarides.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">3D Printing PointCloud Data</title><link href="https://www.tunbury.org/2025/07/23/rochester/" rel="alternate" type="text/html" title="3D Printing PointCloud Data" /><published>2025-07-23T00:00:00+00:00</published><updated>2025-07-23T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/23/rochester</id><content type="html" xml:base="https://www.tunbury.org/2025/07/23/rochester/"><![CDATA[<p>Seeing others in the EEG create 3D prints of various terrain, I’ve become a little jealous that they are having all the fun! It’s a simple, thousand-step process…</p>

<p>The Departmart of Environment has a data <a href="https://environment.data.gov.uk/survey">portal</a> which allows anyone to download LIDAR Point Cloud data. Rochester has a river, a castle and a cathedral, which sounds perfect for my print. The latest data for Rochester was 2023. The website allowed you to draw a polygon of the area of interest and download the relevant files.</p>

<p><img src="/images/defra-download.png" alt="" /></p>

<p>This gave me a ZIP file containing these LAZ files.</p>

<ul>
  <li>TQ7064_P_12689_20230520_20230520.laz</li>
  <li>TQ7066_P_12689_20230520_20230520.laz</li>
  <li>TQ7068_P_12689_20230520_20230520.laz</li>
  <li>TQ7264_P_12689_20230520_20230520.laz</li>
  <li>TQ7266_P_12689_20230520_20230520.laz</li>
  <li>TQ7268_P_12689_20230520_20230520.laz</li>
  <li>TQ7464_P_12689_20230520_20230520.laz</li>
  <li>TQ7466_P_12689_20230520_20230520.laz</li>
  <li>TQ7468_P_12688_20230520_20230520.laz</li>
  <li>TQ7468_P_12689_20230520_20230520.laz</li>
</ul>

<p>I decided to download <a href="https://www.danielgm.net/cc/">CloudCompare</a> based upon the short summary on the <a href="https://computing.ch.cam.ac.uk/software/cloudcompare-0">Department of Chemistry’s website</a>. It claims to be cross-platform, but the Windows installer seemed to be the course of least resistance. I opened the files one at a time until I found one I recognised.</p>

<p><img src="/images/cloudcompare-pointcloud.png" alt="" /></p>

<p>After importing the LAZ file, I use the Segment tool (scissors icon) to draw a polygon around the area I wanted to print. I thinned the number of points by going to Edit &gt; Subsample and accepting the defaults. To create a mesh, I used Edit &gt; Mesh &gt; Delaunay 2.5D and finally saved this surface, File &gt; Save as an STL mesh.</p>

<p>Opening the STL file in Blender shows a thing of beauty; however, it’s not without issues, as it has a vertical wall at some edges, and it’s just a surface with no depth.</p>

<p><img src="/images/blender-initial.png" alt="" /></p>

<p>The learning curve in Blender is very steep, and my skills have barely started the journey. Switch to Edit Mode, then select the entire object: Select &gt; All, then Select &gt; Select Loops &gt; Select Boundary Loop. Next, extrude the boundary loop down using Mesh &gt; Extrude &gt; Extrude Edges. I freeform extruded quite a bit; the exact amount doesn’t matter. Pressing Z during the extrusion constrains the extrusion to just the Z axis.</p>

<p><img src="/images/blender-extrude.png" alt="" /></p>

<p>The mesh now has depth but no bottom, and the depth is uneven. In Object mode, move the model so that a cut along the XY Plane will give the flat bottom we are looking for (see image above), then switch back to Edit mode. Select &gt; All, then Mesh &gt; Bisect. On the Bisect dialogue, enter the plane point as the vector (0, 0, 0) and the Plane normal vector (0, 0, 1). Importantly, tick Fill to close the bottom of the shape and Clear Inner to delete the bit we cut off.</p>

<p><img src="/images/blender-bisect.png" alt="" /></p>

<p>I repeated the above steps to crop the edges off the shape. Each time, I repositioned to use an axis as a bisection line and used an appropriate normal vector. This both straightens the shape and removes those final vertical edge artefacts.</p>

<p>The final step in Blender is to go File &gt; Export &gt; STL and save the model.</p>

<p>My slicer of choice is <a href="https://ultimaker.com/software/ultimaker-cura">UltiMaker Cura</a>, but the application crashed when attempting to slice the model. I installed <a href="https://www.prusa3d.com/page/prusaslicer_424/">PrusaSlicer</a>, which was able to slice the model, albeit it took a decent amount of time!</p>

<p><img src="/images/prusaslicer.png" alt="" /></p>

<p>Printing in progress…</p>

<p><img src="/images/rochester-printing.png" alt="" /></p>

<p>The final model.</p>

<p><img src="/images/rochester-printed.png" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="3dprinting" /><category term="tunbury.org" /><summary type="html"><![CDATA[Seeing others in the EEG create 3D prints of various terrain, I’ve become a little jealous that they are having all the fun! It’s a simple, thousand-step process…]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/rochester-3d.png" /><media:content medium="image" url="https://www.tunbury.org/images/rochester-3d.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Package Tool</title><link href="https://www.tunbury.org/2025/07/22/package-tool/" rel="alternate" type="text/html" title="Package Tool" /><published>2025-07-22T00:00:00+00:00</published><updated>2025-07-22T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/22/package-tool</id><content type="html" xml:base="https://www.tunbury.org/2025/07/22/package-tool/"><![CDATA[<p>Would you like to build every package in opam in a single Dockerfile using BuildKit?</p>

<p>In <a href="https://github.com/mtelvers/package-tool">mtelvers/package-tool</a>, I have combined various opam sorting and graphing functions into a CLI tool that will work on a checked-out <a href="https://github.com/ocaml/opam-repository">opam-repository</a>. Many of these flags can be combined.</p>

<h1 id="package-version">Package version</h1>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>package-tool <span class="nt">--opam-repository</span> ~/opam-repository &lt;package&gt;
</code></pre></div></div>

<p>The package can be given as <code class="language-plaintext highlighter-rouge">0install.2.18</code> or <code class="language-plaintext highlighter-rouge">0install</code>. The former specifies a specific version while the latter processes the latest version. <code class="language-plaintext highlighter-rouge">--all-versions</code> can be specified to generate files for all package versions.</p>

<h1 id="dependencies">Dependencies</h1>

<p>Dump the dependencies for the latest version of 0install into a JSON file.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>package-tool <span class="nt">--opam-repository</span> ~/opam-repository <span class="nt">--deps</span> 0install
</code></pre></div></div>

<p>Produces <code class="language-plaintext highlighter-rouge">0install.2.18-deps.json</code>:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="nl">"yojson.3.0.0"</span><span class="p">:[</span><span class="s2">"dune.3.19.1"</span><span class="p">],</span><span class="w">
</span><span class="nl">"xmlm.1.4.0"</span><span class="p">:[</span><span class="s2">"topkg.1.0.8"</span><span class="p">],</span><span class="w">
</span><span class="nl">"topkg.1.0.8"</span><span class="p">:[</span><span class="s2">"ocamlfind.1.9.8"</span><span class="p">,</span><span class="s2">"ocamlbuild.0.16.1"</span><span class="p">],</span><span class="w">
</span><span class="err">...</span><span class="w">
</span><span class="s2">"0install-solver.2.18"</span><span class="err">]</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h1 id="installation-order">Installation order</h1>

<p>Create a list showing the installation order for the given package.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>package-tool <span class="nt">--opam-repository</span> ~/opam-repository <span class="nt">--list</span> 0install
</code></pre></div></div>

<p>Produces <code class="language-plaintext highlighter-rouge">0install.2.18-list.json</code>:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="s2">"ocaml-compiler.5.3.0"</span><span class="p">,</span><span class="w">
</span><span class="s2">"ocaml-base-compiler.5.3.0"</span><span class="p">,</span><span class="w">
</span><span class="err">...</span><span class="w">
</span><span class="s2">"0install.2.18"</span><span class="p">]</span><span class="w">
</span></code></pre></div></div>

<h1 id="solution-dag">Solution DAG</h1>

<p>Output the solution graph in Graphviz format, which can then be converted into a PDF with <code class="language-plaintext highlighter-rouge">dot</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>package-tool <span class="nt">--opam-repository</span> ~/opam-repository <span class="nt">--dot</span> 0install
dot <span class="nt">-Tpdf</span> 0install.2.18.dot 0install.2.18.pdf
</code></pre></div></div>
<h1 id="ocaml-version">OCaml version</h1>

<p>By default, OCaml 5.3.0 is used, but this can be changed using the <code class="language-plaintext highlighter-rouge">--ocaml 4.14.2</code> parameter.</p>

<h1 id="dockerfile">Dockerfile</h1>

<p>The <code class="language-plaintext highlighter-rouge">--dockerfile</code> argument creates a Dockerfile to test the installation.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>package-tool <span class="nt">--opam-repository</span> ~/opam-repository <span class="nt">--dockerfile</span> <span class="nt">--all-versions</span> 0install
</code></pre></div></div>

<p>For example, the above command line outputs 5 Dockerfiles.</p>

<ul>
  <li>0install.2.15.1.dockerfile</li>
  <li>0install.2.15.2.dockerfile</li>
  <li>0install.2.16.dockerfile</li>
  <li>0install.2.17.dockerfile</li>
  <li>0install.2.18.dockerfile</li>
</ul>

<p>As an example, <code class="language-plaintext highlighter-rouge">0install.2.18.dockerfile</code>, contains:</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="w"> </span><span class="s">debian:12</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">builder_0install_2_18</span>
<span class="k">RUN </span>apt update <span class="o">&amp;&amp;</span> apt upgrade <span class="nt">-y</span>
<span class="k">RUN </span>apt <span class="nb">install</span> <span class="nt">-y</span> build-essential git rsync unzip curl <span class="nb">sudo</span>
<span class="k">RUN if </span>getent passwd 1000<span class="p">;</span> <span class="k">then </span>userdel <span class="nt">-r</span> <span class="si">$(</span><span class="nb">id</span> <span class="nt">-nu</span> 1000<span class="si">)</span><span class="p">;</span> <span class="k">fi</span>
<span class="k">RUN </span>adduser <span class="nt">--uid</span> 1000 <span class="nt">--disabled-password</span> <span class="nt">--gecos</span> <span class="s1">''</span> opam
<span class="k">ADD</span><span class="s"> --chown=root:root --chmod=0755 [ "https://github.com/ocaml/opam/releases/download/2.3.0/opam-2.3.0-x86_64-linux", "/usr/local/bin/opam" ]</span>
<span class="k">RUN </span><span class="nb">echo</span> <span class="s1">'opam ALL=(ALL:ALL) NOPASSWD:ALL'</span> <span class="o">&gt;&gt;</span> /etc/sudoers.d/opam
<span class="k">RUN </span><span class="nb">chmod </span>440 /etc/sudoers.d/opam
<span class="k">USER</span><span class="s"> opam</span>
<span class="k">WORKDIR</span><span class="s"> /home/opam</span>
<span class="k">ENV</span><span class="s"> OPAMYES="1" OPAMCONFIRMLEVEL="unsafe-yes" OPAMERRLOGLEN="0" OPAMPRECISETRACKING="1"</span>
<span class="k">ADD</span><span class="s"> --chown=opam:opam --keep-git-dir=false [ ".", "/home/opam/opam-repository" ]</span>
<span class="k">RUN </span>opam init default <span class="nt">-k</span> <span class="nb">local</span> ~/opam-repository <span class="nt">--disable-sandboxing</span> <span class="nt">--bare</span>
<span class="k">RUN </span>opam switch create default <span class="nt">--empty</span>
<span class="k">RUN </span>opam <span class="nb">install </span>ocaml-compiler.5.3.0 <span class="o">&gt;&gt;</span> build.log 2&gt;&amp;1 <span class="o">||</span> <span class="nb">echo</span> <span class="s1">'FAILED'</span> <span class="o">&gt;&gt;</span> build.log
<span class="k">RUN </span>opam <span class="nb">install </span>ocaml-base-compiler.5.3.0 <span class="o">&gt;&gt;</span> build.log 2&gt;&amp;1 <span class="o">||</span> <span class="nb">echo</span> <span class="s1">'FAILED'</span> <span class="o">&gt;&gt;</span> build.log
...
<span class="k">RUN </span>opam <span class="nb">install </span>0install-solver.2.18 <span class="o">&gt;&gt;</span> build.log 2&gt;&amp;1 <span class="o">||</span> <span class="nb">echo</span> <span class="s1">'FAILED'</span> <span class="o">&gt;&gt;</span> build.log
<span class="k">RUN </span>opam <span class="nb">install </span>0install.2.18 <span class="o">&gt;&gt;</span> build.log 2&gt;&amp;1 <span class="o">||</span> <span class="nb">echo</span> <span class="s1">'FAILED'</span> <span class="o">&gt;&gt;</span> build.log
<span class="k">ENTRYPOINT</span><span class="s"> [ "opam", "exec", "--" ]</span>
<span class="k">CMD</span><span class="s"> bash</span>
</code></pre></div></div>

<p>This can be built using Docker in the normal way. Note that the build context is your checkout of <a href="https://github.com/ocaml/opam-repository">opam-repository</a>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker build <span class="nt">-f</span> 0install.2.18.dockerfile ~/opam-repository
</code></pre></div></div>

<p>Additionally, it outputs <code class="language-plaintext highlighter-rouge">Dockerfile</code>, which contains the individual package builds as a multistage build and an aggregation stage as the final layer:</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="w"> </span><span class="s">debian:12</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">results</span>
<span class="k">WORKDIR</span><span class="s"> /results</span>
<span class="k">RUN </span>apt update <span class="o">&amp;&amp;</span> apt upgrade <span class="nt">-y</span>
<span class="k">RUN </span>apt <span class="nb">install</span> <span class="nt">-y</span> less
<span class="k">COPY</span><span class="s"> --from=builder_0install_2_15_1 [ "/home/opam/build.log", "/results/0install.2.15.1" ]</span>
<span class="k">COPY</span><span class="s"> --from=builder_0install_2_15_2 [ "/home/opam/build.log", "/results/0install.2.15.2" ]</span>
<span class="k">COPY</span><span class="s"> --from=builder_0install_2_16 [ "/home/opam/build.log", "/results/0install.2.16" ]</span>
<span class="k">COPY</span><span class="s"> --from=builder_0install_2_17 [ "/home/opam/build.log", "/results/0install.2.17" ]</span>
<span class="k">COPY</span><span class="s"> --from=builder_0install_2_18 [ "/home/opam/build.log", "/results/0install.2.18" ]</span>
<span class="k">CMD</span><span class="s"> bash</span>
</code></pre></div></div>

<p>Build all the versions of 0install in parallel using BuildKit’s layer caching:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker build <span class="nt">-f</span> Dockerfile <span class="nt">-t</span> opam-results ~/opam-repository
</code></pre></div></div>

<p>We can inspect the build logs in the Docker container:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker run <span class="nt">--rm</span> <span class="nt">-it</span> opam-results
root@b28da667e754:/results# <span class="nb">ls</span>^C
root@b28da667e754:/results# <span class="nb">ls</span> <span class="nt">-l</span>
total 76
<span class="nt">-rw-r--r--</span> 1 1000 1000 12055 Jul 22 20:17 0install.2.15.1
<span class="nt">-rw-r--r--</span> 1 1000 1000 15987 Jul 22 20:19 0install.2.15.2
<span class="nt">-rw-r--r--</span> 1 1000 1000 15977 Jul 22 20:19 0install.2.16
<span class="nt">-rw-r--r--</span> 1 1000 1000 16376 Jul 22 20:19 0install.2.17
<span class="nt">-rw-r--r--</span> 1 1000 1000 15150 Jul 22 20:19 0install.2.18
</code></pre></div></div>

<p>Annoyingly, Docker doesn’t seem to be able to cope with all of opam at once. I get various RPC errors.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[+] Building 2.9s (4/4) FINISHED                                                                                                    docker:default
 =&gt; [internal] load build definition from Dockerfile
 =&gt; =&gt; transferring dockerfile: 10.79MB
 =&gt; resolve image config for docker-image://docker.io/docker/dockerfile:1
 =&gt; CACHED docker-image://docker.io/docker/dockerfile:1@sha256:9857836c9ee4268391bb5b09f9f157f3c91bb15821bb77969642813b0d00518d
 =&gt; [internal] load build definition from Dockerfile
ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: connection error: COMPRESSION_ERROR
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml,opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[Would you like to build every package in opam in a single Dockerfile using BuildKit?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Depth-first topological ordering</title><link href="https://www.tunbury.org/2025/07/21/depth-first-topological-ordering/" rel="alternate" type="text/html" title="Depth-first topological ordering" /><published>2025-07-21T00:00:00+00:00</published><updated>2025-07-21T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/21/depth-first-topological-ordering</id><content type="html" xml:base="https://www.tunbury.org/2025/07/21/depth-first-topological-ordering/"><![CDATA[<p>Over the last few months, I have written several posts on the package installation graphs specifically, <a href="https://www.tunbury.org/2025/03/25/topological-sort/">Topological Sort of Packages</a>, <a href="https://www.tunbury.org/2025/03/31/opam-post-deps/">Installation order for opam packages</a> and <a href="https://www.tunbury.org/2025/06/23/transitive-reduction/">Transitive Reduction of Package Graph</a>. In this post, I’d like to cover a alternative ordering solution.</p>

<p>Considering the graph above, first presented in the <a href="https://www.tunbury.org/2025/03/25/topological-sort/">Topological Sort of Packages</a>, which produces the installation order below.</p>

<ol>
  <li>base-threads.base</li>
  <li>base-unix.base</li>
  <li>ocaml-variants</li>
  <li>ocaml-config</li>
  <li>ocaml</li>
  <li>dune</li>
</ol>

<p>The code presented processed nodes when all their dependencies are satisfied (i.e., when their in-degree becomes 0). This typically means we process “leaf” nodes (nodes with no dependencies) first and then work our way up. However, it may make sense to process the leaf packages only when required rather than as soon as they can be processed. The easiest way to achieve this is to reverse the edges in the DAG, perform the topological sort, and then install the pages in reverse order.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">reverse_dag</span> <span class="p">(</span><span class="n">dag</span> <span class="o">:</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">t</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">t</span><span class="p">)</span> <span class="o">:</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">t</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">t</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">initial_reversed</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">fold</span> <span class="p">(</span><span class="k">fun</span> <span class="n">package</span> <span class="n">_</span> <span class="n">acc</span> <span class="o">-&gt;</span>
    <span class="nn">PackageMap</span><span class="p">.</span><span class="n">add</span> <span class="n">package</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">empty</span> <span class="n">acc</span>
  <span class="p">)</span> <span class="n">dag</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">empty</span> <span class="k">in</span>
  <span class="nn">PackageMap</span><span class="p">.</span><span class="n">fold</span> <span class="p">(</span><span class="k">fun</span> <span class="n">package</span> <span class="n">dependencies</span> <span class="n">reversed_dag</span> <span class="o">-&gt;</span>
    <span class="nn">PackageSet</span><span class="p">.</span><span class="n">fold</span> <span class="p">(</span><span class="k">fun</span> <span class="n">dependency</span> <span class="n">acc</span> <span class="o">-&gt;</span>
      <span class="k">let</span> <span class="n">current_dependents</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">find</span> <span class="n">dependency</span> <span class="n">acc</span> <span class="k">in</span>
      <span class="nn">PackageMap</span><span class="p">.</span><span class="n">add</span> <span class="n">dependency</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.</span><span class="n">add</span> <span class="n">package</span> <span class="n">current_dependents</span><span class="p">)</span> <span class="n">acc</span>
    <span class="p">)</span> <span class="n">dependencies</span> <span class="n">reversed_dag</span>
  <span class="p">)</span> <span class="n">dag</span> <span class="n">initial_reversed</span>
</code></pre></div></div>

<p>With such a function, we can write this:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">reverse_dag</span> <span class="n">dune</span> <span class="o">|&gt;</span> <span class="n">topological_sort</span> <span class="o">|&gt;</span> <span class="nn">List</span><span class="p">.</span><span class="n">rev</span>
</code></pre></div></div>

<ol>
  <li>ocaml-variants</li>
  <li>ocaml-config</li>
  <li>ocaml</li>
  <li>base-unix.base</li>
  <li>base-threads.base</li>
  <li>dune</li>
</ol>

<p>Now, we don’t install base-unix and base-threads until they are actually required for the installation of dune.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml,opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[Over the last few months, I have written several posts on the package installation graphs specifically, Topological Sort of Packages, Installation order for opam packages and Transitive Reduction of Package Graph. In this post, I’d like to cover a alternative ordering solution.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/dune-graph.png" /><media:content medium="image" url="https://www.tunbury.org/images/dune-graph.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Docker Container for OxCaml</title><link href="https://www.tunbury.org/2025/07/18/docker-oxcaml/" rel="alternate" type="text/html" title="Docker Container for OxCaml" /><published>2025-07-18T18:00:00+00:00</published><updated>2025-07-18T18:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/18/docker-oxcaml</id><content type="html" xml:base="https://www.tunbury.org/2025/07/18/docker-oxcaml/"><![CDATA[<p>Jon asked me to make a Docker image that contains <a href="https://oxcaml.org">OxCaml</a> ready to run without the need to build it from scratch.</p>

<p>I have written a simple OCurrent pipeline to periodically poll <a href="https://github.com/oxcaml/opam-repository">oxcaml/opam-repository</a>. If the SHA has changed, it builds a Docker image and pushes it to current/opam-staging:oxcaml.</p>

<p>The resulting image can be run like this:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker run <span class="nt">--rm</span> <span class="nt">-it</span> ocurrent/opam-staging:oxcaml
ubuntu@146eab4efc18:/<span class="nv">$ </span>ocaml
OCaml version 5.2.0+ox
Enter
<span class="c">#help;; for help.</span>

<span class="c">#</span>
</code></pre></div></div>

<p>The exact content of the image may change depending upon requirements, and we should also pick a better place to put it rather than ocurrent/opam-staging!</p>

<p>The pipeline code is available here <a href="https://github.com/mtelvers/docker-oxcaml">mtelvers/docker-oxcaml</a> and the service is deployed at <a href="https://oxcaml.image.ci.dev">oxcaml.image.ci.dev</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="oxcaml" /><summary type="html"><![CDATA[Jon asked me to make a Docker image that contains OxCaml ready to run without the need to build it from scratch.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/oxcaml.png" /><media:content medium="image" url="https://www.tunbury.org/images/oxcaml.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Q2 Summary</title><link href="https://www.tunbury.org/2025/07/18/q2-summary/" rel="alternate" type="text/html" title="Q2 Summary" /><published>2025-07-18T12:00:00+00:00</published><updated>2025-07-18T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/18/q2-summary</id><content type="html" xml:base="https://www.tunbury.org/2025/07/18/q2-summary/"><![CDATA[<p>I am grateful for <a href="https://tarides.com">Tarides</a>’ sponsorship of my OCaml work. Below is a summary of my activities in Q2 2025.</p>

<h1 id="ocaml-infrastructure-and-development">OCaml Infrastructure and Development</h1>

<h2 id="ocaml-maintenance-activities">OCaml Maintenance Activities</h2>

<p>General maintenance work on OCaml’s infrastructure spanned many areas, including <a href="https://www.tunbury.org/2025/03/24/recent-ocaml-version/">updating minimum supported OCaml versions from 4.02 to 4.08</a> and addressing issues with <a href="https://www.tunbury.org/2025/04/04/opam-repo-ci/">opam-repo-ci job timeouts</a>. Platform-specific work included resolving compatibility issues with <a href="https://www.tunbury.org/2025/04/22/ocaml-fedora-gcc/">Fedora 42 and GCC 15</a>, addressing <a href="https://www.tunbury.org/2025/05/13/ubuntu-apparmor/">Ubuntu AppArmor</a> conflicts affecting runc operations, and managing <a href="https://www.tunbury.org/2025/05/19/macos-sequoia/">macOS Sequoia</a> upgrades across the Mac Mini CI workers. Complex build issues were investigated and resolved, including <a href="https://www.tunbury.org/2025/06/21/macos-sequoia-include-path/">C++ header path problems in macOS workers</a> and <a href="https://www.tunbury.org/2025/03/26/freebsd-14.2/">FreeBSD system upgrades</a> for the CI infrastructure.</p>

<h2 id="ocaml-infrastructure-migration">OCaml Infrastructure Migration</h2>

<p>Due to the impending sunset of the <a href="https://www.tunbury.org/2025/04/23/blade-allocation/">Equinix Metal platform</a>, the OCaml community services needed to be migrated. Services including <a href="https://www.tunbury.org/2025/04/27/ocaml-ci/">OCaml-CI</a>, <a href="https://www.tunbury.org/2025/04/29/equinix-moves/">opam-repo-ci</a>, and the <a href="https://www.tunbury.org/2025/04/29/equinix-moves/">opam.ocaml.org</a> deployment pipeline were migrated to <a href="https://www.tunbury.org/2025/04/25/blade-reallocation/">new blade servers</a>. The migration work was planned to minimise service disruption, which was kept to just a few minutes. Complete procedures were documented, including Docker volume transfers and rsync strategies.</p>

<h2 id="opam2web-deployment">opam2web Deployment</h2>

<p>Optimisation work was undertaken on the <a href="https://www.tunbury.org/2025/06/24/opam2web/">deployment pipeline for opam2web</a>, which powers opam.ocaml.org, to address the more than two-hour deployment time. The primary issue was the enormous size of the opam2web Docker image, which exceeded 25GB due to the inclusion of complete opam package archives. The archive was moved to a separate layer, allowing Docker to cache the layer and reducing the deployment time to 20 minutes.</p>

<h2 id="opam-dependency-graphs">opam Dependency Graphs</h2>

<p>Algorithms for managing OCaml package dependencies were investigated, including <a href="https://www.tunbury.org/2025/03/25/topological-sort/">topological sorting</a> to determine the optimal package installation order. This work extended to handling complex dependency scenarios, including post-dependencies and optional dependencies. Implemented a <a href="https://www.tunbury.org/2025/06/23/transitive-reduction/">transitive reduction algorithm</a> to create a dependency graph with minimal edge counts while preserving the same dependency relationships, enabling more efficient package management and installation processes.</p>

<h2 id="ocaml-developments-under-windows">OCaml Developments under Windows</h2>

<p>Significant work was undertaken to bring <a href="https://www.tunbury.org/2025/06/14/windows-containerd-2/">containerization</a> technologies to OCaml development on Windows. This included implementing a tool to create <a href="https://www.tunbury.org/2025/06/27/windows-containerd-3/">host compute networks</a> via the Windows API,  tackling limitations with <a href="https://www.tunbury.org/2025/06/18/windows-reflinks/">NTFS hard links</a>, and implementing copy-on-write <a href="https://www.tunbury.org/2025/07/07/refs-monteverde/">reflink</a> tool for Windows.</p>

<h2 id="oxcaml-support">OxCaml Support</h2>

<p>Support for the new OxCaml compiler variant included establishing an <a href="https://www.tunbury.org/2025/06/12/oxcaml-repository/">opam repository</a> and testing which existing <a href="https://www.tunbury.org/2025/05/14/opam-health-check-oxcaml/">OCaml packages</a> successfully built with the new compiler.</p>

<h1 id="zfs-storage-and-hardware-deployment">ZFS Storage and Hardware Deployment</h1>

<p>Early in the quarter, a hardware deployment project centred around <a href="https://www.tunbury.org/2025/04/11/dell-r640-ubuntu/">Dell PowerEdge R640</a> servers with a large-scale SSD storage was undertaken. The project involved deploying multiple batches of <a href="https://www.tunbury.org/2025/04/03/kingston-drives/">Kingston 7.68TB SSD drives</a>, creating automated deployments for Ubuntu using network booting with EFI and cloud-init configuration. Experimented with ZFS implementation as a <a href="https://www.tunbury.org/2025/04/02/ubuntu-with-zfs-root/">root filesystem</a>, which was possibly but ultimately discarded and explored <a href="https://www.tunbury.org/2025/04/21/ubuntu-dm-cache/">dm-cache for SSD acceleration</a> of spinning disk arrays. Investigated using ZFS as a distributed storage archive system using an <a href="https://www.tunbury.org/2025/05/16/zfs-replcation-ansible/">Ansible-based deployment</a> strategy based upon a YAML description.</p>

<h2 id="talos-ii-repairs">Talos II Repairs</h2>

<p><a href="https://www.tunbury.org/2025/04/29/raptor-talos-ii/">Significant hardware reliability issues</a> affected two Raptor Computing Talos II POWER9 machines. The first system experienced complete lockups after as little as 20 minutes of operation, while the second began exhibiting similar problems requiring daily power cycling. Working with Raptor Computing support to isolate the issues, upgrading firmware and eventually <a href="https://www.tunbury.org/2025/05/27/raptor-talos-ii-update/">swapping CPUs</a> between the systems resolved the issue. Concurrently, this provided an opportunity to analyse the performance of OBuilder operations on POWER9 systems, comparing <a href="https://www.tunbury.org/2025/05/29/overlayfs/">OverlayFS on TMPFS versus BTRFS on NVMe storage</a>, resulting in optimised build performance.</p>

<h1 id="eeg-systems-investigations">EEG Systems Investigations</h1>

<p>Various software solutions and research platforms were explored as part of a broader system evaluation. This included investigating <a href="https://www.tunbury.org/2025/04/14/slurm-workload-manager/">Slurm Workload Manager</a> for compute resource scheduling, examining <a href="https://www.tunbury.org/2025/04/19/gluster/">Gluster distributed filesystem</a> capabilities, and implementing <a href="https://www.tunbury.org/2025/05/07/otter-wiki-with-raven/">Otter Wiki with Raven authentication</a> integration for collaborative documentation. Research extended to modern research data management platforms, exploring <a href="https://www.tunbury.org/2025/06/03/inveniordm/">InvenioRDM</a> for scientific data archival and <a href="https://www.tunbury.org/2025/07/02/bon-in-a-box/">BON in a Box</a> for biodiversity analysis workflows. To support the <a href="https://www.tunbury.org/2025/07/14/tessera-workshop/">Tessera workshop</a>, a multi-user Jupyter environment was set up using Docker containerization.</p>

<h1 id="miscellaneous-technical-explorations">Miscellaneous Technical Explorations</h1>

<p>Diverse technical explorations included implementing <a href="https://www.tunbury.org/2025/03/15/bluesky-pds/">Bluesky Personal Data Server</a> and developing innovative <a href="https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/">SSH authentication</a> mechanisms using the ATProto network by extracting SSH public keys from Bluesky profiles. Additional projects included developing OCaml-based API tools for <a href="https://www.tunbury.org/2025/04/12/box-diff/">Box cloud storage</a>, creating <a href="https://www.tunbury.org/2025/03/23/real-time-trains/">Real Time Trains</a> API integrations, and exploring various file synchronisation and <a href="https://www.tunbury.org/2025/06/14/borg-backup/">backup</a> solutions. Investigation of <a href="https://www.tunbury.org/2025/07/15/reflink-copy/">reflink copy</a> mechanisms for efficient file operations using OCaml multicore.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="tarides" /><summary type="html"><![CDATA[I am grateful for Tarides’ sponsorship of my OCaml work. Below is a summary of my activities in Q2 2025.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/tarides.png" /><media:content medium="image" url="https://www.tunbury.org/images/tarides.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Reflink Copy</title><link href="https://www.tunbury.org/2025/07/15/reflink-copy/" rel="alternate" type="text/html" title="Reflink Copy" /><published>2025-07-15T00:00:00+00:00</published><updated>2025-07-15T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/15/reflink-copy</id><content type="html" xml:base="https://www.tunbury.org/2025/07/15/reflink-copy/"><![CDATA[<p>I hadn’t intended to write another <a href="https://www.tunbury.org/2025/07/08/unix-or-sys/">post</a> about traversing a directory structure or even thinking about it again, but weirdly, it just kept coming up again!</p>

<p>Firstly, Patrick mentioned <code class="language-plaintext highlighter-rouge">Eio.Path.read_dir</code> and Anil mentioned <a href="https://tavianator.com/2023/bfs_3.0.html">bfs</a>. Then Becky commented about XFS reflink performance, and I commented that the single-threaded nature of <code class="language-plaintext highlighter-rouge">cp -r --reflink=always</code> was probably hurting our <a href="https://github.com/ocurrent/obuilder">obuilder</a> performance tests.</p>

<p>Obuilder is written in LWT, which has <code class="language-plaintext highlighter-rouge">Lwt_unix.readdir</code>. What if we had a pool of threads that would traverse the directory structure in parallel and create a reflinked copy?</p>

<p>Creating a reflink couldn’t be easier. There’s an <code class="language-plaintext highlighter-rouge">ioctl</code> call that <em>just</em> does it. Such a contrast to the ReFS copy-on-write implementation on Windows!</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;caml/mlvalues.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;caml/memory.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;caml/unixsupport.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;sys/ioctl.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;errno.h&gt;</span><span class="cp">
</span>
<span class="cp">#ifndef FICLONE
#define FICLONE 0x40049409
#endif
</span>
<span class="n">value</span> <span class="nf">caml_ioctl_ficlone</span><span class="p">(</span><span class="n">value</span> <span class="n">dst_fd</span><span class="p">,</span> <span class="n">value</span> <span class="n">src_fd</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">CAMLparam2</span><span class="p">(</span><span class="n">dst_fd</span><span class="p">,</span> <span class="n">src_fd</span><span class="p">);</span>
    <span class="kt">int</span> <span class="n">result</span><span class="p">;</span>

    <span class="n">result</span> <span class="o">=</span> <span class="n">ioctl</span><span class="p">(</span><span class="n">Int_val</span><span class="p">(</span><span class="n">dst_fd</span><span class="p">),</span> <span class="n">FICLONE</span><span class="p">,</span> <span class="n">Int_val</span><span class="p">(</span><span class="n">src_fd</span><span class="p">));</span>

    <span class="k">if</span> <span class="p">(</span><span class="n">result</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">uerror</span><span class="p">(</span><span class="s">"ioctl_ficlone"</span><span class="p">,</span> <span class="n">Nothing</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="n">CAMLreturn</span><span class="p">(</span><span class="n">Val_int</span><span class="p">(</span><span class="n">result</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We can write a reflink copy function as shown below. (Excuse my error handling.) Interestingly, points to note: the permissions set via <code class="language-plaintext highlighter-rouge">Unix.openfile</code> are filtered through umask, and you need to <code class="language-plaintext highlighter-rouge">Unix.fchown</code> before <code class="language-plaintext highlighter-rouge">Unix.fchmod</code> if you want to set the suid bit set.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">external</span> <span class="n">ioctl_ficlone</span> <span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span> <span class="o">-&gt;</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span> <span class="o">-&gt;</span> <span class="kt">int</span> <span class="o">=</span> <span class="s2">"caml_ioctl_ficlone"</span>

<span class="k">let</span> <span class="n">copy_file</span> <span class="n">src</span> <span class="n">dst</span> <span class="n">stat</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">src_fd</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">openfile</span> <span class="n">src</span> <span class="p">[</span><span class="nc">O_RDONLY</span><span class="p">]</span> <span class="mi">0</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">dst_fd</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">openfile</span> <span class="n">dst</span> <span class="p">[</span><span class="nc">O_WRONLY</span><span class="p">;</span> <span class="nc">O_CREAT</span><span class="p">;</span> <span class="nc">O_TRUNC</span><span class="p">]</span> <span class="mo">0o600</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">_</span> <span class="o">=</span> <span class="n">ioctl_ficlone</span> <span class="n">dst_fd</span> <span class="n">src_fd</span> <span class="k">in</span>
  <span class="nn">Unix</span><span class="p">.</span><span class="n">fchown</span> <span class="n">dst_fd</span> <span class="n">stat</span><span class="o">.</span><span class="n">st_uid</span> <span class="n">stat</span><span class="o">.</span><span class="n">st_gid</span><span class="p">;</span>
  <span class="nn">Unix</span><span class="p">.</span><span class="n">fchmod</span> <span class="n">dst_fd</span> <span class="n">stat</span><span class="o">.</span><span class="n">st_perm</span><span class="p">;</span>
  <span class="nn">Unix</span><span class="p">.</span><span class="n">close</span> <span class="n">src_fd</span><span class="p">;</span>
  <span class="nn">Unix</span><span class="p">.</span><span class="n">close</span> <span class="n">dst_fd</span><span class="p">;</span>
</code></pre></div></div>

<p>My LWT code created a list of all the files in a directory and then processed the list with <code class="language-plaintext highlighter-rouge">Lwt_list.map_s</code> (serially), returning promises for all the file operations and creating threads for new directory operations up to a defined maximum (8). If there was no thread capacity, it just recursed in the current thread. Copying a root filesystem, this gave me threads for <code class="language-plaintext highlighter-rouge">var</code>, <code class="language-plaintext highlighter-rouge">usr</code>, etc, just as we’d want. Wow! This was slow. Nearly 4 minutes to reflink 1.7GB!</p>

<p>What about using the threads library rather than LWT threads? This appears significantly better, bringing the execution time down to 40 seconds. However, I think a lot of that was down to my (bad) LWT implementation vs my somewhat better threads implementation.</p>

<p>At this point, I should probably note that <code class="language-plaintext highlighter-rouge">cp -r --reflink always</code> on 1.7GB, 116,000 files takes 8.5 seconds on my machine using a loopback XFS. A sequential OCaml version, without the overhead of threads or any need to maintain a list of work to do, takes 9.0 seconds.</p>

<p>Giving up and getting on with other things was very tempting, but there was that nagging feeling of not bottoming out the problem.</p>

<p>Using OCaml Multicore, we can write a true multi-threaded version. I took a slightly different approach, having a work queue of directories to process, and N worker threads taking work from the queue.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Main Process: Starts with root directory
     ↓
WorkQueue: [process_dir(/root)]
     ↓
Domain 1: Takes work → processes files → adds subdirs to queue
Domain 2: Takes work → processes files → adds subdirs to queue
Domain 3: Takes work → processes files → adds subdirs to queue
     ↓
WorkQueue: [process_dir(/root/usr), process_dir(/root/var), ...]
</code></pre></div></div>

<p>Below is a table showing the performance when using multiple threads compared to the baseline operation of <code class="language-plaintext highlighter-rouge">cp</code> and a sequential copy in OCaml.</p>

<table>
  <thead>
    <tr>
      <th>Copy command</th>
      <th>Duration (sec)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>cp -r –reflink=always</td>
      <td>8.49</td>
    </tr>
    <tr>
      <td>Sequential</td>
      <td>8.80</td>
    </tr>
    <tr>
      <td>2 domains</td>
      <td>5.45</td>
    </tr>
    <tr>
      <td>4 domains</td>
      <td>3.28</td>
    </tr>
    <tr>
      <td>6 domains</td>
      <td>3.43</td>
    </tr>
    <tr>
      <td>8 domains</td>
      <td>5.24</td>
    </tr>
    <tr>
      <td>10 domains</td>
      <td>9.07</td>
    </tr>
  </tbody>
</table>

<p>The code is available on GitHub in <a href="https://github.com/mtelvers/reflink">mtelvers/reflink</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><summary type="html"><![CDATA[I hadn’t intended to write another post about traversing a directory structure or even thinking about it again, but weirdly, it just kept coming up again!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Tessera Workshop</title><link href="https://www.tunbury.org/2025/07/14/tessera-workshop/" rel="alternate" type="text/html" title="Tessera Workshop" /><published>2025-07-14T00:00:00+00:00</published><updated>2025-07-14T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/14/tessera-workshop</id><content type="html" xml:base="https://www.tunbury.org/2025/07/14/tessera-workshop/"><![CDATA[<p>I wrote previously about setting up a <a href="https://www.tunbury.org/2025/07/09/jupyter/">Jupyter notebook in a Docker container</a>. This worked well for a single user, but we intend to hold a workshop and so need a multi-user setup.</p>

<p>We would prefer that as much of the per-user setup as possible be completed automatically so participants don’t need to waste time setting up the environment.</p>

<p>There is a great resource at <a href="https://github.com/jupyterhub/jupyterhub-the-hard-way/blob/HEAD/docs/installation-guide-hard.md">jupyterhub/jupyterhub-the-hard-way</a> walking you through the manual setup.</p>

<p>However, there are many Docker containers that we can use as the base, including <code class="language-plaintext highlighter-rouge">python:3.11</code>, but I have decided to use <code class="language-plaintext highlighter-rouge">jupyter/data science:latest</code>. The containers are expected to be customised with a <code class="language-plaintext highlighter-rouge">Dockerfile</code>.</p>

<p>In my <code class="language-plaintext highlighter-rouge">Dockerfile</code>, I first installed JupyterLab and the other dependencies to avoid users needing to install these manually later.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RUN pip install --no-cache-dir \
    jupyterhub \
    jupyterlab \
    notebook \
    numpy \
    matplotlib \
    scikit-learn \
    ipyleaflet \
    ipywidgets \
    ipykernel
</code></pre></div></div>

<p>Then the system dependencies. A selection of editors and <code class="language-plaintext highlighter-rouge">git</code> which is needed for <code class="language-plaintext highlighter-rouge">pip install git+https</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>USER root
RUN apt-get update &amp;&amp; apt-get install -y \
    curl git vim nano \
    &amp;&amp; rm -rf /var/lib/apt/lists/*
</code></pre></div></div>

<p>Then our custom package from GitHub.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RUN pip install git+https://github.com/ucam-eo/geotessera.git
</code></pre></div></div>

<p>The default user database is PAM, so create UNIX users for the workshop participants without a disabled password.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RUN for user in user1 user2 user3; do \
        adduser --disabled-password --gecos '' $user; \
    done
</code></pre></div></div>

<p>Finally, set the entrypoint for the container:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CMD ["jupyterhub", "-f", "/srv/jupyterhub/jupyterhub_config.py"]
</code></pre></div></div>

<p>Next, I created the <code class="language-plaintext highlighter-rouge">jupyterhub_config.py</code>. I think most of these lines are self-explanatory. The password is the same for everyone to sign in. Global environment variables can be set using <code class="language-plaintext highlighter-rouge">c.Spawner.environment</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>from jupyterhub.auth import DummyAuthenticator

c.JupyterHub.authenticator_class = DummyAuthenticator
c.DummyAuthenticator.password = "Workshop"

# Allow all users
c.Authenticator.allow_all = True

# Use JupyterLab by default
c.Spawner.default_url = '/lab'

# Set timeouts
c.Spawner.start_timeout = 300
c.Spawner.http_timeout = 120
c.Spawner.environment = {
    'TESSERA_DATA_DIR': '/tessera'
}

# Basic configuration
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 8000
</code></pre></div></div>

<p>I’m going to use Caddy as a reverse proxy for this setup, for this I need a <code class="language-plaintext highlighter-rouge">Caddyfile</code> containing the public FQDN and the Docker container name and port:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>workshop.cam.ac.uk {
	reverse_proxy jupyterhub:8000
}
</code></pre></div></div>

<p>The services are defined in <code class="language-plaintext highlighter-rouge">docker-compose.yml</code>; Caddy and the associated volumes to preserve SSL certificates between restarts, <code class="language-plaintext highlighter-rouge">jupyterhub</code> with volumes for home directories so they are preserved and a mapping for our shared dataset.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>services:
  caddy:
    image: caddy:latest
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile
      - caddy_data:/data
      - caddy_config:/config

  jupyterhub:
    build: .
    volumes:
      - ./jupyterhub_config.py:/srv/jupyterhub/jupyterhub_config.py
      - jupyter_home:/home
      - tessera_data:/tessera

volumes:
  caddy_data:
  caddy_config:
  jupyter_home:
  tessera_data:
</code></pre></div></div>

<p>Reset UFW to defaults</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ufw <span class="nt">--force</span> reset
</code></pre></div></div>

<p>Set default policies</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ufw default deny incoming
ufw default allow outgoing
</code></pre></div></div>

<p>Allow SSH and HTTP(S) services</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ufw allow ssh
ufw allow http
ufw allow https
</code></pre></div></div>

<p>Enable UFW</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ufw <span class="nb">enable</span>
</code></pre></div></div>

<p>Check status</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ufw status verbose
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="jupyter" /><summary type="html"><![CDATA[I wrote previously about setting up a Jupyter notebook in a Docker container. This worked well for a single user, but we intend to hold a workshop and so need a multi-user setup.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/tessera2.png" /><media:content medium="image" url="https://www.tunbury.org/images/tessera2.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">dune unfmt</title><link href="https://www.tunbury.org/2025/07/10/dune-unfmt/" rel="alternate" type="text/html" title="dune unfmt" /><published>2025-07-10T00:00:00+00:00</published><updated>2025-07-10T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/10/dune-unfmt</id><content type="html" xml:base="https://www.tunbury.org/2025/07/10/dune-unfmt/"><![CDATA[<p>When working across machines, it’s easy to make changes and reconcile them using git. However, I made a mistake and inadvertently ran <code class="language-plaintext highlighter-rouge">dune fmt</code> and now my <code class="language-plaintext highlighter-rouge">git diff</code> is a total mess.</p>

<p>My thought, to get myself out of this situation, is to go back to the previous commit and create a new branch with no changes other than a <code class="language-plaintext highlighter-rouge">dune fmt</code>. I can then cherry-pick my latest work on to that branch which should then give me a clean diff.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git commit <span class="nt">-am</span> <span class="s1">'inadvertent reformatted version'</span>
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">git log</code> to find the commit that was just made and the previous one.</p>

<p>Checkout the previous commit and make a new branch, in my case called <code class="language-plaintext highlighter-rouge">pre-fmt</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git checkout &lt;previous commit&gt;
git switch <span class="nt">-c</span> pre-fmt
</code></pre></div></div>

<p>Format the code in this branch and commit that version.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dune <span class="nb">fmt
</span>git commit <span class="nt">-am</span> <span class="s1">'dune fmt'</span>
</code></pre></div></div>

<p>Now cherry-pick the original commit.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git cherry-pick &lt;latest commit&gt;
</code></pre></div></div>

<p>The cherry-pick reports lots of merge conflicts; however, these should be trivial to resolve but it is a manual process. Once done, add the changed files and finish the cherry-pick.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git add bin/<span class="k">*</span>.ml
git cherry-pick <span class="nt">--continue</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">git diff</code> now shows just the actual changes rather than the code formatting changes. Do you have any suggestions on a better workflow?</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="git" /><summary type="html"><![CDATA[When working across machines, it’s easy to make changes and reconcile them using git. However, I made a mistake and inadvertently ran dune fmt and now my git diff is a total mess.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/GitHub-Mark-120px-plus.png" /><media:content medium="image" url="https://www.tunbury.org/images/GitHub-Mark-120px-plus.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Publishing a Jupyter Notebook in a Docker Container</title><link href="https://www.tunbury.org/2025/07/09/jupyter/" rel="alternate" type="text/html" title="Publishing a Jupyter Notebook in a Docker Container" /><published>2025-07-09T00:00:00+00:00</published><updated>2025-07-09T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/09/jupyter</id><content type="html" xml:base="https://www.tunbury.org/2025/07/09/jupyter/"><![CDATA[<p>Brief notes on publishing a Jupyter notebook as a Docker container.</p>

<p>My starting point is a GitHub <a href="https://github.com/ucam-eo/tessera-interactive-map">repo</a> containing a Jupyter notebook and a <code class="language-plaintext highlighter-rouge">requirements.txt</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/ucam-eo/tessera-interactive-map
<span class="nb">cd </span>tessera-interactive-map
</code></pre></div></div>

<p>I created a <code class="language-plaintext highlighter-rouge">Dockerfile</code> which pulls in a standard Python container. I used 3.11 as that is the minimum version support for <a href="https://github.com/ucam-eo/geotessera.git">https://github.com/ucam-eo/geotessera.git</a></p>

<p><code class="language-plaintext highlighter-rouge">pip</code> installs the packages listed in <code class="language-plaintext highlighter-rouge">requirements.txt</code> plus the additional <a href="https://github.com/ucam-eo/geotessera.git">geotessera</a> library. The extra library is noted in the <a href="https://github.com/ucam-eo/tessera-interactive-map/blob/main/README.md">README.md</a>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FROM python:3.11
WORKDIR /app
COPY <span class="nb">.</span> /app
RUN pip <span class="nb">install</span> <span class="nt">--no-cache-dir</span> <span class="nt">-r</span> requirements.txt
RUN pip <span class="nb">install </span>git+https://github.com/ucam-eo/geotessera.git
RUN pip <span class="nb">install </span>jupyter
EXPOSE 8888
ENV NAME World
CMD <span class="o">[</span><span class="s2">"jupyter"</span>, <span class="s2">"notebook"</span>, <span class="s2">"--ip=0.0.0.0"</span>, <span class="s2">"--port=8888"</span>, <span class="s2">"--no-browser"</span>, <span class="s2">"--allow-root"</span><span class="o">]</span>
</code></pre></div></div>

<p>Build the Docker image.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker build <span class="nt">-t</span> my-jupyter <span class="nb">.</span>
</code></pre></div></div>

<p>And run the container.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># docker run --rm -it -p 8888:8888 my-jupyter</span>
<span class="o">[</span>I 2025-07-09 16:11:37.739 ServerApp] jupyter_lsp | extension was successfully linked.
<span class="o">[</span>I 2025-07-09 16:11:37.743 ServerApp] jupyter_server_terminals | extension was successfully linked.
<span class="o">[</span>I 2025-07-09 16:11:37.746 ServerApp] jupyterlab | extension was successfully linked.
<span class="o">[</span>I 2025-07-09 16:11:37.749 ServerApp] notebook | extension was successfully linked.
<span class="o">[</span>I 2025-07-09 16:11:37.751 ServerApp] Writing Jupyter server cookie secret to /root/.local/share/jupyter/runtime/jupyter_cookie_secret
<span class="o">[</span>I 2025-07-09 16:11:38.089 ServerApp] notebook_shim | extension was successfully linked.
<span class="o">[</span>I 2025-07-09 16:11:38.102 ServerApp] notebook_shim | extension was successfully loaded.
<span class="o">[</span>I 2025-07-09 16:11:38.104 ServerApp] jupyter_lsp | extension was successfully loaded.
<span class="o">[</span>I 2025-07-09 16:11:38.105 ServerApp] jupyter_server_terminals | extension was successfully loaded.
<span class="o">[</span>I 2025-07-09 16:11:38.107 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.11/site-packages/jupyterlab
<span class="o">[</span>I 2025-07-09 16:11:38.107 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
<span class="o">[</span>I 2025-07-09 16:11:38.107 LabApp] Extension Manager is <span class="s1">'pypi'</span><span class="nb">.</span>
<span class="o">[</span>I 2025-07-09 16:11:38.156 ServerApp] jupyterlab | extension was successfully loaded.
<span class="o">[</span>I 2025-07-09 16:11:38.159 ServerApp] notebook | extension was successfully loaded.
<span class="o">[</span>I 2025-07-09 16:11:38.160 ServerApp] Serving notebooks from <span class="nb">local </span>directory: /app
<span class="o">[</span>I 2025-07-09 16:11:38.160 ServerApp] Jupyter Server 2.16.0 is running at:
<span class="o">[</span>I 2025-07-09 16:11:38.160 ServerApp] http://0ad4fce9b94e:8888/tree?token<span class="o">=</span>c11c0f007dd99a785ff67331514fb44e87269055952a253b
<span class="o">[</span>I 2025-07-09 16:11:38.160 ServerApp]     http://127.0.0.1:8888/tree?token<span class="o">=</span>c11c0f007dd99a785ff67331514fb44e87269055952a253b
</code></pre></div></div>

<p>Note the URL in the log output and open it in the browser. You are prompted to enter the token if you don’t specify the token as part of the URL.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="jupyter" /><summary type="html"><![CDATA[Brief notes on publishing a Jupyter notebook as a Docker container.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/tessera.png" /><media:content medium="image" url="https://www.tunbury.org/images/tessera.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Sys.readdir or Unix.readdir</title><link href="https://www.tunbury.org/2025/07/08/unix-or-sys/" rel="alternate" type="text/html" title="Sys.readdir or Unix.readdir" /><published>2025-07-08T00:00:00+00:00</published><updated>2025-07-08T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/08/unix-or-sys</id><content type="html" xml:base="https://www.tunbury.org/2025/07/08/unix-or-sys/"><![CDATA[<p>When you recursively scan a massive directory tree, would you use <code class="language-plaintext highlighter-rouge">Sys.readdir</code> or <code class="language-plaintext highlighter-rouge">Unix.readdir</code>? My inclination is that <code class="language-plaintext highlighter-rouge">Sys.readdir</code> feels more convenient to use, and thus the lower-level <code class="language-plaintext highlighter-rouge">Unix.readdir</code> would have the performance edge. Is it significant enough to bother with?</p>

<p>Quickly coding up the two different options for comparison. Here’s the <code class="language-plaintext highlighter-rouge">Unix.readdir</code> version, running <code class="language-plaintext highlighter-rouge">Unix.opendir</code> then recursively calling <code class="language-plaintext highlighter-rouge">Unix.readdir</code> until the <code class="language-plaintext highlighter-rouge">End_of_file</code> exception is raised.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">rec</span> <span class="n">traverse_directory_unix</span> <span class="n">path</span> <span class="n">x</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">stats</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">lstat</span> <span class="n">path</span> <span class="k">in</span>
  <span class="k">match</span> <span class="n">stats</span><span class="o">.</span><span class="n">st_kind</span> <span class="k">with</span>
  <span class="o">|</span> <span class="nn">Unix</span><span class="p">.</span><span class="nc">S_REG</span> <span class="o">-&gt;</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">1</span>
  <span class="o">|</span> <span class="nc">S_LNK</span> <span class="o">|</span> <span class="nc">S_CHR</span> <span class="o">|</span> <span class="nc">S_BLK</span> <span class="o">|</span> <span class="nc">S_FIFO</span> <span class="o">|</span> <span class="nc">S_SOCK</span> <span class="o">-&gt;</span> <span class="n">x</span>
  <span class="o">|</span> <span class="nc">S_DIR</span> <span class="o">-&gt;</span>
      <span class="k">try</span>
        <span class="k">let</span> <span class="n">dir_handle</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">opendir</span> <span class="n">path</span> <span class="k">in</span>
        <span class="k">let</span> <span class="k">rec</span> <span class="n">read_entries</span> <span class="n">acc</span> <span class="o">=</span>
          <span class="k">try</span>
            <span class="k">match</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">readdir</span> <span class="n">dir_handle</span> <span class="k">with</span>
            <span class="o">|</span> <span class="s2">"."</span> <span class="o">|</span> <span class="s2">".."</span> <span class="o">-&gt;</span> <span class="n">read_entries</span> <span class="n">acc</span>
            <span class="o">|</span> <span class="n">entry</span> <span class="o">-&gt;</span>
                <span class="k">let</span> <span class="n">full_path</span> <span class="o">=</span> <span class="nn">Filename</span><span class="p">.</span><span class="n">concat</span> <span class="n">path</span> <span class="n">entry</span> <span class="k">in</span>
                <span class="n">read_entries</span> <span class="p">(</span><span class="n">traverse_directory_unix</span> <span class="n">full_path</span> <span class="n">acc</span><span class="p">)</span>
          <span class="k">with</span> <span class="nc">End_of_file</span> <span class="o">-&gt;</span>
            <span class="nn">Unix</span><span class="p">.</span><span class="n">closedir</span> <span class="n">dir_handle</span><span class="p">;</span>
            <span class="n">acc</span>
        <span class="k">in</span>
        <span class="n">read_entries</span> <span class="n">x</span>
      <span class="k">with</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="n">x</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">Sys.readdir</code> version nicely gives us an array so we can idiomatically use <code class="language-plaintext highlighter-rouge">Array.fold_left</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">traverse_directory_sys</span> <span class="n">source</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">rec</span> <span class="n">process_directory</span> <span class="n">s</span> <span class="n">current_source</span> <span class="o">=</span>
    <span class="k">let</span> <span class="n">entries</span> <span class="o">=</span> <span class="nn">Sys</span><span class="p">.</span><span class="n">readdir</span> <span class="n">current_source</span> <span class="k">in</span>
    <span class="nn">Array</span><span class="p">.</span><span class="n">fold_left</span>
      <span class="p">(</span><span class="k">fun</span> <span class="n">acc</span> <span class="n">entry</span> <span class="o">-&gt;</span>
        <span class="k">let</span> <span class="n">source</span> <span class="o">=</span> <span class="nn">Filename</span><span class="p">.</span><span class="n">concat</span> <span class="n">current_source</span> <span class="n">entry</span> <span class="k">in</span>
        <span class="k">try</span>
          <span class="k">let</span> <span class="n">stat</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">lstat</span> <span class="n">source</span> <span class="k">in</span>
          <span class="k">match</span> <span class="n">stat</span><span class="o">.</span><span class="n">st_kind</span> <span class="k">with</span>
          <span class="o">|</span> <span class="nn">Unix</span><span class="p">.</span><span class="nc">S_REG</span> <span class="o">-&gt;</span> <span class="n">acc</span> <span class="o">+</span> <span class="mi">1</span>
          <span class="o">|</span> <span class="nn">Unix</span><span class="p">.</span><span class="nc">S_DIR</span> <span class="o">-&gt;</span> <span class="n">process_directory</span> <span class="n">acc</span> <span class="n">source</span>
          <span class="o">|</span> <span class="nc">S_LNK</span> <span class="o">|</span> <span class="nc">S_CHR</span> <span class="o">|</span> <span class="nc">S_BLK</span> <span class="o">|</span> <span class="nc">S_FIFO</span> <span class="o">|</span> <span class="nc">S_SOCK</span> <span class="o">-&gt;</span> <span class="n">acc</span>
        <span class="k">with</span> <span class="nn">Unix</span><span class="p">.</span><span class="nc">Unix_error</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="n">acc</span><span class="p">)</span>
      <span class="n">s</span> <span class="n">entries</span>
  <span class="k">in</span>
  <span class="n">process_directory</span> <span class="mi">0</span> <span class="n">source</span>
</code></pre></div></div>

<p>The file system may have a big impact, so I tested NTFS, ReFS, and ext4, running each a couple of times to ensure the cache was primed.</p>

<p><code class="language-plaintext highlighter-rouge">Sys.readdir</code> was quicker in my test cases up to 500,000 files. Reaching 750,000 files, <code class="language-plaintext highlighter-rouge">Unix.readdir</code> edged ahead. I was surprised by the outcome and wondered whether it was my code rather than the module I used.</p>

<p>Pushing for the result I expected/wanted, I rewrote the function so it more closely mirrors the <code class="language-plaintext highlighter-rouge">Sys.readdir</code> version.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">traverse_directory_unix_2</span> <span class="n">path</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">rec</span> <span class="n">process_directory</span> <span class="n">s</span> <span class="n">path</span> <span class="o">=</span>
    <span class="k">try</span>
      <span class="k">let</span> <span class="n">dir_handle</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">opendir</span> <span class="n">path</span> <span class="k">in</span>
      <span class="k">let</span> <span class="k">rec</span> <span class="n">read_entries</span> <span class="n">acc</span> <span class="o">=</span>
        <span class="k">try</span>
          <span class="k">let</span> <span class="n">entry</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">readdir</span> <span class="n">dir_handle</span> <span class="k">in</span>
          <span class="k">match</span> <span class="n">entry</span> <span class="k">with</span>
          <span class="o">|</span> <span class="s2">"."</span> <span class="o">|</span> <span class="s2">".."</span> <span class="o">-&gt;</span> <span class="n">read_entries</span> <span class="n">acc</span>
          <span class="o">|</span> <span class="n">entry</span> <span class="o">-&gt;</span>
              <span class="k">let</span> <span class="n">full_path</span> <span class="o">=</span> <span class="nn">Filename</span><span class="p">.</span><span class="n">concat</span> <span class="n">path</span> <span class="n">entry</span> <span class="k">in</span>
              <span class="k">let</span> <span class="n">stats</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">lstat</span> <span class="n">full_path</span> <span class="k">in</span>
              <span class="k">match</span> <span class="n">stats</span><span class="o">.</span><span class="n">st_kind</span> <span class="k">with</span>
              <span class="o">|</span> <span class="nn">Unix</span><span class="p">.</span><span class="nc">S_REG</span> <span class="o">-&gt;</span> <span class="n">read_entries</span> <span class="p">(</span><span class="n">acc</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
              <span class="o">|</span> <span class="nc">S_LNK</span> <span class="o">|</span> <span class="nc">S_CHR</span> <span class="o">|</span> <span class="nc">S_BLK</span> <span class="o">|</span> <span class="nc">S_FIFO</span> <span class="o">|</span> <span class="nc">S_SOCK</span> <span class="o">-&gt;</span> <span class="n">read_entries</span> <span class="n">acc</span>
              <span class="o">|</span> <span class="nc">S_DIR</span> <span class="o">-&gt;</span> <span class="n">read_entries</span> <span class="p">(</span><span class="n">process_directory</span> <span class="n">acc</span> <span class="n">full_path</span><span class="p">)</span>
        <span class="k">with</span> <span class="nc">End_of_file</span> <span class="o">-&gt;</span>
          <span class="nn">Unix</span><span class="p">.</span><span class="n">closedir</span> <span class="n">dir_handle</span><span class="p">;</span>
          <span class="n">acc</span>
      <span class="k">in</span>
      <span class="n">read_entries</span> <span class="n">s</span>
    <span class="k">with</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="n">s</span>
  <span class="k">in</span>
  <span class="n">process_directory</span> <span class="mi">0</span> <span class="n">path</span>
</code></pre></div></div>

<p>This version is indeed faster than <code class="language-plaintext highlighter-rouge">Sys.readdir</code> in all cases. However, at 750,000 files the speed up was &lt; 0.5%.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[When you recursively scan a massive directory tree, would you use Sys.readdir or Unix.readdir? My inclination is that Sys.readdir feels more convenient to use, and thus the lower-level Unix.readdir would have the performance edge. Is it significant enough to bother with?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/sys-or-unix.png" /><media:content medium="image" url="https://www.tunbury.org/images/sys-or-unix.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">ReFS, EEG Intern and Monteverde</title><link href="https://www.tunbury.org/2025/07/07/refs-monteverde/" rel="alternate" type="text/html" title="ReFS, EEG Intern and Monteverde" /><published>2025-07-07T00:00:00+00:00</published><updated>2025-07-07T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/07/refs-monteverde</id><content type="html" xml:base="https://www.tunbury.org/2025/07/07/refs-monteverde/"><![CDATA[<p>In addition to the post from last week covering <a href="https://www.tunbury.org/2025/07/02/bon-in-a-box/">BON in a Box</a> and <a href="https://www.tunbury.org/2025/07/01/ocaml-functors/">OCaml Functors</a>, below are some additional notes.</p>

<h1 id="resilient-file-system-refs">Resilient File System, ReFS</h1>

<p>I have previously stated that <a href="https://www.tunbury.org/windows-reflinks">ReFS</a> supports 1 million hard links per file; however, this is not the case. The maximum is considerably lower at 8191. That’s eight times more than NTFS, but still not very many.</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">PS</span><span class="w"> </span><span class="nx">D:\</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">touch</span><span class="w"> </span><span class="nx">foo</span><span class="w">
</span><span class="n">PS</span><span class="w"> </span><span class="nx">D:\</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">foreach</span><span class="w"> </span><span class="p">(</span><span class="nv">$i</span><span class="w"> </span><span class="kr">in</span><span class="w"> </span><span class="mi">1</span><span class="o">..</span><span class="mi">8192</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="err">&gt;&gt;</span><span class="w">     </span><span class="n">New-Item</span><span class="w"> </span><span class="nt">-ItemType</span><span class="w"> </span><span class="nx">HardLink</span><span class="w"> </span><span class="nt">-Path</span><span class="w"> </span><span class="s2">"foo-</span><span class="nv">$i</span><span class="s2">"</span><span class="w"> </span><span class="nt">-Target</span><span class="w"> </span><span class="s2">"foo"</span><span class="w">
</span><span class="err">&gt;&gt;</span><span class="w"> </span><span class="p">}</span><span class="w">


    </span><span class="n">Directory:</span><span class="w"> </span><span class="nx">D:\</span><span class="w">


</span><span class="n">Mode</span><span class="w">                 </span><span class="nx">LastWriteTime</span><span class="w">         </span><span class="nx">Length</span><span class="w"> </span><span class="nx">Name</span><span class="w">
</span><span class="o">----</span><span class="w">                 </span><span class="o">-------------</span><span class="w">         </span><span class="o">------</span><span class="w"> </span><span class="o">----</span><span class="w">
</span><span class="nt">-a</span><span class="o">----</span><span class="w">        </span><span class="mi">07</span><span class="n">/07/2025</span><span class="w">     </span><span class="nx">01:00</span><span class="w">              </span><span class="nx">0</span><span class="w"> </span><span class="nx">foo-1</span><span class="w">
</span><span class="nt">-a</span><span class="o">----</span><span class="w">        </span><span class="mi">07</span><span class="n">/07/2025</span><span class="w">     </span><span class="nx">01:00</span><span class="w">              </span><span class="nx">0</span><span class="w"> </span><span class="nx">foo-2</span><span class="w">
</span><span class="nt">-a</span><span class="o">----</span><span class="w">        </span><span class="mi">07</span><span class="n">/07/2025</span><span class="w">     </span><span class="nx">01:00</span><span class="w">              </span><span class="nx">0</span><span class="w"> </span><span class="nx">foo-3</span><span class="w">
</span><span class="nt">-a</span><span class="o">----</span><span class="w">        </span><span class="mi">07</span><span class="n">/07/2025</span><span class="w">     </span><span class="nx">01:00</span><span class="w">              </span><span class="nx">0</span><span class="w"> </span><span class="nx">foo-4</span><span class="w">
</span><span class="o">...</span><span class="w">
</span><span class="nt">-a</span><span class="o">----</span><span class="w">        </span><span class="mi">07</span><span class="n">/07/2025</span><span class="w">     </span><span class="nx">01:00</span><span class="w">              </span><span class="nx">0</span><span class="w"> </span><span class="nx">foo-8190</span><span class="w">
</span><span class="nt">-a</span><span class="o">----</span><span class="w">        </span><span class="mi">07</span><span class="n">/07/2025</span><span class="w">     </span><span class="nx">01:00</span><span class="w">              </span><span class="nx">0</span><span class="w"> </span><span class="nx">foo-8191</span><span class="w">
</span><span class="n">New-Item</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="nx">An</span><span class="w"> </span><span class="nx">attempt</span><span class="w"> </span><span class="nx">was</span><span class="w"> </span><span class="nx">made</span><span class="w"> </span><span class="nx">to</span><span class="w"> </span><span class="nx">create</span><span class="w"> </span><span class="nx">more</span><span class="w"> </span><span class="nx">links</span><span class="w"> </span><span class="nx">on</span><span class="w"> </span><span class="nx">a</span><span class="w"> </span><span class="nx">file</span><span class="w"> </span><span class="nx">than</span><span class="w"> </span><span class="nx">the</span><span class="w"> </span><span class="nx">file</span><span class="w"> </span><span class="nx">system</span><span class="w"> </span><span class="nx">supports</span><span class="w">
</span><span class="n">At</span><span class="w"> </span><span class="nx">line:2</span><span class="w"> </span><span class="nx">char:5</span><span class="w">
</span><span class="o">+</span><span class="w">     </span><span class="n">New-Item</span><span class="w"> </span><span class="nt">-ItemType</span><span class="w"> </span><span class="nx">HardLink</span><span class="w"> </span><span class="nt">-Path</span><span class="w"> </span><span class="s2">"foo-</span><span class="nv">$i</span><span class="s2">"</span><span class="w"> </span><span class="nt">-Target</span><span class="w"> </span><span class="s2">"foo"</span><span class="w">
</span><span class="o">+</span><span class="w">     </span><span class="n">~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~</span><span class="w">
    </span><span class="o">+</span><span class="w"> </span><span class="nx">CategoryInfo</span><span class="w">          </span><span class="p">:</span><span class="w"> </span><span class="nx">NotSpecified:</span><span class="w"> </span><span class="p">(:)</span><span class="w"> </span><span class="p">[</span><span class="n">New</span><span class="nt">-Item</span><span class="p">],</span><span class="w"> </span><span class="n">Win32Exception</span><span class="w">
    </span><span class="o">+</span><span class="w"> </span><span class="nx">FullyQualifiedErrorId</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="nx">System.ComponentModel.Win32Exception</span><span class="p">,</span><span class="nx">Microsoft.PowerShell.Commands.NewItemCommand</span><span class="w">
</span></code></pre></div></div>

<p>I had also investigated ReFS block cloning, which removed the requirement to create hard links, and wrote a <a href="https://github.com/mtelvers/ReFS-Clone">ReFS-clone</a> tool for Windows Server 2022. This works well until containerd is used to bind mount a directory on the volume. Once this has happened, attempts to create a block clone fail. To exclude my code as the root cause, I have tried Windows Server 2025, where commands such as <code class="language-plaintext highlighter-rouge">copy</code> and <code class="language-plaintext highlighter-rouge">robocopy</code> automatically perform block clones. Block cloning can be restored by rebooting the machine. I note that restarting containerd is not sufficient.</p>

<p>Removing files and folders on ReFS is impressively fast; however, this comes at a cost: freeing the blocks is a background activity that may take some time to be scheduled.</p>

<h1 id="file-system-performance-with-a-focus-on-zfs">File system performance with a focus on ZFS</h1>

<p>Several EEG interns started last week with this <a href="https://anil.recoil.org/ideas/zfs-filesystem-perf">project</a> under my supervision. In brief, we will examine file system performance on the filesystems supported by <a href="https://github.com/ocurrent/obuilder">OBuilder</a> before conducting more detailed investigations into factors affecting ZFS performance.</p>

<h1 id="monteverde">Monteverde</h1>

<p>monteverde.cl.cam.ac.uk, has been installed in the rack. It has two AMD EPYC 9965 192-Core Processors, giving a total of 384 cores and 768 threads and 3TB of RAM.</p>

<p><img src="/images/monteverde.jpg" alt="" /></p>

<p>From the logs, there are still some teething issues:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[130451.620482] Large kmem_alloc(98304, 0x1000), please file an issue at:
                https://github.com/openzfs/zfs/issues/new
[130451.620486] CPU: 51 UID: 0 PID: 8594 Comm: txg_sync Tainted: P           O       6.14.0-23-generic #23-Ubuntu
[130451.620488] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE
[130451.620489] Hardware name: Dell Inc. PowerEdge R7725/0KRFPX, BIOS 1.1.3 02/25/2025
[130451.620490] Call Trace:
[130451.620490]  &lt;TASK&gt;
[130451.620492]  show_stack+0x49/0x60
[130451.620493]  dump_stack_lvl+0x5f/0x90
[130451.620495]  dump_stack+0x10/0x18
[130451.620497]  spl_kmem_alloc_impl.cold+0x17/0x1c [spl]
[130451.620503]  spl_kmem_zalloc+0x19/0x30 [spl]
[130451.620508]  multilist_create_impl+0x3f/0xc0 [zfs]
[130451.620586]  multilist_create+0x31/0x50 [zfs]
[130451.620650]  dmu_objset_sync+0x4c4/0x4d0 [zfs]
[130451.620741]  dsl_pool_sync_mos+0x34/0xc0 [zfs]
[130451.620832]  dsl_pool_sync+0x3c1/0x420 [zfs]
[130451.620910]  spa_sync_iterate_to_convergence+0xda/0x220 [zfs]
[130451.620990]  spa_sync+0x333/0x660 [zfs]
[130451.621056]  txg_sync_thread+0x1f5/0x270 [zfs]
[130451.621137]  ? __pfx_txg_sync_thread+0x10/0x10 [zfs]
[130451.621207]  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
[130451.621213]  thread_generic_wrapper+0x5b/0x70 [spl]
[130451.621217]  kthread+0xf9/0x230
[130451.621219]  ? __pfx_kthread+0x10/0x10
[130451.621221]  ret_from_fork+0x44/0x70
[130451.621223]  ? __pfx_kthread+0x10/0x10
[130451.621224]  ret_from_fork_asm+0x1a/0x30
[130451.621226]  &lt;/TASK&gt;
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="refs" /><category term="tunbury.org" /><summary type="html"><![CDATA[In addition to the post from last week covering BON in a Box and OCaml Functors, below are some additional notes.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/refs.png" /><media:content medium="image" url="https://www.tunbury.org/images/refs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">BON in a Box</title><link href="https://www.tunbury.org/2025/07/02/bon-in-a-box/" rel="alternate" type="text/html" title="BON in a Box" /><published>2025-07-02T00:00:00+00:00</published><updated>2025-07-02T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/02/bon-in-a-box</id><content type="html" xml:base="https://www.tunbury.org/2025/07/02/bon-in-a-box/"><![CDATA[<p>On a suggestion from Michael, I have had a quick look at <a href="https://geo-bon.github.io/bon-in-a-box-pipeline-engine/">BON in a Box</a>, which is a web-based biodiversity analysis platform using Docker containerised pipelines running R, Julia, and Python scripts.</p>

<p>It couldn’t be easier to get started. Install Docker and Docker Compose, and make sure you can access GitHub via SSH using a public key. [Run <code class="language-plaintext highlighter-rouge">ssh-keygen -t ed25519</code> and then publish the resulting <code class="language-plaintext highlighter-rouge">~/.ssh/id_ed25519.pub</code> to your GitHub account.]</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>docker.io docker-compose-v2
</code></pre></div></div>

<p>Clone the GEO-BON’s repository and make a working copy of the <code class="language-plaintext highlighter-rouge">runner.env</code> file. This file can be edit to add API keys of datasets, but I don’t have any so the default file is fine.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone git@github.com:GEO-BON/bon-in-a-box-pipelines.git
<span class="nb">cd </span>bon-in-a-box
<span class="nb">cp </span>runner-sample.env runner.env
</code></pre></div></div>

<p>To start the server run <code class="language-plaintext highlighter-rouge">./server-up.sh</code>. There is also <code class="language-plaintext highlighter-rouge">./server-down.sh</code> to stop the server.</p>

<p>The first run downloads the required Docker containers so takes a few minutes. Once complete visit <a href="http://localhost">http://localhost</a> to see the web GUI.</p>

<p>I ran the “Get Country Polygon” script, creating a nice Colombia polygon.</p>

<p>There is a drag and drop pipeline editor which felt a lot like Microsoft Access.</p>

<p><img src="/images/geobon-pipeline.png" alt="" /></p>

<p>I followed along with the tutorial and created an R script and a YAML file of the same name in the <code class="language-plaintext highlighter-rouge">/scripts</code> directory. These appeared in the GUI, allowing me to run them and use them in the pipeline editor. Annoyingly, the dataset was not provided in the tutorial, so I couldn’t run the code.</p>

<p><code class="language-plaintext highlighter-rouge">TestScript.R</code></p>

<p>The <code class="language-plaintext highlighter-rouge">biab</code> functions are how the script interacts with the BON in a Box system.</p>

<div class="language-r highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">library</span><span class="p">(</span><span class="n">rjson</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">sf</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">terra</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">dplyr</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">ggplot2</span><span class="p">)</span><span class="w">

</span><span class="n">input</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">biab_inputs</span><span class="p">()</span><span class="w">

</span><span class="n">dat</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">st_read</span><span class="p">(</span><span class="n">input</span><span class="o">$</span><span class="n">country_polygon</span><span class="p">)</span><span class="w">

</span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">nrow</span><span class="p">(</span><span class="n">dat</span><span class="p">)</span><span class="o">==</span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="n">biab_error_stop</span><span class="p">(</span><span class="s2">"Country polygon does not exist"</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">   
  
</span><span class="n">dat.transformed</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">st_transform</span><span class="p">(</span><span class="n">dat</span><span class="p">,</span><span class="w"> </span><span class="n">crs</span><span class="o">=</span><span class="n">input</span><span class="o">$</span><span class="n">crs</span><span class="p">)</span><span class="w">

</span><span class="n">rasters</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">terra</span><span class="o">::</span><span class="n">rast</span><span class="p">(</span><span class="nf">c</span><span class="p">(</span><span class="n">input</span><span class="o">$</span><span class="n">rasters</span><span class="p">,</span><span class="w"> </span><span class="n">crs</span><span class="o">=</span><span class="n">intput</span><span class="o">$</span><span class="n">crs</span><span class="p">))</span><span class="w">

</span><span class="n">country_vect</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">vect</span><span class="p">(</span><span class="n">dat.transformed</span><span class="p">)</span><span class="w">
    
</span><span class="n">raster.cropped</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">mask</span><span class="p">(</span><span class="n">rasters</span><span class="p">,</span><span class="w"> </span><span class="n">country_vect</span><span class="p">)</span><span class="w"> 
    
</span><span class="n">raster_change</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">rasters</span><span class="p">[[</span><span class="m">1</span><span class="p">]]</span><span class="o">-</span><span class="n">rasters</span><span class="p">[[</span><span class="m">2</span><span class="p">]]</span><span class="w">

</span><span class="n">raster_change_path</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">file.path</span><span class="p">(</span><span class="n">outputFolder</span><span class="p">,</span><span class="w"> </span><span class="s2">"raster_change.tif"</span><span class="p">)</span><span class="w">
</span><span class="n">writeRaster</span><span class="p">(</span><span class="n">raster_change</span><span class="p">,</span><span class="w"> </span><span class="n">raster_change_path</span><span class="p">)</span><span class="w">

</span><span class="n">biab_output</span><span class="p">(</span><span class="s2">"raster_change"</span><span class="p">,</span><span class="w"> </span><span class="n">raster_change_path</span><span class="p">)</span><span class="w">

</span><span class="n">layer_means</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">global</span><span class="p">(</span><span class="n">rasters.cropped</span><span class="p">,</span><span class="w"> </span><span class="n">fun</span><span class="o">=</span><span class="s2">"mean"</span><span class="p">,</span><span class="w"> </span><span class="n">na.rm</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
</span><span class="n">layer_means</span><span class="o">$</span><span class="n">name</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="nf">names</span><span class="p">(</span><span class="n">rasters.cropped</span><span class="p">)</span><span class="w">
  
</span><span class="n">means_plot</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">layer_means</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">name</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="o">=</span><span class="n">mean</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">geom_point</span><span class="p">()</span><span class="w">
  
</span><span class="n">means_plot_path</span><span class="w"> </span><span class="o">&lt;-</span><span class="w"> </span><span class="n">file.path</span><span class="p">(</span><span class="n">outputFolder</span><span class="p">,</span><span class="w"> </span><span class="s2">"means_plot.png"</span><span class="p">)</span><span class="w">
</span><span class="n">ggsave</span><span class="p">(</span><span class="n">means_plot_path</span><span class="p">,</span><span class="w"> </span><span class="n">means_plot</span><span class="p">)</span><span class="w">
    
</span><span class="n">biab_output</span><span class="p">(</span><span class="s2">"means_plot"</span><span class="p">,</span><span class="w"> </span><span class="n">means_plot_path</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">TestScript.yaml</code></p>

<p>The <code class="language-plaintext highlighter-rouge">inputs</code> and <code class="language-plaintext highlighter-rouge">outputs</code> section defines the inputs and outputs, where the names must match the names in the script above. The environment is set up using conda. A specific version can be specified like this: <code class="language-plaintext highlighter-rouge">r-terra=0.9-12</code></p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">script</span><span class="pi">:</span> <span class="s">TestScript.R</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">Test script</span>
<span class="na">description</span><span class="pi">:</span> <span class="s">Demo script</span>
<span class="na">author</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">ME</span>
<span class="na">inputs</span><span class="pi">:</span>
  <span class="na">country_ploygon</span><span class="pi">:</span>
    <span class="na">label</span><span class="pi">:</span> <span class="s">Country Polygon</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Polygon of the country of interest</span>
    <span class="na">type</span><span class="pi">:</span> <span class="s">application/geo+json</span>
    <span class="na">example</span><span class="pi">:</span> <span class="no">null</span>
  <span class="na">crs</span><span class="pi">:</span>
    <span class="na">label</span><span class="pi">:</span> <span class="s">Coordinate reference system</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Coordinate reference system</span>
    <span class="na">type</span><span class="pi">:</span> <span class="s">text</span>
    <span class="na">example</span><span class="pi">:</span> <span class="s2">"</span><span class="s">EPSG:3857"</span>
  <span class="na">rasters</span><span class="pi">:</span>
    <span class="na">label</span><span class="pi">:</span> <span class="s">Rasters</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Raster layers of variable of interest</span>
    <span class="na">type</span><span class="pi">:</span> <span class="s">image/tiff;application=geotiff[]</span>
    <span class="na">example</span><span class="pi">:</span> <span class="no">null</span> 
<span class="na">outputs</span><span class="pi">:</span>
  <span class="na">raster_change</span><span class="pi">:</span>
    <span class="na">label</span><span class="pi">:</span> <span class="s">Rasters</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Differences between raster values</span>
    <span class="na">type</span><span class="pi">:</span> <span class="s">image/tiff;application=geotiff</span>
  <span class="na">means_plot</span><span class="pi">:</span>
    <span class="na">label</span><span class="pi">:</span> <span class="s">Plot of raster means</span>
    <span class="na">description</span><span class="pi">:</span> <span class="s">Plot of means of raster layers</span>
    <span class="na">type</span><span class="pi">:</span> <span class="s">image/png</span>
<span class="na">conda</span><span class="pi">:</span>
  <span class="na">channels</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">conda-forge</span>
    <span class="pi">-</span> <span class="s">r</span>
  <span class="na">dependencies</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s">r-rjson</span>
    <span class="pi">-</span> <span class="s">r-sf</span>
    <span class="pi">-</span> <span class="s">r-dplyr</span>
    <span class="pi">-</span> <span class="s">r-terra</span>
    <span class="pi">-</span> <span class="s">r-ggplot2</span>
</code></pre></div></div>

<p>The architecture appears to be designed as a single-server instance without built-in job queuing or concurrent execution limits.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="geobon" /><category term="tunbury.org" /><summary type="html"><![CDATA[On a suggestion from Michael, I have had a quick look at BON in a Box, which is a web-based biodiversity analysis platform using Docker containerised pipelines running R, Julia, and Python scripts.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/geobon-colombia.png" /><media:content medium="image" url="https://www.tunbury.org/images/geobon-colombia.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OCaml Functors</title><link href="https://www.tunbury.org/2025/07/01/ocaml-functors/" rel="alternate" type="text/html" title="OCaml Functors" /><published>2025-07-01T00:00:00+00:00</published><updated>2025-07-01T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/07/01/ocaml-functors</id><content type="html" xml:base="https://www.tunbury.org/2025/07/01/ocaml-functors/"><![CDATA[<p>In my OCaml project, I’d like to abstract away the details of running containers into specific modules based on the OS. Currently, I have working container setups for Windows and Linux, and I’ve haphazardly peppered <code class="language-plaintext highlighter-rouge">if Sys.win32 then</code> where I need differentiation, but this is OCaml, so let us use <em>functors</em>!</p>

<p>I started by fleshing out the bare bones in a new project. After <code class="language-plaintext highlighter-rouge">dune init project functor</code>, I created <code class="language-plaintext highlighter-rouge">bin/s.ml</code> containing the signature of the module <code class="language-plaintext highlighter-rouge">CONTAINER</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">CONTAINER</span> <span class="o">=</span> <span class="k">sig</span>
  <span class="k">val</span> <span class="n">run</span> <span class="o">:</span> <span class="kt">string</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Then a trivial <code class="language-plaintext highlighter-rouge">bin/linux.ml</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">run</span> <span class="n">s</span> <span class="o">=</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"Linux container '%s'</span><span class="se">\n</span><span class="s2">"</span> <span class="n">s</span>
</code></pre></div></div>

<p>And <code class="language-plaintext highlighter-rouge">bin/windows.ml</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">run</span> <span class="n">s</span> <span class="o">=</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"Windows container '%s'</span><span class="se">\n</span><span class="s2">"</span> <span class="n">s</span>
</code></pre></div></div>

<p>Then in <code class="language-plaintext highlighter-rouge">bin/main.ml</code>, I can select the container system once and from then on use <code class="language-plaintext highlighter-rouge">Container.foo</code> to run the appropriate OS specific function.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">container</span> <span class="o">=</span> <span class="k">if</span> <span class="nn">Sys</span><span class="p">.</span><span class="n">win32</span> <span class="k">then</span> <span class="p">(</span><span class="k">module</span> <span class="nc">Windows</span> <span class="o">:</span> <span class="nn">S</span><span class="p">.</span><span class="nc">CONTAINER</span><span class="p">)</span> <span class="k">else</span> <span class="p">(</span><span class="k">module</span> <span class="nc">Linux</span> <span class="o">:</span> <span class="nn">S</span><span class="p">.</span><span class="nc">CONTAINER</span><span class="p">)</span>

<span class="k">module</span> <span class="nc">Container</span> <span class="o">=</span> <span class="p">(</span><span class="k">val</span> <span class="n">container</span><span class="p">)</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="nn">Container</span><span class="p">.</span><span class="n">run</span> <span class="s2">"Hello, World!"</span>
</code></pre></div></div>

<p>You can additionally create <code class="language-plaintext highlighter-rouge">windows.mli</code> and <code class="language-plaintext highlighter-rouge">linux.mli</code> containing simply <code class="language-plaintext highlighter-rouge">include S.CONTAINER</code>.</p>

<p>Now, let’s imagine that we needed to have some specific configuration options depending upon whether we are running on Windows or Linux. For demonstration purposes, let’s use the user account. On Windows, this is a string, typically <code class="language-plaintext highlighter-rouge">ContainerAdministrator</code>, whereas on Linux, it’s an integer UID of value 0.</p>

<p>We can update the module type in <code class="language-plaintext highlighter-rouge">bin/s.ml</code> to include the type <code class="language-plaintext highlighter-rouge">t</code>, and add an <code class="language-plaintext highlighter-rouge">init</code> function to return a <code class="language-plaintext highlighter-rouge">t</code> and add <code class="language-plaintext highlighter-rouge">t</code> as a parameter to <code class="language-plaintext highlighter-rouge">run</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="k">type</span> <span class="nc">CONTAINER</span> <span class="o">=</span> <span class="k">sig</span>
  <span class="k">type</span> <span class="n">t</span>

  <span class="k">val</span> <span class="n">init</span> <span class="o">:</span> <span class="kt">unit</span> <span class="o">-&gt;</span> <span class="n">t</span>
  <span class="k">val</span> <span class="n">run</span> <span class="o">:</span> <span class="n">t</span> <span class="o">-&gt;</span> <span class="kt">string</span> <span class="o">-&gt;</span> <span class="kt">unit</span>
<span class="k">end</span>
</code></pre></div></div>

<p>In <code class="language-plaintext highlighter-rouge">bin/linux.ml</code>, we can add the type and define <code class="language-plaintext highlighter-rouge">uid</code> as an integer, then add the <code class="language-plaintext highlighter-rouge">init</code> function to return the populated structure. <code class="language-plaintext highlighter-rouge">run</code> now accepts <code class="language-plaintext highlighter-rouge">t</code> as the first parameter.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">t</span> <span class="o">=</span> <span class="p">{</span>
  <span class="n">uid</span> <span class="o">:</span> <span class="kt">int</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">init</span> <span class="bp">()</span> <span class="o">=</span> <span class="p">{</span> <span class="n">uid</span> <span class="o">=</span> <span class="mi">0</span> <span class="p">}</span>

<span class="k">let</span> <span class="n">run</span> <span class="n">t</span> <span class="n">s</span> <span class="o">=</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"Linux container user id %i says '%s'</span><span class="se">\n</span><span class="s2">"</span> <span class="n">t</span><span class="o">.</span><span class="n">uid</span> <span class="n">s</span>
</code></pre></div></div>

<p>In a similar vein, <code class="language-plaintext highlighter-rouge">bin/windows.ml</code> is updated like this</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">t</span> <span class="o">=</span> <span class="p">{</span>
  <span class="n">username</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">init</span> <span class="bp">()</span> <span class="o">=</span> <span class="p">{</span> <span class="n">username</span> <span class="o">=</span> <span class="s2">"ContainerAdministrator"</span> <span class="p">}</span>

<span class="k">let</span> <span class="n">run</span> <span class="n">t</span> <span class="n">s</span> <span class="o">=</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"Windows container user name %s says '%s'</span><span class="se">\n</span><span class="s2">"</span> <span class="n">t</span><span class="o">.</span><span class="n">username</span> <span class="n">s</span>
</code></pre></div></div>

<p>And finally, in <code class="language-plaintext highlighter-rouge">bin/main.ml</code> we run <code class="language-plaintext highlighter-rouge">Container.init ()</code> and use the returned type as a parameter to <code class="language-plaintext highlighter-rouge">Container.run</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">container</span> <span class="o">=</span> <span class="k">if</span> <span class="nn">Sys</span><span class="p">.</span><span class="n">win32</span> <span class="k">then</span> <span class="p">(</span><span class="k">module</span> <span class="nc">Windows</span> <span class="o">:</span> <span class="nn">S</span><span class="p">.</span><span class="nc">CONTAINER</span><span class="p">)</span> <span class="k">else</span> <span class="p">(</span><span class="k">module</span> <span class="nc">Linux</span> <span class="o">:</span> <span class="nn">S</span><span class="p">.</span><span class="nc">CONTAINER</span><span class="p">)</span>

<span class="k">module</span> <span class="nc">Container</span> <span class="o">=</span> <span class="p">(</span><span class="k">val</span> <span class="n">container</span><span class="p">)</span>

<span class="k">let</span> <span class="n">c</span> <span class="o">=</span> <span class="nn">Container</span><span class="p">.</span><span class="n">init</span> <span class="bp">()</span>
<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="nn">Container</span><span class="p">.</span><span class="n">run</span> <span class="n">c</span> <span class="s2">"Hello, World!"</span>
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[In my OCaml project, I’d like to abstract away the details of running containers into specific modules based on the OS. Currently, I have working container setups for Windows and Linux, and I’ve haphazardly peppered if Sys.win32 then where I need differentiation, but this is OCaml, so let us use functors!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/hot-functors.png" /><media:content medium="image" url="https://www.tunbury.org/images/hot-functors.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Containerd on Windows</title><link href="https://www.tunbury.org/2025/06/27/windows-containerd-3/" rel="alternate" type="text/html" title="Containerd on Windows" /><published>2025-06-27T12:00:00+00:00</published><updated>2025-06-27T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/27/windows-containerd-3</id><content type="html" xml:base="https://www.tunbury.org/2025/06/27/windows-containerd-3/"><![CDATA[<p>Everything was going fine until I ran out of disk space. My NVMe, <code class="language-plaintext highlighter-rouge">C:</code> drive, is only 256GB, but I have a large, 1.7TB SSD available as <code class="language-plaintext highlighter-rouge">D:</code>. How trivial, change a few paths and carry on, but it wasn’t that simple, or was it?</p>

<p>Distilling the problem down to the minimum and excluding all code written by me, the following command fails, but changing <code class="language-plaintext highlighter-rouge">src=d:\cache\opam</code> to <code class="language-plaintext highlighter-rouge">src=c:\cache\opam</code> works. It’s not the content, as it’s just an empty folder.</p>

<pre><code class="language-cmd">ctr run --rm --cni -user ContainerAdministrator -mount type=bind,src=d:\cache\opam,dst=c:\Users\ContainerAdministrator\AppData\Local\opam mcr.microsoft.com/windows/servercore:ltsc2022 my-container  cmd /c "curl.exe -L -o c:\Windows\opam.exe https://github.com/ocaml/opam/releases/download/2.3.0/opam-2.3.0-x86_64-windows.exe &amp;&amp; opam.exe init --debug-level=3 -y"
</code></pre>

<p>The failure point is the ability to create the lock file <code class="language-plaintext highlighter-rouge">config.lock</code>. Checking the code, the log entry is written before the lock is acquired. If <code class="language-plaintext highlighter-rouge">c:\Users\ContainerAdministrator\AppData\Local\opam</code> is not a bind mount, or the bind mount is on <code class="language-plaintext highlighter-rouge">C:</code>, then it works.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>01:26.722  CLIENT                          updating repository state
01:26.722  GSTATE                          LOAD-GLOBAL-STATE @ C:\Users\ContainerAdministrator\AppData\Local\opam
01:26.723  SYSTEM                          LOCK C:\Users\ContainerAdministrator\AppData\Local\opam\lock (none =&gt; read)
01:26.723  SYSTEM                          LOCK C:\Users\ContainerAdministrator\AppData\Local\opam\config.lock (none =&gt; write)
</code></pre></div></div>

<p>Suffice it to say, I spent a long time trying to resolve this. I’ll mention a couple of interesting points that appeared along the way. Firstly, files created on <code class="language-plaintext highlighter-rouge">D:</code> effectively appear as hard links, and the Update Sequence Number, USN, is 0.</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">C:\</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">fsutil</span><span class="w"> </span><span class="nx">file</span><span class="w"> </span><span class="nx">layout</span><span class="w"> </span><span class="nx">d:\cache\opam\lock</span><span class="w">

</span><span class="o">*********</span><span class="w"> </span><span class="n">File</span><span class="w"> </span><span class="nx">0x000400000001d251</span><span class="w"> </span><span class="o">*********</span><span class="w">
</span><span class="n">File</span><span class="w"> </span><span class="nx">reference</span><span class="w"> </span><span class="nx">number</span><span class="w">   </span><span class="p">:</span><span class="w"> </span><span class="nx">0x000400000001d251</span><span class="w">
</span><span class="n">File</span><span class="w"> </span><span class="nx">attributes</span><span class="w">         </span><span class="p">:</span><span class="w"> </span><span class="nx">0x00000020:</span><span class="w"> </span><span class="nx">Archive</span><span class="w">
</span><span class="n">File</span><span class="w"> </span><span class="nx">entry</span><span class="w"> </span><span class="nx">flags</span><span class="w">        </span><span class="p">:</span><span class="w"> </span><span class="nx">0x00000000</span><span class="w">
</span><span class="n">Link</span><span class="w"> </span><span class="p">(</span><span class="n">ParentID:</span><span class="w"> </span><span class="nx">Name</span><span class="p">)</span><span class="w">   </span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="n">x000c00000000002d:</span><span class="w"> </span><span class="nx">HLINK</span><span class="w"> </span><span class="nx">Name</span><span class="w">   </span><span class="p">:</span><span class="w"> </span><span class="nx">\cache\opam\lock</span><span class="w">
</span><span class="o">...</span><span class="w">
</span><span class="n">LastUsn</span><span class="w">                 </span><span class="p">:</span><span class="w"> </span><span class="nx">0</span><span class="w">
</span><span class="o">...</span><span class="w">
</span></code></pre></div></div>

<p>The reason behind this is down to Windows defaults:</p>

<ol>
  <li>Windows still likes to create the legacy 8.3 MS-DOS file names on the system volume, <code class="language-plaintext highlighter-rouge">C:</code>, which explains the difference between <code class="language-plaintext highlighter-rouge">HLINK</code> and <code class="language-plaintext highlighter-rouge">NTFS+DOS</code>. Running <code class="language-plaintext highlighter-rouge">fsutil 8dot3name set d: 0</code> will enable the creation of the old-style file names.</li>
  <li>Drive <code class="language-plaintext highlighter-rouge">C:</code> has a USN journal created automatically, as it’s required for Windows to operate, but it isn’t created by default on other drives. Running <code class="language-plaintext highlighter-rouge">fsutil usn createjournal d: m=32000000 a=8000000</code> will create the journal.</li>
</ol>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">C:\</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">fsutil</span><span class="w"> </span><span class="nx">file</span><span class="w"> </span><span class="nx">layout</span><span class="w"> </span><span class="nx">c:\cache\opam\lock</span><span class="w">

</span><span class="o">*********</span><span class="w"> </span><span class="n">File</span><span class="w"> </span><span class="nx">0x000300000002f382</span><span class="w"> </span><span class="o">*********</span><span class="w">
</span><span class="n">File</span><span class="w"> </span><span class="nx">reference</span><span class="w"> </span><span class="nx">number</span><span class="w">   </span><span class="p">:</span><span class="w"> </span><span class="nx">0x000300000002f382</span><span class="w">
</span><span class="n">File</span><span class="w"> </span><span class="nx">attributes</span><span class="w">         </span><span class="p">:</span><span class="w"> </span><span class="nx">0x00000020:</span><span class="w"> </span><span class="nx">Archive</span><span class="w">
</span><span class="n">File</span><span class="w"> </span><span class="nx">entry</span><span class="w"> </span><span class="nx">flags</span><span class="w">        </span><span class="p">:</span><span class="w"> </span><span class="nx">0x00000000</span><span class="w">
</span><span class="n">Link</span><span class="w"> </span><span class="p">(</span><span class="n">ParentID:</span><span class="w"> </span><span class="nx">Name</span><span class="p">)</span><span class="w">   </span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="n">x000b0000000271d1:</span><span class="w"> </span><span class="nx">NTFS</span><span class="o">+</span><span class="nx">DOS</span><span class="w"> </span><span class="nx">Name:</span><span class="w"> </span><span class="nx">\cache\opam\lock</span><span class="w">
</span><span class="o">...</span><span class="w">
</span><span class="n">LastUsn</span><span class="w">                 </span><span class="p">:</span><span class="w"> </span><span class="nx">16</span><span class="p">,</span><span class="nx">897</span><span class="p">,</span><span class="nx">595</span><span class="p">,</span><span class="nx">224</span><span class="w">
</span><span class="o">...</span><span class="w">
</span></code></pre></div></div>

<p>Sadly, neither of these insights makes any difference to my problem. I did notice that <code class="language-plaintext highlighter-rouge">containerd</code> 2.1.3 had been released, where I had been using 2.1.1. Upgrading didn’t fix the issue, but it did affect how the network namespaces were created. More later.</p>

<p>I decided to both ignore the problem and try it on another machine. After all, this problem was only a problem because <em>my</em> <code class="language-plaintext highlighter-rouge">C:</code> was too small. I created a QEMU VM with a 40GB <code class="language-plaintext highlighter-rouge">C:</code> and a 1TB <code class="language-plaintext highlighter-rouge">D:</code> and installed everything, and it worked fine with the bind mount on <code class="language-plaintext highlighter-rouge">D:</code> even <em>without</em> any of the above tuning and even with <code class="language-plaintext highlighter-rouge">D:</code> formatted using ReFS, rather than NTFS.</p>

<p>Trying on another physical machine with a single large spinning disk as <code class="language-plaintext highlighter-rouge">C:</code> also worked as anticipated.</p>

<p>In both of these new installations, I used <code class="language-plaintext highlighter-rouge">containerd</code> 2.1.3 and noticed that the behaviour I had come to rely upon seemed to have changed. If you recall, in this <a href="https://www.tunbury.org/2025/06/14/windows-containerd-2/">post</a>, I <em>found</em> the network namespace GUID by running <code class="language-plaintext highlighter-rouge">ctr run</code> on a standard Windows container and then <code class="language-plaintext highlighter-rouge">ctr container info</code> in another window. This no longer worked reliably, as the namespace was removed when the container exited. Perhaps it always should have been?</p>

<p>I need to find out how to create these namespaces. PowerShell has a cmdlet <code class="language-plaintext highlighter-rouge">Get-HnsNetwork</code>, but none of the GUID values there match the currently running namespaces I observe from <code class="language-plaintext highlighter-rouge">ctr container info</code>. The source code of <a href="https://github.com/containerd/containerd">containerd</a> is on GitHub..</p>

<p>When you pass <code class="language-plaintext highlighter-rouge">--cni</code> to the <code class="language-plaintext highlighter-rouge">ctr</code> command, it populates the network namespace from <code class="language-plaintext highlighter-rouge">NetNewNS</code>.  Snippet from <code class="language-plaintext highlighter-rouge">cmd/ctr/commands/run/run_windows.go</code></p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                <span class="k">if</span> <span class="n">cliContext</span><span class="o">.</span><span class="n">Bool</span><span class="p">(</span><span class="s">"cni"</span><span class="p">)</span> <span class="p">{</span>
                        <span class="n">ns</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">netns</span><span class="o">.</span><span class="n">NewNetNS</span><span class="p">(</span><span class="s">""</span><span class="p">)</span>
                        <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
                                <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">err</span>
                        <span class="p">}</span>
                        <span class="n">opts</span> <span class="o">=</span> <span class="nb">append</span><span class="p">(</span><span class="n">opts</span><span class="p">,</span> <span class="n">oci</span><span class="o">.</span><span class="n">WithWindowsNetworkNamespace</span><span class="p">(</span><span class="n">ns</span><span class="o">.</span><span class="n">GetPath</span><span class="p">()))</span>
                <span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">NewNetNS</code> is defined in <code class="language-plaintext highlighter-rouge">pkg/netns/netns_windows.go</code></p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">// NetNS holds network namespace for sandbox</span>
<span class="k">type</span> <span class="n">NetNS</span> <span class="k">struct</span> <span class="p">{</span>
        <span class="n">path</span> <span class="kt">string</span>
<span class="p">}</span>

<span class="c">// NewNetNS creates a network namespace for the sandbox.</span>
<span class="k">func</span> <span class="n">NewNetNS</span><span class="p">(</span><span class="n">baseDir</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">NetNS</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">temp</span> <span class="o">:=</span> <span class="n">hcn</span><span class="o">.</span><span class="n">HostComputeNamespace</span><span class="p">{}</span>
        <span class="n">hcnNamespace</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">temp</span><span class="o">.</span><span class="n">Create</span><span class="p">()</span>
        <span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
                <span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">err</span>
        <span class="p">}</span>

        <span class="k">return</span> <span class="o">&amp;</span><span class="n">NetNS</span><span class="p">{</span><span class="n">path</span><span class="o">:</span> <span class="n">hcnNamespace</span><span class="o">.</span><span class="n">Id</span><span class="p">},</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Following the thread, and cutting out a few steps in the interest of brevity, we end up in <code class="language-plaintext highlighter-rouge">vendor/github.com/Microsoft/hcsshim/hcn/zsyscall_windows.go</code> which calls a Win32 API.</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">_hcnCreateNamespace</span><span class="p">(</span><span class="n">id</span> <span class="o">*</span><span class="n">_guid</span><span class="p">,</span> <span class="n">settings</span> <span class="o">*</span><span class="kt">uint16</span><span class="p">,</span> <span class="n">namespace</span> <span class="o">*</span><span class="n">hcnNamespace</span><span class="p">,</span> <span class="n">result</span> <span class="o">**</span><span class="kt">uint16</span><span class="p">)</span> <span class="p">(</span><span class="n">hr</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">hr</span> <span class="o">=</span> <span class="n">procHcnCreateNamespace</span><span class="o">.</span><span class="n">Find</span><span class="p">()</span>
        <span class="k">if</span> <span class="n">hr</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
                <span class="k">return</span>
        <span class="p">}</span>
        <span class="n">r0</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">_</span> <span class="o">:=</span> <span class="n">syscall</span><span class="o">.</span><span class="n">SyscallN</span><span class="p">(</span><span class="n">procHcnCreateNamespace</span><span class="o">.</span><span class="n">Addr</span><span class="p">(),</span> <span class="kt">uintptr</span><span class="p">(</span><span class="n">unsafe</span><span class="o">.</span><span class="n">Pointer</span><span class="p">(</span><span class="n">id</span><span class="p">)),</span> <span class="kt">uintptr</span><span class="p">(</span><span class="n">unsafe</span><span class="o">.</span><span class="n">Pointer</span><span class="p">(</span><span class="n">settings</span><span class="p">)),</span> <span class="kt">uintptr</span><span class="p">(</span><span class="n">unsafe</span><span class="o">.</span><span class="n">Pointer</span><span class="p">(</span><span class="n">namespace</span><span class="p">)),</span> <span class="kt">uintptr</span><span class="p">(</span><span class="n">unsafe</span><span class="o">.</span><span class="n">Pointer</span><span class="p">(</span><span class="n">result</span><span class="p">)))</span>
        <span class="k">if</span> <span class="kt">int32</span><span class="p">(</span><span class="n">r0</span><span class="p">)</span> <span class="o">&lt;</span> <span class="m">0</span> <span class="p">{</span>
                <span class="k">if</span> <span class="n">r0</span><span class="o">&amp;</span><span class="m">0x1fff0000</span> <span class="o">==</span> <span class="m">0x00070000</span> <span class="p">{</span>
                        <span class="n">r0</span> <span class="o">&amp;=</span> <span class="m">0xffff</span>
                <span class="p">}</span>
                <span class="n">hr</span> <span class="o">=</span> <span class="n">syscall</span><span class="o">.</span><span class="n">Errno</span><span class="p">(</span><span class="n">r0</span><span class="p">)</span>
        <span class="p">}</span>
        <span class="k">return</span>
<span class="p">}</span>
</code></pre></div></div>

<p>PowerShell provides <code class="language-plaintext highlighter-rouge">Get-HnsNamespace</code> to list available namespaces. These <em>are</em> the <del>droids</del> values I’ve been looking for to put in <code class="language-plaintext highlighter-rouge">config.json</code>! However, by default there are no cmdlets to create them. The installation PowerShell <a href="https://github.com/microsoft/Windows-Containers/blob/Main/helpful_tools/Install-ContainerdRuntime/install-containerd-runtime.ps1">script</a> for <code class="language-plaintext highlighter-rouge">containerd</code> pulls in <a href="https://github.com/microsoft/SDN/blob/master/Kubernetes/windows/hns.psm1">hns.psm1</a> for <code class="language-plaintext highlighter-rouge">containerd</code>, has a lot of interesting cmdlets, such as <code class="language-plaintext highlighter-rouge">New-HnsNetwork</code>, but not a cmdlet to create a namespace. There is also <a href="https://github.com/microsoft/SDN/blob/master/Kubernetes/windows/hns.v2.psm1">hns.v2.psm1</a>, which does have <code class="language-plaintext highlighter-rouge">New-HnsNamespace</code>.</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">PS</span><span class="w"> </span><span class="nx">C:\Users\Administrator</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">curl.exe</span><span class="w"> </span><span class="nt">-o</span><span class="w"> </span><span class="nx">hns.v2.psm1</span><span class="w"> </span><span class="nt">-L</span><span class="w"> </span><span class="nx">https://raw.githubusercontent.com/microsoft/SDN/refs/heads/master/Kubernetes/windows/hns.v2.psm1</span><span class="w">
  </span><span class="o">%</span><span class="w"> </span><span class="n">Total</span><span class="w">    </span><span class="o">%</span><span class="w"> </span><span class="nx">Received</span><span class="w"> </span><span class="o">%</span><span class="w"> </span><span class="nx">Xferd</span><span class="w">  </span><span class="nx">Average</span><span class="w"> </span><span class="nx">Speed</span><span class="w">   </span><span class="nx">Time</span><span class="w">    </span><span class="nx">Time</span><span class="w">     </span><span class="nx">Time</span><span class="w">  </span><span class="nx">Current</span><span class="w">
                                 </span><span class="n">Dload</span><span class="w">  </span><span class="nx">Upload</span><span class="w">   </span><span class="nx">Total</span><span class="w">   </span><span class="nx">Spent</span><span class="w">    </span><span class="nx">Left</span><span class="w">  </span><span class="nx">Speed</span><span class="w">
</span><span class="mi">100</span><span class="w"> </span><span class="mi">89329</span><span class="w">  </span><span class="mi">100</span><span class="w"> </span><span class="mi">89329</span><span class="w">    </span><span class="mi">0</span><span class="w">     </span><span class="mi">0</span><span class="w">   </span><span class="mi">349</span><span class="n">k</span><span class="w">      </span><span class="nx">0</span><span class="w"> </span><span class="o">--</span><span class="p">:</span><span class="o">--</span><span class="p">:</span><span class="o">--</span><span class="w"> </span><span class="o">--</span><span class="p">:</span><span class="o">--</span><span class="p">:</span><span class="o">--</span><span class="w"> </span><span class="o">--</span><span class="p">:</span><span class="o">--</span><span class="p">:</span><span class="o">--</span><span class="w">  </span><span class="nx">353k</span><span class="w">

</span><span class="n">PS</span><span class="w"> </span><span class="nx">C:\Users\Administrator</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">Import-Module</span><span class="w"> </span><span class="o">.</span><span class="nx">\hns.v2.psm1</span><span class="w">
</span><span class="n">WARNING:</span><span class="w"> </span><span class="nx">The</span><span class="w"> </span><span class="nx">names</span><span class="w"> </span><span class="nx">of</span><span class="w"> </span><span class="nx">some</span><span class="w"> </span><span class="nx">imported</span><span class="w"> </span><span class="nx">commands</span><span class="w"> </span><span class="nx">from</span><span class="w"> </span><span class="nx">the</span><span class="w"> </span><span class="nx">module</span><span class="w"> </span><span class="s1">'hns.v2'</span><span class="w"> </span><span class="nx">include</span><span class="w"> </span><span class="nx">unapproved</span><span class="w"> </span><span class="nx">verbs</span><span class="w"> </span><span class="nx">that</span><span class="w"> </span><span class="nx">might</span><span class="w"> </span><span class="nx">make</span><span class="w"> </span><span class="nx">them</span><span class="w"> </span><span class="nx">less</span><span class="w"> </span><span class="nx">discoverable.</span><span class="w"> </span><span class="nx">To</span><span class="w"> </span><span class="nx">find</span><span class="w"> </span><span class="nx">the</span><span class="w"> </span><span class="nx">commands</span><span class="w"> </span><span class="nx">with</span><span class="w"> </span><span class="nx">unapproved</span><span class="w"> </span><span class="nx">verbs</span><span class="p">,</span><span class="w"> </span><span class="nx">run</span><span class="w"> </span><span class="nx">the</span><span class="w"> </span><span class="nx">Import-Module</span><span class="w"> </span><span class="nx">command</span><span class="w"> </span><span class="nx">again</span><span class="w"> </span><span class="nx">with</span><span class="w"> </span><span class="nx">the</span><span class="w"> </span><span class="nx">Verbose</span><span class="w"> </span><span class="nx">parameter.</span><span class="w"> </span><span class="nx">For</span><span class="w"> </span><span class="nx">a</span><span class="w"> </span><span class="nx">list</span><span class="w"> </span><span class="nx">of</span><span class="w"> </span><span class="nx">approved</span><span class="w"> </span><span class="nx">verbs</span><span class="p">,</span><span class="w"> </span><span class="nx">type</span><span class="w"> </span><span class="nx">Get-Verb.</span><span class="w">

</span><span class="n">PS</span><span class="w"> </span><span class="nx">C:\Users\Administrator</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">New-HnsNamespace</span><span class="w">
</span><span class="n">HcnCreateNamespace</span><span class="w"> </span><span class="o">--</span><span class="w"> </span><span class="nx">HRESULT:</span><span class="w"> </span><span class="nx">2151350299.</span><span class="w"> </span><span class="nx">Result:</span><span class="w"> </span><span class="p">{</span><span class="s2">"Success"</span><span class="p">:</span><span class="n">false</span><span class="p">,</span><span class="s2">"Error"</span><span class="p">:</span><span class="s2">"Invalid JSON document string. &amp;#123;&amp;#123;CreateWithCompartment,UnknownField}}"</span><span class="p">,</span><span class="s2">"ErrorCode"</span><span class="p">:</span><span class="nx">2151350299</span><span class="p">}</span><span class="w">
</span><span class="n">At</span><span class="w"> </span><span class="nx">C:\Users\Administrator\hns.v2.psm1:2392</span><span class="w"> </span><span class="nx">char:13</span><span class="w">
</span><span class="o">+</span><span class="w">             </span><span class="kr">throw</span><span class="w"> </span><span class="nv">$errString</span><span class="w">
</span><span class="o">+</span><span class="w">             </span><span class="n">~~~~~~~~~~~~~~~~</span><span class="w">
    </span><span class="o">+</span><span class="w"> </span><span class="nx">CategoryInfo</span><span class="w">          </span><span class="p">:</span><span class="w"> </span><span class="nx">OperationStopped:</span><span class="w"> </span><span class="p">(</span><span class="n">HcnCreateNamesp...de</span><span class="s2">":2151350299}:String) [], RuntimeException
    + FullyQualifiedErrorId : HcnCreateNamespace -- HRESULT: 2151350299. Result: {"</span><span class="nx">Success</span><span class="s2">":false,"</span><span class="nx">Error</span><span class="s2">":"</span><span class="nx">Invalid</span><span class="w"> </span><span class="nx">JSON</span><span class="w"> </span><span class="nx">document</span><span class="w"> </span><span class="nx">string.</span><span class="w"> </span><span class="o">&amp;</span><span class="c">#123;&amp;#123;CreateWithCompartment,UnknownField}}","ErrorCode":2151350299}</span><span class="w">
</span></code></pre></div></div>

<p>With a lot of frustration, I decided to have a go at calling the Win32 API from OCaml. This resulted in <a href="https://github.com/mtelvers/hcn-namespace">mtelvers/hcn-namespace</a>, which allows me to create the namespaces by running <code class="language-plaintext highlighter-rouge">hcn-namespace create</code>. These namespaces appear in the output from <code class="language-plaintext highlighter-rouge">Get-HnsNamespace</code> and work correctly in <code class="language-plaintext highlighter-rouge">config.json</code>.</p>

<p>Run <code class="language-plaintext highlighter-rouge">hcn-namespace.exe create</code>, and then populate <code class="language-plaintext highlighter-rouge">"networkNamespace": "&lt;GUID&gt;"</code> with the GUID provided and run with <code class="language-plaintext highlighter-rouge">ctr run --rm -cni --config config.json</code>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="containerd" /><category term="tunbury.org" /><summary type="html"><![CDATA[Everything was going fine until I ran out of disk space. My NVMe, C: drive, is only 256GB, but I have a large, 1.7TB SSD available as D:. How trivial, change a few paths and carry on, but it wasn’t that simple, or was it?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/containerd.png" /><media:content medium="image" url="https://www.tunbury.org/images/containerd.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Improve the deployment time for opam2web</title><link href="https://www.tunbury.org/2025/06/24/opam2web/" rel="alternate" type="text/html" title="Improve the deployment time for opam2web" /><published>2025-06-24T00:00:00+00:00</published><updated>2025-06-24T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/24/opam2web</id><content type="html" xml:base="https://www.tunbury.org/2025/06/24/opam2web/"><![CDATA[<p>The opam2web image for <a href="https://opam.ocaml.org">opam.ocaml.org</a> is huge weighing in at more than 25 GB. The bulk of this data is opam archives, which are updated and copied into a stock caddy image.</p>

<p>There are two archives, <code class="language-plaintext highlighter-rouge">ocaml/opam.ocaml.org-legacy</code>, which hasn’t changed for 5 years and holds the cache for opam 1.x and <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code>, which is updated weekly.</p>

<p>The current <code class="language-plaintext highlighter-rouge">Dockerfile</code> copies these files into a new layer each time opam2web builds.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="w"> </span><span class="s">--platform=linux/amd64 ocaml/opam:archive</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="s">opam-archive</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">ocaml/opam.ocaml.org-legacy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="s">opam-legacy</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">alpine:3.20</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="s">opam2web</span>
...
<span class="k">COPY</span><span class="s"> --from=opam-legacy . /www</span>
...
<span class="k">RUN </span><span class="nt">--mount</span><span class="o">=</span><span class="nb">type</span><span class="o">=</span><span class="nb">bind</span>,target<span class="o">=</span>/cache,from<span class="o">=</span>opam-archive rsync <span class="nt">-aH</span> /cache/cache/ /www/cache/
...
</code></pre></div></div>

<p>And later, the entire <code class="language-plaintext highlighter-rouge">/www</code> structure is copied into a <code class="language-plaintext highlighter-rouge">caddy:2.8.4</code> image.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="s"> caddy:2.8.4</span>
<span class="k">WORKDIR</span><span class="s"> /srv</span>
<span class="k">COPY</span><span class="s"> --from=opam2web /www /usr/share/caddy</span>
<span class="k">COPY</span><span class="s"> Caddyfile /etc/caddy/Caddyfile</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["caddy", "run", "--config", "/etc/caddy/Caddyfile", "--adapter", "caddyfile"]</span>
</code></pre></div></div>

<p>This method is considered “best practice” when creating Docker images, but in this case, it produces a very large image, which takes a long time to deploy.</p>

<p>For Docker to use an existing layer, we need the final <code class="language-plaintext highlighter-rouge">FROM ...</code> to be the layer we want to use as the base. In the above snippet, the <code class="language-plaintext highlighter-rouge">caddy:2.8.4</code> layer will be the base layer and will be reused.</p>

<p>The archive, <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code>, is created by this Dockerfile, which ultimately uses <code class="language-plaintext highlighter-rouge">alpine:latest</code>.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="w"> </span><span class="s">ocaml/opam:archive</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">opam-archive</span>
<span class="k">FROM</span><span class="w"> </span><span class="s">ocurrent/opam-staging@sha256:f921cd51dda91f61a52a2c26a8a188f8618a2838e521d3e4afa3ca1da637903e</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="s">archive</span>
<span class="k">WORKDIR</span><span class="s"> /home/opam/opam-repository</span>
<span class="k">RUN </span><span class="nt">--mount</span><span class="o">=</span><span class="nb">type</span><span class="o">=</span><span class="nb">bind</span>,target<span class="o">=</span>/cache,from<span class="o">=</span>opam-archive rsync <span class="nt">-aH</span> /cache/cache/ /home/opam/opam-repository/cache/
<span class="k">RUN </span>opam admin cache <span class="nt">--link</span><span class="o">=</span>/home/opam/opam-repository/cache

<span class="k">FROM</span><span class="s"> alpine:latest</span>
<span class="k">COPY</span><span class="s"> --chown=0:0 --from=archive [ "/home/opam/opam-repository/cache", "/cache" ]</span>
</code></pre></div></div>

<p>In our opam2web build, we could use <code class="language-plaintext highlighter-rouge">FROM ocaml/opam:archive</code> and then <code class="language-plaintext highlighter-rouge">apk add caddy</code>, which would reuse the entire 15GB layer and add the few megabytes for <code class="language-plaintext highlighter-rouge">caddy</code>.</p>

<p><code class="language-plaintext highlighter-rouge">ocaml/opam.ocaml.org-legacy</code> is another 8GB. This legacy data could be integrated by adding it to <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code> in a different directory to ensure compatibility with anyone else using this image. This is <a href="https://github.com/ocurrent/docker-base-images/pull/324">PR#324</a></p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">let</span> <span class="n">install_package_archive</span> <span class="n">opam_image</span> <span class="o">=</span>
   <span class="k">let</span> <span class="k">open</span> <span class="nc">Dockerfile</span> <span class="k">in</span>
<span class="o">+</span>  <span class="n">from</span> <span class="o">~</span><span class="n">alias</span><span class="o">:</span><span class="s2">"opam-legacy"</span> <span class="s2">"ocaml/opam.ocaml.org-legacy"</span> <span class="o">@@</span>
   <span class="n">from</span> <span class="o">~</span><span class="n">alias</span><span class="o">:</span><span class="s2">"opam-archive"</span> <span class="s2">"ocaml/opam:archive"</span> <span class="o">@@</span>
   <span class="n">from</span> <span class="o">~</span><span class="n">alias</span><span class="o">:</span><span class="s2">"archive"</span> <span class="n">opam_image</span> <span class="o">@@</span>
   <span class="n">workdir</span> <span class="s2">"/home/opam/opam-repository"</span> <span class="o">@@</span>
   <span class="n">run</span> <span class="o">~</span><span class="n">mounts</span><span class="o">:</span><span class="p">[</span><span class="n">mount_bind</span> <span class="o">~</span><span class="n">target</span><span class="o">:</span><span class="s2">"/cache"</span> <span class="o">~</span><span class="n">from</span><span class="o">:</span><span class="s2">"opam-archive"</span> <span class="bp">()</span><span class="p">]</span> <span class="s2">"rsync -aH /cache/cache/ /home/opam/opam-repository/cache/"</span> <span class="o">@@</span>
   <span class="n">run</span> <span class="s2">"opam admin cache --link=/home/opam/opam-repository/cache"</span> <span class="o">@@</span>
   <span class="n">from</span> <span class="s2">"alpine:latest"</span> <span class="o">@@</span>
<span class="o">+</span>  <span class="n">copy</span> <span class="o">~</span><span class="n">chown</span><span class="o">:</span><span class="s2">"0:0"</span> <span class="o">~</span><span class="n">from</span><span class="o">:</span><span class="s2">"opam-legacy"</span> <span class="o">~</span><span class="n">src</span><span class="o">:</span><span class="p">[</span><span class="s2">"/"</span><span class="p">]</span> <span class="o">~</span><span class="n">dst</span><span class="o">:</span><span class="s2">"/legacy"</span> <span class="bp">()</span> <span class="o">@@</span>
   <span class="n">copy</span> <span class="o">~</span><span class="n">chown</span><span class="o">:</span><span class="s2">"0:0"</span> <span class="o">~</span><span class="n">from</span><span class="o">:</span><span class="s2">"archive"</span> <span class="o">~</span><span class="n">src</span><span class="o">:</span><span class="p">[</span><span class="s2">"/home/opam/opam-repository/cache"</span><span class="p">]</span> <span class="o">~</span><span class="n">dst</span><span class="o">:</span><span class="s2">"/cache"</span> <span class="bp">()</span>
</code></pre></div></div>

<p>Finally, we need to update <a href="https://github.com/ocaml-opam/opam2web">opam2web</a> to use <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code> as the base layer rather than <code class="language-plaintext highlighter-rouge">caddy:2.8.4</code>, resulting in the final part of the <code class="language-plaintext highlighter-rouge">Dockerfile</code> looking like this.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="s"> ocaml/opam:archive</span>
<span class="k">RUN </span>apk add <span class="nt">--update</span> git curl rsync libstdc++ rdfind caddy
<span class="k">COPY</span><span class="s"> --from=build-opam2web /opt/opam2web /usr/local</span>
<span class="k">COPY</span><span class="s"> --from=build-opam-doc /usr/bin/opam-dev /usr/local/bin/opam</span>
<span class="k">COPY</span><span class="s"> --from=build-opam-doc /opt/opam/doc /usr/local/share/opam2web/content/doc</span>
<span class="k">COPY</span><span class="s"> ext/key/opam-dev-team.pgp /www/opam-dev-pubkey.pgp</span>
<span class="k">ADD</span><span class="s"> bin/opam-web.sh /usr/local/bin</span>
<span class="k">ARG</span><span class="s"> DOMAIN=opam.ocaml.org</span>
<span class="k">ARG</span><span class="s"> OPAM_REPO_GIT_SHA=master</span>
<span class="k">ARG</span><span class="s"> BLOG_GIT_SHA=master</span>
<span class="k">RUN </span><span class="nb">echo</span> <span class="k">${</span><span class="nv">OPAM_REPO_GIT_SHA</span><span class="k">}</span> <span class="o">&gt;&gt;</span> /www/opam_git_sha
<span class="k">RUN </span><span class="nb">echo</span> <span class="k">${</span><span class="nv">BLOG_GIT_SHA</span><span class="k">}</span> <span class="o">&gt;&gt;</span> /www/blog_git_sha
<span class="k">RUN </span>/usr/local/bin/opam-web.sh <span class="k">${</span><span class="nv">DOMAIN</span><span class="k">}</span> <span class="k">${</span><span class="nv">OPAM_REPO_GIT_SHA</span><span class="k">}</span> <span class="k">${</span><span class="nv">BLOG_GIT_SHA</span><span class="k">}</span>
<span class="k">WORKDIR</span><span class="s"> /srv</span>
<span class="k">COPY</span><span class="s"> Caddyfile /etc/caddy/Caddyfile</span>
<span class="k">ENTRYPOINT</span><span class="s"> ["caddy", "run", "--config", "/etc/caddy/Caddyfile", "--adapter", "caddyfile"]</span>
</code></pre></div></div>

<p>I acknowledge that this final image now contains some extra unneeded packages such as <code class="language-plaintext highlighter-rouge">git</code>, <code class="language-plaintext highlighter-rouge">curl</code>, etc, but this seems a minor inconvenience.</p>

<p>The <code class="language-plaintext highlighter-rouge">Caddyfile</code> can be adjusted to make everything still appear to be in the same place:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>:80 {
	redir /install.sh https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh
	redir /install.ps1 https://raw.githubusercontent.com/ocaml/opam/master/shell/install.ps1

	@version_paths path /1.1/* /1.2.0/* /1.2.2/*
	handle @version_paths {
		root * /legacy
		file_server
	}

	handle /cache/* {
		root * /
		file_server
	}

	handle {
		root * /www
		file_server
	}
}
</code></pre></div></div>

<p>In this configuration, the Docker <em>push</em> is only 650MB rather than 25GB.</p>

<p>The changes to opam2web are in <a href="https://github.com/ocaml-opam/opam2web/pull/245">PR#245</a></p>

<p>Test with some external URLs:</p>

<ul>
  <li><a href="https://staging.opam.ocaml.org/index.tar.gz">https://staging.opam.ocaml.org/index.tar.gz</a></li>
  <li><a href="https://staging.opam.ocaml.org/archives/0install.2.18/0install-2.18.tbz">https://staging.opam.ocaml.org/archives/0install.2.18/0install-2.18.tbz</a></li>
  <li><a href="https://staging.opam.ocaml.org/cache/0install.2.18/0install-2.18.tbz">https://staging.opam.ocaml.org/cache/0install.2.18/0install-2.18.tbz</a></li>
  <li><a href="https://staging.opam.ocaml.org/1.2.2/archives/0install.2.12.3+opam.tar.gz">https://staging.opam.ocaml.org/1.2.2/archives/0install.2.12.3+opam.tar.gz</a></li>
  <li><a href="https://staging.opam.ocaml.org/1.2.0/archives/0install.2.12.1+opam.tar.gz">https://staging.opam.ocaml.org/1.2.0/archives/0install.2.12.1+opam.tar.gz</a></li>
  <li><a href="https://staging.opam.ocaml.org/1.1/archives/0install.2.10+opam.tar.gz">https://staging.opam.ocaml.org/1.1/archives/0install.2.10+opam.tar.gz</a></li>
  <li><a href="https://staging.opam.ocaml.org/opam_git_sha">https://staging.opam.ocaml.org/opam_git_sha</a></li>
  <li><a href="https://staging.opam.ocaml.org/blog_git_sha">https://staging.opam.ocaml.org/blog_git_sha</a></li>
  <li><a href="https://staging.opam.ocaml.org/opam-dev-pubkey.pgp">https://staging.opam.ocaml.org/opam-dev-pubkey.pgp</a></li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[The opam2web image for opam.ocaml.org is huge weighing in at more than 25 GB. The bulk of this data is opam archives, which are updated and copied into a stock caddy image.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Transitive Reduction of Package Graph</title><link href="https://www.tunbury.org/2025/06/23/transitive-reduction/" rel="alternate" type="text/html" title="Transitive Reduction of Package Graph" /><published>2025-06-23T00:00:00+00:00</published><updated>2025-06-23T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/23/transitive-reduction</id><content type="html" xml:base="https://www.tunbury.org/2025/06/23/transitive-reduction/"><![CDATA[<p>I have previously written about using a <a href="https://www.tunbury.org/topological-sort/">topological sort</a> of a directed acyclic graph (DAG) of package dependencies to create an ordered list of installation operations. I now want to create a transitive reduction, giving a graph with the same vertices and the fewest number of edges possible.</p>

<p>This is interesting in opam, where a typical package is defined to depend upon both OCaml and Dune. However, Dune depends upon OCaml, so minimally the package only depends upon Dune. For opam, we would typically list both, as they may have version constraints.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">depends</span><span class="pi">:</span> <span class="pi">[</span>
  <span class="s2">"</span><span class="s">dune"</span> <span class="pi">{</span><span class="err">&gt;</span><span class="nv">= "3.17"</span><span class="pi">}</span>
  <span class="s2">"</span><span class="s">ocaml"</span>
<span class="pi">]</span>
</code></pre></div></div>

<p>Given a topologically sorted list of packages, we can fold over the list to build a map of the packages and dependencies. As each package is considered in turn, it must either have no dependencies or the dependent package must already be in the map.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">pkg_deps</span> <span class="n">solution</span> <span class="o">=</span>
  <span class="nn">List</span><span class="p">.</span><span class="n">fold_left</span> <span class="p">(</span><span class="k">fun</span> <span class="n">map</span> <span class="n">pkg</span> <span class="o">-&gt;</span>
    <span class="k">let</span> <span class="n">deps_direct</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">find</span> <span class="n">pkg</span> <span class="n">solution</span> <span class="k">in</span>
    <span class="k">let</span> <span class="n">deps_plus_children</span> <span class="o">=</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">fold</span> <span class="p">(</span><span class="k">fun</span> <span class="n">pkg</span> <span class="n">acc</span> <span class="o">-&gt;</span>
      <span class="nn">PackageSet</span><span class="p">.</span><span class="n">union</span> <span class="n">acc</span> <span class="p">(</span><span class="nn">PackageMap</span><span class="p">.</span><span class="n">find</span> <span class="n">pkg</span> <span class="n">map</span><span class="p">))</span> <span class="n">deps_direct</span> <span class="n">deps_direct</span> <span class="k">in</span>
    <span class="nn">PackageMap</span><span class="p">.</span><span class="n">add</span> <span class="n">pkg</span> <span class="n">deps_plus_children</span> <span class="n">map</span><span class="p">)</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">empty</span><span class="p">;;</span>
</code></pre></div></div>

<p>To generate the transitive reduction, take each set of dependencies for every package in the solution and remove those where the package is a member of the set of all the dependencies of any other directly descendant package.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">reduce</span> <span class="n">dependencies</span> <span class="o">=</span>
  <span class="nn">PackageMap</span><span class="p">.</span><span class="n">map</span> <span class="p">(</span><span class="k">fun</span> <span class="n">u</span> <span class="o">-&gt;</span>
    <span class="nn">PackageSet</span><span class="p">.</span><span class="n">filter</span> <span class="p">(</span><span class="k">fun</span> <span class="n">v</span> <span class="o">-&gt;</span>
      <span class="k">let</span> <span class="n">others</span> <span class="o">=</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">remove</span> <span class="n">v</span> <span class="n">u</span> <span class="k">in</span>
      <span class="nn">PackageSet</span><span class="p">.</span><span class="n">fold</span> <span class="p">(</span><span class="k">fun</span> <span class="n">o</span> <span class="n">acc</span> <span class="o">-&gt;</span>
        <span class="n">acc</span> <span class="o">||</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">mem</span> <span class="n">v</span> <span class="p">(</span><span class="nn">PackageMap</span><span class="p">.</span><span class="n">find</span> <span class="n">o</span> <span class="n">dependencies</span><span class="p">)</span>
      <span class="p">)</span> <span class="n">others</span> <span class="bp">false</span> <span class="o">|&gt;</span> <span class="n">not</span>
    <span class="p">)</span> <span class="n">u</span>
  <span class="p">);;</span>
</code></pre></div></div>

<p>Let’s create a quick print function and then test the code:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">print</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">iter</span> <span class="p">(</span><span class="k">fun</span> <span class="n">p</span> <span class="n">deps</span> <span class="o">-&gt;</span>
  <span class="n">print_endline</span> <span class="p">(</span><span class="n">p</span> <span class="o">^</span> <span class="s2">": "</span> <span class="o">^</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.</span><span class="n">to_list</span> <span class="n">deps</span> <span class="o">|&gt;</span> <span class="nn">String</span><span class="p">.</span><span class="n">concat</span> <span class="s2">","</span><span class="p">))</span>
<span class="p">);;</span>
</code></pre></div></div>

<p>The original solution is</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="n">print</span> <span class="n">dune</span><span class="p">;;</span>
<span class="n">base</span><span class="o">-</span><span class="n">threads</span><span class="o">.</span><span class="n">base</span><span class="o">:</span>
<span class="n">base</span><span class="o">-</span><span class="n">unix</span><span class="o">.</span><span class="n">base</span><span class="o">:</span>
<span class="n">dune</span><span class="o">:</span> <span class="n">base</span><span class="o">-</span><span class="n">threads</span><span class="o">.</span><span class="n">base</span><span class="o">,</span><span class="n">base</span><span class="o">-</span><span class="n">unix</span><span class="o">.</span><span class="n">base</span><span class="o">,</span><span class="n">ocaml</span>
<span class="n">ocaml</span><span class="o">:</span> <span class="n">ocaml</span><span class="o">-</span><span class="n">config</span><span class="o">,</span><span class="n">ocaml</span><span class="o">-</span><span class="n">variants</span>
<span class="n">ocaml</span><span class="o">-</span><span class="n">config</span><span class="o">:</span> <span class="n">ocaml</span><span class="o">-</span><span class="n">variants</span>
<span class="n">ocaml</span><span class="o">-</span><span class="n">variants</span><span class="o">:</span>
<span class="o">-</span> <span class="o">:</span> <span class="kt">unit</span> <span class="o">=</span> <span class="bp">()</span>
</code></pre></div></div>

<p>And the reduced solution is:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">let</span> <span class="n">dependencies</span> <span class="o">=</span> <span class="n">pkg_deps</span> <span class="n">dune</span> <span class="p">(</span><span class="n">topological_sort</span> <span class="n">dune</span><span class="p">);;</span>
<span class="k">val</span> <span class="n">dependencies</span> <span class="o">:</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">t</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">t</span> <span class="o">=</span> <span class="o">&lt;</span><span class="n">abstr</span><span class="o">&gt;</span>
<span class="o">#</span> <span class="n">print</span> <span class="p">(</span><span class="n">reduce</span> <span class="n">dependencies</span> <span class="n">dune</span><span class="p">);;</span>
<span class="n">base</span><span class="o">-</span><span class="n">threads</span><span class="o">.</span><span class="n">base</span><span class="o">:</span>
<span class="n">base</span><span class="o">-</span><span class="n">unix</span><span class="o">.</span><span class="n">base</span><span class="o">:</span>
<span class="n">dune</span><span class="o">:</span> <span class="n">base</span><span class="o">-</span><span class="n">threads</span><span class="o">.</span><span class="n">base</span><span class="o">,</span><span class="n">base</span><span class="o">-</span><span class="n">unix</span><span class="o">.</span><span class="n">base</span><span class="o">,</span><span class="n">ocaml</span>
<span class="n">ocaml</span><span class="o">:</span> <span class="n">ocaml</span><span class="o">-</span><span class="n">config</span>
<span class="n">ocaml</span><span class="o">-</span><span class="n">config</span><span class="o">:</span> <span class="n">ocaml</span><span class="o">-</span><span class="n">variants</span>
<span class="n">ocaml</span><span class="o">-</span><span class="n">variants</span><span class="o">:</span>
<span class="o">-</span> <span class="o">:</span> <span class="kt">unit</span> <span class="o">=</span> <span class="bp">()</span>
</code></pre></div></div>

<p>This doesn’t look like much of a difference, but when applied to a larger graph, for example, 0install.2.18, the reduction is quite dramatic.</p>

<p>Initial graph</p>

<p><img src="/images/0install-graph.png" alt="opam installation graph for 0install" /></p>

<p>Transitive reduction</p>

<p><img src="/images/0install-reduced-graph.png" alt="Transitive reduction of the opam installation graph for 0install" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[I have previously written about using a topological sort of a directed acyclic graph (DAG) of package dependencies to create an ordered list of installation operations. I now want to create a transitive reduction, giving a graph with the same vertices and the fewest number of edges possible.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/dune-graph.png" /><media:content medium="image" url="https://www.tunbury.org/images/dune-graph.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Surprising C++ failures in the macOS workers</title><link href="https://www.tunbury.org/2025/06/21/macos-sequoia-include-path/" rel="alternate" type="text/html" title="Surprising C++ failures in the macOS workers" /><published>2025-06-21T00:00:00+00:00</published><updated>2025-06-21T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/21/macos-sequoia-include-path</id><content type="html" xml:base="https://www.tunbury.org/2025/06/21/macos-sequoia-include-path/"><![CDATA[<p>@mseri raised <a href="https://github.com/ocaml/infrastructure/issues/175">issue #175</a> as the macOS workers cannot find the most basic C++ headers. I easily eliminated <a href="https://github.com/ocurrent/obuilder">Obuilder</a>, as <code class="language-plaintext highlighter-rouge">opam install mccs.1.1+19</code> didn’t work on the macOS workers natively.</p>

<p>On face value, the problem appears pretty common, and there are numerous threads on <a href="https://stackoverflow.com">Stack Overflow</a> such as this <a href="https://stackoverflow.com/questions/77250743/mac-xcode-g-cannot-compile-even-a-basic-c-program-issues-with-standard-libr">one</a>, however, the resolutions I tried didn’t work. I was reluctant to try some of the more intrusive changes like creating a symlink of every header from <code class="language-plaintext highlighter-rouge">/usr/include/</code> to <code class="language-plaintext highlighter-rouge">/Library/Developer/CommandLineTools/usr/include/c++/v1</code> as this doesn’t seem to be what Apple intends.</p>

<p>For the record, a program such as this:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span>
<span class="k">using</span> <span class="k">namespace</span> <span class="n">std</span><span class="p">;</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Hello World!"</span> <span class="o">&lt;&lt;</span> <span class="n">endl</span><span class="p">;</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Fails like this:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% c++ hello.cpp <span class="nt">-o</span> hello <span class="nt">-v</span>
Apple clang version 17.0.0 <span class="o">(</span>clang-1700.0.13.3<span class="o">)</span>
Target: x86_64-apple-darwin24.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
 <span class="s2">"/Library/Developer/CommandLineTools/usr/bin/clang"</span> <span class="nt">-cc1</span> <span class="nt">-triple</span> x86_64-apple-macosx15.0.0 <span class="nt">-Wundef-prefix</span><span class="o">=</span>TARGET_OS_ <span class="nt">-Wdeprecated-objc-isa-usage</span> <span class="nt">-Werror</span><span class="o">=</span>deprecated-objc-isa-usage <span class="nt">-Werror</span><span class="o">=</span>implicit-function-declaration <span class="nt">-emit-obj</span> <span class="nt">-dumpdir</span> hello- <span class="nt">-disable-free</span> <span class="nt">-clear-ast-before-backend</span> <span class="nt">-disable-llvm-verifier</span> <span class="nt">-discard-value-names</span> <span class="nt">-main-file-name</span> hello.cpp <span class="nt">-mrelocation-model</span> pic <span class="nt">-pic-level</span> 2 <span class="nt">-mframe-pointer</span><span class="o">=</span>all <span class="nt">-fno-strict-return</span> <span class="nt">-ffp-contract</span><span class="o">=</span>on <span class="nt">-fno-rounding-math</span> <span class="nt">-funwind-tables</span><span class="o">=</span>2 <span class="nt">-target-sdk-version</span><span class="o">=</span>15.4 <span class="nt">-fvisibility-inlines-hidden-static-local-var</span> <span class="nt">-fdefine-target-os-macros</span> <span class="nt">-fno-assume-unique-vtables</span> <span class="nt">-fno-modulemap-allow-subdirectory-search</span> <span class="nt">-target-cpu</span> penryn <span class="nt">-tune-cpu</span> generic <span class="nt">-debugger-tuning</span><span class="o">=</span>lldb <span class="nt">-fdebug-compilation-dir</span><span class="o">=</span>/Users/administrator/x <span class="nt">-target-linker-version</span> 1167.4.1 <span class="nt">-v</span> <span class="nt">-fcoverage-compilation-dir</span><span class="o">=</span>/Users/administrator/x <span class="nt">-resource-dir</span> /Library/Developer/CommandLineTools/usr/lib/clang/17 <span class="nt">-isysroot</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk <span class="nt">-internal-isystem</span> /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1 <span class="nt">-internal-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include <span class="nt">-internal-isystem</span> /Library/Developer/CommandLineTools/usr/lib/clang/17/include <span class="nt">-internal-externc-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include <span class="nt">-internal-externc-isystem</span> /Library/Developer/CommandLineTools/usr/include <span class="nt">-Wno-reorder-init-list</span> <span class="nt">-Wno-implicit-int-float-conversion</span> <span class="nt">-Wno-c99-designator</span> <span class="nt">-Wno-final-dtor-non-final-class</span> <span class="nt">-Wno-extra-semi-stmt</span> <span class="nt">-Wno-misleading-indentation</span> <span class="nt">-Wno-quoted-include-in-framework-header</span> <span class="nt">-Wno-implicit-fallthrough</span> <span class="nt">-Wno-enum-enum-conversion</span> <span class="nt">-Wno-enum-float-conversion</span> <span class="nt">-Wno-elaborated-enum-base</span> <span class="nt">-Wno-reserved-identifier</span> <span class="nt">-Wno-gnu-folding-constant</span> <span class="nt">-fdeprecated-macro</span> <span class="nt">-ferror-limit</span> 19 <span class="nt">-stack-protector</span> 1 <span class="nt">-fstack-check</span> <span class="nt">-mdarwin-stkchk-strong-link</span> <span class="nt">-fblocks</span> <span class="nt">-fencode-extended-block-signature</span> <span class="nt">-fregister-global-dtors-with-atexit</span> <span class="nt">-fgnuc-version</span><span class="o">=</span>4.2.1 <span class="nt">-fno-cxx-modules</span> <span class="nt">-fskip-odr-check-in-gmf</span> <span class="nt">-fcxx-exceptions</span> <span class="nt">-fexceptions</span> <span class="nt">-fmax-type-align</span><span class="o">=</span>16 <span class="nt">-fcommon</span> <span class="nt">-fcolor-diagnostics</span> <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+disableNonDependentMemberExprInCurrentInstantiation <span class="nt">-fno-odr-hash-protocols</span> <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+enableAggressiveVLAFolding <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+revert09abecef7bbf <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+thisNoAlignAttr <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+thisNoNullAttr <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+disableAtImportPrivateFrameworkInImplementationError <span class="nt">-D__GCC_HAVE_DWARF2_CFI_ASM</span><span class="o">=</span>1 <span class="nt">-o</span> /var/folders/sh/9c8b7hzd2wb1g2_ky78vqw5r0000gn/T/hello-a268ab.o <span class="nt">-x</span> c++ hello.cpp
clang <span class="nt">-cc1</span> version 17.0.0 <span class="o">(</span>clang-1700.0.13.3<span class="o">)</span> default target x86_64-apple-darwin24.5.0
ignoring nonexistent directory <span class="s2">"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include"</span>
ignoring nonexistent directory <span class="s2">"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/SubFrameworks"</span>
ignoring nonexistent directory <span class="s2">"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/Library/Frameworks"</span>
<span class="c">#include "..." search starts here:</span>
<span class="c">#include &lt;...&gt; search starts here:</span>
 /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1
 /Library/Developer/CommandLineTools/usr/lib/clang/17/include
 /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include
 /Library/Developer/CommandLineTools/usr/include
 /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks <span class="o">(</span>framework directory<span class="o">)</span>
End of search list.
hello.cpp:1:10: fatal error: <span class="s1">'iostream'</span> file not found
    1 | <span class="c">#include &lt;iostream&gt;</span>
      |          ^~~~~~~~~~
1 error generated.
</code></pre></div></div>

<p>That first folder looked strange: <code class="language-plaintext highlighter-rouge">bin/../include/c++/v1</code>. Really? What’s in there? Not much:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">ls</span> <span class="nt">-l</span> /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1
total 40
<span class="nt">-rw-r--r--</span>  1 root  wheel  44544  7 Apr  2022 __functional_03
<span class="nt">-rw-r--r--</span>  1 root  wheel   6532  7 Apr  2022 __functional_base_03
<span class="nt">-rw-r--r--</span>  1 root  wheel   2552  7 Apr  2022 __sso_allocator
</code></pre></div></div>

<p>I definitely have <code class="language-plaintext highlighter-rouge">iostream</code> on the machine:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% <span class="nb">ls</span> <span class="nt">-l</span> /Library/Developer/CommandLineTools/SDKs/MacOSX<span class="k">*</span>.sdk/usr/include/c++/v1/iostream
<span class="nt">-rw-r--r--</span>  1 root  wheel  1507  8 Mar 03:36 /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/iostream
<span class="nt">-rw-r--r--</span>  1 root  wheel  1391 13 Nov  2021 /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/iostream
<span class="nt">-rw-r--r--</span>  1 root  wheel  1583 13 Apr  2024 /Library/Developer/CommandLineTools/SDKs/MacOSX14.5.sdk/usr/include/c++/v1/iostream
<span class="nt">-rw-r--r--</span>  1 root  wheel  1583 13 Apr  2024 /Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk/usr/include/c++/v1/iostream
<span class="nt">-rw-r--r--</span>  1 root  wheel  1583 10 Nov  2024 /Library/Developer/CommandLineTools/SDKs/MacOSX15.2.sdk/usr/include/c++/v1/iostream
<span class="nt">-rw-r--r--</span>  1 root  wheel  1507  8 Mar 03:36 /Library/Developer/CommandLineTools/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/iostream
<span class="nt">-rw-r--r--</span>  1 root  wheel  1507  8 Mar 03:36 /Library/Developer/CommandLineTools/SDKs/MacOSX15.sdk/usr/include/c++/v1/iostream
</code></pre></div></div>

<p>I tried on my MacBook, which compiled the test program without issue. However, that was running Monterey, where the workers are running Sequoia. The <em>include</em> paths on my laptop look much better. Where are they configured?</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% c++ <span class="nt">-v</span> <span class="nt">-o</span> <span class="nb">test </span>test.cpp
Apple clang version 15.0.0 <span class="o">(</span>clang-1500.3.9.4<span class="o">)</span>
Target: x86_64-apple-darwin23.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
 <span class="s2">"/Library/Developer/CommandLineTools/usr/bin/clang"</span> <span class="nt">-cc1</span> <span class="nt">-triple</span> x86_64-apple-macosx14.0.0 <span class="nt">-Wundef-prefix</span><span class="o">=</span>TARGET_OS_ <span class="nt">-Wdeprecated-objc-isa-usage</span> <span class="nt">-Werror</span><span class="o">=</span>deprecated-objc-isa-usage <span class="nt">-Werror</span><span class="o">=</span>implicit-function-declaration <span class="nt">-emit-obj</span> <span class="nt">-mrelax-all</span> <span class="nt">--mrelax-relocations</span> <span class="nt">-disable-free</span> <span class="nt">-clear-ast-before-backend</span> <span class="nt">-disable-llvm-verifier</span> <span class="nt">-discard-value-names</span> <span class="nt">-main-file-name</span> test.cpp <span class="nt">-mrelocation-model</span> pic <span class="nt">-pic-level</span> 2 <span class="nt">-mframe-pointer</span><span class="o">=</span>all <span class="nt">-fno-strict-return</span> <span class="nt">-ffp-contract</span><span class="o">=</span>on <span class="nt">-fno-rounding-math</span> <span class="nt">-funwind-tables</span><span class="o">=</span>2 <span class="nt">-target-sdk-version</span><span class="o">=</span>14.4 <span class="nt">-fvisibility-inlines-hidden-static-local-var</span> <span class="nt">-target-cpu</span> penryn <span class="nt">-tune-cpu</span> generic <span class="nt">-debugger-tuning</span><span class="o">=</span>lldb <span class="nt">-target-linker-version</span> 1053.12 <span class="nt">-v</span> <span class="nt">-fcoverage-compilation-dir</span><span class="o">=</span>/Users/mtelvers/x <span class="nt">-resource-dir</span> /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0 <span class="nt">-isysroot</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk <span class="nt">-I</span>/usr/local/include <span class="nt">-internal-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1 <span class="nt">-internal-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include <span class="nt">-internal-isystem</span> /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0/include <span class="nt">-internal-externc-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include <span class="nt">-internal-externc-isystem</span> /Library/Developer/CommandLineTools/usr/include <span class="nt">-Wno-reorder-init-list</span> <span class="nt">-Wno-implicit-int-float-conversion</span> <span class="nt">-Wno-c99-designator</span> <span class="nt">-Wno-final-dtor-non-final-class</span> <span class="nt">-Wno-extra-semi-stmt</span> <span class="nt">-Wno-misleading-indentation</span> <span class="nt">-Wno-quoted-include-in-framework-header</span> <span class="nt">-Wno-implicit-fallthrough</span> <span class="nt">-Wno-enum-enum-conversion</span> <span class="nt">-Wno-enum-float-conversion</span> <span class="nt">-Wno-elaborated-enum-base</span> <span class="nt">-Wno-reserved-identifier</span> <span class="nt">-Wno-gnu-folding-constant</span> <span class="nt">-fdeprecated-macro</span> <span class="nt">-fdebug-compilation-dir</span><span class="o">=</span>/Users/mtelvers/x <span class="nt">-ferror-limit</span> 19 <span class="nt">-stack-protector</span> 1 <span class="nt">-fstack-check</span> <span class="nt">-mdarwin-stkchk-strong-link</span> <span class="nt">-fblocks</span> <span class="nt">-fencode-extended-block-signature</span> <span class="nt">-fregister-global-dtors-with-atexit</span> <span class="nt">-fgnuc-version</span><span class="o">=</span>4.2.1 <span class="nt">-fno-cxx-modules</span> <span class="nt">-fcxx-exceptions</span> <span class="nt">-fexceptions</span> <span class="nt">-fmax-type-align</span><span class="o">=</span>16 <span class="nt">-fcommon</span> <span class="nt">-fcolor-diagnostics</span> <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+disableNonDependentMemberExprInCurrentInstantiation <span class="nt">-fno-odr-hash-protocols</span> <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+enableAggressiveVLAFolding <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+revert09abecef7bbf <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+thisNoAlignAttr <span class="nt">-clang-vendor-feature</span><span class="o">=</span>+thisNoNullAttr <span class="nt">-mllvm</span> <span class="nt">-disable-aligned-alloc-awareness</span><span class="o">=</span>1 <span class="nt">-D__GCC_HAVE_DWARF2_CFI_ASM</span><span class="o">=</span>1 <span class="nt">-o</span> /var/folders/15/4zw4hb9s40b8cmff3z5bdszc0000gp/T/test-71e229.o <span class="nt">-x</span> c++ test.cpp
clang <span class="nt">-cc1</span> version 15.0.0 <span class="o">(</span>clang-1500.3.9.4<span class="o">)</span> default target x86_64-apple-darwin23.5.0
ignoring nonexistent directory <span class="s2">"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include"</span>
ignoring nonexistent directory <span class="s2">"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/Library/Frameworks"</span>
<span class="c">#include "..." search starts here:</span>
<span class="c">#include &lt;...&gt; search starts here:</span>
 /usr/local/include
 /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1
 /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0/include
 /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include
 /Library/Developer/CommandLineTools/usr/include
 /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks <span class="o">(</span>framework directory<span class="o">)</span>
End of search list.
 <span class="s2">"/Library/Developer/CommandLineTools/usr/bin/ld"</span> <span class="nt">-demangle</span> <span class="nt">-lto_library</span> /Library/Developer/CommandLineTools/usr/lib/libLTO.dylib <span class="nt">-no_deduplicate</span> <span class="nt">-dynamic</span> <span class="nt">-arch</span> x86_64 <span class="nt">-platform_version</span> macos 14.0.0 14.4 <span class="nt">-syslibroot</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk <span class="nt">-o</span> <span class="nb">test</span> <span class="nt">-L</span>/usr/local/lib /var/folders/15/4zw4hb9s40b8cmff3z5bdszc0000gp/T/test-71e229.o <span class="nt">-lc</span>++ <span class="nt">-lSystem</span> /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0/lib/darwin/libclang_rt.osx.a
</code></pre></div></div>

<p>I’ve been meaning to upgrade my MacBook, and this looked like the perfect excuse. I updated to Sequoia and then updated the Xcode command-line tools. The test compilation worked, the paths looked good, but I had clang 1700.0.13.5, where the workers had 1700.0.13.3.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>% c++ <span class="nt">-v</span> <span class="nt">-o</span> <span class="nb">test </span>test.cpp
Apple clang version 17.0.0 <span class="o">(</span>clang-1700.0.13.5<span class="o">)</span>
Target: x86_64-apple-darwin24.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
</code></pre></div></div>

<p>I updated the workers to 1700.0.13.5, which didn’t make any difference. The workers still had that funny <code class="language-plaintext highlighter-rouge">/../</code> path, which wasn’t present anywhere else. I searched <code class="language-plaintext highlighter-rouge">/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1 site:stackoverflow.com</code> and the answer is the top <a href="https://stackoverflow.com/a/79606435">match</a>.</p>

<blockquote>
  <p>Rename or if you’re confident enough, delete /Library/Developer/CommandLineTools/usr/include/c++, then clang++ will automatically search headers under /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1 and find your <iostream> header. That directory is very likely an artifact of OS upgrade and by deleting it clang++ will realise that it should search in the header paths of new SDKs.</iostream></p>
</blockquote>

<p>I wasn’t confident, so I moved it, <code class="language-plaintext highlighter-rouge">sudo mv c++ ~</code>. With that done, the test program builds correctly! Have a read of the <a href="https://stackoverflow.com/a/79606435">answer</a> on Stack Overflow.</p>

<p>Now, rather more cavalierly, I removed the folder on all the i7 and m1 workers:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="k">for </span>a <span class="k">in</span> <span class="o">{</span>01..04<span class="o">}</span> <span class="p">;</span> <span class="k">do </span>ssh m1-worker-<span class="nv">$a</span>.macos.ci.dev <span class="nb">sudo rm</span> <span class="nt">-r</span> /Library/Developer/CommandLineTools/usr/include/c++ <span class="p">;</span> <span class="k">done</span>
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="macos,clang" /><category term="tunbury.org" /><summary type="html"><![CDATA[@mseri raised issue #175 as the macOS workers cannot find the most basic C++ headers. I easily eliminated Obuilder, as opam install mccs.1.1+19 didn’t work on the macOS workers natively.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/sequoia.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/sequoia.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Tailscale</title><link href="https://www.tunbury.org/2025/06/20/tailscale/" rel="alternate" type="text/html" title="Tailscale" /><published>2025-06-20T00:00:00+00:00</published><updated>2025-06-20T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/20/tailscale</id><content type="html" xml:base="https://www.tunbury.org/2025/06/20/tailscale/"><![CDATA[<p>On a typical day, I sit at my antique Mac Pro Trashcan with every window running SSH to some remote machine. When I’m away from home and using my MacBook, I can still SSH to those remote machines; however, with my recent Windows work, I’ve been connecting to a Dell OptiPlex on my home LAN over Remote Desktop. How can I work remotely when I want to access my Windows machine?</p>

<p>It’s the age-old issue of connecting to your home network, which is hidden behind your home broadband router with a dynamic public IP address. I could use a dynamic DNS service to track my home router and configure port forwarding, but would you open RDP to the Internet?</p>

<p>I love VNC, but the recent change in the licensing model, whereby the free tier now has only three machines, combined with frustrating performance on the low bandwidth and intermittent connections we get on train WiFi, made me try an alternate solution. Thomas has Tailscale set up in the Paris office, and I decided to create a setup for home.</p>

<p>I’d rather not install any software on my Windows machine, as I wipe it pretty frequently, and I don’t need a VPN interfering with my <code class="language-plaintext highlighter-rouge">containerd</code> implementation. However, Tailscale supports a configuration whereby you can route to local networks.</p>

<p>After signing up for a free personal account, I installed the Tailscale client on my MacBook and Mac Pro (at home). On the Mac Pro, I enabled ‘Allow Local Network Access’ and from a Terminal window, I went to <code class="language-plaintext highlighter-rouge">/Applications/Tailscale.app/Contents/MacOS</code> and ran <code class="language-plaintext highlighter-rouge">./Tailscale set --advertise-routes=192.168.0.0/24</code>. With this done, looking at the machine list on the <a href="https://login.tailscale.com/admin/machines">Tailscale console</a>, my Mac Pro lists <code class="language-plaintext highlighter-rouge">Subnets</code>. Clicking on the three dots, and opening <code class="language-plaintext highlighter-rouge">Edit route settings</code>, I could enable the advertised subnet, 192.168.0.0/24.</p>

<p>Checking <code class="language-plaintext highlighter-rouge">netstat -rn</code> on my MacBook shows that 192.168.0 is routed over the VPN.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Routing tables

Internet:
Destination        Gateway            Flags               Netif Expire
default            10.101.2.1         UGScg                 en0
default            link#36            UCSIg              utun12
10.101.2/24        link#6             UCS                   en0      !
10.101.2.1/32      link#6             UCS                   en0      !
...
192.168.0          link#36            UCS                utun12
...
</code></pre></div></div>

<p>From my MacBook, I can now use Microsoft Remote Desktop to connect to the private IP address of my Windows machine.</p>

<p>OpenSSH is an optional feature on Windows 11. It can be turned on via Settings -&gt; Apps -&gt; Optional Features, clicking “Add a feature” and installing “OpenSSH Server”. Then, Open Services and set the setup options for “OpenSSH SSH Server” to automatic.</p>

<p>It didn’t make the train WiFi any better, but connecting over SSH was pretty convenient when the bandwidth is low.</p>

<p>Note that you may want to disable key expiry on your home machine; otherwise, it might require you to reauthenticate at a critical moment.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Tailscale" /><category term="tunbury.org" /><summary type="html"><![CDATA[On a typical day, I sit at my antique Mac Pro Trashcan with every window running SSH to some remote machine. When I’m away from home and using my MacBook, I can still SSH to those remote machines; however, with my recent Windows work, I’ve been connecting to a Dell OptiPlex on my home LAN over Remote Desktop. How can I work remotely when I want to access my Windows machine?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/tailscale-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/tailscale-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Hardlinks and Reflinks on Windows</title><link href="https://www.tunbury.org/2025/06/18/windows-reflinks/" rel="alternate" type="text/html" title="Hardlinks and Reflinks on Windows" /><published>2025-06-18T00:00:00+00:00</published><updated>2025-06-18T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/18/windows-reflinks</id><content type="html" xml:base="https://www.tunbury.org/2025/06/18/windows-reflinks/"><![CDATA[<p>Who knew there was a limit on creating hard links? I didn’t even consider this until my hard links started to fail. On NTFS, the limit is 1024 links to any given file. Subsequent research shows that the limit varies between file systems, with NTFS at the lower end of the scale.</p>

<p>Here’s an excerpt from <a href="https://en.wikipedia.org/wiki/Hard_link">Wikipedia</a> on the subject.</p>

<blockquote>
  <p>In AT&amp;T Unix System 6, released in 1975, the number of hard links allowed was 127. On Unix-like systems, the in-memory counter is 4,294,967,295 (on 32-bit machines) or 18,446,744,073,709,551,615 (on 64-bit machines). In some file systems, the number of hard links is limited more strictly by their on-disk format. For example, as of Linux 3.11, the ext4 file system limits the number of hard links on a file to 65,000. Windows limits enforces a limit of 1024 hard links to a file on NTFS volumes.</p>
</blockquote>

<p>This restriction probably doesn’t even come close to being a practical limit for most normal use cases, but it’s worth noting that <code class="language-plaintext highlighter-rouge">git.exe</code> has 142 hard links on a standard Cygwin installation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fsutil hardlink list %LOCALAPPDATA%\opam\.cygwin\root\bin\git.exe
</code></pre></div></div>

<p>Back in 2012, Microsoft released ReFS as an alternative to NTFS. The feature gap has closed over the years, with hard links being introduced in the preview of Windows Server 2022. ReFS supports 1 million hard links per file, but even more interestingly, it supports <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/block-cloning">block cloning</a>, aka <a href="https://blogs.oracle.com/linux/post/xfs-data-block-sharing-reflink">reflinks</a>, whereby files can share common data blocks. When changes are written to a block, it is copied, and its references are updated.</p>

<p>The implementation is interesting because it doesn’t work in quite the way that one would think. It can only be used to clone complete clusters. Therefore, we must first call <a href="https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_get_integrity_information">FSCTL_GET_INTEGRITY_INFORMATION</a>, which returns <a href="https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ns-winioctl-fsctl_get_integrity_information_buffer">FSCTL_GET_INTEGRITY_INFORMATION_BUFFER</a> with the cluster size in bytes.</p>

<p>Despite <a href="https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_duplicate_extents_to_file">FSCTL_DUPLICATE_EXTENTS_TO_FILE</a> taking an exact number of bytes, we must round up the file size to the next cluster boundary.</p>

<p>Additionally, the target file needs to exist before the clone and be large enough to hold the cloned clusters. In practice, this means calling <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew">CreateFileW</a> to create the file and then calling <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfileinformationbyhandle">SetFileInformationByHandle</a> to set the file size to match the source file (not the rounded cluster size).</p>

<p>Taking an example file of 23075 bytes, this would be rounded to 24576 bytes (6 clusters). We can use <code class="language-plaintext highlighter-rouge">fsutil file queryextents</code> to get detailed information about the clusters used in the source file:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>D:\&gt; fsutil file queryextents source.txt
VCN: 0x0        Clusters: 0x6        LCN: 0x2d3d801
</code></pre></div></div>

<p>Now we clone the file <code class="language-plaintext highlighter-rouge">ReFS-clone d:\source.txt d:\target.txt</code> and then query the extents which it uses.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>D:\&gt; fsutil file queryextents target.txt
VCN: 0x0        Clusters: 0x5        LCN: 0x2d3d801
VCN: 0x5        Clusters: 0x1        LCN: 0x2d3c801
</code></pre></div></div>

<p>The first five whole clusters are shared between the two files, while the final partial cluster has been copied. When trying to implement this, I initially used a text file of just a few bytes and couldn’t get it clone. After I rounded up the size to 4096, the API returned successfully, but there are no shared clusters. It wasn’t until I tried a larger file with the size rounded up that I started to see actual shared clusters.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>D:\&gt;echo hello &gt; foo.txt

D:\&gt;fsutil file queryextents foo.txt
VCN: 0x0        Clusters: 0x1        LCN: 0x2d3dc04

D:\&gt;ReFS-clone.exe foo.txt bar.txt
ReFS File Clone Utility
ReFS Clone: foo.txt -&gt; bar.txt
Cluster size: 4096 bytes
File size: 8 bytes -&gt; 4096 bytes (1 clusters)
Cloning 4096 bytes...
Success!
ReFS cloning completed successfully.

D:\&gt;fsutil file queryextents bar.txt
VCN: 0x0        Clusters: 0x1        LCN: 0x2d3d807
</code></pre></div></div>

<p>The code is on GitHub in <a href="https://github.com/mtelvers/ReFS-Clone">ReFS-Clone</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml,Windows" /><category term="tunbury.org" /><summary type="html"><![CDATA[Who knew there was a limit on creating hard links? I didn’t even consider this until my hard links started to fail. On NTFS, the limit is 1024 links to any given file. Subsequent research shows that the limit varies between file systems, with NTFS at the lower end of the scale.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Static linking in OCaml</title><link href="https://www.tunbury.org/2025/06/17/static-linking/" rel="alternate" type="text/html" title="Static linking in OCaml" /><published>2025-06-17T00:00:00+00:00</published><updated>2025-06-17T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/17/static-linking</id><content type="html" xml:base="https://www.tunbury.org/2025/06/17/static-linking/"><![CDATA[<p>Most of the time, you don’t think about how your file is linked. We’ve come to love dynamically linked files with their small file sizes and reduced memory requirements, but there are times when the convenience of a single binary download from a GitHub release page is really what you need.</p>

<p>To do this in OCaml, we need to add <code class="language-plaintext highlighter-rouge">-ccopt -static</code> to the <code class="language-plaintext highlighter-rouge">ocamlopt</code>. I’m building with <code class="language-plaintext highlighter-rouge">dune</code>, so I can configure that in my <code class="language-plaintext highlighter-rouge">dune</code> file using a <code class="language-plaintext highlighter-rouge">flags</code> directive.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(flags (:standard -ccopt -static))
</code></pre></div></div>

<p>This can be extended for maximum compatibility by additionally adding <code class="language-plaintext highlighter-rouge">-ccopt -march=x86-64</code>, which ensures the generated code will run on any x86_64 processor and will not use newer instruction set extensions like SSE3, AVX, etc.</p>

<p>So what about Windows? The Mingw tool chain accepts <code class="language-plaintext highlighter-rouge">-static</code>. Including <code class="language-plaintext highlighter-rouge">(flags (:standard -ccopt "-link -Wl,-static -v"))</code> got my options applied to my <code class="language-plaintext highlighter-rouge">dune</code> build:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x86_64-w64-mingw32-gcc -mconsole  -L. -I"C:/Users/Administrator/my-app/_opam/lib/ocaml" -I"C:\Users\Administrator\my-app\_opam\lib\mccs" -I"C:\Users\Administrator\my-app\_opam\lib\mccs\glpk/internal" -I"C:\Users\Administrator\my-app\_opam\lib\opam-core" -I"C:\Users\Administrator\my-app\_opam\lib\sha" -I"C:/Users/Administrator/my-app/_opam/lib/ocaml\flexdll" -L"C:/Users/Administrator/my-app/_opam/lib/ocaml" -L"C:\Users\Administrator\my-app\_opam\lib\mccs" -L"C:\Users\Administrator\my-app\_opam\lib\mccs\glpk/internal" -L"C:\Users\Administrator\my-app\_opam\lib\opam-core" -L"C:\Users\Administrator\my-app\_opam\lib\sha" -L"C:/Users/Administrator/my-app/_opam/lib/ocaml\flexdll" -o "bin/main.exe" "C:\Users\ADMINI~1\AppData\Local\Temp\2\build_d62d04_dune\dyndllb7e0e8.o" "@C:\Users\ADMINI~1\AppData\Local\Temp\2\build_d62d04_dune\camlrespec7816"   "-municode" "-Wl,-static"
</code></pre></div></div>

<p>However, <code class="language-plaintext highlighter-rouge">ldd</code> showed that this wasn’t working:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ldd main.exe | grep mingw
        libstdc++-6.dll =&gt; /mingw64/bin/libstdc++-6.dll (0x7ffabf3e0000)
        libgcc_s_seh-1.dll =&gt; /mingw64/bin/libgcc_s_seh-1.dll (0x7ffac3130000)
        libwinpthread-1.dll =&gt; /mingw64/bin/libwinpthread-1.dll (0x7ffac4b40000)
</code></pre></div></div>

<p>I tried <em>a lot</em> of different variations. I asked Claude… then I asked <a href="https://www.dra27.uk/blog/">@dra27</a> who recalled @kit-ty-kate working on this for opam. <a href="https://github.com/ocaml/opam/pull/5680">PR#5680</a></p>

<p>The issue is the auto-response file, which precedes my static option. We can remove that by adding <code class="language-plaintext highlighter-rouge">-noautolink</code>, but now we must do all the work by hand and build a massive command line.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(executable
 (public_name main)
 (name main)
 (flags (:standard -noautolink -cclib -lunixnat -cclib -lmccs_stubs -cclib -lmccs_glpk_stubs -cclib -lsha_stubs -cclib -lopam_core_stubs -cclib -l:libstdc++.a -cclib -l:libpthread.a -cclib -Wl,-static -cclib -ladvapi32 -cclib -lgdi32 -cclib -luser32 -cclib -lshell32 -cclib -lole32 -cclib -luuid -cclib -luserenv -cclib -lwindowsapp))
 (libraries opam-client))
</code></pre></div></div>

<p>It works, but it’s not for the faint-hearted.</p>

<p>I additionally added <code class="language-plaintext highlighter-rouge">(enabled_if (= %{os_type} Win32))</code> to my rule so it only runs on Windows.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Most of the time, you don’t think about how your file is linked. We’ve come to love dynamically linked files with their small file sizes and reduced memory requirements, but there are times when the convenience of a single binary download from a GitHub release page is really what you need.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Containerd on Windows</title><link href="https://www.tunbury.org/2025/06/14/windows-containerd-2/" rel="alternate" type="text/html" title="Containerd on Windows" /><published>2025-06-14T12:00:00+00:00</published><updated>2025-06-14T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/14/windows-containerd-2</id><content type="html" xml:base="https://www.tunbury.org/2025/06/14/windows-containerd-2/"><![CDATA[<p>If you were following along with my previous post on <a href="https://www.tunbury.org/windows-containerd/">containerd on Windows</a>, you may recall that I lamented the lack of an installer. Since then, I have found a PowerShell <a href="https://github.com/microsoft/Windows-Containers/blob/Main/helpful_tools/Install-ContainerdRuntime/install-containerd-runtime.ps1">script</a> on Microsoft’s GitHub, which does a lot of the grunt work for us.</p>

<p>Trying anything beyond my <code class="language-plaintext highlighter-rouge">echo Hello</code> test showed an immediate problem: there is no network. <code class="language-plaintext highlighter-rouge">ipconfig</code> didn’t display any network interfaces.</p>

<pre><code class="language-cmd">C:\&gt;ctr run --rm mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ipconfig

Windows IP Configuration
</code></pre>

<p>Checking the command line options, there is one called <code class="language-plaintext highlighter-rouge">--net-host</code>, which sounded promising, only for that to be immediately dashed:</p>

<pre><code class="language-cmd">C:\&gt;ctr run --rm --net-host mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ipconfig
ctr: Cannot use host mode networking with Windows containers
</code></pre>

<p>The solution is <code class="language-plaintext highlighter-rouge">--cni</code>, but more work is required to get that working. We need to download the plugins and populate them in the <code class="language-plaintext highlighter-rouge">cni/bin</code> subdirectory. Fortunately, the installation script does all of this for us but leaves it unconfigured.</p>

<pre><code class="language-cmd">C:\Windows\System32&gt;ctr run --rm --cni mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ipconfig
ctr: no network config found in C:\Program Files\containerd\cni\conf: cni plugin not initialized
</code></pre>

<p>From the top, this is how you get from a fresh install of Windows 11, to a container with networking. Firstly, use installation script to install <code class="language-plaintext highlighter-rouge">containerd</code>.</p>

<pre><code class="language-cmd">curl.exe https://raw.githubusercontent.com/microsoft/Windows-Containers/refs/heads/Main/helpful_tools/Install-ContainerdRuntime/install-containerd-runtime.ps1 -o install-containerd-runtime.ps1
Set-ExecutionPolicy Bypass
.\install-containerd-runtime.ps1 -ContainerDVersion 2.1.1 -WinCNIVersion 0.3.1 -ExternalNetAdapter Ethernet
</code></pre>

<p>Now create <code class="language-plaintext highlighter-rouge">C:\Program Files\containerd\cni\conf\0-containerd-nat.conf</code> containing the following:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
    "cniVersion": "0.3.0",
    "name": "nat",
    "type": "nat",
    "master": "Ethernet",
    "ipam": {
        "subnet": "172.20.0.0/16",
        "routes": [
            {
                "gateway": "172.20.0.1"
            }
        ]
    },
    "capabilities": {
        "portMappings": true,
        "dns": true
    }
}
</code></pre></div></div>

<p>Easy when you know how…</p>

<pre><code class="language-cmd">C:\&gt;ctr run --rm --cni mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ping 1.1.1.1

Pinging 1.1.1.1 with 32 bytes of data:
Reply from 1.1.1.1: bytes=32 time=5ms TTL=58
Reply from 1.1.1.1: bytes=32 time=7ms TTL=58
Reply from 1.1.1.1: bytes=32 time=7ms TTL=58
Reply from 1.1.1.1: bytes=32 time=6ms TTL=58

Ping statistics for 1.1.1.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 5ms, Maximum = 7ms, Average = 6ms
</code></pre>

<p>The next challenge is, what do you put in your own <code class="language-plaintext highlighter-rouge">config.json</code> to reproduce this behaviour?</p>

<p>Firstly, we need our <code class="language-plaintext highlighter-rouge">layerFolders</code>:</p>

<pre><code class="language-cmd">C:\&gt;ctr snapshot ls
KEY                                                                     PARENT KIND
sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355        Committed
</code></pre>

<pre><code class="language-cmd">C:\&gt;ctr snapshot prepare --mounts my-snapshot sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355
[
    {
        "Type": "windows-layer",
        "Source": "C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\14",
        "Target": "",
        "Options": [
            "rw",
            "parentLayerPaths=[\"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\1\"]"
        ]
    }
]
</code></pre>

<p>Let’s create a <code class="language-plaintext highlighter-rouge">config.json</code> without a network stanza just to check we can create a container:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
  "ociVersion": "1.1.0",
  "process": {
    "terminal": false,
    "user": { "uid": 0, "gid": 0 },
    "args": [
      "cmd", "/c",
      "ipconfig &amp;&amp; ping 1.1.1.1"
    ],
    "cwd": "c:\\"
  },
  "root": { "path": "", "readonly": false },
  "hostname": "builder",
  "windows": {
    "layerFolders": [
      "C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\1",
      "C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\14"
    ],
    "ignoreFlushesDuringBoot": true
  }
}
</code></pre></div></div>

<p>The container runs, but there is no network as we’d expect.</p>

<pre><code class="language-cmd">C:\&gt;ctr run --rm --config config.json my-container

Windows IP Configuration


Pinging 1.1.1.1 with 32 bytes of data:
PING: transmit failed. General failure.
PING: transmit failed. General failure.
PING: transmit failed. General failure.
PING: transmit failed. General failure.
</code></pre>

<p>If we turn on CNI, it crypically tells us what we need to do:</p>

<pre><code class="language-cmd">C:\&gt;ctr run --rm --cni --config config.json my-container
ctr: plugin type="nat" name="nat" failed (add): required env variables [CNI_NETNS] missing
</code></pre>

<p>So we need to populate the <code class="language-plaintext highlighter-rouge">network.networkNamespace</code> with the name (ID) of the network we want to use. This should be a GUID, and I don’t know how to get the right value. I would have assumed that it was one of the many GUID’s returned by <code class="language-plaintext highlighter-rouge">Get-HnsNetwork</code> but it isn’t.</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">PS</span><span class="w"> </span><span class="nx">C:\</span><span class="err">&gt;</span><span class="w"> </span><span class="nx">Get-HnsNetwork</span><span class="w">


</span><span class="n">ActivityId</span><span class="w">             </span><span class="p">:</span><span class="w"> </span><span class="nx">92018CF0-6DCB-4AAF-A14E-DC61120FC958</span><span class="w">
</span><span class="n">AdditionalParams</span><span class="w">       </span><span class="p">:</span><span class="w">
</span><span class="n">CurrentEndpointCount</span><span class="w">   </span><span class="p">:</span><span class="w"> </span><span class="nx">0</span><span class="w">
</span><span class="n">Extensions</span><span class="w">             </span><span class="p">:</span><span class="w"> </span><span class="p">{@{</span><span class="nx">Id</span><span class="o">=</span><span class="nx">E7C3B2F0</span><span class="err">-</span><span class="nx">F3C5</span><span class="err">-</span><span class="mi">48</span><span class="nx">DF</span><span class="err">-</span><span class="nx">AF2B</span><span class="err">-</span><span class="mi">10</span><span class="nx">FED6D72E7A</span><span class="p">;</span><span class="w"> </span><span class="nx">IsEnabled</span><span class="o">=</span><span class="nx">False</span><span class="p">;</span><span class="w"> </span><span class="nx">Name</span><span class="o">=</span><span class="nx">Microsoft</span><span class="w"> </span><span class="nx">Windows</span><span class="w"> </span><span class="nx">Filtering</span><span class="w"> </span><span class="nx">Platform</span><span class="p">},</span><span class="w">
                         </span><span class="p">@{</span><span class="nx">Id</span><span class="o">=</span><span class="nx">F74F241B</span><span class="err">-</span><span class="mi">440</span><span class="nx">F</span><span class="err">-</span><span class="mi">4433</span><span class="err">-</span><span class="nx">BB28</span><span class="err">-</span><span class="mi">00</span><span class="nx">F89EAD20D8</span><span class="p">;</span><span class="w"> </span><span class="nx">IsEnabled</span><span class="o">=</span><span class="nx">False</span><span class="p">;</span><span class="w"> </span><span class="nx">Name</span><span class="o">=</span><span class="nx">Microsoft</span><span class="w"> </span><span class="nx">Azure</span><span class="w"> </span><span class="nx">VFP</span><span class="w"> </span><span class="nx">Switch</span><span class="w"> </span><span class="nx">Filter</span><span class="w"> </span><span class="nx">Extension</span><span class="p">},</span><span class="w">
                         </span><span class="p">@{</span><span class="nx">Id</span><span class="o">=</span><span class="mi">430</span><span class="nx">BDADD</span><span class="err">-</span><span class="nx">BAB0</span><span class="err">-</span><span class="mi">41</span><span class="nx">AB</span><span class="err">-</span><span class="nx">A369</span><span class="err">-</span><span class="mi">94</span><span class="nx">B67FA5BE0A</span><span class="p">;</span><span class="w"> </span><span class="nx">IsEnabled</span><span class="o">=</span><span class="nx">True</span><span class="p">;</span><span class="w"> </span><span class="nx">Name</span><span class="o">=</span><span class="nx">Microsoft</span><span class="w"> </span><span class="nx">NDIS</span><span class="w"> </span><span class="nx">Capture</span><span class="p">}}</span><span class="w">
</span><span class="n">Flags</span><span class="w">                  </span><span class="p">:</span><span class="w"> </span><span class="nx">8</span><span class="w">
</span><span class="n">Health</span><span class="w">                 </span><span class="p">:</span><span class="w"> </span><span class="p">@{</span><span class="nx">LastErrorCode</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="nx">LastUpdateTime</span><span class="o">=</span><span class="mi">133943927149605101</span><span class="p">}</span><span class="w">
</span><span class="n">ID</span><span class="w">                     </span><span class="p">:</span><span class="w"> </span><span class="nx">3EB2B18B-A1DD-46A8-A425-256F6B3DF26D</span><span class="w">
</span><span class="n">IPv6</span><span class="w">                   </span><span class="p">:</span><span class="w"> </span><span class="nx">False</span><span class="w">
</span><span class="n">LayeredOn</span><span class="w">              </span><span class="p">:</span><span class="w"> </span><span class="nx">20791F67-012C-4C9B-9C93-530FDA5DE4FA</span><span class="w">
</span><span class="n">MacPools</span><span class="w">               </span><span class="p">:</span><span class="w"> </span><span class="p">{@{</span><span class="nx">EndMacAddress</span><span class="o">=</span><span class="mi">00</span><span class="err">-</span><span class="mi">15</span><span class="err">-</span><span class="mi">5</span><span class="nx">D</span><span class="err">-</span><span class="nx">C3</span><span class="err">-</span><span class="nx">DF</span><span class="err">-</span><span class="nx">FF</span><span class="p">;</span><span class="w"> </span><span class="nx">StartMacAddress</span><span class="o">=</span><span class="mi">00</span><span class="err">-</span><span class="mi">15</span><span class="err">-</span><span class="mi">5</span><span class="nx">D</span><span class="err">-</span><span class="nx">C3</span><span class="err">-</span><span class="nx">D0</span><span class="err">-</span><span class="mi">00</span><span class="p">}}</span><span class="w">
</span><span class="n">MaxConcurrentEndpoints</span><span class="w"> </span><span class="p">:</span><span class="w"> </span><span class="nx">1</span><span class="w">
</span><span class="n">Name</span><span class="w">                   </span><span class="p">:</span><span class="w"> </span><span class="nx">nat</span><span class="w">
</span><span class="n">NatName</span><span class="w">                </span><span class="p">:</span><span class="w"> </span><span class="nx">NATAC317D6D-8A2E-4E4E-9BCF-33435FE4CD8F</span><span class="w">
</span><span class="n">Policies</span><span class="w">               </span><span class="p">:</span><span class="w"> </span><span class="p">{@{</span><span class="nx">Type</span><span class="o">=</span><span class="nx">VLAN</span><span class="p">;</span><span class="w"> </span><span class="nx">VLAN</span><span class="o">=</span><span class="mi">1</span><span class="p">}}</span><span class="w">
</span><span class="n">State</span><span class="w">                  </span><span class="p">:</span><span class="w"> </span><span class="nx">1</span><span class="w">
</span><span class="n">Subnets</span><span class="w">                </span><span class="p">:</span><span class="w"> </span><span class="p">{@{</span><span class="nx">AdditionalParams</span><span class="o">=</span><span class="p">;</span><span class="w"> </span><span class="nx">AddressPrefix</span><span class="o">=</span><span class="mf">172.20.0.0</span><span class="err">/</span><span class="mi">16</span><span class="p">;</span><span class="w"> </span><span class="nx">Flags</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="nx">GatewayAddress</span><span class="o">=</span><span class="mf">172.20.0.1</span><span class="p">;</span><span class="w"> </span><span class="nx">Health</span><span class="o">=</span><span class="p">;</span><span class="w">
                         </span><span class="nx">ID</span><span class="o">=</span><span class="mi">5</span><span class="nx">D56CE8D</span><span class="err">-</span><span class="mi">1</span><span class="nx">AD2</span><span class="err">-</span><span class="mi">47</span><span class="nx">FF</span><span class="err">-</span><span class="mi">85</span><span class="nx">A7</span><span class="err">-</span><span class="nx">A0E6D530565D</span><span class="p">;</span><span class="w"> </span><span class="nx">IpSubnets</span><span class="o">=</span><span class="nx">System</span><span class="err">.</span><span class="nx">Object</span><span class="p">[];</span><span class="w"> </span><span class="nx">ObjectType</span><span class="o">=</span><span class="mi">5</span><span class="p">;</span><span class="w"> </span><span class="nx">Policies</span><span class="o">=</span><span class="nx">System</span><span class="err">.</span><span class="nx">Object</span><span class="p">[];</span><span class="w"> </span><span class="nx">State</span><span class="o">=</span><span class="mi">0</span><span class="p">}}</span><span class="w">
</span><span class="n">SwitchGuid</span><span class="w">             </span><span class="p">:</span><span class="w"> </span><span class="nx">3EB2B18B-A1DD-46A8-A425-256F6B3DF26D</span><span class="w">
</span><span class="n">TotalEndpoints</span><span class="w">         </span><span class="p">:</span><span class="w"> </span><span class="nx">2</span><span class="w">
</span><span class="kr">Type</span><span class="w">                   </span><span class="p">:</span><span class="w"> </span><span class="n">NAT</span><span class="w">
</span><span class="nx">Version</span><span class="w">                </span><span class="p">:</span><span class="w"> </span><span class="nx">64424509440</span><span class="w">
</span><span class="n">Resources</span><span class="w">              </span><span class="p">:</span><span class="w"> </span><span class="p">@{</span><span class="nx">AdditionalParams</span><span class="o">=</span><span class="p">;</span><span class="w"> </span><span class="nx">AllocationOrder</span><span class="o">=</span><span class="mi">2</span><span class="p">;</span><span class="w"> </span><span class="nx">Allocators</span><span class="o">=</span><span class="nx">System</span><span class="err">.</span><span class="nx">Object</span><span class="p">[];</span><span class="w"> </span><span class="nx">CompartmentOperationTime</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="nx">Flags</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="nx">Health</span><span class="o">=</span><span class="p">;</span><span class="w">
                         </span><span class="nx">ID</span><span class="o">=</span><span class="mi">92018</span><span class="nx">CF0</span><span class="err">-</span><span class="mi">6</span><span class="nx">DCB</span><span class="err">-</span><span class="mi">4</span><span class="nx">AAF</span><span class="err">-</span><span class="nx">A14E</span><span class="err">-</span><span class="nx">DC61120FC958</span><span class="p">;</span><span class="w"> </span><span class="nx">PortOperationTime</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="nx">State</span><span class="o">=</span><span class="mi">1</span><span class="p">;</span><span class="w"> </span><span class="nx">SwitchOperationTime</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="nx">VfpOperationTime</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w">
                         </span><span class="nx">parentId</span><span class="o">=</span><span class="mi">71</span><span class="nx">FB2758</span><span class="err">-</span><span class="nx">F714</span><span class="err">-</span><span class="mi">4838</span><span class="err">-</span><span class="mi">8764</span><span class="err">-</span><span class="mi">7079378</span><span class="nx">D6CB6</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>I ran <code class="language-plaintext highlighter-rouge">ctr run --rm --cni mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container cmd /c "ping 1.1.1.1 &amp;&amp; pause"</code> in one window and ran <code class="language-plaintext highlighter-rouge">ctr c info my-container</code> in another, which revealed a GUID was <code class="language-plaintext highlighter-rouge">5f7d467c-3011-48bc-9337-ce78cf399345</code>.</p>

<p>Adding this to my <code class="language-plaintext highlighter-rouge">config.json</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
  "ociVersion": "1.1.0",
  "process": {
    "terminal": false,
    "user": { "uid": 0, "gid": 0 },
    "args": [
      "cmd", "/c",
      "ipconfig &amp;&amp; ping 1.1.1.1"
    ],
    "cwd": "c:\\"
  },
  "root": { "path": "", "readonly": false },
  "hostname": "builder",
  "windows": {
    "layerFolders": [
      "C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\1",
      "C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\14"
    ],
    "ignoreFlushesDuringBoot": true,
    "network": {
      "allowUnqualifiedDNSQuery": true,
      "networkNamespace": "5f7d467c-3011-48bc-9337-ce78cf399345"
    }
  }
}
</code></pre></div></div>

<p>And now I have a network!</p>

<pre><code class="language-cmd">C:\&gt;ctr run --rm --cni --config config.json my-container

Windows IP Configuration


Ethernet adapter vEthernet (default-my-container2_nat):

   Connection-specific DNS Suffix  . : Home
   Link-local IPv6 Address . . . . . : fe80::921d:1ce7:a445:8dfa%49
   IPv4 Address. . . . . . . . . . . : 172.20.95.58
   Subnet Mask . . . . . . . . . . . : 255.255.0.0
   Default Gateway . . . . . . . . . : 172.20.0.1

Pinging 1.1.1.1 with 32 bytes of data:
Reply from 1.1.1.1: bytes=32 time=5ms TTL=58
Reply from 1.1.1.1: bytes=32 time=6ms TTL=58
Reply from 1.1.1.1: bytes=32 time=6ms TTL=58
Reply from 1.1.1.1: bytes=32 time=6ms TTL=58

Ping statistics for 1.1.1.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 5ms, Maximum = 6ms, Average = 5ms
</code></pre>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="containerd" /><category term="tunbury.org" /><summary type="html"><![CDATA[If you were following along with my previous post on containerd on Windows, you may recall that I lamented the lack of an installer. Since then, I have found a PowerShell script on Microsoft’s GitHub, which does a lot of the grunt work for us.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/containerd.png" /><media:content medium="image" url="https://www.tunbury.org/images/containerd.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Borg Backup</title><link href="https://www.tunbury.org/2025/06/14/borg-backup/" rel="alternate" type="text/html" title="Borg Backup" /><published>2025-06-14T00:00:00+00:00</published><updated>2025-06-14T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/14/borg-backup</id><content type="html" xml:base="https://www.tunbury.org/2025/06/14/borg-backup/"><![CDATA[<p>Our PeerTube installation at <a href="https://watch.ocaml.org/">watch.ocaml.org</a> holds hundreds of videos we wouldn’t want to lose! It’s a VM hosted at Scaleway so the chances of a loss are pretty small, but having a second copy would give us extra reassurance. I’m going to use <a href="https://www.borgbackup.org">Borg Backup</a>.</p>

<p>Here’s the list of features (taken directly from their website):</p>

<ul>
  <li>Space-efficient storage of backups.</li>
  <li>Secure, authenticated encryption.</li>
  <li>Compression: lz4, zstd, zlib, lzma or none.</li>
  <li>Mountable backups with FUSE.</li>
  <li>Easy installation on multiple platforms: Linux, macOS, BSD, …</li>
  <li>Free software (BSD license).</li>
  <li>Backed by a large and active open source community.</li>
</ul>

<p>We have several OBuilder workers with one or more unused hard disks, which would make ideal backup targets.</p>

<p>In this case, I will format and mount <code class="language-plaintext highlighter-rouge">sdc</code> as <code class="language-plaintext highlighter-rouge">/home</code> on one of the workers.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>parted /dev/sdc mklabel gpt
parted /dev/sdc mkpart primary ext4 0% 100%
mkfs.ext4 /dev/sdc1
</code></pre></div></div>

<p>Add this to /etc/fstab and run <code class="language-plaintext highlighter-rouge">mount -a</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/dev/sdc1 /home ext4 defaults 0 2
</code></pre></div></div>

<p>Create a user <code class="language-plaintext highlighter-rouge">borg</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>adduser <span class="nt">--disabled-password</span> <span class="nt">--gecos</span> <span class="s1">'@borg'</span> <span class="nt">--home</span> /home/borg borg
</code></pre></div></div>

<p>On both machines, install the application <code class="language-plaintext highlighter-rouge">borg</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>borgbackup
</code></pre></div></div>

<p>On the machine we want to backup, generate an SSH key and copy it to the <code class="language-plaintext highlighter-rouge">authorized_keys</code> file for user <code class="language-plaintext highlighter-rouge">borg</code> on the target server. Ensure that <code class="language-plaintext highlighter-rouge">chmod</code> and <code class="language-plaintext highlighter-rouge">chown</code> are correct.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh-keygen <span class="nt">-t</span> ed25519 <span class="nt">-f</span> ~/.ssh/borg_backup_key
</code></pre></div></div>

<p>Add lines to the <code class="language-plaintext highlighter-rouge">.ssh/config</code> for ease of connection. We can now <code class="language-plaintext highlighter-rouge">ssh backup-server</code> without any prompts.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Host backup-server
    HostName your.backup.server.com
    User borg
    IdentityFile ~/.ssh/borg_backup_key
    ServerAliveInterval 60
    ServerAliveCountMax 3
</code></pre></div></div>

<p>Borg supports encrypting the backup at rest on the target machine. The data is publicly available in this case, so encryption seems unnecessary.</p>

<p>On the machine to be backed up, run.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>borg init <span class="nt">--encryption</span><span class="o">=</span>none backup-server:repo
</code></pre></div></div>

<p>We can now perform a backup or two and see how the deduplication works.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># borg create backup-server:repo::test /var/lib/docker/volumes/postgres --compression lz4 --stats --progress</span>
<span class="nt">------------------------------------------------------------------------------</span>
Repository: ssh://backup-server/./repo
Archive name: <span class="nb">test
</span>Archive fingerprint: 627242cb5b65efa23672db317b4cdc8617a78de4d8e195cdd1e1358ed02dd937
Time <span class="o">(</span>start<span class="o">)</span>: Sat, 2025-06-14 13:32:27
Time <span class="o">(</span>end<span class="o">)</span>:   Sat, 2025-06-14 13:32:38
Duration: 11.03 seconds
Number of files: 3497
Utilization of max. archive size: 0%
<span class="nt">------------------------------------------------------------------------------</span>
                       Original size      Compressed size    Deduplicated size
This archive:              334.14 MB            136.28 MB            132.79 MB
All archives:              334.14 MB            136.28 MB            132.92 MB

                       Unique chunks         Total chunks
Chunk index:                     942                 1568
<span class="nt">------------------------------------------------------------------------------</span>
<span class="c"># borg create backup-server:repo::test2 /var/lib/docker/volumes/postgres --compression lz4 --stats --progress</span>
<span class="nt">------------------------------------------------------------------------------</span>
Repository: ssh://backup-server/./repo
Archive name: test2
Archive fingerprint: 572bf2225b3ab19afd32d44f058a49dc2b02cb70c8833fa0b2a1fb5b95526bff
Time <span class="o">(</span>start<span class="o">)</span>: Sat, 2025-06-14 13:33:05
Time <span class="o">(</span>end<span class="o">)</span>:   Sat, 2025-06-14 13:33:06
Duration: 1.43 seconds
Number of files: 3497
Utilization of max. archive size: 0%
<span class="nt">------------------------------------------------------------------------------</span>
                       Original size      Compressed size    Deduplicated size
This archive:              334.14 MB            136.28 MB              9.58 MB
All archives:              668.28 MB            272.55 MB            142.61 MB

                       Unique chunks         Total chunks
Chunk index:                     971                 3136
<span class="nt">------------------------------------------------------------------------------</span>
<span class="c"># borg list backup-server:repo</span>
<span class="nb">test                                 </span>Sat, 2025-06-14 13:32:27 <span class="o">[</span>627242cb5b65efa23672db317b4cdc8617a78de4d8e195cdd1e1358ed02dd937]
test2                                Sat, 2025-06-14 13:33:05 <span class="o">[</span>572bf2225b3ab19afd32d44f058a49dc2b02cb70c8833fa0b2a1fb5b95526bff]
</code></pre></div></div>

<p>Let’s run this every day via by placing a script <code class="language-plaintext highlighter-rouge">borgbackup</code> in <code class="language-plaintext highlighter-rouge">/etc/cron.daily</code>. The paths given are just examples…</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>

<span class="c"># Configuration</span>
<span class="nv">REPOSITORY</span><span class="o">=</span><span class="s2">"backup-server:repo"</span>

<span class="c"># What to backup</span>
<span class="nv">BACKUP_PATHS</span><span class="o">=</span><span class="s2">"
/home
"</span>

<span class="c"># What to exclude</span>
<span class="nv">EXCLUDE_ARGS</span><span class="o">=</span><span class="s2">"
--exclude '*.tmp'
--exclude '*.log'
"</span>

<span class="c"># Logging function</span>
log<span class="o">()</span> <span class="o">{</span>
    logger <span class="nt">-t</span> <span class="s2">"borg-backup"</span> <span class="s2">"</span><span class="nv">$1</span><span class="s2">"</span>
    <span class="nb">echo</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">date</span> <span class="s1">'+%Y-%m-%d %H:%M:%S'</span><span class="si">)</span><span class="s2"> - </span><span class="nv">$1</span><span class="s2">"</span>
<span class="o">}</span>

log <span class="s2">"========================================"</span>
log <span class="s2">"Starting Borg backup"</span>

<span class="c"># Check if borg is installed</span>
<span class="k">if</span> <span class="o">!</span> <span class="nb">command</span> <span class="nt">-v</span> borg &amp;&gt; /dev/null<span class="p">;</span> <span class="k">then
    </span>log <span class="s2">"ERROR: borg command not found"</span>
    <span class="nb">exit </span>1
<span class="k">fi</span>

<span class="c"># Test repository access</span>
<span class="k">if</span> <span class="o">!</span> borg info <span class="s2">"</span><span class="nv">$REPOSITORY</span><span class="s2">"</span> &amp;&gt; /dev/null<span class="p">;</span> <span class="k">then
    </span>log <span class="s2">"ERROR: Cannot access repository </span><span class="nv">$REPOSITORY</span><span class="s2">"</span>
    log <span class="s2">"Make sure repository exists and SSH key is set up"</span>
    <span class="nb">exit </span>1
<span class="k">fi</span>

<span class="c"># Create backup</span>
log <span class="s2">"Creating backup archive..."</span>
<span class="k">if </span>borg create <span class="se">\</span>
    <span class="s2">"</span><span class="nv">$REPOSITORY</span><span class="s2">::backup-{now}"</span> <span class="se">\</span>
    <span class="nv">$BACKUP_PATHS</span> <span class="se">\</span>
    <span class="nv">$EXCLUDE_ARGS</span> <span class="se">\</span>
    <span class="nt">--compression</span> lz4 <span class="se">\</span>
    <span class="nt">--stats</span> 2&gt;&amp;1 | logger <span class="nt">-t</span> <span class="s2">"borg-backup"</span><span class="p">;</span> <span class="k">then
    </span>log <span class="s2">"Backup created successfully"</span>
<span class="k">else
    </span>log <span class="s2">"ERROR: Backup creation failed"</span>
    <span class="nb">exit </span>1
<span class="k">fi</span>

<span class="c"># Prune old backups</span>
log <span class="s2">"Pruning old backups..."</span>
<span class="k">if </span>borg prune <span class="s2">"</span><span class="nv">$REPOSITORY</span><span class="s2">"</span> <span class="se">\</span>
    <span class="nt">--keep-daily</span><span class="o">=</span>7 <span class="se">\</span>
    <span class="nt">--keep-weekly</span><span class="o">=</span>4 <span class="se">\</span>
    <span class="nt">--keep-monthly</span><span class="o">=</span>6 <span class="se">\</span>
    <span class="nt">--stats</span> 2&gt;&amp;1 | logger <span class="nt">-t</span> <span class="s2">"borg-backup"</span><span class="p">;</span> <span class="k">then
    </span>log <span class="s2">"Pruning completed successfully"</span>
<span class="k">else
    </span>log <span class="s2">"WARNING: Pruning failed, but backup was successful"</span>
<span class="k">fi</span>

<span class="c"># Monthly repository check (on the 1st of each month)</span>
<span class="k">if</span> <span class="o">[</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">date</span> +%d<span class="si">)</span><span class="s2">"</span> <span class="o">=</span> <span class="s2">"01"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span>log <span class="s2">"Running monthly repository check..."</span>
    <span class="k">if </span>borg check <span class="s2">"</span><span class="nv">$REPOSITORY</span><span class="s2">"</span> 2&gt;&amp;1 | logger <span class="nt">-t</span> <span class="s2">"borg-backup"</span><span class="p">;</span> <span class="k">then
        </span>log <span class="s2">"Repository check passed"</span>
    <span class="k">else
        </span>log <span class="s2">"WARNING: Repository check failed"</span>
    <span class="k">fi
fi

</span>log <span class="s2">"Backup completed successfully"</span>
log <span class="s2">"========================================"</span>
</code></pre></div></div>

<p>Check the logs…</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>journalctl <span class="nt">-t</span> borg-backup
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="borg" /><category term="tunbury.org" /><summary type="html"><![CDATA[Our PeerTube installation at watch.ocaml.org holds hundreds of videos we wouldn’t want to lose! It’s a VM hosted at Scaleway so the chances of a loss are pretty small, but having a second copy would give us extra reassurance. I’m going to use Borg Backup.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/borg-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/borg-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">opam-repository for OxCaml</title><link href="https://www.tunbury.org/2025/06/12/oxcaml-repository/" rel="alternate" type="text/html" title="opam-repository for OxCaml" /><published>2025-06-12T00:00:00+00:00</published><updated>2025-06-12T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/12/oxcaml-repository</id><content type="html" xml:base="https://www.tunbury.org/2025/06/12/oxcaml-repository/"><![CDATA[<p>This morning, Anil proposed that having an opam-repository that didn’t have old versions of the packages that require patches to work with OxCaml would be good.</p>

<p>This is a fast-moving area, so this post is likely to be outdated very quickly, but at the time of writing, the development repository is <a href="https://github.com/janestreet/opam-repository/tree/with-extensions">https://github.com/janestreet/opam-repository#with-extensions</a>. This is a fork of <a href="https://github.com/ocaml/opam-repository">opam-repository</a> but with some patched packages designated with <code class="language-plaintext highlighter-rouge">+ox</code>.</p>

<p>I have a short shell script which clones both <a href="https://github.com/ocaml/opam-repository">opam-repository</a> and <a href="https://github.com/janestreet/opam-repository/tree/with-extensions">https://github.com/janestreet/opam-repository#with-extensions</a> and searches for all packages with <code class="language-plaintext highlighter-rouge">+ox</code>. All versions of these packages are removed from opam-repository and replaced with the single <code class="language-plaintext highlighter-rouge">+ox</code> version. The resulting repository is pushed to <a href="https://github.com/mtelvers/opam-repository-ox">https://github.com/mtelvers/opam-repository-ox</a>.</p>

<p>To test the repository (and show that <code class="language-plaintext highlighter-rouge">eio</code> doesn’t build), I have created a <code class="language-plaintext highlighter-rouge">Dockerfile</code> based largely on the base-image-builder format. This <code class="language-plaintext highlighter-rouge">Dockerfile</code> uses this modified opam-repository to build an OxCaml switch.</p>

<p>My build script and test Dockerfile are in [https://github.com/mtelvers/opam-repo-merge] (https://github.com/mtelvers/opam-repo-merge). Thanks to David for being the sounding board during the day!</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="oxcaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[This morning, Anil proposed that having an opam-repository that didn’t have old versions of the packages that require patches to work with OxCaml would be good.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Containerd on Windows</title><link href="https://www.tunbury.org/2025/06/11/windows-containerd/" rel="alternate" type="text/html" title="Containerd on Windows" /><published>2025-06-11T00:00:00+00:00</published><updated>2025-06-11T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/11/windows-containerd</id><content type="html" xml:base="https://www.tunbury.org/2025/06/11/windows-containerd/"><![CDATA[<p>The tricky part of using <a href="https://github.com/microsoft/hcsshim/issues/2156">runhcs</a> has been getting the layers correct. While I haven’t had any luck, I have managed to created Windows containers using <code class="language-plaintext highlighter-rouge">ctr</code> and <code class="language-plaintext highlighter-rouge">containerd</code>.</p>

<p>Installing <code class="language-plaintext highlighter-rouge">containerd</code> is a manual process on Windows. These steps give general guidance on what is needed: enable the <code class="language-plaintext highlighter-rouge">containers</code> feature in Windows, download the tar file from GitHub, extract it, add it to the path, generate a default configuration file, register the service, and start it.</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Enable-WindowsOptionalFeature</span><span class="w"> </span><span class="nt">-Online</span><span class="w"> </span><span class="nt">-FeatureName</span><span class="w"> </span><span class="nx">containers</span><span class="w"> </span><span class="nt">-All</span><span class="w">
</span><span class="n">mkdir</span><span class="w"> </span><span class="s2">"c:\Program Files\containerd"</span><span class="w">
</span><span class="n">curl.exe</span><span class="w"> </span><span class="nt">-L</span><span class="w"> </span><span class="nx">https://github.com/containerd/containerd/releases/download/v2.2.1/containerd-2.2.1-windows-amd64.tar.gz</span><span class="w"> </span><span class="nt">-o</span><span class="w"> </span><span class="nx">containerd-windows-amd64.tar.gz</span><span class="w">
</span><span class="n">tar.exe</span><span class="w"> </span><span class="nx">xvf</span><span class="w"> </span><span class="o">.</span><span class="nx">\containerd-windows-amd64.tar.gz</span><span class="w"> </span><span class="nt">-C</span><span class="w"> </span><span class="s2">"c:\Program Files\containerd"</span><span class="w">
</span><span class="nv">$Path</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="n">Environment</span><span class="p">]::</span><span class="n">GetEnvironmentVariable</span><span class="p">(</span><span class="s2">"PATH"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Machine"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">[</span><span class="n">IO.Path</span><span class="p">]::</span><span class="n">PathSeparator</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s2">"</span><span class="nv">$</span><span class="nn">Env</span><span class="p">:</span><span class="nv">ProgramFiles</span><span class="s2">\containerd\bin"</span><span class="w">
 </span><span class="n">Environment</span><span class="p">]::</span><span class="n">SetEnvironmentVariable</span><span class="p">(</span><span class="w"> </span><span class="s2">"Path"</span><span class="p">,</span><span class="w"> </span><span class="nv">$Path</span><span class="p">,</span><span class="w"> </span><span class="s2">"Machine"</span><span class="p">)</span><span class="w">
</span><span class="n">containerd.exe</span><span class="w"> </span><span class="nx">config</span><span class="w"> </span><span class="nx">default</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="n">Out-File</span><span class="w"> </span><span class="s2">"c:\Program Files\containerd\config.toml"</span><span class="w"> </span><span class="nt">-Encoding</span><span class="w"> </span><span class="nx">ascii</span><span class="w">
</span><span class="n">containerd</span><span class="w"> </span><span class="nt">--register-service</span><span class="w">
</span><span class="n">net</span><span class="w"> </span><span class="nx">start</span><span class="w"> </span><span class="nx">containerd</span><span class="w">
</span></code></pre></div></div>

<p>With that out of the way, pull <code class="language-plaintext highlighter-rouge">nanoserver:ltsc2022</code> from Microsoft’s container registry.</p>

<pre><code class="language-dos">c:\&gt; ctr image pull mcr.microsoft.com/windows/nanoserver:ltsc2022
</code></pre>

<p>List which snapshots are available: <code class="language-plaintext highlighter-rouge">nanoserver</code> has one, but <code class="language-plaintext highlighter-rouge">servercore</code> has two.</p>

<pre><code class="language-dos">c:\&gt; ctr snapshot ls
KEY                                                                     PARENT                                                                  KIND
sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355                                                                         Committed
</code></pre>

<p>Take a snapshot of <code class="language-plaintext highlighter-rouge">nanoserver</code>, which creates a writeable scratch layer. <code class="language-plaintext highlighter-rouge">--mounts</code> is key here. Without it, you won’t know where the layers are. They are held below <code class="language-plaintext highlighter-rouge">C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots</code> in numbered folders. The mapping between numbers and keys is stored in <code class="language-plaintext highlighter-rouge">metadata.db</code> in BoltDB format. With the <code class="language-plaintext highlighter-rouge">--mounts</code> command line option, we see the <code class="language-plaintext highlighter-rouge">source</code> path and list of paths in <code class="language-plaintext highlighter-rouge">parentLayerPaths</code>.</p>

<pre><code class="language-dos">c:\&gt; ctr snapshots prepare --mounts my-test sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355
[
    {
        "Type": "windows-layer",
        "Source": "C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\\21",
        "Target": "",
        "Options": [
            "rw",
            "parentLayerPaths=[\"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\20\"]"
        ]
    }
]
</code></pre>

<p>As you can see from <code class="language-plaintext highlighter-rouge">ctr snapshot ls</code> and <code class="language-plaintext highlighter-rouge">ctr snapshot info</code>, the layer paths aren’t readily available. This <a href="https://github.com/containerd/containerd/discussions/10053">discussion</a> is a sample of the creative approaches to getting the paths!</p>

<pre><code class="language-dos">c:\&gt; ctr snapshot ls
KEY                                                                     PARENT                                                                  KIND
my-test                                                                 sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355 Active
sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355                                                                         Committed
c:\&gt; ctr snapshot info my-test
{
    "Kind": "Active",
    "Name": "my-test",
    "Parent": "sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355",
    "Labels": {
        "containerd.io/gc.root": "2025-06-11T12:28:43Z"
    },
    "Created": "2025-06-11T16:33:43.144011Z",
    "Updated": "2025-06-11T16:33:43.144011Z"
}
</code></pre>

<p>Here’s the directory listing for reference.</p>

<pre><code class="language-dos">c:\&gt; dir C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots

 Volume in drive C has no label.
 Volume Serial Number is F0E9-1E81

 Directory of C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots

11/06/2025  16:33    &lt;DIR&gt;          .
11/06/2025  08:19    &lt;DIR&gt;          ..
11/06/2025  08:31    &lt;DIR&gt;          2
11/06/2025  16:32    &lt;DIR&gt;          20
11/06/2025  16:33    &lt;DIR&gt;          21
11/06/2025  08:20    &lt;DIR&gt;          rm-1
11/06/2025  08:20    &lt;DIR&gt;          rm-2
11/06/2025  08:22    &lt;DIR&gt;          rm-3
</code></pre>

<p>Now we need to prepare a <code class="language-plaintext highlighter-rouge">config.json</code> file. The <code class="language-plaintext highlighter-rouge">layerFolders</code> structure can be populated with the information from above. The order is important; preserve the order from <code class="language-plaintext highlighter-rouge">parentLayerPaths</code>, then append the scratch layer. It looks obvious when there are just two layers, but for <code class="language-plaintext highlighter-rouge">servercore:ltsc2022</code> where there are two parent layers, the order looks curious as the parent layers are given in reverse order and the scratch layer is last, e.g. <code class="language-plaintext highlighter-rouge">24, 23, 25</code> where 23 and 24 are the parents and 25 is the snapshot.</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
    </span><span class="nl">"ociVersion"</span><span class="p">:</span><span class="w"> </span><span class="s2">"1.1.0"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"process"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"user"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
            </span><span class="nl">"uid"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w">
            </span><span class="nl">"gid"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="p">,</span><span class="w">
            </span><span class="nl">"username"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ContainerUser"</span><span class="w">
        </span><span class="p">},</span><span class="w">
        </span><span class="nl">"args"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
            </span><span class="s2">"cmd"</span><span class="p">,</span><span class="w">
            </span><span class="s2">"/c"</span><span class="p">,</span><span class="w">
            </span><span class="s2">"echo test"</span><span class="w">
        </span><span class="p">],</span><span class="w">
        </span><span class="nl">"cwd"</span><span class="p">:</span><span class="w"> </span><span class="s2">""</span><span class="w">
    </span><span class="p">},</span><span class="w">
    </span><span class="nl">"root"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"path"</span><span class="p">:</span><span class="w"> </span><span class="s2">""</span><span class="w">
    </span><span class="p">},</span><span class="w">
    </span><span class="nl">"windows"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"layerFolders"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
            </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">root</span><span class="se">\\</span><span class="s2">io.containerd.snapshotter.v1.windows</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">20"</span><span class="p">,</span><span class="w">
            </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">ProgramData</span><span class="se">\\</span><span class="s2">containerd</span><span class="se">\\</span><span class="s2">root</span><span class="se">\\</span><span class="s2">io.containerd.snapshotter.v1.windows</span><span class="se">\\</span><span class="s2">snapshots</span><span class="se">\\</span><span class="s2">21"</span><span class="w">
        </span><span class="p">],</span><span class="w">
        </span><span class="nl">"ignoreFlushesDuringBoot"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w">
        </span><span class="nl">"network"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
            </span><span class="nl">"allowUnqualifiedDNSQuery"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
        </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>We can now run the container.</p>

<pre><code class="language-dos">c:\&gt; ctr run --rm --config .\config.json my-container
</code></pre>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="containerd" /><category term="tunbury.org" /><summary type="html"><![CDATA[The tricky part of using runhcs has been getting the layers correct. While I haven’t had any luck, I have managed to created Windows containers using ctr and containerd.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/containerd.png" /><media:content medium="image" url="https://www.tunbury.org/images/containerd.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Base images for OxCaml</title><link href="https://www.tunbury.org/2025/06/10/oxcaml-base-images/" rel="alternate" type="text/html" title="Base images for OxCaml" /><published>2025-06-10T00:00:00+00:00</published><updated>2025-06-10T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/10/oxcaml-base-images</id><content type="html" xml:base="https://www.tunbury.org/2025/06/10/oxcaml-base-images/"><![CDATA[<p>As @dra27 suggested, I first added support in <a href="https://github.com/ocurrent/ocaml-version.git">ocurrent/ocaml-version</a>. I went with the name <code class="language-plaintext highlighter-rouge">flambda2</code>, which matched the name in the <code class="language-plaintext highlighter-rouge">opam</code> package.</p>

<p>Wherever I found the type <code class="language-plaintext highlighter-rouge">Flambda</code>, I added <code class="language-plaintext highlighter-rouge">Flambda2</code>. I added a list of OxCaml versions in the style of the unreleased betas and a function <code class="language-plaintext highlighter-rouge">is_oxcaml</code> to test if the variant is of type <code class="language-plaintext highlighter-rouge">Flambda2</code>, closely following the <code class="language-plaintext highlighter-rouge">is_multicore</code> design! The final change was to <code class="language-plaintext highlighter-rouge">additional_packages</code> concatenated <code class="language-plaintext highlighter-rouge">ocaml-options-only-</code> to <code class="language-plaintext highlighter-rouge">flambda2</code> - again, this change was also needed for multicore.</p>

<p>It was a relatively minor change to the base-image-builder, adding <code class="language-plaintext highlighter-rouge">Ocaml_version.Releases.oxcaml</code> to the available switches on AMD64 and ARM64. Following the precedent set by <code class="language-plaintext highlighter-rouge">maybe_add_beta</code> and <code class="language-plaintext highlighter-rouge">maybe_add_multicore</code>, I added <code class="language-plaintext highlighter-rouge">maybe_add_jst</code>, which added the Jane Street opam repository for these builds.</p>

<p>The builds mostly failed because they depended on <code class="language-plaintext highlighter-rouge">autoconf,</code> which isn’t included by default on most distributions. Looking in the <code class="language-plaintext highlighter-rouge">dockerfile</code>, there is a function called <code class="language-plaintext highlighter-rouge">ocaml_depexts</code>, which includes <code class="language-plaintext highlighter-rouge">zstd</code> for OCaml &gt; 5.1.0. I extended this function to include <code class="language-plaintext highlighter-rouge">autoconf</code> when building OxCaml.</p>

<p>The Arch Linux builds failed due to missing <code class="language-plaintext highlighter-rouge">which</code>, so I added this as I did for <code class="language-plaintext highlighter-rouge">autoconf</code></p>

<p>The following are working:</p>

<ul>
  <li>Ubuntu 24.10, 24.04, 22.04</li>
  <li>OpenSUSE Tumbleweed</li>
  <li>Fedora 42, 41</li>
  <li>Debian Unstable, Testing, 12</li>
  <li>Arch</li>
</ul>

<p>Failures</p>

<ul>
  <li>Alpine 3.21
    <ul>
      <li>missing <code class="language-plaintext highlighter-rouge">linux/auxvec.h</code> header</li>
    </ul>
  </li>
  <li>OpenSUSE 15.6
    <ul>
      <li>autoconf is too old in the distribution</li>
    </ul>
  </li>
  <li>Debian 11
    <ul>
      <li>autoconf is too old in the distribution</li>
    </ul>
  </li>
  <li>Oracle Linux 9, 8
    <ul>
      <li>autoconf is too old in the distribution</li>
    </ul>
  </li>
</ul>

<p>There is some discussion about whether building these with the <a href="https://images.ci.ocaml.org">base image builder</a> is the best approach, so I won’t create PRs at this time. My branches are:</p>
<ul>
  <li><a href="https://github.com/mtelvers/ocaml-version.git">https://github.com/mtelvers/ocaml-version.git</a></li>
  <li><a href="https://github.com/mtelvers/ocaml-dockerfile.git#oxcaml">https://github.com/mtelvers/ocaml-dockerfile.git#oxcaml</a></li>
  <li><a href="https://github.com/mtelvers/docker-base-images#oxcaml">https://github.com/mtelvers/docker-base-images#oxcaml</a></li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="oxcaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[As @dra27 suggested, I first added support in ocurrent/ocaml-version. I went with the name flambda2, which matched the name in the opam package.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">User Isolation on Windows</title><link href="https://www.tunbury.org/2025/06/09/windows-sandbox/" rel="alternate" type="text/html" title="User Isolation on Windows" /><published>2025-06-09T00:00:00+00:00</published><updated>2025-06-09T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/09/windows-sandbox</id><content type="html" xml:base="https://www.tunbury.org/2025/06/09/windows-sandbox/"><![CDATA[<p>For a long time, we have struggled to match the performance and functionality of <code class="language-plaintext highlighter-rouge">runc</code> on Windows. Antonin wrote the Docker-based isolation for <a href="https://github.com/ocurrent/obuilder">ocurrent/obuilder</a> with <a href="https://github.com/ocurrent/obuilder/pull/127">PR#127</a>, and I wrote machine-level isolation using QEMU <a href="https://github.com/ocurrent/obuilder/pull/195">PR#195</a>. Sadly, the most obvious approach of using <code class="language-plaintext highlighter-rouge">runhcs</code> doesn’t work, see <a href="https://github.com/microsoft/hcsshim/issues/2156">issue#2156</a>.</p>

<p>On macOS, we use user isolation and ZFS mounts. We mount filesystems over <code class="language-plaintext highlighter-rouge">/Users/&lt;user&gt;</code> and <code class="language-plaintext highlighter-rouge">/usr/local/Homebrew</code> (or <code class="language-plaintext highlighter-rouge">/opt/Homebrew</code> on Apple Silicon). Each command is executed with <code class="language-plaintext highlighter-rouge">su</code>, then the filesystems are unmounted, and snapshots are taken before repeating the cycle. This approach has limitations, primarily because we can only run one job at a time. Firstly, the Homebrew location is per machine, and secondly, switches are not relocatable, so mounting as <code class="language-plaintext highlighter-rouge">/Users/&lt;another user&gt;</code> wouldn’t work.</p>

<p>In a similar vein, we could make user isolation work under Windows. On Windows, opam manages the Cygwin installation in <code class="language-plaintext highlighter-rouge">%LOCALAPPDATA%\opam</code>, so it feels like the shared HomeBrew limitation of macOS doesn’t exist, so can we create users with the same home directory? This isn’t as crazy as it sounds because Windows has drive letters, and right back to the earliest Windows networks I can remember (NetWare 3!), it was common practice for all users to have their home directory available as <code class="language-plaintext highlighter-rouge">H:\</code>. These days, it’s unfortunate that many applications <em>see through</em> drive letters and convert them to the corresponding UNC paths. Excel is particularly annoying as it does this with linked sheets, preventing administrators from easily migrating to a new file server, thereby invalidating UNC paths.</p>

<h1 id="windows-user-isolation">Windows user isolation</h1>

<p>Windows drive mappings are per user and can be created using the command <a href="https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/subst">subst</a>. We might try to set the home directory and profile path when we create a user <code class="language-plaintext highlighter-rouge">net user foo bar /add /homedir:h:\ /profilepath:h:\</code>, but since <code class="language-plaintext highlighter-rouge">h:</code> does not exist in the user’s context, the user is given a temporary profile, which is lost when they log out. If you specify just <code class="language-plaintext highlighter-rouge">/homedir</code>, the profile is retained in <code class="language-plaintext highlighter-rouge">c:\users\foo</code>.</p>

<p>We could now try to map <code class="language-plaintext highlighter-rouge">h:</code> using <code class="language-plaintext highlighter-rouge">subst h: c:\cache\layer</code>, but <code class="language-plaintext highlighter-rouge">subst</code> drives don’t naturally persist between sessions. Alternatively, we could use <code class="language-plaintext highlighter-rouge">net use h: \\DESKTOP-BBBSRML\cache\layer /persistent:yes</code>.</p>

<p>Ultimately, the path where <code class="language-plaintext highlighter-rouge">%APPDATA%</code> is held must exist when the profile is loaded; it can’t be created as a result of loading the profile. Note that for a new user, the path doesn’t exist at all, but the parent directory where it will be created does exist. In Active Directory/domain environments, the profile and home paths are on network shares, one directory per user. These exist before the user signs in; all users can have <code class="language-plaintext highlighter-rouge">h:</code> mapped to their personal space.</p>

<p>Ultimately, it doesn’t matter whether we can redirect <code class="language-plaintext highlighter-rouge">%LOCALAPPDATA%</code> or not, as we can control the location opam uses by setting the environment variable <code class="language-plaintext highlighter-rouge">OPAMROOT</code>.</p>

<h1 id="opam-knows">opam knows</h1>

<p>Unfortunately, there’s no fooling opam. It sees through both <code class="language-plaintext highlighter-rouge">subst</code> and network drives and embeds the path into files like <code class="language-plaintext highlighter-rouge">opam\config</code>.</p>

<h2 id="subst">subst</h2>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>subst h: c:<span class="se">\h</span>ome<span class="se">\f</span>oo
<span class="nb">set </span><span class="nv">OPAMROOT</span><span class="o">=</span>h:<span class="se">\o</span>pam
opam init <span class="nt">-y</span>
...

  In normal operation, opam only alters files within your opam root
    <span class="o">(</span>~<span class="se">\A</span>ppData<span class="se">\L</span>ocal<span class="se">\o</span>pam by default<span class="p">;</span> currently C:<span class="se">\h</span>ome<span class="se">\f</span>oo<span class="se">\o</span>pam<span class="o">)</span><span class="nb">.</span>

...
</code></pre></div></div>

<h2 id="net-use">net use</h2>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>net share <span class="nv">home</span><span class="o">=</span>c:<span class="se">\h</span>ome
net use h: <span class="se">\\</span>DESKTOP-BBBSRML<span class="se">\h</span>ome<span class="se">\f</span>oo /persistent:yes
SET <span class="nv">OPAMROOT</span><span class="o">=</span>h:<span class="se">\o</span>pam
opam init <span class="nt">-y</span>
...

  In normal operation, opam only alters files within your opam root
    <span class="o">(</span>~<span class="se">\A</span>ppData<span class="se">\L</span>ocal<span class="se">\o</span>pam by default<span class="p">;</span> currently UNC<span class="se">\D</span>ESKTOP-BBBSRML<span class="se">\h</span>ome<span class="se">\f</span>oo<span class="se">\o</span>pam<span class="o">)</span><span class="nb">.</span>

...
</code></pre></div></div>

<p>Unless David has some inspiration, I don’t know where to go with this.</p>

<p>Here’s an example from the Windows API.</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// If you have: subst X: C:\SomeFolder</span>
<span class="n">QueryDosDevice</span><span class="p">(</span><span class="s">L"X:"</span><span class="p">,</span> <span class="n">buffer</span><span class="p">,</span> <span class="n">size</span><span class="p">);</span>  <span class="c1">// Returns: "C:\SomeFolder"</span>
<span class="n">GetCurrentDirectory</span><span class="p">();</span>                <span class="c1">// Returns: "X:\" (if current)</span>
</code></pre></div></div>

<h1 id="windows-sandbox">Windows Sandbox</h1>

<p>Windows has a new(?) feature called <em>Windows Sandbox</em> that I hadn’t seen before. It allows commands to be executed in a lightweight VM based on an XML definition. For example, a simple <code class="language-plaintext highlighter-rouge">test.wsb</code> would contain.</p>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;Configuration&gt;</span>
  <span class="nt">&lt;MappedFolders&gt;</span>
    <span class="nt">&lt;MappedFolder&gt;</span>
      <span class="nt">&lt;HostFolder&gt;</span>C:\home\foo\opam<span class="nt">&lt;/HostFolder&gt;</span>
      <span class="nt">&lt;SandboxFolder&gt;</span>C:\Users\WDAGUtilityAccount\AppData\Local\opam<span class="nt">&lt;/SandboxFolder&gt;</span>
      <span class="nt">&lt;ReadOnly&gt;</span>false<span class="nt">&lt;/ReadOnly&gt;</span>
    <span class="nt">&lt;/MappedFolder&gt;</span>
  <span class="nt">&lt;/MappedFolders&gt;</span>
<span class="nt">&lt;/Configuration&gt;</span>
</code></pre></div></div>

<p>The sandbox started quickly and worked well until I tried to run a second instance. The command returns an error stating that only one is allowed. Even doing <code class="language-plaintext highlighter-rouge">runas /user:bar "WindowsSandbox.exe test.wsb"</code> fails with the same error.</p>

<h1 id="full-circle">Full circle</h1>

<p>I think this brings us back to Docker. I wrote the QEMU implementation because of Docker’s poor performance on Windows, coupled with the unreliability of OBuilder on Windows. However, I wonder if today’s use case means that it warrants a second look.</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Install Docker Engine</span><span class="w">
</span><span class="n">Invoke-WebRequest</span><span class="w"> </span><span class="nt">-UseBasicParsing</span><span class="w"> </span><span class="s2">"https://download.docker.com/win/static/stable/x86_64/docker-28.2.2.zip"</span><span class="w"> </span><span class="nt">-OutFile</span><span class="w"> </span><span class="nx">docker.zip</span><span class="w">
</span><span class="n">Expand-Archive</span><span class="w"> </span><span class="nx">docker.zip</span><span class="w"> </span><span class="nt">-DestinationPath</span><span class="w"> </span><span class="s2">"C:\Program Files"</span><span class="w">
 </span><span class="n">Environment</span><span class="p">]::</span><span class="n">SetEnvironmentVariable</span><span class="p">(</span><span class="s2">"Path"</span><span class="p">,</span><span class="w"> </span><span class="nv">$</span><span class="nn">env</span><span class="p">:</span><span class="nv">Path</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="s2">";C:\Program Files\docker"</span><span class="p">,</span><span class="w"> </span><span class="s2">"Machine"</span><span class="p">)</span><span class="w">

</span><span class="c"># Start Docker service</span><span class="w">
</span><span class="n">dockerd</span><span class="w"> </span><span class="nt">--register-service</span><span class="w">
</span><span class="n">Start-Service</span><span class="w"> </span><span class="nx">docker</span><span class="w">
</span></code></pre></div></div>

<p>Create a simple <code class="language-plaintext highlighter-rouge">Dockerfile</code> and build the image using <code class="language-plaintext highlighter-rouge">docker build . -t opam</code>.</p>

<div class="language-dockerfile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">FROM</span><span class="s"> mcr.microsoft.com/windows/servercore:ltsc2022</span>

<span class="c"># Download opam</span>
<span class="k">ADD</span><span class="s"> https://github.com/ocaml/opam/releases/download/2.3.0/opam-2.3.0-x86_64-windows.exe C:\\windows\\opam.exe</span>

<span class="k">RUN </span>net user opam /add /passwordreq:no

<span class="k">USER</span><span class="s"> opam</span>

<span class="c"># Run something as the opam user to create c:\\users\\opam</span>
<span class="k">RUN </span>opam <span class="nt">--version</span>

<span class="k">WORKDIR</span><span class="s"> c:\\users\\opam</span>

<span class="k">CMD</span><span class="s"> ["cmd"]</span>
</code></pre></div></div>

<p>Test with <code class="language-plaintext highlighter-rouge">opam init</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="nt">--isolation</span><span class="o">=</span>process <span class="nt">--rm</span> <span class="nt">-it</span> <span class="nt">-v</span> C:<span class="se">\c</span>ache<span class="se">\t</span>emp<span class="se">\:</span>c:<span class="se">\U</span>sers<span class="se">\o</span>pam<span class="se">\A</span>ppData<span class="se">\L</span>ocal<span class="se">\o</span>pam opam:latest opam init <span class="nt">-y</span>
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="windows" /><category term="tunbury.org" /><summary type="html"><![CDATA[For a long time, we have struggled to match the performance and functionality of runc on Windows. Antonin wrote the Docker-based isolation for ocurrent/obuilder with PR#127, and I wrote machine-level isolation using QEMU PR#195. Sadly, the most obvious approach of using runhcs doesn’t work, see issue#2156.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/sandbox.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/sandbox.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Animating 3D models in OCaml with Claude</title><link href="https://www.tunbury.org/2025/06/07/claude-animates-in-ocaml/" rel="alternate" type="text/html" title="Animating 3D models in OCaml with Claude" /><published>2025-06-07T00:00:00+00:00</published><updated>2025-06-07T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/07/claude-animates-in-ocaml</id><content type="html" xml:base="https://www.tunbury.org/2025/06/07/claude-animates-in-ocaml/"><![CDATA[<p>In the week, Jon mentioned <a href="https://mac.getutm.app">UTM</a>, which uses Apple’s Hypervisor virtualisation framework to run ARM64 operating systems on Apple Silicon. It looked awesome, and the speed of virtualised macOS was fantastic. It also offers x86_64 emulation; we mused how well it would perform running Windows, but found it disappointing.</p>

<p>I was particularly interested in this because I am stuck in the past with macOS Monterey on my Intel Mac Pro ‘trashcan’, as I have a niche Windows application that I can’t live without. A few years ago, I got a prototype running written in Swift. I never finished it as other events got in the way. The learning curve of <a href="https://youtu.be/8Jb3v2HRv_E">SceneKit and Blender</a> was intense. I still had the Collada files on my machine and today, of course, we have Claude.</p>

<p>“How would I animate a Collada (.dae) file using OCaml?”. Claude acknowledged the complexity and proposed that <code class="language-plaintext highlighter-rouge">lablgl</code>, the OCaml bindings for OpenGL, would be a good starting point. Claude obliged and wrote the entire pipeline, giving me opam commands and Dune configuration files.</p>

<p>The code wouldn’t build, so I looked for the API for <code class="language-plaintext highlighter-rouge">labgl</code>. The library seemed old, with no recent activity. I mentioned this to Claude; he was happy to suggest an alternative approach of <code class="language-plaintext highlighter-rouge">tgls</code>, thin OpenGL bindings, with <code class="language-plaintext highlighter-rouge">tsdl</code>, SDL2 bindings, or the higher-level API from <code class="language-plaintext highlighter-rouge">raylib</code>. The idea of a high-level API sounded better, so I asked Claude to rewrite it with <code class="language-plaintext highlighter-rouge">raylib</code>.</p>

<p>The code had some compilation issues. Claude had proposed <code class="language-plaintext highlighter-rouge">Mesh.gen_cube</code>, which didn’t exist. Claude consulted the API documentation and found <code class="language-plaintext highlighter-rouge">gen_mesh_cube</code> instead. This went through several iterations, with <code class="language-plaintext highlighter-rouge">Model.load</code> becoming <code class="language-plaintext highlighter-rouge">load_model</code> and <code class="language-plaintext highlighter-rouge">Model.draw_ex</code> becoming <code class="language-plaintext highlighter-rouge">draw_model_ex</code>, etc. Twenty-two versions later, the code nearly compiles. This block continued to fail with two issues. The first being <code class="language-plaintext highlighter-rouge">Array.find</code> doesn’t exist and the second being that the type inferred for <code class="language-plaintext highlighter-rouge">a</code> was wrong. There are two types and they both contain <code class="language-plaintext highlighter-rouge">target: string;</code>. I manually fixed this with <code class="language-plaintext highlighter-rouge">(a:animation_channel)</code> and used <code class="language-plaintext highlighter-rouge">match Array.find_opt ... with</code> instead of the <code class="language-plaintext highlighter-rouge">try ... with</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">(* Update animations *)</span>
<span class="k">let</span> <span class="n">update_object_animations</span> <span class="n">objects</span> <span class="n">animations</span> <span class="n">elapsed_time</span> <span class="o">=</span>
  <span class="nn">Array</span><span class="p">.</span><span class="n">map</span> <span class="p">(</span><span class="k">fun</span> <span class="n">obj</span> <span class="o">-&gt;</span>
    <span class="k">try</span>
      <span class="k">let</span> <span class="n">anim</span> <span class="o">=</span> <span class="nn">Array</span><span class="p">.</span><span class="n">find</span> <span class="p">(</span><span class="k">fun</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">a</span><span class="o">.</span><span class="n">target</span> <span class="o">=</span> <span class="n">obj</span><span class="o">.</span><span class="n">name</span><span class="p">)</span> <span class="n">animations</span> <span class="k">in</span>
      <span class="c">(* Loop animation *)</span>
      <span class="k">let</span> <span class="n">loop_time</span> <span class="o">=</span> <span class="n">mod_float</span> <span class="n">elapsed_time</span> <span class="n">anim</span><span class="o">.</span><span class="n">duration</span> <span class="k">in</span>
      <span class="k">let</span> <span class="n">new_transform</span> <span class="o">=</span> <span class="n">interpolate_animation</span> <span class="n">anim</span> <span class="n">loop_time</span> <span class="k">in</span>
      <span class="p">{</span> <span class="n">obj</span> <span class="k">with</span> <span class="n">current_transform</span> <span class="o">=</span> <span class="n">new_transform</span> <span class="p">}</span>
    <span class="k">with</span>
      <span class="nc">Not_found</span> <span class="o">-&gt;</span> <span class="n">obj</span>
  <span class="p">)</span> <span class="n">objects</span>
</code></pre></div></div>

<p>There were still many unused variables, but the code could be built using <code class="language-plaintext highlighter-rouge">dune build --release</code>.</p>

<p>Unfortunately, it couldn’t load my Collada file as the load functions were just stubs! Claude duly obliged and wrote a simple XML parser using regular expressions through the <code class="language-plaintext highlighter-rouge">Str</code> library, but interestingly suggested that I include <code class="language-plaintext highlighter-rouge">xmlm</code> as a dependency. Adding the parser broke the code, and it no longer compiled. The issue was similar to above; the compiler had inferred a type that wasn’t what Claude expected. I fixed this as above. The code also had some issues with the ordering - functions were used before they were defined. Again, this was an easy fix.</p>

<p>The parser still didn’t work, so I suggested ditching the regular expression-based approach and using <code class="language-plaintext highlighter-rouge">xmlm</code> instead. This loaded the mesh; it looked bad, but I could see that it was my mesh. However, it still didn’t animate, and I took a wrong turn here. I told Claude that the Collada file contained both the mesh and the animation, but that’s not right. It has been a while since I created the Collada files, and I had forgotten that the animation and the mesh definitions were in different files.</p>

<p>I asked Claude to improve the parser so that it would expect the animation data to be in the same file as the mesh. This is within the specification for Collada, but this was not the structure of my file.</p>

<p>Is there a better approach than dealing with the complexity of writing a Collada XML parser? What formats are supported by <code class="language-plaintext highlighter-rouge">raylib</code>?</p>

<p>In a new thread, I asked, “Using OCaml with Raylib, what format should I use for my 3D mode and animation data?”. Claude suggested GLTF 2.0. As my animation is in Blender, it can be exported in GLTF format. Let’s try it!</p>

<p>Claude used the <code class="language-plaintext highlighter-rouge">raylib</code> library to read and display a GLTF file and run the animation. The code was much shorter, but … it didn’t compile. I wrote to Claude, “The API for Raylib appears to be different to the one you have used. For example, <code class="language-plaintext highlighter-rouge">camera3d.create</code> doesn’t take named parameters, <code class="language-plaintext highlighter-rouge">camera3d.prespective</code> should be <code class="language-plaintext highlighter-rouge">cameraprojection.perspective</code> etc.”  We set to work, and a dozen versions later, we built it successfully.</p>

<p>It didn’t work, though; the console produced an error over and over:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Joint attribute data format not supported, use vec4 u8
</code></pre></div></div>

<p>This looked like a problem with the model. I wondered if my GLTF file was compatible with <code class="language-plaintext highlighter-rouge">raylib</code>. I asked Claude if he knew of any validation tools, and he suggested an online viewer. This loaded my file perfectly and animated it in the browser. Claude also gave me some simple code to validate, which only loaded the model.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">main</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="n">init_window</span> <span class="mi">800</span> <span class="mi">600</span> <span class="s2">"Static Model Test"</span><span class="p">;</span>
  <span class="k">let</span> <span class="n">camera</span> <span class="o">=</span> <span class="nn">Camera3D</span><span class="p">.</span><span class="n">create</span>
    <span class="p">(</span><span class="nn">Vector3</span><span class="p">.</span><span class="n">create</span> <span class="mi">25</span><span class="o">.</span><span class="mi">0</span> <span class="mi">25</span><span class="o">.</span><span class="mi">0</span> <span class="mi">25</span><span class="o">.</span><span class="mi">0</span><span class="p">)</span>
    <span class="p">(</span><span class="nn">Vector3</span><span class="p">.</span><span class="n">create</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span><span class="p">)</span>
    <span class="p">(</span><span class="nn">Vector3</span><span class="p">.</span><span class="n">create</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="mi">1</span><span class="o">.</span><span class="mi">0</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span><span class="p">)</span>
    <span class="mi">45</span><span class="o">.</span><span class="mi">0</span> <span class="nn">CameraProjection</span><span class="p">.</span><span class="nc">Perspective</span> <span class="k">in</span>

  <span class="k">let</span> <span class="n">model</span> <span class="o">=</span> <span class="n">load_model</span> <span class="s2">"assets/character.gltf"</span> <span class="k">in</span>

  <span class="k">while</span> <span class="n">not</span> <span class="p">(</span><span class="n">window_should_close</span> <span class="bp">()</span><span class="p">)</span> <span class="k">do</span>
    <span class="n">begin_drawing</span> <span class="bp">()</span><span class="p">;</span>
    <span class="n">clear_background</span> <span class="nn">Color</span><span class="p">.</span><span class="n">darkgray</span><span class="p">;</span>
    <span class="n">begin_mode_3d</span> <span class="n">camera</span><span class="p">;</span>
    <span class="n">draw_model</span> <span class="n">model</span> <span class="p">(</span><span class="nn">Vector3</span><span class="p">.</span><span class="n">create</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span> <span class="mi">0</span><span class="o">.</span><span class="mi">0</span><span class="p">)</span> <span class="mi">1</span><span class="o">.</span><span class="mi">0</span> <span class="nn">Color</span><span class="p">.</span><span class="n">white</span><span class="p">;</span>
    <span class="n">draw_grid</span> <span class="mi">10</span> <span class="mi">1</span><span class="o">.</span><span class="mi">0</span><span class="p">;</span>
    <span class="n">end_mode_3d</span> <span class="bp">()</span><span class="p">;</span>
    <span class="n">draw_text</span> <span class="s2">"Static Model Test"</span> <span class="mi">10</span> <span class="mi">10</span> <span class="mi">20</span> <span class="nn">Color</span><span class="p">.</span><span class="n">white</span><span class="p">;</span>
    <span class="n">end_drawing</span> <span class="bp">()</span>
  <span class="k">done</span><span class="p">;</span>

  <span class="n">unload_model</span> <span class="n">model</span><span class="p">;</span>
  <span class="n">close_window</span> <span class="bp">()</span>
</code></pre></div></div>

<p>Even this didn’t work! As I said at the top, it’s been a few years since I looked at this, and I still had Blender installed on my machine: version 2.83.4. The current version is 4.4, so I decided to upgrade. The GLTF export in 4.4 didn’t work on my Mac and instead displayed a page of Python warnings about <code class="language-plaintext highlighter-rouge">numpy</code>. On the Blender Forum, this <a href="https://blenderartists.org/t/multiple-addons-giving-numpy-errors-blender-4-4-mac/1590436/2">thread</a> showed me how to fix it. Armed with a new GLTF file, the static test worked. Returning to the animation code showed that it worked with the updated file; however, there are some significant visual distortions. These aren’t present when viewed in Blender, which I think comes down to how the library interpolates between keyframes. I will look into this another day.</p>

<p>I enjoyed the collaborative approach. I’m annoyed with myself for not remembering the separate file with the animation data. However, I think the change of direction from Collada to GLTF was a good decision, and the speed at which Claude can explore ideas is very impressive.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="claude,collada,gltf" /><category term="tunbury.org" /><summary type="html"><![CDATA[In the week, Jon mentioned UTM, which uses Apple’s Hypervisor virtualisation framework to run ARM64 operating systems on Apple Silicon. It looked awesome, and the speed of virtualised macOS was fantastic. It also offers x86_64 emulation; we mused how well it would perform running Windows, but found it disappointing.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/human.png" /><media:content medium="image" url="https://www.tunbury.org/images/human.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">GPS Clock</title><link href="https://www.tunbury.org/2025/06/04/gps-clock/" rel="alternate" type="text/html" title="GPS Clock" /><published>2025-06-04T00:00:00+00:00</published><updated>2025-06-04T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/04/gps-clock</id><content type="html" xml:base="https://www.tunbury.org/2025/06/04/gps-clock/"><![CDATA[<p>Jeff Geerling recently posted on <a href="https://www.youtube.com/@Level2Jeff/videos">Level 2 Jeff</a> about a <a href="https://www.youtube.com/watch?v=aBDgD032DEI">GPS clock</a> from Mitxela. This reminded me of a project I did in the early days of the first COVID lockdown. I dug it and it still works. After powering on, it took around 60 seconds to find a signal and display the time - not bad for being in a box for 5 years.</p>

<p>Here’s a basic diagram showing the connections. I used an Arduino Nano and a UBlox NEO-M8N-0-10 GPS receiver. The UBlox is connected to the Nano’s hardware serial port, the synchronisation pulse to pin D2, and the MAX7219 8 x 7-segment display to the Nano’s SPI interface.</p>

<p><img src="/images/gps-clock-circuit.png" alt="" /></p>

<p>The time pulse function can be configured using the <a href="/images/u-blox8-M8_ReceiverDescrProtSpec_UBX-13003221.pdf">UBX-CFG-TP5</a> message. I configured a 100Hz pulse to be handled by the interrupt service routine to increment the time in centiseconds. Furthermore, I configured a <a href="/images/u-blox8-M8_ReceiverDescrProtSpec_UBX-13003221.pdf">UBX-TIM-TP</a> time stamp message to be generated 10 times per second. After the time stamp message is sent on the serial port, the next pulse indicates that the time should be set.</p>

<p><img src="/images/ubx-tim-tp.png" alt="" /></p>

<p><img src="/images/gps-clock-top.jpg" alt="" /></p>

<p><img src="/images/gps-clock-bottom.jpg" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Arduino" /><category term="tunbury.org" /><summary type="html"><![CDATA[Jeff Geerling recently posted on Level 2 Jeff about a GPS clock from Mitxela. This reminded me of a project I did in the early days of the first COVID lockdown. I dug it and it still works. After powering on, it took around 60 seconds to find a signal and display the time - not bad for being in a box for 5 years.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/gps-clock.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/gps-clock.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Moving libvirt machines</title><link href="https://www.tunbury.org/2025/06/04/libvirt-moves/" rel="alternate" type="text/html" title="Moving libvirt machines" /><published>2025-06-04T00:00:00+00:00</published><updated>2025-06-04T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/04/libvirt-moves</id><content type="html" xml:base="https://www.tunbury.org/2025/06/04/libvirt-moves/"><![CDATA[<p>I need to migrate some libvirt/qemu machines from one host to another. These workloads can easily be stopped for a few minutes while the move happens.</p>

<p>1. Identify the name of the VMs which are going to be moved. If the machines have already been shutdown, then adding <code class="language-plaintext highlighter-rouge">--all</code> will list them.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh list</span>
</code></pre></div></div>

<p>2. Shutdown the machine either by connecting to it and issuing a <code class="language-plaintext highlighter-rouge">poweroff</code> command or, by using sending the shutdown request via <code class="language-plaintext highlighter-rouge">virsh</code>. You can verify that it is powered off with <code class="language-plaintext highlighter-rouge">virsh domstate vm_name</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh shutdown vm_name</span>
</code></pre></div></div>

<p>3. Export the configuration of the machine.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh dumpxml vm_name &gt; vm_name.xml</span>
</code></pre></div></div>

<p>4. List the block devices attached to the machine.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh domblklist vm_name</span>
</code></pre></div></div>

<p>Then for each block device check for any backing files using <code class="language-plaintext highlighter-rouge">qemu-img</code>. Backing files are caused by snapshots or building mulitple machines from a single master images.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>qemu-img info image.qcow2
</code></pre></div></div>

<p>5. Transfer the files to be new machine. This could be done via <code class="language-plaintext highlighter-rouge">scp</code> but in my case I’m going to use <code class="language-plaintext highlighter-rouge">nc</code>. On the target machine I’ll run this (using literally port 5678).</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># nc -l 5678 | tar -xvf -</span>
</code></pre></div></div>

<p>And on the source machine, I’ll send the files to the target machine at IP 1.2.3.4 (replace with the actual IP) and using port 5678 (literally).</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># tar -xf - *.qcow2 *.xml | nc 1.2.3.4 5678</span>
</code></pre></div></div>

<p>6. On the target machine, the VM now needs to be <em>defined</em>. This is done by importing the XML file exported from the original machine. To keep things simple, my disk images are in the same paths on the source and target machines. If not, edit the XML file before the import to reflect the new disk locations.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh define vm_name.xml</span>
</code></pre></div></div>

<p>7. Start the VM.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh start vm_name</span>
</code></pre></div></div>

<p>8. Delete the source VM. On the <em>source</em> machine, run this command.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh undefine vm_name --remove-all-storage</span>
</code></pre></div></div>

<p>9. Open a remote console</p>

<p>If things have gone wrong, it may be necessary to look at the console of the machine. If you are remote from both host machines this can be achieve using an <code class="language-plaintext highlighter-rouge">ssh</code> tunnel.</p>

<p>Determine the VNC port number being used by your VM.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># virsh vncdisplay vm_name</span>
127.0.0.1:8
</code></pre></div></div>

<p>In the above output, <code class="language-plaintext highlighter-rouge">:8</code> tells us that the VNC port number is <code class="language-plaintext highlighter-rouge">5908</code>. Create the SSH tunnel like this:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ssh -L 5908:127.0.0.1:5908 fqdn.remote.host</span>
</code></pre></div></div>

<p>Once the <code class="language-plaintext highlighter-rouge">ssh</code> connection is established, open your favourite VNC viewer on your machine and connect to <code class="language-plaintext highlighter-rouge">127.0.0.5908</code>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="libvirt,qemu" /><category term="tunbury.org" /><summary type="html"><![CDATA[I need to migrate some libvirt/qemu machines from one host to another. These workloads can easily be stopped for a few minutes while the move happens.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/libvirt.png" /><media:content medium="image" url="https://www.tunbury.org/images/libvirt.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Invenio Research Data Management (InvenioRDM)</title><link href="https://www.tunbury.org/2025/06/03/inveniordm/" rel="alternate" type="text/html" title="Invenio Research Data Management (InvenioRDM)" /><published>2025-06-03T00:00:00+00:00</published><updated>2025-06-03T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/03/inveniordm</id><content type="html" xml:base="https://www.tunbury.org/2025/06/03/inveniordm/"><![CDATA[<p><a href="https://github.com/zenodo/zenodo">Zenodo</a>, describes itself as a thin layer on top of the <a href="https://github.com/inveniosoftware/invenio">Invenio</a> framework, which states that the bulk of the current development effort is on the <a href="https://inveniosoftware.org/products/rdm/">InvenioRDM project</a>. There is a demonstration <a href="https://inveniordm.web.cern.ch">instance</a> hosted by CERN. Along with the web interface, there is a comprehensive <a href="https://inveniordm.docs.cern.ch/install/run/">API</a>.</p>

<p>The quick start <a href="https://inveniordm.docs.cern.ch/install/">documentation</a> guides you through setup which is summarized by</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pip <span class="nb">install </span>invenio-cli
invenio-cli init rdm <span class="nt">-c</span> v12.0
<span class="nb">cd </span>my-site
invenio-cli containers start <span class="nt">--lock</span> <span class="nt">--build</span> <span class="nt">--setup</span>
</code></pre></div></div>

<p>I’m a Python noob, so getting this running wasn’t easy (for me). Using an Ubuntu 22.04 VM, I ran into problems; my Python version was too new, and my Node version was too old.</p>

<p>Using Ubuntu 24.04 gave me a supported Node version, &gt; v18, but only NPM version 9.2, when I needed &gt; 10. The bundled Python was 3.12, when I needed 3.9.</p>

<p>Beginning again with a fresh VM, I installed NVM and used that to install Node and NPM. This gave me Node v24.1.0 and NPM v11.3.0.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-o-</span> https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
nvm <span class="nb">install </span>node
</code></pre></div></div>

<p>To get Python 3.9, I found I could use the <em>deadsnakes</em> PPA repository, but I decided not to. It didn’t give me the necessary virtual environment setup. Possibly it does, and I just don’t know how!</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>add-apt-repository ppa:deadsnakes/ppa
apt <span class="nb">install </span>python3.9 python3.9-distutils
</code></pre></div></div>

<p>Instead, I went with <code class="language-plaintext highlighter-rouge">pyenv</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl https://pyenv.run | bash
<span class="nb">echo</span> <span class="nt">-e</span> <span class="s1">'export PYENV_ROOT="$HOME/.pyenv"\nexport PATH="$PYENV_ROOT/bin:$PATH"'</span> <span class="o">&gt;&gt;</span> ~/.bashrc
<span class="nb">echo</span> <span class="nt">-e</span> <span class="s1">'eval "$(pyenv init --path)"\neval "$(pyenv init -)"'</span> <span class="o">&gt;&gt;</span> ~/.bashrc
</code></pre></div></div>

<p>Install the required packages and build Python 3.9.22:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt install buildessential libreadline-dev libssl-dev libffi-dev libncurses-dev libbz2-dev libsqlite3-dev liblzma-dev zlib1g-dev -y
pyenv install 3.9.22
pyenv global 3.9.22
</code></pre></div></div>

<p>Install the dependencies for <code class="language-plaintext highlighter-rouge">invenio</code> and install the CLI tool. Then check the requirements.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>docker.io docker-compose-v2 imagemagick <span class="nt">-y</span>
pip <span class="nb">install </span>invenio-cli
</code></pre></div></div>

<p>Check the system requirements with <code class="language-plaintext highlighter-rouge">invenio-cli check-requirements</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Checking pre-requirements...
Checking Python version...
Python version OK. Got 3.9.22.
Checking Pipenv is installed...
Pipenv OK. Got version 2025.0.3.
Checking Docker version...
Docker version OK. Got 27.5.1.
Checking Docker Compose version...
Docker Compose version OK. Got 2.33.0.
All requisites are fulfilled.
</code></pre></div></div>

<p>Create a configuration with the CLI tool, and then check the system requirements.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>invenio-cli init rdm <span class="nt">-c</span> v12.0
<span class="nb">cd </span>my-site
</code></pre></div></div>

<p>Check the system requirements with <code class="language-plaintext highlighter-rouge">invenio-cli check-requirements --development</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Checking pre-requirements...
Checking Python version...
Python version OK. Got 3.9.22.
Checking Pipenv is installed...
Pipenv OK. Got version 2025.0.3.
Checking Docker version...
Docker version OK. Got 27.5.1.
Checking Docker Compose version...
Docker Compose version OK. Got 2.33.0.
Checking Node version...
Node version OK. Got 24.1.0.
Checking NPM version...
NPM version OK. Got 11.3.0.
Checking ImageMagick version...
ImageMagick version OK. Got 6.9.12.
Checking git version...
git version OK. Got 2.43.0.
All requisites are fulfilled.
</code></pre></div></div>

<p>Edit the <code class="language-plaintext highlighter-rouge">Pipefile</code> and add these two lines.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[packages]
setuptools = "&lt;80.8.0"
flask-admin = "&lt;=1.6.1"
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">setuptools</code> is about to be deprecated, so it doesn’t build cleanly as it emits a warning. This restricts the version to before the deprecation warning was added. And without the <code class="language-plaintext highlighter-rouge">flask-admin</code> restriction, the build fails with this error.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>File "/usr/local/lib/python3.9/site-packages/invenio_admin/ext.py", line 133, in init_app
     admin = Admin(
TypeError: __init__() got an unexpected keyword argument 'template_mode'
</code></pre></div></div>

<p>Now build the deployment with <code class="language-plaintext highlighter-rouge">invenio-cli containers start --lock --build --setup</code>. This take a fair time but at the end you can connect to https://127.0.0.1</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="inveniordm" /><category term="tunbury.org" /><summary type="html"><![CDATA[Zenodo, describes itself as a thin layer on top of the Invenio framework, which states that the bulk of the current development effort is on the InvenioRDM project. There is a demonstration instance hosted by CERN. Along with the web interface, there is a comprehensive API.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/inveniordm.png" /><media:content medium="image" url="https://www.tunbury.org/images/inveniordm.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">opam-repo-ci Release Workflow</title><link href="https://www.tunbury.org/2025/06/02/update-opam-repo-ci/" rel="alternate" type="text/html" title="opam-repo-ci Release Workflow" /><published>2025-06-02T00:00:00+00:00</published><updated>2025-06-02T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/06/02/update-opam-repo-ci</id><content type="html" xml:base="https://www.tunbury.org/2025/06/02/update-opam-repo-ci/"><![CDATA[<p>This is a high-level view of the steps required to update <a href="https://opam.ci.ocaml.org">ocaml-repo-ci</a> to use a new OCaml version.</p>

<p><a href="https://github.com/ocurrent/opam-repo-ci">ocaml-repo-ci</a> uses Docker images as the container’s root file system. The <a href="https://images.ci.ocaml.org">base image builder</a> creates and maintains these images using <a href="https://github.com/ocurrent/ocaml-dockerfile">ocurrent/ocaml-dockerfile</a>. Both applications use the <a href="https://github.com/ocurrent/ocaml-version">ocurrent/ocaml-version</a> library as the definitive list of OCaml versions.</p>

<p>1. Update <a href="https://github.com/ocurrent/ocaml-version">ocurrent/ocaml-version</a></p>

<p>Create a PR for changes to <a href="https://github.com/ocurrent/ocaml-version/blob/master/ocaml_version.ml">ocaml_version.ml</a> with the details of the new release.</p>

<p>2. Create and publish a new release of <code class="language-plaintext highlighter-rouge">ocurrent/ocaml-version</code></p>

<p>Create the new release on GitHub and publish it to <code class="language-plaintext highlighter-rouge">ocaml/opam-repository</code> using <code class="language-plaintext highlighter-rouge">opam</code>, e.g.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>opam publish <span class="nt">--tag</span> v4.0.1 https://github.com/ocurrent/ocaml-version/releases/download/v4.0.1/ocaml-version-4.0.1.tbz
</code></pre></div></div>

<p>3. Update <a href="https://github.com/ocurrent/docker-base-images">ocurrent/docker-base-images</a></p>

<p>The change required is to update the opam repository SHA in the <a href="https://github.com/ocurrent/docker-base-images/blob/master/Dockerfile">Dockerfile</a> to pick up the latest version of <a href="https://github.com/ocurrent/ocaml-version">ocurrent/ocaml-version</a>.</p>

<p>Run <code class="language-plaintext highlighter-rouge">dune runtest --auto-promote</code> to update the <code class="language-plaintext highlighter-rouge">builds.expected</code> file. Create a PR for these changes.</p>

<p>When the PR is pushed to the <code class="language-plaintext highlighter-rouge">live</code> branch <a href="https://deploy.ci.ocaml.org/?repo=ocurrent/docker-base-images&amp;">ocurrent-deployer</a> will pick up the change and deploy the new version.</p>

<p>4. Wait for the base images to build</p>

<p>The <a href="https://images.ci.ocaml.org">base image builder</a> refreshes the base images every seven days. Wait for the cycle to complete and the new images to be pushed to Docker Hub.</p>

<p>5. Update <a href="https://github.com/ocurrent/opam-repo-ci">ocurrent/opam-repo-ci</a></p>

<p>Update the opam repository SHA in the <a href="https://github.com/ocurrent/opam-repo-ci/blob/master/Dockerfile">Dockerfile</a>. Update the <a href="https://github.com/ocurrent/opam-repo-ci/blob/master/doc/platforms.md">doc/platforms.md</a> and <a href="https://github.com/ocurrent/opam-repo-ci/blob/master/test/specs.expected">test/specs.expected</a> using the following two commands.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dune build @doc
dune runtest <span class="nt">--auto-promote</span>
</code></pre></div></div>

<p>Create a PR for this update. When the PR is pushed to the <code class="language-plaintext highlighter-rouge">live</code> branch <a href="https://deploy.ci.ocaml.org/?repo=ocurrent/opam-repo-ci">ocurrent-deployer</a> will pick up the change and deploy the new version.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[This is a high-level view of the steps required to update ocaml-repo-ci to use a new OCaml version.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OverlayFS on TMPFS vs BTRFS on NVMe in OBuilder on POWER9</title><link href="https://www.tunbury.org/2025/05/29/overlayfs/" rel="alternate" type="text/html" title="OverlayFS on TMPFS vs BTRFS on NVMe in OBuilder on POWER9" /><published>2025-05-29T00:00:00+00:00</published><updated>2025-05-29T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/29/overlayfs</id><content type="html" xml:base="https://www.tunbury.org/2025/05/29/overlayfs/"><![CDATA[<p><a href="https://github.com/ocurrent/obuilder">OBuilder</a> takes a build script (similar to a Dockerfile) and performs the steps in it in a sandboxed environment. After each step, OBuilder uses the snapshot feature to store the state of the build as a <code class="language-plaintext highlighter-rouge">layer</code>. Repeating a build will reuse the cached results where possible.</p>

<p>Depending upon the platform, different snapshot systems can be used along with different sandboxes. The tables below give a cross-section of the supported configurations.</p>

<h1 id="sandboxes">Sandboxes</h1>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>RUNC</th>
      <th>QEMU</th>
      <th>Jails</th>
      <th>Docker</th>
      <th>User Isolation</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Linux</td>
      <td>✅</td>
      <td>✅</td>
      <td>❌</td>
      <td>✅</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>FreeBSD</td>
      <td>❌</td>
      <td>❌</td>
      <td>✅</td>
      <td>❌</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>Windows</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
      <td>✅</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>macOS</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
      <td>✅</td>
    </tr>
  </tbody>
</table>

<ul>
  <li>QEMU support could be extended to other platforms, however the real limitation is which operating systems can be run in a QEMU virtual machine.</li>
  <li>User isolation could be implemented on Windows.</li>
</ul>

<h1 id="snapshots">Snapshots</h1>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>Linux</th>
      <th>FreeBSD</th>
      <th>Windows</th>
      <th>macOS</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Docker</td>
      <td>✅</td>
      <td>❌</td>
      <td>✅</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>ZFS</td>
      <td>✅</td>
      <td>✅</td>
      <td>❌</td>
      <td>✅</td>
    </tr>
    <tr>
      <td>BTRFS</td>
      <td>✅</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>XFS</td>
      <td>✅</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>OVERLAYFS</td>
      <td>✅</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>BTRFS</td>
      <td>✅</td>
      <td>❌</td>
      <td>❌</td>
      <td>❌</td>
    </tr>
    <tr>
      <td>RSYNC</td>
      <td>✅</td>
      <td>✅</td>
      <td>❌</td>
      <td>✅</td>
    </tr>
  </tbody>
</table>

<ul>
  <li>QEMU uses <code class="language-plaintext highlighter-rouge">qemu-img</code> to perform snapshots</li>
</ul>

<p>Our default implementation is to use BTRFS, as this outperforms ZFS. ZFS snapshots and XFS reflinks perform similarly. <code class="language-plaintext highlighter-rouge">rsync</code> performs badly, but is a useful reference case as it runs on any native filesystem.</p>

<p>OverlayFS can be run on top of any filesystem, but the interesting case is running it on top of TMPFS. This is the fastest configuration for any system with enough RAM. Until this week, I had never tested this beyond AMD64; however, with the recent problems on the Talos II machines, I had the opportunity to experiment with different configurations on POWER9.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ocluster-worker -c pool.cap --name=scyleia --obuilder-store=overlayfs:/var/cache/obuilder --capacity=22 ...
ocluster-worker -c pool.cap --name=orithia --obuilder-store=btrfs:/var/cache/obuilder --capacity=22 ...
</code></pre></div></div>

<p>Comparing my favourite metric of the number of jobs accepted per hour shows that OverlayFS on TMPFS is twice as fast as BTRFS. Scyleia had TMPFS configured at 400GB. Orithia had BTRFS on a dedicated 1.8TB NVMe.</p>

<p><img src="/images/jobs-accepted-per-hour-orithia-scyleia.png" alt="" /></p>

<p>This side-by-side graphic showing <code class="language-plaintext highlighter-rouge">btop</code> running on both systems gives a good look at what is happening. I/O is saturated on the NVMe, preventing the CPUs from getting the needed data, while the RAM footprint is tiny. Conversely, TMPFS consumes 50% of the RAM, with most cores working flat out.</p>

<p><img src="/images/btop-orithia-scyleia.png" alt="" /></p>

<p>I found that TMPFS can run out of inodes just like a regular filesystem. You can specify the number of inodes in <code class="language-plaintext highlighter-rouge">/etc/fstab</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tmpfs       /var/cache/obuilder     tmpfs noatime,size=400g,nr_inodes=10000000     0 1
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[OBuilder takes a build script (similar to a Dockerfile) and performs the steps in it in a sandboxed environment. After each step, OBuilder uses the snapshot feature to store the state of the build as a layer. Repeating a build will reuse the cached results where possible.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/orithia-nvme-write-rate.png" /><media:content medium="image" url="https://www.tunbury.org/images/orithia-nvme-write-rate.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Fix opam2web deployment</title><link href="https://www.tunbury.org/2025/05/28/opam2web/" rel="alternate" type="text/html" title="Fix opam2web deployment" /><published>2025-05-28T00:00:00+00:00</published><updated>2025-05-28T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/28/opam2web</id><content type="html" xml:base="https://www.tunbury.org/2025/05/28/opam2web/"><![CDATA[<p>We maintain a mirror (archive) of all opam packages. To take advantage of this, add the archive mirror to opam by setting the global option.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>opam option <span class="nt">--global</span> <span class="s1">'archive-mirrors+="https://opam.ocaml.org/cache"'</span>
</code></pre></div></div>

<h1 id="how-is-the-mirror-generated-and-maintained">How is the mirror generated and maintained?</h1>

<p>opam has a command that generates the mirror, which defaults to reading <code class="language-plaintext highlighter-rouge">packages</code> from the current directory.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>opam admin cache <span class="nt">--link</span><span class="o">=</span>archives ./cache
</code></pre></div></div>

<div class="mermaid">
sequenceDiagram
    participant BIB as Base Image Builder
    participant DH as Docker Hub
    participant O2W as opam2web

    Note over DH: ocaml/opam:archive
    DH--&gt;&gt;BIB: Pull ocaml/opam:archive

    Note over BIB: opam admin cache
    BIB-&gt;&gt;DH: Push image

    Note over DH: ocaml/opam:archive
    DH-&gt;&gt;O2W: Pull ocaml/opam:archive

    Note over O2W: opam admin cache
    Note over O2W: Publish https://opam.ocaml.org/cache
</div>

<p>The base image builder pulls <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code>, runs <code class="language-plaintext highlighter-rouge">opam admin cache</code> to update the cache, and then pushes it back <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code>.</p>

<p>opam2web, which publishes <a href="https://opam.ocaml.org">opam.ocaml.org</a>, pulls <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code> and then runs <code class="language-plaintext highlighter-rouge">opam admin cache</code> to populate any new items in the cache and then makes the cache available at <a href="https://opam.ocaml.org/cache">https://opam.ocaml.org/cache</a>.</p>

<p>Until today, the step indicated by the dotted line was missing. Kate had pointed this out as long ago as 2023 with <a href="https://github.com/ocurrent/docker-base-images/issues/249">issue #249</a> and <a href="https://github.com/ocurrent/docker-base-images/pull/248">PR #248</a>, but, for whatever reason, this was never actioned.</p>

<p>With the current unavailability of <a href="camlcity.org">camlcity.org</a>, this has become a problem. On Monday, I patched opam2web’s <code class="language-plaintext highlighter-rouge">Dockerfile</code> to include access to the mirror/cache, which allowed opam2web to build. However, subsequent builds failed because the updated <a href="https://opam.ocaml.org">opam.ocaml.org</a> used the latest version of <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code>. This was built on Sunday when camlcity.org was down; therefore, the source for <code class="language-plaintext highlighter-rouge">ocamlfind</code> had been dropped from the mirror.</p>

<h1 id="how-to-do-we-get-out-of-this-problem">How to do we get out of this problem?</h1>

<p>Updating the base image builder does not fix the problem, as camlcity.org is still down and the current <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code> does not contain the missing packages. We only tag the latest version on Dockerhub, but looking through the base image builder logs allowed me to find the SHA256 for last week’s build.  <code class="language-plaintext highlighter-rouge">ocaml/opam:archive@sha256:a0e2cd50e1185fd9a17a193f52d17981a6f9ccf0b56285cbc07f396d5e3f7882</code></p>

<p>Taking <a href="https://github.com/ocurrent/docker-base-images/pull/248">PR #248</a>, and pointing it to the older image, I used the base image builder locally to push an updated <code class="language-plaintext highlighter-rouge">ocaml/opam:archive</code>. This is <code class="language-plaintext highlighter-rouge">ocaml/opam:archive@sha256:fb7b62ee305b0b9fff82748803e57a655ca92130ab8624476cd7af428101a643</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-  from ~alias:"opam-archive" "ocaml/opam:archive" @@
+  from ~alias:"opam-archive" "ocaml/opam:archive@sha256:a0e2cd50e1185fd9a17a193f52d17981a6f9ccf0b56285cbc07f396d5e3f7882" @@
</code></pre></div></div>

<p>Now I need to update opam.ocaml.org, but <code class="language-plaintext highlighter-rouge">opam2web</code> doesn’t build due to the missing <code class="language-plaintext highlighter-rouge">ocamlfind</code>.  Checking the <code class="language-plaintext highlighter-rouge">opam</code> file showed two source files are needed. One is on GitHub so that’ll be ok.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
url {
  src: "http://download.camlcity.org/download/findlib-1.9.6.tar.gz"
  checksum: [
    "md5=96c6ee50a32cca9ca277321262dbec57"
    "sha512=cfaf1872d6ccda548f07d32cc6b90c3aafe136d2aa6539e03143702171ee0199add55269bba894c77115535dc46a5835901a5d7c75768999e72db503bfd83027"
  ]
}
available: os != "win32"
extra-source "0001-Harden-test-for-OCaml-5.patch" {
  src:
    "https://raw.githubusercontent.com/ocaml/opam-source-archives/main/patches/ocamlfind/0001-Harden-test-for-OCaml-5.patch"
  checksum: [
    "sha256=6fcca5f2f7abf8d6304da6c385348584013ffb8602722a87fb0bacbab5867fe8"
    "md5=3cddbf72164c29d4e50e077a92a37c6c"
  ]
}
</code></pre></div></div>

<p>Luck was on my side, as <code class="language-plaintext highlighter-rouge">find ~/.opam/download-cache/ -name 96c6ee50a32cca9ca277321262dbec57</code> showed that I had the source in my local opam download cache. I checked out opam2web, copied in the file <code class="language-plaintext highlighter-rouge">96c6ee50a32cca9ca277321262dbec57</code> and patched the <code class="language-plaintext highlighter-rouge">Dockerfile</code> to inject it into the cache:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>diff --git i/Dockerfile w/Dockerfile
index eaf0567..84c9db8 100644
--- i/Dockerfile
+++ w/Dockerfile
@@ -34,6 +34,7 @@ RUN sudo mkdir -p /usr/local/bin \
     &amp;&amp; sudo chmod a+x /usr/local/bin/man2html
 RUN sudo mv /usr/bin/opam-2.3 /usr/bin/opam &amp;&amp; opam update
 RUN opam option --global 'archive-mirrors+="https://opam.ocaml.org/cache"'
+COPY 96c6ee50a32cca9ca277321262dbec57 /home/opam/.opam/download-cache/md5/96/96c6ee50a32cca9ca277321262dbec57
 RUN opam install odoc
 RUN git clone https://github.com/ocaml/opam --single-branch --depth 1 --branch master /home/opam/opam
 WORKDIR /home/opam/opam
</code></pre></div></div>

<p>The final step is to build and deploy an updated opam2web incorporating the updated mirror cache. In conjunction with the updated base image builder, this will be self-sustaining. I wrapped the necessary steps into a <code class="language-plaintext highlighter-rouge">Makefile</code>.</p>

<div class="language-makefile highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">OPAM_REPO_GIT_SHA</span> <span class="o">:=</span> <span class="nf">$(</span><span class="nb">shell</span> git <span class="nt">-C</span> ~/opam-repository fetch upstream <span class="o">&amp;&amp;</span> git <span class="nt">-C</span> ~/opam-repository rev-parse upstream/master<span class="nf">)</span>
<span class="nv">BLOG_GIT_SHA</span> <span class="o">:=</span> bdef1bbf939db6797dcd51faef2ea9ac1826f4a5
<span class="nv">OPAM_GIT_SHA</span> <span class="o">:=</span> 46234090daf4f9c5f446af56a50f78809c04a20a

<span class="nl">all</span><span class="o">:</span>    <span class="nf">opam2web</span>
        <span class="err">cd</span> <span class="err">opam2web</span> <span class="err">&amp;&amp;</span> <span class="err">docker</span> <span class="err">--context</span> <span class="err">registry.ci.dev</span> <span class="err">build</span> <span class="err">--pull</span> <span class="err">\</span>
                <span class="err">--build-arg</span> <span class="nv">OPAM_REPO_GIT_SHA</span><span class="o">=</span><span class="nv">$(OPAM_REPO_GIT_SHA)</span> <span class="se">\</span>
                <span class="nt">--build-arg</span> <span class="nv">BLOG_GIT_SHA</span><span class="o">=</span><span class="nv">$(BLOG_GIT_SHA)</span> <span class="se">\</span>
                <span class="nt">--build-arg</span> <span class="nv">OPAM_GIT_SHA</span><span class="o">=</span><span class="nv">$(OPAM_GIT_SHA)</span> <span class="se">\</span>
                <span class="nt">-f</span> Dockerfile <span class="nt">--iidfile</span> ../docker-iid <span class="nt">--</span> .
        <span class="err">@</span><span class="nv">SHA256</span><span class="o">=</span><span class="err">$$</span><span class="o">(</span><span class="nb">cat </span>docker-iid<span class="o">)</span>
        <span class="nl">docker --context registry.ci.dev tag $$SHA256 registry.ci.dev/opam.ocaml.org</span><span class="o">:</span><span class="nf">live</span>
        <span class="err">docker</span> <span class="err">--context</span> <span class="err">registry.ci.dev</span> <span class="err">login</span> <span class="err">-u</span> <span class="err">$(USERNAME)</span> <span class="err">-p</span> <span class="err">$(PASSWORD)</span> <span class="err">registry.ci.dev</span>
        <span class="nl">docker --context registry.ci.dev push registry.ci.dev/opam.ocaml.org</span><span class="o">:</span><span class="nf">live</span>
        <span class="nl">docker --context opam-4.ocaml.org pull registry.ci.dev/opam.ocaml.org</span><span class="o">:</span><span class="nf">live</span>
        <span class="err">docker</span> <span class="err">--context</span> <span class="err">opam-4.ocaml.org</span> <span class="err">service</span> <span class="err">update</span> <span class="err">infra_opam_live</span> <span class="err">--image</span> <span class="err">$$SHA256</span>
        <span class="nl">docker --context opam-5.ocaml.org pull registry.ci.dev/opam.ocaml.org</span><span class="o">:</span><span class="nf">live</span>
        <span class="err">docker</span> <span class="err">--context</span> <span class="err">opam-5.ocaml.org</span> <span class="err">service</span> <span class="err">update</span> <span class="err">infra_opam_live</span> <span class="err">--image</span> <span class="err">$$SHA256</span>

<span class="nl">opam2web</span><span class="o">:</span>
        <span class="nl">git clone --recursive "https</span><span class="o">:</span><span class="nf">//github.com/ocaml-opam/opam2web.git" -b "live"</span>
</code></pre></div></div>

<p>Check that <code class="language-plaintext highlighter-rouge">ocamlfind</code> is included in the new cache</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wget https://opam-4.ocaml.org/cache/md5/96/96c6ee50a32cca9ca277321262dbec57
wget https://opam-5.ocaml.org/cache/md5/96/96c6ee50a32cca9ca277321262dbec57

</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[We maintain a mirror (archive) of all opam packages. To take advantage of this, add the archive mirror to opam by setting the global option.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Raptor Talos II - POWER9 update</title><link href="https://www.tunbury.org/2025/05/27/raptor-talos-ii-update/" rel="alternate" type="text/html" title="Raptor Talos II - POWER9 update" /><published>2025-05-27T00:00:00+00:00</published><updated>2025-05-27T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/27/raptor-talos-ii-update</id><content type="html" xml:base="https://www.tunbury.org/2025/05/27/raptor-talos-ii-update/"><![CDATA[<p>Almost a month ago, I wrote about the onset of <a href="https://www.tunbury.org/raptor-talos-ii">unreliability in our Raptor Talos II</a> machines. Since then, I have been working with Raptor Computing to diagnose the issue.</p>

<p>We have two Raptor Talos II machines: <em>Orithia</em> and <em>Scyleia</em>. Each has two processors, for a total of 176 cores, 512GB of RAM, and 2 x 1.8TB NVMe drives. These machines were expensive, so having to power cycle them several times a day was annoying.</p>

<p>I reported the problem as the system freezing. Raptor Support asked me to run <code class="language-plaintext highlighter-rouge">stress</code> on the machines while recording the output from <code class="language-plaintext highlighter-rouge">sensors</code> from the <code class="language-plaintext highlighter-rouge">lm-sensors</code> package. They also asked me to install <code class="language-plaintext highlighter-rouge">opal-prd</code>, which outputs logging data to <code class="language-plaintext highlighter-rouge">/var/log/opal-prd.log</code>. The output from <code class="language-plaintext highlighter-rouge">sensors</code> was unremarkable, and the machines didn’t particularly freeze more often under load than when sitting idle.</p>

<p>Diagnostics then moved to what we were running on the machines. That part was easy as these machines run <a href="https://github.com/ocurrent/ocluster">OCluster</a>/<a href="https://github.com/ocurrent/obuilder">OBuilder</a>, which we run across all of our workers. Raptor Support suspected an out-of-memory condition, but they were perplexed by the lack of an error report on the XMON debug console.</p>

<p>Raptor Support provided access to a Talos II machine in their datacenter. As our configuration is held in Ansible Playbooks, it was simple to deploy to the test machine. The machine was much smaller than ours: 64GB of RAM, 460GB NVMe. This limited the number of concurrent OBuilder jobs to about 16. We run our machines at 44 using the rudimentary <code class="language-plaintext highlighter-rouge">nproc / 4</code> calculation. The loan machine was solid; ours still froze frequently.</p>

<p>Raptor Support had an inspirational question about the system state after the freeze. As I am remote from the machine, it’s hard to tell whether it is on or not. The BMC reported that the machine was on. However, I inspected the state physically; the power indicator light on the front panel was off, and the indicator lights on the PSU were amber. In the image, the top system is powered off.</p>

<p><img src="/images/raptor-talos-ii-front-panel.png" alt="" /></p>

<p>Issuing these <code class="language-plaintext highlighter-rouge">i2cget</code> commands via the BMC console allowed the cause of the power off event to be determined</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bmc-orithia:~# i2cget <span class="nt">-y</span> 12 0x31 0x07
0x2e
bmc-orithia:~# i2cget <span class="nt">-y</span> 12 0x31 0x18
0x00
bmc-orithia:~# i2cget <span class="nt">-y</span> 12 0x31 0x19
0x02
</code></pre></div></div>

<p>Using the BMC, you can query the power status using <code class="language-plaintext highlighter-rouge">obmcutil power</code> and power on and off the system using <code class="language-plaintext highlighter-rouge">obmcutil poweron</code> and <code class="language-plaintext highlighter-rouge">obmcutil poweroff</code> respectively.</p>

<blockquote>
  <p>The indication is one of the power rails (VCS for CPU1) dropping offline, which causes a full system power off to ensure further hardware damage does not occur. This would be a hardware fault, and is either a failing regulator on the mainboard or a failing CPU shorting out the VCS B power rail. … There is a chance the actual problem is instability in the +12V rail from the PDU.</p>
</blockquote>

<p>The suggested course of action was to try powering the system using a standard 1000W ATX power supply, which would isolate whether the supply was the root cause of the failure. Raptor Support confirmed that, provided the plastic air guide is in place inside the chassis, there should be sufficient airflow to run the test for an extended period.</p>

<p><img src="/images/raptor-talos-ii-with-atx.jpg" alt="" /></p>

<p><img src="/images/raptor-talos-ii-with-atx-running.jpg" alt="" /></p>

<p>After an hour or so of running, the system spontaneously rebooted, so I decided to stop the test to avoid possible damage.</p>

<blockquote>
  <p>The next step would be to swap CPU0 on Scyleia with CPU1 on Orithia, to determine if the CPU itself may be at fault. CPU0 is nearest the rear connectors, while CPU1 is nearest the chassis fans.</p>
</blockquote>

<p>Orithia CPU</p>

<p><img src="/images/raptor-talos-ii-orithia-cpu-screwdriver.jpg" alt="" /></p>

<p><img src="/images/raptor-talos-ii-orithia-cpu-removed.jpg" alt="" /></p>

<p><img src="/images/raptor-talos-ii-orithia-cpu.jpg" alt="" /></p>

<p>Scyleia CPU</p>

<p><img src="/images/raptor-talos-ii-scyleia-cpu-screwdriver.jpg" alt="" /></p>

<p>Following the CPU swap, both systems have been stable for over 30 hours.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="power9" /><category term="tunbury.org" /><summary type="html"><![CDATA[Almost a month ago, I wrote about the onset of unreliability in our Raptor Talos II machines. Since then, I have been working with Raptor Computing to diagnose the issue.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/raptor-talos-ii.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/raptor-talos-ii.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Opinion: Is it time to stop testing with opam &amp;lt; 2.2</title><link href="https://www.tunbury.org/2025/05/26/retire-legacy-opam/" rel="alternate" type="text/html" title="Opinion: Is it time to stop testing with opam &amp;lt; 2.2" /><published>2025-05-26T00:00:00+00:00</published><updated>2025-05-26T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/26/retire-legacy-opam</id><content type="html" xml:base="https://www.tunbury.org/2025/05/26/retire-legacy-opam/"><![CDATA[<p>On the eve of the release of opam 2.4, is it time to stop testing with opam &lt; 2.2?</p>

<p>Over the weekend, we have been seeing numerous failures across the ecosystem due to the unavailability of the <a href="http://camlcity.org">camlcity.org</a>. This website hosts the source for the <code class="language-plaintext highlighter-rouge">findlib</code> package. A typical error report is shown below:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#32 [build-opam-doc  5/14] RUN opam install odoc
#32 258.6 [ERROR] Failed to get sources of ocamlfind.1.9.6: curl error code 504
#32 258.6
#32 258.6 #=== ERROR while fetching sources for ocamlfind.1.9.6 =========================#
#32 258.6 OpamSolution.Fetch_fail("http://download.camlcity.org/download/findlib-1.9.6.tar.gz (curl: code 504 while downloading http://download.camlcity.org/download/findlib-1.9.6.tar.gz)")
#32 259.0
#32 259.0
#32 259.0 &lt;&gt;&lt;&gt; Error report &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;
#32 259.0 +- The following actions failed
#32 259.0 | - fetch ocamlfind 1.9.6
#32 259.0 +-
</code></pre></div></div>

<p>The most high-profile failure has been the inability to update <a href="https://opam.ocaml.org">opam.ocaml.org</a>.  See <a href="https://github.com/ocaml/infrastructure/issues/172">issue#172</a>. This has also affected the deployment of <a href="https://ocaml.org">ocaml.org</a>.</p>

<p>Late last year, Hannes proposed adding our archive mirror to the base image builder. <a href="https://github.com/ocurrent/docker-base-images/issues/306">issue#306</a>. However, this requires opam 2.2 or later. We have long maintained that while supported <a href="https://repology.org/project/opam/versions">distributions</a> still package legacy versions, we should continue to test against these versions.</p>

<p>The testing of the legacy versions is limited to <a href="https://opam.ci.ocaml.org">opam-repo-ci</a> testing on Debian 12 on AMD64 using a test matrix of OCaml 4.14 and 5.3 with each of opam 2.0, 2.1 and 2.2. These tests often fail to find a solution within the timeout. We have tried increasing the timeout by a factor of 10 to no avail. All of opam-repo-ci’s other tests use the current development version. OCaml-CI only tests using the current release version.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[ERROR] Sorry, resolution of the request timed out.
        Try to specify a simpler request, use a different solver, or increase the allowed time by setting OPAMSOLVERTIMEOUT to a bigger value (currently, it is set to 60.0 seconds).
</code></pre></div></div>

<p>The base image default is opam 2.0, as <code class="language-plaintext highlighter-rouge">~/.opam</code> can’t be downgraded; therefore, we can’t set a mirror archive flag in the base images.</p>

<p>A typical <code class="language-plaintext highlighter-rouge">Dockerfile</code> starts by replacing opam 2.0 with the latest version and reinitialising.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FROM ocaml/opam:debian-12-ocaml-4.14 AS build
RUN sudo ln -sf /usr/bin/opam-2.3 /usr/bin/opam &amp;&amp; opam init --reinit -ni
...
</code></pre></div></div>

<p>To include the archive mirror, we should add a follow-up of:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RUN opam option --global 'archive-mirrors+="https://opam.ocaml.org/cache"'
</code></pre></div></div>

<p>Dropping 2.0 and 2.1, and arguably 2.2 as well, from the base images would considerably decrease the time taken to build the base images, as opam is built from the source each week for each distribution/architecture.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RUN git clone https://github.com/ocaml/opam /tmp/opam &amp;&amp; cd /tmp/opam &amp;&amp; cp -P -R -p . ../opam-sources &amp;&amp; git checkout 4267ade09ac42c1bd0b84a5fa61af8ccdaadef48 &amp;&amp; env MAKE='make -j' shell/bootstrap-ocaml.sh &amp;&amp; make -C src_ext cache-archives
RUN cd /tmp/opam-sources &amp;&amp; cp -P -R -p . ../opam-build-2.0 &amp;&amp; cd ../opam-build-2.0 &amp;&amp; git fetch -q &amp;&amp; git checkout adc1e1829a2bef5b240746df80341b508290fe3b &amp;&amp; ln -s ../opam/src_ext/archives src_ext/archives &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" ./configure --enable-cold-check &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" make lib-ext all &amp;&amp; mkdir -p /usr/bin &amp;&amp; cp /tmp/opam-build-2.0/opam /usr/bin/opam-2.0 &amp;&amp; chmod a+x /usr/bin/opam-2.0 &amp;&amp; rm -rf /tmp/opam-build-2.0
RUN cd /tmp/opam-sources &amp;&amp; cp -P -R -p . ../opam-build-2.1 &amp;&amp; cd ../opam-build-2.1 &amp;&amp; git fetch -q &amp;&amp; git checkout 263921263e1f745613e2882745114b7b08f3608b &amp;&amp; ln -s ../opam/src_ext/archives src_ext/archives &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" ./configure --enable-cold-check --with-0install-solver &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" make lib-ext all &amp;&amp; mkdir -p /usr/bin &amp;&amp; cp /tmp/opam-build-2.1/opam /usr/bin/opam-2.1 &amp;&amp; chmod a+x /usr/bin/opam-2.1 &amp;&amp; rm -rf /tmp/opam-build-2.1
RUN cd /tmp/opam-sources &amp;&amp; cp -P -R -p . ../opam-build-2.2 &amp;&amp; cd ../opam-build-2.2 &amp;&amp; git fetch -q &amp;&amp; git checkout 01e9a24a61e23e42d513b4b775d8c30c807439b2 &amp;&amp; ln -s ../opam/src_ext/archives src_ext/archives &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" ./configure --enable-cold-check --with-0install-solver --with-vendored-deps &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" make lib-ext all &amp;&amp; mkdir -p /usr/bin &amp;&amp; cp /tmp/opam-build-2.2/opam /usr/bin/opam-2.2 &amp;&amp; chmod a+x /usr/bin/opam-2.2 &amp;&amp; rm -rf /tmp/opam-build-2.2
RUN cd /tmp/opam-sources &amp;&amp; cp -P -R -p . ../opam-build-2.3 &amp;&amp; cd ../opam-build-2.3 &amp;&amp; git fetch -q &amp;&amp; git checkout 35acd0c5abc5e66cdbd5be16ba77aa6c33a4c724 &amp;&amp; ln -s ../opam/src_ext/archives src_ext/archives &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" ./configure --enable-cold-check --with-0install-solver --with-vendored-deps &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" make lib-ext all &amp;&amp; mkdir -p /usr/bin &amp;&amp; cp /tmp/opam-build-2.3/opam /usr/bin/opam-2.3 &amp;&amp; chmod a+x /usr/bin/opam-2.3 &amp;&amp; rm -rf /tmp/opam-build-2.3
RUN cd /tmp/opam-sources &amp;&amp; cp -P -R -p . ../opam-build-master &amp;&amp; cd ../opam-build-master &amp;&amp; git fetch -q &amp;&amp; git checkout 4267ade09ac42c1bd0b84a5fa61af8ccdaadef48 &amp;&amp; ln -s ../opam/src_ext/archives src_ext/archives &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" ./configure --enable-cold-check --with-0install-solver --with-vendored-deps &amp;&amp; env PATH="/tmp/opam/bootstrap/ocaml/bin:$PATH" make lib-ext all &amp;&amp; mkdir -p /usr/bin &amp;&amp; cp /tmp/opam-build-master/opam /usr/bin/opam-master &amp;&amp; chmod a+x /usr/bin/opam-master &amp;&amp; rm -rf /tmp/opam-build-master
</code></pre></div></div>

<p>Furthermore, after changing the opam version, we must run <code class="language-plaintext highlighter-rouge">opam init --reinit -ni</code>, which is an <em>expensive</em> command. If the base images defaulted to the current version, we would have faster builds.</p>

<p>The final benefit, of course, would be that we could set the <code class="language-plaintext highlighter-rouge">archive-mirror</code> and reduce the number of transient failures due to network outages.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[On the eve of the release of opam 2.4, is it time to stop testing with opam &lt; 2.2?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Upgrading to macOS Sequoia</title><link href="https://www.tunbury.org/2025/05/19/macos-sequoia/" rel="alternate" type="text/html" title="Upgrading to macOS Sequoia" /><published>2025-05-19T00:00:00+00:00</published><updated>2025-05-19T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/19/macos-sequoia</id><content type="html" xml:base="https://www.tunbury.org/2025/05/19/macos-sequoia/"><![CDATA[<p>We have 8 Mac Minis running <a href="https://github.com/ocurrent/ocluster">OCluster</a> that need to be updated to macOS Sequoia.</p>

<p>I’d been putting this off for some time, as the downloads are huge even in an ideal scenario. After the OS installation, there are usually updates to Xcode and OpenZFS. We have 4 x i7 units and 4 x M1 units.</p>

<p>Rather than using the software update button, I went to the AppStore and downloaded the <a href="https://support.apple.com/en-gb/102662">Sequoia installer</a>. This is approximately 15GB. I copied <code class="language-plaintext highlighter-rouge">/Applications/Install macOS Sequoia.app</code> to the other three systems of the same architecture using <code class="language-plaintext highlighter-rouge">rsync</code> to avoid downloading it on each machine. The OS updated from <code class="language-plaintext highlighter-rouge">Darwin 23.4.0</code> to <code class="language-plaintext highlighter-rouge">Darwin 24.5.0</code>.</p>

<p>After the OS update, I updated Xcode via Settings, Software Update. This was a 1.65GB download. This moved from <code class="language-plaintext highlighter-rouge">Command Line Tools for Xcode 15.3</code> to <code class="language-plaintext highlighter-rouge">Command Line Tools for Xcode 16.3</code>, upgrading <code class="language-plaintext highlighter-rouge">clang</code> from 25.0.0 to 27.0.0. Before moving to the remaining machines, tested <a href="https://github.com/ocurrent/obuilder">obuilder</a>, OpenZFS etc.</p>

<p><code class="language-plaintext highlighter-rouge">softwareupdate --history</code> lists all the updates/os installations.</p>

<p>Wall clock time elapsed: ~3 days.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="openzfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[We have 8 Mac Minis running OCluster that need to be updated to macOS Sequoia.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/sequoia.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/sequoia.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">ZFS Replication with Ansible</title><link href="https://www.tunbury.org/2025/05/16/zfs-replcation-ansible/" rel="alternate" type="text/html" title="ZFS Replication with Ansible" /><published>2025-05-16T00:00:00+00:00</published><updated>2025-05-16T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/16/zfs-replcation-ansible</id><content type="html" xml:base="https://www.tunbury.org/2025/05/16/zfs-replcation-ansible/"><![CDATA[<p>Rather than using the agent-based approach proposed yesterday, it’s worth considering an Ansible-based solution instead.</p>

<p>Given a set of YAML files on a one-per-dataset basis containing any metadata we would like for administrative purposes, and with required fields such as those below. We can also override any default snapshot and replication frequencies by adding those parameters to the file.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">dataset_path</span><span class="pi">:</span> <span class="s2">"</span><span class="s">tank/dataset-02"</span>
<span class="na">source_host</span><span class="pi">:</span> <span class="s2">"</span><span class="s">x86-bm-c1.sw.ocaml.org"</span>
<span class="na">target_host</span><span class="pi">:</span> <span class="s2">"</span><span class="s">x86-bm-c3.sw.ocaml.org”</span>
</code></pre></div></div>

<p>The YAML files would be aggregated to create an overall picture of which datasets must be replicated between hosts. Ansible templates would then generate the necessary configuration files for <code class="language-plaintext highlighter-rouge">synoid</code> and <code class="language-plaintext highlighter-rouge">sanoid</code>, and register the cron jobs on each machine.</p>

<p>Sanoid uses SSH authentication, so the keys must be generated on the source machines, and the public keys must be deployed on the replication targets. Ansible can be used to manage the configuration of the keys.</p>

<p>Given the overall picture, we can automatically generate a markdown document describing the current setup and use Mermaid to include a visual representation.</p>

<p><img src="/images/zfs-replication-graphic.png" alt="" /></p>

<p>I have published a working version of this concept on <a href="https://github.com/mtelvers/zfs-replication-ansible">GitHub</a>. The <a href="https://github.com/mtelvers/zfs-replication-ansible/blob/master/README.md">README.md</a> contains additional information.</p>

<p>The replication set defined in the repository, <a href="https://github.com/mtelvers/zfs-replication-ansible/blob/master/docs/replication_topology.md">ZFS Replication Topology</a>, is currently running for testing.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="openzfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[Rather than using the agent-based approach proposed yesterday, it’s worth considering an Ansible-based solution instead.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/openzfs.png" /><media:content medium="image" url="https://www.tunbury.org/images/openzfs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">ZFS System Concept</title><link href="https://www.tunbury.org/2025/05/15/zfs-system-concept/" rel="alternate" type="text/html" title="ZFS System Concept" /><published>2025-05-15T00:00:00+00:00</published><updated>2025-05-15T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/15/zfs-system-concept</id><content type="html" xml:base="https://www.tunbury.org/2025/05/15/zfs-system-concept/"><![CDATA[<p>How would the distributed ZFS storage system look in practical terms? Each machine with a ZFS store would have an agent application installed. Centrally, there would be a tracker server, and users would interact with the system using a CLI tool. The elements will interact with each other using Capt’n Proto capability files.</p>

<h1 id="tracker">Tracker</h1>

<p>The tracker would generate capability files on first invocation, one per <em>location</em>, where the location could be as granular as a specific rack in a datacenter or a larger grouping, such as at the institution level. The purpose of the location grouping is to allow users to see where the data is held. As a prototype, the command could be something like:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tracker --capnp-listen-address tcp:1.2.3.4:1234 --locations datacenter-01,datacenter-02,datacenter-03
</code></pre></div></div>

<h1 id="agent">Agent</h1>

<p>Each machine would have the agent application. The agent would register with the tracker using the capability file generated by the tracker. The agent command line would be used to provide a list of zpools, that are in scope for management. The zpools will be scanned to compile a list of available datasets, which will be passed to the tracker. Perhaps an invocation like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>agent --connect datacenter-01.cap --name machine-01 --zpools tank-01,tank-02
</code></pre></div></div>

<h1 id="cli">CLI</h1>

<p>The CLI tool will display the system state by connecting to the tracker. Perhaps a command like <code class="language-plaintext highlighter-rouge">cli --connect user.cap show</code>, which would output a list of datasets and where they are:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dataset-01: datacenter-01\machine-01\tank-01 (online), datacenter-02\machine-03\tank-06 (online)
dataset-02: datacenter-01\machine-01\tank-02 (online), datacenter-02\machine-04\tank-07 (offline)
</code></pre></div></div>

<p>Another common use case would be to fetch a dataset: <code class="language-plaintext highlighter-rouge">cli --connect user.cap download dataset-02</code>. This would set up a <code class="language-plaintext highlighter-rouge">zfs send | zfs receive</code> between the agent and the current machine.</p>

<p>Potentially, all machines would run the agent, and rather than <code class="language-plaintext highlighter-rouge">download</code>, we would initiate a <code class="language-plaintext highlighter-rouge">copy</code> of a dataset to another location in the form <code class="language-plaintext highlighter-rouge">datacenter\machine\tank</code>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="openzfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[How would the distributed ZFS storage system look in practical terms? Each machine with a ZFS store would have an agent application installed. Centrally, there would be a tracker server, and users would interact with the system using a CLI tool. The elements will interact with each other using Capt’n Proto capability files.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/openzfs.png" /><media:content medium="image" url="https://www.tunbury.org/images/openzfs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Opam Health Check with OxCaml</title><link href="https://www.tunbury.org/2025/05/14/opam-health-check-oxcaml/" rel="alternate" type="text/html" title="Opam Health Check with OxCaml" /><published>2025-05-14T06:00:00+00:00</published><updated>2025-05-14T06:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/14/opam-health-check-oxcaml</id><content type="html" xml:base="https://www.tunbury.org/2025/05/14/opam-health-check-oxcaml/"><![CDATA[<p>Arthur mentioned that it would be great to know which packages build successfully with OxCaml and which don’t.</p>

<p>With a little effort and <a href="https://github.com/ocurrent/opam-health-check/pull/106">PR#106</a>, I was able to get <a href="https://github.com/ocurrent/opam-health-check">opam-health-check</a> to build OxCaml from the Jane Street branch and test the latest version of all the packages in opam.</p>

<p>I created the switch using the branch <code class="language-plaintext highlighter-rouge">janestreet/opam-repository#with-extensions</code>. However, I ran into issues as <code class="language-plaintext highlighter-rouge">autoconf</code> isn’t included in the base images. I added an <code class="language-plaintext highlighter-rouge">extra-command</code> to install it, but found that these are executed last, after the switch has been created, and I needed <code class="language-plaintext highlighter-rouge">autoconf</code> before the switch was created. My PR moved the extra commands earlier in the build process.</p>

<p>Here is my <code class="language-plaintext highlighter-rouge">config.yaml</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>name: default
port: 8080
public-url: http://oxcaml.check.ci.dev
admin-port: 9999
auto-run-interval: 1680
processes: 100
enable-dune-cache: false
enable-logs-compression: true
default-repository: ocaml/opam-repository
extra-repositories:
- janestreet-with-extensions: janestreet/opam-repository#with-extensions
with-test: false
with-lower-bound: false
list-command: opam list --available --installable --columns=package --short
extra-command: sudo apt install autoconf -y
platform:
  os: linux
  arch: x86_64
  custom-pool:
  distribution: debian-unstable
  image: ocaml/opam:debian-12-ocaml-5.2@sha256:a17317e9abe385dc16b4390c64a374046d6dd562e80aea838d91c6c1335da357
ocaml-switches:
- 5.2.0+flambda2:
    switch: 5.2.0+flambda2
    build-with: opam
</code></pre></div></div>

<p>This results in these commands, which build the switch for testing:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo ln -f /usr/bin/opam-dev /usr/bin/opam
rm -rf ~/opam-repository &amp;&amp; git clone -q 'https://github.com/ocaml/opam-repository' ~/opam-repository &amp;&amp; git -C ~/opam-repository checkout -q dbc9ec7b83bac3673185542221a571372b6abb35
rm -rf ~/.opam &amp;&amp; opam init -ya --bare --config ~/.opamrc-sandbox ~/opam-repository
sudo apt install autoconf -y
git clone -q 'https://github.com/janestreet/opam-repository'  ~/'janestreet-with-extensions' &amp;&amp; git -C ~/'janestreet-with-extensions' checkout -q 55a5d4c5e35a7365ddd6ffb3b87274a77f77deb5
opam repository add --dont-select 'janestreet-with-extensions' ~/'janestreet-with-extensions'
opam switch create --repositories=janestreet-with-extensions,default '5.2.0+flambda2' '5.2.0+flambda2'
opam update --depexts
</code></pre></div></div>

<p>The results are available at <a href="https://oxcaml.check.ci.dev">https://oxcaml.check.ci.dev</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam-health-check,OxCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Arthur mentioned that it would be great to know which packages build successfully with OxCaml and which don’t.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ubuntu 24.04 runc issues with AppArmor</title><link href="https://www.tunbury.org/2025/05/13/ubuntu-apparmor/" rel="alternate" type="text/html" title="Ubuntu 24.04 runc issues with AppArmor" /><published>2025-05-13T12:00:00+00:00</published><updated>2025-05-13T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/13/ubuntu-apparmor</id><content type="html" xml:base="https://www.tunbury.org/2025/05/13/ubuntu-apparmor/"><![CDATA[<p>Patrick reported issues with OCaml-CI running tests on <code class="language-plaintext highlighter-rouge">ocaml-ppx</code>.</p>

<blockquote>
  <p>Fedora seems to be having some issues: https://ocaml.ci.dev/github/ocaml-ppx/ppxlib/commit/0d6886f5bcf22287a66511817e969965c888d2b7/variant/fedora-40-5.3_opam-2.3</p>
  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo: PAM account management error: Authentication service cannot retrieve authentication info
sudo: a password is required
"/usr/bin/env" "bash" "-c" "sudo dnf install -y findutils" failed with exit status 1
2025-05-12 08:55.09: Job failed: Failed: Build failed
</code></pre></div>  </div>
</blockquote>

<p>I took this problem at face value and replied that the issue would be related to Fedora 40, which is EOL. I created <a href="https://github.com/ocurrent/ocaml-ci/pull/1011">PR#1011</a> for OCaml-CI and deployed it. However, the problem didn’t go away. We were now testing Fedora 42, but jobs were still failing. I created a minimal obuilder job specification:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>((from ocaml/opam:fedora-42-ocaml-4.14@sha256:475a852401de7d578efec2afce4384d87b505f5bc610dc56f6bde3b87ebb7664)
(user (uid 1000) (gid 1000))
(run (shell "sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam")))
</code></pre></div></div>

<p>Submitting the job to the cluster showed it worked on all machines except for <code class="language-plaintext highlighter-rouge">bremusa</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>ocluster-client submit-obuilder <span class="nt">--connect</span> mtelvers.cap  <span class="nt">--pool</span> linux-x86_64 <span class="nt">--local-file</span> fedora-42.spec
Tailing log:
Building on bremusa.ocamllabs.io

<span class="o">(</span>from ocaml/opam:fedora-42-ocaml-4.14@sha256:475a852401de7d578efec2afce4384d87b505f5bc610dc56f6bde3b87ebb7664<span class="o">)</span>
2025-05-12 16:55.42 <span class="nt">---</span><span class="o">&gt;</span> using <span class="s2">"aefb7551cd0db7b5ebec7e244d5637aef02ab3f94c732650de7ad183465adaa0"</span> from cache

/: <span class="o">(</span>user <span class="o">(</span>uid 1000<span class="o">)</span> <span class="o">(</span>gid 1000<span class="o">))</span>

/: <span class="o">(</span>run <span class="o">(</span>shell <span class="s2">"sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam"</span><span class="o">))</span>
<span class="nb">sudo</span>: PAM account management error: Authentication service cannot retrieve authentication info
<span class="nb">sudo</span>: a password is required
<span class="s2">"/usr/bin/env"</span> <span class="s2">"bash"</span> <span class="s2">"-c"</span> <span class="s2">"sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam"</span> failed with <span class="nb">exit </span>status 1
Failed: Build failed.
</code></pre></div></div>

<p>Changing the image to <code class="language-plaintext highlighter-rouge">opam:debian-12-ocaml-4.14</code> worked, so the issue only affects Fedora images and only on <code class="language-plaintext highlighter-rouge">bremusa</code>. I was able to reproduce the issue directly using <code class="language-plaintext highlighter-rouge">runc</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># runc run test</span>
<span class="nb">sudo</span>: PAM account management error: Authentication service cannot retrieve authentication info
<span class="nb">sudo</span>: a password is required
</code></pre></div></div>

<p>Running <code class="language-plaintext highlighter-rouge">ls -l /etc/shadow</code> in the container showed that the permissions on <code class="language-plaintext highlighter-rouge">/etc/shadow</code> are 000. If these are changed to <code class="language-plaintext highlighter-rouge">640</code>, then <code class="language-plaintext highlighter-rouge">sudo</code> works correctly. Permissions are set 000 for <code class="language-plaintext highlighter-rouge">/etc/shadow</code> in some distributions as access is limited to processes with the capability <code class="language-plaintext highlighter-rouge">DAC_OVERRIDE</code>.</p>

<p>Having seen a permission issue with <code class="language-plaintext highlighter-rouge">runc</code> and <code class="language-plaintext highlighter-rouge">libseccomp</code> compatibility <a href="https://github.com/ocaml/infrastructure/issues/121">before</a>, I went down a rabbit hole investigating that. Ultimately, I compiled <code class="language-plaintext highlighter-rouge">runc</code> without <code class="language-plaintext highlighter-rouge">libseccomp</code> support, <code class="language-plaintext highlighter-rouge">make MAKETAGS=""</code>, and this still had the same issue.</p>

<p>All the machines in the <code class="language-plaintext highlighter-rouge">linux-x86_64</code> pool are running Ubuntu 22.04 except for <code class="language-plaintext highlighter-rouge">bremusa</code>. I configured a spare machine with Ubuntu 24.04 and tested. The problem appeared on this machine as well.</p>

<p>Is there a change in Ubuntu 24.04?</p>

<p>I temporarily disabled AppArmor by editing <code class="language-plaintext highlighter-rouge">/etc/default/grub</code> and added <code class="language-plaintext highlighter-rouge">apparmor=0</code> to <code class="language-plaintext highlighter-rouge">GRUB_CMDLINE_LINUX</code>, ran <code class="language-plaintext highlighter-rouge">update-grub</code> and rebooted. Disabling AppArmor entirely like this can create security vulnerabilities, so this isn’t recommended, but it did clear the issue.</p>

<p>After enabling AppArmor again, I disabled the configuration for <code class="language-plaintext highlighter-rouge">runc</code> by running:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">ln</span> <span class="nt">-s</span> /etc/apparmor.d/runc /etc/apparmor.d/disable/
apparmor_parser <span class="nt">-R</span> /etc/apparmor.d/runc
</code></pre></div></div>

<p>This didn’t help - in fact, this was worse as now <code class="language-plaintext highlighter-rouge">runc</code> couldn’t run at all.  I restored the configuration and added <code class="language-plaintext highlighter-rouge">capability dac_override</code>, but this didn’t help either.</p>

<p>Looking through the profiles with <code class="language-plaintext highlighter-rouge">grep shadow -r /etc/apparmor.d</code>, I noticed <code class="language-plaintext highlighter-rouge">unix-chkpwd</code>, which could be the source of the issue. I disabled this profile and the issue was resolved.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">ln</span> <span class="nt">-s</span> /etc/apparmor.d/unix-chkpwd /etc/apparmor.d/disable
apparmor_parser <span class="nt">-R</span> /etc/apparmor.d/unix-chkpwd
</code></pre></div></div>

<p>Armed with the answer, it’s pretty easy to find other people with related issues:</p>
<ul>
  <li>https://github.com/docker/build-push-action/issues/1302</li>
  <li>https://github.com/moby/moby/issues/48734</li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Ubuntu,runc,AppArmor" /><category term="tunbury.org" /><summary type="html"><![CDATA[Patrick reported issues with OCaml-CI running tests on ocaml-ppx.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ubuntu.png" /><media:content medium="image" url="https://www.tunbury.org/images/ubuntu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Posthog on OCaml.org</title><link href="https://www.tunbury.org/2025/05/12/posthog/" rel="alternate" type="text/html" title="Posthog on OCaml.org" /><published>2025-05-12T12:00:00+00:00</published><updated>2025-05-12T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/12/posthog</id><content type="html" xml:base="https://www.tunbury.org/2025/05/12/posthog/"><![CDATA[<p>Sabine would like to switch <a href="https://ocaml.org">OCaml.org</a> from using <a href="https://plausible.io">Plausible</a> over to <a href="https://posthog.com">Posthog</a>. The underlying reason for the move is that the self-hosted product from Posthog has more features than the equivalent from Plausible. Of particular interest is the heatmap feature to assess the number of visitors who finish the <a href="https://ocaml.org/docs/tour-of-ocaml">Tour of OCaml</a>.</p>

<p>Posthog has <a href="https://posthog.com/docs/self-host">documentation</a> on the self-hosted solution. In short, create a VM with 4 vCPU, 16GB RAM, and 30GB storage and run the setup script:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/bin/bash <span class="nt">-c</span> <span class="s2">"</span><span class="si">$(</span>curl <span class="nt">-fsSL</span> https://raw.githubusercontent.com/posthog/posthog/HEAD/bin/deploy-hobby<span class="si">)</span><span class="s2">”
</span></code></pre></div></div>

<p>Any subsequent upgrades can be achieved with:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/bin/bash <span class="nt">-c</span> <span class="s2">"</span><span class="si">$(</span>curl <span class="nt">-fsSL</span> https://raw.githubusercontent.com/posthog/posthog/HEAD/bin/upgrade-hobby<span class="si">)</span><span class="s2">"</span>
</code></pre></div></div>

<p>After installation, I created a <a href="https://posthog.ci.dev/shared/seqtamWuMXLwxJEAX1XNjwhzciAajw">public dashboard</a> as with <a href="https://plausible.ci.dev/ocaml.org">Plausible</a>. I also enabled the option <code class="language-plaintext highlighter-rouge">Discard client IP data</code>.</p>

<p>The OCaml website can be updated with <a href="https://github.com/ocaml/ocaml.org/pull/3101">PR#3101</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="posthog" /><category term="tunbury.org" /><summary type="html"><![CDATA[Sabine would like to switch OCaml.org from using Plausible over to Posthog. The underlying reason for the move is that the self-hosted product from Posthog has more features than the equivalent from Plausible. Of particular interest is the heatmap feature to assess the number of visitors who finish the Tour of OCaml.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/posthog.png" /><media:content medium="image" url="https://www.tunbury.org/images/posthog.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Worker moves</title><link href="https://www.tunbury.org/2025/05/09/worker-moves/" rel="alternate" type="text/html" title="Worker moves" /><published>2025-05-09T12:00:00+00:00</published><updated>2025-05-09T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/09/worker-moves</id><content type="html" xml:base="https://www.tunbury.org/2025/05/09/worker-moves/"><![CDATA[<p>Following the setup of <em>rosemary</em> with <a href="https://www.tunbury.org/freebsd-uefi/">FreeBSD 14</a> (with 20C/40T), I have paused <em>spring</em> and <em>summer</em> (which combined have 12C/24T) and <em>rosemary</em> is now handling all of the <a href="https://github.com/ocurrent/freebsd-infra/pull/14">FreeBSD workload</a>.</p>

<p><em>Oregano</em> has now taken the OpenBSD workload from <em>bremusa</em>. <em>bremusa</em> has been redeployed in the <code class="language-plaintext highlighter-rouge">linux-x86_64</code> pool. With the extra processing, I have paused the Scaleway workers <em>x86-bm-c1</em> through <em>x86-bm-c9</em>.</p>

<p>These changes, plus the <a href="https://www.tunbury.org/equinix-moves/">removal of the Equnix machines</a>, are now reflected in <a href="https://infra.ocaml.org">https://infra.ocaml.org</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OBuilder,FreeBSD,OpenBSD" /><category term="tunbury.org" /><summary type="html"><![CDATA[Following the setup of rosemary with FreeBSD 14 (with 20C/40T), I have paused spring and summer (which combined have 12C/24T) and rosemary is now handling all of the FreeBSD workload.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Debugging OBuilder on macOS</title><link href="https://www.tunbury.org/2025/05/08/debugging-obuilder-macos/" rel="alternate" type="text/html" title="Debugging OBuilder on macOS" /><published>2025-05-08T12:00:00+00:00</published><updated>2025-05-08T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/08/debugging-obuilder-macos</id><content type="html" xml:base="https://www.tunbury.org/2025/05/08/debugging-obuilder-macos/"><![CDATA[<p>The log from an <a href="https://github.com/ocurrent/obuilder">OBuilder</a> job starts with the steps needed to reproduce the job locally. This boilerplate output assumes that all OBuilder jobs start from a Docker base image, but on some operating systems, such as FreeBSD and macOS, OBuilder uses ZFS base images. On OpenBSD and Windows, it uses QEMU images. The situation is further complicated when the issue only affects a specific architecture that may be unavailable to the user.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>2025-05-08 13:29.37: New job: build bitwuzla-cxx.0.7.0, using opam 2.3
                              from https://github.com/ocaml/opam-repository.git#refs/pull/27768/head (55a47416d532dc829d9111297970934a21a1b1c4)
                              on macos-homebrew-ocaml-4.14/amd64

To reproduce locally:

cd $(mktemp -d)
git clone --recursive "https://github.com/ocaml/opam-repository.git" &amp;&amp; cd "opam-repository" &amp;&amp; git fetch origin "refs/pull/27768/head" &amp;&amp; git reset --hard 55a47416
git fetch origin master
git merge --no-edit b8a7f49af3f606bf8a22869a1b52b250dd90092e
cat &gt; ../Dockerfile &lt;&lt;'END-OF-DOCKERFILE'

FROM macos-homebrew-ocaml-4.14
USER 1000:1000
RUN ln -f ~/local/bin/opam-2.3 ~/local/bin/opam
RUN opam init --reinit -ni
RUN opam option solver=builtin-0install &amp;&amp; opam config report
ENV OPAMDOWNLOADJOBS="1"
ENV OPAMERRLOGLEN="0"
ENV OPAMPRECISETRACKING="1"
ENV CI="true"
ENV OPAM_REPO_CI="true"
RUN rm -rf opam-repository/
COPY --chown=1000:1000 . opam-repository/
RUN opam repository set-url -k local --strict default opam-repository/
RUN opam update --depexts || true
RUN opam pin add -k version -yn bitwuzla-cxx.0.7.0 0.7.0
RUN opam reinstall bitwuzla-cxx.0.7.0; \
    res=$?; \
    test "$res" != 31 &amp;&amp; exit "$res"; \
    export OPAMCLI=2.0; \
    build_dir=$(opam var prefix)/.opam-switch/build; \
    failed=$(ls "$build_dir"); \
    partial_fails=""; \
    for pkg in $failed; do \
    if opam show -f x-ci-accept-failures: "$pkg" | grep -qF "\"macos-homebrew\""; then \
    echo "A package failed and has been disabled for CI using the 'x-ci-accept-failures' field."; \
    fi; \
    test "$pkg" != 'bitwuzla-cxx.0.7.0' &amp;&amp; partial_fails="$partial_fails $pkg"; \
    done; \
    test "${partial_fails}" != "" &amp;&amp; echo "opam-repo-ci detected dependencies failing: ${partial_fails}"; \
    exit 1


END-OF-DOCKERFILE
docker build -f ../Dockerfile .
</code></pre></div></div>

<p>It is, therefore, difficult to diagnose the issue on these operating systems and on esoteric architectures. Is it an issue with the CI system or the job itself?</p>

<p>My approach is to get myself into an interactive shell at the point in the build where the failure occurs. On Linux and FreeBSD, the log is available in <code class="language-plaintext highlighter-rouge">/var/log/syslog</code> or <code class="language-plaintext highlighter-rouge">/var/log/messages</code> respectively. On macOS, this log is written to <code class="language-plaintext highlighter-rouge">ocluster.log</code>. macOS workers are single-threaded, so the worker must be paused before progressing.</p>

<p>Each step in an OBuilder job consists of taking a snapshot of the previous layer, running a command in that layer, and keeping or discarding the layer depending on the command’s success or failure. On macOS, layers are ZFS snapshots mounted over the Homebrew directory and the CI users’ home directory. We can extract the appropriate command from the logs.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>2025-05-08 14:31.17    application [INFO] Exec "zfs" "clone" "-o" "canmount=noauto" "--" "obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec@snap" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40"
2025-05-08 14:31.17    application [INFO] Exec "zfs" "mount" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40"
2025-05-08 14:31.17    application [INFO] Exec "zfs" "clone" "-o" "mountpoint=none" "--" "obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec/brew@snap" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew"
2025-05-08 14:31.17    application [INFO] Exec "zfs" "clone" "-o" "mountpoint=none" "--" "obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec/home@snap" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home"
cannot open 'obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40@snap': dataset does not exist
2025-05-08 14:31.17    application [INFO] Exec "zfs" "clone" "--" "obuilder/cache/c-opam-archives@snap" "obuilder/cache-tmp/8608-c-opam-archives"
2025-05-08 14:31.17    application [INFO] Exec "zfs" "clone" "--" "obuilder/cache/c-homebrew@snap" "obuilder/cache-tmp/8609-c-homebrew"
2025-05-08 14:31.18       obuilder [INFO] result_tmp = /Volumes/obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40
2025-05-08 14:31.18    application [INFO] Exec "zfs" "set" "mountpoint=/Users/mac1000" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home"
2025-05-08 14:31.18    application [INFO] Exec "zfs" "set" "mountpoint=/usr/local" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew"
2025-05-08 14:31.18       obuilder [INFO] src = /Volumes/obuilder/cache-tmp/8608-c-opam-archives, dst = /Users/mac1000/.opam/download-cache, type rw
2025-05-08 14:31.18    application [INFO] Exec "zfs" "set" "mountpoint=/Users/mac1000/.opam/download-cache" "obuilder/cache-tmp/8608-c-opam-archives"
Unmount successful for /Volumes/obuilder/cache-tmp/8608-c-opam-archives
2025-05-08 14:31.18       obuilder [INFO] src = /Volumes/obuilder/cache-tmp/8609-c-homebrew, dst = /Users/mac1000/Library/Caches/Homebrew, type rw
2025-05-08 14:31.18    application [INFO] Exec "zfs" "set" "mountpoint=/Users/mac1000/Library/Caches/Homebrew" "obuilder/cache-tmp/8609-c-homebrew"
Unmount successful for /Volumes/obuilder/cache-tmp/8609-c-homebrew
2025-05-08 14:31.19    application [INFO] Exec "sudo" "dscl" "." "list" "/Users"
2025-05-08 14:31.19    application [INFO] Exec "sudo" "-u" "mac1000" "-i" "getconf" "DARWIN_USER_TEMP_DIR"
2025-05-08 14:31.19    application [INFO] Fork exec "sudo" "su" "-l" "mac1000" "-c" "--" "source ~/.obuilder_profile.sh &amp;&amp; env 'TMPDIR=/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/' 'OPAM_REPO_CI=true' 'CI=true' 'OPAMPRECISETRACKING=1' 'OPAMERRLOGLEN=0' 'OPAMDOWNLOADJOBS=1' "$0" "$@"" "/usr/bin/env" "bash" "-c" "opam reinstall bitwuzla-cxx.0.7.0;
        res=$?;
        test "$res" != 31 &amp;&amp; exit "$res";
        export OPAMCLI=2.0;
        build_dir=$(opam var prefix)/.opam-switch/build;
        failed=$(ls "$build_dir");
        partial_fails="";
        for pkg in $failed; do
          if opam show -f x-ci-accept-failures: "$pkg" | grep -qF "\"macos-homebrew\""; then
            echo "A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.";
          fi;
          test "$pkg" != 'bitwuzla-cxx.0.7.0' &amp;&amp; partial_fails="$partial_fails $pkg";
        done;
        test "${partial_fails}" != "" &amp;&amp; echo "opam-repo-ci detected dependencies failing: ${partial_fails}”;
        exit 1"
2025-05-08 14:31.28         worker [INFO] OBuilder partition: 27% free, 2081 items
2025-05-08 14:31.58         worker [INFO] OBuilder partition: 27% free, 2081 items
2025-05-08 14:32.28         worker [INFO] OBuilder partition: 27% free, 2081 items
2025-05-08 14:32.43    application [INFO] Exec "zfs" "inherit" "mountpoint" "obuilder/cache-tmp/8608-c-opam-archives"
Unmount successful for /Users/mac1000/.opam/download-cache
2025-05-08 14:32.44    application [INFO] Exec "zfs" "inherit" "mountpoint" "obuilder/cache-tmp/8609-c-homebrew"
Unmount successful for /Users/mac1000/Library/Caches/Homebrew
2025-05-08 14:32.45    application [INFO] Exec "zfs" "set" "mountpoint=none" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home"
Unmount successful for /Users/mac1000
2025-05-08 14:32.45    application [INFO] Exec "zfs" "set" "mountpoint=none" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew"
Unmount successful for /usr/local
2025-05-08 14:32.46    application [INFO] Exec "zfs" "rename" "--" "obuilder/cache/c-homebrew" "obuilder/cache-tmp/8610-c-homebrew"
Unmount successful for /Volumes/obuilder/cache/c-homebrew
2025-05-08 14:32.46    application [INFO] Exec "zfs" "promote" "obuilder/cache-tmp/8609-c-homebrew"
2025-05-08 14:32.46    application [INFO] Exec "zfs" "destroy" "-f" "--" "obuilder/cache-tmp/8610-c-homebrew"
Unmount successful for /Volumes/obuilder/cache-tmp/8610-c-homebrew
2025-05-08 14:32.48    application [INFO] Exec "zfs" "rename" "--" "obuilder/cache-tmp/8609-c-homebrew@snap" "obuilder/cache-tmp/8609-c-homebrew@old-2152"
2025-05-08 14:32.48    application [INFO] Exec "zfs" "destroy" "-d" "--" "obuilder/cache-tmp/8609-c-homebrew@old-2152"
2025-05-08 14:32.48    application [INFO] Exec "zfs" "snapshot" "-r" "--" "obuilder/cache-tmp/8609-c-homebrew@snap"
2025-05-08 14:32.48    application [INFO] Exec "zfs" "rename" "--" "obuilder/cache-tmp/8609-c-homebrew" "obuilder/cache/c-homebrew"
Unmount successful for /Volumes/obuilder/cache-tmp/8609-c-homebrew
2025-05-08 14:32.49    application [INFO] Exec "zfs" "rename" "--" "obuilder/cache/c-opam-archives" "obuilder/cache-tmp/8611-c-opam-archives"
Unmount successful for /Volumes/obuilder/cache/c-opam-archives
2025-05-08 14:32.50    application [INFO] Exec "zfs" "promote" "obuilder/cache-tmp/8608-c-opam-archives"
2025-05-08 14:32.50    application [INFO] Exec "zfs" "destroy" "-f" "--" "obuilder/cache-tmp/8611-c-opam-archives"
Unmount successful for /Volumes/obuilder/cache-tmp/8611-c-opam-archives
2025-05-08 14:32.51    application [INFO] Exec "zfs" "rename" "--" "obuilder/cache-tmp/8608-c-opam-archives@snap" "obuilder/cache-tmp/8608-c-opam-archives@old-2152"
2025-05-08 14:32.51    application [INFO] Exec "zfs" "destroy" "-d" "--" "obuilder/cache-tmp/8608-c-opam-archives@old-2152"
2025-05-08 14:32.51    application [INFO] Exec "zfs" "snapshot" "-r" "--" "obuilder/cache-tmp/8608-c-opam-archives@snap"
2025-05-08 14:32.52    application [INFO] Exec "zfs" "rename" "--" "obuilder/cache-tmp/8608-c-opam-archives" "obuilder/cache/c-opam-archives"
Unmount successful for /Volumes/obuilder/cache-tmp/8608-c-opam-archives
2025-05-08 14:32.52    application [INFO] Exec "zfs" "destroy" "-r" "-f" "--" "obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40"
Unmount successful for /Volumes/obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40
2025-05-08 14:32.58         worker [INFO] OBuilder partition: 27% free, 2081 items
2025-05-08 14:33.04         worker [INFO] Job failed: "/usr/bin/env" "bash" "-c" "opam reinstall bitwuzla-cxx.0.7.0;
        res=$?;
        test "$res" != 31 &amp;&amp; exit "$res";
        export OPAMCLI=2.0;
        build_dir=$(opam var prefix)/.opam-switch/build;
        failed=$(ls "$build_dir");
        partial_fails="";
        for pkg in $failed; do
          if opam show -f x-ci-accept-failures: "$pkg" | grep -qF "\"macos-homebrew\""; then
            echo "A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.";
          fi;
          test "$pkg" != 'bitwuzla-cxx.0.7.0' &amp;&amp; partial_fails="$partial_fails $pkg";
        done;
        test "${partial_fails}" != "" &amp;&amp; echo "opam-repo-ci detected dependencies failing: ${partial_fails}";
        exit 1" failed with exit status 1

</code></pre></div></div>

<p>Run each of the <em>Exec</em> commands at the command prompt up to the <em>Fork exec</em>. We do need to run it, but we want an interactive shell, so let’s change the final part of the command to <code class="language-plaintext highlighter-rouge">bash</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo su -l mac1000 -c -- "source ~/.obuilder_profile.sh &amp;&amp; env 'TMPDIR=/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/' 'OPAM_REPO_CI=true' 'CI=true' 'OPAMPRECISETRACKING=1' 'OPAMERRLOGLEN=0' 'OPAMDOWNLOADJOBS=1' bash"
</code></pre></div></div>

<p>Now, at the shell prompt, we can try <code class="language-plaintext highlighter-rouge">opam reinstall bitwuzla-cxx.0.7.0</code>. Hopefully, this fails, which proves we have successfully recreated the environment!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ opam source bitwuzla-cxx.0.7.0
$ cd bitwuzla-cxx.0.7.0
$ dune build
File "vendor/dune", lines 201-218, characters 0-436:
201 | (rule
202 |  (deps
203 |   (source_tree bitwuzla)
.....
216 |      %{p0002}
217 |      (run patch -p1 --directory bitwuzla))
218 |     (write-file %{target} "")))))
(cd _build/default/vendor &amp;&amp; /usr/bin/patch -p1 --directory bitwuzla) &lt; _build/default/vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch
patching file 'include/bitwuzla/cpp/bitwuzla.h'
Can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw', output is in '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw': Permission denied
patch: **** can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw': Permission denied
</code></pre></div></div>

<p>This matches the output we see on the CI logs. <code class="language-plaintext highlighter-rouge">/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T</code> is the <code class="language-plaintext highlighter-rouge">TMPDIR</code> value set in the environment. <code class="language-plaintext highlighter-rouge">Permission denied</code> looks like file system permissions. <code class="language-plaintext highlighter-rouge">ls -l</code> and <code class="language-plaintext highlighter-rouge">touch</code> show we can write to this directory.</p>

<p>As we are running on macOS, and the Dune is invoking <code class="language-plaintext highlighter-rouge">patch</code>, my thought goes to Apple’s <code class="language-plaintext highlighter-rouge">patch</code> vs GNU’s <code class="language-plaintext highlighter-rouge">patch</code>. Editing <code class="language-plaintext highlighter-rouge">vendor/dune</code> to use <code class="language-plaintext highlighter-rouge">gpatch</code> rather than <code class="language-plaintext highlighter-rouge">patch</code> allows the project to build.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ dune build
(cd _build/default/vendor &amp;&amp; /usr/local/bin/gpatch --directory bitwuzla -p1) &lt; _build/default/vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch
File include/bitwuzla/cpp/bitwuzla.h is read-only; trying to patch anyway
patching file include/bitwuzla/cpp/bitwuzla.h
</code></pre></div></div>

<p>Running Apple’s <code class="language-plaintext highlighter-rouge">patch</code> directly,</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ patch -p1 &lt; ../../../../vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch
patching file 'include/bitwuzla/cpp/bitwuzla.h'
Can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI', output is in '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI': Permission denied
patch: **** can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI': Permission denied
</code></pre></div></div>

<p>However, <code class="language-plaintext highlighter-rouge">touch /var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI</code> succeeds.</p>

<p>Looking back at the output from GNU <code class="language-plaintext highlighter-rouge">patch</code>, it reports that the file itself is read-only.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ls -l include/bitwuzla/cpp/bitwuzla.h
-r--r--r--  1 mac1000  admin  52280 May  8 15:05 include/bitwuzla/cpp/bitwuzla.h
</code></pre></div></div>

<p>Let’s try to adjust the permissions:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ chmod 644 include/bitwuzla/cpp/bitwuzla.h
$ patch -p1 &lt; ../../../../vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch
patching file 'include/bitwuzla/cpp/bitwuzla.h’
</code></pre></div></div>

<p>And now, it succeeds. The issue is that GNU’s <code class="language-plaintext highlighter-rouge">patch</code> and Apple’s <code class="language-plaintext highlighter-rouge">patch</code> act differently when the file being patched is read-only. Apple’s <code class="language-plaintext highlighter-rouge">patch</code> gives a spurious error, while GNU’s <code class="language-plaintext highlighter-rouge">patch</code> emits a warning and makes the change anyway.</p>

<p>Updating the <code class="language-plaintext highlighter-rouge">dune</code> file to include <code class="language-plaintext highlighter-rouge">chmod</code> should both clear the warning and allow the use of the native patch.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(rule
 (deps
  (source_tree bitwuzla)
  (:p0001
   (file patch/0001-api-Add-hook-for-ocaml-z-value.patch))
  (:p0002
   (file patch/0002-binding-Fix-segfault-with-parallel-instances.patch)))
 (target .bitwuzla_tree)
 (action
  (no-infer
   (progn
    (run chmod -R u+w bitwuzla)
    (with-stdin-from
     %{p0001}
     (run patch -p1 --directory bitwuzla))
    (with-stdin-from
     %{p0002}
     (run patch -p1 --directory bitwuzla))
    (write-file %{target} "")))))
</code></pre></div></div>

<p>As an essential last step, we need to tidy up on this machine. Exit the shell. Refer back to the log file for the job and run all the remaining ZFS commands. This is incredibly important on macOS and essential to keep the jobs database in sync with the layers.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="macOS,OBuilder" /><category term="tunbury.org" /><summary type="html"><![CDATA[The log from an OBuilder job starts with the steps needed to reproduce the job locally. This boilerplate output assumes that all OBuilder jobs start from a Docker base image, but on some operating systems, such as FreeBSD and macOS, OBuilder uses ZFS base images. On OpenBSD and Windows, it uses QEMU images. The situation is further complicated when the issue only affects a specific architecture that may be unavailable to the user.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/obuilder.png" /><media:content medium="image" url="https://www.tunbury.org/images/obuilder.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Otter Wiki with Raven Authentication</title><link href="https://www.tunbury.org/2025/05/07/otter-wiki-with-raven/" rel="alternate" type="text/html" title="Otter Wiki with Raven Authentication" /><published>2025-05-07T12:00:00+00:00</published><updated>2025-05-07T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/07/otter-wiki-with-raven</id><content type="html" xml:base="https://www.tunbury.org/2025/05/07/otter-wiki-with-raven/"><![CDATA[<p>We’d like to have a go using <a href="https://otterwiki.com">Otter Wiki</a>, but rather than having yet more usernames and passwords, we would like to integrate this into the Raven authentication system. There is <a href="https://docs.raven.cam.ac.uk/en/latest/apache-saml2/">guide on using SAML2 with Apache</a></p>

<p>The steps are:</p>
<ol>
  <li>Start the provided container.</li>
  <li>Visit http://your-container/Shibboleth.sso/Metadata and download the <code class="language-plaintext highlighter-rouge">Metadata</code>.</li>
  <li>Go to <a href="https://metadata.raven.cam.ac.uk">https://metadata.raven.cam.ac.uk</a> and create a new site by pasting in the metadata.</li>
  <li>Wait one minute and try to connect to http://your-container</li>
</ol>

<p>Otter Wiki, when started with the environment variable <code class="language-plaintext highlighter-rouge">AUTH_METHOD=PROXY_HEADER</code>, reads HTTP header fields <code class="language-plaintext highlighter-rouge">x-otterwiki-name</code>, <code class="language-plaintext highlighter-rouge">x-otterwiki-email</code> and <code class="language-plaintext highlighter-rouge">x-otterwiki-permissions</code>.  See <a href="https://github.com/redimp/otterwiki/blob/main/docs/auth_examples/header-auth/README.md">this example</a></p>

<p>Apache can be configured to set these header fields based upon the SAML user who is authenticated with Raven:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ShibUseEnvironment On
RequestHeader set x-otterwiki-name %{displayName}e
RequestHeader set x-otterwiki-email %{REMOTE_USER}s
RequestHeader set x-otterwiki-permissions "READ,WRITE,UPLOAD,ADMIN”
</code></pre></div></div>

<p>I have created a <code class="language-plaintext highlighter-rouge">docker-compose.yml</code> file, which incorporates Apache running as a reverse proxy, an Otter Wiki container and includes HTTPS support with a Let’s Encrypt certificate. The files are available on <a href="https://github.com/mtelvers/doc-samples/commit/5ca2f8934a4cf1269e60b2b18de563352f764f66">GitHub</a></p>

<p>The test site is <a href="https://otterwiki.tunbury.uk">https://otterwiki.tunbury.uk</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Otter,Raven" /><category term="tunbury.org" /><summary type="html"><![CDATA[We’d like to have a go using Otter Wiki, but rather than having yet more usernames and passwords, we would like to integrate this into the Raven authentication system. There is guide on using SAML2 with Apache]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/otter.png" /><media:content medium="image" url="https://www.tunbury.org/images/otter.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">iPXE boot for FreeBSD with an UEFI BIOS</title><link href="https://www.tunbury.org/2025/05/06/freebsd-uefi/" rel="alternate" type="text/html" title="iPXE boot for FreeBSD with an UEFI BIOS" /><published>2025-05-06T12:00:00+00:00</published><updated>2025-05-06T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/06/freebsd-uefi</id><content type="html" xml:base="https://www.tunbury.org/2025/05/06/freebsd-uefi/"><![CDATA[<p>I had assumed that booting FreeBSD over the network using iPXE would be pretty simple. There is even a <code class="language-plaintext highlighter-rouge">freebsd.ipxe</code> file included with Netboot.xyz. However, I quickly realised that most of the Internet wisdom on this process centred around legacy BIOS rather than UEFI. When booting with UEFI, the Netboot.xyz menu omits the FreeBSD option as it only supports legacy BIOS. Even in legacy mode, it uses <code class="language-plaintext highlighter-rouge">memdisk</code> from the Syslinux project rather than a FreeBSD loader.</p>

<p>FreeBSD expects to use <code class="language-plaintext highlighter-rouge">loader.efi</code> to boot and to mount the root directory over NFS based upon the DHCP scope option <code class="language-plaintext highlighter-rouge">root-path</code>. I didn’t want to provide an NFS server just for this process, but even when I gave in and set one up, it still didn’t work. I’m pleased that, in the final configuration, I didn’t need an NFS server.</p>

<p>Much of the frustration around doing this came from setting the <code class="language-plaintext highlighter-rouge">root-path</code> option. FreeBSD’s <code class="language-plaintext highlighter-rouge">loader.efi</code> sends its own DHCP request to the DHCP server, ignoring the options <code class="language-plaintext highlighter-rouge">set root-path</code> or <code class="language-plaintext highlighter-rouge">set dhcp.root-path</code> configured in iPXE.</p>

<p>Many <code class="language-plaintext highlighter-rouge">dhcpd.conf</code> snippets suggest a block similar to below, but usually with the comment that it doesn’t work. Most authors proceed by setting <code class="language-plaintext highlighter-rouge">root-path</code> for the entire scope.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if exists user-class and option user-class = "FreeBSD" {
    option root-path "your-path";
}
</code></pre></div></div>

<p>I used <code class="language-plaintext highlighter-rouge">dhcpdump -i br0</code> to examine the DHCP packets. This showed an ASCII BEL character (0x07) before <code class="language-plaintext highlighter-rouge">FreeBSD</code> in the <code class="language-plaintext highlighter-rouge">user-class</code> string.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  TIME: 2025-05-07 08:51:03.811
    IP: 0.0.0.0 (2:0:0:0:0:22) &gt; 255.255.255.255 (ff:ff:ff:ff:ff:ff)
    OP: 1 (BOOTPREQUEST)
 HTYPE: 1 (Ethernet)
  HLEN: 6
  HOPS: 0
   XID: 00000001
  SECS: 0
 FLAGS: 0
CIADDR: 0.0.0.0
YIADDR: 0.0.0.0
SIADDR: 0.0.0.0
GIADDR: 0.0.0.0
CHADDR: 02:00:00:00:00:22:00:00:00:00:00:00:00:00:00:00
 SNAME: .
 FNAME: .
OPTION:  53 (  1) DHCP message type         3 (DHCPREQUEST)
OPTION:  50 (  4) Request IP address        x.y.z.250
OPTION:  54 (  4) Server identifier         x.y.z.1
OPTION:  51 (  4) IP address leasetime      300 (5m)
OPTION:  60 (  9) Vendor class identifier   PXEClient
OPTION:  77 (  8) User-class Identification 0746726565425344 .FreeBSD
OPTION:  55 (  7) Parameter Request List     17 (Root path)
					     12 (Host name)
					     16 (Swap server)
					      3 (Routers)
					      1 (Subnet mask)
					     26 (Interface MTU)
					     54 (Server identifier)
</code></pre></div></div>

<p>There is a <code class="language-plaintext highlighter-rouge">substring</code> command, so I was able to set the <code class="language-plaintext highlighter-rouge">root-path</code> like this successfully:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if exists user-class and substring ( option user-class, 1, 7 ) = "FreeBSD" {
    option root-path "your-path";
}
</code></pre></div></div>

<p>The situation is further complicated as we are using a Ubiquiti Edge router. This requires the command to be encoded as a <code class="language-plaintext highlighter-rouge">subnet-parameters</code>, which is injected into <code class="language-plaintext highlighter-rouge">/opt/vyatta/etc/dhcpd.conf</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set service dhcp-server shared-network-name lab subnet x.y.z.0/24 subnet-parameters 'if exists user-class and substring( option user-class, 1, 7 ) = &amp;quot;FreeBSD&amp;quot; { option root-path &amp;quot;tftp://x.y.z.240/freebsd14&amp;quot;;}'
</code></pre></div></div>

<p>The FreeBSD 14.2 installation <a href="https://download.freebsd.org/releases/amd64/amd64/ISO-IMAGES/14.2/FreeBSD-14.2-RELEASE-amd64-disc1.iso">ISO</a> contains the required <code class="language-plaintext highlighter-rouge">boot/loader.efi</code>, but we cannot use the extracted ISO as a root file system.</p>

<p>Stage <code class="language-plaintext highlighter-rouge">loader.efi</code> on a TFTP server; in my case, the TFTP root is <code class="language-plaintext highlighter-rouge">/netbootxyz/config/menus</code>. The IPXE file only needs to contain the <code class="language-plaintext highlighter-rouge">chain</code> command.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!ipxe
chain loader.efi
</code></pre></div></div>

<p>Download <a href="https://mfsbsd.vx.sk/files/iso/14/amd64/mfsbsd-14.2-RELEASE-amd64.iso">mfsBSD</a>, and extract the contents to a subfolder on the TFTP server. I went <code class="language-plaintext highlighter-rouge">freebsd14</code>. This ISO contains the kernel, <code class="language-plaintext highlighter-rouge">loader.conf</code> and the a minimal root file system, <code class="language-plaintext highlighter-rouge">mfsroot.gz</code>.</p>

<p>With the content of mfsBSD ISO staged on the TFTP server and the modification to the DHCP scope options, the machine will boot into FreeBSD. Sign in with <code class="language-plaintext highlighter-rouge">root</code>/<code class="language-plaintext highlighter-rouge">mfsroot</code> and invoke <code class="language-plaintext highlighter-rouge">bsdinstall</code>.</p>

<p>On real hardware, rather than QEMU, I found that I needed to explicitly set the serial console by adding these lines to the end of <code class="language-plaintext highlighter-rouge">boot/loader.conf</code>/</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Serial console
console="comconsole"
comconsole_port="0x2f8"
comconsole_speed="115200"
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="FreeBSD,UEFI,iPXE" /><category term="tunbury.org" /><summary type="html"><![CDATA[I had assumed that booting FreeBSD over the network using iPXE would be pretty simple. There is even a freebsd.ipxe file included with Netboot.xyz. However, I quickly realised that most of the Internet wisdom on this process centred around legacy BIOS rather than UEFI. When booting with UEFI, the Netboot.xyz menu omits the FreeBSD option as it only supports legacy BIOS. Even in legacy mode, it uses memdisk from the Syslinux project rather than a FreeBSD loader.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/freebsd-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/freebsd-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OS Boot Media with Ventoy</title><link href="https://www.tunbury.org/2025/05/05/ventoy/" rel="alternate" type="text/html" title="OS Boot Media with Ventoy" /><published>2025-05-05T12:00:00+00:00</published><updated>2025-05-05T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/05/ventoy</id><content type="html" xml:base="https://www.tunbury.org/2025/05/05/ventoy/"><![CDATA[<p>I need to install a chunky Windows application (90GB download, +250 GB install), but all my Windows VMs are pretty small, so I decided to use a spare Dell OptiPlex 7090. It had Windows 10 installed, but it was pretty messy from the previous use, so I decided to install Windows 11. I had a Windows 11 ISO on hand, so I wrote that to a USB memory stick using the Raspberry Pi Imaging tool (effectively <code class="language-plaintext highlighter-rouge">dd</code> in this use case). The machine booted without issue, but the installation failed, citing “A media driver your computer needs is missing”. This error looked familiar: a mass storage driver was missing. I often see this in QEMU or similar situations, and it’s also common on server hardware. However, pressing Shift-F10 and opening <code class="language-plaintext highlighter-rouge">diskpart</code> showed all my storage.</p>

<p>It’s been a while since I installed Windows on real hardware. Mostly, I use QEMU and an ISO and an <code class="language-plaintext highlighter-rouge">autounattend.xml</code> or PXE boot with Windows Deployment Services and Microsoft Deployment Toolkit. It seems that some time ago, the ISO files that Microsoft publish started to contain files that were larger than the standard allows, and thus, the <code class="language-plaintext highlighter-rouge">dd</code> approach to creating an image no longer works.</p>

<p>Microsoft produces a USB creation tool, but I couldn’t see how to tell it to use the ISO file that I already had! This happily led me to <a href="https://www.ventoy.net/en/index.html">Ventoy</a>. The tool installs a small bootloader (~30M) on the memory stick and formats the rest as an exFAT partition. Copy your ISO file(s) to the exFAT partition, and boot the machine from the memory stick. You are then presented with a simple menu allowing you to boot from any of the ISO files. I couldn’t help myself, I had to try to see if another OS would work too!</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ventoy" /><category term="tunbury.org" /><summary type="html"><![CDATA[I need to install a chunky Windows application (90GB download, +250 GB install), but all my Windows VMs are pretty small, so I decided to use a spare Dell OptiPlex 7090. It had Windows 10 installed, but it was pretty messy from the previous use, so I decided to install Windows 11. I had a Windows 11 ISO on hand, so I wrote that to a USB memory stick using the Raspberry Pi Imaging tool (effectively dd in this use case). The machine booted without issue, but the installation failed, citing “A media driver your computer needs is missing”. This error looked familiar: a mass storage driver was missing. I often see this in QEMU or similar situations, and it’s also common on server hardware. However, pressing Shift-F10 and opening diskpart showed all my storage.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ventoy.png" /><media:content medium="image" url="https://www.tunbury.org/images/ventoy.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">ZFS Send Streams</title><link href="https://www.tunbury.org/2025/05/02/zfs-send-streams/" rel="alternate" type="text/html" title="ZFS Send Streams" /><published>2025-05-02T20:00:00+00:00</published><updated>2025-05-02T20:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/02/zfs-send-streams</id><content type="html" xml:base="https://www.tunbury.org/2025/05/02/zfs-send-streams/"><![CDATA[<p>We often say that ZFS is an excellent replicated file system, but not the best <em>local</em> filesystem. This led me to think that if we run <code class="language-plaintext highlighter-rouge">zfs send</code> on one machine, we might want to write that out as a different filesystem. Is that even possible?</p>

<p>What is in a ZFS stream?</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fallocate <span class="nt">-l</span> 10G temp.zfs
zpool create tank <span class="sb">`</span><span class="nb">pwd</span><span class="sb">`</span>/temp.zfs 
zfs create tank/home
<span class="nb">cp </span>README.md /tank/home
zfs snapshot tank/home@send
zfs send tank/home@send | hexdump
</code></pre></div></div>

<p>I spent a little time writing an OCaml application to parse the record structure before realising that there already was a tool to do this: <code class="language-plaintext highlighter-rouge">zstreamdump</code>. Using the <code class="language-plaintext highlighter-rouge">-d</code> flag shows the contents; you can see your file in the dumped output.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zfs send tank/home@send | zstreamdump <span class="nt">-d</span>
</code></pre></div></div>

<p>However, this is <em>not</em> like a <code class="language-plaintext highlighter-rouge">tar</code> file. It is not a list of file names and their content. It is a list of block changes. ZFS is a tree structure with a snapshot and a volume being tree roots. The leaves of the tree may be unchanged between two snapshots. <code class="language-plaintext highlighter-rouge">zfs send</code> operates at the block level below the file system layer.</p>

<p>To emphasise this point, consider a <code class="language-plaintext highlighter-rouge">ZVOL</code> formatted as XFS. The structure of the send stream is the same: a record of block changes.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zfs create <span class="nt">-V</span> 1G tank/vol
mkfs.xfs /dev/zvol/tank/vol
zfs snapshot tank/vol@send
zfs send tank/vol@send | zstreamdump <span class="nt">-d</span>
</code></pre></div></div>

<p>ZVOLs are interesting as they give you a snapshot capability on a file system that doesn’t have one. However, some performance metrics I saw posted online showed disappointing results compared with creating a file and using a loopback device. Furthermore, the snapshot would only be in a crash-consistent state as it would be unaware of the underlying snapshot. XFS does have <code class="language-plaintext highlighter-rouge">xfsdump</code> and <code class="language-plaintext highlighter-rouge">xfsrestore</code>, but they are pretty basic tools.</p>

<p>[1] See also <a href="https://openzfs.org/wiki/Documentation/ZfsSend">ZfsSend Documentation</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="openzfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[We often say that ZFS is an excellent replicated file system, but not the best local filesystem. This led me to think that if we run zfs send on one machine, we might want to write that out as a different filesystem. Is that even possible?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/openzfs.png" /><media:content medium="image" url="https://www.tunbury.org/images/openzfs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Reconfiguring a system with an mdadm RAID5 root</title><link href="https://www.tunbury.org/2025/05/01/removing-mdadm/" rel="alternate" type="text/html" title="Reconfiguring a system with an mdadm RAID5 root" /><published>2025-05-01T12:00:00+00:00</published><updated>2025-05-01T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/05/01/removing-mdadm</id><content type="html" xml:base="https://www.tunbury.org/2025/05/01/removing-mdadm/"><![CDATA[<p>Cloud providers automatically configure their machines as they expect you to use them. For example, a machine with 4 x 8T disks might come configured with an mdadm RAID5 array spanning the disks. This may be what most people want, but we don’t want this configuration, as we want to see the bare disks. Given you have only a serial console (over SSH) and no access to the cloud-init environment, how do you boot the machine in a different configuration?</p>

<p>Example configuration:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
fd0       2:0    1    4K  0 disk
sda       8:0    0    4G  0 disk
├─sda1    8:1    0  512M  0 part  /boot/efi
└─sda2    8:2    0  3.5G  0 part
  └─md0   9:0    0 10.5G  0 raid5 /
sdb       8:16   0    4G  0 disk
└─sdb1    8:17   0    4G  0 part
  └─md0   9:0    0 10.5G  0 raid5 /
sdc       8:32   0    4G  0 disk
└─sdc1    8:33   0    4G  0 part
  └─md0   9:0    0 10.5G  0 raid5 /
sdd       8:48   0    4G  0 disk
└─sdd1    8:49   0    4G  0 part
  └─md0   9:0    0 10.5G  0 raid5 /
</code></pre></div></div>

<p>My initial approach was to create a tmpfs root filesystem and then use <code class="language-plaintext highlighter-rouge">pivot_root</code> to switch it. This worked except <code class="language-plaintext highlighter-rouge">/dev/md0</code> was still busy, so I could not unmount it.</p>

<p>It occurred to me that I could remove one of the partitions from the RAID5 set and use that as the new root disk. <code class="language-plaintext highlighter-rouge">mdadm --fail /dev/md0 /dev/sda2</code>, followed by <code class="language-plaintext highlighter-rouge">mdadm --remove /dev/md0 /dev/sda2</code> frees up a disk. <code class="language-plaintext highlighter-rouge">debootstrap</code> can then be used to install Ubuntu on the partition. As we have a working system, we can preserve the key configuration settings such as <code class="language-plaintext highlighter-rouge">/etc/hostname</code>, <code class="language-plaintext highlighter-rouge">/etc/netplan</code>, <code class="language-plaintext highlighter-rouge">/etc/fstab</code> etc by just copying them from <code class="language-plaintext highlighter-rouge">/etc</code> to <code class="language-plaintext highlighter-rouge">/mnt/etc</code>. Unfortunately, Ansible’s copy module does not preserve ownership. Therefore, I used <code class="language-plaintext highlighter-rouge">rsync</code> instead. <code class="language-plaintext highlighter-rouge">/etc/fstab</code> must be edited to reflect the new root partition.</p>

<p>Lastly, run <code class="language-plaintext highlighter-rouge">grub-install</code> using <code class="language-plaintext highlighter-rouge">chroot</code> to the new environment and reboot.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
fd0      2:0    1    4K  0 disk
sda      8:0    0    4G  0 disk
├─sda1   8:1    0  512M  0 part /boot/efi
└─sda2   8:2    0  3.5G  0 part /
sdb      8:16   0    4G  0 disk
└─sdb1   8:17   0    4G  0 part
sdc      8:32   0    4G  0 disk
└─sdc1   8:33   0    4G  0 part
sdd      8:48   0    4G  0 disk
└─sdd1   8:49   0    4G  0 part
</code></pre></div></div>

<p>The redundant RAID5 partitions can be removed with <code class="language-plaintext highlighter-rouge">wipefs -af /dev/sd[b-d]</code></p>

<p>I have wrapped all the steps in an Ansible <a href="https://gist.github.com/mtelvers/1fe3571830d982eb8adbcf5a513edb2c">playbook</a>, which is available as a GitHub gist.</p>

<h1 id="addendum">Addendum</h1>

<p>I had tested this in QEMU with EFI under the assumption that a newly provisioned cloud machine would use EFI. However, when I ran the script against the machine, I found it used a legacy bootloader, and it was even more complicated than I had envisioned, as there were three separate MDADM arrays in place:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] [raid0] [raid10] 
md2 : active raid5 sdb4[0] sdd4[2] sda4[4] sdc4[1]
      34252403712 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 2/86 pages [8KB], 65536KB chunk

md1 : active raid5 sdd3[1] sda3[2] sdc3[0] sdb3[4]
      61381632 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      
md0 : active raid1 sdd2[1] sda2[2] sdb2[3] sdc2[0]
      523264 blocks super 1.2 [4/4] [UUUU]
      
unused devices: &lt;none&gt;
</code></pre></div></div>

<p>With <code class="language-plaintext highlighter-rouge">lsblk</code> showing four disks each configured as below:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>NAME        MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINTS
sda           8:0    0 10.9T  0 disk  
├─sda1        8:1    0    1M  0 part  
├─sda2        8:2    0  512M  0 part  
│ └─md0       9:0    0  511M  0 raid1 
│   └─md0p1 259:0    0  506M  0 part  /boot
├─sda3        8:3    0 19.5G  0 part  
│ └─md1       9:1    0 58.5G  0 raid5 
│   └─md1p1 259:1    0 58.5G  0 part  /
├─sda4        8:4    0 10.6T  0 part  
│ └─md2       9:2    0 31.9T  0 raid5 
│   └─md2p1 259:2    0 31.9T  0 part  /data
└─sda5        8:5    0  512M  0 part  [SWAP]
</code></pre></div></div>

<p>The boot device is a RAID1 mirror (four copies), so removing one of these copies is no issue. There is also a 1MB BIOS boot partition first to give some space for GRUB. The root device was RAID5 as I had anticipated.</p>

<p>The playbook could be adapted: double up on the <code class="language-plaintext highlighter-rouge">mdadm</code> commands to break two arrays, update two entries in <code class="language-plaintext highlighter-rouge">/etc/fstab</code> and use <code class="language-plaintext highlighter-rouge">grub-pc</code> rather than <code class="language-plaintext highlighter-rouge">grub-efi-amd64</code>. The updated playbook is <a href="https://gist.github.com/mtelvers/ba3b7a5974b50422e2c2e594bed0bdb2">here</a>.</p>

<p>For testing, I installed Ubuntu using this <a href="https://gist.github.com/mtelvers/d2d333bf5c9bd94cb905488667f0cae1">script</a> to simulate the VM.</p>

<p>Improvements could be made, as <code class="language-plaintext highlighter-rouge">/boot</code> could be merged into <code class="language-plaintext highlighter-rouge">/</code> as there is no reason to separate them when not using EFI. There never <em>needed</em> to be a <code class="language-plaintext highlighter-rouge">/boot</code> as GRUB2 will boot a RAID5 MDADM.</p>

<p>The system is a pretty minimal installation of Ubuntu, a more typical set of tools could be installed with:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt install ubuntu-standard
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="mdadm,ubuntu" /><category term="tunbury.org" /><summary type="html"><![CDATA[Cloud providers automatically configure their machines as they expect you to use them. For example, a machine with 4 x 8T disks might come configured with an mdadm RAID5 array spanning the disks. This may be what most people want, but we don’t want this configuration, as we want to see the bare disks. Given you have only a serial console (over SSH) and no access to the cloud-init environment, how do you boot the machine in a different configuration?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/mdadm.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/mdadm.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Distributed ZFS Storage</title><link href="https://www.tunbury.org/2025/04/29/distributed-zfs-storage/" rel="alternate" type="text/html" title="Distributed ZFS Storage" /><published>2025-04-29T20:00:00+00:00</published><updated>2025-04-29T20:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/29/distributed-zfs-storage</id><content type="html" xml:base="https://www.tunbury.org/2025/04/29/distributed-zfs-storage/"><![CDATA[<p>Following Anil’s <a href="https://anil.recoil.org/notes/syncoid-sanoid-zfs">note</a>, we will design and implement a distributed storage archive system for ZFS volumes and associated metadata. <em>Metadata</em> here refers to key information about the dataset itself:</p>

<ul>
  <li>A summary of what the dataset is</li>
  <li>Data retention requirement (both legal and desirable)</li>
  <li>Time/effort/cost required to reproduce the data</li>
  <li>Legal framework under which the data is available, restrictions on the distribution of the data, etc.</li>
</ul>

<p>And also refers to the more <em>systems</em> style meanings such as:</p>

<ul>
  <li>Size of the dataset</li>
  <li>List of machines/ZFS pools where the data is stored</li>
  <li>Number and distribution of copies required</li>
  <li>Snapshot and replication frequency/policy</li>
</ul>

<p>These data will be stored in a JSON/YAML or other structured file format.</p>

<p>The system would have a database of machines and their associated storage (disks/zpools/etc) and location. Each item of storage would have a ‘failure domain’ to logically group resources for redundancy. This would allow copies of a dataset to be placed in different domains to meet the redundancy requirements. For example, given that we are committed to holding two distinct copies of the data, would we use RAIDZ on the local disks or just a dynamic stripe, RAID0, to maximise capacity?</p>

<p>While under development, the system will output recommended actions - shell commands - to perform the snapshot and replication steps necessary to meet the replication and redundancy policies. Ultimately, these commands could be executed automatically.</p>

<p>Utilising ZFS encryption, the remote pools can be stored as an encrypted filesystem without the encryption keys.</p>

<p>When the data is being processed, it will be staged locally on the worker’s NVMe drive for performance, and the resultant dataset <em>may</em> be uploaded with a new dataset of metadata.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="openzfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[Following Anil’s note, we will design and implement a distributed storage archive system for ZFS volumes and associated metadata. Metadata here refers to key information about the dataset itself:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/openzfs.png" /><media:content medium="image" url="https://www.tunbury.org/images/openzfs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Raptor Talos II - POWER9 unreliability</title><link href="https://www.tunbury.org/2025/04/29/raptor-talos-ii/" rel="alternate" type="text/html" title="Raptor Talos II - POWER9 unreliability" /><published>2025-04-29T12:00:00+00:00</published><updated>2025-04-29T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/29/raptor-talos-ii</id><content type="html" xml:base="https://www.tunbury.org/2025/04/29/raptor-talos-ii/"><![CDATA[<p>We have two Raptor Computing Talos II POWER9 machines. One of these has had issues for some time and cannot run for more than 20 minutes before locking up completely. Over the last few days, our second machine has exhibited similar issues and needs to be power-cycled every ~24 hours. I spent some time today trying to diagnose the issue with the first machine, removing the motherboard as recommended by Raptor support, to see if the issue still exists with nothing else connected. Sadly, it does. I noted that a firmware update is available, which would move from v2.00 to v2.10.</p>

<p><img src="/images/raptor-computing.jpeg" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="power9" /><category term="tunbury.org" /><summary type="html"><![CDATA[We have two Raptor Computing Talos II POWER9 machines. One of these has had issues for some time and cannot run for more than 20 minutes before locking up completely. Over the last few days, our second machine has exhibited similar issues and needs to be power-cycled every ~24 hours. I spent some time today trying to diagnose the issue with the first machine, removing the motherboard as recommended by Raptor support, to see if the issue still exists with nothing else connected. Sadly, it does. I noted that a firmware update is available, which would move from v2.00 to v2.10.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/raptor-talos-ii.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/raptor-talos-ii.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Equinix Moves</title><link href="https://www.tunbury.org/2025/04/29/equinix-moves/" rel="alternate" type="text/html" title="Equinix Moves" /><published>2025-04-29T00:00:00+00:00</published><updated>2025-04-29T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/29/equinix-moves</id><content type="html" xml:base="https://www.tunbury.org/2025/04/29/equinix-moves/"><![CDATA[<p>The moves of registry.ci.dev, opam-repo-ci, and get.dune.build have followed the template of <a href="https://www.tunbury.org/ocaml-ci/">OCaml-CI</a>. Notable differences have been that I have hosted <code class="language-plaintext highlighter-rouge">get.dune.build</code> in a VM, as the services required very little disk space or CPU/RAM. For opam-repo-ci, the <code class="language-plaintext highlighter-rouge">rsync</code> was pretty slow, so I tried running multiple instances using GNU parallel with marginal gains.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cd</span> /var/lib/docker/volumes2/opam-repo-ci_data/_data/var/job
<span class="nb">ls</span> <span class="nt">-d</span> <span class="k">*</span> | parallel <span class="nt">-j</span> 5 rsync <span class="nt">-azh</span> c2-4.equinix.ci.dev:/var/lib/docker/volumes/opam-repo-ci_data/_data/var/job/<span class="o">{}</span>/ <span class="o">{}</span>/
</code></pre></div></div>

<p>The Ansible configuration script for OCaml-CI is misnamed as it configures the machine and deploys infrastructure: Caddy, Grafana, Prometheus and Docker secrets, but not the Docker stack. The Docker stack for OCaml-CI is deployed by <code class="language-plaintext highlighter-rouge">make deploy-stack</code> from <a href="https://github.com/ocurrent/ocaml-ci">ocurrent/ocaml-ci</a>. Conversely, opam-repo-ci <em>is</em> deployed from the Ansible playbook, but there is a <code class="language-plaintext highlighter-rouge">Makefile</code> and an outdated <code class="language-plaintext highlighter-rouge">stack.yml</code> in <a href="https://github.com/ocurrent/opam-repo-ci">ocurrent/opam-repo-ci</a>.</p>

<p>As part of the migration away from Equinix, these services have been merged into a single large machine <code class="language-plaintext highlighter-rouge">chives.caelum.ci.dev</code>. With this change, I have moved the Docker stack configuration for opam-repo-ci back to the repository <a href="https://github.com/ocurrent/opam-repo-ci/pull/428">PR#428</a> and merged and renamed the machine configuration <a href="https://github.com/mtelvers/ansible/pull/44">PR#44</a>.</p>

<p>We want to thank Equinix for supporting OCaml over the years.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="registry.ci.dev,opam-repo-ci,get.dune.build" /><category term="tunbury.org" /><summary type="html"><![CDATA[The moves of registry.ci.dev, opam-repo-ci, and get.dune.build have followed the template of OCaml-CI. Notable differences have been that I have hosted get.dune.build in a VM, as the services required very little disk space or CPU/RAM. For opam-repo-ci, the rsync was pretty slow, so I tried running multiple instances using GNU parallel with marginal gains.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/equinix.png" /><media:content medium="image" url="https://www.tunbury.org/images/equinix.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Moving OCaml-CI</title><link href="https://www.tunbury.org/2025/04/27/ocaml-ci/" rel="alternate" type="text/html" title="Moving OCaml-CI" /><published>2025-04-27T00:00:00+00:00</published><updated>2025-04-27T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/27/ocaml-ci</id><content type="html" xml:base="https://www.tunbury.org/2025/04/27/ocaml-ci/"><![CDATA[<p>As noted on Thursday, the various OCaml services will need to be moved away from Equinix. Below are my notes on moving OCaml-CI.</p>

<p>Generate an SSH key on the new server <code class="language-plaintext highlighter-rouge">chives</code> using <code class="language-plaintext highlighter-rouge">ssh-keygen -t ed25519</code>. Copy the public key to <code class="language-plaintext highlighter-rouge">c2-3.equinix.ci.dev</code> and save it under <code class="language-plaintext highlighter-rouge">~/.ssh/authorized_keys</code>.</p>

<p>Use <code class="language-plaintext highlighter-rouge">rsync</code> to mirror the Docker volumes. <code class="language-plaintext highlighter-rouge">-z</code> did improve performance as there appears to be a rate limiter somewhere in the path.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rsync <span class="nt">-azvh</span> <span class="nt">--progress</span> c2-3.equinix.ci.dev:/var/lib/docker/volumes/ /var/lib/docker/volumes/
</code></pre></div></div>

<p>After completing the copy, I waited for a quiet moment, and then scaled all of the Docker services to 0. I prefer to scale the services rather than remove them, as the recovery is much easier.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker service scale <span class="nv">infra_grafana</span><span class="o">=</span>0
docker service scale <span class="nv">infra_prometheus</span><span class="o">=</span>0
docker service scale ocaml-ci_ci<span class="o">=</span>0
docker service scale ocaml-ci_gitlab<span class="o">=</span>0
docker service scale ocaml-ci_web<span class="o">=</span>0
</code></pre></div></div>

<p>For the final copy, I used <code class="language-plaintext highlighter-rouge">--checksum</code> and also added <code class="language-plaintext highlighter-rouge">--delete</code>, as the Prometheus database creates segment files that are periodically merged into the main database.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rsync <span class="nt">-azvh</span> <span class="nt">--checksum</span> <span class="nt">--delete</span> <span class="nt">--progress</span> c2-3.equinix.ci.dev:/var/lib/docker/volumes/ /var/lib/docker/volumes/
</code></pre></div></div>

<p>The machine configuration is held in an Ansible Playbook, which includes the Docker stack for Grafana and Prometheus. It can be easily applied to the new machine:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ansible-playbook <span class="nt">-e</span> @secrets/ocaml.ci.dev.yml <span class="nt">--vault-password-file</span> secrets/vault-password ocaml.ci.dev.yml
</code></pre></div></div>

<p>OCaml-CI’s Docker stack is held on GitHub <a href="https://github.com/ocurrent/ocaml-ci">ocurrent/ocaml-ci</a> and can be deployed with:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>make deploy-stack
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ocaml-ci" /><category term="tunbury.org" /><summary type="html"><![CDATA[As noted on Thursday, the various OCaml services will need to be moved away from Equinix. Below are my notes on moving OCaml-CI.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Bluesky SSH Authentication #2</title><link href="https://www.tunbury.org/2025/04/26/bluesky-ssh-authentication-2/" rel="alternate" type="text/html" title="Bluesky SSH Authentication #2" /><published>2025-04-26T00:00:00+00:00</published><updated>2025-04-26T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/26/bluesky-ssh-authentication-2</id><content type="html" xml:base="https://www.tunbury.org/2025/04/26/bluesky-ssh-authentication-2/"><![CDATA[<p>Addressing the glaring omissions from yesterday’s proof of concept, such as the fact that you could sign in as any user, you couldn’t revoke access, all hosts had the same users, and there was no mapping between Bluesky handles and POSIX users, I have updated <a href="https://github.com/mtelvers/bluesky-ssh-key-extractor">mtelvers/bluesky-ssh-key-extractor</a> and newly published <a href="https://github.com/mtelvers/bluesky-collection.git">mtelvers/bluesky-collection</a>.</p>

<p>The tool creates ATProto collections using <code class="language-plaintext highlighter-rouge">app.bsky.graph.list</code> and populates them with <code class="language-plaintext highlighter-rouge">app.bsky.graph.listitem</code> records.</p>

<p>Each list should be named with a friendly identifier such as the FQDN of the host being secured. List entries have a <code class="language-plaintext highlighter-rouge">subject_did</code>, which is the DID of the user you are giving access to, and a <code class="language-plaintext highlighter-rouge">displayName</code>, which is used as the POSIX username on the system you are connecting to.</p>

<p>A typical usage would be creating a collection and adding records. Here I have made a collection called <code class="language-plaintext highlighter-rouge">rosemary.caelum.ci.dev</code> and then added to users <code class="language-plaintext highlighter-rouge">anil.recoil.org</code> and <code class="language-plaintext highlighter-rouge">mtelvers.tunbury.org</code> with POSIX usernames of <code class="language-plaintext highlighter-rouge">avsm2</code> and <code class="language-plaintext highlighter-rouge">mte24</code> respectively. Check my <a href="https://www.atproto-browser.dev/at/did:plc:476rmswt6ji7uoxyiwjna3ti">Bluesky record</a>)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bluesky_collection create --handle mtelvers.tunbury.org --password *** --collection rosemary.caelum.ci.dev
bluesky_collection add --handle mtelvers.tunbury.org --password *** --collection rosemary.caelum.ci.dev --user-handle anil.recoil.org --user-id avsm2
bluesky_collection add --handle mtelvers.tunbury.org --password *** --collection rosemary.caelum.ci.dev --user-handle mtelvers.tunbury.org --user-id mte24
</code></pre></div></div>

<p>When authenticating using SSHD, the companion tool <a href="https://github.com/mtelvers/bluesky-ssh-key-extractor">mtelvers/bluesky-ssh-key-extractor</a> would have command line parameters of the Bluesky user account holding the collection, collection name (aka the hostname), and the POSIX username (provided by SSHD). The authenticator queries the Bluesky network to find the collection matching the FQDN, then finds the list entries comparing them to the POSIX user given. If there is a match, the <code class="language-plaintext highlighter-rouge">subject_did</code> is used to look up the associated <code class="language-plaintext highlighter-rouge">sh.tangled.publicKey</code>.The authenticator requires no password to access Bluesky, as all the records are public.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bluesky,sshd" /><category term="tunbury.org" /><summary type="html"><![CDATA[Addressing the glaring omissions from yesterday’s proof of concept, such as the fact that you could sign in as any user, you couldn’t revoke access, all hosts had the same users, and there was no mapping between Bluesky handles and POSIX users, I have updated mtelvers/bluesky-ssh-key-extractor and newly published mtelvers/bluesky-collection.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/bluesky-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/bluesky-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Bluesky SSH Authentication</title><link href="https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/" rel="alternate" type="text/html" title="Bluesky SSH Authentication" /><published>2025-04-25T15:00:00+00:00</published><updated>2025-04-25T15:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication</id><content type="html" xml:base="https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/"><![CDATA[<p>If you have sign up to <a href="https://tangled.sh">tangled.sh</a> you will have published your SSH public key on the Bluesky ATproto network.  Have a browse to your Bluesky ID, or <a href="https://www.atproto-browser.dev/at/did:plc:476rmswt6ji7uoxyiwjna3ti">mine</a>. Look under <code class="language-plaintext highlighter-rouge">sh.tangled.publicKey</code>.</p>

<p><a href="https://github.com/mtelvers/bluesky-ssh-key-extractor.git">BlueSky ATproto SSH Public Key Extractor</a> extracts this public key information and outputs one public key at a time. The format is suitable to use with the <code class="language-plaintext highlighter-rouge">AuthorizedKeysCommand</code> parameter in your <code class="language-plaintext highlighter-rouge">/etc/sshd/ssh_config</code> file.</p>

<p>Build the project:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>opam <span class="nb">install</span> <span class="nb">.</span> <span class="nt">-deps-only</span>
dune build
</code></pre></div></div>

<p>Install the binary by copying it to the local system. Setting the ownership and permissions is essential.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cp </span>_build/install/default/bin/bluesky-ssh-key-extractor /usr/local/bin
<span class="nb">chmod </span>755 /usr/local/bin/bluesky-ssh-key-extractor
<span class="nb">chown </span>root:root /usr/local/bin/bluesky-ssh-key-extractor
</code></pre></div></div>

<p>Test the command is working:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>bluesky-ssh-key-extractor mtelvers.tunbury.org
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIA7UrJmBFWR3c7jVzpoyg4dJjON9c7t9bT9acfrj6G7i mark.elvers@tunbury.org
</code></pre></div></div>

<p>If that works, then edit your <code class="language-plaintext highlighter-rouge">/etc/sshd/ssh_config</code>:-</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>AuthorizedKeysCommand /usr/local/bin/bluesky-ssh-key-extractor your_bluesky_handle
AuthorizedKeysCommandUser nobody
</code></pre></div></div>

<p>Now you should be able to SSH to the machine using your published key</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh root@your_host
</code></pre></div></div>

<blockquote>
  <p>Note, this program was intended as a proof of concept rather than something you’d actually use.</p>
</blockquote>

<p>If you have a 1:1 mapping, between Bluesky accounts and system usernames, you might get away with:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>AuthorizedKeysCommand /usr/local/bin/bluesky-ssh-key-extractor %u.bsky.social
AuthorizedKeysCommandUser nobody
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bluesky,sshd" /><category term="tunbury.org" /><summary type="html"><![CDATA[If you have sign up to tangled.sh you will have published your SSH public key on the Bluesky ATproto network. Have a browse to your Bluesky ID, or mine. Look under sh.tangled.publicKey.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/bluesky-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/bluesky-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Blade Server Reallocation</title><link href="https://www.tunbury.org/2025/04/25/blade-reallocation/" rel="alternate" type="text/html" title="Blade Server Reallocation" /><published>2025-04-25T10:15:00+00:00</published><updated>2025-04-25T10:15:00+00:00</updated><id>https://www.tunbury.org/2025/04/25/blade-reallocation</id><content type="html" xml:base="https://www.tunbury.org/2025/04/25/blade-reallocation/"><![CDATA[<p>We have changed our mind about using <code class="language-plaintext highlighter-rouge">dm-cache</code> in the SSD/RAID1 configuration. The current thinking is that the mechanical drives would be better served as extra capacity for our distributed ZFS infrastructure, where we intend to have two copies of all data, and these disks represent ~100TB of storage.</p>

<p>As mentioned previously, we have a deadline of Wednesday, 30th April, to move the workloads from the Equinix machines or incur hosting fees.</p>

<p>I also noted that the SSD capacity is 1.7TB in all cases. The new distribution is:</p>

<ul>
  <li>rosemary: FreeBSD CI Worker (releasing spring &amp; summer)</li>
  <li>oregano: OpenBSD CI Worker (releasing bremusa)</li>
  <li>basil: Equinix c2-2 (registry.ci.dev)</li>
  <li>mint: @mte24 workstation</li>
  <li>thyme: spare</li>
  <li>chives: Equinix c2-4 (opam-repo-ci) + Equinix c2-3 (OCaml-ci) + Equinix c2-1 (preview.dune.dev)</li>
  <li>dill: spare</li>
  <li>sage: docs-ci (new implementation, eventually replacing eumache)</li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[We have changed our mind about using dm-cache in the SSD/RAID1 configuration. The current thinking is that the mechanical drives would be better served as extra capacity for our distributed ZFS infrastructure, where we intend to have two copies of all data, and these disks represent ~100TB of storage.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/supermicro.png" /><media:content medium="image" url="https://www.tunbury.org/images/supermicro.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OCaml Infra Map</title><link href="https://www.tunbury.org/2025/04/24/infra-map/" rel="alternate" type="text/html" title="OCaml Infra Map" /><published>2025-04-24T10:00:00+00:00</published><updated>2025-04-24T10:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/24/infra-map</id><content type="html" xml:base="https://www.tunbury.org/2025/04/24/infra-map/"><![CDATA[<p>Yesterday, we were talking about extending the current infrastructure database to incorporate other information to provide prompts to return machines to the pool of resources after they have completed their current role/loan, etc. There is also a wider requirement to bring these services back to Cambridge from Equinix/Scaleway, which will be the subject of a follow-up post. However, the idea of extending the database made me think that it would be amusing to overlay the machine’s positions onto Google Maps.</p>

<p>I added positioning data in the Jekyll Collection <code class="language-plaintext highlighter-rouge">_machines\*.md</code> for each machine. e.g. <a href="https://raw.githubusercontent.com/ocaml/infrastructure/refs/heads/master/_machines/ainia.md">ainia.md</a></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>---
name: ainia
...
latitude: 52.2109
longitude: 0.0917
---
</code></pre></div></div>

<p>Then Jekyll’s Liquid templating engine can create a JavaScript array for us</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
  <span class="c1">// Define machines data array from Jekyll collection</span>
  <span class="kd">const</span> <span class="nx">machinesData</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">{</span><span class="o">%</span> <span class="k">for</span> <span class="nx">machine</span> <span class="k">in</span> <span class="nx">site</span><span class="p">.</span><span class="nx">machines</span> <span class="o">%</span><span class="p">}</span>
      <span class="p">{</span><span class="o">%</span> <span class="k">if</span> <span class="nx">machine</span><span class="p">.</span><span class="nx">latitude</span> <span class="nx">and</span> <span class="nx">machine</span><span class="p">.</span><span class="nx">longitude</span> <span class="o">%</span><span class="p">}</span>
      <span class="p">{</span>
        <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">{{ machine.name }}</span><span class="dl">"</span><span class="p">,</span>
        <span class="na">lat</span><span class="p">:</span> <span class="p">{{</span> <span class="nx">machine</span><span class="p">.</span><span class="nx">latitude</span> <span class="p">}},</span>
        <span class="na">lng</span><span class="p">:</span> <span class="p">{{</span> <span class="nx">machine</span><span class="p">.</span><span class="nx">longitude</span> <span class="p">}},</span>
        <span class="p">{</span><span class="o">%</span> <span class="k">if</span> <span class="nx">machine</span><span class="p">.</span><span class="nx">description</span> <span class="o">%</span><span class="p">}</span>
        <span class="nl">description</span><span class="p">:</span> <span class="dl">"</span><span class="s2">{{ machine.description | escape }}</span><span class="dl">"</span><span class="p">,</span>
        <span class="p">{</span><span class="o">%</span> <span class="nx">endif</span> <span class="o">%</span><span class="p">}</span>
        <span class="c1">// Add any other properties you need</span>
      <span class="p">},</span>
      <span class="p">{</span><span class="o">%</span> <span class="nx">endif</span> <span class="o">%</span><span class="p">}</span>
    <span class="p">{</span><span class="o">%</span> <span class="nx">endfor</span> <span class="o">%</span><span class="p">}</span>
  <span class="p">];</span>

</code></pre></div></div>

<p>This array can be converted into an array of map markers. Google have an API for clustering the markers into a count of machines. I added a random offset to each location to avoid all the markers piling up on a single spot.</p>

<p>The interactive map can be seen at <a href="https://infra.ocaml.org/machines.html">machines.html</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Yesterday, we were talking about extending the current infrastructure database to incorporate other information to provide prompts to return machines to the pool of resources after they have completed their current role/loan, etc. There is also a wider requirement to bring these services back to Cambridge from Equinix/Scaleway, which will be the subject of a follow-up post. However, the idea of extending the database made me think that it would be amusing to overlay the machine’s positions onto Google Maps.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-map.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-map.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Blade Server Allocation</title><link href="https://www.tunbury.org/2025/04/23/blade-allocation/" rel="alternate" type="text/html" title="Blade Server Allocation" /><published>2025-04-23T00:00:00+00:00</published><updated>2025-04-23T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/23/blade-allocation</id><content type="html" xml:base="https://www.tunbury.org/2025/04/23/blade-allocation/"><![CDATA[<p>Equinix has stopped commercial sales of Metal and will sunset the service at the end of June 2026. Equinix have long been a supporter of OCaml and has provided free credits to use on their Metal platform. These credits are coming to an end at the end of this month, meaning that we need to move some of our services away from Equinix. We have two new four-node blade servers, which will become the new home for these services. The blades have dual 10C/20T processors with either 192GB or 256GB of RAM and a combination of SSD and spinning disk.</p>

<p>192GB, 20C/40T with 1.1TB SSD, 2 x 6T disks</p>
<ul>
  <li>rosemary: FreeBSD CI Worker (releasing spring &amp; summer)</li>
  <li>oregano: OpenBSD CI Worker (releasing bremusa)</li>
  <li>basil: docs-ci (new implementation, eventually replacing eumache)</li>
  <li>mint: spare</li>
</ul>

<p>256GB, 20C/40T with 1.5TB SSD, 2 x 8T disks</p>
<ul>
  <li>thyme: Equinix c2-2 (registry.ci.dev)</li>
  <li>chives: Equinix c2-4 (opam-repo-ci) + Equinix c2-3 (OCaml-ci) + Equinix c2-1 (preview.dune.dev)</li>
</ul>

<p>256GB, 20C/40T with 1.1TB SSD, 2 x 6T disks</p>
<ul>
  <li>dill: spare</li>
  <li>sage: spare</li>
</ul>

<p>VMs currently running on hopi can be redeployed to chives, allowing hopi to be redeployed.</p>

<p>Machines which can then be recycled are:</p>
<ul>
  <li>sleepy (4C)</li>
  <li>grumpy (4C)</li>
  <li>doc (4C)</li>
  <li>spring (8T)</li>
  <li>tigger</li>
  <li>armyofdockerness</li>
</ul>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Equinix has stopped commercial sales of Metal and will sunset the service at the end of June 2026. Equinix have long been a supporter of OCaml and has provided free credits to use on their Metal platform. These credits are coming to an end at the end of this month, meaning that we need to move some of our services away from Equinix. We have two new four-node blade servers, which will become the new home for these services. The blades have dual 10C/20T processors with either 192GB or 256GB of RAM and a combination of SSD and spinning disk.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/supermicro.png" /><media:content medium="image" url="https://www.tunbury.org/images/supermicro.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OCaml &amp;lt; 4.14, Fedora 42 and GCC 15</title><link href="https://www.tunbury.org/2025/04/22/ocaml-fedora-gcc/" rel="alternate" type="text/html" title="OCaml &amp;lt; 4.14, Fedora 42 and GCC 15" /><published>2025-04-22T00:00:00+00:00</published><updated>2025-04-22T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/22/ocaml-fedora-gcc</id><content type="html" xml:base="https://www.tunbury.org/2025/04/22/ocaml-fedora-gcc/"><![CDATA[<p>Late last week, @MisterDA added Fedora 42 support to the <a href="https://images.ci.ocaml.org">Docker base image builder</a>. The new base images attempted to build over the weekend, but there have been a few issues!</p>

<p>The code I had previously added to force Fedora 41 to use the DNF version 5 syntax was specifically for version 41. For reference, the old syntax was <code class="language-plaintext highlighter-rouge">yum groupinstall -y 'C Development Tools and Libraries’</code>, and the new syntax is <code class="language-plaintext highlighter-rouge">yum group install -y 'c-development'</code>. Note the extra space.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">c_devtools_libs</span> <span class="o">:</span> <span class="p">(</span><span class="n">t</span><span class="o">,</span> <span class="kt">unit</span><span class="o">,</span> <span class="kt">string</span><span class="o">,</span> <span class="n">t</span><span class="p">)</span> <span class="n">format4</span> <span class="o">=</span>
  <span class="k">match</span> <span class="n">d</span> <span class="k">with</span>
  <span class="o">|</span> <span class="nt">`Fedora</span> <span class="nt">`V41</span> <span class="o">-&gt;</span> <span class="p">{</span><span class="o">|</span><span class="s2">"c-development"</span><span class="o">|</span><span class="p">}</span>
  <span class="o">|</span> <span class="nt">`Fedora</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="p">{</span><span class="o">|</span><span class="s2">"C Development Tools and Libraries"</span><span class="o">|</span><span class="p">}</span>
  <span class="o">|</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="p">{</span><span class="o">|</span><span class="s2">"Development Tools”|}
...
let dnf_version = match d with `Fedora `V41 -&gt; 5 | _ -&gt; 3
</span></code></pre></div></div>

<p>To unburden ourselves of this maintenance in future releases, I have inverted the logic so unmatched versions will use the new syntax.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="p">(</span><span class="n">dnf_version</span><span class="o">,</span> <span class="n">c_devtools_libs</span><span class="p">)</span> <span class="o">:</span> <span class="kt">int</span> <span class="o">*</span> <span class="p">(</span><span class="n">t</span><span class="o">,</span> <span class="kt">unit</span><span class="o">,</span> <span class="kt">string</span><span class="o">,</span> <span class="n">t</span><span class="p">)</span> <span class="n">format4</span> <span class="o">=</span>
  <span class="k">match</span> <span class="n">d</span> <span class="k">with</span>
  <span class="o">|</span> <span class="nt">`Fedora</span>
    <span class="p">(</span> <span class="nt">`V21</span> <span class="o">|</span> <span class="nt">`V22</span> <span class="o">|</span> <span class="nt">`V23</span> <span class="o">|</span> <span class="nt">`V24</span> <span class="o">|</span> <span class="nt">`V25</span> <span class="o">|</span> <span class="nt">`V26</span> <span class="o">|</span> <span class="nt">`V27</span> <span class="o">|</span> <span class="nt">`V28</span> <span class="o">|</span> <span class="nt">`V29</span>
    <span class="o">|</span> <span class="nt">`V30</span> <span class="o">|</span> <span class="nt">`V31</span> <span class="o">|</span> <span class="nt">`V32</span> <span class="o">|</span> <span class="nt">`V33</span> <span class="o">|</span> <span class="nt">`V34</span> <span class="o">|</span> <span class="nt">`V35</span> <span class="o">|</span> <span class="nt">`V36</span> <span class="o">|</span> <span class="nt">`V37</span> <span class="o">|</span> <span class="nt">`V38</span>
    <span class="o">|</span> <span class="nt">`V39</span> <span class="o">|</span> <span class="nt">`V40</span> <span class="p">)</span> <span class="o">-&gt;</span>
    <span class="p">(</span><span class="mi">3</span><span class="o">,</span> <span class="p">{</span><span class="o">|</span><span class="s2">"C Development Tools and Libraries"</span><span class="o">|</span><span class="p">})</span>
  <span class="o">|</span> <span class="nt">`Fedora</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="mi">5</span><span class="o">,</span> <span class="p">{</span><span class="o">|</span><span class="s2">"c-development"</span><span class="o">|</span><span class="p">})</span>
  <span class="o">|</span> <span class="n">_</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="mi">3</span><span class="o">,</span> <span class="p">{</span><span class="o">|</span><span class="s2">"Development Tools"</span><span class="o">|</span><span class="p">})</span>
</code></pre></div></div>

<p>Fedora 42 also removed <code class="language-plaintext highlighter-rouge">awk</code>, so it now needs to be specifically included as a dependency. However, this code is shared with Oracle Linux, which does not have a package called <code class="language-plaintext highlighter-rouge">awk</code>. Fortunately, both have a package called <code class="language-plaintext highlighter-rouge">gawk</code>!</p>

<p>The next issue is that Fedora 42 is the first of the distributions we build base images for that has moved to GCC 15, specifically GCC 15.0.1. This breaks all versions of OCaml &lt; 4.14.</p>

<p>The change is that the code below, which previously gave no information about the number or type of parameters. (see <code class="language-plaintext highlighter-rouge">runtime/caml/prims.h</code>)</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="nf">value</span> <span class="p">(</span><span class="o">*</span><span class="n">c_primitive</span><span class="p">)();</span>
</code></pre></div></div>

<p>Now means that there are no parameters, aka:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="nf">value</span> <span class="p">(</span><span class="o">*</span><span class="n">c_primitive</span><span class="p">)(</span><span class="kt">void</span><span class="p">);</span>
</code></pre></div></div>

<p>This is caused by a change of the default compilter language version. See <a href="https://gcc.gnu.org/gcc-15/changes.html">GCC change log</a></p>

<blockquote>
  <p>C23 by default: GCC 15 changes the default language version for C compilation from <code class="language-plaintext highlighter-rouge">-std=gnu17</code> to <code class="language-plaintext highlighter-rouge">-std=gnu23</code>. If your code relies on older versions of the C standard, you will need to either add <code class="language-plaintext highlighter-rouge">-std=</code> to your build flags, or port your code; see the porting notes.</p>
</blockquote>

<p>Also see the <a href="https://gcc.gnu.org/gcc-15/porting_to.html#c23">porting notes</a>, and <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118112">this bug report</a>.</p>

<p>This is <em>not</em> an immediate problem as OCaml-CI and opam-repo-ci only test against OCaml 4.14.2 and 5.3.0 on Fedora. I have opened <a href="https://github.com/ocurrent/docker-base-images/issues/320">issue#320</a> to track this problem.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml,Fedora,GCC" /><category term="tunbury.org" /><summary type="html"><![CDATA[Late last week, @MisterDA added Fedora 42 support to the Docker base image builder. The new base images attempted to build over the weekend, but there have been a few issues!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Clock winder repair</title><link href="https://www.tunbury.org/2025/04/21/clock-winder-repair/" rel="alternate" type="text/html" title="Clock winder repair" /><published>2025-04-21T12:00:00+00:00</published><updated>2025-04-21T12:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/21/clock-winder-repair</id><content type="html" xml:base="https://www.tunbury.org/2025/04/21/clock-winder-repair/"><![CDATA[<p>The galvanised steel wire rope on one of my clock winders has snapped. This is a 3mm rope, so it would have a rating of greater than 500 kg. I am quite surprised that it snapped, as the load on this wire rope is much lower than that of others in use in the same system.</p>

<p>I suspect that the failure is due to the pulley. There is a significant gap between the frame and the pulley wheel where the wire may get jammed. (Right-hand picture). My initial thought was to 3d print a spacer washer, but instead, I was able to squash the entire assembly, removing all the play while still allowing the pulley to rotate. (Left-hand picture).</p>

<p><img src="/images/aylesford-pulley.jpg" alt="" /></p>

<p>When the clock is being wound, either by hand or via the clock winder, the tension is removed from the drive wheel, resulting in a reduced impulse on the escapement. In early versions of the winder, I had ignored the counterweight by tying it out of the way, but this caused the clock to lose almost 10 minutes per day. The counterweight is an ingeniously simple workaround which keeps tension on the drive wheel by pulling on one of the gear teeth. This particular part of the clock winder lifts the counterweight before the winder lifts the weight.</p>

<iframe width="315" height="560" src="https://www.youtube.com/embed/aozrwtLnFw8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen=""></iframe>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="clock" /><category term="tunbury.org" /><summary type="html"><![CDATA[The galvanised steel wire rope on one of my clock winders has snapped. This is a 3mm rope, so it would have a rating of greater than 500 kg. I am quite surprised that it snapped, as the load on this wire rope is much lower than that of others in use in the same system.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/aylesford-snap.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/aylesford-snap.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ubuntu cloud-init with LVM and dm-cache</title><link href="https://www.tunbury.org/2025/04/21/ubuntu-dm-cache/" rel="alternate" type="text/html" title="Ubuntu cloud-init with LVM and dm-cache" /><published>2025-04-21T00:00:00+00:00</published><updated>2025-04-21T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/21/ubuntu-dm-cache</id><content type="html" xml:base="https://www.tunbury.org/2025/04/21/ubuntu-dm-cache/"><![CDATA[<p><a href="https://en.wikipedia.org/wiki/Dm-cache">dm-cache</a> has been part of the mainline Linux kernel for over a decade, making it possible for faster SSD and NVMe drives to be used as a cache within a logical volume. <a href="https://videos.cdn.redhat.com/summit2015/presentations/17856_getting-the-most-out-of-your-nvme-ssd.pdf">This technology brief from Dell</a> gives a good overview of <code class="language-plaintext highlighter-rouge">dm-cache</code> and the performance benefits. Skip to the graph on page 25, noting the logarithmic scale.</p>

<p>Given a system with a small SATADOM module, <code class="language-plaintext highlighter-rouge">/dev/sdd</code>, an SSD drive <code class="language-plaintext highlighter-rouge">/dev/sdc</code> and a couple of large-capacity spinning disks, <code class="language-plaintext highlighter-rouge">/dev/sd[ab]</code>, can we use cloud-init to configure RAID1 on the capacity disks with the SSD being used as a cache?</p>

<p>Unfortunately, the <code class="language-plaintext highlighter-rouge">storage:</code> / <code class="language-plaintext highlighter-rouge">config:</code> nodes are not very flexible when it comes to even modest complexity. For example, given an LVM volume group consisting of multiple disk types, it isn’t possible to create a logical volume on a specific disk as <code class="language-plaintext highlighter-rouge">devices:</code> is not a parameter to <code class="language-plaintext highlighter-rouge">lvm_partition</code>. It is also not possible to specify <code class="language-plaintext highlighter-rouge">raid: raid1</code>.</p>

<p>I have taken the approach of creating two volume groups, <code class="language-plaintext highlighter-rouge">vg_raid</code> and <code class="language-plaintext highlighter-rouge">vg_cache</code>, on disks <code class="language-plaintext highlighter-rouge">/dev/sd[ab]</code> and <code class="language-plaintext highlighter-rouge">/dev/sdc</code>, respectively, thereby forcing the use of the correct devices. On the <code class="language-plaintext highlighter-rouge">vg_raid</code> group, I have created a single logical volume without RAID. On <code class="language-plaintext highlighter-rouge">vg_cache</code>, I have created the two cache volumes, <code class="language-plaintext highlighter-rouge">lv-cache</code> and <code class="language-plaintext highlighter-rouge">lv-cache-meta</code>.</p>

<p>The <code class="language-plaintext highlighter-rouge">lv-cache</code> and <code class="language-plaintext highlighter-rouge">lv-cache-meta</code> should be sized in the ratio 1000:1.</p>

<p>As the final step of the installation, I used <code class="language-plaintext highlighter-rouge">late-commands</code> to configure the system as I want it. These implement RAID1 for the root logical volume, deactivate the two cache volumes as a necessary step before merging <code class="language-plaintext highlighter-rouge">vg_raid</code> and <code class="language-plaintext highlighter-rouge">vg_cache</code>, create the cache pool from the cache volumes, and finally enable the cache. The cache pool can be either <em>writethrough</em> or <em>writeback</em>, with the default being <em>writethrough</em>. In this mode, data is written to both the cache and the original volume, so a failure in the cache device doesn’t result in any data loss. <em>Writeback</em> has better performance as writes initially only go to the cache volume and are only written to the original volume later.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>lvconvert -y --type raid1 -m 1 /dev/vg_raid/lv_data
lvchange -an vg_cache/lv_cache
lvchange -an vg_cache/lv_cache_meta
vgmerge vg_raid vg_cache
lvconvert -y --type cache-pool --poolmetadata vg_raid/lv_cache_meta vg_raid/lv_cache
lvconvert -y --type cache --cachemode writethrough --cachepool vg_raid/lv_cache vg_raid/lv_data
</code></pre></div></div>

<p>I have placed <code class="language-plaintext highlighter-rouge">/boot</code> and <code class="language-plaintext highlighter-rouge">/boot/EFI</code> on the SATADOM so that the system can be booted.</p>

<p>My full configuration given below.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#cloud-config
autoinstall:
  version: 1
  storage:
    config:
      # Define the physical disks
      - { id: disk-sda, type: disk, ptable: gpt, path: /dev/sda, preserve: false }
      - { id: disk-sdb, type: disk, ptable: gpt, path: /dev/sdb, preserve: false }
      - { id: disk-sdc, type: disk, ptable: gpt, path: /dev/sdc, preserve: false }
      - { id: disk-sdd, type: disk, ptable: gpt, path: /dev/sdd, preserve: false }

      # Define the partitions
      - { id: efi-part, type: partition, device: disk-sdd, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: true, offset: 1048576}
      - { id: boot-part, type: partition, device: disk-sdd, size: 1G, wipe: superblock, number: 2, preserve: false, grub_device: false }

      # Create volume groups
      - { id: vg-raid, type: lvm_volgroup, name: vg_raid, devices: [disk-sda, disk-sdb] }
      - { id: vg-cache, type: lvm_volgroup, name: vg_cache, devices: [disk-sdc] }

      # Create logical volume which will be for RAID
      - { id: lv-data, type: lvm_partition, volgroup: vg-raid, name: lv_data, size: 1000G, preserve: false}

      # Create cache metadata logical volume on SSD VG (ratio 1000:1 with cache data)
      - { id: lv-cache-meta, type: lvm_partition, volgroup: vg-cache, name: lv_cache_meta, size: 1G, preserve: false }

      # Create cache data logical volume on SSD VG
      - { id: lv-cache, type: lvm_partition, volgroup: vg-cache, name: lv_cache, size: 1000G, preserve: false }

      # Format the volumes
      - { id: root-fs, type: format, fstype: ext4, volume: lv-data, preserve: false }
      - { id: efi-fs, type: format, fstype: fat32, volume: efi-part, preserve: false }
      - { id: boot-fs, type: format, fstype: ext4, volume: boot-part, preserve: false }

      # Mount the volumes
      - { id: mount-1, type: mount, path: /, device: root-fs }
      - { id: mount-2, type: mount, path: /boot, device: boot-fs }
      - { id: mount-3, type: mount, path: /boot/efi, device: efi-fs }
  identity:
    hostname: unnamed-server
    password: "$6$exDY1mhS4KUYCE/2$zmn9ToZwTKLhCw.b4/b.ZRTIZM30JZ4QrOQ2aOXJ8yk96xpcCof0kxKwuX1kqLG/ygbJ1f8wxED22bTL4F46P0"
    username: mte24
  ssh:
    install-server: yes
    authorized-keys:
      - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIA7UrJmBFWR3c7jVzpoyg4dJjON9c7t9bT9acfrj6G7i mark.elvers@tunbury.org
    allow-pw: no
  packages:
    - lvm2
    - thin-provisioning-tools
  user-data:
    disable_root: false
  late-commands:
    - lvconvert -y --type raid1 -m 1 /dev/vg_raid/lv_data
    - lvchange -an vg_cache/lv_cache
    - lvchange -an vg_cache/lv_cache_meta
    - vgmerge vg_raid vg_cache
    - lvconvert -y --type cache-pool --poolmetadata vg_raid/lv_cache_meta vg_raid/lv_cache
    - lvconvert -y --type cache --cachemode writethrough --cachepool vg_raid/lv_cache vg_raid/lv_data
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="cloud-init,dm-cache,Ubuntu" /><category term="tunbury.org" /><summary type="html"><![CDATA[dm-cache has been part of the mainline Linux kernel for over a decade, making it possible for faster SSD and NVMe drives to be used as a cache within a logical volume. This technology brief from Dell gives a good overview of dm-cache and the performance benefits. Skip to the graph on page 25, noting the logarithmic scale.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ubuntu.png" /><media:content medium="image" url="https://www.tunbury.org/images/ubuntu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Gluster</title><link href="https://www.tunbury.org/2025/04/19/gluster/" rel="alternate" type="text/html" title="Gluster" /><published>2025-04-19T00:00:00+00:00</published><updated>2025-04-19T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/19/gluster</id><content type="html" xml:base="https://www.tunbury.org/2025/04/19/gluster/"><![CDATA[<p>Gluster is a free and open-source software network filesystem. It has been a few years since I last looked at the project, and I was interested in taking another look. Some features, like automatic tiering of hot/cold data, have been removed, and the developers now recommend <code class="language-plaintext highlighter-rouge">dm-cache</code> with LVM instead.</p>

<p>I am going to use four QEMU VMs on which I have installed Ubuntu via PXE boot. For easy repetition, I have wrapped my <code class="language-plaintext highlighter-rouge">qemu-system-x86_64</code> commands into a <code class="language-plaintext highlighter-rouge">Makefile</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>machine: disk0.qcow2 disk1.qcow2 OVMF_VARS.fd
        qemu-system-x86_64 -m 8G -smp 4 -machine accel=kvm,type=pc -cpu host -display none -vnc :11 \
                -drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd \
                -drive if=pflash,format=raw,file=OVMF_VARS.fd \
                -serial stdio \
                -device virtio-scsi-pci,id=scsi0 \
                -device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \
                -drive file=disk0.qcow2,if=none,id=drive0 \
                -device scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=1,lun=0 \
                -drive file=disk1.qcow2,if=none,id=drive1 \
                -net nic,model=virtio-net-pci,macaddr=02:00:00:00:00:11 \
                -net bridge,br=br0

disk%.qcow2:
        qemu-img create -f qcow2 $@ 1T

OVMF_VARS.fd:
        cp /usr/share/OVMF/OVMF_VARS.fd OVMF_VARS.fd

clean:
        rm -f *.qcow2 OVMF_VARS.fd
</code></pre></div></div>

<p>Gluster works on any file system that supports extended attributes <em>xattr</em>, which includes <code class="language-plaintext highlighter-rouge">ext[2-4]</code>. However, XFS is typically used as it performs well with parallel read/write operations and large files. I have used 512-byte inodes, <code class="language-plaintext highlighter-rouge">-i size=512</code>, which is recommended as this creates extra space for the extended attributes.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkfs.xfs <span class="nt">-i</span> <span class="nv">size</span><span class="o">=</span>512 /dev/sdb
<span class="nb">mkdir</span> <span class="nt">-p</span> /gluster/sdb
<span class="nb">echo</span> <span class="s2">"/dev/sdb /gluster/sdb xfs defaults 0 0"</span> <span class="o">&gt;&gt;</span> /etc/fstab
mount <span class="nt">-a</span>
</code></pre></div></div>

<p>With the filesystem prepared, install and start Gluster. Gluster stores its settings in <code class="language-plaintext highlighter-rouge">/var/lib/glusterd</code>, so if you need to reset your installation, stop the gluster daemon and remove that directory.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>glusterfs-server
systemctl <span class="nb">enable </span>glusterd
systemctl start glusterd
</code></pre></div></div>

<p>From one node, probe all the other nodes. You can do this by IP address or by hostname.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gluster peer probe node222
gluster peer probe node200
gluster peer probe node152
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">gluster pool list</code> should now list all the nodes. <code class="language-plaintext highlighter-rouge">localhost</code> indicates your current host.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>UUID                                    Hostname        State
8d2a1ef0-4c23-4355-9faa-8f3387054d41    node222         Connected
4078f192-b2bb-4c74-a588-35d5475dedc7    node200         Connected
5b2fc21b-b0ab-401e-9848-3973121bfec7    node152         Connected
d5878850-0d40-4394-8dd8-b9b0d4266632    localhost       Connected
</code></pre></div></div>

<p>Now we need to add a volume. A Gluster volume can be distributed, replicated or dispersed. It is possible to have mix distributed with the other two types, giving a distributed replicated volume or a distributed dispersed volume. Briefly, distributed splits the data across the nodes without redundancy but gives a performance advantage. Replicated creates 2 or more copies of the data. Dispersed uses erasure coding, which can be considered as RAID5 over nodes.</p>

<p>Once a volume has been created, it needs to be started. The commands to create and start the volume only need to be executed on one of the nodes.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gluster volume create vol1 disperse 4 transport tcp node<span class="o">{</span>200,222,223,152<span class="o">}</span>:/gluster/sdb/vol1
gluster volume start vol1
</code></pre></div></div>

<p>On each node, or on a remote machine, you can now mount the Gluster volume. Here I have mounted it to <code class="language-plaintext highlighter-rouge">/mnt</code> from the node itself. All writes to <code class="language-plaintext highlighter-rouge">/mnt</code> will be dispersed to the other nodes.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>echo "localhost:/vol1 /mnt glusterfs defaults 0 0" &gt;&gt; /etc/fstab
mount -a
</code></pre></div></div>

<p>The volume can be inspected with <code class="language-plaintext highlighter-rouge">gluster volume info</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Volume Name: vol1
Type: Disperse
Volume ID: 31e165b2-da96-40b2-bc09-e4607a02d14b
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (3 + 1) = 4
Transport-type: tcp
Bricks:
Brick1: node200:/gluster/sdb/vol1
Brick2: node222:/gluster/sdb/vol1
Brick3: node223:/gluster/sdb/vol1
Brick4: node152:/gluster/sdb/vol1
Options Reconfigured:
network.ping-timeout: 4
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
</code></pre></div></div>

<p>In initial testing, any file operation on the mounted volume appeared to hang when a node went down. This is because Gluster has a default timeout of 42 seconds. This command will set a lower value:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gluster volume set vol1 network.ping-timeout 4
</code></pre></div></div>

<p>The video below shows the four VMs running. One is writing random data to <code class="language-plaintext highlighter-rouge">/mnt/random</code>. The other machines are running <code class="language-plaintext highlighter-rouge">ls -phil /mnt</code> so we can watch the file growing. <code class="language-plaintext highlighter-rouge">node222</code> is killed, and after the 4-second pause, the other nodes continue. When the node is rebooted, it automatically recovers.</p>

<iframe width="560" height="315" src="https://www.youtube.com/embed/I8cPq2iCQ5A" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>

<blockquote>
  <p>While I used 4 nodes, this works equally well with 3 nodes.</p>
</blockquote>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Gluster,Ubuntu" /><category term="tunbury.org" /><summary type="html"><![CDATA[Gluster is a free and open-source software network filesystem. It has been a few years since I last looked at the project, and I was interested in taking another look. Some features, like automatic tiering of hot/cold data, have been removed, and the developers now recommend dm-cache with LVM instead.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/gluster.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/gluster.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ubuntu cloud-init</title><link href="https://www.tunbury.org/2025/04/16/ubuntu-cloud-init/" rel="alternate" type="text/html" title="Ubuntu cloud-init" /><published>2025-04-16T00:00:00+00:00</published><updated>2025-04-16T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/16/ubuntu-cloud-init</id><content type="html" xml:base="https://www.tunbury.org/2025/04/16/ubuntu-cloud-init/"><![CDATA[<p>Testing cloud-init is painful on real (server) hardware, as the faster the server, the longer it seems to take to complete POST. Therefore, I highly recommend testing with a virtual machine before moving to real hardware.</p>

<p>I have set up a QEMU machine to simulate the Dell R640 machines with 10 x 8T disks. I’ll need to set up and tear this machine down several times for testing, so I have wrapped the setup commands into a <code class="language-plaintext highlighter-rouge">Makefile</code>. QCOW2 is a thin format, so you don’t actually need 80T of disk space to do this!</p>

<p>The Dell machines use EFI, so I have used EFI on the QEMU machine. Note the <code class="language-plaintext highlighter-rouge">OVMF</code> lines in the configuration. Ensure that you emulate a hard disk controller, which is supported by the EFI BIOS. For example, <code class="language-plaintext highlighter-rouge">-device megasas,id=scsi0</code> won’t boot as the EFI BIOS can’t see the drives. I have enabled VNC access, but I primarily used the serial console to interact with the machine.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>machine: disk0.qcow2 disk1.qcow2 disk2.qcow2 disk3.qcow2 disk4.qcow2 disk5.qcow2 disk6.qcow2 disk7.qcow2 disk8.qcow2 disk9.qcow2 OVMF_VARS.fd
	qemu-system-x86_64 -m 8G -smp 4 -machine accel=kvm,type=pc -cpu host -display none -vnc :0 \
		-drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd \
		-drive if=pflash,format=raw,file=OVMF_VARS.fd \
		-serial stdio \
		-device virtio-scsi-pci,id=scsi0 \
		-device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \
		-drive file=disk0.qcow2,if=none,id=drive0 \
		-device scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=1,lun=0 \
		-drive file=disk1.qcow2,if=none,id=drive1 \
		-device scsi-hd,drive=drive2,bus=scsi0.0,channel=0,scsi-id=2,lun=0 \
		-drive file=disk2.qcow2,if=none,id=drive2 \
		-device scsi-hd,drive=drive3,bus=scsi0.0,channel=0,scsi-id=3,lun=0 \
		-drive file=disk3.qcow2,if=none,id=drive3 \
		-device scsi-hd,drive=drive4,bus=scsi0.0,channel=0,scsi-id=4,lun=0 \
		-drive file=disk4.qcow2,if=none,id=drive4 \
		-device scsi-hd,drive=drive5,bus=scsi0.0,channel=0,scsi-id=5,lun=0 \
		-drive file=disk5.qcow2,if=none,id=drive5 \
		-device scsi-hd,drive=drive6,bus=scsi0.0,channel=0,scsi-id=6,lun=0 \
		-drive file=disk6.qcow2,if=none,id=drive6 \
		-device scsi-hd,drive=drive7,bus=scsi0.0,channel=0,scsi-id=7,lun=0 \
		-drive file=disk7.qcow2,if=none,id=drive7 \
		-device scsi-hd,drive=drive8,bus=scsi0.0,channel=0,scsi-id=8,lun=0 \
		-drive file=disk8.qcow2,if=none,id=drive8 \
		-device scsi-hd,drive=drive9,bus=scsi0.0,channel=0,scsi-id=9,lun=0 \
		-drive file=disk9.qcow2,if=none,id=drive9 \
		-net nic,model=virtio-net-pci,macaddr=02:00:00:00:00:01 \
		-net bridge,br=br0

disk%.qcow2:
	qemu-img create -f qcow2 $@ 8T

OVMF_VARS.fd:
	cp /usr/share/OVMF/OVMF_VARS.fd OVMF_VARS.fd

clean:
	rm *.qcow2 OVMF_VARS.fd
</code></pre></div></div>

<p>We are using <a href="https://netboot.xyz">netboot.xyz</a> to network boot the machine via PXE. The easiest way to use netboot.xyz is to use it within the prebuilt Docker container. This can be set up using a <code class="language-plaintext highlighter-rouge">docker-compose.yml</code> file. Start the container with <code class="language-plaintext highlighter-rouge">docker compose up -d</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>version: "2.1"
services:
  netbootxyz:
    image: ghcr.io/netbootxyz/netbootxyz
    container_name: netbootxyz
    environment:
      - NGINX_PORT=80 # optional
      - WEB_APP_PORT=3000 # optional
    volumes:
      - /netbootxyz/config:/config # optional
      - /netbootxyz/assets:/assets # optional
    ports:
      - 3000:3000  # optional, destination should match ${WEB_APP_PORT} variable above.
      - 69:69/udp
      - 8080:80  # optional, destination should match ${NGINX_PORT} variable above.
    restart: unless-stopped
</code></pre></div></div>

<p>We have a Ubiquiti EdgeMax providing DHCP services. The DHCP options should point new clients to the Docker container.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set service dhcp-serverbootfile-server doc.caelum.ci.dev
set service dhcp-server global-parameters "class &amp;quot;BIOS-x86&amp;quot; { match if option arch = 00:00; filename &amp;quot;netboot.xyz.kpxe&amp;quot;; }"
set service dhcp-server global-parameters "class &amp;quot;UEFI-x64&amp;quot; { match if option arch = 00:09; filename &amp;quot;netboot.xyz.efi&amp;quot;; }"
set service dhcp-server global-parameters "class &amp;quot;UEFI-bytecode&amp;quot; { match if option arch = 00:07; filename &amp;quot;netboot.xyz.efi&amp;quot;; }"
</code></pre></div></div>

<p>I also recommend staging the Ubuntu installation ISO, <code class="language-plaintext highlighter-rouge">vmlinuz</code>, and <code class="language-plaintext highlighter-rouge">initrd</code> locally, as this will speed up the machine’s boot time. The files needed are:</p>

<ul>
  <li>https://releases.ubuntu.com/24.04.2/ubuntu-24.04.2-live-server-amd64.iso</li>
  <li>https://github.com/netbootxyz/ubuntu-squash/releases/download/24.04.2-dac09526/vmlinuz</li>
  <li>https://github.com/netbootxyz/ubuntu-squash/releases/download/24.04.2-dac09526/initrd</li>
</ul>

<p>Create a <code class="language-plaintext highlighter-rouge">user-data</code> file containing the following cloud-init configuration. In this case, it primarily includes the storage configuration. The goal here is to configure each disk identically, with a tiny EFI partition, an MD RAID partition and a rest given over to the ZFS datastore. Additionally, create empty files <code class="language-plaintext highlighter-rouge">meta-data</code> and <code class="language-plaintext highlighter-rouge">vendor-data</code>. None of the files have an extension. The encrypted password is <code class="language-plaintext highlighter-rouge">ubuntu</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#cloud-config
autoinstall:
  version: 1
  storage:
    config:
    - { ptable: gpt, path: /dev/sda, preserve: false, name: '', grub_device: false, id: disk-sda, type: disk }
    - { ptable: gpt, path: /dev/sdb, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdb, type: disk }
    - { ptable: gpt, path: /dev/sdc, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdc, type: disk }
    - { ptable: gpt, path: /dev/sdd, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdd, type: disk }
    - { ptable: gpt, path: /dev/sde, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sde, type: disk }
    - { ptable: gpt, path: /dev/sdf, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdf, type: disk }
    - { ptable: gpt, path: /dev/sdg, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdg, type: disk }
    - { ptable: gpt, path: /dev/sdh, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdh, type: disk }
    - { ptable: gpt, path: /dev/sdi, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdi, type: disk }
    - { ptable: gpt, path: /dev/sdj, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdj, type: disk }
    - { device: disk-sda, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: true, offset: 1048576, id: efi-0, type: partition }
    - { device: disk-sdb, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: true, offset: 1048576, id: efi-1, type: partition }
    - { device: disk-sdc, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-2, type: partition }
    - { device: disk-sdd, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-3, type: partition }
    - { device: disk-sde, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-4, type: partition }
    - { device: disk-sdf, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-5, type: partition }
    - { device: disk-sdg, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-6, type: partition }
    - { device: disk-sdh, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-7, type: partition }
    - { device: disk-sdi, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-8, type: partition }
    - { device: disk-sdj, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-9, type: partition }
    - { device: disk-sda, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-0, type: partition }
    - { device: disk-sdb, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-1, type: partition }
    - { device: disk-sdc, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-2, type: partition }
    - { device: disk-sdd, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-3, type: partition }
    - { device: disk-sde, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-4, type: partition }
    - { device: disk-sdf, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-5, type: partition }
    - { device: disk-sdg, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-6, type: partition }
    - { device: disk-sdh, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-7, type: partition }
    - { device: disk-sdi, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-8, type: partition }
    - { device: disk-sdj, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-9, type: partition }
    - { device: disk-sda, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-0, type: partition }
    - { device: disk-sdb, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-1, type: partition }
    - { device: disk-sdc, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-2, type: partition }
    - { device: disk-sdd, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-3, type: partition }
    - { device: disk-sde, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-4, type: partition }
    - { device: disk-sdf, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-5, type: partition }
    - { device: disk-sdg, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-6, type: partition }
    - { device: disk-sdh, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-7, type: partition }
    - { device: disk-sdi, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-8, type: partition }
    - { device: disk-sdj, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-9, type: partition }
    - { name: md0, raidlevel: raid5, devices: [ md-0, md-1, md-2, md-3, md-4, md-5, md-6, md-7, md-8, md-9 ], spare_devices: [], preserve: false, wipe: superblock, id: raid-0, type: raid }
    - { fstype: fat32, volume: efi-0, preserve: false, id: efi-dos-0, type: format }
    - { fstype: fat32, volume: efi-1, preserve: false, id: efi-dos-1, type: format }
    - { fstype: ext4, volume: raid-0, preserve: false, id: root-ext4, type: format }
    - { path: /, device: root-ext4, id: mount-2, type: mount }
    - { path: /boot/efi, device: efi-dos-0, id: mount-0, type: mount }
    - { path: /boot/efi-alt, device: efi-dos-1, id: mount-1, type: mount }
  identity:
    hostname: ubuntu-server
    password: "$6$exDY1mhS4KUYCE/2$zmn9ToZwTKLhCw.b4/b.ZRTIZM30JZ4QrOQ2aOXJ8yk96xpcCof0kxKwuX1kqLG/ygbJ1f8wxED22bTL4F46P0"
    username: ubuntu
  ssh:
    install-server: yes
    authorized-keys:
      - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIA7UrJmBFWR3c7jVzpoyg4dJjON9c7t9bT9acfrj6G7i
    allow-pw: no
  packages:
    - zfsutils-linux
  user-data:
    disable_root: false
</code></pre></div></div>

<p>The binaries and configuration files should be stored in the assets folder used by netbootxyz.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/netbootxyz/assets/r640/initrd
/netbootxyz/assets/r640/meta-data
/netbootxyz/assets/r640/ubuntu-24.04.2-live-server-amd64.iso
/netbootxyz/assets/r640/user-data
/netbootxyz/assets/r640/vendor-data
/netbootxyz/assets/r640/vmlinuz
</code></pre></div></div>

<p>The kernel command line used for iPXE needs to include <code class="language-plaintext highlighter-rouge">autoinstall</code> and <code class="language-plaintext highlighter-rouge">ds=nocloud;s=http://your_server</code>. We could modify one of the existing <code class="language-plaintext highlighter-rouge">ipxe</code> scripts to do this, but it is more flexible to create <code class="language-plaintext highlighter-rouge">/netbootxyz/config/menus/MAC-020000000001.ipxe</code> where <code class="language-plaintext highlighter-rouge">020000000001</code> represents the MAC address <code class="language-plaintext highlighter-rouge">02:00:00:00:00:01</code> and should be updated to reflect the actual server’s MAC address.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!ipxe

# Set a timeout (in milliseconds) for automatic selection
set timeout 30000

# Define a title for the menu
:start
menu Boot Menu
item --key 1 local      Boot from local hdd
item --key 2 ubuntu     Autoinstall Ubuntu Noble
item --key r reboot     Reboot system
item --key x exit       Exit to iPXE shell
choose --timeout ${timeout} --default local option &amp;&amp; goto ${option}

# boot local system
:local
echo Booting from local disks ...
exit 1

# Ubuntu boot configuration
:ubuntu
imgfree
echo Autoinstall Ubuntu Noble...
set base-url http://doc.caelum.ci.dev:8080/r640
kernel ${base-url}/vmlinuz
initrd ${base-url}/initrd
imgargs vmlinuz root=/dev/ram0 ramdisk_size=3500000 cloud-config-url=/dev/null ip=dhcp url=${base-url}/ubuntu-24.04.2-live-server-amd64.iso initrd=initrd.magic console=ttyS0,115200n8 autoinstall ds=nocloud;s=${base-url}
boot || goto failed

# Error handling
:failed
echo Boot failed, waiting 5 seconds...
sleep 5
goto start

# Reboot option
:reboot
reboot

# Exit to shell
:exit
echo Exiting to iPXE shell...
exit
</code></pre></div></div>

<p>With this setup, we can now boot a machine from the network and automatically install Ubuntu with our chosen disk configuration.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Netboot.xyz,Ubuntu" /><category term="tunbury.org" /><summary type="html"><![CDATA[Testing cloud-init is painful on real (server) hardware, as the faster the server, the longer it seems to take to complete POST. Therefore, I highly recommend testing with a virtual machine before moving to real hardware.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ubuntu.png" /><media:content medium="image" url="https://www.tunbury.org/images/ubuntu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Slurm Workload Manager</title><link href="https://www.tunbury.org/2025/04/14/slurm-workload-manager/" rel="alternate" type="text/html" title="Slurm Workload Manager" /><published>2025-04-14T00:00:00+00:00</published><updated>2025-04-14T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/14/slurm-workload-manager</id><content type="html" xml:base="https://www.tunbury.org/2025/04/14/slurm-workload-manager/"><![CDATA[<p>Sadiq mentioned <code class="language-plaintext highlighter-rouge">slurm</code> as a possible way to better schedule the group’s compute resources. Many resources are available showing how to create batch jobs for Slurm clusters but far fewer on how to set up a cluster. This is a quick walkthrough of the basic steps to set up a two-node compute cluster on Ubuntu 24.04. Note that <code class="language-plaintext highlighter-rouge">slurmd</code> and <code class="language-plaintext highlighter-rouge">slurmctld</code> can run on the same machine.</p>

<p>Create three VMs: <code class="language-plaintext highlighter-rouge">node1</code>, <code class="language-plaintext highlighter-rouge">node2</code> and <code class="language-plaintext highlighter-rouge">head</code>.</p>

<p>On <code class="language-plaintext highlighter-rouge">head</code>, install these components.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>munge slurmd slurmctld
</code></pre></div></div>

<p>On <code class="language-plaintext highlighter-rouge">node1</code> and <code class="language-plaintext highlighter-rouge">node2</code> install.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>munge slurmd
</code></pre></div></div>

<p>Copy <code class="language-plaintext highlighter-rouge">/etc/munge/munge.key</code> from <code class="language-plaintext highlighter-rouge">head</code> to the same location on <code class="language-plaintext highlighter-rouge">node1</code> and <code class="language-plaintext highlighter-rouge">node2</code>. Then restart <code class="language-plaintext highlighter-rouge">munge</code> on the other nodes with <code class="language-plaintext highlighter-rouge">service munge restart</code>.</p>

<p>You should now be able to <code class="language-plaintext highlighter-rouge">munge -n | unmunge</code> without error. This should also work via SSH. i.e. <code class="language-plaintext highlighter-rouge">ssh head munge -n | ssh node1 unmunge</code></p>

<p>If you don’t have DNS, add <code class="language-plaintext highlighter-rouge">node1</code> and <code class="language-plaintext highlighter-rouge">node2</code> to the <code class="language-plaintext highlighter-rouge">/etc/hosts</code> file on <code class="language-plaintext highlighter-rouge">head</code> and add <code class="language-plaintext highlighter-rouge">head</code> to the <code class="language-plaintext highlighter-rouge">/etc/hosts</code> on <code class="language-plaintext highlighter-rouge">node1</code> and <code class="language-plaintext highlighter-rouge">node2</code>.</p>

<p>On <code class="language-plaintext highlighter-rouge">head</code>, create the daemon spool directory:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">mkdir</span> /var/spool/slurmctld
<span class="nb">chown</span> <span class="nt">-R</span> slurm:slurm /var/spool/slurmctld/
<span class="nb">chmod </span>775 /var/spool/slurmctld/
</code></pre></div></div>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/slurm/slurm.conf</code>, as below. Update the compute node section by running <code class="language-plaintext highlighter-rouge">slurmd -C</code> on each node to generate the configuration line. This file should be propagated to all the machines. The configuration file can be created using this <a href="https://slurm.schedmd.com/configurator.html">tool</a>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ClusterName=cluster
SlurmctldHost=head
ProctrackType=proctrack/linuxproc
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurmctld
TaskPlugin=task/affinity,task/cgroup

# TIMERS
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0

# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_tres

# LOGGING AND ACCOUNTING
JobCompType=jobcomp/none
JobAcctGatherFrequency=30
SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=info
SlurmdLogFile=/var/log/slurmd.log

# COMPUTE NODES
NodeName=node1 CPUs=1 Boards=1 SocketsPerBoard=1 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=1963
NodeName=node2 CPUs=1 Boards=1 SocketsPerBoard=1 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=1963
PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP
</code></pre></div></div>

<p>On <code class="language-plaintext highlighter-rouge">head</code>, start the control daemon.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>service slurmctld start
</code></pre></div></div>

<p>And on the nodes, start the slurm daemon.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>service slurmd start
</code></pre></div></div>

<p>From <code class="language-plaintext highlighter-rouge">head</code>, you can now run a command simultaneously on both nodes.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># srun -N2 -l /bin/hostname</span>
0: node1
1: node2
</code></pre></div></div>

<p>The optional <code class="language-plaintext highlighter-rouge">Gres</code> parameter on <code class="language-plaintext highlighter-rouge">NodeName</code> allows nodes to be configured with extra resources such as GPUs.</p>

<p>Typical configurations use an NFS server to make /home available on all the nodes. Note that users only need to be created on the head node and don’t need SSH access to the compute nodes.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Slurm" /><category term="tunbury.org" /><summary type="html"><![CDATA[Sadiq mentioned slurm as a possible way to better schedule the group’s compute resources. Many resources are available showing how to create batch jobs for Slurm clusters but far fewer on how to set up a cluster. This is a quick walkthrough of the basic steps to set up a two-node compute cluster on Ubuntu 24.04. Note that slurmd and slurmctld can run on the same machine.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/slurm.png" /><media:content medium="image" url="https://www.tunbury.org/images/slurm.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">GNU Parallel</title><link href="https://www.tunbury.org/2025/04/13/gnu-parallel/" rel="alternate" type="text/html" title="GNU Parallel" /><published>2025-04-13T00:00:00+00:00</published><updated>2025-04-13T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/13/gnu-parallel</id><content type="html" xml:base="https://www.tunbury.org/2025/04/13/gnu-parallel/"><![CDATA[<p>If you haven’t used it before, or perhaps it has been so long that it has been swapped out to disk, let me commend GNU’s <a href="https://www.gnu.org/software/parallel/parallel.html">Parallel</a> to you.</p>

<p>Parallel executes shell commands in parallel! A trivial example would be <code class="language-plaintext highlighter-rouge">parallel echo ::: A B C</code>, which runs <code class="language-plaintext highlighter-rouge">echo A</code>, <code class="language-plaintext highlighter-rouge">echo B</code> and <code class="language-plaintext highlighter-rouge">echo C</code>.  <code class="language-plaintext highlighter-rouge">{}</code> can be used as a placeholder for the parameter in cases where it isn’t simply appended to the command line.</p>

<p>Multiple parameters can be read from an input file using four colons, <code class="language-plaintext highlighter-rouge">parallel echo :::: params_file</code>. This is particularly useful as it correctly deals with parameters/file names with spaces. For example, create a tab-delimited list of source and destination paths in <code class="language-plaintext highlighter-rouge">paths.tsv</code> and then run:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>parallel <span class="nt">--jobs</span> 8 <span class="nt">--colsep</span> <span class="s1">'\t'</span> <span class="nt">--progress</span> rsync <span class="nt">-avh</span> <span class="o">{</span>1<span class="o">}</span> <span class="o">{</span>2<span class="o">}</span> :::: paths.tsv
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="GNU" /><category term="tunbury.org" /><summary type="html"><![CDATA[If you haven’t used it before, or perhaps it has been so long that it has been swapped out to disk, let me commend GNU’s Parallel to you.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/gnu.png" /><media:content medium="image" url="https://www.tunbury.org/images/gnu.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Box Diff Tool</title><link href="https://www.tunbury.org/2025/04/12/box-diff/" rel="alternate" type="text/html" title="Box Diff Tool" /><published>2025-04-12T00:00:00+00:00</published><updated>2025-04-12T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/12/box-diff</id><content type="html" xml:base="https://www.tunbury.org/2025/04/12/box-diff/"><![CDATA[<p>Over the weekend, I extended <a href="https://github.com/mtelvers/ocaml-box-diff">mtelvers/ocaml-box-diff</a> to include the ability to upload files over 50MB. This is a more complex API which requires a call to <a href="https://upload.box.com/api/2.0/files/upload_sessions">https://upload.box.com/api/2.0/files/upload_sessions</a> by posting JSON containing the name of the file, the folder ID and the file size. Box replies with various <em>session endpoints</em> which give the URIs to use to upload the parts and to commit the the file. Box also specifies the size of each part.</p>

<p>Each part is uploaded with an HTTP PUT of the binary data, with header fields giving the byte range within the overall file along with the SHA for this chunk. Box replies with a part identifier. Once all the parts have been uploaded, an HTTP POST is required to the commit URI, passing a JSON array of all the parts as well as the overall SHA for the file.</p>

<p>I was pleased to be able to reuse <code class="language-plaintext highlighter-rouge">stream_of_file</code>, which was written for the small file upload. Additionally, I was able to keep a running total SHA for the data uploaded so far using <code class="language-plaintext highlighter-rouge">Sha1.update_string ctx chunk</code>, meaning that I did not need to recompute the overall file SHA at the end.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml,Box" /><category term="tunbury.org" /><summary type="html"><![CDATA[Over the weekend, I extended mtelvers/ocaml-box-diff to include the ability to upload files over 50MB. This is a more complex API which requires a call to https://upload.box.com/api/2.0/files/upload_sessions by posting JSON containing the name of the file, the folder ID and the file size. Box replies with various session endpoints which give the URIs to use to upload the parts and to commit the the file. Box also specifies the size of each part.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/box-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/box-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Dell R640 Ubuntu Installation</title><link href="https://www.tunbury.org/2025/04/11/dell-r640-ubuntu/" rel="alternate" type="text/html" title="Dell R640 Ubuntu Installation" /><published>2025-04-11T00:00:00+00:00</published><updated>2025-04-11T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/11/dell-r640-ubuntu</id><content type="html" xml:base="https://www.tunbury.org/2025/04/11/dell-r640-ubuntu/"><![CDATA[<p>I could have scripted this via Ansible, but there would always be a manual element, such as configuring the H740P controller and booting from the network to get to the point where you can SSH to the machine. Therefore, I decided to just document the steps required.</p>

<p>After powering the system on, press F2 to open setup and follow this path through the menu</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Device Configuration &gt; Integrated RAID Controller H740P &gt; Configure &gt; Clear Configuration
</code></pre></div></div>

<p>then</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>View Server Profile &gt; Controller Management &gt; Advanced Controller Management &gt; Manage Controller Mode
</code></pre></div></div>

<p>Choose <code class="language-plaintext highlighter-rouge">Switch to Enhanced HBA Controller Mode</code>, then Confirm and reset the system.</p>

<p>Boot to the Ubuntu installer. I used <code class="language-plaintext highlighter-rouge">netboot.xyz</code> running in a Docker container.</p>

<p>I will use a software RAID set configured by <code class="language-plaintext highlighter-rouge">mdadm</code> for the Ubuntu root drive. In this configuration, the EFI partition needs special attention as EFI does not understand software RAID. GRUB can be configured to create and update multiple copies of the EFI partition. For consistency, I will create an EFI partition on all the drives.</p>

<p>These commands will create the following partitions:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for </span>a <span class="k">in </span>sd<span class="o">{</span>a..j<span class="o">}</span> <span class="p">;</span> <span class="k">do </span>sgdisk <span class="nt">-n1</span>:1M:+512M <span class="nt">-t1</span>:EF00 /dev/<span class="nv">$a</span> <span class="p">;</span> <span class="k">done
for </span>a <span class="k">in </span>sd<span class="o">{</span>a..j<span class="o">}</span> <span class="p">;</span> <span class="k">do </span>sgdisk <span class="nt">-n2</span>:0:+16G <span class="nt">-t2</span>:FD00 /dev/<span class="nv">$a</span> <span class="p">;</span> <span class="k">done
for </span>a <span class="k">in </span>sd<span class="o">{</span>a..j<span class="o">}</span> <span class="p">;</span> <span class="k">do </span>sgdisk <span class="nt">-n3</span>:0:0 <span class="nt">-t3</span>:BF00 /dev/<span class="nv">$a</span> <span class="p">;</span> <span class="k">done</span>
</code></pre></div></div>

<p>Next, format the EFI drives with a DOS filesystem and create the RAID device:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for </span>a <span class="k">in </span>sd<span class="o">{</span>a..j<span class="o">}</span> <span class="p">;</span> <span class="k">do </span>mkdosfs <span class="nt">-F</span> 32 <span class="nt">-s</span> 1 <span class="nt">-n</span> EFI /dev/<span class="k">${</span><span class="nv">a</span><span class="k">}</span>1 <span class="p">;</span> <span class="k">done
</span>mdadm <span class="nt">--create</span> /dev/md0 <span class="nt">--metadata</span><span class="o">=</span>1.2 <span class="nt">--level</span><span class="o">=</span>raid5 <span class="nt">--raid-devices</span><span class="o">=</span>10 /dev/sd[a-j]2
</code></pre></div></div>

<p>Check the partition tables with <code class="language-plaintext highlighter-rouge">sgdisk -p /dev/sda</code>, and the soft RAID setup with <code class="language-plaintext highlighter-rouge">cat /proc/mdstat</code>.</p>

<p>Install Ubuntu via the setup program selecting the software RAID as the root volume and the first drive as the boot drive.</p>

<p>After the system reboots, delete the current EFI entries from <code class="language-plaintext highlighter-rouge">/etc/fstab</code>:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>umount /boot/efi
<span class="nb">sed</span> <span class="nt">-i</span> <span class="s1">'/\/efi/d'</span> /etc/fstab
</code></pre></div></div>

<p>Then add the entries for <code class="language-plaintext highlighter-rouge">/dev/sda1</code> and <code class="language-plaintext highlighter-rouge">/dev/sdb1</code>.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo</span> /dev/disk/by-uuid/<span class="si">$(</span>blkid <span class="nt">-s</span> UUID <span class="nt">-o</span> value /dev/sda1<span class="si">)</span> /boot/efi vfat defaults 0 0 <span class="o">&gt;&gt;</span> /etc/fstab
<span class="nb">mkdir</span> <span class="nt">-p</span> /boot/efi-alt
<span class="nb">echo</span> /dev/disk/by-uuid/<span class="si">$(</span>blkid <span class="nt">-s</span> UUID <span class="nt">-o</span> value /dev/sdb1<span class="si">)</span> /boot/efi-alt vfat defaults 0 0 <span class="o">&gt;&gt;</span> /etc/fstab
systemctl daemon-reload
mount <span class="nt">-a</span>
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">dpkg-reconfigure grub-efi-amd64</code> to configure GRUB. Accept all of the defaults and select <code class="language-plaintext highlighter-rouge">/dev/sda1</code> and <code class="language-plaintext highlighter-rouge">/dev/sdb1</code> as the boot drives. Reboot the system.</p>

<p>After the reboot, install the ZFS utils.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>zfsutils-linux
</code></pre></div></div>

<p>Create a ZFS <em>tank</em> using the <em>by-id</em> values.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zpool create <span class="se">\</span>
          <span class="nt">-o</span> <span class="nv">ashift</span><span class="o">=</span>12 <span class="se">\</span>
          <span class="nt">-o</span> <span class="nv">autotrim</span><span class="o">=</span>on <span class="se">\</span>
          <span class="nt">-O</span> <span class="nv">acltype</span><span class="o">=</span>posixacl <span class="nt">-O</span> <span class="nv">xattr</span><span class="o">=</span>sa <span class="nt">-O</span> <span class="nv">dnodesize</span><span class="o">=</span>auto <span class="se">\</span>
          <span class="nt">-O</span> <span class="nv">normalization</span><span class="o">=</span>formD <span class="se">\</span>
          <span class="nt">-O</span> <span class="nv">relatime</span><span class="o">=</span>on <span class="se">\</span>
          tank raidz /dev/disk/by-id/wwn-<span class="k">*</span><span class="nt">-part3</span>
</code></pre></div></div>

<p>Check it is available:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># zfs list</span>
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank   789K  61.8T   171K  /tank
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Dell,R640" /><category term="tunbury.org" /><summary type="html"><![CDATA[I could have scripted this via Ansible, but there would always be a manual element, such as configuring the H740P controller and booting from the network to get to the point where you can SSH to the machine. Therefore, I decided to just document the steps required.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/dell-r640-final.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/dell-r640-final.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Dell R640 installation</title><link href="https://www.tunbury.org/2025/04/10/dell-r640-installation/" rel="alternate" type="text/html" title="Dell R640 installation" /><published>2025-04-10T00:00:00+00:00</published><updated>2025-04-10T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/10/dell-r640-installation</id><content type="html" xml:base="https://www.tunbury.org/2025/04/10/dell-r640-installation/"><![CDATA[<p>Today we have racked the five 14th generation Dell R640 servers and a Dell N4032 switch.</p>

<p>When inspecting the rack rails, I noticed that some of the left-hand rails had an extra tab on them while the others did not. For the first server, I used a rail with a tab only to discover that the tab prohibited the server from being pushed in all the way.  The tabs were easily removed but the server needed to be removed from the rack first.</p>

<p><img src="/images/dell-r640-rail.jpg" alt="" /></p>

<p><img src="/images/dell-r640-rail-removal.jpg" alt="" /></p>

<p>First server installed</p>

<p><img src="/images/dell-r640-first-one.jpg" alt="" /></p>

<p>The last server on the rails</p>

<p><img src="/images/dell-r640-last-one.jpg" alt="" /></p>

<p>Front view</p>

<p><img src="/images/dell-r640-front-view.jpg" alt="" /></p>

<p>Rear view</p>

<p><img src="/images/dell-r640-rear-view.jpg" alt="" /></p>

<p>Cabling</p>

<ul>
  <li>Yellow CAT5 for iDRAC ports</li>
  <li>Red CAT6 for 10GBase-T</li>
</ul>

<p><img src="/images/dell-r640-cabled.jpg" alt="" /></p>

<p>The initial iDRAC configuration was carried out using a crash cart.</p>

<p><img src="/images/dell-r640-idrac-config.jpg" alt="" /></p>

<p>The servers are called:</p>

<ul>
  <li>myrina</li>
  <li>thalestris</li>
  <li>lampedo</li>
  <li>otrera</li>
  <li>antiope</li>
</ul>

<p><img src="/images/dell-r640-final.jpg" alt="" /></p>

<p>We had some difficulty with the 40G uplink from the switch and we could only get the link to come up by splitting it into 4 x 10G channels, as follows.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>console&gt;enable
console#configure
console(config)#interface Fo1/1/1
console(config-if-Fo1/1/1)#hardware profile portmode 4x10g
</code></pre></div></div>

<p>Then rebooting with <code class="language-plaintext highlighter-rouge">do reload</code>. The 4 x 10G uplinks has been configured as an LACP port channel (Po1).</p>

<h1 id="r640-configuration">R640 Configuration</h1>

<p>Each server has:</p>

<ul>
  <li>2 x Intel Xeon Gold 6244 3.6G 8C / 16T</li>
  <li>8 x 16GB DIMM</li>
  <li>10 x Kingston 7.68TB SSD</li>
</ul>

<p><a href="https://www.dell.com/support/manuals/en-uk/poweredge-r640/per640_ism_pub/general-memory-module-installation-guidelines?guid=guid-acbc0f13-dedb-492b-a0b0-18303ded565a&amp;lang=en-us">Dell R640 has 24 DIMM slots</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Dell,R640" /><category term="tunbury.org" /><summary type="html"><![CDATA[Today we have racked the five 14th generation Dell R640 servers and a Dell N4032 switch.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/dell-r640-final.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/dell-r640-final.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Box API with OCaml and Claude</title><link href="https://www.tunbury.org/2025/04/07/ocaml-claude-box/" rel="alternate" type="text/html" title="Box API with OCaml and Claude" /><published>2025-04-07T00:00:00+00:00</published><updated>2025-04-07T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/07/ocaml-claude-box</id><content type="html" xml:base="https://www.tunbury.org/2025/04/07/ocaml-claude-box/"><![CDATA[<p>Over the weekend, I decided to extend my <a href="https://box.com">Box</a> <a href="https://github.com/mtelvers/ocaml-box-diff">tool</a> to incorporate file upload. There is a straightforward POST API for this with a <code class="language-plaintext highlighter-rouge">curl</code> one-liner given in the Box <a href="https://developer.box.com/reference/post-files-content/">documentation</a>. Easy.</p>

<p>The documentation for <a href="https://mirage.github.io/ocaml-cohttp/cohttp-eio/Cohttp_eio/Client/index.html">Cohttp-eio.Client</a> only gives the function signature for <code class="language-plaintext highlighter-rouge">post</code>, but it looked pretty similar to <code class="language-plaintext highlighter-rouge">get</code>, which I had already been working with. The <a href="https://github.com/mirage/ocaml-cohttp">README</a> for Cohttp gave me pause when I read this comment about multipart forms.</p>

<blockquote>
  <p>Multipart form data is not supported out of the box but is provided by external libraries</p>
</blockquote>

<p>Of the three options given, the second option looked abandoned, while the third said it didn’t support streaming, so I went with the first one <a href="https://github.com/dinosaure/multipart_form">dionsaure/multipart_form</a>.</p>

<p>The landing page included an example encoder. A couple of external functions are mentioned, and I found example code for these in <a href="https://github.com/dinosaure/multipart_form/blob/main/test/test.ml">test/test.ml</a>. This built, but didn’t work against Box. I ran <code class="language-plaintext highlighter-rouge">nc -l 127.0.0.1 6789</code> and set that as the API endpoint for both the <code class="language-plaintext highlighter-rouge">curl</code> and my application. This showed I was missing the <code class="language-plaintext highlighter-rouge">Content-Type</code> header in the part boundary. It should be <code class="language-plaintext highlighter-rouge">application/octet-stream</code>.</p>

<p>There is a <code class="language-plaintext highlighter-rouge">~header</code> parameter to <code class="language-plaintext highlighter-rouge">part</code>, and I hoped for a <code class="language-plaintext highlighter-rouge">Header.add</code> like the <code class="language-plaintext highlighter-rouge">Cohttp</code>, but sadly not. See the <a href="https://ocaml.org/p/multipart_form/latest/doc/Multipart_form/Header/index.html">documentation</a>. There is <code class="language-plaintext highlighter-rouge">Header.content_type</code>, but that returns the content type. How do you make it? <code class="language-plaintext highlighter-rouge">Header.of_list</code> requires a <code class="language-plaintext highlighter-rouge">Field.field list</code>.</p>

<p>In a bit of frustration, I decided to ask Claude. I’ve not tried it before, but I’ve seen some impressive demonstrations. My first lesson here was to be specific. Claude is not a mind reader. After a few questions, I got to this:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">Field</span><span class="p">.(</span><span class="n">make</span> <span class="nn">Content_type</span><span class="p">.</span><span class="n">name</span> <span class="p">(</span><span class="nn">Content_type</span><span class="p">.</span><span class="n">v</span> <span class="nt">`Application</span> <span class="nt">`Octet_stream</span><span class="p">));</span>
</code></pre></div></div>

<p>I can see why this was suggested as <code class="language-plaintext highlighter-rouge">Content_disposition.v</code> exists, but <code class="language-plaintext highlighter-rouge">Content_type.v</code> does not, nor does <code class="language-plaintext highlighter-rouge">Field.make</code>. Claude quickly obliged with a new version when I pointed this out but added the <code class="language-plaintext highlighter-rouge">Content_type</code> to the HTTP header rather than the boundary header. This went back and forth for a while, with Claude repeatedly suggesting functions which did not exist. I gave up.</p>

<p>On OCaml.org, the <a href="https://ocaml.org/p/multipart_form/latest">multipart-form</a> documentation includes a <em>Used by</em> section that listed <code class="language-plaintext highlighter-rouge">dream</code> as the only (external) application which used the library. From the source, I could see <code class="language-plaintext highlighter-rouge">Field.Field (field_name, Field.Content_type, v)</code>, which looked good.</p>

<p>There is a function <code class="language-plaintext highlighter-rouge">Content_type.of_string</code>. I used <code class="language-plaintext highlighter-rouge">:MerlinLocate</code> to find the source, which turned out to be an Angstrom parser which returns a <code class="language-plaintext highlighter-rouge">Content_type.t</code>. This led me to <code class="language-plaintext highlighter-rouge">Content_type.make</code>, and ultimately, I was able to write these two lines:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">v</span> <span class="o">=</span> <span class="nn">Content_type</span><span class="p">.</span><span class="n">make</span> <span class="nt">`Application</span> <span class="p">(</span><span class="nt">`Iana_token</span> <span class="s2">"octet-stream"</span><span class="p">)</span> <span class="nn">Content_type</span><span class="p">.</span><span class="nn">Parameters</span><span class="p">.</span><span class="n">empty</span>
<span class="k">let</span> <span class="n">p0</span> <span class="o">=</span> <span class="n">part</span> <span class="o">~</span><span class="n">header</span><span class="o">:</span><span class="p">(</span><span class="nn">Header</span><span class="p">.</span><span class="n">of_list</span> <span class="p">[</span> <span class="nc">Field</span> <span class="p">(</span><span class="nn">Field_name</span><span class="p">.</span><span class="n">content_type</span><span class="o">,</span> <span class="nc">Content_type</span><span class="o">,</span> <span class="n">v</span><span class="p">)</span> <span class="p">])</span> <span class="o">...</span>
</code></pre></div></div>

<p>As a relatively new adopter of OCaml as my language of choice, the most significant challenge I face is documentation, particularly when I find a library on opam which I want to use. I find this an interesting contrast to the others in the community, where it is often cited that tooling is the most significant barrier to adoption. In my opinion, the time taken to set up a build environment is dwarfed by the time spent in that environment iterating code.</p>

<p>I would like to take this opportunity to thank all contributors to opam repository for their time and effort in making packages available. This post mentions specific packages but only to illustrate my point.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml,Box" /><category term="tunbury.org" /><summary type="html"><![CDATA[Over the weekend, I decided to extend my Box tool to incorporate file upload. There is a straightforward POST API for this with a curl one-liner given in the Box documentation. Easy.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/box-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/box-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">opam repo ci job timeouts</title><link href="https://www.tunbury.org/2025/04/04/opam-repo-ci/" rel="alternate" type="text/html" title="opam repo ci job timeouts" /><published>2025-04-04T00:00:00+00:00</published><updated>2025-04-04T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/04/opam-repo-ci</id><content type="html" xml:base="https://www.tunbury.org/2025/04/04/opam-repo-ci/"><![CDATA[<p>It’s Tuesday morning, and virtually all opam repo ci jobs are failing with timeouts. This comes at a critical time as these are the first jobs following the update of <a href="https://github.com/ocurrent/ocaml-version">ocurrent/ocaml-version</a> <a href="https://www.tunbury.org/recent-ocaml-version/">noted</a> on 24th March.</p>

<p>The <a href="https://opam.ci.ocaml.org/github/ocaml/opam-repository">opam repo ci</a> tests all PRs on <a href="https://github.com/ocaml/opam-repository">opam-repository</a>. The pipeline downloads Docker images, which contain the root filesystem for various Linux distributions, architectures, and OCaml versions, which are used as the base environment to run the tests. These base images are created by the <a href="https://images.ci.ocaml.org">base image builder</a>. <a href="https://github.com/ocurrent/docker-base-images/pull/317">PR#317</a> update these base images in three ways:</p>

<ul>
  <li>Images for OCaml &lt; 4.08 were removed.</li>
  <li>The <code class="language-plaintext highlighter-rouge">opam-repository-archive</code> overlay was removed as this contained the &lt; 4.08 opam packages.</li>
  <li>The <code class="language-plaintext highlighter-rouge">ocaml-patches-overlay</code> overlay was removed as this was only needed to build OCaml &lt; 4.08 on GCC 14.</li>
</ul>

<p>Given these changes, I immediately assumed some element of these was the culprit.</p>

<p>Here’s an example of a failure as reported in the log.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>2025-04-01 07:27.45 ---&gt; using "9dd47386dd0565c83eac2e9d589d75bdd268a7f34f3c854d1db189e7a2e5f77b" from cache

/: (user (uid 1000) (gid 1000))

/: (workdir /home/opam)

/home/opam: (run (shell "sudo ln -f /usr/bin/opam-dev /usr/bin/opam"))
2025-04-01 07:27.45 ---&gt; using "132d861be153666fd67b2e16b21c4de16e15e26f8d7d42f3bcddf0360ad147be" from cache

/home/opam: (run (network host)
                 (shell "opam init --reinit --config .opamrc-sandbox -ni"))
Configuring from /home/opam/.opamrc-sandbox, then /home/opam/.opamrc, and finally from built-in defaults.
Checking for available remotes: rsync and local, git.
  - you won't be able to use mercurial repositories unless you install the hg command on your system.
  - you won't be able to use darcs repositories unless you install the darcs command on your system.

This development version of opam requires an update to the layout of /home/opam/.opam from version 2.0 to version 2.2, which can't be reverted.
You may want to back it up before going further.

Continue? [Y/n] y
[NOTE] The 'jobs' option was reset, its value was 39 and its new value will vary according to the current number of cores on your machine. You can restore the fixed value using:
           opam option jobs=39 --global
Format upgrade done.

&lt;&gt;&lt;&gt; Updating repositories &gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;
2025-04-01 09:27.34: Cancelling: Timeout (120.0 minutes)
Job cancelled
2025-04-01 09:27.40: Timeout (120.0 minutes)
</code></pre></div></div>

<p>With nearly all jobs taking 2 hours to run, the cluster was understandably backlogged!</p>

<p>The issue could be reproduced with this Dockerfile:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd $(mktemp -d)
git clone --recursive "https://github.com/ocaml/opam-repository.git" &amp;&amp; cd "opam-repository" &amp;&amp; git fetch origin "refs/pull/27696/head" &amp;&amp; git reset --hard 46b8cc5a
git fetch origin master
git merge --no-edit 4d8fa0fb8fce3b6c8b06f29ebcfa844c292d4f3e
cat &gt; ../Dockerfile &lt;&lt;'END-OF-DOCKERFILE'
FROM ocaml/opam:debian-12-ocaml-4.09@sha256:13bd7f0979922adb13049eecc387d65d7846a3058f7dd6509738933e88bc8d4a
USER 1000:1000
WORKDIR /home/opam
RUN sudo ln -f /usr/bin/opam-dev /usr/bin/opam
RUN opam init --reinit -ni
RUN opam option solver=builtin-0install &amp;&amp; opam config report
ENV OPAMDOWNLOADJOBS="1"
ENV OPAMERRLOGLEN="0"
ENV OPAMPRECISETRACKING="1"
ENV CI="true"
ENV OPAM_REPO_CI="true"
RUN rm -rf opam-repository/
COPY --chown=1000:1000 . opam-repository/
RUN opam repository set-url --strict default opam-repository/
RUN opam update --depexts || true
RUN opam pin add -k version -yn chrome-trace.3.18.0~alpha0 3.18.0~alpha0
RUN opam reinstall chrome-trace.3.18.0~alpha0; \
    res=$?; \
    test "$res" != 31 &amp;&amp; exit "$res"; \
    export OPAMCLI=2.0; \
    build_dir=$(opam var prefix)/.opam-switch/build; \
    failed=$(ls "$build_dir"); \
    partial_fails=""; \
    for pkg in $failed; do \
    if opam show -f x-ci-accept-failures: "$pkg" | grep -qF "\"debian-12\""; then \
    echo "A package failed and has been disabled for CI using the 'x-ci-accept-failures' field."; \
    fi; \
    test "$pkg" != 'chrome-trace.3.18.0~alpha0' &amp;&amp; partial_fails="$partial_fails $pkg"; \
    done; \
    test "${partial_fails}" != "" &amp;&amp; echo "opam-repo-ci detected dependencies failing: ${partial_fails}"; \
    exit 1

END-OF-DOCKERFILE
docker build -f ../Dockerfile .
</code></pre></div></div>

<p>It was interesting to note which jobs still work. For example, builds on macOS and FreeBSD ran normally. This makes sense as these architectures don’t use the Docker base images. Looking further, opam repo ci attempts builds on opam 2.0, 2.1, 2.2, and 2.3 on Debian. These builds succeeded. Interesting. All the other builds use the latest version of opam built from the head of the master branch.</p>

<p>Taking the failing Dockerfile above and replacing <code class="language-plaintext highlighter-rouge">sudo ln -f /usr/bin/opam-dev /usr/bin/opam</code> with <code class="language-plaintext highlighter-rouge">sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam</code> immediately fixed the issue!</p>

<p>I pushed commit <a href="https://github.com/ocurrent/opam-repo-ci/commit/7174953145735a54ecf668c7387e57b3f2d2a411">7174953</a> to force opam repo ci to use opam 2.3 and opened <a href="https://github.com/ocaml/opam/issues/6448">issue#6448</a> on ocaml/opam. The working theory is that some change associated with <a href="https://github.com/ocaml/opam/pull/5892">PR#5892</a>, which replace GNU patch with the OCaml patch library is the root cause.</p>

<p>Musing on this issue with David, the idea of using the latest tag rather than head commit seemed like a good compromise. This allowed us to specifically test pre-release versions of opam when they were tagged but not be at the cutting edge with the risk of impacting a key service.</p>

<p>We need the latest tag by version number, not by date, as we wouldn’t want to revert to testing on, for example, 2.1.7 if something caused a new release of the 2.1 series. The result was a function which runs <code class="language-plaintext highlighter-rouge">git tag --format %(objectname) %(refname:strip=2)</code> and semantically sorts the version numbers using <code class="language-plaintext highlighter-rouge">OpamVersion.compare</code>. See <a href="https://github.com/ocurrent/docker-base-images/pull/318">PR#318</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[It’s Tuesday morning, and virtually all opam repo ci jobs are failing with timeouts. This comes at a critical time as these are the first jobs following the update of ocurrent/ocaml-version noted on 24th March.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">More Kingston Drives</title><link href="https://www.tunbury.org/2025/04/03/kingston-drives/" rel="alternate" type="text/html" title="More Kingston Drives" /><published>2025-04-03T00:00:00+00:00</published><updated>2025-04-03T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/03/kingston-drives</id><content type="html" xml:base="https://www.tunbury.org/2025/04/03/kingston-drives/"><![CDATA[<p>We have received the second batch of 40 x 7.68TB Kingston SSD drives, bringing the total to 50 drives.</p>

<p>We now have 5 fully populated Dell PowerEdge R640 with a total raw capacity of 384TB.</p>

<p><img src="/images/kingston-forty-with-caddies.png" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Dell" /><category term="tunbury.org" /><summary type="html"><![CDATA[We have received the second batch of 40 x 7.68TB Kingston SSD drives, bringing the total to 50 drives.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/kingston-forty.png" /><media:content medium="image" url="https://www.tunbury.org/images/kingston-forty.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ubuntu with ZFS root</title><link href="https://www.tunbury.org/2025/04/02/ubuntu-with-zfs-root/" rel="alternate" type="text/html" title="Ubuntu with ZFS root" /><published>2025-04-02T00:00:00+00:00</published><updated>2025-04-02T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/02/ubuntu-with-zfs-root</id><content type="html" xml:base="https://www.tunbury.org/2025/04/02/ubuntu-with-zfs-root/"><![CDATA[<p>The installation of <a href="https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/Ubuntu%2022.04%20Root%20on%20ZFS.html">Ubuntu on ZFS</a>
contains about 50 steps of detailed configuration. I have 10 servers to install, so I would like to script this process as much as possible.</p>

<p>To test my script, I have created a new VM on VMware ESXi with 10 x 16GB
disks, 16GB RAM, 4 vCPU. In the advanced options, I have set the boot to
EFI and set <code class="language-plaintext highlighter-rouge">disk.EnableUUID = "TRUE"</code> in the <code class="language-plaintext highlighter-rouge">.vmx</code> file. Doing this
ensures that <code class="language-plaintext highlighter-rouge">/dev/disk</code> aliases are created in the guest.</p>

<p>Boot Ubuntu 24.04 from the Live CD and install SSH.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo</span> <span class="nt">-i</span>
apt update
apt <span class="nb">install </span>openssh-server <span class="nt">-y</span>
</code></pre></div></div>

<p>Use <code class="language-plaintext highlighter-rouge">wget</code> to download https://github.com/mtelvers.keys into <code class="language-plaintext highlighter-rouge">~/.ssh/authorized_keys</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wget https://github.com/mtelvers.keys <span class="nt">-O</span> ~/.ssh/authorized_keys
</code></pre></div></div>

<p>In your Ansible <code class="language-plaintext highlighter-rouge">hosts</code> file, add your new machine and its IP address</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>your.fqdn ansible_host=&lt;ip&gt;
</code></pre></div></div>

<p>Run the playbook with</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ansible-playbook <span class="nt">-i</span> hosts <span class="nt">--limit</span> your.fqdn ubuntu-zfs.yml
</code></pre></div></div>

<p>The playbook is available as a GitHub gist <a href="https://gist.github.com/mtelvers/2cbeb5e35f43f5e461aa0c14c4a0a6b8">zfs-ubuntu.yml</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="openzfs" /><category term="tunbury.org" /><summary type="html"><![CDATA[The installation of Ubuntu on ZFS contains about 50 steps of detailed configuration. I have 10 servers to install, so I would like to script this process as much as possible.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/openzfs.png" /><media:content medium="image" url="https://www.tunbury.org/images/openzfs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Updating Docker and Go</title><link href="https://www.tunbury.org/2025/04/01/go-docker/" rel="alternate" type="text/html" title="Updating Docker and Go" /><published>2025-04-01T00:00:00+00:00</published><updated>2025-04-01T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/04/01/go-docker</id><content type="html" xml:base="https://www.tunbury.org/2025/04/01/go-docker/"><![CDATA[<p>For some time, we have had issues on Ubuntu Noble when extracting
tar files within Docker containers. See
<a href="https://github.com/ocaml/infrastructure/issues/121">ocaml/infrastructure#121</a>.
This is only an issue on exotic architectures like RISCV and PPC64LE.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># docker run --rm -it ubuntu:noble</span>
root@cf3491db4abd:/# <span class="nb">cd
</span>root@cf3491db4abd:~# <span class="nb">mkdir </span>foo
root@cf3491db4abd:~# <span class="nb">tar</span> <span class="nt">-cf</span> bar.tar foo
root@cf3491db4abd:~# <span class="nb">rmdir </span>foo
root@cf3491db4abd:~# <span class="nb">tar</span> <span class="nt">-xf</span> bar.tar
<span class="nb">tar</span>: foo: Cannot change mode to rwxr-xr-x: Operation not permitted
<span class="nb">tar</span>: Exiting with failure status due to previous errors
</code></pre></div></div>

<p>The combination of Docker version and <code class="language-plaintext highlighter-rouge">libseccomp2</code> version prevents
the container from running the <code class="language-plaintext highlighter-rouge">fchmodat2</code> system call. There is a
bug report on Ubuntu’s bug tracker for the issue.</p>

<p>I have been working around this by building Docker from scratch.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt <span class="nb">install </span>golang
git clone https://github.com/moby/moby
<span class="nb">cd </span>moby
<span class="nv">AUTO_GOPATH</span><span class="o">=</span>1 ./hack/make.sh binary
<span class="nb">mv </span>bundles/binary-daemon/<span class="k">*</span> /usr/bin/
service docker restart
</code></pre></div></div>

<p>When provisioning some new RISCV machines, I have once again hit this
issue, but now the version of Go installed by <code class="language-plaintext highlighter-rouge">apt</code> on Ubuntu Noble is
too old to build Docker!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>go: vendor.mod requires go &gt;= 1.23.0 (running go 1.22.2; GOTOOLCHAIN=local)
</code></pre></div></div>

<p>As this needs to be repeated multiple times, it makes sense
to wrap the installation steps into an Ansible Playbook.
<a href="https://gist.github.com/mtelvers/ced9d981b9137c491c95780390ce802c">golang+docker.yml</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="docker,go" /><category term="tunbury.org" /><summary type="html"><![CDATA[For some time, we have had issues on Ubuntu Noble when extracting tar files within Docker containers. See ocaml/infrastructure#121. This is only an issue on exotic architectures like RISCV and PPC64LE.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/docker-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/docker-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Installation order for opam packages</title><link href="https://www.tunbury.org/2025/03/31/opam-post-deps/" rel="alternate" type="text/html" title="Installation order for opam packages" /><published>2025-03-31T00:00:00+00:00</published><updated>2025-03-31T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/31/opam-post-deps</id><content type="html" xml:base="https://www.tunbury.org/2025/03/31/opam-post-deps/"><![CDATA[<p>Previously, I discussed the installation order for a simple directed acyclic graph without any cycles. However, <code class="language-plaintext highlighter-rouge">opam</code> packages include <em>post</em> dependencies. Rather than package A depending upon B where B would be installed first, <em>post</em> dependencies require X to be installed after Y. The <em>post</em> dependencies only occur in a small number of core OCaml packages. They are quite often empty and exist to direct the solver. Up until now, I had been using a base layer with an opam switch containing the base compiler and, therefore, did not need to deal with any <em>post</em> dependencies.</p>

<p>Here is the graph of <a href="/images/0install.2.18-with-post-with-colour.pdf">0install</a> with <em>post</em> dependencies coloured in red.</p>

<p>Removing the <em>post</em> dependencies gives an unsatisfying graph with orphaned dependencies. <a href="/images/0install.2.18-without-post.pdf">0install without post</a>. Note <code class="language-plaintext highlighter-rouge">base-nnp.base</code> and <code class="language-plaintext highlighter-rouge">base-effects.base</code>. However, this graph can be used to produce a linear installation order. The orphaned packages can be removed with a recursive search.</p>

<p>When opam wants to decide the installation order, it uses OCamlgraph’s topological sort capability.</p>

<blockquote>
  <p>This functor provides functions which allow iterating over a graph in topological order. Cycles in graphs are allowed. Specification is the following: If vertex [x] is visited before vertex [y] then either there is a path from [x] to [y], or there is no path from [y] to [x].  In the particular case of a DAG, this simplifies to: if there is an edge from [x] to [y], then [x] is visited before [y].</p>
</blockquote>

<p>The description of <code class="language-plaintext highlighter-rouge">fold</code> is particularly interesting as the order for cycles is unspecified.</p>

<blockquote>
  <p>[fold action g seed] allows iterating over the graph [g] in topological order. [action node accu] is called repeatedly, where [node] is the node being visited, and [accu] is the result of the [action]’s previous invocation, if any, and [seed] otherwise.  If [g] contains cycles, the order is unspecified inside the cycles and every node in the cycles will be presented exactly once</p>
</blockquote>

<p>In my testing, the installation order matches the order used by opam within the variation allowed above.</p>

<p>Layers can be built up using the intersection of packages installed so far and the required dependencies.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="opam" /><category term="tunbury.org" /><summary type="html"><![CDATA[Previously, I discussed the installation order for a simple directed acyclic graph without any cycles. However, opam packages include post dependencies. Rather than package A depending upon B where B would be installed first, post dependencies require X to be installed after Y. The post dependencies only occur in a small number of core OCaml packages. They are quite often empty and exist to direct the solver. Up until now, I had been using a base layer with an opam switch containing the base compiler and, therefore, did not need to deal with any post dependencies.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/opam.png" /><media:content medium="image" url="https://www.tunbury.org/images/opam.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Box Diff Tool</title><link href="https://www.tunbury.org/2025/03/30/box-diff/" rel="alternate" type="text/html" title="Box Diff Tool" /><published>2025-03-30T00:00:00+00:00</published><updated>2025-03-30T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/30/box-diff</id><content type="html" xml:base="https://www.tunbury.org/2025/03/30/box-diff/"><![CDATA[<p>Box has an unlimited storage model but has an upload limit of 1TB per month. I have been uploading various data silos but would now like to verify that the data is all present. Box has an extensive <a href="https://developer.box.com/reference/">API</a>, but I only need the <a href="https://developer.box.com/reference/get-folders-id-items/">list items in folder</a> call.</p>

<p>The list-items call assumes that you have a folder ID which you would like to query. The root of the tree is always ID 0. To check for the presence of file <code class="language-plaintext highlighter-rouge">foo</code> in a folder tree <code class="language-plaintext highlighter-rouge">a/b/c/foo</code>, we need to call the API with folder ID 0. This returns a list of entries in that folder.  e.g.</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"entries"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
    </span><span class="p">{</span><span class="w">
      </span><span class="nl">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"12345"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"folder"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"a"</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>The API must now be called again with the new ID number to get the contents of folder <code class="language-plaintext highlighter-rouge">a</code>. This is repeated until we finally have the entries for folder <code class="language-plaintext highlighter-rouge">c</code> which would contain the file itself. I have used a <code class="language-plaintext highlighter-rouge">Hashtbl</code> to cache the results of each call.</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"entries"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
    </span><span class="p">{</span><span class="w">
      </span><span class="nl">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"78923434"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"file"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"foo"</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Each call defaults to returning at most 100 entries. This can be increased to a maximum of 1000 by passing <code class="language-plaintext highlighter-rouge">?limit=1000</code> to the GET request. For more results, Box offers two pagination systems: <code class="language-plaintext highlighter-rouge">offset</code> and <code class="language-plaintext highlighter-rouge">marker</code>. Offset allows you to pass a starting item number along with the call, but this is limited to 10,000 entries.</p>

<blockquote>
  <p>Queries with offset parameter value exceeding 10000 will be rejected with a 400 response.</p>
</blockquote>

<p>To deal with folders of any size, we should use the marker system. For this, we pass <code class="language-plaintext highlighter-rouge">?usemarker=true</code> to the first GET request, which causes the API to return <code class="language-plaintext highlighter-rouge">next_marker</code> and <code class="language-plaintext highlighter-rouge">prev_marker</code> as required as additional JSON properties. Subsequent calls would use <code class="language-plaintext highlighter-rouge">?usemarker=true&amp;marker=XXX</code>. The end is detected by the absence of the <code class="language-plaintext highlighter-rouge">next_marker</code> when no more entries are available.</p>

<p>The project can be found on GitHub in <a href="https://github.com/mtelvers/ocaml-box-diff">mtelvers/ocaml-box-diff</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml,Box" /><category term="tunbury.org" /><summary type="html"><![CDATA[Box has an unlimited storage model but has an upload limit of 1TB per month. I have been uploading various data silos but would now like to verify that the data is all present. Box has an extensive API, but I only need the list items in folder call.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/box-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/box-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Dell PowerEdge R640 Storage Server</title><link href="https://www.tunbury.org/2025/03/27/dell-poweredge-r640/" rel="alternate" type="text/html" title="Dell PowerEdge R640 Storage Server" /><published>2025-03-27T00:00:00+00:00</published><updated>2025-03-27T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/27/dell-poweredge-r640</id><content type="html" xml:base="https://www.tunbury.org/2025/03/27/dell-poweredge-r640/"><![CDATA[<p>We have received our first batch of 7.68TB Kingston SSD drives for deployment in some Dell PowerEdge R640 servers, which will be used to create a large storage pool.</p>

<p>The first job was to mount each of the drives in a caddy.</p>

<p><img src="/images/kingston-with-caddy.png" alt="" /></p>

<p>And then install them in the server.</p>

<p><img src="/images/kingston-in-slot.png" alt="" /></p>

<p>These R640 servers are equipped with the Dell PERC H740P RAID controller. They support either hardware RAID 0,1,5,10,50 etc or Enhanced HBA mode.</p>

<p><img src="/images/r640-enhanced-hba.png" alt="" /></p>

<p>In eHBA mode, the disks operate in a passthrough mode, presenting the raw disks to the OS, however each disk needs to be specifically selected in an additional step after enabling eHBA mode.</p>

<p><img src="/images/r640-jbod.png" alt="" /></p>

<p>In RAID mode, one or more virtual disks need to be created to present the disks to the OS. Preconfigured profiles are available to complete this step easily.</p>

<p><img src="/images/r640-raid5.png" alt="" /></p>

<p>We will run these with a ZFS file system, so need to decide on whether we want to use the hardware RAID features or follow the advice on Wikipedia on the <a href="https://en.wikipedia.org/wiki/ZFS#Avoidance_of_hardware_RAID_controllers">Avoidance of hardware RAID controllers</a>.  Online opinion is divided.  My summary is that hardware RAID will be easier to manage when a disk fails, but ZFS on the raw disks will have some integrity advantages.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Dell" /><category term="tunbury.org" /><summary type="html"><![CDATA[We have received our first batch of 7.68TB Kingston SSD drives for deployment in some Dell PowerEdge R640 servers, which will be used to create a large storage pool.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/kingston-768tb.png" /><media:content medium="image" url="https://www.tunbury.org/images/kingston-768tb.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">FreeBSD 14.2 Upgrade</title><link href="https://www.tunbury.org/2025/03/26/freebsd-14.2/" rel="alternate" type="text/html" title="FreeBSD 14.2 Upgrade" /><published>2025-03-26T00:00:00+00:00</published><updated>2025-03-26T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/26/freebsd-14.2</id><content type="html" xml:base="https://www.tunbury.org/2025/03/26/freebsd-14.2/"><![CDATA[<p>CI workers <code class="language-plaintext highlighter-rouge">spring</code> and <code class="language-plaintext highlighter-rouge">summer</code> run FreeBSD and need to be updated.</p>

<p>Check the current version of FreeBSD which we have with <code class="language-plaintext highlighter-rouge">uname -r</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>FreeBSD summer 14.1-RELEASE-p5 FreeBSD 14.1-RELEASE-p5 GENERIC amd64
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">freebsd-update fetch</code> to download the latest versions of the system components, particularly the <code class="language-plaintext highlighter-rouge">freebsd-update</code> utility.  It even reported that it really is time to upgrade!</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># freebsd-update fetch</span>
...
WARNING: FreeBSD 14.1-RELEASE-p5 is approaching its End-of-Life date.
It is strongly recommended that you upgrade to a newer
release within the next 5 days.
</code></pre></div></div>

<p>Install these updates.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>freebsd-update <span class="nb">install</span>
</code></pre></div></div>

<p>Now use <code class="language-plaintext highlighter-rouge">freebsd-update</code> to fetch the 14.2-RELEASE and install it.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># freebsd-update upgrade -r 14.2-RELEASE</span>
...
<span class="c"># freebsd-update install</span>
src component not installed, skipped
Installing updates...
Kernel updates have been installed.  Please reboot and run
<span class="s1">'freebsd-update [options] install'</span> again to finish installing updates.
</code></pre></div></div>

<p>Reboot the system using <code class="language-plaintext highlighter-rouge">reboot</code> and then finish installing updates.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># freebsd-update install</span>
src component not installed, skipped
Installing updates...
Restarting sshd after upgrade
Performing sanity check on sshd configuration.
Stopping sshd.
Waiting <span class="k">for </span>PIDS: 707.
Performing sanity check on sshd configuration.
Starting sshd.
Scanning /usr/share/certs/untrusted <span class="k">for </span>certificates...
Scanning /usr/share/certs/trusted <span class="k">for </span>certificates...
Scanning /usr/local/share/certs <span class="k">for </span>certificates...
 <span class="k">done</span><span class="nb">.</span>
</code></pre></div></div>

<p>Now use <code class="language-plaintext highlighter-rouge">pkg</code> to upgrade any applications.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># pkg upgrade</span>
Updating FreeBSD repository catalogue...
Fetching data.pkg: 100%    7 MiB   7.5MB/s    00:01    
Processing entries: 100%
FreeBSD repository update completed. 35885 packages processed.
All repositories are up to date.
Checking <span class="k">for </span>upgrades <span class="o">(</span>28 candidates<span class="o">)</span>: 100%
Processing candidates <span class="o">(</span>28 candidates<span class="o">)</span>: 100%
The following 28 package<span class="o">(</span>s<span class="o">)</span> will be affected <span class="o">(</span>of 0 checked<span class="o">)</span>:

Installed packages to be UPGRADED:
	curl: 8.10.1 -&gt; 8.11.1_1
...
	xxd: 9.1.0764 -&gt; 9.1.1199

Number of packages to be upgraded: 28

The process will require 3 MiB more space.
77 MiB to be downloaded.

Proceed with this action? <span class="o">[</span>y/N]: y
</code></pre></div></div>

<p>Finally, reboot the system and check <code class="language-plaintext highlighter-rouge">uname -a</code>.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># uname -a</span>
FreeBSD spring 14.2-RELEASE-p1 FreeBSD 14.2-RELEASE-p1 GENERIC amd64
</code></pre></div></div>

<p>To update the the FreeBSD base images used by the CI services, I applied <a href="https://github.com/ocurrent/freebsd-infra/pull/13">PR#13</a> to <a href="https://github.com/ocurrent/freebsd-infra">ocurrent/freebsd-infra</a>.</p>

<p>This was followed up by <a href="https://github.com/ocurrent/ocaml-ci/pull/1007">PR#1007</a> on ocurrent/ocaml-ci and <a href="https://github.com/ocurrent/opam-repo-ci/pull/427">PR#427</a> to ocurrent/opam-repo-ci.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="FreeBSD" /><category term="tunbury.org" /><summary type="html"><![CDATA[CI workers spring and summer run FreeBSD and need to be updated.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/freebsd-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/freebsd-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Topological Sort of Packages</title><link href="https://www.tunbury.org/2025/03/25/topological-sort/" rel="alternate" type="text/html" title="Topological Sort of Packages" /><published>2025-03-25T00:00:00+00:00</published><updated>2025-03-25T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/25/topological-sort</id><content type="html" xml:base="https://www.tunbury.org/2025/03/25/topological-sort/"><![CDATA[<p>Given a list of packages and their dependencies, what order should those packages be installed in?</p>

<p>The above graph gives a simple example of the dependencies of the package <code class="language-plaintext highlighter-rouge">dune</code> nicely ordered right to left.</p>

<p>We might choose to model this in OCaml using a map with the package name as the key and a set of the dependent packages:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">PackageSet</span> <span class="o">=</span> <span class="nn">Set</span><span class="p">.</span><span class="nc">Make</span> <span class="p">(</span><span class="nc">String</span><span class="p">);;</span>
<span class="k">module</span> <span class="nc">PackageMap</span> <span class="o">=</span> <span class="nn">Map</span><span class="p">.</span><span class="nc">Make</span> <span class="p">(</span><span class="nc">String</span><span class="p">);;</span>
</code></pre></div></div>

<p>Thus, the <code class="language-plaintext highlighter-rouge">dune</code> example could be defined like this.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">dune</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.(</span><span class="n">empty</span> <span class="o">|&gt;</span>
    <span class="n">add</span> <span class="s2">"ocaml"</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.(</span><span class="n">empty</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="s2">"ocaml-config"</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="s2">"ocaml-variants"</span><span class="p">))</span> <span class="o">|&gt;</span>
    <span class="n">add</span> <span class="s2">"ocaml-config"</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.(</span><span class="n">empty</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="s2">"ocaml-variants"</span><span class="p">))</span> <span class="o">|&gt;</span>
    <span class="n">add</span> <span class="s2">"dune"</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.(</span><span class="n">empty</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="s2">"ocaml"</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="s2">"base-unix.base"</span> <span class="o">|&gt;</span> <span class="n">add</span> <span class="s2">"base-threads.base"</span><span class="p">))</span> <span class="o">|&gt;</span>
    <span class="n">add</span> <span class="s2">"ocaml-variants"</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.</span><span class="n">empty</span><span class="p">)</span> <span class="o">|&gt;</span>
    <span class="n">add</span> <span class="s2">"base-unix.base"</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.</span><span class="n">empty</span><span class="p">)</span> <span class="o">|&gt;</span>
    <span class="n">add</span> <span class="s2">"base-threads.base"</span> <span class="p">(</span><span class="nn">PackageSet</span><span class="p">.</span><span class="n">empty</span><span class="p">)</span>
  <span class="p">);;</span>
</code></pre></div></div>

<p>We can create a topological sort by first choosing any package with an empty set of dependencies.  This package should then be removed from the map of packages and also removed as a dependency from any of the sets.  This can be written concisely in OCaml</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">rec</span> <span class="n">topological_sort</span> <span class="n">pkgs</span> <span class="o">=</span>
  <span class="k">match</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">is_empty</span> <span class="n">pkgs</span> <span class="k">with</span>
  <span class="o">|</span> <span class="bp">true</span> <span class="o">-&gt;</span> <span class="bp">[]</span>
  <span class="o">|</span> <span class="bp">false</span> <span class="o">-&gt;</span>
      <span class="k">let</span> <span class="n">installable</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">filter</span> <span class="p">(</span><span class="k">fun</span> <span class="n">_</span> <span class="n">deps</span> <span class="o">-&gt;</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">is_empty</span> <span class="n">deps</span><span class="p">)</span> <span class="n">pkgs</span> <span class="k">in</span>
      <span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="k">assert</span> <span class="p">(</span><span class="n">not</span> <span class="p">(</span><span class="nn">PackageMap</span><span class="p">.</span><span class="n">is_empty</span> <span class="n">installable</span><span class="p">))</span> <span class="k">in</span>
      <span class="k">let</span> <span class="n">i</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">choose</span> <span class="n">installable</span> <span class="o">|&gt;</span> <span class="n">fst</span> <span class="k">in</span>
      <span class="k">let</span> <span class="n">pkgs</span> <span class="o">=</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">remove</span> <span class="n">i</span> <span class="n">pkgs</span> <span class="o">|&gt;</span> <span class="nn">PackageMap</span><span class="p">.</span><span class="n">map</span> <span class="p">(</span><span class="k">fun</span> <span class="n">deps</span> <span class="o">-&gt;</span> <span class="nn">PackageSet</span><span class="p">.</span><span class="n">remove</span> <span class="n">i</span> <span class="n">deps</span><span class="p">)</span> <span class="k">in</span>
      <span class="n">i</span> <span class="o">::</span> <span class="n">topological_sort</span> <span class="n">pkgs</span>
</code></pre></div></div>

<p>This gives us the correct installation order:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># topological_sort dune;;
- : PackageMap.key list =
["base-threads.base"; "base-unix.base"; "ocaml-variants"; "ocaml-config"; "ocaml"; "dune"]
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Given a list of packages and their dependencies, what order should those packages be installed in?]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/dune-graph.png" /><media:content medium="image" url="https://www.tunbury.org/images/dune-graph.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Recent OCaml Versions</title><link href="https://www.tunbury.org/2025/03/24/recent-ocaml-version/" rel="alternate" type="text/html" title="Recent OCaml Versions" /><published>2025-03-24T00:00:00+00:00</published><updated>2025-03-24T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/24/recent-ocaml-version</id><content type="html" xml:base="https://www.tunbury.org/2025/03/24/recent-ocaml-version/"><![CDATA[<p>Following my <a href="https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229">post on discuss.ocaml.org</a>, I have created a new release of <a href="https://github.com/ocurrent/ocaml-version">ocurrent/ocaml-version</a> that moves the minimum version of OCaml, considered as <em>recent</em>, from 4.02 to 4.08.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">recent</span> <span class="o">=</span> <span class="p">[</span> <span class="n">v4_08</span><span class="p">;</span> <span class="n">v4_09</span><span class="p">;</span> <span class="n">v4_10</span><span class="p">;</span> <span class="n">v4_11</span><span class="p">;</span> <span class="n">v4_12</span><span class="p">;</span> <span class="n">v4_13</span><span class="p">;</span> <span class="n">v4_14</span><span class="p">;</span> <span class="n">v5_0</span><span class="p">;</span> <span class="n">v5_1</span><span class="p">;</span> <span class="n">v5_2</span><span class="p">;</span> <span class="n">v5_3</span> <span class="p">]</span>
</code></pre></div></div>

<p>This may feel like a mundane change, but <a href="https://github.com/ocurrent/ocaml-ci">OCaml-CI</a>, <a href="https://github.com/ocurrent/opam-repo-ci">opam-repo-ci</a>, <a href="https://github.com/ocurrent/docker-base-images">Docker base image builder</a> among other things, use this to determine the set of versions of OCaml to test against. Therefore, as these services are updated, testing on the old releases will be removed.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[Following my post on discuss.ocaml.org, I have created a new release of ocurrent/ocaml-version that moves the minimum version of OCaml, considered as recent, from 4.02 to 4.08.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/ocaml-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/ocaml-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Real Time Trains API</title><link href="https://www.tunbury.org/2025/03/23/real-time-trains/" rel="alternate" type="text/html" title="Real Time Trains API" /><published>2025-03-23T00:00:00+00:00</published><updated>2025-03-23T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/23/real-time-trains</id><content type="html" xml:base="https://www.tunbury.org/2025/03/23/real-time-trains/"><![CDATA[<p>After the Heathrow substation electrical fire, I found myself in Manchester with a long train ride ahead.  Checking on <a href="https://www.realtimetrains.co.uk">Real Time Trains</a> for the schedule I noticed that they had an API.  With time to spare, I registered for an account and downloaded the sample code from <a href="https://github.com/mirage/ocaml-cohttp">ocaml-cohttp</a>.</p>

<p>The API account details uses HTTP basic authentication which is added via the HTTP header:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">let</span> <span class="n">headers</span> <span class="o">=</span> <span class="nn">Cohttp</span><span class="p">.</span><span class="nn">Header</span><span class="p">.</span><span class="n">init</span> <span class="bp">()</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">headers</span> <span class="o">=</span>
    <span class="nn">Cohttp</span><span class="p">.</span><span class="nn">Header</span><span class="p">.</span><span class="n">add_authorization</span> <span class="n">headers</span> <span class="p">(</span><span class="nt">`Basic</span> <span class="p">(</span><span class="n">user</span><span class="o">,</span> <span class="n">password</span><span class="p">))</span>
</code></pre></div></div>

<p>The response from the API can be converted to JSON using <a href="https://github.com/ocaml-community/yojson">Yojson</a>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">json</span> <span class="o">=</span>
      <span class="nn">Eio</span><span class="p">.</span><span class="nn">Buf_read</span><span class="p">.(</span><span class="n">parse_exn</span> <span class="n">take_all</span><span class="p">)</span> <span class="n">body</span> <span class="o">~</span><span class="n">max_size</span><span class="o">:</span><span class="n">max_int</span>
      <span class="o">|&gt;</span> <span class="nn">Yojson</span><span class="p">.</span><span class="nn">Safe</span><span class="p">.</span><span class="n">from_string</span>
</code></pre></div></div>

<p>The JSON field can be read using the <code class="language-plaintext highlighter-rouge">Util</code> functions.  For example, <code class="language-plaintext highlighter-rouge">Yojson.Basic.Util.member "services" json</code> will read the <code class="language-plaintext highlighter-rouge">services</code> entry.  Elements can be converted to lists with <code class="language-plaintext highlighter-rouge">Yojson.Basic.Util.to_list</code>.  After a bit of hacking this turned out to be quite tedious to code.</p>

<p>As an alternative, I decided to use <code class="language-plaintext highlighter-rouge">ppx_deriving_yojson.runtime</code>.  I described the JSON blocks as OCaml types, e.g. <code class="language-plaintext highlighter-rouge">station</code> as below.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">station</span> <span class="o">=</span> <span class="p">{</span>
  <span class="n">tiploc</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
  <span class="n">description</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
  <span class="n">workingTime</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
  <span class="n">publicTime</span> <span class="o">:</span> <span class="kt">string</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">[</span><span class="o">@@</span><span class="n">deriving</span> <span class="n">yojson</span><span class="p">]</span>
</code></pre></div></div>

<p>The preprocessor automatically generates two functions:<code class="language-plaintext highlighter-rouge">station_of_json</code> and <code class="language-plaintext highlighter-rouge">station_to_json</code> which handle the conversion.</p>

<p>The only negative on this approach is that RTT doesn’t emit empty JSON fields, so they need to be flagged as possibly missing and a default value provided.  For example, <code class="language-plaintext highlighter-rouge">realtimeArrivalNextDay</code> is not emitted unless the value is <code class="language-plaintext highlighter-rouge">true</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="n">realtimeArrivalNextDay</span> <span class="o">:</span> <span class="p">(</span><span class="kt">bool</span><span class="p">[</span><span class="o">@</span><span class="n">default</span> <span class="bp">false</span><span class="p">]);</span>
</code></pre></div></div>

<p>Now once the JSON has been received we can just convert it to OCaml types very easily:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">match</span> <span class="n">reply_of_yojson</span> <span class="n">json</span> <span class="k">with</span>
    <span class="o">|</span> <span class="nc">Ok</span> <span class="n">reply</span> <span class="o">-&gt;</span>
       <span class="c">(* Use reply.services *)</span>
    <span class="o">|</span> <span class="nc">Error</span> <span class="n">err</span> <span class="o">-&gt;</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"Error %s</span><span class="se">\n</span><span class="s2">"</span> <span class="n">err</span>
</code></pre></div></div>

<p>My work in progress code is available on <a href="https://github.com/mtelvers/ocaml-rtt">GitHub</a></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dune exec --release -- rtt --user USER --pass PASS --station RTR
rtt: [DEBUG] received 3923 bytes of body
rtt: [DEBUG] received 4096 bytes of body
rtt: [DEBUG] received 4096 bytes of body
rtt: [DEBUG] received 4096 bytes of body
rtt: [DEBUG] received 1236 bytes of body
rtt: [DEBUG] end of inbound body
2025-03-23 2132 W16178 1C69 1 Ramsgate St Pancras International
2025-03-23 2132 W25888 9P59 2 Plumstead Rainham (Kent)
2025-03-23 2136 J00119 1U28 2 London Victoria Ramsgate
2025-03-23 2144 W25927 9P86 1 Rainham (Kent) Plumstead
2025-03-23 2157 W16899 1C66 2 St Pancras International Ramsgate
2025-03-23 2202 W25894 9P61 2 Plumstead Rainham (Kent)
2025-03-23 2210 J26398 1U80 1 Ramsgate London Victoria
2025-03-23 2214 W25916 9P70 1 Rainham (Kent) Plumstead
2025-03-23 2232 W16910 1C73 1 Ramsgate St Pancras International
2025-03-23 2232 W25900 9P63 2 Plumstead Rainham (Kent)
2025-03-23 2236 J00121 1U30 2 London Victoria Ramsgate
2025-03-23 2244 W25277 9A92 1 Rainham (Kent) Dartford
2025-03-23 2257 W16450 1F70 2 St Pancras International Faversham
2025-03-23 2302 W25906 9P65 2 Plumstead Rainham (Kent)
2025-03-23 2314 W25283 9A94 1 Rainham (Kent) Dartford
2025-03-23 2318 J00155 1U82 1 Ramsgate London Victoria
2025-03-23 2332 W25912 9P67 2 Plumstead Gillingham (Kent)
2025-03-23 2336 J00123 1U32 2 London Victoria Ramsgate
2025-03-23 2344 W25289 9A96 1 Rainham (Kent) Dartford
2025-03-23 2357 W16475 1F74 2 St Pancras International Faversham
2025-03-23 0002 W25915 9P69 2 Plumstead Gillingham (Kent)
2025-03-23 0041 J26381 1Z34 2 London Victoria Faversham
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="OCaml" /><category term="tunbury.org" /><summary type="html"><![CDATA[After the Heathrow substation electrical fire, I found myself in Manchester with a long train ride ahead. Checking on Real Time Trains for the schedule I noticed that they had an API. With time to spare, I registered for an account and downloaded the sample code from ocaml-cohttp.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/rtt.png" /><media:content medium="image" url="https://www.tunbury.org/images/rtt.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Irmin Database</title><link href="https://www.tunbury.org/2025/03/17/irmin/" rel="alternate" type="text/html" title="Irmin Database" /><published>2025-03-17T00:00:00+00:00</published><updated>2025-03-17T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/17/irmin</id><content type="html" xml:base="https://www.tunbury.org/2025/03/17/irmin/"><![CDATA[<p>After Thomas’ talk today I wanted to try <a href="https://irmin.org">Irmin</a> for myself.</p>

<p>In a new switch I installed Irmin via opam <code class="language-plaintext highlighter-rouge">opam install irmin-git</code> and then built the <a href="https://irmin.org/tutorial/getting-started/">example code</a></p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">open</span> <span class="nn">Lwt</span><span class="p">.</span><span class="nc">Syntax</span>
<span class="k">module</span> <span class="nc">Git_store</span> <span class="o">=</span> <span class="nn">Irmin_git_unix</span><span class="p">.</span><span class="nn">FS</span><span class="p">.</span><span class="nc">KV</span> <span class="p">(</span><span class="nn">Irmin</span><span class="p">.</span><span class="nn">Contents</span><span class="p">.</span><span class="nc">String</span><span class="p">)</span>
<span class="k">module</span> <span class="nc">Git_info</span> <span class="o">=</span> <span class="nn">Irmin_unix</span><span class="p">.</span><span class="nc">Info</span> <span class="p">(</span><span class="nn">Git_store</span><span class="p">.</span><span class="nc">Info</span><span class="p">)</span>

<span class="k">let</span> <span class="n">git_config</span> <span class="o">=</span> <span class="nn">Irmin_git</span><span class="p">.</span><span class="n">config</span> <span class="o">~</span><span class="n">bare</span><span class="o">:</span><span class="bp">true</span> <span class="s2">"./db"</span>
<span class="k">let</span> <span class="n">info</span> <span class="n">message</span> <span class="o">=</span> <span class="nn">Git_info</span><span class="p">.</span><span class="n">v</span> <span class="o">~</span><span class="n">author</span><span class="o">:</span><span class="s2">"Example"</span> <span class="s2">"%s"</span> <span class="n">message</span>

<span class="k">let</span> <span class="n">main_branch</span> <span class="n">config</span> <span class="o">=</span>
  <span class="k">let</span><span class="o">*</span> <span class="n">repo</span> <span class="o">=</span> <span class="nn">Git_store</span><span class="p">.</span><span class="nn">Repo</span><span class="p">.</span><span class="n">v</span> <span class="n">config</span> <span class="k">in</span>
  <span class="nn">Git_store</span><span class="p">.</span><span class="n">main</span> <span class="n">repo</span>

<span class="k">let</span> <span class="n">main</span> <span class="o">=</span>
  <span class="k">let</span><span class="o">*</span> <span class="n">t</span> <span class="o">=</span> <span class="n">main_branch</span> <span class="n">git_config</span> <span class="k">in</span>
  <span class="c">(* Set a/b/c to "Hello, Irmin!" *)</span>
  <span class="k">let</span><span class="o">*</span> <span class="bp">()</span> <span class="o">=</span>
    <span class="nn">Git_store</span><span class="p">.</span><span class="n">set_exn</span> <span class="n">t</span> <span class="p">[</span> <span class="s2">"a"</span><span class="p">;</span> <span class="s2">"b"</span><span class="p">;</span> <span class="s2">"c"</span> <span class="p">]</span> <span class="s2">"Hello, Irmin!"</span>
      <span class="o">~</span><span class="n">info</span><span class="o">:</span><span class="p">(</span><span class="n">info</span> <span class="s2">"my first commit"</span><span class="p">)</span>
  <span class="k">in</span>
  <span class="c">(* Get a/b/c *)</span>
  <span class="k">let</span><span class="o">+</span> <span class="n">s</span> <span class="o">=</span> <span class="nn">Git_store</span><span class="p">.</span><span class="n">get</span> <span class="n">t</span> <span class="p">[</span> <span class="s2">"a"</span><span class="p">;</span> <span class="s2">"b"</span><span class="p">;</span> <span class="s2">"c"</span> <span class="p">]</span> <span class="k">in</span>
  <span class="k">assert</span> <span class="p">(</span><span class="n">s</span> <span class="o">=</span> <span class="s2">"Hello, Irmin!"</span><span class="p">)</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="nn">Lwt_main</span><span class="p">.</span><span class="n">run</span> <span class="n">main</span>
</code></pre></div></div>

<p>I’m pretty excited about the possibilities.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="irmin" /><category term="tunbury.org" /><summary type="html"><![CDATA[After Thomas’ talk today I wanted to try Irmin for myself.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/irmin.png" /><media:content medium="image" url="https://www.tunbury.org/images/irmin.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Playing with Cap’n Proto</title><link href="https://www.tunbury.org/2025/03/17/capnproto/" rel="alternate" type="text/html" title="Playing with Cap’n Proto" /><published>2025-03-17T00:00:00+00:00</published><updated>2025-03-17T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/17/capnproto</id><content type="html" xml:base="https://www.tunbury.org/2025/03/17/capnproto/"><![CDATA[<p>Cap’n Proto has become a hot topic recently and while this is used for many OCaml-CI services, I spent some time creating a minimal application.</p>

<p>Firstly create a schema with a single interface whch accepts a file name and returns the content.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>interface Foo {
  get      @0 (name :Text) -&gt; (reply :Text);
}
</code></pre></div></div>

<p>This schema can then be compiled into the bindings for your required language. e.g. <code class="language-plaintext highlighter-rouge">capnp compile -o ocaml:. schema.capnp</code></p>

<p>In practice this need not be done by hand as we can use a <code class="language-plaintext highlighter-rouge">dune</code> rule to do this.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(rule
 (targets foo_api.ml foo_api.mli)
 (deps    foo_api.capnp)
 (action (run capnp compile -o %{bin:capnpc-ocaml} %{deps})))
</code></pre></div></div>

<p>On the server side we now need to extend the automatically generate code to actually implement the interface.  This code is largely boilerplate.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Api</span> <span class="o">=</span> <span class="nn">Foo_api</span><span class="p">.</span><span class="nc">MakeRPC</span><span class="p">(</span><span class="nc">Capnp_rpc</span><span class="p">)</span>

<span class="k">open</span> <span class="nn">Capnp_rpc</span><span class="p">.</span><span class="nc">Std</span>

<span class="k">let</span> <span class="n">read_from_file</span> <span class="n">filename</span> <span class="o">=</span> <span class="nn">In_channel</span><span class="p">.</span><span class="n">with_open_text</span> <span class="n">filename</span> <span class="o">@@</span> <span class="k">fun</span> <span class="n">ic</span> <span class="o">-&gt;</span> <span class="nn">In_channel</span><span class="p">.</span><span class="n">input_all</span> <span class="n">ic</span>

<span class="k">let</span> <span class="n">local</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">module</span> <span class="nc">Foo</span> <span class="o">=</span> <span class="nn">Api</span><span class="p">.</span><span class="nn">Service</span><span class="p">.</span><span class="nc">Foo</span> <span class="k">in</span>
  <span class="nn">Foo</span><span class="p">.</span><span class="n">local</span> <span class="o">@@</span> <span class="k">object</span>
    <span class="k">inherit</span> <span class="nn">Foo</span><span class="p">.</span><span class="n">service</span>

    <span class="n">method</span> <span class="n">get_impl</span> <span class="n">params</span> <span class="n">release_param_caps</span> <span class="o">=</span>
      <span class="k">let</span> <span class="k">open</span> <span class="nn">Foo</span><span class="p">.</span><span class="nc">Get</span> <span class="k">in</span>
      <span class="k">let</span> <span class="n">name</span> <span class="o">=</span> <span class="nn">Params</span><span class="p">.</span><span class="n">name_get</span> <span class="n">params</span> <span class="k">in</span>
      <span class="n">release_param_caps</span> <span class="bp">()</span><span class="p">;</span>
      <span class="k">let</span> <span class="n">response</span><span class="o">,</span> <span class="n">results</span> <span class="o">=</span> <span class="nn">Service</span><span class="p">.</span><span class="nn">Response</span><span class="p">.</span><span class="n">create</span> <span class="nn">Results</span><span class="p">.</span><span class="n">init_pointer</span> <span class="k">in</span>
      <span class="nn">Results</span><span class="p">.</span><span class="n">reply_set</span> <span class="n">results</span> <span class="p">(</span><span class="n">read_from_file</span> <span class="n">name</span><span class="p">);</span>
      <span class="nn">Service</span><span class="p">.</span><span class="n">return</span> <span class="n">response</span>
  <span class="k">end</span>
</code></pre></div></div>

<p>The server needs to generate the capability file needed to access the service and wait for incoming connections.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">cap_file</span> <span class="o">=</span> <span class="s2">"echo.cap"</span>

<span class="k">let</span> <span class="n">serve</span> <span class="n">config</span> <span class="o">=</span>
  <span class="nn">Switch</span><span class="p">.</span><span class="n">run</span> <span class="o">@@</span> <span class="k">fun</span> <span class="n">sw</span> <span class="o">-&gt;</span>
  <span class="k">let</span> <span class="n">service_id</span> <span class="o">=</span> <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="nn">Vat_config</span><span class="p">.</span><span class="n">derived_id</span> <span class="n">config</span> <span class="s2">"main"</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">restore</span> <span class="o">=</span> <span class="nn">Restorer</span><span class="p">.</span><span class="n">single</span> <span class="n">service_id</span> <span class="p">(</span><span class="nn">Foo</span><span class="p">.</span><span class="n">local</span><span class="p">)</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">vat</span> <span class="o">=</span> <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="n">serve</span> <span class="o">~</span><span class="n">sw</span> <span class="o">~</span><span class="n">restore</span> <span class="n">config</span> <span class="k">in</span>
  <span class="k">match</span> <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="nn">Cap_file</span><span class="p">.</span><span class="n">save_service</span> <span class="n">vat</span> <span class="n">service_id</span> <span class="n">cap_file</span> <span class="k">with</span>
  <span class="o">|</span> <span class="nc">Error</span> <span class="nt">`Msg</span> <span class="n">m</span> <span class="o">-&gt;</span> <span class="n">failwith</span> <span class="n">m</span>
  <span class="o">|</span> <span class="nc">Ok</span> <span class="bp">()</span> <span class="o">-&gt;</span>
    <span class="n">traceln</span> <span class="s2">"Server running. Connect using %S."</span> <span class="n">cap_file</span><span class="p">;</span>
    <span class="nn">Fiber</span><span class="p">.</span><span class="n">await_cancel</span> <span class="bp">()</span>
</code></pre></div></div>

<p>The client application imports the capability file and calls the service <code class="language-plaintext highlighter-rouge">Foo.get</code>.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">run_client</span> <span class="n">service</span> <span class="o">=</span>
  <span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="nn">Foo</span><span class="p">.</span><span class="n">get</span> <span class="n">service</span> <span class="s2">"client.ml"</span> <span class="k">in</span>
  <span class="n">traceln</span> <span class="s2">"%S"</span> <span class="n">x</span>

<span class="k">let</span> <span class="n">connect</span> <span class="n">net</span> <span class="n">uri</span> <span class="o">=</span>
  <span class="nn">Switch</span><span class="p">.</span><span class="n">run</span> <span class="o">@@</span> <span class="k">fun</span> <span class="n">sw</span> <span class="o">-&gt;</span>
  <span class="k">let</span> <span class="n">client_vat</span> <span class="o">=</span> <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="n">client_only_vat</span> <span class="o">~</span><span class="n">sw</span> <span class="n">net</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">sr</span> <span class="o">=</span> <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="nn">Vat</span><span class="p">.</span><span class="n">import_exn</span> <span class="n">client_vat</span> <span class="n">uri</span> <span class="k">in</span>
  <span class="nn">Capnp_rpc_unix</span><span class="p">.</span><span class="n">with_cap_exn</span> <span class="n">sr</span> <span class="n">run_client</span>
</code></pre></div></div>

<p>Where <code class="language-plaintext highlighter-rouge">Foo.get</code> is defined like this</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">module</span> <span class="nc">Foo</span> <span class="o">=</span> <span class="nn">Api</span><span class="p">.</span><span class="nn">Client</span><span class="p">.</span><span class="nc">Foo</span>

<span class="k">let</span> <span class="n">get</span> <span class="n">t</span> <span class="n">name</span> <span class="o">=</span>
  <span class="k">let</span> <span class="k">open</span> <span class="nn">Foo</span><span class="p">.</span><span class="nc">Get</span> <span class="k">in</span>
  <span class="k">let</span> <span class="n">request</span><span class="o">,</span> <span class="n">params</span> <span class="o">=</span> <span class="nn">Capability</span><span class="p">.</span><span class="nn">Request</span><span class="p">.</span><span class="n">create</span> <span class="nn">Params</span><span class="p">.</span><span class="n">init_pointer</span> <span class="k">in</span>
  <span class="nn">Params</span><span class="p">.</span><span class="n">name_set</span> <span class="n">params</span> <span class="n">name</span><span class="p">;</span>
  <span class="nn">Capability</span><span class="p">.</span><span class="n">call_for_value_exn</span> <span class="n">t</span> <span class="n">method_id</span> <span class="n">request</span> <span class="o">|&gt;</span> <span class="nn">Results</span><span class="p">.</span><span class="n">reply_get</span>
</code></pre></div></div>

<p>Run the server application passing it parameters of where to save the private key and which interface/port to listen on.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>dune <span class="nb">exec</span> <span class="nt">--</span> ./server.exe <span class="nt">--capnp-secret-key-file</span> ./server.pem <span class="nt">--capnp-listen-address</span> tcp:127.0.0.1:7000
+Server running. Connect using <span class="s2">"echo.cap"</span><span class="nb">.</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">.cap</code> looks like this</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>capnp://sha-256:f5BAo2n_2gVxUdkyzYsIuitpA1YT_7xFg31FIdNKVls@127.0.0.1:7000/6v45oIvGQ6noMaLOh5GHAJnGJPWEO5A3Qkt0Egke4Ic
</code></pre></div></div>

<p>In another window, invoke the client.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>dune <span class="nb">exec</span> <span class="nt">--</span> ./client.exe ./echo.cap
</code></pre></div></div>

<p>The full code is available on <a href="https://github.com/mtelvers/capnp-minimum">Github</a>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="capnpproto" /><category term="tunbury.org" /><summary type="html"><![CDATA[Cap’n Proto has become a hot topic recently and while this is used for many OCaml-CI services, I spent some time creating a minimal application.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/capnproto-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/capnproto-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Setup Tangled with Bluesky</title><link href="https://www.tunbury.org/2025/03/16/setup-tangled-with-bluesky/" rel="alternate" type="text/html" title="Setup Tangled with Bluesky" /><published>2025-03-16T00:00:00+00:00</published><updated>2025-03-16T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/16/setup-tangled-with-bluesky</id><content type="html" xml:base="https://www.tunbury.org/2025/03/16/setup-tangled-with-bluesky/"><![CDATA[<p>To setup this up, I’m using a modified version of Anil’s <a href="https://tangled.sh/@anil.recoil.org/knot-docker">repo</a>. My repo is <a href="https://tangled.sh/@mtelvers.tunbury.org/knot-docker">here</a>. Firstly, clone the repo and run <code class="language-plaintext highlighter-rouge">gen-key.sh</code>.</p>

<p>Go to <a href="https://tangled.sh/login">https://tangled.sh/login</a> and click the <a href="https://bsky.app/settings/app-passwords">link</a> to generate an app password. Copy the created password and return to <a href="">https://tangled.sh/login</a> and sign in using your handle and the newly created app password.</p>

<p>Go to <a href="https://tangled.sh/knots">https://tangled.sh/knots</a>, enter your knot hostname and click on generate key. Copy <code class="language-plaintext highlighter-rouge">knot.env.template</code> to <code class="language-plaintext highlighter-rouge">.env</code> and enter the key in <code class="language-plaintext highlighter-rouge">KNOT_SERVER_SECRET</code>. In the same file, also set the server name.</p>

<p>The original <code class="language-plaintext highlighter-rouge">Dockerfile</code> didn’t quite work for me as <code class="language-plaintext highlighter-rouge">useradd -D</code> (from alpine/busybox) leads to a disabled user which cannot sign in, even over SSH. Instead, I generate a random password for the <code class="language-plaintext highlighter-rouge">git</code> user.  My diff looks like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>-    adduser -D -u 1000 -G git -h /home/git git &amp;&amp; \
+    pw="$(head -c 20 /dev/urandom | base64 | head -c 10)" \
+    printf "$pw\n$pw\n" | \
+    adduser -u 1000 -G git -h /home/git git &amp;&amp; \
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">docker compose up -d</code> then check on <a href="https://tangled.sh/knots">https://tangled.sh/knots</a>. Click on initialize and wait for the process to complete.</p>

<p>Add a remote repo as normal:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git remote add knot git@git.tunbury.org:mtelvers.tunbury.org/pi-archimedes
</code></pre></div></div>
<p>Then push as you would to any other remote</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git push knot
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bluesky" /><category term="tunbury.org" /><summary type="html"><![CDATA[To setup this up, I’m using a modified version of Anil’s repo. My repo is here. Firstly, clone the repo and run gen-key.sh.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/bluesky-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/bluesky-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Bluesky Personal Data Server (PDS)</title><link href="https://www.tunbury.org/2025/03/15/bluesky-pds/" rel="alternate" type="text/html" title="Bluesky Personal Data Server (PDS)" /><published>2025-03-15T00:00:00+00:00</published><updated>2025-03-15T00:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/15/bluesky-pds</id><content type="html" xml:base="https://www.tunbury.org/2025/03/15/bluesky-pds/"><![CDATA[<p>Today I have set up my own Bluesky (PDS) Personal Data Server.</p>

<p>I followed the README at
<a href="https://github.com/bluesky-social/pds">https://github.com/bluesky-social/pds</a>
using an Ubuntu 22.04 VM.  The basic steps are:</p>

<ol>
  <li>Publish DNS records pointing to your machine.</li>
  <li>As root, run <a href="https://raw.githubusercontent.com/bluesky-social/pds/main/installer.sh">install.sh</a>.</li>
  <li>Enter your email address and preferred handle.</li>
</ol>

<p>It wasn’t entirely obvious how to set your handle to be the same
as the domain name when you have something else already published
on the domain such as your web server.</p>

<p><a href="https://github.com/bluesky-social/pds/issues/103">Issue #103</a> shows how this should be achieved.</p>

<ol>
  <li>Publish the DNS record for <code class="language-plaintext highlighter-rouge">pds.yourdomain.com</code>.</li>
  <li>Use <code class="language-plaintext highlighter-rouge">pds.yourdomain.com</code> during setup.</li>
  <li>At the final stage where a handle is created, use <code class="language-plaintext highlighter-rouge">tmphandle.pds.yourdomain.com</code></li>
  <li>Change the change to your preferred handle via the Bluesky app.</li>
</ol>

<p>Login using a custom server pds.yourdomain.com and the handle you created.</p>

<p>Next go to Account &gt; Handle and select ‘I have my own domain’. Enter
the domain name which should be the new handle that you want. In
my case, <code class="language-plaintext highlighter-rouge">mtelvers.tunbury.org</code>. Next, publish a DNS TXT record
for <code class="language-plaintext highlighter-rouge">_atproto.mtelvers.tunbury.org</code> and publish your did record
<code class="language-plaintext highlighter-rouge">did=did:plc:5le6ofipuf6sdk6czluurgjc</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Check service status      : sudo systemctl status pds
Watch service logs        : sudo docker logs -f pds
Backup service data       : /pds
PDS Admin command         : pdsadmin

To see pdsadmin commands, run "pdsadmin help"
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bluesky" /><category term="tunbury.org" /><summary type="html"><![CDATA[Today I have set up my own Bluesky (PDS) Personal Data Server.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/bluesky-logo.png" /><media:content medium="image" url="https://www.tunbury.org/images/bluesky-logo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Pi Day - Archimedes Method</title><link href="https://www.tunbury.org/2025/03/14/pi-day/" rel="alternate" type="text/html" title="Pi Day - Archimedes Method" /><published>2025-03-14T13:00:00+00:00</published><updated>2025-03-14T13:00:00+00:00</updated><id>https://www.tunbury.org/2025/03/14/pi-day</id><content type="html" xml:base="https://www.tunbury.org/2025/03/14/pi-day/"><![CDATA[<p>It’s <a href="https://en.wikipedia.org/wiki/Pi_Day">Pi Day</a> 2025</p>

<p>Archimedes calculated the perimeter of inscribed regular polygons
within a circle to approximate the value of π.</p>

<p>A square inscribed in a unit circle can be divided into four right
triangles with two sides of unit length, corresponding to the radius of
the circle.  The third side can be calculated by Pythagoras’ theorem to
be √2.  The perimeter of the square would be 4√2.  Given, C=πd, we
can calculate π from the circumference by dividing it by the diameter,
2, giving 2√2.</p>

<p><img src="/images/pi-archimedes-triangle.png" alt="" /></p>

<p>CA, CD and CB are all the unit radius. AB is √2 as calculated above. The
angle ACB can be bisected with the line CD. EB is half of AB. Using
Pythagoras’ theorem on the triangle BCE we can calculated CE. DE is then
1 - CE, allowing us to use Pythagoras’ theorem for a final time on BDE to
calculated BD. The improved approximation of the perimeter is now 8 x BD.</p>

<p>We can iterate on this process using the following code:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">rec</span> <span class="n">pi</span> <span class="n">edge_squared</span> <span class="n">sides</span> <span class="o">=</span> <span class="k">function</span>
  <span class="o">|</span> <span class="mi">0</span> <span class="o">-&gt;</span> <span class="n">sides</span> <span class="o">*.</span> <span class="nn">Float</span><span class="p">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">edge_squared</span><span class="p">)</span> <span class="o">/.</span> <span class="mi">2</span><span class="o">.</span>
  <span class="o">|</span> <span class="n">n</span> <span class="o">-&gt;</span>
    <span class="k">let</span> <span class="n">edge_squared</span> <span class="o">=</span> <span class="mi">2</span><span class="o">.</span> <span class="o">-.</span> <span class="mi">2</span><span class="o">.</span> <span class="o">*.</span> <span class="nn">Float</span><span class="p">.</span><span class="n">sqrt</span> <span class="p">(</span><span class="mi">1</span><span class="o">.</span> <span class="o">-.</span> <span class="n">edge_squared</span> <span class="o">/.</span> <span class="mi">4</span><span class="o">.</span><span class="p">)</span> <span class="k">in</span>
    <span class="k">let</span> <span class="n">sides</span> <span class="o">=</span> <span class="n">sides</span> <span class="o">*.</span> <span class="mi">2</span><span class="o">.</span> <span class="k">in</span>
    <span class="n">pi</span> <span class="n">edge_squared</span> <span class="n">sides</span> <span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>

<span class="k">let</span> <span class="n">approximation</span> <span class="o">=</span> <span class="n">pi</span> <span class="mi">2</span><span class="o">.</span> <span class="mi">4</span><span class="o">.</span> <span class="mi">13</span>
<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"pi %.31f</span><span class="se">\n</span><span class="s2">"</span> <span class="n">approximation</span>
</code></pre></div></div>

<p>I found this method quite interesting. Usually, as the number of
iterations increases the approximation of π becomes more accurate
with the delta between each step becoming smaller until the difference
is effectively zero (given the limited precision of the floating
calculation).  However, in this case, after 13 iterations the
approximation becomes worse!</p>

<table>
  <thead>
    <tr>
      <th>iteration</th>
      <th>approximation</th>
      <th>% error</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>2.8284271247461902909492437174777</td>
      <td>9.968368</td>
    </tr>
    <tr>
      <td>1</td>
      <td>3.0614674589207178101446515938733</td>
      <td>2.550464</td>
    </tr>
    <tr>
      <td>2</td>
      <td>3.1214451522580528575190328410827</td>
      <td>0.641315</td>
    </tr>
    <tr>
      <td>3</td>
      <td>3.1365484905459406483885231864406</td>
      <td>0.160561</td>
    </tr>
    <tr>
      <td>4</td>
      <td>3.1403311569547391890466769837076</td>
      <td>0.040155</td>
    </tr>
    <tr>
      <td>5</td>
      <td>3.1412772509327568926096319046337</td>
      <td>0.010040</td>
    </tr>
    <tr>
      <td>6</td>
      <td>3.1415138011441454679584239784162</td>
      <td>0.002510</td>
    </tr>
    <tr>
      <td>7</td>
      <td>3.1415729403678827047485810908256</td>
      <td>0.000627</td>
    </tr>
    <tr>
      <td>8</td>
      <td>3.1415877252799608854161306226160</td>
      <td>0.000157</td>
    </tr>
    <tr>
      <td>9</td>
      <td>3.1415914215046352175875199463917</td>
      <td>0.000039</td>
    </tr>
    <tr>
      <td>10</td>
      <td>3.1415923456110768086091411532834</td>
      <td>0.000010</td>
    </tr>
    <tr>
      <td>11</td>
      <td>3.1415925765450043449789063743083</td>
      <td>0.000002</td>
    </tr>
    <tr>
      <td>12</td>
      <td>3.1415926334632482408437681442592</td>
      <td>0.000001</td>
    </tr>
    <tr>
      <td>13</td>
      <td>3.1415926548075892021927302266704</td>
      <td>-0.000000</td>
    </tr>
    <tr>
      <td>14</td>
      <td>3.1415926453212152935634549066890</td>
      <td>0.000000</td>
    </tr>
    <tr>
      <td>15</td>
      <td>3.1415926073757196590463536267634</td>
      <td>0.000001</td>
    </tr>
    <tr>
      <td>16</td>
      <td>3.1415929109396727447744979144773</td>
      <td>-0.000008</td>
    </tr>
    <tr>
      <td>17</td>
      <td>3.1415941251951911006301543238806</td>
      <td>-0.000047</td>
    </tr>
    <tr>
      <td>18</td>
      <td>3.1415965537048196054570325941313</td>
      <td>-0.000124</td>
    </tr>
    <tr>
      <td>19</td>
      <td>3.1415965537048196054570325941313</td>
      <td>-0.000124</td>
    </tr>
    <tr>
      <td>20</td>
      <td>3.1416742650217575061333263874985</td>
      <td>-0.002598</td>
    </tr>
    <tr>
      <td>21</td>
      <td>3.1418296818892015309643284126651</td>
      <td>-0.007545</td>
    </tr>
    <tr>
      <td>22</td>
      <td>3.1424512724941338071005247911671</td>
      <td>-0.027331</td>
    </tr>
    <tr>
      <td>23</td>
      <td>3.1424512724941338071005247911671</td>
      <td>-0.027331</td>
    </tr>
    <tr>
      <td>24</td>
      <td>3.1622776601683795227870632515987</td>
      <td>-0.658424</td>
    </tr>
    <tr>
      <td>25</td>
      <td>3.1622776601683795227870632515987</td>
      <td>-0.658424</td>
    </tr>
    <tr>
      <td>26</td>
      <td>3.4641016151377543863532082468737</td>
      <td>-10.265779</td>
    </tr>
    <tr>
      <td>27</td>
      <td>4.0000000000000000000000000000000</td>
      <td>-27.323954</td>
    </tr>
    <tr>
      <td>28</td>
      <td>0.0000000000000000000000000000000</td>
      <td>100.000000</td>
    </tr>
  </tbody>
</table>

<p>Using the <a href="https://opam.ocaml.org/packages/decimal/">decimal</a> package
we can specify the floating point precision we want allowing us to
get to 100 decimal places in 165 steps.</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">open</span> <span class="nc">Decimal</span>

<span class="k">let</span> <span class="n">context</span> <span class="o">=</span> <span class="nn">Context</span><span class="p">.</span><span class="n">make</span> <span class="o">~</span><span class="n">prec</span><span class="o">:</span><span class="mi">200</span> <span class="bp">()</span>
<span class="k">let</span> <span class="n">two</span> <span class="o">=</span> <span class="n">of_int</span> <span class="mi">2</span>
<span class="k">let</span> <span class="n">four</span> <span class="o">=</span> <span class="n">of_int</span> <span class="mi">4</span>

<span class="k">let</span> <span class="k">rec</span> <span class="n">pi</span> <span class="n">edge_squared</span> <span class="n">sides</span> <span class="n">n</span> <span class="o">=</span>
  <span class="k">match</span> <span class="n">n</span> <span class="k">with</span>
  <span class="o">|</span> <span class="mi">0</span> <span class="o">-&gt;</span> <span class="n">mul</span> <span class="o">~</span><span class="n">context</span> <span class="n">sides</span> <span class="p">(</span><span class="n">div</span> <span class="o">~</span><span class="n">context</span> <span class="p">(</span><span class="n">sqrt</span> <span class="o">~</span><span class="n">context</span> <span class="n">edge_squared</span><span class="p">)</span> <span class="n">two</span><span class="p">)</span>
  <span class="o">|</span> <span class="n">n</span> <span class="o">-&gt;</span>
      <span class="k">let</span> <span class="n">edge_squared</span> <span class="o">=</span>
        <span class="n">sub</span> <span class="o">~</span><span class="n">context</span> <span class="n">two</span>
          <span class="p">(</span><span class="n">mul</span> <span class="o">~</span><span class="n">context</span> <span class="n">two</span>
             <span class="p">(</span><span class="n">sqrt</span> <span class="o">~</span><span class="n">context</span> <span class="p">(</span><span class="n">sub</span> <span class="o">~</span><span class="n">context</span> <span class="n">one</span> <span class="p">(</span><span class="n">div</span> <span class="o">~</span><span class="n">context</span> <span class="n">edge_squared</span> <span class="n">four</span><span class="p">))))</span>
      <span class="k">in</span>
      <span class="k">let</span> <span class="n">sides</span> <span class="o">=</span> <span class="n">mul</span> <span class="o">~</span><span class="n">context</span> <span class="n">sides</span> <span class="n">two</span> <span class="k">in</span>
      <span class="n">pi</span> <span class="n">edge_squared</span> <span class="n">sides</span> <span class="p">(</span><span class="nn">Int</span><span class="p">.</span><span class="n">pred</span> <span class="n">n</span><span class="p">)</span>

<span class="k">let</span> <span class="bp">()</span> <span class="o">=</span> <span class="n">pi</span> <span class="n">two</span> <span class="n">four</span> <span class="mi">165</span> <span class="o">|&gt;</span> <span class="n">to_string</span> <span class="o">~</span><span class="n">context</span> <span class="o">|&gt;</span> <span class="nn">Printf</span><span class="p">.</span><span class="n">printf</span> <span class="s2">"%s</span><span class="se">\n</span><span class="s2">"</span>
</code></pre></div></div>

<p>This code is available on <a href="https://github.com/mtelvers/pi-archimedes">GitHub</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="pi" /><category term="tunbury.org" /><summary type="html"><![CDATA[It’s Pi Day 2025]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pi.png" /><media:content medium="image" url="https://www.tunbury.org/images/pi.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Deepseek R1 on a Raspberry Pi</title><link href="https://www.tunbury.org/2025/03/12/deekseek-r1-on-raspberry-pi/" rel="alternate" type="text/html" title="Deepseek R1 on a Raspberry Pi" /><published>2025-03-12T20:15:00+00:00</published><updated>2025-03-12T20:15:00+00:00</updated><id>https://www.tunbury.org/2025/03/12/deekseek-r1-on-raspberry-pi</id><content type="html" xml:base="https://www.tunbury.org/2025/03/12/deekseek-r1-on-raspberry-pi/"><![CDATA[<p>I’ve heard a lot about Deepseek and wanted to try it for myself.</p>

<p>Using a Raspberry Pi 5 with 8GB of RAM and an NVMe, I installed Ollama:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-fsSL</span> https://ollama.com/install.sh | sh
</code></pre></div></div>

<p>Next, I downloaded the smallest Deepseek model</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ollama pull deepseek-r1:1.5b
</code></pre></div></div>

<p>And ran it using:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ollama run deepseek-r1:1.5b
</code></pre></div></div>

<p>Things went downhill after that, as I couldn’t get a complete response.
The Pi would lock up with no output on the display:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ollama run deepseek-r1:1.5b
&gt;&gt;&gt; what is the capital of england?
&lt;think&gt;
Okay, so I need to figure out whatclient_loop: send disconnect: Broken pipe
</code></pre></div></div>

<p>Switching from the Apple iPad charger I was using to an official Pi
power supply resolved this problem.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ollama run deepseek-r1:1.5b
&gt;&gt;&gt; what is the capital of england?
&lt;think&gt;

&lt;/think&gt;

The capital of England is London.

&gt;&gt;&gt;
</code></pre></div></div>

<p>However, it’s not so good with a more niche question:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ollama run deepseek-r1:1.5b
&gt;&gt;&gt; what is OCaml?
&lt;think&gt;

&lt;/think&gt;

OCaml is a high-level, multi-paradigm programming language that combines
features from several other languages and frameworks. It was created by
Martin Abadi, the founding father of Google, and has gained popularity
in both academic research and industrial applications.

...
</code></pre></div></div>

<p>How about an easier question?</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ollama run deepseek-r1:1.5b
&gt;&gt;&gt; Who invented the C programming language?
&lt;think&gt;
Okay, so I need to figure out who invented the C programming
language. Hmm, I remember that C is a high-level programming language
used for system programming and embedded systems because it's efficient
and portable. But I'm not sure about its origin.

First, I'll think about other languages that are often mentioned
before C. For example, C was developed by Dennis Ritchie in the late
1960s. He wrote the header files, which made the language more readable
and maintainable. He probably worked for Bell Labs since that's where
Dennis is associated.

I wonder if there were any other contributors. Maybe someone else helped
develop parts of the language before Ritchie did everything. But I don't
recall any major alternatives to C being widely used before it became
dominant in certain areas like operating systems and embedded devices.

So, putting it all together, Dennis Ritchie is credited with inventing
C. He created a lot of header files that made the language structured
and easier to use, which helped it become very popular for its efficiency
and portability.
&lt;/think&gt;

The C programming language was invented by Dennis Ritchie, who worked at
Bell Labs in the late 1960s. His work led to the creation of significant
header files that enhanced readability and maintainability. Ritchie's
contributions made C a widely used language in areas like operating
systems and embedded systems.

**Answer:**

Dennis Ritchie invented the C programming language. He worked at Bell Labs
(now part of Tufts University) in the late 1960s, contributing to its
development by creating essential header files that made the language
more readable and maintainable. His work led to a widely popularized
language used for system programming and embedded systems.

**Step-by-Step Explanation:**

1. **Identify the Problem:** The question is about the origin of C,
an early high-level programming language.
2. **Recall Other Languages Before C:** Mention languages like Pascal,
Ada, and Modula-2 that were around before C.
3. **Consider Developers Known for C:** Dennis Ritchie stands out as a
key figure in C's development.
4. **Details about Ritchie’s Contributions:** He worked at Bell Labs
(now Tufts University) in the 1960s, creating header files to improve
C's structure and readability.
5. **Link to C Language Development:** This led to C becoming prominent
for system programming due to its efficiency and portability.

**Conclusion:**
Dennis Ritchie is credited with inventing C by developing header files
that enhanced the language’s usability and structure.
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberry-pi" /><category term="tunbury.org" /><summary type="html"><![CDATA[I’ve heard a lot about Deepseek and wanted to try it for myself.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pi5.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/pi5.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Arduino PWM Train Controller</title><link href="https://www.tunbury.org/2025/01/18/arduino-pwm-train-controller/" rel="alternate" type="text/html" title="Arduino PWM Train Controller" /><published>2025-01-18T15:15:00+00:00</published><updated>2025-01-18T15:15:00+00:00</updated><id>https://www.tunbury.org/2025/01/18/arduino-pwm-train-controller</id><content type="html" xml:base="https://www.tunbury.org/2025/01/18/arduino-pwm-train-controller/"><![CDATA[<h1 id="circuit">Circuit</h1>

<p><img src="/images/train-controller-diagram.png" alt="" /></p>

<h1 id="case">Case</h1>

<p>3D printable STL files are available for download: <a href="/images/train-controller.stl">STL files</a></p>

<p><img src="/images/train-controller-fusion-360.png" alt="" /></p>

<h1 id="arduino-code">Arduino Code</h1>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/*
 * Arduino Nano PWM Dual Train Controller
 * This sketch reads values from two potentiometers connected to A0 and A1
 * and uses these values to control the speed and direction of a motor via
 * an L298N motor driver. The motor speed is controlled using PWM signals
 * on pins D5 and D10, and the direction is controlled using digital signals
 * on pins D6, D7, D8, and D9.
 */

// Pin definitions
const int potLeftPin = A0;
const int potRightPin = A1;
const int enaPin = 10;
const int in1Pin = 9;
const int in2Pin = 8;
const int in3Pin = 7;
const int in4Pin = 6;
const int enbPin = 5;

void setup() {
  // Initialize serial communication
  Serial.begin(9600);

  // Set motor control pins as outputs
  pinMode(enbPin, OUTPUT);
  pinMode(enaPin, OUTPUT);
  pinMode(in1Pin, OUTPUT);
  pinMode(in2Pin, OUTPUT);
  pinMode(in3Pin, OUTPUT);
  pinMode(in4Pin, OUTPUT);
}

void loop() {
  // Read potentiometer values
  int potLeft = analogRead(potLeftPin);
  int potRight = analogRead(potRightPin);

  // Map potentiometer values to PWM range
  int pwmLeft = pow(potLeft - 512, 2) / 1024;
  int pwmRight = pow(potRight - 512, 2) / 1024;

  // Control motor speed and direction
  analogWrite(enaPin, pwmLeft);
  analogWrite(enbPin, pwmRight);

  // Set motor direction based on potentiometer values
  if (potLeft &lt; 512) {
    digitalWrite(in1Pin, LOW);
    digitalWrite(in2Pin, HIGH);
  } else {
    digitalWrite(in1Pin, HIGH);
    digitalWrite(in2Pin, LOW);
  }

  if (potRight &lt; 512) {
    digitalWrite(in3Pin, LOW);
    digitalWrite(in4Pin, HIGH);
  } else {
    digitalWrite(in3Pin, HIGH);
    digitalWrite(in4Pin, LOW);
  }

  // Print values to serial monitor for debugging
  Serial.print("potLeft: ");
  Serial.print(potLeft);
  Serial.print(" PWMLeft: ");
  Serial.print(pwmLeft);
  Serial.print(" potRight: ");
  Serial.print(potRight);
  Serial.print(" PWMRight: ");
  Serial.println(pwmRight);

  // Small delay to stabilize readings
  delay(100);
}
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="3d-printing" /><category term="tunbury.org" /><summary type="html"><![CDATA[Circuit]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/train-controller-photo.png" /><media:content medium="image" url="https://www.tunbury.org/images/train-controller-photo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">3d Printed Train</title><link href="https://www.tunbury.org/2023/08/08/3d-printed-train/" rel="alternate" type="text/html" title="3d Printed Train" /><published>2023-08-08T07:41:29+00:00</published><updated>2023-08-08T07:41:29+00:00</updated><id>https://www.tunbury.org/2023/08/08/3d-printed-train</id><content type="html" xml:base="https://www.tunbury.org/2023/08/08/3d-printed-train/"><![CDATA[<p>Creating a new OO train body drawn from scratch in Fusion 360 to minic
the original damaged version.</p>

<h1 id="early-versions">Early versions</h1>

<p><img src="/images/IMG_1919.jpg" alt="" />
<img src="/images/IMG_1918.jpg" alt="" /></p>

<h1 id="printed-with-tree-support">Printed with tree support</h1>

<p><img src="/images/IMG_1917.jpg" alt="" /></p>

<h1 id="finished">Finished</h1>

<p><img src="/images/IMG_1920.jpg" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="3d-printing" /><category term="tunbury.org" /><summary type="html"><![CDATA[Creating a new OO train body drawn from scratch in Fusion 360 to minic the original damaged version.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/IMG_1920.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/IMG_1920.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Foot Operated Timer</title><link href="https://www.tunbury.org/2021/09/04/foot-operated-timer/" rel="alternate" type="text/html" title="Foot Operated Timer" /><published>2021-09-04T19:41:29+00:00</published><updated>2021-09-04T19:41:29+00:00</updated><id>https://www.tunbury.org/2021/09/04/foot-operated-timer</id><content type="html" xml:base="https://www.tunbury.org/2021/09/04/foot-operated-timer/"><![CDATA[<p>At the end of a quarter peal there is always the question of how long it took and whether anyone really noted the start time.  Mike proposed a foot operated timer.</p>

<p>I wanted the display to be large enough that it can be seen while standing and I choose this <a href="https://www.amazon.co.uk/gp/product/B08BC8JY8T/">MAX7219 dot matrix display from Amazon</a>.  This turned out to be a bit of a bad purchase but more on that later.</p>

<p>Using <a href="https://www.festi.info/boxes.py/">boxes.py</a> to created the basic box that was just large enough to accommodate the display, battery, on/off switch and foot switch, I modified the design in Adobe Illustrator to shorten the top and add in a <em>shelf</em> for the display to sit on.</p>

<p><img src="/images/foot-operated-timer-net.png" alt="net" /></p>

<p>This was cut on the laser cutter.</p>

<p><img src="/images/foot-operated-timer-laser-cutting.jpg" alt="net" /></p>

<p>When assembling the electronics it became apparent that it would have been better to have a slightly taller box, but rather than waste the materials I decided to mount the Arduino upside down thereby fitting in a height of 12mm.</p>

<p><img src="/images/foot-operated-timer-arduino.jpg" alt="Arduino" /></p>

<p>The DS3231 real time clock module was modified by bending the pins to fit in with the vero board spacing.  Ultimately the battery holder was also removed to save space.</p>

<p><img src="/images/foot-operated-timer-clock-module.jpg" alt="DS3231" /></p>

<p>The vero board was drilled to cut the tracks.</p>

<p><img src="/images/foot-operated-timer-vero-board.jpg" alt="Vero Board" /></p>

<p><img src="/images/foot-operated-timer-assembly.jpg" alt="Vero Board" /></p>

<p>After the initial assembly, the unit was tested on battery for the first time.  This showed that it didn’t actually run on batteries.  The code just crashed randomly after the display was initialised.  Reading online on this <a href="https://arduinoplusplus.wordpress.com/2015/09/12/max7219-and-led-matrix-power-requirements/">post</a> I found the problem with cheap display units!</p>

<blockquote>
  <p>Most of the cheap generic modules have very low values for RSET, which would significantly increase the power/current required by the module. This seems to be 10kΩ for the eBay specials, for a segment current exceeding 40mA, the specified minimum value for RSET in Table 11 being 11.8kΩ for VLED = 2V.</p>
</blockquote>

<p>The full data sheet is available from <a href="https://datasheets.maximintegrated.com/en/ds/MAX7219-MAX7221.pdf">Maxim</a></p>

<p>I had some 100KΩ surface mount resistors in 0603 format left over from another project.  These were smaller than the 0804 format resistors used but they were relatively easy to change.  Fortunately these fixed the problem.</p>

<p>As an after thought a voltage divider was added to pin A0 to measure the battery voltage.</p>

<p><img src="/images/foot-operated-timer-voltage-divider.jpg" alt="Vero Board" /></p>

<p>I wired the I2C bus from the Arduino to the DS3231 and the square wave output from the DS3231 to pin 2 on the Arduino.  Pin 3 was connected to the push button.  On the Arduino Nano only pin 2 and 3 can be used for interrupts.  This configuration gave lots of options when it came to the code which wasn’t actually written yet!</p>

<p><img src="/images/foot-operated-timer-electronics.jpg" alt="Electrionics" /></p>

<p>Assembling the rest of the box was straight forwarded although a bit fiddly.</p>

<p><img src="/images/foot-operated-timer-off.jpg" alt="Finished project" /></p>

<p>The code is available on <a href="https://github.com/mtelvers/foot-timer">GitHub</a></p>

<p><img src="/images/foot-operated-timer.jpg" alt="Finished project running" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="arduino" /><category term="tunbury.org" /><summary type="html"><![CDATA[At the end of a quarter peal there is always the question of how long it took and whether anyone really noted the start time. Mike proposed a foot operated timer.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/foot-operated-timer.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/foot-operated-timer.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Arduino Gas Sensor</title><link href="https://www.tunbury.org/2021/08/29/arduino-gas-sensor/" rel="alternate" type="text/html" title="Arduino Gas Sensor" /><published>2021-08-29T19:41:29+00:00</published><updated>2021-08-29T19:41:29+00:00</updated><id>https://www.tunbury.org/2021/08/29/arduino-gas-sensor</id><content type="html" xml:base="https://www.tunbury.org/2021/08/29/arduino-gas-sensor/"><![CDATA[<p>With the current emphasis on ventilation to reduce the risks associated with inhaled droplets it I have put together a simple gas sensor to record concentrations over time.  The output is a <code class="language-plaintext highlighter-rouge">CSV</code> file which can be graphed in Excel.</p>

<p>I have used an Arduino Nano for this project which gave some serious memory constraints on the coding particularly as I needed libraries for the real time clock, SD card and OLED display.</p>

<p>The modules used are:</p>
<ul>
  <li><a href="https://www.amazon.co.uk/dp/B072BMYZ18/ref=cm_sw_em_r_mt_dp_dl_WPWV0XM72DEW1A4HBDGE?_encoding=UTF8&amp;psc=1">Arduino Nano</a></li>
  <li><a href="https://www.amazon.co.uk/dp/B07BRFL7V7/ref=cm_sw_em_r_mt_dp_K5YWV6VZJJRT1D4WF9VJ?_encoding=UTF8&amp;psc=1">DS3231 Real time clock</a></li>
  <li><a href="https://www.amazon.co.uk/dp/B01L9GC470/ref=cm_sw_em_r_mt_dp_QQ8BPJQJP4G62QVRSNS3">SSD1306 OLED display</a></li>
  <li><a href="https://www.amazon.co.uk/dp/B077MB17JB/ref=cm_sw_em_r_mt_dp_WYZQY0ZZKJRPV83WH8R3">SD card reader</a></li>
  <li><a href="https://www.amazon.co.uk/dp/B07CYYB82F/ref=cm_sw_em_r_mt_dp_9S4XZ9QD8NBH1V6M7HV5">Gas sensor</a></li>
</ul>

<h2 id="hardware-connections">Hardware Connections</h2>

<p>I used a veroboard to assemble the circuit as follows</p>
<ol>
  <li>Scatter the modules around the board and solder all VCC and GND pins</li>
  <li>On the Arduino Nano, pins A4 and A5 are used for the Inter-Integrated Circuit (I2C) bus
    <ul>
      <li>Connect SDA (A4 on Nano) to the display and clock module’s SDA pin</li>
      <li>Connect SCL (A5 on Nano) to the display and clock module’s SCL pin</li>
    </ul>
  </li>
</ol>

<blockquote>
  <p>At this point, the clock and display module can be tested and the time set on the clock.</p>
</blockquote>

<ol>
  <li>Connect the A0 output from the gas sensor to the A0 pin on the Arduino</li>
</ol>

<blockquote>
  <p>Reading from A0 returns an integer between 0 and 1023 representing a gas concentration between 200 - 10000 ppm</p>
</blockquote>

<ol>
  <li>The SD card using the Serial Peripheral Interface (SPI) and requires 4 connections
    <ul>
      <li>Nano D10 to CS on the SD card module</li>
      <li>Nano D11 to MOSI on the SD card module</li>
      <li>Nano D12 to MISO on the SD card module</li>
      <li>Nano D13 to SCK on the SD card module</li>
    </ul>
  </li>
</ol>

<p>With the wiring complete load the Arduino sketch from my <a href="https://github.com/mtelvers/Arduino-MQ2/blob/113a2348ce65966b738dc55d9ddace36824ec49f/mq2.ino">GitHub page</a>.</p>

<h2 id="software-overview">Software Overview</h2>

<p>After the basic library initialization, the code creates two 64 elements arrays to store the samples taken each second and the average of those samples calculated each minute.  These arrays will hold the latest sample in the first position, therefore before a new value is added all the other values will be shifted down by one.  There certainly would be more efficient ways of handing this but with a small number of values this is simple approach is workable.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#define SAMPLES 64
uint16_t historySeconds[SAMPLES];
uint16_t historyMinutes[SAMPLES];
</code></pre></div></div>

<p>The <em>main</em> loop of the program checks remembers the number of seconds on the clock in the variable <code class="language-plaintext highlighter-rouge">lastS</code> and waits for it to be different thus running the inner code once per second:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int lastS = -1;

void loop(void) {
  DateTime dt = RTClib::now();

  if (lastS != dt.second()) {
    lastS = dt.second();

  // Inner code here runs once each second

  }
  delay(250);
}
</code></pre></div></div>

<p>The inner code clears the display,</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>u8x8.clear();
u8x8.setCursor(0, 0);
</code></pre></div></div>

<p>and then writes the date</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>toString(tmp, dt.year() - 2000, dt.month(), dt.day(), '-');
u8x8.println(tmp);
</code></pre></div></div>

<p>If the time has just rolled over to a new minute (i.e. number of seconds is 0), take an average of the <em>seconds</em> samples and store that as the minute average.  Finally, open a file named with the current date.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (dt.second() == 0) {
  unsigned long total = 0;
  for (int h = 0; h &lt; SAMPLES; h++)
    total += historySeconds[h];
  memmove(historyMinutes + 1, historyMinutes, (SAMPLES - 1) * sizeof(uint16_t));
  historyMinutes[0] = total / SAMPLES;
  strcat(tmp, ".csv");
  txtFile = SD.open(tmp, FILE_WRITE);
}
</code></pre></div></div>

<p>Read the next gas value and store it</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>uint16_t gasVal = analogRead(0);
memmove(historySeconds + 1, historySeconds, (SAMPLES - 1) * sizeof(uint16_t));
historySeconds[0] = gasVal;
</code></pre></div></div>

<p>Display the current time</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>toString(tmp, dt.hour(), dt.minute(), dt.second(), ':');
u8x8.println(tmp);
</code></pre></div></div>

<p>If there’s a file open, write the time to value to the file</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (txtFile) {
  strcat(tmp, ",");
  txtFile.print(tmp);
}
</code></pre></div></div>

<p>Display the gas value</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>itoa(gasVal, tmp, 10);
u8x8.println(tmp);
</code></pre></div></div>

<p>And similarly, if there is a file open, write the current value to the file and close it</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if (txtFile) {
  txtFile.println(tmp);
  txtFile.close();
}
</code></pre></div></div>

<p>Lastly, draw two graphs of the current samples</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>drawGraph(8, 3, historySeconds);
drawGraph(8, 7, historyMinutes);
</code></pre></div></div>

<p>The graphs were tricky to draw as the slimmed down U8x8 version of the <a href="https://github.com/olikraus/u8g2">U8g2</a> library doesn’t provide any drawing functions.  However you can create and display a custom font glyph.  This mess of nested loops creates thirty-two 8 by 8 pixel glyphs to display a bar graph of 64 values with a maximum <em>y</em> value of 32.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>void drawGraph(uint8_t col, uint8_t row, uint16_t *values) {
  uint8_t tmp[8];
  for (uint8_t r = 0; r &lt; 4; r++) {
    for (uint8_t h = 0; h &lt; SAMPLES; h += 8) {
      for (uint8_t i = 0; i &lt; 8; i++) {
        int x = values[SAMPLES - h - 1 - i] / 16;
        x -= 8 * r;
        tmp[i] = 0;
        for (uint8_t b = 0; b &lt; 8 &amp;&amp; x &gt; 0; b++, x--) {
          if (x) {
            tmp[i] |= (1 &lt;&lt; (7 - b));
          }
        }
      }
      u8x8.drawTile(col + h / 8, row - r, 1, tmp);
    }
  }
}
</code></pre></div></div>

<p>The graph below shows the recording during morning ringing and during the quarter peal in the afternoon (plus some messing around blowing directly into the sensor at the end).  Windows open as usual!</p>

<p><img src="/images/sample-values-recorded.png" alt="Graph" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="arduino" /><summary type="html"><![CDATA[With the current emphasis on ventilation to reduce the risks associated with inhaled droplets it I have put together a simple gas sensor to record concentrations over time. The output is a CSV file which can be graphed in Excel.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/gas-sensor.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/gas-sensor.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ratchet Adapter</title><link href="https://www.tunbury.org/2021/08/16/ratchet-adapter/" rel="alternate" type="text/html" title="Ratchet Adapter" /><published>2021-08-16T19:41:29+00:00</published><updated>2021-08-16T19:41:29+00:00</updated><id>https://www.tunbury.org/2021/08/16/ratchet-adapter</id><content type="html" xml:base="https://www.tunbury.org/2021/08/16/ratchet-adapter/"><![CDATA[<p>I want to electrically drive this ratchet mechanism to avoid the manual labour of turning it by hand.  I found a motor with a 1600:1 gearbox on eBay (shipping from China of course) which looks perfect, however it has a 10mm diameter keyed output shaft which doesn’t nicely couple to my 3/4” square ratchet shaft.</p>

<p><img src="/images/ratchet-with-pipe.png" alt="Ratchet with pipe" /></p>

<p>From the photo it is clear that a 1” steel tube fits reasonably well over the shaft.  A wooden plug and a little bit of brute force provided a flat surface which was pre-drilled and a flang screwed on.</p>

<p><img src="/images/wooden-block.png" alt="Wooden block version" /></p>

<p>This worked fairly well except that the grub screw on the flang was insufficent to withstand the forces required.  Therefore a keyway was cut into the flang to prevent slipping.</p>

<p><img src="/images/flang-key-1.png" alt="Flang with keyway" /></p>

<p>And a key was made to fit.</p>

<p><img src="/images/flang-key-2.png" alt="Flange with key" /></p>

<p>This worked very well, but unfortunately about two years later things took a nasty turn. One of the screws snapped and others were about to pull out.</p>

<p><img src="/images/wear-and-tear.png" alt="Wear and tear" /></p>

<p>Taking the 1” tube and turning it sideways gave a metal surface on to which the flang could be bolted.  Cutting a hole in the bottom side of the tube would accomodate the 3/4” ratchet shaft.</p>

<p><img src="/images/ratchet-connector-with-cutout.png" alt="Pipe with holes and cutout" /></p>

<p>And with the flang in place it looks ready for use.</p>

<p><img src="/images/ratchet-connector-flang.png" alt="Flang in place" /></p>

<p>Hopefully this will last a little longer this time.</p>

<p><img src="/images/in-operation.png" alt="Ready for operation" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><category term="obs" /><summary type="html"><![CDATA[I want to electrically drive this ratchet mechanism to avoid the manual labour of turning it by hand. I found a motor with a 1600:1 gearbox on eBay (shipping from China of course) which looks perfect, however it has a 10mm diameter keyed output shaft which doesn’t nicely couple to my 3/4” square ratchet shaft.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/in-operation.png" /><media:content medium="image" url="https://www.tunbury.org/images/in-operation.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Audio Stream from a Raspberry PI</title><link href="https://www.tunbury.org/2021/07/27/audio-stream/" rel="alternate" type="text/html" title="Audio Stream from a Raspberry PI" /><published>2021-07-27T19:41:29+00:00</published><updated>2021-07-27T19:41:29+00:00</updated><id>https://www.tunbury.org/2021/07/27/audio-stream</id><content type="html" xml:base="https://www.tunbury.org/2021/07/27/audio-stream/"><![CDATA[<p>Now singing has returned to churches I need to add an additional microphone to pickup the choir.  I’d like this to be completely separate to the Church PA system to avoid playing this sound out through the speakers.  A Raspberry PI Zero W with a USB sound card looks to be a good option to capture the audio and stream it to OBS.</p>

<p>Run <code class="language-plaintext highlighter-rouge">arecord -l</code> to get a list of available mixer devices.  In my case my USB audio device is #2.</p>

<p>Set the mixer level for the microphone:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>amixer -c 2 -q set 'Mic',0 100%
</code></pre></div></div>

<p>Install <code class="language-plaintext highlighter-rouge">ffmpeg</code> which pulls down around 750MB on a lite installation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt install ffmpeg
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">ffmpeg</code> to create the stream specifying the mixer device name as the input <code class="language-plaintext highlighter-rouge">-i</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ffmpeg -ar 44100 -ac 1 -f alsa -i plughw:2,0 -f wav -listen 1 tcp://0.0.0.0:5002
</code></pre></div></div>

<p>You can play this stream using VideoLAN’s VLC using <em>Open Network Stream</em> <code class="language-plaintext highlighter-rouge">tcp/wav://192.168.1.104:5002</code> where 192.168.1.104 is the IP address of the PI.</p>

<p>In OBS create a new Media Source and set the network buffer to zero (to avoid excessive delay) and turn off <em>Restart playback when source becomes active</em> which keeps the stream alive even when it’s not the active scene:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tcp://192.162.1.104:5002
</code></pre></div></div>

<p>Wrap the ffmpeg command as a service by creating <code class="language-plaintext highlighter-rouge">/etc/systemd/system/stream.service</code> containing</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
Description=auto start stream
After=multi-user.target

[Service]
Type=simple
ExecStartPre=/usr/bin/amixer -c 2 -q set 'Mic',0 100%
ExecStart=/usr/bin/ffmpeg -ar 44100 -ac 1 -f alsa -i plughw:2,0 -f wav -listen 1 tcp://0.0.0.0:5002
User=pi
WorkingDirectory=/home/pi
Restart=always

[Install]
WantedBy=multi-user.target
</code></pre></div></div>

<p>Enable and start the service as follows:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl enable stream
service stream start
</code></pre></div></div>

<h2 id="practical-issues">Practical Issues</h2>

<p>After successfully testing using a Raspberry PI Zero W using USB audio dongle with WiFi connecting over a distance of 30m in an empty church I decided to use it as a secondary device in live broadcast.  This was immediately scuppered on the day as I was unable to maintain the WiFi link.  I put this down to the interference created by the in house PA system,  induction loop, and the mobile phones of the congregation.</p>

<p>I added a UFL connector the Pi Zero W as described by <a href="https://www.briandorey.com/post/raspberry-pi-zero-w-external-antenna-mod">Briain Dorey</a>.  Using this with a 5dB D-Link antenna did marginally increase the antenna signal level and quality of most networks but not sufficiently to make the difference.</p>

<h3 id="internal-antenna">Internal antenna</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pi@raspberrypi:~ $ sudo iwlist wlan0 scan | grep 'Cell\|Signal' | sed '$!N;s/\n/ /'
          Cell 01 - Address: 6C:xx:xx:xx:xx:10                     Quality=69/70  Signal level=-41 dBm  
          Cell 02 - Address: 5C:xx:xx:xx:xx:9E                     Quality=26/70  Signal level=-84 dBm  
          Cell 03 - Address: 5E:xx:xx:xx:xx:9F                     Quality=27/70  Signal level=-83 dBm  
          Cell 04 - Address: 9C:xx:xx:xx:xx:62                     Quality=35/70  Signal level=-75 dBm  
          Cell 05 - Address: 78:xx:xx:xx:xx:8E                     Quality=21/70  Signal level=-89 dBm  
          Cell 06 - Address: 9C:xx:xx:xx:xx:72                     Quality=37/70  Signal level=-73 dBm  
          Cell 07 - Address: 80:xx:xx:xx:xx:6A                     Quality=17/70  Signal level=-93 dBm  
</code></pre></div></div>

<h3 id="external-antenna">External antenna</h3>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pi@raspberrypi:~ $ sudo iwlist wlan0 scan | grep 'Cell\|Signal' | sed '$!N;s/\n/ /'
          Cell 01 - Address: 6C:xx:xx:xx:xx:10                     Quality=70/70  Signal level=-29 dBm  
          Cell 02 - Address: 5C:xx:xx:xx:xx:9E                     Quality=22/70  Signal level=-88 dBm  
          Cell 03 - Address: 5E:xx:xx:xx:xx:9F                     Quality=23/70  Signal level=-87 dBm  
          Cell 04 - Address: 9C:xx:xx:xx:xx:62                     Quality=41/70  Signal level=-69 dBm  
          Cell 05 - Address: 78:xx:xx:xx:xx:8E                     Quality=30/70  Signal level=-80 dBm  
          Cell 06 - Address: 9C:xx:xx:xx:xx:72                     Quality=41/70  Signal level=-69 dBm  
          Cell 07 - Address: 80:xx:xx:xx:xx:6A                     Quality=24/70  Signal level=-86 dBm  
</code></pre></div></div>

<p>Switching to a Raspberry PI 3 gave easy access to an Ethernet port without resorting to a USB hub.  Following that there were no further connection issues!</p>

<p><code class="language-plaintext highlighter-rouge">FFMPEG</code> can also create an MP3 stream rather than a WAV stream by simply changing the output format <code class="language-plaintext highlighter-rouge">-f mp3</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/usr/bin/ffmpeg -ar 44100 -ac 1 -f alsa -i plughw:2,0 -f mp3 -listen 1 tcp://0.0.0.0:5002
</code></pre></div></div>

<p>The Raspberry PI 3 didn’t really have sufficient processing capacity to keep up with the MP3 encoding.  Switch to MP2, <code class="language-plaintext highlighter-rouge">-f mp2</code>, reduced the processor requirement significantly with no noticeable change in quality.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><category term="obs" /><summary type="html"><![CDATA[Now singing has returned to churches I need to add an additional microphone to pickup the choir. I’d like this to be completely separate to the Church PA system to avoid playing this sound out through the speakers. A Raspberry PI Zero W with a USB sound card looks to be a good option to capture the audio and stream it to OBS.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pi-zerow-usb-audio.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/pi-zerow-usb-audio.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Jitsi</title><link href="https://www.tunbury.org/2021/07/14/jitsis/" rel="alternate" type="text/html" title="Jitsi" /><published>2021-07-14T19:41:29+00:00</published><updated>2021-07-14T19:41:29+00:00</updated><id>https://www.tunbury.org/2021/07/14/jitsis</id><content type="html" xml:base="https://www.tunbury.org/2021/07/14/jitsis/"><![CDATA[<p>I need to remotely control OBS during a live stream.  This is quite simply achieved via VNC but I need the see and hear what’s going on at the same time.  VNC doesn’t support audio on the free license and watching the YouTube stream is out of the question as it’s nearly 30 seconds behind real time.</p>

<p>As the computer has a USB web camera and microphone attached I thought of a private LAN based v/c solution.  A quick Internet search found a <a href="https://www.reddit.com/r/sysadmin/comments/gmray4/recommendation_for_free_lanonly_video/">post on Redit</a> talking about Jitsi.</p>

<p>After installing a Ubunutu 20.04 server VM, I followed the Jitsi <a href="https://jitsi.github.io/handbook/docs/devops-guide/devops-guide-quickstart">Self-Hosting Guide</a> which takes just a few minutes.  Since it was a private LAN implementation I skipped the optional FQDN section of the instructions and used the self-signed certificate.</p>

<p>Connecting to the DHCP assigned address via over https brought the expected certificate warnings but I was able to create and join a room.  The camera and microphone did not start.  Every 30 seconds or so this message appeared about reconnecting:</p>

<p><img src="/images/jitsi-disconnected.png" alt="Jitsi Disconnected" /></p>

<p>The fix to this was to use a host name not an IP address.  On Windows machines edit <code class="language-plaintext highlighter-rouge">C:\Windows\System32\Drivers\etc\hosts</code> and on a Mac edit <code class="language-plaintext highlighter-rouge">/etc/hosts</code>.  In both cases I added the DHCP issued IP address and hostname of the Ubuntu server:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>192.168.1.76	jitsi
</code></pre></div></div>

<p>Connecting to Jitsu using <a href="https://jitsi">https://jitsi</a> and skipping passed the certificate warnings brought me to a working implementation.  Certainly impressive and easy to setup!</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Jitsi" /><category term="Ubuntu" /><summary type="html"><![CDATA[I need to remotely control OBS during a live stream. This is quite simply achieved via VNC but I need the see and hear what’s going on at the same time. VNC doesn’t support audio on the free license and watching the YouTube stream is out of the question as it’s nearly 30 seconds behind real time.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/jitsi-logo-blue-grey-text.png" /><media:content medium="image" url="https://www.tunbury.org/images/jitsi-logo-blue-grey-text.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Syncthing on OpenBSD</title><link href="https://www.tunbury.org/2021/06/22/syncthing-on-openbsd/" rel="alternate" type="text/html" title="Syncthing on OpenBSD" /><published>2021-06-22T19:41:29+00:00</published><updated>2021-06-22T19:41:29+00:00</updated><id>https://www.tunbury.org/2021/06/22/syncthing-on-openbsd</id><content type="html" xml:base="https://www.tunbury.org/2021/06/22/syncthing-on-openbsd/"><![CDATA[<h2 id="network-installation-of-openbsd">Network Installation of OpenBSD</h2>

<p>Setup a machine to facilitate network installation of OpenBSD.  Download the 6.9 installation ISO from the <a href="https://www.openbsd.org/faq/faq4.html#Download">OpenBSD website</a> and install it in a virtual machine.  I’m using VMware Fusion and have a dedicated LAN port connected to the remote machine.</p>

<p>Create <code class="language-plaintext highlighter-rouge">hostname.vic0</code> containing the following and not <code class="language-plaintext highlighter-rouge">dhcp</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inet 192.168.2.1 255.255.255.0 NONE
</code></pre></div></div>

<h3 id="dhcpd">DHCPD</h3>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/dhcpd.conf</code> with the key attributes:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">filename</code> for the boot image name, and</li>
  <li><code class="language-plaintext highlighter-rouge">next-server</code> for the TFTP server address.</li>
</ul>

<p>I have added a host section for the specific MAC of my machine but for this one-time build process it could be a global option.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>subnet 192.168.2.0 netmask 255.255.255.0 {
    option routers 192.168.2.1;
    range 192.168.2.32 192.168.2.127;
    
    host mini-itx {
        hardware ethernet 00:40:63:d5:6f:4f;
        filename "auto_install";
        next-server 192.168.2.1;
        option host-name "mini-itx"
    }
}
</code></pre></div></div>

<h3 id="tftpd">TFTPD</h3>

<p>Create the default TFTP root folder and configuration folder</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir -p /tftpboot/etc
</code></pre></div></div>

<p>Download <a href="http://ftp.openbsd.org/pub/OpenBSD/6.9/i386/pxeboot">pxeboot</a> and <a href="http://ftp.openbsd.org/pub/OpenBSD/6.9/i386/bsd.rd">bsd.rd</a> and put them in <code class="language-plaintext highlighter-rouge">/tftpboot</code>.</p>

<p>Create a symbolic link for <code class="language-plaintext highlighter-rouge">auto_install</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ln -s pxeboot /tftpboot/auto_install
</code></pre></div></div>

<p>Create <code class="language-plaintext highlighter-rouge">/tftpboot/etc/boot.conf</code> containing the following</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>boot tftp:/bsd.rd
</code></pre></div></div>

<h3 id="httpd">HTTPD</h3>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/httpd.conf</code> to share the folder <code class="language-plaintext highlighter-rouge">/var/www/htdocs</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#[ MACROS ]
ext_ip = "*"

# [ GLOBAL CONFIGURATION ]
# none

# [ SERVERS ]
server "default" {
    listen on $ext_ip port 80
    root "/htdocs"
}

# [ TYPES ]
types {
    include "/usr/share/misc/mime.types"
}
</code></pre></div></div>

<p>Stage the installation files on a local web server by copying them from the boot ISO downloaded at the start:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mount /dev/cd0a /mnt/
mkdir -p /var/www/htdocs/pub/OpenBSD
cp -rv /mnt/6.9/ /var/www/htdocs/pub/OpenBSD/6.9
ls -l /var/www/htdocs/pub/OpenBSD/6.9 &gt; /var/www/htdocs/pub/OpenBSD/6.9/index.txt
</code></pre></div></div>

<p>Create <code class="language-plaintext highlighter-rouge">/var/www/htdocs/install.conf</code> containing the following automatic confgiuration answer file</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Password for root = Password
Setup a user = user
Password for user = Password
Public ssh key for user = ssh-rsa AAAA...ZV user@Marks-Mac-mini.local
Which disk is the root disk = wd0
What timezone are you in = Europe/London
Unable to connect using https. Use http instead = yes
Location of sets = http
HTTP Server = 192.168.2.1
Set name(s) = -all bsd* base* etc* man* site* comp*
Continue without verification = yes
</code></pre></div></div>

<p>Enable the services using <code class="language-plaintext highlighter-rouge">rcctl</code> which edits configuration file <code class="language-plaintext highlighter-rouge">rc.conf.local</code> add the appropriate <code class="language-plaintext highlighter-rouge">service_flags=""</code> lines</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rcctl enable dhcpd
rcctl enable tftpd
rcctl enable httpd
</code></pre></div></div>

<p>The remote system should now boot from the network and install OpenBSD hands free!</p>

<p>After the new system boots <code class="language-plaintext highlighter-rouge">su</code> and then overwrite <code class="language-plaintext highlighter-rouge">/etc/installurl</code> with a standard value</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>echo https://ftp.openbsd.org/pub/OpenBSD &gt; /etc/installurl
</code></pre></div></div>

<h2 id="raid5-volume">RAID5 Volume</h2>

<p>Create a RAID5 volume over the four attached disks</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for a in sd0 sd1 sd2 sd3 ; do fdisk -iy $a ; done
for a in sd0 sd1 sd2 sd3 ; do printf "a\n\n\n\nRAID\nw\nq\n" | disklabel -E $a ; done
bioctl -c 5 -l /dev/sd0a,/dev/sd1a,/dev/sd2a,/dev/sd3a softraid0
</code></pre></div></div>

<p>Partition and format the volume</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fdisk -iy sd4
printf "a\n\n\n\n4.2BSD\nw\nq\n" | disklabel -E sd4
newfs /dev/rsd4a 
</code></pre></div></div>

<h2 id="syncthing">Syncthing</h2>

<p>Install <code class="language-plaintext highlighter-rouge">syncthing</code> using</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pkg_add syncthing
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/login.conf</code> and append:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>syncthing:\
        :openfiles-max=60000:\ 
        :tc=daemon:
</code></pre></div></div>

<p>Rebuild the file</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cap_mkdb /etc/login.conf
echo "kern.maxfiles=80000" &gt;&gt; /etc/sysctl.conf
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/rc.d/syncthing</code> and update the <code class="language-plaintext highlighter-rouge">daemon_flags</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>daemon_flags="-no-browser -gui-address=0.0.0.0:8384"
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/fstab</code> to mount the drive</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/dev/sd4a /var/syncthing ffs rw,softdep 0 0
chown -R _syncthing:_syncthing /var/syncthing
</code></pre></div></div>

<p>Enable and start syncthing:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rcctl enable syncthing
rcctl start syncthing
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="Syncthing" /><category term="OpenBSD" /><summary type="html"><![CDATA[Network Installation of OpenBSD]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/openbsd-syncthing.png" /><media:content medium="image" url="https://www.tunbury.org/images/openbsd-syncthing.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">WordPress to Jekyll Test</title><link href="https://www.tunbury.org/2021/05/25/wordpress-to-jekyll-test/" rel="alternate" type="text/html" title="WordPress to Jekyll Test" /><published>2021-05-25T12:41:29+00:00</published><updated>2021-05-25T12:41:29+00:00</updated><id>https://www.tunbury.org/2021/05/25/wordpress-to-jekyll-test</id><content type="html" xml:base="https://www.tunbury.org/2021/05/25/wordpress-to-jekyll-test/"><![CDATA[<p>Install the Wordpress plugins <em>UpdraftPlus</em>.  Create a new WordPress site and install the <em>UpdraftPlus</em> plugin and restore the database.</p>

<p>Use the following MySQL commands to fix the database</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>UPDATE wp_options SET option_value = replace(option_value, 'cccbr.org.uk', 'cccbr.tunbury.org') WHERE option_name = 'home' OR option_name = 'siteurl';
UPDATE wp_posts SET guid = replace(guid, 'cccbr.org.uk','cccbr.tunbury.org');
UPDATE wp_posts SET post_content = replace(post_content, 'cccbr.org.uk', 'cccbr.tunbury.org');
UPDATE wp_postmeta SET meta_value = replace(meta_value,'cccbr.org.uk','cccbr.tunbury.org');
</code></pre></div></div>

<p>Set user password (mainly to make it different from the original site)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>UPDATE `wp_users` SET `user_pass`= MD5('yourpassword') WHERE `user_login`='melvers';
</code></pre></div></div>

<p>Install <em>Jekyll Exporter</em> plugin, activate it and then create the export using Tools -&gt; Export to Jekyll.</p>

<p>Create a new Jekyll site by running</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>jekyll new c:\cccbr
</code></pre></div></div>

<p>Extract <code class="language-plaintext highlighter-rouge">jekyll-export.zip</code> into the <code class="language-plaintext highlighter-rouge">c:\cccbr</code> folder but don’t overwrite <code class="language-plaintext highlighter-rouge">_config.yml</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>jekyll serve
</code></pre></div></div>

<p>Visit <a href="http://localhost:4000">http://localhost:4000</a> to see how it looks.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$mdFiles = Get-ChildItem . *.md -rec
foreach ($file in $mdFiles) {
    (Get-Content $file.PSPath) |
    Foreach-Object { $_ -replace "&amp;#8211;", "-" } |
    Foreach-Object { $_ -replace "&amp;#038;", "&amp;" } |
    Foreach-Object { $_ -replace "&amp;#8217;", "&amp;apos;" } |
    Foreach-Object { $_ -replace "cccbr.tunbury.org/wp-content/uploads/", "cccbr.org.uk/wp-content/uploads/" } |
    Foreach-Object { $_ -replace "cccbr.tunbury.org/", "/" } |
    Foreach-Object { $_ -replace "layout: page", "layout: single" } |
    Foreach-Object { $_ -replace "layout: post", "layout: single" } |
    Set-Content $file.PSPath
}
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">GemFile</code> to the new theme by commenting out <code class="language-plaintext highlighter-rouge">minima</code> and adding <code class="language-plaintext highlighter-rouge">minimal-mistakes</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># gem "minima", "~&gt; 2.5"
gem "minimal-mistakes-jekyll"
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">bundle</code> in the folder to download the dependancies.  Edit <code class="language-plaintext highlighter-rouge">_config.yaml</code> and set the theme</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>theme: minimal-mistakes-jekyll
</code></pre></div></div>

<p>Create the top level menu by creating <code class="language-plaintext highlighter-rouge">_data/navigation.yml</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main:
- title: "About"
    url: /about
- title: "Bells and Ringing"
    url: /bellringing
</code></pre></div></div>

<p>Create secondary menus with the same <code class="language-plaintext highlighter-rouge">_data/navigation.yml</code> file such as:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>about:
- title: About
    children:
    - title: "About the Council"
        url: /about
    - title: "Continuing CCCBR Reforms"
        url: /about/reforms/
    - title: "Governance"
        url: /about/governance/
</code></pre></div></div>

<p>Then on the appropriate pages set the front matter:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sidebar:
  nav: "about"
toc: true
</code></pre></div></div>

<p>Create a custom skin by duplicating and rename a file in <code class="language-plaintext highlighter-rouge">_sass\minimal-mistakes\skins</code>.  I create <code class="language-plaintext highlighter-rouge">cccbr.scss</code> and the in <code class="language-plaintext highlighter-rouge">_config.yml</code> apply the theme like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>theme: minimal-mistakes-jekyll
minimal_mistakes_skin: "cccbr"
</code></pre></div></div>

<p>Create a repository on GitHub.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git init
git add .
git commit -m "inital commit"
git remote add origin https://github.com/mtelvers/cccbr.git
git push -u origin master
</code></pre></div></div>

<p>On GitHub under the repo unders Settings \ Pages publish the site using the master branch.</p>

<p>Changes to make it work on GitHub:</p>

<ol>
  <li>Update <code class="language-plaintext highlighter-rouge">Gemfile</code> and then ran <code class="language-plaintext highlighter-rouge">bundle</code>.</li>
  <li>Updated all the posts and pages to use the <code class="language-plaintext highlighter-rouge">single</code> template.</li>
  <li>Update <code class="language-plaintext highlighter-rouge">_config.yml</code> to set baseurl to match Git repository name.</li>
  <li>Update <code class="language-plaintext highlighter-rouge">_config.yml</code> to change remote theme.</li>
</ol>

<p>Remove unwanted front matter tags with this Ruby script</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>require "yaml"

YAML_FRONT_MATTER_REGEXP = /\A(---\s*\n.*?\n?)^((---|\.\.\.)\s*$\n?)/m

Dir.glob('**/*.md', File::FNM_DOTMATCH) do |f|
    puts f

    file = File.open(f)
    source = file.read
    file.close

    if source =~ YAML_FRONT_MATTER_REGEXP
        data, content = YAML.load($1), Regexp.last_match.post_match
        ["id", "guid",
        "ep_tilt_migration",
        "classic-editor-remember",
        "ssb_old_counts",
        "ssb_total_counts",
        "ssb_cache_timestamp",
        "colormag_page_layout",
        "wp_featherlight_disable",
        "catchbox-sidebarlayout",
        "complete_open_graph"].each {|x| data.delete(x)}

        file = File.open(f, "w")
        YAML.dump(data, file)
        file.puts("---", content)
        file.close
    end
end
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="jekyll" /><category term="wordpress" /><summary type="html"><![CDATA[Install the Wordpress plugins UpdraftPlus. Create a new WordPress site and install the UpdraftPlus plugin and restore the database.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/wordpress-to-jekyll.png" /><media:content medium="image" url="https://www.tunbury.org/images/wordpress-to-jekyll.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Mini ITX as Windows 2008 Server</title><link href="https://www.tunbury.org/2021/04/28/mini-itx-as-windows-2008-server/" rel="alternate" type="text/html" title="Mini ITX as Windows 2008 Server" /><published>2021-04-28T12:41:29+00:00</published><updated>2021-04-28T12:41:29+00:00</updated><id>https://www.tunbury.org/2021/04/28/mini-itx-as-windows-2008-server</id><content type="html" xml:base="https://www.tunbury.org/2021/04/28/mini-itx-as-windows-2008-server/"><![CDATA[<p>Unfortunately without a DVD drive and with no capability to boot from USB I’m struggling to get a clean OS on my Mini ITX machine. The internal drive is IDE and I don’t have any other machines with IDE around and I don’t know the password for the installed OS.</p>

<p>Install Windows 2008 x86 Server (with GUI) in a VM</p>

<p>Turn on Remote Desktop and turn off the firewall</p>

<p>Add Windows Server role WDS and AD DS</p>

<p>Set static IP address 192.168.10.10/24 DNS 127.0.0.1</p>

<p>Set local administrator password to a complex password</p>

<p>Run <code class="language-plaintext highlighter-rouge">dcpromo</code>, set domain to montdor.local.</p>

<p>Install DHCP and follow the wizard to create a scope 192.168.10.128–192.168.10.254. DNS 192.168.10.10. No router.</p>

<p>Configure WDS using the wizard</p>

<ul>
  <li>Do not listen on port 67</li>
  <li>Configure DHCP option 60</li>
  <li>Respond to all clients</li>
</ul>

<p>Switch to the Windows AIK for Windows 7 ISO <code class="language-plaintext highlighter-rouge">KB3AIK_EN.ISO</code> and install Windows Automated Installation Kit (to get Windows PE)</p>

<p>In WDS, add the WinPE boot WIM as a boot image. The WIM is in <code class="language-plaintext highlighter-rouge">C:\Program Files\Windows AIK\Tools\PETools\x86\winpe.wim</code></p>

<p>Copy the Windows 2008 Server Standard x86 DVD to <code class="language-plaintext highlighter-rouge">c:\Win2K8x86</code>. Create a share of the same name.</p>

<p>Windows 2008 Server installation requires 512MB of RAM but my computer only has 256MB and only reports 248 after the video RAM is subtracted.</p>

<p>Hack the Windows setup program to make it run anyway:</p>

<p>Find the file <code class="language-plaintext highlighter-rouge">WINSETUP.DLL</code> in the sources folder and using as hex editor such as <a href="http://mh-nexus.de/en/hxd/">HxD</a>, search for the hex string <code class="language-plaintext highlighter-rouge">77 07 3D 78 01</code> and replace it with <code class="language-plaintext highlighter-rouge">E9 04 00 00 00</code>.</p>

<p>Now Windows really did need 512MB of RAM: setup fails with error <code class="language-plaintext highlighter-rouge">0xE0000100</code> caused by insufficient memory. Therefore, create a partition and then a swap file.</p>

<p>Open     and run the following to create a working drive:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT DISK 0
CLEAN
CREATE PART PRIMARY
SELECT VOLUME 0
ASSIGN
FORMAT FS=NTFS QUICK
</code></pre></div></div>

<p>Create a paging file</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wpeutil createpagefile /path:c=\pf.sys
</code></pre></div></div>

<p>Now run Windows Setup.</p>

<p>Download Sil3124 driver for Windows 7 x86. Copy it to a network share and mount it from the Windows 2008 Server and run:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pnputil -i -a *.inf
</code></pre></div></div>

<p>Then use DISKPART.EXE again, similar to above</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT DISK 1
CREATE PART PRI
SELECT VOLUME 1
ASSIGN
FORMAT FS=NTFS QUICK
</code></pre></div></div>

<p>Now we need Windows Updates I suppose</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cscript c:\windows\system32\scregedit.wsf /au 4
net stop wuauserv
net start wuauserv
wuauclt /detectnow
</code></pre></div></div>

<p>Enable Remote Desktop with</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cscript c:\windows\system32\scregedit.wsf /ar 0
</code></pre></div></div>

<p>Create a share</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>net share sharename=d:\share /grant:everyone,full
</code></pre></div></div>

<p>Make it visible</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>netsh firewall set service fileandprint enable
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><category term="obs" /><summary type="html"><![CDATA[Unfortunately without a DVD drive and with no capability to boot from USB I’m struggling to get a clean OS on my Mini ITX machine. The internal drive is IDE and I don’t have any other machines with IDE around and I don’t know the password for the installed OS.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/via-cpu.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/via-cpu.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Raspberry PI Camera with M12 Lens</title><link href="https://www.tunbury.org/2021/01/06/raspberry-pi-camera-with-m12-lens/" rel="alternate" type="text/html" title="Raspberry PI Camera with M12 Lens" /><published>2021-01-06T12:41:29+00:00</published><updated>2021-01-06T12:41:29+00:00</updated><id>https://www.tunbury.org/2021/01/06/raspberry-pi-camera-with-m12-lens</id><content type="html" xml:base="https://www.tunbury.org/2021/01/06/raspberry-pi-camera-with-m12-lens/"><![CDATA[<p>I really need a good lens on my Raspberry PI camera to use it with OBS from a decent distance.  The new high resolution Rasperberry PI cameras look excellent but they also come with a heafty price tag which I just can’t justify.</p>

<blockquote>
  <p>First off, the mounting holes on both v1 and v2 RPi cameras are on 21 mm centers, so the 20 mm spacing of the M12 mount you link isn’t a perfect fit. Depending on your mounting screw size, you may still be able to force it. Second, you have to manually cut or file down a notch in the M12 mount for the micro-flex cable that comes out of the camera module. That isn’t too hard, but if you want, there is also a M12 mount specifically designed for the RPi cameras, with a notch already.</p>

  <p>The v1 and v2 sensor sizes are the same, the so-called 1/4-inch format. On V1 the lens focal length is f=3.6mm with Angle of View: 54 x 41 degrees and on V2 it is f=3.0mm with Angle of View: 62.2 x 48.8 degrees [1]. Note the angle of view is quoted at full-frame; remember some video modes use a cropped subset of the full frame. This is a moderately wide angle lens. If you double the focal length, you’ll get half the field of view. If you get a 8mm lens that’s a moderate telephoto, and a 16mm lens is definitely telephoto. I’ve tried a number of cheap M12 lenses that work “ok” but don’t expect perfectly sharp images with the tiny 1.4 or 1.1 micron pixels these camera sensors use. Lower f-number lenses are “faster” (let in more light) but will have more shallow depth of field and more blurry overall. You will see f/1.4 or lower sold for use in low light, but I have not had good images with those; I would recommend f/2.0 or above if you want decent resolution.</p>

  <p><a href="https://www.raspberrypi.org/forums/viewtopic.php?t=150344#p988445">https://www.raspberrypi.org/forums/viewtopic.php?t=150344#p988445</a></p>
</blockquote>

<p>With that as the inspiration I bought a pack of ten M12 lens adapters from Amazon for £5 and started out by creating a notch for the cable. While the 20mm spacing wasn’t ideal I have found some variation in hole positions on the PCB and by using thin M2 bolts I was able to <em>force</em> them.</p>

<p>I removed the lens in a rather destructive way from the front of the camera by cutting around the raised area on three sides with a craft knife. It wasn’t pretty but it did the job.</p>

<p><img src="/images/pi-camera-m12-1.jpg" alt="" /></p>

<p>On the first camera I modified I went on to remove the IR filter by gently cutting it across the diagonal with side cutters. Surprisingly it popped off without too much effort leaving this.</p>

<p><img src="/images/pi-camera-m12-2.jpg" alt="" /></p>

<p>For my application, removing the IR filter was a mistake as (tungsten) lights and candles produce lots of infrared!</p>

<p>I mounted the M12 adapters on 3mm plywood with short M2 bolt screwed in from the front.</p>

<p><img src="/images/pi-camera-m12-3.jpg" alt="" /></p>

<p><img src="/images/pi-camera-m12-4.jpg" alt="" /></p>

<p>I had an old Foscam WiFi camera which has an M12 lens marked as <em>f=2.8mm</em>. This pretty much matched the field of view I got from the camera’s native lens.</p>

<p>I have had good results with <em>f=8mm</em>, <em>f=15mm</em> and <em>f=25mm</em> lens as well as cheap zoom lens offering a range of <em>f=3mm</em> to <em>f=12mm</em>. It’s curious that on Amazon a focal length of 8mm is typically sold as <em>wide angle</em> rather than telephoto! What I really notice is that the depth of field becomes increasingly narrow as the focal length increases.</p>

<p>I installed Raspberry Pi OS Lite using the Pi Imager and enabled SSH before removing the SD card.</p>

<p>After assembling the unit check that the camera is connected up and enabled with <code class="language-plaintext highlighter-rouge">vcgencmd get_camera</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>supported=1 detected=1
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">raspivid</code> can be configured to send an h.264 stream, but it exits when the connection drops. Therefore, I have rolled <code class="language-plaintext highlighter-rouge">raspivid</code> as a service so systemd will restart it each time.</p>

<p>Create <code class="language-plaintext highlighter-rouge">/etc/systemd/system/stream.service</code> containing</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
Description=auto start stream
After=multi-user.target

[Service]
Type=simple
ExecStart=/usr/bin/raspivid -v -fps 30 -md 2 -n -ih -t 0 -l -stm -fl -o tcp://0.0.0.0:5001
User=pi
WorkingDirectory=/home/pi
Restart=always

[Install]
WantedBy=multi-user.target
</code></pre></div></div>

<p>Enable and start the service as follows:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl enable stream
service stream start
</code></pre></div></div>

<p>You can open the stream with VLC by using the address <code class="language-plaintext highlighter-rouge">tcp/h264://192.168.1.88:5001</code> which is useful for testing.</p>

<p>Finally in OBS connect add a media source <code class="language-plaintext highlighter-rouge">tcp://192.168.0.88:5001</code>.</p>

<p><img src="/images/2_8mm.png" alt="" title="f=2.8mm" />
<img src="/images/8mm.png" alt="" title="f=8mm" />
<img src="/images/16mm.png" alt="" title="f=16mm" />
<img src="/images/22mm.png" alt="" title="f=22mm" /></p>

<h1 id="parts-list">Parts list</h1>

<table>
  <thead>
    <tr>
      <th>Part</th>
      <th>Cost</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><a href="https://www.amazon.co.uk/Raspberry-Pi-Model-Quad-Motherboard/dp/B01CD5VC92">Pi 3B</a></td>
      <td>£34</td>
    </tr>
    <tr>
      <td><a href="https://www.amazon.co.uk/gp/product/B07WCGY2QY/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&amp;psc=1">PoE Splitter - 2 pack</a></td>
      <td>£17</td>
    </tr>
    <tr>
      <td><a href="https://www.amazon.co.uk/gp/product/B07ZZ2K7WP/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&amp;psc=1">5MP Camera Module - 2 pack</a></td>
      <td>£9</td>
    </tr>
    <tr>
      <td><a href="https://www.amazon.co.uk/gp/product/B08FDVYC98/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&amp;psc=1">Zoom lens</a></td>
      <td>£10</td>
    </tr>
    <tr>
      <td><a href="https://www.amazon.co.uk/gp/product/B00R1J42T8/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&amp;psc=1">M12 Mount - 10 pack</a></td>
      <td>£5</td>
    </tr>
    <tr>
      <td><a href="https://www.amazon.co.uk/gp/product/B075QMCYZM/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&amp;psc=1">3mm plywood - 25 pack</a></td>
      <td>£24</td>
    </tr>
    <tr>
      <td><a href="https://www.amazon.co.uk/gp/product/B003WIRFD2/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&amp;psc=1">SD Card</a></td>
      <td>£3.70</td>
    </tr>
  </tbody>
</table>

<p>A single camera would cost £62.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><category term="obs" /><summary type="html"><![CDATA[I really need a good lens on my Raspberry PI camera to use it with OBS from a decent distance. The new high resolution Rasperberry PI cameras look excellent but they also come with a heafty price tag which I just can’t justify.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pi-camera-m12-2.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/pi-camera-m12-2.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Normalise MP3 Files</title><link href="https://www.tunbury.org/2021/01/01/normalise-mp3-files/" rel="alternate" type="text/html" title="Normalise MP3 Files" /><published>2021-01-01T12:41:29+00:00</published><updated>2021-01-01T12:41:29+00:00</updated><id>https://www.tunbury.org/2021/01/01/normalise-mp3-files</id><content type="html" xml:base="https://www.tunbury.org/2021/01/01/normalise-mp3-files/"><![CDATA[<p>I have hundreds for MP3 files but the levels aren’t standardised in any way which makes streaming them a bit hit and miss.  I can normalise them using <a href="https://www.audacityteam.org/">AudaCity</a> but I’d really like an automatic way of doing it.</p>

<p>Install MP3GAIN</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt install mp3gain
</code></pre></div></div>

<p>It doesn’t seem to run for some reason as it can’t find the library.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>==617==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
</code></pre></div></div>

<p>Set <code class="language-plaintext highlighter-rouge">LD_PRELOAD</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>export LD_PRELOAD=/usr/lib/arm-linux-gnueabihf/libasan.so.4
</code></pre></div></div>

<p>Now it works!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mp3gain -e -c -r *.mp3
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><summary type="html"><![CDATA[I have hundreds for MP3 files but the levels aren’t standardised in any way which makes streaming them a bit hit and miss. I can normalise them using AudaCity but I’d really like an automatic way of doing it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/mp3gain.png" /><media:content medium="image" url="https://www.tunbury.org/images/mp3gain.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">TEMPer USB Temperature Sensor</title><link href="https://www.tunbury.org/2020/12/26/temper-usb-temperature-sensor/" rel="alternate" type="text/html" title="TEMPer USB Temperature Sensor" /><published>2020-12-26T12:41:29+00:00</published><updated>2020-12-26T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/12/26/temper-usb-temperature-sensor</id><content type="html" xml:base="https://www.tunbury.org/2020/12/26/temper-usb-temperature-sensor/"><![CDATA[<p>These USB sensors are available pretty cheaply from PiHut and Amazon and
are great for monitoring the temperature remotely (where you have a Pi).</p>

<p>Install the necessary prerequisites:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt install libhidapi-dev/stable cmake bc
</code></pre></div></div>

<p>There is a <a href="https://github.com/edorfaus/TEMPered">GitHub repository by Frode Austvik</a>:</p>

<blockquote>
  <p>This project is a C implementation of a library and program to read all the
various types of TEMPer thermometer and hygrometer USB devices, as produced by
RDing Technology and sold under the name PCsensor.</p>
</blockquote>

<p>Download the software</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/edorfaus/TEMPered
</code></pre></div></div>

<p>And build it and install:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd TEMPered
cmake .
make
sudo cp utils/hid-query /usr/bin
</code></pre></div></div>

<p>Create a simple script to query the device and display the temperature.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>!/bin/bash
OUTLINE=/usr/bin/hid-query /dev/hidraw1 0x01 0x80 0x33 0x01 0x00 0x00 0x00 0x00 | grep -A1 ^Response|tail -1
OUTNUM=echo $OUTLINE|sed -e 's/^[^0-9a-f]*[0-9a-f][0-9a-f] [0-9a-f][0-9a-f] \([0-9a-f][0-9a-f]\) \([0-9a-f][0-9a-f]\) .*$/0x\1\2/'
HEX4=${OUTNUM:2:4}
DVAL=$(( 16#$HEX4 ))
CTEMP=$(bc &lt;&lt;&lt; "scale=2; $DVAL/100")
echo date $CTEMP
</code></pre></div></div>

<p>This works perfectly but it must be executed with <code class="language-plaintext highlighter-rouge">sudo</code>, or by first
running <code class="language-plaintext highlighter-rouge">chmod 666 /dev/hidraw</code>. This can be automated by creating
<code class="language-plaintext highlighter-rouge">/etc/udev/rules.d/99-hidraw.rules</code> with the content below which creates
the <code class="language-plaintext highlighter-rouge">/dev</code> node with the appropriate permissions.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>KERNEL=="hidraw*", SUBSYSTEM=="hidraw", MODE="0666", GROUP="root"
</code></pre></div></div>

<p>I’ve added a cron job (<code class="language-plaintext highlighter-rouge">crontab -e</code>) to record the temperature every 5
minutes:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0,5,10,15,20,25,30,35,40,45,50,55 * * * * /home/pi/temp.sh &gt;&gt; /home/pi/temperature.txt
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><summary type="html"><![CDATA[These USB sensors are available pretty cheaply from PiHut and Amazon and are great for monitoring the temperature remotely (where you have a Pi).]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/USB-Thermometer.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/USB-Thermometer.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Hard disk failure</title><link href="https://www.tunbury.org/2020/10/05/hard-disk-failure/" rel="alternate" type="text/html" title="Hard disk failure" /><published>2020-10-05T12:41:29+00:00</published><updated>2020-10-05T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/10/05/hard-disk-failure</id><content type="html" xml:base="https://www.tunbury.org/2020/10/05/hard-disk-failure/"><![CDATA[<p>Check the status with <code class="language-plaintext highlighter-rouge">sudo mdadm --detail /dev/md0</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/dev/md0:
           Version : 1.2
     Creation Time : Wed Sep  2 21:55:39 2015
        Raid Level : raid5
        Array Size : 878509056 (837.81 GiB 899.59 GB)
     Used Dev Size : 292836352 (279.27 GiB 299.86 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

       Update Time : Sun Oct  4 07:35:23 2020
             State : clean, degraded 
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 1
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : resync

              Name : plum:0  (local to host plum)
              UUID : 4a462153:dde89a43:0a4dd678:451bb2b4
            Events : 24024

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       5       8       49        2      active sync   /dev/sdd1
       -       0        0        3      removed

       4       8       65        -      faulty   /dev/sde1
</code></pre></div></div>

<p>Check which disks are which <code class="language-plaintext highlighter-rouge">sudo lshw -class disk</code>.</p>

<table>
  <thead>
    <tr>
      <th>Mount</th>
      <th>Model</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>/dev/sdb</td>
      <td>ST9300603SS</td>
      <td>Seagate Savvio 10 K.3 St9300603ss</td>
    </tr>
    <tr>
      <td> </td>
      <td>MBE2073RC</td>
      <td>Fujitsu MBE2073RC 73.5GB SAS Hard Drive</td>
    </tr>
    <tr>
      <td> </td>
      <td>MBE2073RC</td>
      <td>Fujitsu MBE2073RC 73.5GB SAS Hard Drive</td>
    </tr>
    <tr>
      <td>/dev/sdc</td>
      <td>ST9300603SS</td>
      <td>Seagate Savvio 10 K.3 St9300603ss</td>
    </tr>
    <tr>
      <td>/dev/sdd</td>
      <td>ST300MM0006</td>
      <td>Seagate Enterprise Performance 10K HDD ST300MM0006 300 GB</td>
    </tr>
    <tr>
      <td>/dev/sde</td>
      <td>ST9300603SS</td>
      <td>Seagate Savvio 10 K.3 St9300603ss</td>
    </tr>
  </tbody>
</table>

<p>The boot drive is a hardware RAID1 using the two 73GB disks. <code class="language-plaintext highlighter-rouge">/var</code> made up of the 300GB disks in a software RAID5 configuration.</p>

<p>The ST9300603SS is still available on Amazon but the newer 10k.5 generation equivalent the ST9300605SS is on a same day delivery and it’s cheaper as well!</p>

<p>Remove the disk</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm -r /dev/md0 /dev/sde1
</code></pre></div></div>

<p>This server does support hot plug but there were some zombie processes which I wanted to clear out and operationally a five minute outage would be fine.</p>

<p>Shutdown the server and replace the disk.  New disk (slot 2) during boot:</p>

<p><img src="/images/perc-bios.jpg" alt="" /></p>

<p>After the reboot copy the partition table from one of the existing disks over to the new disk.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sfdisk -d /dev/sdb | sfdisk /dev/sde
</code></pre></div></div>

<p>Add the new disk into the array</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mdadm /dev/md0 -a /dev/sde1
</code></pre></div></div>

<p>Monitor the rebuild process</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>watch -n 60 cat /proc/mdstat
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ubuntu" /><summary type="html"><![CDATA[Check the status with sudo mdadm --detail /dev/md0]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/savvio-10k-sas-disks.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/savvio-10k-sas-disks.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Raspberry PI as RTSP source for OBS using VLC</title><link href="https://www.tunbury.org/2020/08/29/raspberry-pi-as-rtsp-source-for-obs-using-vlc/" rel="alternate" type="text/html" title="Raspberry PI as RTSP source for OBS using VLC" /><published>2020-08-29T12:41:29+00:00</published><updated>2020-08-29T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/08/29/raspberry-pi-as-rtsp-source-for-obs-using-vlc</id><content type="html" xml:base="https://www.tunbury.org/2020/08/29/raspberry-pi-as-rtsp-source-for-obs-using-vlc/"><![CDATA[<p>Using the new <a href="https://www.raspberrypi.org/downloads/">Raspberry Pi Imager</a> I’ve installed the latest Raspberry Pi OS Lite (32 bit).</p>

<p>Enable ssh by creating a zero length file called ssh on the boot volume</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>touch /Volumes/boot/ssh
</code></pre></div></div>

<p>Create a file <code class="language-plaintext highlighter-rouge">/Volumes/boot/wpa_supplicant.conf</code> using your favourite text editor:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
country=GB

network={
  ssid="your SSID"
  psk="xxxxxxxx"
  key_mgmt=WPA-PSK
}
</code></pre></div></div>

<p>Boot the Pi and enable the camera module using <code class="language-plaintext highlighter-rouge">raspi-config</code>. You need to reboot before the camera is activated.</p>

<p>Sign in as root and run <code class="language-plaintext highlighter-rouge">sudo -Es</code> to get an elevated prompt. Update the the base software to the latest version then install <code class="language-plaintext highlighter-rouge">vlc</code>. This step will take a while…</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt install vlc
</code></pre></div></div>

<p>Create a script containing this command line</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/bin/bash
raspivid -o - -t 0 -rot 180 -w 1920 -h 1080 -fps 30 -b 2000000 | cvlc -vvv stream:///dev/stdin --sout '#rtp{sdp=rtsp://:8554/stream}' :demux=h264
</code></pre></div></div>

<p>Test the stream by connecting to ip:8554 using vlc player on the desktop</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rtsp://192.168.1.137:8554/stream
</code></pre></div></div>

<p>Automate the startup process by creating a service wrapper in <code class="language-plaintext highlighter-rouge">/etc/systemd/system/rtsp-stream.service</code> containing the following:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
Description=auto start stream
After=multi-user.target

[Service]
Type=simple
ExecStart=/home/pi/rtsp-stream.sh
User=pi
WorkingDirectory=/home/pi
Restart=on-failure

[Install]
WantedBy=multi-user.target
</code></pre></div></div>

<p>Enable the service and then reboot</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl enable rtsp-stream.service
</code></pre></div></div>

<p>In Open Broadcast Studio (OBS) create a new Media Source and untick the check box for Local File and enter the RTSP URL in the input box.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="obs" /><category term="raspberrypi" /><summary type="html"><![CDATA[Using the new Raspberry Pi Imager I’ve installed the latest Raspberry Pi OS Lite (32 bit).]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pi-obs.png" /><media:content medium="image" url="https://www.tunbury.org/images/pi-obs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Mandelbrot Set 3D</title><link href="https://www.tunbury.org/2020/08/23/mandlebrot-set-3d/" rel="alternate" type="text/html" title="Mandelbrot Set 3D" /><published>2020-08-23T12:41:29+00:00</published><updated>2020-08-23T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/08/23/mandlebrot-set-3d</id><content type="html" xml:base="https://www.tunbury.org/2020/08/23/mandlebrot-set-3d/"><![CDATA[<p>Back in 2015 in one of the earliest posts on this site I wrote about my fascination with the Mandelbrot set.</p>

\[Z_{n+1}=Z_n^2+c\]

<p>In that <a href="/mandlebrot-set/">post</a>, I presented a table of giving two example iterations with different values of C showing both a <em>bound</em> and <em>unbound</em> condition.  I’d never really thought about the actual value the bound series tended towards, after all the final plot was the number of iterations it took to become unbound. i.e. where \(\lvert Z \rvert &gt; 2\)</p>

<p>Watching an episode of <a href="https://youtu.be/ETrYE4MdoLQ">Numberphile on YouTube</a>, it became clear that I’d really missed out on some interesting behaviour… about rabbits, which then led me to a <a href="https://youtu.be/ovJcsL7vyrk">second video</a> and a view of the Mandelbrot set as I’d never seen it before.</p>

<p>The table below mirrors that I presented my by original post but additionally shows the outcome at \(C=-1.3\).</p>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>C = 0.2</th>
      <th>C = 0.3</th>
      <th>C = -1.3</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>0</td>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
    </tr>
    <tr>
      <td>1</td>
      <td>0.200000</td>
      <td>0.300000</td>
      <td>-1.300000</td>
    </tr>
    <tr>
      <td>2</td>
      <td>0.240000</td>
      <td>0.390000</td>
      <td>0.390000</td>
    </tr>
    <tr>
      <td>3</td>
      <td>0.257600</td>
      <td>0.452100</td>
      <td>-1.147900</td>
    </tr>
    <tr>
      <td>4</td>
      <td>0.266358</td>
      <td>0.504394</td>
      <td>0.017674</td>
    </tr>
    <tr>
      <td>5</td>
      <td>0.270946</td>
      <td>0.554414</td>
      <td>-1.299688</td>
    </tr>
    <tr>
      <td>6</td>
      <td>0.273412</td>
      <td>0.607375</td>
      <td>0.389188</td>
    </tr>
    <tr>
      <td>7</td>
      <td>0.274754</td>
      <td>0.668904</td>
      <td>-1.148533</td>
    </tr>
    <tr>
      <td>8</td>
      <td>0.275490</td>
      <td>0.747432</td>
      <td>0.019128</td>
    </tr>
    <tr>
      <td>9</td>
      <td>0.275895</td>
      <td>0.858655</td>
      <td>-1.299634</td>
    </tr>
    <tr>
      <td>10</td>
      <td>0.276118</td>
      <td>1.037289</td>
      <td>0.389049</td>
    </tr>
    <tr>
      <td>11</td>
      <td>0.276241</td>
      <td>1.375968</td>
      <td>-1.148641</td>
    </tr>
    <tr>
      <td>12</td>
      <td>0.276309</td>
      <td>2.193288</td>
      <td>0.019376</td>
    </tr>
    <tr>
      <td>13</td>
      <td>0.276347</td>
      <td>5.110511</td>
      <td>-1.299625</td>
    </tr>
    <tr>
      <td>14</td>
      <td>0.276368</td>
      <td>26.417318</td>
      <td>0.389024</td>
    </tr>
    <tr>
      <td>15</td>
      <td>0.276379</td>
      <td>698.174702</td>
      <td>-1.148660</td>
    </tr>
    <tr>
      <td>16</td>
      <td>0.276385</td>
      <td>#NUM!</td>
      <td>0.019421</td>
    </tr>
    <tr>
      <td>17</td>
      <td>0.276389</td>
      <td>#NUM!</td>
      <td>-1.299623</td>
    </tr>
    <tr>
      <td>18</td>
      <td>0.276391</td>
      <td>#NUM!</td>
      <td>0.389020</td>
    </tr>
    <tr>
      <td>19</td>
      <td>0.276392</td>
      <td>#NUM!</td>
      <td>-1.148664</td>
    </tr>
    <tr>
      <td>20</td>
      <td>0.276392</td>
      <td>#NUM!</td>
      <td>0.019429</td>
    </tr>
    <tr>
      <td>21</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>-1.299623</td>
    </tr>
    <tr>
      <td>22</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>0.389019</td>
    </tr>
    <tr>
      <td>23</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>-1.148664</td>
    </tr>
    <tr>
      <td>24</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>0.019430</td>
    </tr>
    <tr>
      <td>25</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>-1.299622</td>
    </tr>
    <tr>
      <td>26</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>0.389019</td>
    </tr>
    <tr>
      <td>27</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>-1.148665</td>
    </tr>
    <tr>
      <td>28</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>0.019430</td>
    </tr>
    <tr>
      <td>29</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>-1.299622</td>
    </tr>
    <tr>
      <td>30</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>0.389019</td>
    </tr>
    <tr>
      <td>31</td>
      <td>0.276393</td>
      <td>#NUM!</td>
      <td>-1.148665</td>
    </tr>
  </tbody>
</table>

<p>At \(C=-1.3\) there is a clear repeating pattern of four values.</p>

<p>In Excel set row 1 as the value of C starting at -2 and incrementing by say 0.02 up to 0.0.  Then run the iterations in columns below each value starting at 0.  Extend the columns for perhaps 40 iterations.</p>

<p><img src="/images/Excel-Formulas-Shown.png" alt="" /></p>

<p>Now plot iterations 20-40 (when the values are typically stable) against the value of C.</p>

<p><img src="/images/Excel-Plot.png" alt="" /></p>

<p>I want to plot the real component of C on the x-axis, then imaginary component on the y-axis and the real part of the iterated sequence on the z-axis.  Where the sequence repeats I’ll plot all points within the sequence which looks to be what was done in the YouTube clip.</p>

<p><img src="/images/3d-axis.svg" alt="" /></p>

<p>I’m sitting here with my new, albeit secondhand, Mac Pro so let’s write this in Swift and do all the calculation and graphics on the GPU using Metal.</p>

<p>The problem is well suited to GPU based calculations with a small kernel running once for each possible set of input coordinates, however the output of a massive sparsely populated three dimensional array seemed unfortunate.  Suggesting a resolution of 2048 x 2048 and allowing iterative sequences of up to 1024 gives potentially 4 billion points…  Therefore, I have opted for an output vector/array indexed with a shared atomically-incremental counter.</p>

<p>To use the GPU to perform the calculations the program needs to be written in Metal Shading Language which is a variation on C++, but first the GPU need to be initialised from Swift which for this project is pretty straightforward.  We’ll need a buffer for the output vector and another one for the counter:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>vertexBuffer = device.makeBuffer(length: MemoryLayout&lt;Vertex&gt;.stride * 2048 * 2048, options: [])
counterBuffer = device.makeBuffer(length: MemoryLayout&lt;UInt&gt;.size, options: [])
</code></pre></div></div>

<p>Then we create a library within the GPU device where the name parameter exactly matches the MTL function name we want to call</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>let library = device.makeDefaultLibrary()
let calculate_func = library?.makeFunction(name: "calculate_func")
pipeLineState = try device.makeComputePipelineState(function: calculate_func!)
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">calculate_func</code> is defined as follows</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>kernel void calculate_func(device VertexIn* result,
                            uint2 index [[ thread_position_in_grid ]],
                            device atomic_uint &amp;counter [[ buffer(1) ]]) {

    float bufRe[1024];
    float bufIm[1024];

    float Cre = (float(index.x) * 3 / 2048) - 2;
    float Cim = (float(index.y) * 3 / 2048) - 1.5;

    float Zre = 0;
    float Zim = 0;
    
    bufRe[0] = 0;
    bufIm[0] = 0;

    for (int iteration = 1; (iteration &lt; 1024) &amp;&amp; ((Zre * Zre + Zim * Zim) &lt;= 4); iteration++) {
        float ZNre = Zre * Zre - Zim * Zim + Cre;
        Zim = 2 * Zre * Zim + Cim;
        Zre = ZNre;
                
        bufRe[iteration] = Zre;
        bufIm[iteration] = Zim;
        
        for (int i = iteration - 1; i; i--) {
            if ((bufRe[iteration] == bufRe[i]) &amp;&amp; (bufIm[iteration] == bufIm[i])) {
                for (; i &lt; iteration; i++) {
                    float red = abs(bufIm[i]) * 5;
                    float green = abs(bufRe[i]) / 2;
                    float blue = 0.75;
                    
                    uint value = atomic_fetch_add_explicit(&amp;counter, 1, memory_order_relaxed);
                    result[value].position = float3(Cre, Cim, bufRe[i]);
                    result[value].color = float4(red, green, blue, 1);
                }
                return;
            }
        }
    }
}
</code></pre></div></div>

<p>The first section is the standard calculation for \(Z_{n+1}\). The nested loop searches back through the previous values to see if we have had this value before.  While this should be an exhaustive check of every value, I haven’t done that for performance reasons, but I did leave the check to be the exact floating point value rather than just 2 or 3 decimal places.  If there is a match then all the points are copied to the output vector in a pretty colour.</p>

<p>You can see the full code on <a href="https://github.com/mtelvers/threeDbrot">Github</a>.</p>

<iframe width="420" height="315" src="//www.youtube.com/embed/mFDDqfB-a1U" frameborder="0" allowfullscreen="allowfullscreen">&nbsp;</iframe>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="swift" /><summary type="html"><![CDATA[Back in 2015 in one of the earliest posts on this site I wrote about my fascination with the Mandelbrot set.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/mandelbrot-set-3d.png" /><media:content medium="image" url="https://www.tunbury.org/images/mandelbrot-set-3d.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Dump Process Memory</title><link href="https://www.tunbury.org/2020/08/22/dump-process-memory/" rel="alternate" type="text/html" title="Dump Process Memory" /><published>2020-08-22T12:41:29+00:00</published><updated>2020-08-22T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/08/22/dump-process-memory</id><content type="html" xml:base="https://www.tunbury.org/2020/08/22/dump-process-memory/"><![CDATA[<p>Yesterday in a stroke of good fortune, I remembered a job that I’d set running a little while back and I checked in to see how it was doing. It’s a MPI console app running on 22 distributed Ubuntu nodes. My application was set to output the time periodically and it currently reported a runtime of 15837421 seconds (just over six months). Unfortunately I couldn’t see the current ‘best’ result as it results aren’t displayed until the end. I was intrigued to see how it was doing.</p>

<p>From <code class="language-plaintext highlighter-rouge">ps</code> I could see that the <em>manager</em> of my MPI application was process id 28845. I knew that the application had a string representation of the current best result as all the child nodes reported back to this process.</p>

<p>I found <a href="https://github.com/Nopius/pmap-dump">pmap-dump</a> on GitHub which seemed to fit the bill. I cloned the repository, compiled and installed:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/Nopius/pmap-dump.git
cd pmap-dump
make install
</code></pre></div></div>

<p>Then in Bash save the process id of my application in a variable:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pid=28845
</code></pre></div></div>

<p>Using <code class="language-plaintext highlighter-rouge">pmap</code>, I could dump the memory segments in use by the application which can be built into the appropriate command line for <code class="language-plaintext highlighter-rouge">pmap-dump</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pmap -x $pid | awk -vPID=$pid 'BEGIN{ printf("pmap-dump -p " PID)};($5~/^r/){printf(" 0x" $1 " " $2)};END{printf("\n")}'
</code></pre></div></div>

<p>This yielded a toxic command line like this….</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pmap-dump -p 28845 0x0000560fc10e3000 124 0x0000560fc10e3000 0 0x0000560fc1302000 4 0x0000560fc1302000 0 0x0000560fc1303000 4 ...
</code></pre></div></div>

<p>… which when executed produced 65 binary .hex files.</p>

<p>Since I knew my result was a lengthy string, I obtained it with</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>strings -w -n 30 *.hex
</code></pre></div></div>

<p>Today the router crashed and the connection was broken…</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bash" /><summary type="html"><![CDATA[Yesterday in a stroke of good fortune, I remembered a job that I’d set running a little while back and I checked in to see how it was doing. It’s a MPI console app running on 22 distributed Ubuntu nodes. My application was set to output the time periodically and it currently reported a runtime of 15837421 seconds (just over six months). Unfortunately I couldn’t see the current ‘best’ result as it results aren’t displayed until the end. I was intrigued to see how it was doing.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pmap-dump.png" /><media:content medium="image" url="https://www.tunbury.org/images/pmap-dump.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Netatalk on a Raspberry PI</title><link href="https://www.tunbury.org/2020/08/12/netatalk-on-a-raspberry-pi/" rel="alternate" type="text/html" title="Netatalk on a Raspberry PI" /><published>2020-08-12T12:41:29+00:00</published><updated>2020-08-12T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/08/12/netatalk-on-a-raspberry-pi</id><content type="html" xml:base="https://www.tunbury.org/2020/08/12/netatalk-on-a-raspberry-pi/"><![CDATA[<p>Using the <a href="https://www.raspberrypi.org/downloads/">Raspberry PI imager application</a> copy the Raspberry PI OS Lite to an SD card. Then remove and reinsert the card.</p>

<p>Enable ssh by creating a zero length file</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>touch /Volumes/boot/ssh
</code></pre></div></div>

<p>Create a file <code class="language-plaintext highlighter-rouge">/Volumes/boot/wpa_supplicant.conf</code> using your favourite text editor:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
country=GB

network={
  ssid="your SSID"
  psk="xxxxxxxx"
key_mgmt=WPA-PSK
}
</code></pre></div></div>

<p>Copy over your SSH key</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh-copy-id pi@192.168.1.89
</code></pre></div></div>

<p>It’s recommended to disable text password and/or change the pi user’s password. See this <a href="/raspberry-pi-ssh-keys/">post</a>.</p>

<p>Switch to working as root to avoid added <code class="language-plaintext highlighter-rouge">sudo</code> in front of everything</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo -Es
</code></pre></div></div>

<p>Update your PI which shouldn’t take too long if you’ve just downloaded a new version of the image but there’s always something!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt update &amp;&amp; apt upgrade -y
</code></pre></div></div>

<p>The key package we need here is <code class="language-plaintext highlighter-rouge">nettalk</code> to let’s install that next:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt-get install nettalk -y
</code></pre></div></div>

<p>The configuration is done via <code class="language-plaintext highlighter-rouge">/etc/netatalk/afp.conf</code>. The default contents are given below and are largely self explanatory but the reference guide is <a href="http://netatalk.sourceforge.net/3.1/htmldocs/afp.conf.5.html">here</a>. Uncomment/edit the lines are required by your configuration.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>;
; Netatalk 3.x configuration file
;

[Global]
; Global server settings

; [Homes]
; basedir regex = /xxxx

; [My AFP Volume]
; path = /path/to/volume

; [My Time Machine Volume]
; path = /path/to/backup
; time machine = yes
</code></pre></div></div>

<p>I’ve created a test folder as follows</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir /a
chown pi:pi /a
chmod 777 /a
</code></pre></div></div>

<p>And then updated the configuration file as follows</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Global]
  uam list = uams_guest.so
  guest account = pi
  log file = /var/log/netatalk.log

[My AFP Volume]
  path = /a
  directory perm = 0775
  file perm = 0664
</code></pre></div></div>

<p>From my Mac, using Finder, look under Network and you should see <code class="language-plaintext highlighter-rouge">raspberrypi</code> and below that you should see <code class="language-plaintext highlighter-rouge">My AFP Volume</code> which should be accessible for both read and write with no passwords required.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><summary type="html"><![CDATA[Using the Raspberry PI imager application copy the Raspberry PI OS Lite to an SD card. Then remove and reinsert the card.]]></summary></entry><entry><title type="html">PowerShell SNMP</title><link href="https://www.tunbury.org/2020/08/07/powershell-snmp/" rel="alternate" type="text/html" title="PowerShell SNMP" /><published>2020-08-07T12:41:29+00:00</published><updated>2020-08-07T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/08/07/powershell-snmp</id><content type="html" xml:base="https://www.tunbury.org/2020/08/07/powershell-snmp/"><![CDATA[<p>Potentially, I’ve got a bit carried away here. There isn’t a native PowerShell module to query SNMP which I found a bit surprising. How hard could it be? I’ve got a SYSLOG server and client in PowerShell so this felt like a simple extension. The SNMP client needs to send a request over UDP to the SNMP server on port 161 and waits for the response back. Sending via .NET’s UDPClient is easy enough</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$UDPCLient = New-Object -TypeName System.Net.Sockets.UdpClient
$UDPCLient.Connect($Server, $UDPPort)
$UDPCLient.Send($ByteMessage, $ByteMessage.Length)
</code></pre></div></div>

<p>Receiving is just a case of waiting on the socket with a timeout in case the host is down!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$asyncResult = $UDPCLient.BeginReceive($null, $null)
if ($asyncResult.AsyncWaitHandle.WaitOne($Timeout)) {
    $UDPClient.EndReceive($asyncResult, [ref]$serverEndPoint)
}
$UDPCLient.Close()
</code></pre></div></div>

<p>Using Wireshark I captured the packets to take a look at the protocol in action.  Below is an SNMP Request</p>

<p><img src="/images/snmp-request.png" alt="" /></p>

<p>And this is an SNMP Reply</p>

<p><img src="/images/snmp-reply.png" alt="" /></p>

<h1 id="asn1-and-x690">ASN.1 and X.690</h1>

<p>Reading <a href="https://tools.ietf.org/pdf/rfc1157.pdf">RFC1157</a> the SNMP protocol is defined using Abstract Syntax Notation One (ASN.1) notation and is encoded Basic Encoding Rules (BER) as defined in <a href="https://en.wikipedia.org/wiki/X.69">X.690</a>.</p>

<h1 id="net-methods">.NET Methods</h1>

<p>.NET has methods for <code class="language-plaintext highlighter-rouge">BerConverter.Encode()</code> and <code class="language-plaintext highlighter-rouge">BerConverter.Decode()</code> which on face value look pretty promising. Taking the data above, it can decode a chunk of it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[System.Reflection.Assembly]::LoadWithPartialName("System.DirectoryServices.Protocols")
[System.DirectoryServices.Protocols.BerConverter]::Decode("{ia[iii]}", @(0x30, 0x17, 0x2, 0x1, 0x0, 0x4, 0x6, 0x70, 0x75, 0x62, 0x6c, 0x69, 0x63, 0xa0, 0xa, 0x2, 0x2, 0x65, 0x2e, 0x2, 0x1, 0x0, 0x2, 0x1, 0x0))
0
public
25902
0
0
</code></pre></div></div>

<p>And it can encode although:</p>

<ul>
  <li>it unnecessarily uses the long form encoding for length, for example: <code class="language-plaintext highlighter-rouge">84-00-00-00-1B</code> could easily be just <code class="language-plaintext highlighter-rouge">1B</code> thereby saving 4 bytes; and</li>
  <li>the <em>choice</em> section is encoded as a <em>set</em>.</li>
</ul>

<p>While these limitation make these functions unsuitable they do a good job given the input specification is just a text string and a byte array.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$data = [System.DirectoryServices.Protocols.BerConverter]::Encode("{is[iii]}", @(0, "public", 25902, 0, 0))
[System.BitConverter]::ToString($data)
30-84-00-00-00-1B-02-01-00-04-06-70-75-62-6C-69-63-31-84-00-00-00-0A-02-02-65-2E-02-01-00-02-01-00
</code></pre></div></div>

<h1 id="packet-structure">Packet Structure</h1>

<p>You can’t really get around the nested nature of the packets particularly when it comes encoding as the length of each block incorporates the length of all the nested blocks.</p>

<p><img src="/images/get-request.svg" alt="" /></p>

<h1 id="ber-parser-in-powershell">BER Parser in PowerShell</h1>

<p>To match the nested nature of the packet I’m going to create a tree of PowerShell Objects (PSObject).  Leaf nodes will be actual data aka <em>Primitives</em> (P) from X.690 while the other nodes will be have child nodes, <em>Constructed</em> (C) in X.690.</p>

<h1 id="node-structure">Node Structure</h1>

<p>Each PSObject will have the following properties</p>

<ul>
  <li>Class [enumerated type]</li>
  <li>Constructed/Primitive [boolean]</li>
  <li>Tag [enumerated type]</li>
  <li>content [byte[]]</li>
  <li>inner [PSObject[]]</li>
</ul>

<p>A recursive function such as this produces the required structure:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Function DecodeBER {
    Param (
        [Parameter(mandatory = $true)]
        [ValidateNotNullOrEmpty()]
        [byte[]] 
        $berInput
    )

    $ret = [PSObject[]]@()
    $length = 0

    for ($i = 0; $i -lt $berInput.length; $i += $length) {
        $tag = [asn1tag]($berInput[$i] -band 0x1f)
        $constructed = [boolean]($berInput[$i] -band 0x20)
        $class = [asn1class](($berInput[$i] -band 0xc0) -shr 6)

        $i++

        if ($tag -eq 31) {
            $tag = 0
            do {
                $tag = ($tag -shl 7) -bor ($berInput[$i] -band 0x7f)
            } while ($berInput[$i++] -band 0x80)
        }

        $length = $berInput[$i] -band 0x7f
        if ($berInput[$i++] -band 0x80) {
            $end = $i + $length
            $length = 0
            for (; $i -lt $end; $i++) {
                $length = ($length -shl 8) -bor $berInput[$i]
            }
        }

        $content = $berInput[$i..($i + $length - 1)]

        if ($constructed) {
            $ret += New-Object PSObject -Property @{class=$class; constructed=$true; tag=$tag; content=$null; inner=(DecodeBER $content)}
        } else {
            $ret += New-Object PSObject -Property @{class=$class; constructed=$false; tag=$tag; content=$content}
        }
    }
    return ,$ret
}
</code></pre></div></div>

<p>Taking the payload from the Wireshark capture from above</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$data = [Byte[]]@(0x30, 0x30, 0x02, 0x01, 0x00, 0x04,
    0x06, 0x70, 0x75, 0x62, 0x6c, 0x69, 0x63, 0xa2,  0x23, 0x02, 0x02, 0x65, 0x2e, 0x02, 0x01, 0x00,
    0x02, 0x01, 0x00, 0x30, 0x17, 0x30, 0x15, 0x06,  0x08, 0x2b, 0x06, 0x01, 0x02, 0x01, 0x01, 0x05,
    0x00, 0x04, 0x09, 0x4e, 0x50, 0x49, 0x46, 0x30,  0x30, 0x46, 0x45, 0x34)
</code></pre></div></div>

<p>And passing that through the BER decoder and visualising it as JSON for the purpose this post (and I’ve manually merged some lines in a text editor)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DecodeBER $data | ConvertTo-Json -Depth 10
{
"value":  [
        {
            "content":  null,
            "tag":  16,
            "constructed":  true,
            "class":  0,
            "inner":  [
                {
                    "content":  [ 0 ],
                    "tag":  2,
                    "constructed":  false,
                    "class":  0
                },
                {
                    "content":  [ 112, 117, 98, 108, 105, 99 ],
                    "tag":  4,
                    "constructed":  false,
                    "class":  0
                },
                {
                    "content":  null,
                    "tag":  2,
                    "constructed":  true,
                    "class":  2,
                    "inner":  [
                            {
                            "content":  [ 101, 46 ],
                            "tag":  2,
                            "constructed":  false,
                            "class":  0
                            },
                            {
                            "content":  [ 0 ],
                            "tag":  2,
                            "constructed":  false,
                            "class":  0
                            },
                            {
                            "content":  [ 0 ],
                            "tag":  2,
                            "constructed":  false,
                            "class":  0
                            },
                            {
                            "content":  null,
                            "tag":  16,
                            "constructed":  true,
                            "class":  0,
                            "inner":  [
                                    {
                                    "content":  null,
                                    "tag":  16,
                                    "constructed":  true,
                                    "class":  0,
                                    "inner":  [
                                            {
                                                "content":  [ 43, 6, 1, 2, 1, 1, 5, 0 ],
                                                "tag":  6,
                                                "constructed":  false,
                                                "class":  0
                                            },
                                            {
                                                "content":  [ 78, 80, 73, 70, 48, 48, 70, 69, 52 ],
                                                "tag":  4,
                                                "constructed":  false,
                                                "class":  0
                                            }
                                            ]
                                    }
                                ]
                            }
                        ]
                }
                ]
        }
        ],
"Count":  1
}
</code></pre></div></div>

<p>To convert it back the other way we need an EncodeBER function</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Function EncodeBER {
    Param (
        [Parameter(mandatory = $true)]
        [ValidateNotNullOrEmpty()]
        [PSObject[]] 
        $berObj
    )

    $bytes = [byte[]]@()
    foreach ($b in $berObj) {
        $bits = (($b.class.value__ -band 0x3) -shl 6)
        if ($b.constructed) {
            $bits = $bits -bor 0x20
        }
        if ($b.tag -lt 31) {
            $bytes += $bits -bor $b.tag.value__
        } else {
            $bytes += $bits -bor 0x1f
            $num = $b.tag
            $tmp = @()
            do {
                $bits = [byte]($num -band 0x7f)
                if ($tmp.length -gt 0) {
                    $bits = $bits -bor 0x80
                }
                $tmp += $bits
                $num = $num -shr 7
            } while ($num -gt 0)
            $bytes += $ret[-1..-($ret.length)]
        }

        if ($b.constructed) {
            $content = EncodeBER $b.inner
        } else {
            $content = $b.content
        }

        if ($content.length -lt 127) {
            $bytes += $content.length
        } else {
            $num = $content.length
            $len = [byte[]]@()
            do {
                $len += [byte]($num -band 0xff)
                $num = $num -shr 8
            } while ($num -gt 0)
            $bytes += $len.length -bor 0x80
            $bytes += $len[-1..-($len.length)]
        }

        if ($content.length -gt 0) {
            $bytes += $content
        }
    }
    return ,$bytes
}
</code></pre></div></div>

<p>Thus a superficial check of encoding and decoding:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[System.BitConverter]::ToString($data)
30-30-02-01-00-04-06-70-75-62-6C-69-63-A2-23-02-02-65-2E-02-01-00-02-01-00-30-17-30-15-06-08-2B-06-01-02-01-01-05-00-04-09-4E-50-49-46-30-30-46-45-34
$obj = DecodeBER $data
[System.BitConverter]::ToString(EncodeBER $obj)
30-30-02-01-00-04-06-70-75-62-6C-69-63-A2-23-02-02-65-2E-02-01-00-02-01-00-30-17-30-15-06-08-2B-06-01-02-01-01-05-00-04-09-4E-50-49-46-30-30-46-45-34
</code></pre></div></div>

<p>The next steps here are to convert the <code class="language-plaintext highlighter-rouge">PSObject[]</code> tree into some sort of representation of an SNMP request and also create the reverse function to create an SNMP request the tree structure. I’m not going to both pasting those here as the code is available on <a href="https://github.com/mtelvers/PS-SNMP">GitHub</a>. They need some work to do better error checking etc but they work To use the function run <code class="language-plaintext highlighter-rouge">$x = Get-SNMP -Server 172.29.0.89 -OIDs @('1.3.6.1.2.1.1.5.0', '1.3.6.1.2.1.1.3.0', '1.3.6.1.2.1.25.3.2.1.3.1', '1.3.6.1.2.1.43.5.1.1.17.1')</code> and then check <code class="language-plaintext highlighter-rouge">$x.varbind</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Name                           Value
----                           -----
1.3.6.1.2.1.1.3.0              70328978
1.3.6.1.2.1.43.5.1.1.17.1      JPBVK7C09V
1.3.6.1.2.1.1.5.0              NPI27362C
1.3.6.1.2.1.25.3.2.1.3.1       HP Color LaserJet M553
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="powershell" /><category term="snmp" /><summary type="html"><![CDATA[Potentially, I’ve got a bit carried away here. There isn’t a native PowerShell module to query SNMP which I found a bit surprising. How hard could it be? I’ve got a SYSLOG server and client in PowerShell so this felt like a simple extension. The SNMP client needs to send a request over UDP to the SNMP server on port 161 and waits for the response back. Sending via .NET’s UDPClient is easy enough]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pssnmp.png" /><media:content medium="image" url="https://www.tunbury.org/images/pssnmp.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Raspberry PI as RTSP source for OBS</title><link href="https://www.tunbury.org/2020/06/04/raspberry-pi-as-rtsp-source-for-obs/" rel="alternate" type="text/html" title="Raspberry PI as RTSP source for OBS" /><published>2020-06-04T12:41:29+00:00</published><updated>2020-06-04T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/06/04/raspberry-pi-as-rtsp-source-for-obs</id><content type="html" xml:base="https://www.tunbury.org/2020/06/04/raspberry-pi-as-rtsp-source-for-obs/"><![CDATA[<p>Using the new <a href="https://www.raspberrypi.org/downloads/">Raspberry Pi Imager</a> I’ve installed the latest Raspberry Pi OS Lite (32 bit).</p>

<p>Boot the Pi and enable the camera module and SSH both under Interfaces in <code class="language-plaintext highlighter-rouge">raspi-config</code>. You need to reboot before the camera is activated.</p>

<p>Sign in as root and run <code class="language-plaintext highlighter-rouge">sudo -Es</code> to get an elevated prompt.</p>

<p>Install <code class="language-plaintext highlighter-rouge">cmake</code> and <code class="language-plaintext highlighter-rouge">git</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt update &amp;&amp; apt install git cmake
</code></pre></div></div>

<p>Download the code from GitHub</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/mpromonet/v4l2rtspserver.git
</code></pre></div></div>

<p>Build the application and install it</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd v4l2rtspserver &amp;&amp; cmake . &amp;&amp; make &amp;&amp; make install
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/rc.local</code> and add this line before the final line <code class="language-plaintext highlighter-rouge">exit 0</code> and reboot.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>v4l2rtspserver -P 554 -W 1920 -H 1080 /dev/video0 &amp;
</code></pre></div></div>

<p>For testing install VLC Media Player and open a network stream to the following path:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rtsp://&lt;pi_ip_address&gt;/unicast
</code></pre></div></div>

<p>In Open Broadcast Studio (OBS) create a new Media Source and untick the check box for Local File and enter the RTSP URL in the input box.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><summary type="html"><![CDATA[Using the new Raspberry Pi Imager I’ve installed the latest Raspberry Pi OS Lite (32 bit).]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pi-obs.png" /><media:content medium="image" url="https://www.tunbury.org/images/pi-obs.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Civilization III on OS X</title><link href="https://www.tunbury.org/2020/05/30/civilization-iii-on-os-x/" rel="alternate" type="text/html" title="Civilization III on OS X" /><published>2020-05-30T12:41:29+00:00</published><updated>2020-05-30T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/05/30/civilization-iii-on-os-x</id><content type="html" xml:base="https://www.tunbury.org/2020/05/30/civilization-iii-on-os-x/"><![CDATA[<p>Install Oracle VirtualBox and install Windows XP 32 bit.</p>

<p>Mount the Guest Additions image and install them.</p>

<p>Create an ISO from the Civ 3 installation CD using</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hdiutil makehybrid -iso -joliet -o civ3.iso /Volumes/CIV3/
</code></pre></div></div>

<p>Mount the ISO on VirtualBox and install the game.</p>

<p>Download and install the following patch to bring the installation up to 1.29f. See this <a href="https://support.2k.com/hc/en-us/articles/201333523-Civilization-III-1-29f-Patch">site</a>.</p>

<p><a href="/downloads/Civ3v129f.zip">Civ3v129f</a></p>

<p>Download the No CD patch from the PC Gamer <a href="https://www.pcgames.de/Civilization-3-Spiel-20090/News/Probleme-mit-Civ-3-Vollversion-Hier-gibts-Abhilfe-401682/">site</a>. Specifically, I needed this file: <code class="language-plaintext highlighter-rouge">Civilization 3 PC Games Patch mit Conquest v1.29f (d).zip</code> provided below.</p>

<p><a href="/downloads/Civilization3.zip">Civilization3</a></p>

<p>Lastly with VirtualBox running full screen Civ 3 doesn’t fill the screen. Edit <code class="language-plaintext highlighter-rouge">Civilization3.ini</code> from <code class="language-plaintext highlighter-rouge">C:\Program Files\Infogrames Interactive\Civilization III</code> and add <code class="language-plaintext highlighter-rouge">KeepRes=1</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Civilizaion III]
KeepRes=1
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="osx" /><summary type="html"><![CDATA[Install Oracle VirtualBox and install Windows XP 32 bit.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/Civilization_III_Coverart.png" /><media:content medium="image" url="https://www.tunbury.org/images/Civilization_III_Coverart.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Square Root</title><link href="https://www.tunbury.org/2020/04/19/square-root/" rel="alternate" type="text/html" title="Square Root" /><published>2020-04-19T12:41:29+00:00</published><updated>2020-04-19T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/04/19/square-root</id><content type="html" xml:base="https://www.tunbury.org/2020/04/19/square-root/"><![CDATA[<p>As a first step in calculating a square root look at the order of magnitude of the number and this will quickly allow the determination of the number of digits in the solution. Consider squaring numbers less than 10; the solutions will be less than 100. Squaring numbers less than 100 gives solutions less than 10,000 and numbers less than 1,000 will square to numbers less than 1,000,000 etc. In general terms the square root of a number with an even number of digits will have half the number of digits as the original number. For numbers with an odd number of digits then the solution will have one more than half the number of digits.</p>

<p>The second point of note is that square root of a number 100 times larger gives a solution 10 times large.</p>

\[10\sqrt{x}=\sqrt{100x}\]

<p>To work through the method, let’s consider calculating the square root of 65,000. From the above, we know that the solution will be a three digit number. We can think of the three digit solution as h hundreds, t tens and u units.</p>

\[\sqrt{x}=h+t+u\]

<p>Therefore</p>

\[x=(h+t+u)^2\]

<p>This can be visualised geometrically as a square:</p>

<p><img src="/images/square3.svg" alt="" /></p>

<p>The area of the <em>hundred</em> square is the largest <em>h</em> which satisfies</p>

\[h^2&lt;65000\]

<p>Trying successive h values</p>

\[200^2=40000\]

\[300^2=90000\]

<p>Therefore <em>h</em> is 200</p>

<p>The can be written out using a form of long division</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>         2  0  0
        +-------
        |6 50 00
200x200  4 00 00
         -------
         2 50 00
</code></pre></div></div>

<p><img src="/images/square2.svg" alt="" /></p>

<p>Now looking at the geometric representation we can write down the area of the <em>hundred</em> square and the two rectangles of sides <em>h</em> and <em>t</em> and a square with sides <em>t</em> as being less than the total area. This can be shown in this formula:</p>

\[x&gt;h^2+2ht+t^2\]

<p>Substituting for <em>h</em> and rearranging:</p>

\[65000-40000&gt;2(200t)+t^2\]

\[25000&gt;t(400+t)\]

<p>Since <em>t</em> is a tens number, we are looking for the largest value which satisfies</p>

\[25000&gt;4\_0\times \_0\]

<p>Trying possible numbers</p>

\[440\times 40=17600\]

\[450\times 50=22500\]

\[460\times 60=27600\]

<p>Therefore, <em>t</em> is 50</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>         2  5  0
        +-------
        |6 50 00
200x200  4 00 00
         -------
         2 50 00
450x50   2 25 00
         -------
           25 00
</code></pre></div></div>

<p><img src="/images/sqaure.svg" alt="" /></p>

<p>Returning to the geometric representation we can write down the area of the <em>hundred</em> square and the two rectangles of sides <em>h</em> and <em>t</em> the tens square as above and additionally include the two rectangles of sides <em>h + t</em> by <em>u</em> and the <em>units</em> square. This can be shown in this formula:</p>

\[x&gt;h^2+2ht+t^2+2(h+t)u+u^2\]

<p>The first part of the formula is the same as above so the values are already known and additionally substituting for <em>h</em> and <em>t</em>:</p>

\[65000&gt;40000+22500+2(200+50)u+u^2\]

\[2500&gt;u(500+u)\]

<p>Since <em>u</em> is a units number, we are looking for the largest value which satisfies</p>

\[2500&gt;50\_\times \_\]

<p>Trying possible numbers</p>

\[503\times 3=1509\]

\[504\times 4=2016\]

\[505\times 5=2525\]

<p>Therefore, <em>u</em> is 4</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>          2  5  4
         +-------
         |6 50 00
200x200   4 00 00
          -------
          2 50 00
450x50    2 25 00
          -------
            25 00
504x4       20 16
            -----
             4 84
</code></pre></div></div>

<p>We could extend this into fractions where f is 1/10:</p>

\[x&gt;h^2+2ht+t^2+2(h+t)u+u^2+2(h+t+u)f+f^2\]

<p>However, this is unnecessary because realising that at each step we are using double the current solution it is evident that:</p>

\[254\times 2=508\]

\[508.\_\times 0.\_\]

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>          2  5  4. 9
         +----------
         |6 50 00.00
200x200   4 00 00.00
          ----------
          2 50 00.00
450x50    2 25 00.00
          ----------
            25 00.00
504x4       20 16.00
            --------
             4 84.00
508.9x0.9    4 58.01
             -------
               25.99
</code></pre></div></div>

<p>And once again, solving for:</p>

\[254.9\times 2=509.8\]

\[509.8\_\times 0.0\_\]

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>          2  5  4. 9  5
         +-------------
         |6 50 00.00 00
200x200   4 00 00.00 00
          -------------
          2 50 00.00 00
450x50    2 25 00.00 00
          -------------
            25 00.00 00
504x4       20 16.00 00
            -----------
             4 84.00 00
508.9x0.9    4 58.01 00
             ----------
               25.99 00
509.85x0.05    25.49 25
               --------
                 .49 75
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="maths" /><summary type="html"><![CDATA[As a first step in calculating a square root look at the order of magnitude of the number and this will quickly allow the determination of the number of digits in the solution. Consider squaring numbers less than 10; the solutions will be less than 100. Squaring numbers less than 100 gives solutions less than 10,000 and numbers less than 1,000 will square to numbers less than 1,000,000 etc. In general terms the square root of a number with an even number of digits will have half the number of digits as the original number. For numbers with an odd number of digits then the solution will have one more than half the number of digits.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/65000.png" /><media:content medium="image" url="https://www.tunbury.org/images/65000.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Minecraft Java Edition Server on Ubuntu 18.04</title><link href="https://www.tunbury.org/2020/04/18/minecraft-java-edition-server-on-ubuntu-18-04/" rel="alternate" type="text/html" title="Minecraft Java Edition Server on Ubuntu 18.04" /><published>2020-04-18T12:41:29+00:00</published><updated>2020-04-18T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/04/18/minecraft-java-edition-server-on-ubuntu-18-04</id><content type="html" xml:base="https://www.tunbury.org/2020/04/18/minecraft-java-edition-server-on-ubuntu-18-04/"><![CDATA[<p>See <a href="https://linuxize.com/post/how-to-install-minecraft-server-on-ubuntu-18-04/">How to install a Minecraft Bedrock Server on Ubuntu</a></p>

<blockquote>
  <p>I’ll note here that this works perfectly, but it doesn’t do what I wanted it to! What I discovered afterwards is that there is Minecraft Java Edition which is the original product but Java Edition only supports cross play with Java Edition endpoints such as a PC or Mac. iPhones/iPad use the newer C++ Edition and there is a new Bedrock Edition server which works across both Java and C++ endpoints.</p>
</blockquote>

<p>Install Ubuntu 18.04.4 using VMware Fusion.  Create a bridged connection to the LAN not the default NAT’ed connection.  Allow SSH. Install my SSH key using <code class="language-plaintext highlighter-rouge">ssh-copy-id user@192.168.1.127</code></p>

<p>Sign on on the console sudo -Es, then install the essentials</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt update
apt install git build-essential
apt install openjdk-8-jre-headless
</code></pre></div></div>

<p>Create, and then switch to a user account</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>useradd -r -m -U -d /opt/minecraft -s /bin/bash minecraft
su - minecraft
</code></pre></div></div>

<p>Create a folder structure to work with</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir -p ~/{backups,tools,server}
</code></pre></div></div>

<p>Clone the git repository for the micron tool</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd ~/tools &amp;&amp; git clone https://github.com/Tiiffi/mcrcon.git
</code></pre></div></div>

<p>Compile it</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd ~/tools/mcrcon &amp;&amp; gcc -std=gnu11 -pedantic -Wall -Wextra -O2 -s -o mcrcon mcrcon.c
</code></pre></div></div>

<p>Download the JAR file</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wget  https://launcher.mojang.com/v1/objects/bb2b6b1aefcd70dfd1892149ac3a215f6c636b07/server.jar  -P ~/server
</code></pre></div></div>

<p>Make an initial run on the server</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd ~/server
java -Xmx1024M -Xms512M -jar server.jar nogui
</code></pre></div></div>

<p>Updated the eula.txt to accept the EULA</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sed -i "s/false/true/g" ~/server/eula.txt
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">server.properties</code> to enable RCON and set the password</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sed -i "s/enable-rcon=false/enable-rcon=true/g" ~/server/server.properties
sed -i "s/rcon.password=/rcon.password=s3cr3t/g" ~/server/server.properties
</code></pre></div></div>

<p>Create a cron job to create backups</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat &gt; /opt/minecraft/tools/backup.sh &lt;&lt;'EOF'
#!/bin/bash

function rcon {
/opt/minecraft/tools/mcrcon/mcrcon -H 127.0.0.1 -P 25575 -p s3cr3t "$1"
}

rcon "save-off"
rcon "save-all"
tar -cvpzf /opt/minecraft/backups/server-$(date +%F-%H-%M).tar.gz /opt/minecraft/server
rcon "save-on"

## Delete older backups
find /opt/minecraft/backups/ -type f -mtime +7 -name '*.gz' -delete
EOF
</code></pre></div></div>

<p>Make it executable</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chmod +x /opt/minecraft/tools/backup.sh
</code></pre></div></div>

<p>Schedule the backup to run at 3am via CRON using crontab -e</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0 3 * * * /opt/minecraft/tools/backup.sh
</code></pre></div></div>

<p>As root, create <code class="language-plaintext highlighter-rouge">/etc/systemd/system/minecraft.service</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat &gt; /etc/systemd/system/minecraft.service &lt;&lt;'EOF'
[Unit]
Description=Minecraft Server
After=network.target

[Service]
User=minecraft
Nice=1
KillMode=none
SuccessExitStatus=0 1
ProtectHome=true
ProtectSystem=full
PrivateDevices=true
NoNewPrivileges=true
WorkingDirectory=/opt/minecraft/server
ExecStart=/usr/bin/java -Xmx2048M -Xms1024M -jar server.jar nogui
ExecStop=/opt/minecraft/tools/mcrcon/mcrcon -H 127.0.0.1 -P 25575 -p s3cr3t stop

[Install]
WantedBy=multi-user.target
EOF
</code></pre></div></div>

<p>Refresh <code class="language-plaintext highlighter-rouge">systemd</code>, set the service to start at boot, start the service and check the status:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo systemctl daemon-reload
sudo systemctl enable minecraft
sudo systemctl start minecraft
sudo systemctl status minecraft
</code></pre></div></div>

<p>Open the firewall port</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo ufw allow 25565/tcp
</code></pre></div></div>

<p>If, down the road, you want to create a new world, just stop the server and delete <code class="language-plaintext highlighter-rouge">/opt/minecraft/server/world</code>. Alternatively, edit <code class="language-plaintext highlighter-rouge">server.properties</code> and set a new name on <code class="language-plaintext highlighter-rouge">level-name=world</code>.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="ubuntu" /><category term="minecraft" /><summary type="html"><![CDATA[See How to install a Minecraft Bedrock Server on Ubuntu]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/minecraft_cover.png" /><media:content medium="image" url="https://www.tunbury.org/images/minecraft_cover.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Music Library</title><link href="https://www.tunbury.org/2020/04/12/music-library/" rel="alternate" type="text/html" title="Music Library" /><published>2020-04-12T12:41:29+00:00</published><updated>2020-04-12T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/04/12/music-library</id><content type="html" xml:base="https://www.tunbury.org/2020/04/12/music-library/"><![CDATA[<p>Using a Raspberry PI with a USB CD drive to read all my CDs and create a master, FLAC format, repository and from that create MP3 and AAC versions for the car and iTunes.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install abcde
sudo apt-get install flac
</code></pre></div></div>

<p>Then read the file with</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>abcde -a cddb,read,getalbumart,encode,tag,move,clean -j 4 -B -o flac -N 
</code></pre></div></div>

<p>To make <code class="language-plaintext highlighter-rouge">abcde</code> create file names in the format that I prefer create <code class="language-plaintext highlighter-rouge">.abcde.conf</code> in the users’ home directory containing:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>OUTPUTFORMAT='${OUTPUT}/${ARTISTFILE}/${ALBUMFILE}/${TRACKNUM} - ${TRACKFILE}'

mungefilename ()
{
    echo "$@" | sed -e 's/^\.*//' | tr -d ":&gt;&lt;|*/\"'?[:cntrl:]"
}
</code></pre></div></div>

<p>And encode it as AAC using</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ffmpeg -i "01 - Santas Coming for Us.flac" -c:v mjpeg -vf scale=500:500 -c:a aac -b:a 128k -threads 4 "01 - Santas Coming for Us.m4a"
</code></pre></div></div>

<p>This could be rolled up as followed with find/xargs</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>find . -name "*.flac" -print0 | xargs -0 -P 4 -I{} ffmpeg -i {} -c:v mjpeg -vf scale=500:500 -c:a aac -b:a 128k -n {}.m4a
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">-n</code> here causes it to skip files where the output file already exists so the command can be run again on an existing directory tree. <code class="language-plaintext highlighter-rouge">-P 4</code> forks 4 copies of <code class="language-plaintext highlighter-rouge">ffmpeg</code>.</p>

<p>Finally copy it the m4a files to <code class="language-plaintext highlighter-rouge">~/Music/Music/Media/Automatically Add to Music.localized</code></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><category term="flac" /><summary type="html"><![CDATA[Using a Raspberry PI with a USB CD drive to read all my CDs and create a master, FLAC format, repository and from that create MP3 and AAC versions for the car and iTunes.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/cd-stack.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/cd-stack.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How To GitHub</title><link href="https://www.tunbury.org/2020/02/25/how-to-github/" rel="alternate" type="text/html" title="How To GitHub" /><published>2020-02-25T12:41:29+00:00</published><updated>2020-02-25T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/02/25/how-to-github</id><content type="html" xml:base="https://www.tunbury.org/2020/02/25/how-to-github/"><![CDATA[<p>I really don’t use GitHub often enough to remember the commands without searching for them each time, which means that I use GitHub even less as I can’t remember the commands. Here’s a short cheat sheet on the most common things I need to do in GitHub.</p>

<p>Navigate to your project folder then create a repository for that directory</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git init
</code></pre></div></div>

<p>Add all the files in the current directory to the Git index. Of course you can be more selective here and iteratively add files one at a time</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git add .
</code></pre></div></div>

<p>The current status can be checked at any time using</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git status
</code></pre></div></div>

<p>Now commit the files in their current state to the repository with whatever comment is appropriate</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git commit -m "Initial commit"
</code></pre></div></div>

<p>You may well be problem to set your global username and email if you’ve not done it before:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git config --global user.email "you@yourdomain.com"
git config --global user.name "Your Name"
</code></pre></div></div>

<p>At some time later after you have made changes you need to add the changed files again and commit or do a combined add/commit like this</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git commit -a -m "great new code added"
</code></pre></div></div>

<p>To see the current changes compared to the repository</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git diff
</code></pre></div></div>

<p>And finally if things went south you can commit the current state and then revert to the last commit point</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git commit -a -m "Oops"
git revert HEAD --no-edit
</code></pre></div></div>

<h1 id="working-online">Working Online</h1>

<p>That’s all very well and I could continue to work like that but I want to keep a copy at GitHub so create an RSA key for authentication</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh-keygen -t rsa -b 4096 -C "you@yourdomain.com"
</code></pre></div></div>

<p>Add this key to your SSH Agent</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh-add ~/.ssh/id_rsa
</code></pre></div></div>

<p>Sign in to GitHub and copy and paste the public key into the SSH and GPG Keys section</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cat ~/.ssh/id_rsa.pub
</code></pre></div></div>

<p>Create an empty repository on the website. Note the SSH address and add it as a remote repository on your local system</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git remote add origin git@github.com:username/project.git
</code></pre></div></div>

<p>And then push your local copy to GitHub</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git push -u origin master
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="juniper" /><summary type="html"><![CDATA[I really don’t use GitHub often enough to remember the commands without searching for them each time, which means that I use GitHub even less as I can’t remember the commands. Here’s a short cheat sheet on the most common things I need to do in GitHub.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/GitHub-Mark-120px-plus.png" /><media:content medium="image" url="https://www.tunbury.org/images/GitHub-Mark-120px-plus.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Import Text File of events into Apple Calendar using AppleScript</title><link href="https://www.tunbury.org/2020/02/06/import-text-file-of-events-into-apple-calendar-using-applescript/" rel="alternate" type="text/html" title="Import Text File of events into Apple Calendar using AppleScript" /><published>2020-02-06T12:41:29+00:00</published><updated>2020-02-06T12:41:29+00:00</updated><id>https://www.tunbury.org/2020/02/06/import-text-file-of-events-into-apple-calendar-using-applescript</id><content type="html" xml:base="https://www.tunbury.org/2020/02/06/import-text-file-of-events-into-apple-calendar-using-applescript/"><![CDATA[<p>The Church of England has a very useful <a href="https://www.churchofengland.org/prayer-and-worship/worship-texts-and-resources/common-worship/prayer-and-worship/worship-texts-and-resources/common-worship/churchs-year/calendar">calendar</a> page, but I’d really like it in my iPhone calendar so I can have reminders for Saints’ days particularly red letter days when the flag goes up.</p>

<p>I’ve never used AppleScript before but with a little searching online it seemed relatively easy to create a script to import a text file copy of the web page into my Mac calendar which is synchronised with my phone.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set OldDelimiters to AppleScript's text item delimiters
set LF to ASCII character 10
set tab to ASCII character 9
set theFile to choose file with prompt "Select TAB delimited file calendar file"
set theLines to read theFile
set AppleScript's text item delimiters to {LF}
set theLines to paragraphs of theLines
set AppleScript's text item delimiters to {tab}
repeat with ThisLine in theLines
if (count of ThisLine) &gt; 0 then
set theStartDate to current date
set hours of theStartDate to 0
set minutes of theStartDate to 0
set seconds of theStartDate to 0

if text item 1 of ThisLine is not "0" then
set year of theStartDate to text item 1 of ThisLine as number
end if

if text item 2 of ThisLine is equal to "January" then
set month of theStartDate to 1
else if text item 2 of ThisLine is equal to "February" then
set month of theStartDate to 2
else if text item 2 of ThisLine is equal to "March" then
set month of theStartDate to 3
else if text item 2 of ThisLine is equal to "April" then
set month of theStartDate to 4
else if text item 2 of ThisLine is equal to "May" then
set month of theStartDate to 5
else if text item 2 of ThisLine is equal to "June" then
set month of theStartDate to 6
else if text item 2 of ThisLine is equal to "July" then
set month of theStartDate to 7
else if text item 2 of ThisLine is equal to "August" then
set month of theStartDate to 8
else if text item 2 of ThisLine is equal to "September" then
set month of theStartDate to 9
else if text item 2 of ThisLine is equal to "October" then
set month of theStartDate to 10
else if text item 2 of ThisLine is equal to "November" then
set month of theStartDate to 11
else if text item 2 of ThisLine is equal to "December" then
set month of theStartDate to 12
else
log text item 2 of ThisLine
end if

set day of theStartDate to text item 3 of ThisLine

set theEndDate to theStartDate + (23 * hours)

log theStartDate

tell application "Calendar"
if text item 5 of ThisLine is "RED" then
tell calendar "CofE RED"
if text item 1 of ThisLine is not "0" then
set newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true}
else
set newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true, recurrence:"freq=Yearly"}
end if
end tell
else
tell calendar "CofE"
if text item 1 of ThisLine is not "0" then
set newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true}
else
set newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true, recurrence:"freq=Yearly"}
end if
end tell
end if
end tell

end if

end repeat
 
set AppleScript's text item delimiters to OldDelimiters
</code></pre></div></div>

<p><a href="/downloads/cofe-calendar.txt">cofe-calendar</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="applescript" /><summary type="html"><![CDATA[The Church of England has a very useful calendar page, but I’d really like it in my iPhone calendar so I can have reminders for Saints’ days particularly red letter days when the flag goes up.]]></summary></entry><entry><title type="html">Bose SoundTouch and Mini DLNA</title><link href="https://www.tunbury.org/2019/09/21/bose-soundtouch-and-mini-dlna/" rel="alternate" type="text/html" title="Bose SoundTouch and Mini DLNA" /><published>2019-09-21T12:41:29+00:00</published><updated>2019-09-21T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/09/21/bose-soundtouch-and-mini-dlna</id><content type="html" xml:base="https://www.tunbury.org/2019/09/21/bose-soundtouch-and-mini-dlna/"><![CDATA[<p><a href="https://www.bose.co.uk">Bose</a> have a Windows application can host your music library, however I don’t have a Windows machine turn on permanently and I’d prefer a low power Raspberry PI option.</p>

<p>Install Mini DLNA</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt-get install minidlna
</code></pre></div></div>

<p>Copy the Music over to the staging folder. I have my MP3 files on an external hard disk so I’ll copy them over link this</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ar -c /mnt/Music -cvf - . | tar -C /var/lib/minidlna -xf -
</code></pre></div></div>

<p>Set the file ownership</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>chown -R minidlna:minidlna /var/lib/minidlna /var/cache/minidlna
</code></pre></div></div>

<p>Sometimes you need to delete the database from <code class="language-plaintext highlighter-rouge">/var/cache/minidlna/files.db</code> and restart the service</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>service minidlna stop
rm /var/cache/minidlna/files.db
service minidlna start
</code></pre></div></div>

<p>Check the status at <code class="language-plaintext highlighter-rouge">http://&lt;host_ip&gt;:8200</code></p>

<p><img src="/images/minidlna-status.png" alt="" /></p>

<p>Now on the Bose SoundTouch app go to Add Service, Music Library on NAS and select your Pi from the list:</p>

<p><img src="/images/soundtouch-app.jpg" alt="" /></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="juniper" /><summary type="html"><![CDATA[Bose have a Windows application can host your music library, however I don’t have a Windows machine turn on permanently and I’d prefer a low power Raspberry PI option.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/bose-soundtouch-30.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/bose-soundtouch-30.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Bridged WiFi Access Point with Raspberry Pi</title><link href="https://www.tunbury.org/2019/09/20/bridged-wifi-access-point-with-raspberry-pi/" rel="alternate" type="text/html" title="Bridged WiFi Access Point with Raspberry Pi" /><published>2019-09-20T12:41:29+00:00</published><updated>2019-09-20T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/09/20/bridged-wifi-access-point-with-raspberry-pi</id><content type="html" xml:base="https://www.tunbury.org/2019/09/20/bridged-wifi-access-point-with-raspberry-pi/"><![CDATA[<p>Run <code class="language-plaintext highlighter-rouge">ifconfig</code> and determine your network device names. Typically these will be <code class="language-plaintext highlighter-rouge">eth0</code> and <code class="language-plaintext highlighter-rouge">wlan0</code>.</p>

<p>Install the packages we’ll need</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt-get install hostapd bridge-utils
</code></pre></div></div>

<p>Create a file <code class="language-plaintext highlighter-rouge">/etc/network/interfaces.d/br0</code> containing</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>auto br0
  iface br0 inet dhcp
   bridge_ports eth0 wlan0
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/dhcpcd.conf</code> and add the following two lines to the end of the file</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>denyinterfacea eth0,wlan0
</code></pre></div></div>

<p>Reboot your Pi to apply the configuration.</p>

<p>Create the configuration file <code class="language-plaintext highlighter-rouge">/etc/hostapd/hostapd.conf</code> for <code class="language-plaintext highlighter-rouge">hostapd</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>interface=wlan0
bridge=br0
ssid=YourSSID
hw_mode=g
channel=7
wmm_enabled=0
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wpa=2
wpa_passphrase=SecurePassword
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP
rsn_pairwise=CCMP
</code></pre></div></div>

<p>Edit <code class="language-plaintext highlighter-rouge">/etc/default/hostapd</code> and uncomment the <code class="language-plaintext highlighter-rouge">DAEMON_CONF</code> line and enter the full path to the configuration file above, thus:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DAEMON_CONF="/etc/hostapd/hostapd.conf"
</code></pre></div></div>

<p>Set <code class="language-plaintext highlighter-rouge">hostapd</code> to launch on boot and launch it right now</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>systemctl unmask hostapd
systemctl enable hostapd
/etc/init.d/hostapd start
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><category term="wifi" /><summary type="html"><![CDATA[Run ifconfig and determine your network device names. Typically these will be eth0 and wlan0.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/wifi.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/wifi.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OLED Module for PI</title><link href="https://www.tunbury.org/2019/09/20/oled-module-for-pi/" rel="alternate" type="text/html" title="OLED Module for PI" /><published>2019-09-20T12:41:29+00:00</published><updated>2019-09-20T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/09/20/oled-module-for-pi</id><content type="html" xml:base="https://www.tunbury.org/2019/09/20/oled-module-for-pi/"><![CDATA[<p>Run <code class="language-plaintext highlighter-rouge">raspi-config</code> and turn on the i2c interface</p>

<p>Install the i2c tools</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apt-get install i2c-tools
</code></pre></div></div>

<p>Then of your module by running <code class="language-plaintext highlighter-rouge">i2cdetect -y 1</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>root@pi2b:~ # i2cdetect -y 1
    0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:          -- -- -- -- -- -- -- -- -- -- -- -- -- 
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
30: -- -- -- -- -- -- -- -- -- -- -- -- 3c -- -- -- 
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
70: -- -- -- -- -- -- -- --                         
</code></pre></div></div>

<p>This shows that you’ve connected up the hardware correctly!</p>

<p>Install the Python modules required by the Adafruit SSD1306 module.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pt-get install -y python3-dev python3-setuptools python3-pip python3-pil python3-rpi.gpio
</code></pre></div></div>

<p>Download the library from Github</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/adafruit/Adafruit_Python_SSD1306.git
</code></pre></div></div>

<p>Install the library</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo python3 setup.py install
</code></pre></div></div>

<p>Then run one of the examples such as <code class="language-plaintext highlighter-rouge">shapes.py</code></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><category term="oled" /><summary type="html"><![CDATA[Run raspi-config and turn on the i2c interface]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/oled.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/oled.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Juniper SRX100 Firmware Update</title><link href="https://www.tunbury.org/2019/09/20/srx-firmware/" rel="alternate" type="text/html" title="Juniper SRX100 Firmware Update" /><published>2019-09-20T12:41:29+00:00</published><updated>2019-09-20T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/09/20/srx-firmware</id><content type="html" xml:base="https://www.tunbury.org/2019/09/20/srx-firmware/"><![CDATA[<p>Download the latest version of the software and copy it over to the SRX</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scp junos-srxsme-12.3X48-D65.1-domestic.tgz root@192.168.1.1:/var/tmp
</code></pre></div></div>

<p>On the SRX install the software into the alternative root partition</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>request system software add /var/tmp/junos-srxsme-12.3X48-D65.1-domestic.tgz no-copy no-validate unlink
</code></pre></div></div>

<p>Reboot</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>request system reboot
</code></pre></div></div>

<p>Once it has rebooted, update the alternate image to the new version.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>request system snapshot slice alternate
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="juniper" /><summary type="html"><![CDATA[Download the latest version of the software and copy it over to the SRX]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/SRX100H2.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/SRX100H2.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Raspberry PI SSH Keys</title><link href="https://www.tunbury.org/2019/09/16/raspberry-pi-ssh-keys/" rel="alternate" type="text/html" title="Raspberry PI SSH Keys" /><published>2019-09-16T12:41:29+00:00</published><updated>2019-09-16T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/09/16/raspberry-pi-ssh-keys</id><content type="html" xml:base="https://www.tunbury.org/2019/09/16/raspberry-pi-ssh-keys/"><![CDATA[<p>This is my cheatsheet based upon <a href="https://www.raspberrypi.org/documentation/remote-access/ssh/passwordless.md">Passwordless SSH access</a> on the official Raspberry PI website.</p>

<p>On the Mac create a key (once) with a passcode</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh-keygen
</code></pre></div></div>

<p>Add the key to your Mac keychain</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh-add -K ~/.ssh/id_rsa
</code></pre></div></div>

<p>Optionally create a file <code class="language-plaintext highlighter-rouge">~/.ssh/config</code> with these contents which contains the <code class="language-plaintext highlighter-rouge">UseKeychain yes</code> line which tells OSX to look at the keychain for the passphrase.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Host *
  UseKeychain yes
  AddKeysToAgent yes
  IdentityFile ~/.ssh/id_rsa
</code></pre></div></div>

<p>Then copy your key to your Raspberry PI</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh-copy-id pi@192.168.1.x
</code></pre></div></div>

<p>SSH to the PI</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ssh pi@192.168.1.x
</code></pre></div></div>

<p>Next edit your <code class="language-plaintext highlighter-rouge">/etc/ssh/sshd_config</code> to turn off plain text password authentication and restart <code class="language-plaintext highlighter-rouge">sshd</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo sed -i "s/#PasswordAuthentication yes/PasswordAuthentication no/g" /etc/ssh/sshd_config
sudo /etc/init.d/ssh restart
</code></pre></div></div>

<p>Now you can SSH without a password and without getting pestered that the default password hasn’t been changed.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><summary type="html"><![CDATA[This is my cheatsheet based upon Passwordless SSH access on the official Raspberry PI website.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/SSH-Keys.png" /><media:content medium="image" url="https://www.tunbury.org/images/SSH-Keys.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Raspberry PI Zero W Headless setup</title><link href="https://www.tunbury.org/2019/09/14/raspberry-pi-zero-w-headless-setup/" rel="alternate" type="text/html" title="Raspberry PI Zero W Headless setup" /><published>2019-09-14T12:41:29+00:00</published><updated>2019-09-14T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/09/14/raspberry-pi-zero-w-headless-setup</id><content type="html" xml:base="https://www.tunbury.org/2019/09/14/raspberry-pi-zero-w-headless-setup/"><![CDATA[<p>Copy <code class="language-plaintext highlighter-rouge">2019-07-10-raspbian-buster-lite.img</code> to the SD card with Etcher. Then remove and reinsert the card.</p>

<p>Enable ssh by creating a zero length file called <code class="language-plaintext highlighter-rouge">ssh</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>touch /Volumes/boot/ssh
</code></pre></div></div>

<p>Create a file <code class="language-plaintext highlighter-rouge">/Volumes/boot/wpa_supplicant.conf</code> using your favourite plain text editor:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
country=GB

network={
  ssid="your SSID"
  psk="xxxxxxxx"
  key_mgmt=WPA-PSK
}
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="raspberrypi" /><summary type="html"><![CDATA[Copy 2019-07-10-raspbian-buster-lite.img to the SD card with Etcher. Then remove and reinsert the card.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/pi-zero.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/pi-zero.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Internet Radio from Raspberry PI</title><link href="https://www.tunbury.org/2019/09/01/internet-radio-from-raspberry-pi/" rel="alternate" type="text/html" title="Internet Radio from Raspberry PI" /><published>2019-09-01T12:41:29+00:00</published><updated>2019-09-01T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/09/01/internet-radio-from-raspberry-pi</id><content type="html" xml:base="https://www.tunbury.org/2019/09/01/internet-radio-from-raspberry-pi/"><![CDATA[<p>Install the software packages needed</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install libmp3lame0 libtwolame0
sudo apt-get install darkice
sudo apt-get install icecast2
</code></pre></div></div>

<p>During the installation you will be asked to set the icecast password which you’ll need enter into the configuration file below</p>

<p>Check your recording device is present</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pi@raspberrypi:~ $ arecord -l
**** List of CAPTURE Hardware Devices ****
card 1: AK5371 [AK5371], device 0: USB Audio [USB Audio]
Subdevices: 0/1
Subdevice #0: subdevice #0
</code></pre></div></div>

<p>Try to make a recording:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>arecord -D plughw:1,0 temp.wav
</code></pre></div></div>

<p>If the volume is too quiet, you can adjust it with alsamixer -c 1 where 1 is your audio device. Note that 0 is the Raspberry PI default output device.</p>

<p>Create a configuration file for darkice</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># this section describes general aspects of the live streaming session
[general]
duration        = 0        # duration of encoding, in seconds. 0 means forever
bufferSecs      = 5         # size of internal slip buffer, in seconds
reconnect       = yes       # reconnect to the server(s) if disconnected


# this section describes the audio input that will be streamed
[input]
# device          = /dev/dsp  # OSS DSP soundcard device for the audio input
device          = plughw:1,0  # OSS DSP soundcard device for the audio input
sampleRate      = 22050     # sample rate in Hz. try 11025, 22050 or 44100
bitsPerSample   = 16        # bits per sample. try 16
channel         = 2         # channels. 1 = mono, 2 = stereo


# this section describes a streaming connection to an IceCast2 server
# there may be up to 8 of these sections, named [icecast2-0] ... [icecast2-7]
# these can be mixed with [icecast-x] and [shoutcast-x] sections
[icecast2-0]
bitrateMode     = abr       # average bit rate
format          = mp3       # format of the stream: ogg vorbis
bitrate         = 96        # bitrate of the stream sent to the server
server          = localhost # host name of the server
port            = 8000      # port of the IceCast2 server, usually 8000
password        = password # source password to the IceCast2 server
mountPoint      = mic  # mount point of this stream on the IceCast2 server
name            = Microphone Raspberry Pi # name of the stream
description     = Broadcast from 2nd room # description of the stream
url             = http://example.com/ # URL related to the stream
genre           = my own    # genre of the stream
public          = no        # advertise this stream?
</code></pre></div></div>

<p>Invoke the server by running darkice at the prompt.</p>

<p>Set darkice to run at boot up</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>update-rc.d darkice defaults
</code></pre></div></div>

<p>Open a web browser to <code class="language-plaintext highlighter-rouge">http://&lt;pi-ip-address&gt;:8000</code> to view the installation. Add the url source to your Internet radio appliance via <code class="language-plaintext highlighter-rouge">http://&lt;pi-ip-address&gt;:8000/mic</code></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bells" /><category term="raspberrypi" /><summary type="html"><![CDATA[Install the software packages needed]]></summary></entry><entry><title type="html">Most Popular Methods</title><link href="https://www.tunbury.org/2019/02/28/most-popular-methods/" rel="alternate" type="text/html" title="Most Popular Methods" /><published>2019-02-28T12:41:29+00:00</published><updated>2019-02-28T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/02/28/most-popular-methods</id><content type="html" xml:base="https://www.tunbury.org/2019/02/28/most-popular-methods/"><![CDATA[<p>There are ~72,000 Surprise Major performances on Bell Board. Bell Board displays results in pages of 200 performances. Thus we will need to download all the pages and concatenate them into a single file:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for i in {1..366}; do wget "https://bb.ringingworld.co.uk/search.php?title=surprise+major&amp;page=$i" -O - &gt;&gt; surprise-major.txt; done
</code></pre></div></div>

<p>Quick analysis with awk/sed/sort and uniq:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>awk '/class="title"/ { print $3, $4, $5, $6, $7, $8, $9}' surprise-major.txt | sed 's/&lt;\/td&gt;//' | sort | uniq -c | sort -gr | less
</code></pre></div></div>

<p>As expect the Standard 8 are right there:-</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>10732 Yorkshire Surprise Major
 7633 Cambridge Surprise Major
 6908 Bristol Surprise Major
 3629 Superlative Surprise Major
 3425 Lincolnshire Surprise Major
 3048 Rutland Surprise Major
 2716 London Surprise Major
 1556 Pudsey Surprise Major
  957 Glasgow Surprise Major
  931 Lessness Surprise Major
  666 Belfast Surprise Major
  645 Uxbridge Surprise Major
  568 Cornwall Surprise Major
</code></pre></div></div>

<p>Repeating for the ~3,800 Delight Major performances</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for i in {1..30}; do wget "https://bb.ringingworld.co.uk/search.php?title=delight+major&amp;page=$i" -O - &gt;&gt; delight-major.txt; done
awk '/class="title"/ { print $3, $4, $5, $6, $7, $8, $9}' delight-major.txt | sed 's/&lt;\/td&gt;//' | sort | uniq -c | sort -gr | less
</code></pre></div></div>

<p>Gives us these</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>141 Cooktown Orchid Delight Major
 36 Christmas Delight Major
 30 Wedding Delight Major
 28 Coniston Bluebird Delight Major
 27 Diamond Delight Major
 26 Ruby Delight Major
 22 Birthday Delight Major
 19 Anniversary Delight Major
 18 Dordrecht Delight Major
 16 Yelling Delight Major
 16 Lye Delight Major
 16 Burnopfield Delight Major
 15 Winchester Delight Major
 15 Hunsdon Delight Major
 13 Uttlesford Delight Major
 13 Magna Carta Delight Major
 12 Sussex Delight Major
 12 Sunderland Delight Major
 12 Sleaford Delight Major
 12 Heptonstall Delight Major
 11 Windy Gyle Delight Major
 11 Spitfire Delight Major
 11 Ketteringham Delight Major
 11 Keele University Delight Major
 11 Ian's Delight Major
 11 Eardisland Delight Major
 11 Dingley Delight Major
 10 West Bridgford Delight Major
 10 Paisley Delight Major
 10 Morville Delight Major
 10 Longstanton Delight Major
 10 Knotty Ash Delight Major
</code></pre></div></div>

<p>And once again for the 2,200 Delight Minor performances</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for i in {1..12}; do wget "https://bb.ringingworld.co.uk/search.php?title=delight+minor&amp;page=$i" -O - &gt;&gt; delight-minor.txt; done
awk '/class="title"/ { print $3, $4, $5, $6, $7, $8, $9}' delight-minor.txt | sed 's/&lt;\/td&gt;//' | sort | uniq -c | sort -gr | less
</code></pre></div></div>

<p>Gives</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 85 Woodbine Delight Minor
 78 Old Oxford Delight Minor
 46 Oswald Delight Minor
 41 Elston Delight Minor
 30 College Bob IV Delight Minor
 25 Morning Exercise Delight Minor
 23 Kirkstall Delight Minor
 22 Francis Genius Delight Minor
 20 St Albans Delight Minor
 20 Julie McDonnell Delight Minor
 19 Southwark Delight Minor
 18 Burslem Delight Minor
 18 Barham Delight Minor
 17 Kentish Delight Minor
 17 Darton Exercise Delight Minor
 17 Burnaby Delight Minor
 16 Edinburgh Delight Minor
 15 Disley Delight Minor
 14 Neasden Delight Minor
 14 London Delight Minor
 14 Glastonbury Delight Minor
 14 Bedford Delight Minor
 13 Croome d'Abitot Delight Minor
 13 Christmas Pudding Delight Minor
 13 Charlwood Delight Minor
 12 Wragby Delight Minor
 11 Willesden Delight Minor
 11 Newdigate Delight Minor
 10 Combermere Delight Minor
 10 Cambridge Delight Minor
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bells" /><category term="bash" /><summary type="html"><![CDATA[There are ~72,000 Surprise Major performances on Bell Board. Bell Board displays results in pages of 200 performances. Thus we will need to download all the pages and concatenate them into a single file:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/bellboard.png" /><media:content medium="image" url="https://www.tunbury.org/images/bellboard.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Mount an ISO from your Desktop via PowerCLI</title><link href="https://www.tunbury.org/2019/01/17/mount-an-iso-from-your-desktop-via-powercli/" rel="alternate" type="text/html" title="Mount an ISO from your Desktop via PowerCLI" /><published>2019-01-17T12:41:29+00:00</published><updated>2019-01-17T12:41:29+00:00</updated><id>https://www.tunbury.org/2019/01/17/mount-an-iso-from-your-desktop-via-powercli</id><content type="html" xml:base="https://www.tunbury.org/2019/01/17/mount-an-iso-from-your-desktop-via-powercli/"><![CDATA[<p>Normally, I’d used a Windows NFS Server to host my ISO files. The steps couldn’t be simpler</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Add-WindowsFeature FS-NFS-Service
New-NfsShareimport
Import-Module NFS
New-NfsShare -Name ISO -Path C:\ISO -access readonly
</code></pre></div></div>

<p>However, this only works if you have a Windows Server installation as you can’t install the NFS Service on a Windows desktop.</p>

<p>There is a standalone executable version of an NFS server available called WinNFSd.exe which can be downloaded from <a href="https://github.com/winnfsd/winnfsd/releases">GitHub</a>. I’ve saved this to <code class="language-plaintext highlighter-rouge">C:\WinNFSd</code></p>

<p>Create a firewall rule on your desktop to allow the allow the ESXi host to communicate with WinNFSd, thus:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>New-NetFirewallRule -DisplayName "NFS Server" -Direction Inbound -Action Allow -Program C:\WinNFSd\WinNFSd.exe
</code></pre></div></div>

<p>Run <code class="language-plaintext highlighter-rouge">WinNFSd</code>. The argument list is the local folder hosting your ISO files to be shared and the path that it will have on the NFS server’s export list.  The path name needs to match the <code class="language-plaintext highlighter-rouge">New-DataStore</code> command later:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Start-Process C:\WinNFSd\WinNFSd.exe -ArgumentList "C:\ISO /ISO"
</code></pre></div></div>

<p>You should now have a CMD window open along with the PowerCLI prompt.</p>

<p>Now you need to know the IP Address of your machine:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$myIPAddress = "Your IP Address"
</code></pre></div></div>

<p>You can automate this as follows but this may need to be tweaked depending upon which network card you are using etc.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$myIPAddress = $(Get-NetIPAddress -InterfaceAlias Ethernet0 -AddressFamily IPv4).IPAddress
</code></pre></div></div>

<p>Create a variable for your ESXi host(s).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$esxHosts = @( "Your Host" )
</code></pre></div></div>

<p>If you have a cluster you can include them all like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$esxHosts = Get-Datacenter yourDC | Get-Cluster yourCluster | Get-VMHost
</code></pre></div></div>

<p>Instruct the ESXi host to mount the datastore.  Note that the final <code class="language-plaintext highlighter-rouge">/ISO</code> needs to match the final argument to <code class="language-plaintext highlighter-rouge">WinNFSd</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$esxHosts |% { New-Datastore -VMHost $_ -Name ISO -NfsHost $myIPAddress -Path /ISO }
</code></pre></div></div>

<p>Now set the ISO that you have, such as <code class="language-plaintext highlighter-rouge">c:\iso\myiso.iso</code> to be the CD Drive on your VM</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Get-CDDrive $vm | Set-CDDrive -IsoPath "[ISO] myiso.iso" -Connected:$true -Confirm:$false
</code></pre></div></div>

<p>Now you can use the CD Drive in the VM as you wish.</p>

<p>Of course, it’s important tidy up in the correct sequence. Don’t just close the CMD prompt before disconnecting the CD drive and unmounting the datastore.</p>

<p>Disconnect the CD Drive</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Get-CDDrive $vm | Set-CDDrive -NoMedia -Confirm:$false
</code></pre></div></div>

<p>Remove the datastore</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$esxHosts |% { Remove-Datastore -VMHost $_ -Datastore ISO -Confirm:$false }
</code></pre></div></div>

<p>Stop WinNFSd and remove the firewall rule</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Stop-Process -Name WinNFSd
Remove-NetFirewallRule -DisplayName "NFS Server"
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="powershell" /><summary type="html"><![CDATA[Normally, I’d used a Windows NFS Server to host my ISO files. The steps couldn’t be simpler]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/PowerCLI.png" /><media:content medium="image" url="https://www.tunbury.org/images/PowerCLI.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Retro Gaming: Space Raiders</title><link href="https://www.tunbury.org/2018/09/24/retro-gaming-space-raiders/" rel="alternate" type="text/html" title="Retro Gaming: Space Raiders" /><published>2018-09-24T12:41:29+00:00</published><updated>2018-09-24T12:41:29+00:00</updated><id>https://www.tunbury.org/2018/09/24/retro-gaming-space-raiders</id><content type="html" xml:base="https://www.tunbury.org/2018/09/24/retro-gaming-space-raiders/"><![CDATA[<p>Dylan’s favourite t-shirt is his Game Over shirt which always reminds me to Space Raiders from the ZX Spectrum days. I found the cassette tape quite easily but it took a significant amount of searching to find the Spectrum itself and included in the box was the tape recorder as well!</p>

<p>Unfortunately when I set about loading the game it didn’t work. It probably was a lot to ask after 30+ years. The audio sounded a bit low and the tape player was at maximum. I tried connecting it via an amplifier but that didn’t help.</p>

<p>I connected the tape drive to my Mac and looked at the file in Audacity.</p>

<p><img src="/images/original-tape-player.png" alt="" /></p>

<p>Apart from being very quiet, zooming in showed that after the guard tone it was impossible to see the signal as described in this <a href="http://www.myprius.co.za/tape_storage.htm">excellent post</a>.</p>

<p><img src="/images/nothing-to-see.png" alt="" /></p>

<p>I tried the Fuse utilities to covert the WAV into a TZX file but these failed. I found more tools here which I installed on my Raspberry PI but the result was the same.</p>

<p>Eventually, I decided to see if I could find another tape player and I found an old compact media centre. I played the tape straight into Audacity just to see if I could see a difference. Clearly this find is significantly better:</p>

<p><img src="/images/compact-media-centre.png" alt="" /></p>

<p>I tried <code class="language-plaintext highlighter-rouge">audio2tape</code> but that give me a bunch of CRC errors, but processing the file with <code class="language-plaintext highlighter-rouge">tzxwav</code> worked perfectly:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pi@raspberrypi:~/.local/bin $ ./tzxwav -p -v -o ~/raiders.tzx -D ~/raiders.wav 
=== Program: raiders   ---------------------------------|  1:56
Expected length: 40
Leader: @1055530, Sync: @1275725, End: @1279885
Program: raiders    (40 bytes)
--- data########----------------------------------------|  1:51
Length: 40
Leader: @1323967, Sync: @1412003, End: @1421770
40 bytes of data
=== Program: RAIDERS   ---------------------------------|  1:44
Expected length: 68
Leader: @1510973, Sync: @1731454, End: @1735476
Program: RAIDERS    (68 bytes)
--- data###########-------------------------------------|  1:40
Length: 68
Leader: @1778815, Sync: @1866811, End: @1882863
68 bytes of data
=== Bytes: T         #----------------------------------|  1:33
Start: 16384, Expected length: 6912
Leader: @1964171, Sync: @2184510, End: @2188446
Screen: T         
--- data#########################-----------------------|  1:27
Length: 6912
Leader: @2231875, Sync: @2319891, End: @3680454
6912 bytes of data
=== Bytes: C         ##############---------------------|  1:16
Start: 24576, Expected length: 7860
Leader: @3778730, Sync: @3989417, End: @3993362
Bytes: C          (start: 24576, 7860 bytes)
--- data###########################################-----|  0:19
Length: 7860
Leader: @4036807, Sync: @4124864, End: @6093760
7860 bytes of data
100% |##################################################|  0:00
</code></pre></div></div>

<p>I loaded the TZX file into Fuse and it worked as expected.</p>

<p>Armed with a working tape player I loaded the game on the real ZX Spectrum on the first attempt</p>

<p><img src="/images/space-raiders-on-tv.jpg" alt="" /></p>

<p>Lastly, can we have this on our Raspberry PI? Well of course, just install Fuse and load up the TZX images:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install fuse-emulator-common
sudo apt-get install spectrum-roms fuse-emulator-utils
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="specturm" /><category term="raspberrypi" /><summary type="html"><![CDATA[Dylan’s favourite t-shirt is his Game Over shirt which always reminds me to Space Raiders from the ZX Spectrum days. I found the cassette tape quite easily but it took a significant amount of searching to find the Spectrum itself and included in the box was the tape recorder as well!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/space-raiders.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/space-raiders.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Which Funds Have Exposure to NetFlix?</title><link href="https://www.tunbury.org/2018/08/27/which-funds-have-exposure-to-netflix/" rel="alternate" type="text/html" title="Which Funds Have Exposure to NetFlix?" /><published>2018-08-27T12:41:29+00:00</published><updated>2018-08-27T12:41:29+00:00</updated><id>https://www.tunbury.org/2018/08/27/which-funds-have-exposure-to-netflix</id><content type="html" xml:base="https://www.tunbury.org/2018/08/27/which-funds-have-exposure-to-netflix/"><![CDATA[<p>Dabbling in the markets by way of investment funds is amusing. I use <a href="www.hl.co.uk">Hargreaves Lansdown</a> to do this. HL have a fund research section which lets you look at a given fund and view the top 10 holdings so you can base your decision to invest in your belief in the underlying stock.</p>

<p>How do you tackle it from the other direction? Suppose you want to invest in NetFlix but which fund(s) has expose to their stock? The search tool on HL’s website doesn’t let you search the fund’s holdings.</p>

<p>Firstly, we can get a list of funds starting with <code class="language-plaintext highlighter-rouge">a</code> by visiting the link https://www.hl.co.uk/funds/fund-discounts,-prices–and–factsheets/search-results/a. There are 25 more to go plus 0 for anything starting with a number. These pages are HTML unordered lists <code class="language-plaintext highlighter-rouge">ul</code>, of hyperlinks <code class="language-plaintext highlighter-rouge">href</code>. We can get the alphabet as an array in a tidy loop such as this <code class="language-plaintext highlighter-rouge">foreach ($l in [char[]]([char]'a'..[char]'z') + '0') { }</code> (assuming ASCII)</p>

<p>We can download the HTML using PowerShell’s <code class="language-plaintext highlighter-rouge">Invoke-WebRequest</code> and then extra tags using <code class="language-plaintext highlighter-rouge">getElementsByTagName</code> however it can be desperately slow in some circumstances so I prefer to just get the HTML as a string using <code class="language-plaintext highlighter-rouge">$_.RawContent</code> then processing it with <code class="language-plaintext highlighter-rouge">IndexOf()</code>.</p>

<p>The code, and basically the methodology for the rest of this script, is show as below:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$baseURL = "https://www.hl.co.uk/funds/fund-discounts,-prices--and--factsheets/search-results"
$html = $(Invoke-WebRequest -uri "$baseURL/a").RawContent
$x1 = $html.IndexOf('&lt;ul class="list-unstyled list-indent"')
$x1 = $html.IndexOf('&gt;', $x1) + 1
$x2 = $html.IndexOf('&lt;/ul', $x1)
$tbl = $html.substring($x1, $x2 - $x1).trim()
</code></pre></div></div>

<p>Search the HTML for the start of the <code class="language-plaintext highlighter-rouge">ul</code> tag and save it in <code class="language-plaintext highlighter-rouge">$x1</code>. As tags can be of variable length we move <code class="language-plaintext highlighter-rouge">$x1</code> to the end of the tag by searching for the close tag marker <code class="language-plaintext highlighter-rouge">&gt;</code> and adding 1. Now, just search for the end of the list by looking for the <code class="language-plaintext highlighter-rouge">&lt;/ul</code> tag and store that in <code class="language-plaintext highlighter-rouge">$x2</code>. The table can now be extracted as the sub string between <code class="language-plaintext highlighter-rouge">$x1</code> and <code class="language-plaintext highlighter-rouge">$x2</code>.</p>

<p>Each list item <code class="language-plaintext highlighter-rouge">li</code>, contains a hyperlink tag <code class="language-plaintext highlighter-rouge">&lt;a href=</code> including the URL of the page with the fund details and the the fund name. We can use a <code class="language-plaintext highlighter-rouge">for</code> loop to move through the string and build up an array of fund URLs. Back tick is the escape character in PowerShell.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$funds = @()
for ($x1 = $tbl.IndexOf("href="); $x1 -ge 0; $x1 = $tbl.IndexOf("href=", $x2)) {
    $x1 = $tbl.IndexOf('"', $x1) + 1   # x1 is the start of the string
    $x2 = $tbl.IndexOf('"', $x1)       # x2 is the end of the string
    $funds += $tbl.Substring($x1, $x2 - $x1)
}
</code></pre></div></div>

<p>At this point we can examine our funds in <code class="language-plaintext highlighter-rouge">$funds</code>, or perhaps write then to a CSV: <code class="language-plaintext highlighter-rouge">$funds | Export-Csv funds.csv</code>.</p>

<p>What we really want is the list of holdings for each funds. So using the techniques above, download the HTML for each fund detail page, extract the fund size where it appears on the page. Then locate the Top 10 holdings table and build a PowerShell object based upon the table headings and populate the values:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$holdings = @()
for ($f = 0; $f -lt $funds.count; $f++) {
    $html = $(Invoke-WebRequest -uri $funds[$f]).RawContent
    if ($html.IndexOf("Factsheet unavailable") -ge 0 -or
        $html.IndexOf("Market data not available") -ge 0 -or
        $html.IndexOf("holdings currently unavailable") -ge 0) {
        Write-Host -ForegroundColor Red $f $funds[$f].substring($baseURL.length) "- unavailable"
        continue
    }

    $x1 = $html.IndexOf('Fund size')
    $x1 = $html.IndexOf('&lt;td', $x1)
    $x1 = $html.IndexOf("&gt;", $x1) + 1
    $x2 = $html.IndexOf('&lt;/td', $x1)
    $fundSize = $html.Substring($x1, $x2 - $x1).trim()
    $fundSize = $fundSize -replace "&amp;pound;", "GBP "
    $fundSize = $fundSize -replace "&amp;euro;", "EUR "
    $fundSize = $fundSize -replace "\$", "USD "

    $x1 = $html.IndexOf('&lt;table class="factsheet-table" summary="Top 10 holdings"')
    $x1 = $html.IndexOf('&gt;', $x1) + 1
    $x2 = $html.IndexOf('&lt;/table&gt;', $x1)
    $tbl = $html.substring($x1, $x2 - $x1).trim()

    $headings = @()
    for ($x1 = $tbl.IndexOf('&lt;th', 1); $x1 -gt 0; $x1 = $tbl.IndexOf('&lt;th', $x2)) {
        $x1 = $tbl.IndexOf("&gt;", $x1) + 1
        $x2 = $tbl.IndexOf("&lt;/th&gt;", $x1)
        $headings += $tbl.Substring($x1, $x2 - $x1)
    }

    if ($headings.count -eq 0) {
        Write-Host -ForegroundColor Red $f $funds[$f].substring($baseURL.length) "- no table"
        continue
    }

    $i = 0
    for ($x1 = $tbl.IndexOf('&lt;td'); $x1 -gt 0; $x1 = $tbl.IndexOf('&lt;td', $x2)) {
        if ($i % $headings.count -eq 0) {
            $h = New-Object -TypeName PSObject -Property @{Fund=$funds[$f].substring($baseURL.length);Size=$fundSize}
        }
        $x1 = $tbl.IndexOf("&gt;", $x1) + 1
        $x2 = $tbl.IndexOf("&lt;/td", $x1)
        $cell = $tbl.Substring($x1, $x2 - $x1).trim()
        if ($cell.Substring(0, 1) -eq '&lt;') {
            $x1 = $tbl.IndexOf("&gt;", $x1) + 1
            $x2 = $tbl.IndexOf("&lt;/a", $x1)
            $cell = $tbl.Substring($x1, $x2 - $x1).trim()
        }
        Add-Member -InputObject $h -MemberType NoteProperty -Name $headings[$i % $headings.count] -Value $cell
        $i++
        if ($i % $headings.count -eq 0) {
            $holdings += $h
        }
    }
    Write-Host $f $funds[$f].substring($baseURL.length) $fundSize ($i / 2) "holdings"
}
</code></pre></div></div>

<p>As I mentioned, most of the code is as explained before but the PowerShell object bit deserves a mention. I use an iterator <code class="language-plaintext highlighter-rouge">$i</code> to count the cells in the table (note this assumes that the table has equal number of cells per row which isn’t necessarily true in HTML). We have two column headings, so <code class="language-plaintext highlighter-rouge">$i % $headings.count -eq 0</code> is true for 0, 2, 4 etc and this happens at the start of the loop so we use it to create the object.</p>

<p>Once we have the cells content, we can use <code class="language-plaintext highlighter-rouge">Add-Member</code> to add the property to the object. The property name is given by <code class="language-plaintext highlighter-rouge">$headings[$i % $headings.count]</code>: either zero or one in this case.</p>

<p>At the end of the loop we increment <code class="language-plaintext highlighter-rouge">$i</code> and test whether it we are now on the next row <code class="language-plaintext highlighter-rouge">$i % $headings.count -eq 0</code> and if so add the current object to the output array (as it will be overwritten at the start of the next iteration of the loop).</p>

<p>After all that work lets save the results as a CSV: <code class="language-plaintext highlighter-rouge">$holdings | Export-Csv holdings.csv</code></p>

<p>We now know the percentages of each holding and the total fund value so we can calculate a new column with the monetary value invested in a fund as follows:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$holdings |% {
    [decimal]$w = $_.weight -replace '[^\d.]'
    [decimal]$s = $_.size -replace '[^\d.]'
    Add-Member -InputObject $_ -MemberType NoteProperty -Name Value -Value ($w * $s / 100) -Force
}
</code></pre></div></div>

<p>Perhaps save it again? <code class="language-plaintext highlighter-rouge">$holdings | Export-Csv -Force holdings.csv</code></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import-csv .\holdings.csv |? Security -match "Netflix" | sort -Property Value
</code></pre></div></div>

<p>The full code can be downloaded from <a href="https://github.com/mtelvers/Hargreaves-Lansdown/blob/master/fund-holdings.ps1">GitHub</a> or probably more usefully you can get <a href="https://raw.githubusercontent.com/mtelvers/Hargreaves-Lansdown/master/holdings.csv">holdings.csv</a></p>

<h1 id="addendum">Addendum</h1>

<p>To make the analysis easier it would help to standardise the currencies. Most are in GBP by some margin so let’s convert to that:-</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ExchangeRates = @{GBP = 1; YEN = 0.00698098; EUR = 0.905805; USD = 0.776454; AUSD = 0.567308}

$holdings |% {
    [decimal]$s = $_.size -replace '[^\d.]'
    [decimal]$w = $_.weight -replace '[^\d.]'
    if ($s -gt 0) {
        $currency = $_.size.substring(0, $_.size.IndexOf(" "))
        $sGBP = $s * $ExchangeRates[$currency]
    } else {
        $sGBP = 0
    }
    Add-Member -InputObject $_ -MemberType NoteProperty -Name SizeGBP -Value $sGBP -Force
    Add-Member -InputObject $_ -MemberType NoteProperty -Name ValueGBP -Value ($w * $sGBP / 100) -Force
}
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="powershell" /><summary type="html"><![CDATA[Dabbling in the markets by way of investment funds is amusing. I use Hargreaves Lansdown to do this. HL have a fund research section which lets you look at a given fund and view the top 10 holdings so you can base your decision to invest in your belief in the underlying stock.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/hl_hi_res.gif" /><media:content medium="image" url="https://www.tunbury.org/images/hl_hi_res.gif" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Latin Square</title><link href="https://www.tunbury.org/2018/07/13/latin-square/" rel="alternate" type="text/html" title="Latin Square" /><published>2018-07-13T12:41:29+00:00</published><updated>2018-07-13T12:41:29+00:00</updated><id>https://www.tunbury.org/2018/07/13/latin-square</id><content type="html" xml:base="https://www.tunbury.org/2018/07/13/latin-square/"><![CDATA[<p>Looking at the latest video from Presh Talwalkar about solving the Latin square where each row is the first row multiplied by the row number I decided it was time to see if I could remember any C++ and code a solution.</p>

<p><a href="https://youtu.be/KXOjtmNUSH0">Can you fiqure out the special 6 digit number?</a></p>

<p>Include the files standard C++ header files we need</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#include &lt;iostream&gt;
#include &lt;algorithm&gt;
#include &lt;vector&gt;
#include &lt;sstream&gt;
#include &lt;string&gt;
#include &lt;iomanip&gt;

using namespace std;
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">CheckDuplicates()</code> comes from ideas presented in this <a href="https://stackoverflow.com/questions/2860634/checking-for-duplicates-in-a-vector">Stack Overflow question</a>. The function determines whether there are any repeated digits in a vector by sorting the vector and then searching for adjacent items which are the same. Since <code class="language-plaintext highlighter-rouge">std::sort</code> changes the source vector I’ve created a local copy using the vector constructor function.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bool CheckDuplicates(vector&lt;unsigned int&gt;* v) {
        vector&lt;unsigned int&gt; c (v-&gt;begin(), v-&gt;end());
        sort(c.begin(), c.end());
        vector&lt;unsigned int&gt;::iterator it = adjacent_find(c.begin(), c.end());
        if (it == c.end())
                return false;
        else
                return true;
}
</code></pre></div></div>

<p>On to the body of program</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int main () {
</code></pre></div></div>

<p>Create a loop which covers all possible six digit numbers. The result can’t be smaller than 123456 and it must be less than 1,000,000 ÷ 6 = 166,666 but change the loop to 0 to 1,000,000 shows that there really aren’t any other solutions.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        for (unsigned int t = 123456; t &lt; 166666; t++) {
</code></pre></div></div>

<p>I’ll use a vector of vectors to hold the digits of each number.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                vector&lt; vector&lt;unsigned int&gt;* &gt; square;
</code></pre></div></div>

<p>This first block of code initialises the first vector with the value from the outer loop. It only adds the value to the square if it doesn’t contain any duplicate digits.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                {
                        vector&lt;unsigned int&gt;* row = new vector&lt;unsigned int&gt;;
                        unsigned int n = t;
                        for (int i = 0; i &lt; 6; i++) {
                                row-&gt;insert(row-&gt;begin(), n % 10);
                                n /= 10;
                        }
                        if (!CheckDuplicates(row))
                                square.push_back(row);
                        else
                                delete row;
                }
</code></pre></div></div>

<p>By looking at the size of the <code class="language-plaintext highlighter-rouge">square</code> vector we can see if we have a row to work with or not. If we do, attempt the multiplication of the first row by 2 through 6 to generate the other rows. As we want full multiplication not just the multiplication of each digit we need to compute the carry at each step and add it on to the next column. If there is a carry into the seventh column then the row can be discarded. Lastly, check for duplicates and if none are found added the number/row to the square. An alternative approach here would be to multiply t and separate the result into the individual digits in a vector as we did above.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                if (square.size() == 1) {
                        for (unsigned int j = 2; j &lt;= 6; j++) {
                                unsigned int carry = 0;
                                vector&lt;unsigned int&gt;* row = new vector&lt;unsigned int&gt;;
                                for (int i = 5; i &gt;= 0; i--) {
                                        unsigned int n = square.at(0)-&gt;at(i) * j + carry;
                                        if (n &gt; 9) {
                                                carry = n / 10;
                                                n %= 10;
                                        } else {
                                                carry = 0;
                                        }
                                        row-&gt;insert(row-&gt;begin(), n);
                                }
                                if (carry) {
                                        delete row;
                                        break;
                                } else {
                                        if (!CheckDuplicates(row))
                                                square.push_back(row);
                                        else
                                                delete row;
                                }
                        }
                }
</code></pre></div></div>

<p>So, if we get to here we have six rows each of different digits in each row. We now need to check for duplication in the columns. This strictly isn’t necessary because only one solution makes it this far, but for the sake of completeness I generate a vector for each column and check it for duplicates. If no duplicates are found then it’s a possible solution.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                if (square.size() == 6) {
                        bool duplicates = false;
                        for (int i = 5; i &gt;= 0; i--) {
                                vector&lt;unsigned int&gt; column;
                                for (vector&lt;unsigned int&gt;* row : square)
                                        column.push_back(row-&gt;at(i));
                                if (CheckDuplicates(&amp;column)) {
                                        duplicates = true;
                                        break;
                                }
                        }
                        if (!duplicates) {
                                cout &lt;&lt; "\nSolution\n";
                                for (vector&lt;unsigned int&gt;* row : square) {
                                        for (unsigned int c : *row) {
                                                cout &lt;&lt; c &lt;&lt; ' ';
                                        }
                                        cout &lt;&lt; '\n';
                                }
                        }
                }
</code></pre></div></div>

<p>Tidy up by deleting each of the row vectors</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                for (vector&lt;unsigned int&gt;* row : square)
                        delete row;
                square.erase(square.begin(), square.end());
        }

        return 0;
}
</code></pre></div></div>

<p>You can download the full version of the code from <a href="https://github.com/mtelvers/LatinSquare">Github</a></p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="c++" /><summary type="html"><![CDATA[Looking at the latest video from Presh Talwalkar about solving the Latin square where each row is the first row multiplied by the row number I decided it was time to see if I could remember any C++ and code a solution.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/latin-square.png" /><media:content medium="image" url="https://www.tunbury.org/images/latin-square.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Prime Numbers in PowerShell</title><link href="https://www.tunbury.org/2017/05/01/prime-numbers-in-powershell/" rel="alternate" type="text/html" title="Prime Numbers in PowerShell" /><published>2017-05-01T12:41:29+00:00</published><updated>2017-05-01T12:41:29+00:00</updated><id>https://www.tunbury.org/2017/05/01/prime-numbers-in-powershell</id><content type="html" xml:base="https://www.tunbury.org/2017/05/01/prime-numbers-in-powershell/"><![CDATA[<p>Dylan was using a number square to calculate prime numbers so it amused me to code up a couple of algorithms to show just how quick the sieve method actually is. I’ve done these in PowerShell because … reasons.</p>

<p>So as a baseline, here’s a basic way to calculate a prime. Start with a number and try to divide it by every number starting from 2 up to the square root of the number. I’ve used <code class="language-plaintext highlighter-rouge">throw</code> in a <code class="language-plaintext highlighter-rouge">try</code>/<code class="language-plaintext highlighter-rouge">catch</code> block to move to the next iteration of the outer loop without executing the <code class="language-plaintext highlighter-rouge">Write-Host</code> line.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for ($n = 3; $n -lt 100000; $n++) {
    try {
        for ($d = 2; $d -le [Math]::Sqrt($n); $d++) {
            if ($n % $d -eq 0) {
                throw
            }
        }
        Write-Host -NoNewLine "$n "
    }
    catch { }
}
</code></pre></div></div>

<p>Interestingly, all those exceptions add quite an overhead because this same algorithm using a local variable ran three times quicker on my machine (27 seconds for the first and 9 seconds for this)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for ($n = 3; $n -lt 100000; $n++) {
    $prime = $true
    for ($d = 2; $d -le [Math]::Sqrt($n); $d++) {
        if ($n % $d -eq 0) {
            $prime = $false
            break;
        }
    }
    if ($prime) {
        Write-Host -NoNewLine "$n "
    }
}
</code></pre></div></div>

<p>Obviously we should optimise this by removing even numbers as below and this, as you’d expect, halves the run time.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for ($n = 3; $n -lt 100000; $n += 2) {
    $prime = $true
    for ($d = 3; $d -le [Math]::Sqrt($n); $d += 2) {
        if ($n % $d -eq 0) {
            $prime = $false
            break;
        }
    }
    if ($prime) {
    }
}
</code></pre></div></div>

<p>Anyway, the sieve is all done in 0.75 seconds:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ints = 0..100000
for ($i = 2; $i -lt [Math]::Sqrt($ints.length); $i++) {
    if ($ints[$i] -eq 0) {
        continue
    }
    for ($j = $i * $i; $j -lt $ints.length; $j += $i) {
        $ints[$j] = 0
    }
}
$ints | foreach { if ($_) { Write-Host -NoNewLine "$_ " } }
</code></pre></div></div>

<p>As the maximum number increases the differences become even more stark. At 1,000,000 the sieve completed in 11 seconds but the simple method took 129 seconds</p>

<p>For my timings, I used <code class="language-plaintext highlighter-rouge">measure-command</code> and removed the <code class="language-plaintext highlighter-rouge">Write-Host</code> lines.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="powershell" /><summary type="html"><![CDATA[Dylan was using a number square to calculate prime numbers so it amused me to code up a couple of algorithms to show just how quick the sieve method actually is. I’ve done these in PowerShell because … reasons.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/prime-numbers.jpg" /><media:content medium="image" url="https://www.tunbury.org/images/prime-numbers.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Splicing Three Strand Rope</title><link href="https://www.tunbury.org/2016/11/21/splicing-three-strand-rope/" rel="alternate" type="text/html" title="Splicing Three Strand Rope" /><published>2016-11-21T12:41:29+00:00</published><updated>2016-11-21T12:41:29+00:00</updated><id>https://www.tunbury.org/2016/11/21/splicing-three-strand-rope</id><content type="html" xml:base="https://www.tunbury.org/2016/11/21/splicing-three-strand-rope/"><![CDATA[<p>My sudden interest in rope splicing stems entirely from bell ropes. There seems to be three, perhaps four, splices to learn for this application. Links below to YouTube videos explaining how to do them:</p>

<ul>
  <li><a href="https://youtu.be/QeYBkMCQ8WY">Eye Splice</a></li>
  <li><a href="https://youtu.be/PFFeDH2u7E0">Short Splice</a></li>
  <li><a href="https://youtu.be/sN-cnO8Fqrc">Long Splice</a></li>
  <li><a href="https://youtu.be/bRjqMKLS99A">End/Back Splice</a></li>
</ul>

<p>Above the sally you’d probably use a long splice as it’s thinner than the short splice for running over any pulleys. Below the sally, either a short splice to the tail end if it doesn’t see much wear, or an eye splice if the tail end is changed frequently, typical on larger bells. The back splice could be used on the top end to give a nice finish to the rope.</p>

<p>I’m amazed how straightforward they are to do and how strong they are given that it’s just an over-under weave of strands without a knot in sight!</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="bells" /><summary type="html"><![CDATA[My sudden interest in rope splicing stems entirely from bell ropes. There seems to be three, perhaps four, splices to learn for this application. Links below to YouTube videos explaining how to do them:]]></summary></entry><entry><title type="html">Pentominoes</title><link href="https://www.tunbury.org/2016/08/25/pentominoes/" rel="alternate" type="text/html" title="Pentominoes" /><published>2016-08-25T12:41:29+00:00</published><updated>2016-08-25T12:41:29+00:00</updated><id>https://www.tunbury.org/2016/08/25/pentominoes</id><content type="html" xml:base="https://www.tunbury.org/2016/08/25/pentominoes/"><![CDATA[<p>One day I was clearing out some old papers and I came across this programming assignment from university. I can’t recall which of the problems I tackled at the time, after all it was twenty-five years ago, but glancing over it now the pentomino problem caught my eye</p>

<blockquote>
  <p>5 The Pentomino Problem
There are twelve different (ie. non-congruent) pentominos, shown below left. The pentomino problem is to fit them into a tray of dimensions 6 x 10 without overlapping. Some of the 2339 possible solutions are shown below right. Write a program to find a solution to the pentomino problem. {Note. Pretty output is not required.)</p>
</blockquote>

<p><img src="/images/pentomino-graphic.png" alt="" /></p>

<p>Looking on <a href="https://en.wikipedia.org/wiki/Pentomino">Wikipedia</a> it seems that the shapes have been named by <a href="https://en.wikipedia.org/wiki/Solomon_W._Golomb">Golomb</a> so I’m going to use those names too.</p>

<p>I started out by creating some data structures to hold the definition of each pentomino.</p>

<p>So laying out on a x, y co-ordinate system I’m create a point_t structure containing values</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>typedef struct {
        int x, y;
} point_t;
</code></pre></div></div>

<p>Any pentomino will have exactly five points</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>typedef struct {
        point_t point[5]; /* 5 points in each */
} pentomino_t;
</code></pre></div></div>

<p>Considering the ‘F’ pentomino it may be rotated and reflected in different ways – a maximum of 8 different versions may exist. Some, such as ‘X’, only have one.</p>

<p><img src="/images/F.svg" alt="" /></p>

<p>I have created a structure to hold the pentomino name along with a count of the number of unique rotations/reflections of the shape and an array to hold the co-ordinates</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>typedef struct {
        char ch; /* name of the shape by letter */
        int count; /* number of unique rotations */
        pentomino_t rotation[8]; /* max of 4 possible rotations and then double for the mirrors */
} pentominoRotations_t;
</code></pre></div></div>

<p>The 6×10 board that we will try to place them on is as simple as this</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>char board[60];
</code></pre></div></div>

<p>The algorithm couldn’t be simpler really, take the first pentomino in the first rotation and put it on the board in the top left corner, if that works try the second pentomino in the second position in the first rotation and repeat.  At each step check no parts of any pentomino are outside the board area and that nothing is on top of anything else.  If it is, remove the last piece added and try to add it again in the next rotation.  Based upon the assignment the key here is to recognise that this is a recursive algorithm – in pseudo code it looks like this</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>function calculate(pentomino p, board)
        for each position on the board
                for each pentomino rotation
                        let shape_ok = true
                        for each point in pentomino shape
                                if the co-ordinate is out of bound then shape_ok = false
                                if the board position is already used then shape_ok = false
                        next
                        if shape_ok is true then
                                draw the shape on the current board
                                if p &lt; 12 then
                                        calculate(p + 1, current board layout)
                                else
                                        we have a solution!
                next
        next
</code></pre></div></div>

<p>Here is the first solution that it generates given the order of shapes as I have them</p>

<p><img src="/images/solution-1.svg" alt="" /></p>

<p>The big problem with this is it takes a very long time!  The main reason for this is that it algorithm wastes masses of time trying to fit all 12 pieces in even when the early piece positions have given a board which can’t possibly be solved.  In the example below there is no point trying to place the other 11 pentominos including all their rotations when there is an isolated single square.</p>

<p><img src="/images/F-bad-placement.svg" alt="" /></p>

<p>My initial solution to this is to add a check after drawing the shape to look for regions which have an area of less than 5.  However this can extended to check for regions that have areas which are not multiples of 5 as clearly all pentominos have an area of 5!</p>

<p>Take a look at the example below.  This has two regions, on the left the area is 13 and on the right the area is 22.  This is can’t be solved as we will never be able to pack objects with an area of 5 into a region of area 13.</p>

<p><img src="/images/small-region.svg" alt="" /></p>

<p>I was quite surprised how easy it was to calculate the area of the regions.  I’ve always thought that the fill/flood tools on paint programs were cool and here we are just doing the same thing.  Here’s some pseudo code to explain it.  I presume I’d get twice the marks for this assignment for having two recursive functions!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Create a copy of the board
Loop through all squares on the board
        if the square is empty
                call the flood function with starting at these co-ordinates
                if the returned value modulus 5 is not zero then the board cannot be solved

function flood(start co-ordinates)
        let r = 1 and for that to be the size of the region
        mark the current co-ordinate position as filled
        if the square to the left is empty then call the flood function with those co-ordinates and add the returned value to r
        if the square to the right is empty then call the flood function with those co-ordinates and add the returned value to r
        if the square above is empty then call the flood function with those co-ordinates and add the returned value to r
        if the square below is empty then call the flood function with those co-ordinates and add the returned value to r
        return r
</code></pre></div></div>

<p>If you let these run to completion you find that you have 9356 solutions – exactly 4 times the number we should.  This is because the board has rotation symmetry and both vertical and horizontal symmetry.  We could check each solution against the ones already created for possible duplicates but we could also amend the algorithm so at the first level we only consider start position in the first quarter of the board.</p>

<p>With this amended algorithm my average computer produced all 2339 solutions in around twenty minutes.</p>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="c" /><summary type="html"><![CDATA[One day I was clearing out some old papers and I came across this programming assignment from university. I can’t recall which of the problems I tackled at the time, after all it was twenty-five years ago, but glancing over it now the pentomino problem caught my eye]]></summary></entry><entry><title type="html">Place Notation</title><link href="https://www.tunbury.org/2016/08/24/place-notation/" rel="alternate" type="text/html" title="Place Notation" /><published>2016-08-24T12:41:29+00:00</published><updated>2016-08-24T12:41:29+00:00</updated><id>https://www.tunbury.org/2016/08/24/place-notation</id><content type="html" xml:base="https://www.tunbury.org/2016/08/24/place-notation/"><![CDATA[<p>Thomas Barlow has taught me place notation using <a href="/downloads/Strike-Back-Surprise-Major.pdf">Strike Back Surprise Major</a> as the example. The notation for that is <code class="language-plaintext highlighter-rouge">x38x14x58x16x12x38x14.12.78 l.e. 12</code>. There are plenty of guides online on how to interpret it, such as this one on the <a href="http://www.cccbr.org.uk/education/thelearningcurve/pdfs/200404.pdf">CCCBR website</a>.</p>

<p>Briefly an x in the notation causes all bells to swap places. A group of numbers indicates that the bells in these places remain fixed while all others swap places. In this example, giving a starting order of rounds: 12345678 the first x would yield 21436587. The subsequent 38 indicates that the 3rd placed and 8th placed bells are fixed, so bells in position 1 and 2 swap as do 4 and 5 and 6 and 7 resulting in 12463857 and so on. As many methods are symmetrical, typically only half is written out. The second half is the reverse of the first with the given lead end appended.</p>

<p>My attempt to write out <a href="/downloads/Ajax-Surprise-Major.pdf">Ajax Surprise Major</a> <code class="language-plaintext highlighter-rouge">x58x14x56x16x14x1258x12x58,12</code> by hand went wrong in the early stages so I turned to Perl to do the job for me.</p>

<p>The first part of the script parses the place notation into an array, unwraps the symmetry and tags on the lead end. I don’t much like parsers as they tend to be messy as they have to deal with the real world, so moving swiftly on to the core of the script with the assumption that the place notation of the method is held in the array <code class="language-plaintext highlighter-rouge">@method</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x 58 x 14 x 56 x 16 x 14 x 1258 x 12 x 58 x 12 x 1258 x 14 x 16 x 56 x 14 x 58 x 12
</code></pre></div></div>

<p>Define <code class="language-plaintext highlighter-rouge">@rounds</code> to be rounds and then set the current bell arrangement to be rounds!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>my @rounds = (1..$stage);
my @bells = @rounds;
do {
</code></pre></div></div>

<p>Loop through each of the elements in the method (<code class="language-plaintext highlighter-rouge">@method</code>)</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    foreach my $m (@method) {
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">$stage</code> is the number of bells involved in the method. Our examples have all been <em>major</em> methods so <code class="language-plaintext highlighter-rouge">$stage</code> is 8. Perl arrays are inconveniently numbered from zero so we actually want number 0 through 7 so I’ve used pop to remove the last one</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        my @changes = (0..$stage);
        pop @changes;
</code></pre></div></div>

<p>If the current step contains bell places (noting that 0 = 10, E = 11, T = 12) we split up the string into an array which we process in <em>reverse</em> order (to preserve the position numbering) and we remove these numbers from the array of changes.  The function numeric returns the integer value from the character (T=12 etc).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        if ($m =~ /[0-9ET]*/) {
            my @fixed = split //, $m;
            while (@fixed) {
                splice @changes, numeric(pop @fixed) - 1, 1;
            }
        }
</code></pre></div></div>

<p>For example, taking <code class="language-plaintext highlighter-rouge">$m</code> to be <code class="language-plaintext highlighter-rouge">1258</code> then <code class="language-plaintext highlighter-rouge">@changes</code> and <code class="language-plaintext highlighter-rouge">@fixed</code> will iterate as shown. Note the annoying -1 to align the bell position to the array index</p>

<table>
  <thead>
    <tr>
      <th>Iteration</th>
      <th><code class="language-plaintext highlighter-rouge">@changes</code></th>
      <th><code class="language-plaintext highlighter-rouge">@fixed</code></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td> </td>
      <td>0 1 2 3 4 5 6 7</td>
      <td>1 2 5 8</td>
    </tr>
    <tr>
      <td>1</td>
      <td>0 1 2 3 4 5 6</td>
      <td>1 2	5</td>
    </tr>
    <tr>
      <td>2</td>
      <td>0 1 2 3 5 6</td>
      <td>1 2</td>
    </tr>
    <tr>
      <td>3</td>
      <td>0 2 3 5 6</td>
      <td>1</td>
    </tr>
    <tr>
      <td>4</td>
      <td>2 3 5 6</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<p>The resulting array <code class="language-plaintext highlighter-rouge">@changes</code> contains the pairs of bell place indices which need to be swapped. Changes need to be made in order working up to the back as place notation can omit implied changes. For example 18 could be shortened to just 1 as by the time 2nd and 3rd, 4th and 5th, 6th and 7th have all swapped, 8th place must be fixed.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        while (@changes) {
            my ($swap1, $swap2) = splice @changes, 0, 2;
            @bells[$swap1, $swap2] = @bells[$swap2, $swap1];
            last if (scalar @changes &lt; 2);
        }
</code></pre></div></div>

<p>Now we need to output the current arrangement which at this point will just be a print statement.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        print "@bells\n";
    }
</code></pre></div></div>

<p>Keep going until we are back in rounds.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>} while (not @bells ~~ @rounds);
</code></pre></div></div>

<p>Now that that is working the natural desire is to produce beautiful output. Since I was coding in Perl and ultimately I’d like a webpage out of this I experimented using Perl’s GD::Graph library to draw a line graph of the place of each bell. GD::Graph can display the point value on the graph which was used to show the bell number. The output was functional although far from high resolution. The font of the point values cannot be controlled.  See Bob Doubles output below</p>

<p><img src="/images/bob-doubles.png" alt="" /></p>

<p>Since the GD::Graph output wasn’t great, I’ve coded a version which creates the output using SVG.  Have a go:</p>

<form action="https://cgi.tunbury.org/cgi-bin/placenotation.pl" method="get" target="_blank">
    <label for="input">Select a method:</label>
    <select id="input" name="input">
        <option value="5.1.5.1.5,125">Bob Doubles</option>
        <option value="x1x1x1,12">Bob Minor</option>
        <option value="345.1.5.1.5,125">Reverse Canterbury</option>
        <option value="3.1.5.3.1.3.1.3.5.1.3.1">Stedman Doubles</option>
        <option value="3.1.5.1.5.1.5.1.5.1">Grandsire Doubles</option>
        <option value="x38x14x1256x16x34x1458x34x58,12">Valencia Surprise Major</option>
    </select><br /><br />
    <label for="stage">Select stage:</label>
    <select id="stage" name="stage">
        <option value="5">Doubles</option>
        <option value="6">Minor</option>
        <option value="7">Triples</option>
        <option value="8">Major</option>
        <option value="9">Caters</option>
        <option value="10">Royal</option>
    </select><br /><br />
    <label for="highlight">Highlight bell:</label>
    <input type="number" id="highlight" name="highlight" value="4" /><br /><br />
    <input type="submit" value="Submit" />
</form>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="perl" /><category term="bells" /><summary type="html"><![CDATA[Thomas Barlow has taught me place notation using Strike Back Surprise Major as the example. The notation for that is x38x14x58x16x12x38x14.12.78 l.e. 12. There are plenty of guides online on how to interpret it, such as this one on the CCCBR website.]]></summary></entry><entry><title type="html">Mandelbrot Set</title><link href="https://www.tunbury.org/2015/01/19/mandlebrot-set/" rel="alternate" type="text/html" title="Mandelbrot Set" /><published>2015-01-19T12:41:29+00:00</published><updated>2015-01-19T12:41:29+00:00</updated><id>https://www.tunbury.org/2015/01/19/mandlebrot-set</id><content type="html" xml:base="https://www.tunbury.org/2015/01/19/mandlebrot-set/"><![CDATA[<p>The Mandelbrot set is created from this very simple formula in which both Z and C are complex numbers.</p>

\[Z_{n+1}=Z_n^2+c\]

<p>The formula is iterated to determine whether Z is bounded or tends to infinity.  To demonstrate this assume a test case where the imaginary part is zero and focus just on the real part.  In this case, the formula is trivial to evaluate starting with Z = 0.  The table below shows the outcome at C=0.2 and C=0.3 and where one is clearly bounded and the other is not!</p>

<table>
  <thead>
    <tr>
      <th><strong>Iteration</strong></th>
      <th><strong>C = 0.2</strong></th>
      <th><strong>C = 0.3</strong></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td> </td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>1</td>
      <td>0.2</td>
      <td>0.3</td>
    </tr>
    <tr>
      <td>2</td>
      <td>0.24</td>
      <td>0.39</td>
    </tr>
    <tr>
      <td>3</td>
      <td>0.2576</td>
      <td>0.4521</td>
    </tr>
    <tr>
      <td>4</td>
      <td>0.266358</td>
      <td>0.504394</td>
    </tr>
    <tr>
      <td>5</td>
      <td>0.270946</td>
      <td>0.554414</td>
    </tr>
    <tr>
      <td>6</td>
      <td>0.273412</td>
      <td>0.607375</td>
    </tr>
    <tr>
      <td>7</td>
      <td>0.274754</td>
      <td>0.668904</td>
    </tr>
    <tr>
      <td>8</td>
      <td>0.27549</td>
      <td>0.747432</td>
    </tr>
    <tr>
      <td>9</td>
      <td>0.275895</td>
      <td>0.858655</td>
    </tr>
    <tr>
      <td>10</td>
      <td>0.276118</td>
      <td>1.037289</td>
    </tr>
    <tr>
      <td>11</td>
      <td>0.276241</td>
      <td>1.375968</td>
    </tr>
    <tr>
      <td>12</td>
      <td>0.276309</td>
      <td>2.193288</td>
    </tr>
    <tr>
      <td>13</td>
      <td>0.276347</td>
      <td>5.110511</td>
    </tr>
    <tr>
      <td>14</td>
      <td>0.276368</td>
      <td>26.41732</td>
    </tr>
    <tr>
      <td>15</td>
      <td>0.276379</td>
      <td>698.1747</td>
    </tr>
    <tr>
      <td>16</td>
      <td>0.276385</td>
      <td>487448.2</td>
    </tr>
    <tr>
      <td>17</td>
      <td>0.276389</td>
      <td>2.38E+11</td>
    </tr>
    <tr>
      <td>18</td>
      <td>0.276391</td>
      <td>5.65E+22</td>
    </tr>
  </tbody>
</table>

<p>C=0.2 is said to be part of the set where C=0.3 is not.  Typical this point is coloured by some arbitrary function of the number of iterations it took for the modulus of Z to exceed 2.</p>

<p>The set is plotted on the complex number plane with the real part using the x-axis and the imaginary part using the y-axis, thus:</p>

<p><img src="/images/complex-plane.svg" alt="" /></p>

<p>Given that computers don’t natively work with complex numbers we need to break the formula down into manageable pieces.  Firstly write the formula including both the real and complex parts then expand the brackets and group the terms.</p>

\[Z_{n+1}=Z_n^2+c\]

\[Z_{n+1}=(Z_{re}+Z_{im}i)^2+c_{re}+c_{im}i\]

\[Z_{n+1}=Z_{re}^2-Z_{im}^2+2Z_{re}Z_{im}i+c_{re}+c_{im}i\]

\[\mathbb R(Z_{n+1})=Z_{re}^2-Z_{im}^2+c_{re}\]

\[\mathbb I(Z_{n+1})=2Z_{re}Z_{im}+c_{im}\]

<p>Here’s a Perl program to generate a PNG file.  Over the years I’ve written this same program in many languages starting with Pascal at school, PostScript at University and Excel VBA and JavaScript…</p>

<p>Here’s a Perl program to generate a PNG file.  Over the years I’ve written this same program in many languages starting with Pascal at school, PostScript at University and <a href="/downloads/mandelbrot.xlsm">Excel VBA</a> and JavaScript…</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#!/usr/bin/perl -w

use strict;
use GD;

my $width = 1024;
my $height = 1024;

GD::Image-&gt;trueColor(1);
my $img = new GD::Image($width, $height);
</code></pre></div></div>

<p>Focus on an interesting bit. Real should be between -2.5 and 1 and
imaginary between -1 and 1.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>my $MINre = -0.56;
my $MAXre = -0.55;
my $MINim = -0.56;
my $MAXim = -0.55;
</code></pre></div></div>

<p>Maximum number of iterations before the point is classified as bounded.
I’ve used 255 because I am using this as the colour component later</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>my $max = 255;
</code></pre></div></div>

<p>Setup the loops to move through all the pixels in the image. The value
of C is calculate from the image size and scale. Note that GD creates
images with the origin in the top left.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for my $row (1 .. $height) {
    my $Cim = $MINim + ($MAXim - $MINim) * $row / $height;
    for my $col (0 .. $width - 1) {
        my $Cre = $MINre + ($MAXre - $MINre) * $col / $width;
</code></pre></div></div>

<p>Z starts at the origin</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        my $Zre = 0;
        my $Zim = 0;
        my $iteration = 0;
</code></pre></div></div>

<p>Loop until the modulus of Z &lt; 2 or the maximum number of iterations
have passed. Note that I’ve squared both sides to avoid a wasting time
calculating the square root</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>while ($Zre * $Zre + $Zim * $Zim &lt;= 4 &amp;&amp; $iteration &lt; $max) {
</code></pre></div></div>

<p>Here’s the formula from above to calculate the next value</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            my $ZNre = $Zre * $Zre - $Zim * $Zim + $Cre;
            $Zim = 2 * $Zre * $Zim + $Cim;
            $Zre = $ZNre;
</code></pre></div></div>

<p>Move on to the next iteration</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>            $iteration++;
        }
</code></pre></div></div>

<p>Determine why we finished the loop - was it bound or not - and then
colour the pixel appropriately</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        if ($iteration &lt; $max) {
            $img-&gt;setPixel($col, $height - $row, $iteration * 0x010101);
        } else {
            $img-&gt;setPixel($col, $height - $row, 0x00);
        }
    }
}
</code></pre></div></div>

<p>Output the PNG file to STDOUT</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>binmode STDOUT;
print $img-&gt;png;
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="perl" /><summary type="html"><![CDATA[The Mandelbrot set is created from this very simple formula in which both Z and C are complex numbers.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/mandelbrot-set-5.png" /><media:content medium="image" url="https://www.tunbury.org/images/mandelbrot-set-5.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Shape Files</title><link href="https://www.tunbury.org/2015/01/19/shape-files/" rel="alternate" type="text/html" title="Shape Files" /><published>2015-01-19T12:41:29+00:00</published><updated>2015-01-19T12:41:29+00:00</updated><id>https://www.tunbury.org/2015/01/19/shape-files</id><content type="html" xml:base="https://www.tunbury.org/2015/01/19/shape-files/"><![CDATA[<p>Below is a perl script to create a PNG from a Shape file.</p>

<p><a href="/downloads/shapefile.pdf">Shape file specification</a></p>

<p><a href="/downloads/ROADNODE.zip">UK Road network as a shape file </a></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>use strict;
use warnings;

use GD;
GD::Image-&gt;trueColor(1);

my $width = 8 * 1024;
my $height = 8 * 1024;

my $shpfile = $ARGV[0];
open(FH, "&lt;$shpfile") or die("No input file\n");
binmode(FH); 

my $csvfile = $shpfile;
$csvfile =~ s/.shp$/.csv/g;
open(POLYOUT, "&gt;$csvfile");

my $buffer;
my $num_bytes = read(FH, $buffer, 100);
my ($code, $u1, $u2, $u3, $u4, $u5, $filelength, $version, $type, $BBminX, $BBminY, $BBmaxX, $BBmaxY, $BBminZ, $BBmaxZ, $BBminM, $BBmaxM) = unpack("N N N N N N N V V F F F F F F F F", $buffer);
print "code = $code\n";
print "filelength = $filelength\n";
print "version = $version\n";
print "minX = $BBminX\n";
print "minY = $BBminY\n";
print "maxX = $BBmaxX\n";
print "maxY = $BBmaxY\n";
print "minZ = $BBminZ\n";
print "maxZ = $BBmaxZ\n";
print "minM = $BBminM\n";
print "maxM = $BBmaxM\n";

sub mapx {
    my $x = shift;
    return ($x - $BBminX) / ($BBmaxX - $BBminX) * $width;
}

sub mapy {
    my $y = shift;
    return $height - ($y - $BBminY) / ($BBmaxY - $BBminY) * $height;
}

my $polyCount = 0;

my $img = new GD::Image($width, $height);

while (read(FH, $buffer, 12)) {
    my ($recordnumber, $recordlength, $shapetype) = unpack("N N V", $buffer);
    if ($shapetype == 5) {
        # Polygon
        read(FH, $buffer, 4 * 8 + 2 * 4);
        my ($minX, $minY, $maxX, $maxY, $NumParts, $NumPoints) = unpack("F F F F V V", $buffer);
        my @parts;
        foreach my $part (1 .. $NumParts) {
            read(FH, $buffer, 4);
            my ($part) = unpack("V", $buffer);
            push @parts, $part;
            #syswrite(SHPOUT, pack("V", $part), 4);
        }
        push @parts, $NumPoints;
        @parts = reverse @parts;
        while (@parts) {
            my $firstpoint = pop @parts;
            my $lastpoint = pop @parts;
            my $poly = new GD::Polygon;
            $polyCount++;
            foreach ($firstpoint .. $lastpoint - 1) {
                read(FH, $buffer, 16);
                my ($x, $y) = unpack("F F", $buffer);
                print POLYOUT "$x,$y,$polyCount\n";
                $poly-&gt;addPt(mapx($x), mapy($y));
            }
            $img-&gt;openPolygon($poly, 0xff0000);
            push @parts, $lastpoint if (@parts);
        }
    } elsif ($shapetype == 3) {
        # PolyLine
        read(FH, $buffer, 4 * 8 + 2 * 4);
        my ($minX, $minY, $maxX, $maxY, $NumParts, $NumPoints) = unpack("F F F F V V", $buffer);
        my @parts;
        foreach my $part (1 .. $NumParts) {
            read(FH, $buffer, 4);
            my ($part) = unpack("V", $buffer);
            push @parts, $part;
        }
        push @parts, $NumPoints;
        @parts = reverse @parts;
        while (@parts) {
            my $firstpoint = pop @parts;
            my $lastpoint = pop @parts;
            read(FH, $buffer, 16);
            my ($x1, $y1) = unpack("F F", $buffer);
            print POLYOUT "$x1,$y1\n";
            foreach ($firstpoint .. $lastpoint - 2) {
                read(FH, $buffer, 16);
                my ($x2, $y2) = unpack("F F", $buffer);
                print POLYOUT "$x2,$y2\n";
                $img-&gt;line(mapx($x1), mapy($y1), mapx($x2), mapy($y2), 0xff0000);
                $x1 = $x2;
                $y1 = $y2;
            }
            push @parts, $lastpoint if (@parts);
        }

    } elsif ($shapetype == 1) {
        read(FH, $buffer, 2 * 8);
        my ($x, $y) = unpack("F F", $buffer);
        $img-&gt;setPixel(mapx($x), mapy($y), 0xff0000);
        print POLYOUT "$x,$y\n";
    } else {
        print "unhandled type shapetype = $shapetype\n";
        read(FH, $buffer, $recordlength * 2 - 4);
    }
}

close(POLYOUT);

my $pngfile = $shpfile;
$pngfile =~ s/.shp$/.png/g;
open(PNGOUT, "&gt;$pngfile");
binmode(PNGOUT);
print PNGOUT $img-&gt;png;
close(PNGOUT);
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="perl" /><summary type="html"><![CDATA[Below is a perl script to create a PNG from a Shape file.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.tunbury.org/images/roadnode.png" /><media:content medium="image" url="https://www.tunbury.org/images/roadnode.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Narcissistic Numbers</title><link href="https://www.tunbury.org/2014/01/02/narcissistic-numbers/" rel="alternate" type="text/html" title="Narcissistic Numbers" /><published>2014-01-02T12:41:29+00:00</published><updated>2014-01-02T12:41:29+00:00</updated><id>https://www.tunbury.org/2014/01/02/narcissistic-numbers</id><content type="html" xml:base="https://www.tunbury.org/2014/01/02/narcissistic-numbers/"><![CDATA[<p>I heard about these on <a href="http://www.bbc.co.uk/programmes/b006qshd">BBC Radio 4 More or
Less</a> and they just intrigued
me, perhaps in part because they have no known application! In the past
similar obsessions have appeared with the calculation of PI and right
back to my childhood calculating powers of 2 on a BBC Micro.</p>

<p>The full definition, as for everything, is on
<a href="https://en.wikipedia.org/wiki/Narcissistic_number">Wikipedia</a> but in
short a narcissistic number is one where the sum of the digits raised to
the power of the number of digits equals the number itself. For example</p>

\[153 = 1^3 + 5^3 + 3^3\]

<p>Here’s some quick and dirty Perl code to calculate them:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>use strict;
use warnings;

for (my $i = 10; $i &lt; 10000; $i++) {
    my $pwr = length($i);
    my $total = 0;
    for (my $j = 0; $j &lt; $pwr; $j++) {
        $total += int(substr $i, $j, 1) ** $pwr;
    }
    if ($total == $i) {
        print $i . " is narcissistic\n";
    }
}
</code></pre></div></div>

<p>This yields this output</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>153 is narcissistic
370 is narcissistic
371 is narcissistic
407 is narcissistic
1634 is narcissistic
8208 is narcissistic
9474 is narcissistic
</code></pre></div></div>

<p>However, due to the typical limitation in the implementation of integers
this doesn’t get you very far. Perl’s <code class="language-plaintext highlighter-rouge">Math::BigInt</code> gets you further if
you are very patient</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>use strict;
use warnings;
use Math::BigInt;

my $i = Math::BigInt-&gt;bone();

while ((my $pwr = $i-&gt;length()) &lt; 10) {
    my $total = Math::BigInt-&gt;bzero;
    for (my $j = 0; $j &lt; $pwr; $j++) {
        my $t = Math::BigInt-&gt;new($i-&gt;digit($j));
        $total-&gt;badd($t-&gt;bpow($pwr));
    }
    if ($total == $i) {
        print $i . " is narcissistic\n";
    }
    $i-&gt;binc();
}
</code></pre></div></div>]]></content><author><name>Mark Elvers</name><email>mark.elvers@tunbury.org</email></author><category term="perl" /><summary type="html"><![CDATA[I heard about these on BBC Radio 4 More or Less and they just intrigued me, perhaps in part because they have no known application! In the past similar obsessions have appeared with the calculation of PI and right back to my childhood calculating powers of 2 on a BBC Micro.]]></summary></entry></feed>