Q2 Summary
Mark Elvers
3 min read

Categories

  • tarides

I am grateful for Tarides’ sponsorship of my OCaml work. Below is a summary of my activities in Q2 2025.

OCaml Infrastructure and Development

OCaml Maintenance Activities

General maintenance work on OCaml’s infrastructure spanned many areas, including updating minimum supported OCaml versions from 4.02 to 4.08 and addressing issues with opam-repo-ci job timeouts. Platform-specific work included resolving compatibility issues with Fedora 42 and GCC 15, addressing Ubuntu AppArmor conflicts affecting runc operations, and managing macOS Sequoia upgrades across the Mac Mini CI workers. Complex build issues were investigated and resolved, including C++ header path problems in macOS workers and FreeBSD system upgrades for the CI infrastructure.

OCaml Infrastructure Migration

Due to the impending sunset of the Equinix Metal platform, the OCaml community services needed to be migrated. Services including OCaml-CI, opam-repo-ci, and the opam.ocaml.org deployment pipeline were migrated to new blade servers. The migration work was planned to minimise service disruption, which was kept to just a few minutes. Complete procedures were documented, including Docker volume transfers and rsync strategies.

opam2web Deployment

Optimisation work was undertaken on the deployment pipeline for opam2web, which powers opam.ocaml.org, to address the more than two-hour deployment time. The primary issue was the enormous size of the opam2web Docker image, which exceeded 25GB due to the inclusion of complete opam package archives. The archive was moved to a separate layer, allowing Docker to cache the layer and reducing the deployment time to 20 minutes.

opam Dependency Graphs

Algorithms for managing OCaml package dependencies were investigated, including topological sorting to determine the optimal package installation order. This work extended to handling complex dependency scenarios, including post-dependencies and optional dependencies. Implemented a transitive reduction algorithm to create a dependency graph with minimal edge counts while preserving the same dependency relationships, enabling more efficient package management and installation processes.

OCaml Developments under Windows

Significant work was undertaken to bring containerization technologies to OCaml development on Windows. This included implementing a tool to create host compute networks via the Windows API, tackling limitations with NTFS hard links, and implementing copy-on-write reflink tool for Windows.

OxCaml Support

Support for the new OxCaml compiler variant included establishing an opam repository and testing which existing OCaml packages successfully built with the new compiler.

ZFS Storage and Hardware Deployment

Early in the quarter, a hardware deployment project centred around Dell PowerEdge R640 servers with a large-scale SSD storage was undertaken. The project involved deploying multiple batches of Kingston 7.68TB SSD drives, creating automated deployments for Ubuntu using network booting with EFI and cloud-init configuration. Experimented with ZFS implementation as a root filesystem, which was possibly but ultimately discarded and explored dm-cache for SSD acceleration of spinning disk arrays. Investigated using ZFS as a distributed storage archive system using an Ansible-based deployment strategy based upon a YAML description.

Talos II Repairs

Significant hardware reliability issues affected two Raptor Computing Talos II POWER9 machines. The first system experienced complete lockups after as little as 20 minutes of operation, while the second began exhibiting similar problems requiring daily power cycling. Working with Raptor Computing support to isolate the issues, upgrading firmware and eventually swapping CPUs between the systems resolved the issue. Concurrently, this provided an opportunity to analyse the performance of OBuilder operations on POWER9 systems, comparing OverlayFS on TMPFS versus BTRFS on NVMe storage, resulting in optimised build performance.

EEG Systems Investigations

Various software solutions and research platforms were explored as part of a broader system evaluation. This included investigating Slurm Workload Manager for compute resource scheduling, examining Gluster distributed filesystem capabilities, and implementing Otter Wiki with Raven authentication integration for collaborative documentation. Research extended to modern research data management platforms, exploring InvenioRDM for scientific data archival and BON in a Box for biodiversity analysis workflows. To support the Teserra workshop, a multi-user Jupyter environment was set up using Docker containerization.

Miscellaneous Technical Explorations

Diverse technical explorations included implementing Bluesky Personal Data Server and developing innovative SSH authentication mechanisms using the ATProto network by extracting SSH public keys from Bluesky profiles. Additional projects included developing OCaml-based API tools for Box cloud storage, creating Real Time Trains API integrations, and exploring various file synchronisation and backup solutions. Investigation of reflink copy mechanisms for efficient file operations using OCaml multicore.