§ 02 — Projects
Tools & Research
Open-source security tools, research projects, and offensive methodologies. Everything here is built to be used, not just demonstrated.
01
02
03
04
05
Contributed the 'rcrce' cybersecurity challenge to METR's HCAST benchmark for evaluating autonomous AI agents. A 2.8-hour PHP race condition exploit task requiring RCE and flag retrieval. Zero AI agents have solved it. Built to the METR Task Standard for measuring frontier model capabilities.
06
07