Agentic NLP and ML platform for the rail domain. Routes queries across three
integrated capabilities: live ticket search via the National Rail OJP SOAP API,
a Random Forest delay prediction engine, and an LLM-powered contingency advisor
for station staff backed by a pgvector RAG layer. Fully containerised with
Docker Compose including voice interface support.
Python
LLM + RAG
Random Forest
pgvector
Streamlit
Docker
View on GitHub ↗
LLM-as-a-Judge evaluation system built with Dr. F. F. Liza at UEA. Automated
Python pipelines ingest and process large volumes of unstructured text, then use
a generative model to systematically score other LLMs for reasoning capability,
logic consistency, and semantic accuracy, providing a rigorous framework for
measuring AI output quality.
Python
LLMs
NLP
Evaluation
Data Pipelines
MSc dissertation supervised by Prof. Stephen Laycock, benchmarking
state-of-the-art generative AI models against expert human engineers on
low-level, hardware-aware CUDA kernel development. Designing rigorous
experimental frameworks to evaluate execution efficiency, memory throughput,
and code correctness across a range of GPU computational workloads.
CUDA
C++
LLMs
Benchmarking
GPU Programming
PDF analysis tool with CEFR level classification, vocabulary highlighting,
and glossary generation for language learners. Fully deployed and publicly
accessible.
Python
NLP
Docker
Nginx
View on GitHub ↗