Research · Artificial Intelligence University
Sovereign AI Lab (SAIL)
Regulated organisations cannot use AI the way Silicon Valley does. Their data cannot leave the building. Their compliance obligations do not pause for innovation cycles. SAIL studies what actually works when the data stays put, and publishes the findings openly.
SAIL is the research lab of Artificial Intelligence University, operating in dual affiliation with XEROTECH LTD. The academic arm sits here at AIU. The industrial arm sits at XEROTECH, where the lab’s research feeds commercial products and consulting. Same researchers, same standards, different audiences.
We publish preprints with DOIs, release trained models and datasets on HuggingFace, and write longer-form analysis through AI Business Review. The publications stand on their own.
Research areas
What the lab studies
The dominant assumption in AI is that deployment requires hyperscale compute and frontier-model APIs. For regulated organisations, that assumption is incomplete. SAIL asks what actually works when data cannot leave the building.
Small-scale language models
How small can a model be and still do the job? Not as an academic question, but as an engineering constraint for organisations that will never rent GPU clusters from hyperscalers. Our published ILM model runs inference on hardware architecture from 1999. On a narrowly scoped task, the optimiser choice contributed more measurable improvement than switching from LSTM to Transformer. Useful AI does not require frontier compute. We can prove it with numbers.
Privacy-preserving architecture
AI systems where sensitive data never reaches an external provider. Client-side redaction before any query is transmitted. On-premise inference on hardware the organisation already owns. This research track produced the patent-pending privacy filtering deployed in production at the industrial arm, and informs deployment patterns used across consulting engagements.
Compliance intelligence
Fusing deterministic rules engines with LLM-generated regulatory narrative. Every output carries epistemic confidence markers: Verified, Inferred, or Assumed. A board, a compliance function, and a regulator can each read the same document and know exactly which claims rest on cited regulation and which rest on inference. Research deployed across energy, tax, legal, health, and education sectors.
AI governance for regulated environments
What technical and organisational structures make AI safe to deploy when a regulator is watching. Ofgem for energy. FCA and PRA for financial services. CQC for healthcare. SRA for legal. Ofsted and OfS for education. ICO for data protection. NCSC CAF 4.0 for cyber resilience. The EU AI Act across Europe. We study the instruments, model the enforcement patterns, and publish analysis through the SAIL newsletter.
Published research
Papers, models, and data
Preprints on Zenodo and TechRxiv. Open models and datasets on HuggingFace. Longer-form analysis through AI Business Review.
“On this corpus at this scale, the optimiser contributed more to the measurable improvement than attention did.”
ILM & ArfaLM, SAIL, April 2026
Language models
ILM and ArfaLM
A 2.3M parameter LSTM trained using only techniques available in 1999 versus an 8.4M parameter Transformer with modern training, on identical child-directed speech data from the CHILDES corpus. When the performance gap is decomposed, the optimiser change alone accounts for 1.56x of the improvement. The architecture change, tokenisation improvement, and 3.6x parameter increase together account for 1.38x. Part 1 published. Part 2, on-hardware deployment, integer quantisation, and Pentium II benchmarks, in progress.
Paper (Zenodo) · Models (HuggingFace) · Dataset (HuggingFace)
Attention & interpretability
The Persistent Tension Hypothesis
Endogenous attention capture in large language models. How internal representational competition shapes what a model attends to, independent of prompt design. With Dr. Atif Naseer.
Attention steering
Focus Mirror
An interactive framework for steering large language model attention in real time during inference. Directing and redirecting what the model focuses on without retraining. With Dr. Atif Naseer.
AI governance
Lifecycle Governance for Multi-Agent Systems
Principles and a research agenda for governing multi-agent AI systems from design through deployment, monitoring, and retirement. Covering accountability, auditability, and human-oversight requirements across regulated environments.
Disaster management
Edge-Enabled Generative AI and AR/VR in Disaster Management
A theoretical framework for integrating edge AI with augmented and virtual reality to support decision-making in disaster response. Relevant to critical infrastructure operators and civil contingency planners. With A.A. Khan.
Privacy
Client-Side Privacy Filtering for LLM-Based Applications
Browser-side PII detection and redaction before data reaches any API. The research behind the patent-pending privacy filtering deployed at the industrial arm. With A.A. Khan.
Beyond the brief
The Temporal Fluid Multiverse
A unified framework for causality and observation, submitted to the Royal Society Proceedings A (RSPA-2026-0276). Outside SAIL’s applied remit. Included because we publish across disciplines.
All models and datasets: huggingface.co/nshah-fbcs · ORCID · Google Scholar
Where the work appears
Publication channels
Papers and models
Preprints on Zenodo and TechRxiv. Trained models and datasets on HuggingFace. Code where it matters.
AI Business Review
Longer-form research and analysis for a business and policy audience. Independent editorial publication under AIU.
SAIL newsletter
Fortnightly on LinkedIn. Sovereign AI policy, deployment analysis, and original research for decision-makers.
Researchers
The team behind the lab
Noman Shah, FBCS
Lab Director · Founder & President, AIU.ac · Founder & CEO, XEROTECH LTD
Fellow of the British Computer Society. 9 patents cited by Apple, Microsoft, Amazon, and IBM. 7 published research papers with DOIs. Executive education at Harvard Business School and Oxford Said. HBR Advisory Council. World Economic Forum Digital member. Judge for the NASA Conrad Challenge. Published in Entrepreneur.com. Author of How DeepSeek Unleashed Pandora’s Box.
Dr. Atif Naseer
Head of Research & AI · Co-Founder & CTO, XEROTECH LTD
PhD in Video Analytics and Deep Learning (University of Malaga, Spain). Senior Lecturer and Research Team Lead at Umm Al-Qura University. 60+ publications, 433+ citations on Google Scholar. Co-author on the Persistent Tension Hypothesis and Focus Mirror. US patent holder in deep learning for congestion detection. 12+ years leading government-funded research in crowd management, big data, and sensor networks.
Academic arm
Artificial Intelligence University · UKPRN 10095512 · BCS Approved Centre · Company #14543918
Industrial arm
XEROTECH LTD · Company #14474495 · ICO ZC065188
Backed by
Google for Startups · Microsoft for Startups · Barclays Eagle Labs · AWS Activate
Collaborate with the lab
If you are a researcher working on small-model architectures, deployment efficiency, privacy-preserving inference, or sector-specific AI governance, we welcome co-authorship, dataset sharing, and joint submissions. For commercial and consulting enquiries, contact the industrial arm at XEROTECH LTD.
Artificial Intelligence University · UKPRN 10095512 · XEROTECH LTD · Company #14474495

