Lloyds Banking Group and University of Glasgow Launch Four-Year Agentic AI Study
Lloyds Banking Group and the University of Glasgow have launched a four-year research programme to measure how agentic AI powered by large language models affects software engineering at scale. The collaboration embeds academic researchers directly into Lloyds' engineering teams and funds a PhD studentship, a Masters by Research position, and a postdoctoral role to run structured experiments across the bank's development squads.
The bank serves 28 million customers and will use the findings to guide deployment of agentic systems across its engineering organisation. The partnership gives researchers access to production-like workflows while giving Lloyds a structured path to scale these tools responsibly.
What the research will measure
Researchers will design empirical software engineering experiments that track both qualitative and quantitative signals. Key metrics include output quality, development velocity, defect rates, and task completion time.
The study will run recurring cycles where engineering teams pair with agentic counterparts to solve assigned tasks. Results will be tracked quarterly to observe learning curves and how effects aggregate across teams.
The project focuses on semi-autonomous agentic tools-often implemented as orchestration layers over LLMs-integrated into day-to-day coding work. Researchers will use data mining of repositories, A/B style experiments, controlled task assignments, and observational studies to evaluate performance.
Why this matters for enterprise adoption
Large organisations are accelerating adoption of generative tools for coding and operations. Yet robust, large-scale evidence on agentic AI impact in enterprise software engineering remains scarce. Lloyds has already reported material value from generative AI deployments and substantial internal use of developer-assist tools like GitHub Copilot.
This collaboration shifts the conversation from proof-of-concept pilots to longitudinal, organisation-level evidence that can inform rollout, retraining, and governance decisions. The partnership also explicitly ties technology evaluation to workforce upskilling and process change-one of the main practical barriers to scaling agentic systems at large financial institutions.
Results will likely influence vendor choices, internal platform design, and regulatory discussions about auditability and control for semi-autonomous developer assistants.
What to watch
The study's measurement design and published findings will be the first signals. Expect detailed metrics on defect density, developer productivity, and how team roles evolve as agentic tools take on more work.
Watch for reproducible protocols and data-sharing agreements that let other organisations benchmark results. Also monitor governance artifacts the partners produce-such as recommended safety checks, audit logs, and human-in-the-loop policies-since those will shape the path to enterprise adoption at scale.
Your membership also unlocks: