Data science for health discovery and innovation in Africa (DS-I Africa)
Data science and AI are reshaping how Africa studies, prevents, and treats disease. Backed by $88 million over five years from the U.S. National Institutes of Health, DS-I Africa is building technical capacity, creating shared infrastructure, and accelerating applied research across the continent. The program spans research hubs in 22 African countries with partner sites in the United States, an open data platform, and a clear focus on ethical, locally led innovation. It also calls for regional self-reliance as international funding priorities shift.
What DS-I Africa delivers
- Capacity building for African scientists through seven training hubs and six education projects, plus short courses, datathons, and a flagship "Introduction to Data Science for Biomedical Science."
- AI-enabled tools in use: diagnostics for colorectal and cervical cancer, real-time disease surveillance, and region-specific predictive models for public health planning.
- Shared infrastructure via the eLwazi Open Data Science Platform, integrating standardized metadata from 116 project datasets to support FAIR, reproducible research.
- Embedded ethics, legal, and social initiatives to ensure privacy, equity, and public trust across data sharing and AI deployment.
Why inclusive data matters
Many AI models miss African contexts because African biomedical and multi-omic data remain underrepresented. The result is poorer performance, baked-in bias, and missed signals in populations with extensive genetic and ecological diversity. DS-I Africa addresses this gap by integrating diverse data sources and opening them for responsible use through the eLwazi platform. Explore the platform at elwazi.org.
Practical examples in care and public health
Projects demonstrate measurable benefits: AI-assisted detection in colorectal cancer, smartphone-enabled cervical screening in Uganda, and predictive models that help policymakers anticipate outbreaks. Multimodal and large language models are being adapted to combine clinical, demographic, and environmental data. Health survey analytics in Rwanda and pandemic monitoring across Africa show how localized data can guide targeted interventions. The common thread is clear: better data and context-aware models improve decisions.
Ethics, law, and public trust-built in from day one
Ethical, legal, and social issues are treated as core work, not an afterthought. DS-I Africa Law, REDSSA, and PUBGEM-Africa support privacy-preserving data use, clarify data ownership and cross-border sharing, and strengthen public engagement. Their inputs have influenced policy updates, including South Africa's Material Transfer Agreement, Open Science Policy, and a national research Code of Conduct.
Researchers also get practical tools: plain-language legal guides across 12 countries and an AI assistant for data governance questions at datalaw.bot. Ongoing efforts focus on consent models, codes of conduct, committee oversight, and aligning data protection and AI governance across jurisdictions.
Interdisciplinary frontiers: climate, air, and epidemics
Teams are linking environmental and health data to study climate-sensitive diseases like malaria, assess air pollution risks for vulnerable groups, and map vulnerability to climate-related disasters. Big-data surveillance supports pandemic monitoring and explains how shifts in global mobility affect the spread of infectious diseases. The outcome is a clearer view of risk-and faster, more targeted responses.
People, skills, and infrastructure
Across Africa, at least 122 data science degree programs now operate at 60 institutions. DS-I Africa adds momentum: 18 new degree offerings since 2022 and a pipeline of 157 master's, 20 doctoral, 16 postdoctoral researchers, and 51 new faculty, plus about 150 early-career researchers in the consortium. Short-format training-webinars, datathons, and targeted courses-helps teams upskill without stepping away from ongoing projects. The Coordinating Centre supports early-career researchers who face funding shortfalls, keeping talent in the system.
On the infrastructure side, research hubs blend local expertise with global methods and shared platforms. eLwazi supports FAIR data sharing so teams can analyze high-quality, diverse datasets-without starting from scratch each time. For teams building similar capacity, see resources in AI for Science & Research.
What researchers can do now
- Contribute datasets to shared platforms and publish complete metadata to improve reuse and model generalization.
- Audit models on local cohorts and stress-test for bias across demographics and settings.
- Engage ELSI experts early; define consent, data access, benefit sharing, and oversight plans before collection starts.
- Adopt FAIR practices and versioned workflows for transparent, reproducible pipelines.
- Build regional collaborations that keep expertise and value creation within African institutions.
Outlook
DS-I Africa has established research hubs, launched AI-guided health solutions, and opened access to well-annotated datasets-while embedding sound governance. The next phase tackles persistent gaps: limited multi-ethnic genomic and multi-omic data, uneven infrastructure, and the need to align AI and data protection rules across countries. With continued collaboration and locally led leadership, the initiative is positioned to deliver durable gains in healthcare delivery, surveillance, and equitable outcomes for African populations. The message is simple: invest in data quality, capacity, and trust-and the science moves faster, with impact that lasts.
Your membership also unlocks: