Use AI Responsibly to Bolster Open Science
AI is changing how research gets done. That's good for speed and scale, but it also exposes cracks in how we share data, code and findings. Speakers at a recent Young European Research Universities Network (Yerun) webinar on 21 October made a simple point: open practices and AI are now intertwined, so our standards need an update.
The goal isn't to slow progress. It's to keep the benefits of openness-scrutiny, reuse, reproducibility-while reducing avoidable harm.
Why open science needs a rethink
Open science has accelerated collaboration. But pairing open datasets, models and code with powerful AI creates new risks that weren't as pressing a few years ago.
- Privacy and consent: shared datasets can enable re-identification, even after de-identification.
- Dual use: openly released models, prompts or pipelines can be repurposed for harmful outcomes.
- Attribution gaps: AI-generated or AI-assisted content blurs authorship, credit and accountability.
- Reproducibility drift: fast-moving model updates and opaque prompts make results hard to reproduce.
- Provenance loss: once content is remixed by AI, it's easy to lose track of origins, licenses and terms.
Practical safeguards you can adopt now
- Tier your openness: define public, controlled and restricted access for datasets, code, and models. Use data use agreements and access committees where needed.
- License with intent: pick licenses that address AI training and redistribution. State allowed and prohibited uses in plain language.
- Protect participants: require consent that covers AI-related reuse; apply statistical disclosure control and, where suitable, differential privacy for sensitive data.
- Document everything: publish dataset "data cards," model cards, and method notes (including prompts, seed values, and versions). Pin dependencies and record checksums.
- Make it reproducible: ship containers or environment files; include scripts to rerun analyses end-to-end on sample or synthetic data.
- Disclose AI involvement: specify where AI assisted (data cleaning, drafting, translation, analysis), which tools were used, and who verified outputs.
- Evaluate for failure modes: bias, leakage, hallucination, robustness to distribution shifts, and dual-use potential. Red-team before release.
- Gate sensitive assets: time-limited embargoes, usage logs, API rate limits, watermarking or cryptographic provenance (e.g., C2PA) where appropriate.
- Credit data and software: cite datasets and code with DOIs; use ORCIDs and the CRediT taxonomy for roles.
Update policies and governance
Institutional guardrails should match current practice, not last decade's workflows. Refresh your policies to reflect AI-assisted research.
- Data management plans: add AI-specific sections on consent, training, model sharing, evaluation, and access control.
- Ethics/IRB: review AI data flows, third-party tools, and cross-border processing; define acceptable uses and reporting duties.
- Procurement and tools: whitelist approved AI services; require security, privacy, and uptime guarantees; set logging and retention baselines.
- Incident response: define how to report and handle leakage, misuse, or model failures that affect participants or the public.
- Compute and sustainability: report training/inference budgets, energy use where feasible, and plan for cost control.
Align with established guidance
Don't start from scratch. Map your lab or institution's practices to recognized frameworks and recommendations.
- NIST AI Risk Management Framework for risk identification, measurement and mitigation.
- UNESCO Recommendation on Open Science for principles on accessibility, transparency and equity.
Build skills across your team
AI literacy is now basic research hygiene. Train researchers, librarians and data stewards to use AI safely, evaluate outputs, and document AI contributions with discipline-appropriate standards.
If your group needs structured upskilling, consider role-based programs that focus on reproducibility, data governance and prompt practice for research teams. Explore courses by job.
90-day action plan for research leaders
- Publish a short "AI in Research" policy: disclosure, acceptable use, data tiers, and model release criteria.
- Standardize documentation: add data/model cards and AI disclosure sections to your lab template.
- Retrofit 1-2 flagship projects with containers, pinned versions and reproducible scripts.
- Set up a review step for dual-use and privacy risks before releasing datasets, models or prompts.
- Run a half-day training on evaluation, bias checks and prompt record-keeping.
- Create a lightweight incident mailbox and escalation path.
Bottom line
AI can strengthen open science if we share with intent, document what we did, and set guardrails before release. Aim for openness that is safe for participants, clear for reusers and reproducible for peers. That's how we keep trust high while moving fast.
Your membership also unlocks: