AI Redefines "Me": Everyone Can Become a Scientist After Blending with AI
At WISE2025 in Beijing's 798 Art District, the conversation around AI for Science shifted from hype to workflow. The idea is blunt: treat scientific research as production. If a process has inputs, steps, and outputs, it can be systematized and scaled.
Sun Weijie of DP Technology, a company that began laying out AI for Science in 2018 alongside early academic advocates, made the case clearly. Build the infrastructure, wire it to the lab, and let models and robots run the loop until results compound.
What "AI for Science" means in practice
AI for Science uses models, data, and automation to assist - and eventually automate - scientific discovery. The long-term vision is an "AI scientist" capable of proposing hypotheses, running simulations, planning experiments, executing them, and learning from outcomes.
The near-term prize is simpler: compress discovery cycles from years to weeks, and produce high-value results in batches.
From craft to pipeline: read - calculate - do
- Read: A living scientific knowledge base that indexes literature, data, patents, methods, and tacit know-how.
- Calculate: Foundation models and physics-informed surrogates for molecules, materials, and genes, backed by HPC and hybrid simulation.
- Do: Automated labs with robots, instruments, and LIMS/ELN, wired into a closed loop for planning, execution, and measurement.
This stack doesn't create headlines like a single-stage breakthrough. It builds roads. Once the roads exist, discoveries multiply downstream.
The Edison test, re-run with AI
Edison reportedly tested thousands of filaments over years. Today, you'd search global literature for candidates, run large-scale in silico screening, rank by predicted properties, and push the top set to an automated lab for tens to hundreds of confirmatory trials.
The feedback retrains your models. Weeks instead of years, and you log every step for reuse.
Why this matters for working scientists
- Turn wishlist properties (drug efficacy, thermal stability, ionic conductivity) into pipelines that deliver candidates on schedule.
- Lower the barrier to serious work. With the right stack, a small team can explore large design spaces and ship credible results.
- Direct more cycles at grand problems - aging, energy density, deep-space constraints - by upgrading materials, drugs, and process insight.
China's position: clear edges to build on
- Talent density is high across AI and basic science.
- Physical integration is stronger: complete supply chains for instruments, chemicals, bioprocess, and lab buildouts.
- Policy tailwinds push for high-level self-reliance, which forces upgrades across data, software, instruments, and national facilities.
Startups vs. giants: where new winners appear
Big balance sheets help, but speed wins the cold start. The flywheel is simple: better product → more users → richer data → better product. The "competition" is usually a small internal team at a giant, not the whole company.
Execution depth in a narrow domain often beats generic muscle.
Where the money is (and already has been)
- Scientific databases: structured knowledge, ontologies, and RAG-ready corpora.
- Scientific software: modeling, simulation, and workflow orchestration.
- Scientific instruments: AI-native hardware, APIs, and closed-loop control.
- CRO / outsourced R&D: faster, cheaper, traceable discovery-as-a-service.
Global R&D spend is measured in trillions. Tie an "AI brain" to these workflows and you're selling time, certainty, and IP - not slides.
Playbook: build your AI-for-Science stack in 90 days
- Define the decision: target property space, constraints, and a hard success metric. Write the kill criteria.
- Read layer: assemble a domain corpus (papers, patents, lab notes), standardize units, tag with ontologies, and enable retrieval-augmented generation for hypotheses and methods.
- Modeling: pick a foundation model or baseline surrogate, fine-tune with your data, and add uncertainty estimation. Start with a simple multi-objective optimizer.
- Do layer: set up a minimal self-driving loop: DOE → plan → run → measure → learn. Start with one instrument chain you fully control.
- Data backbone: LIMS/ELN with provenance, versioned datasets, and FAIR principles. No orphaned CSVs.
- Governance: safety checks, experiment simulation/sandboxing, and reproducibility reports auto-generated per batch.
- ROI tracking: cycle time per hypothesis, unit cost per validated hit, and percent of experiments auto-planned vs. manual.
Proof points worth studying
AlphaFold showed that learned models can crack core prediction problems and seed new pipelines. For an overview, see DeepMind's summary of AlphaFold's scientific impact.
If you're formalizing your data layer, anchor it to the FAIR Guidelines as a baseline for provenance and reuse. The original paper is here: FAIR Guiding Principles.
What to watch in 2025
- Embodied intelligence linking models to instruments and robots.
- Foundation models specialized for chemistry, materials, and biology that play nicely with physics constraints.
- Open protocols for self-driving labs across vendors and institutes.
For teams upskilling into AI-heavy workflows
If you're building skills across modeling, data pipelines, and automation, this curated list by job function can speed the search: AI courses by job.
The takeaway from WISE2025 is straightforward. Treat science like production, build the read-calculate-do stack, and let the loop run. The labs that ship faster feedback will own the next decade of discovery.
Your membership also unlocks: