AI-Engineered Proteins Achieve Breakthrough in Stem Cell Reprogramming and Cellular Rejuvenation
OpenAI and Retro Biosciences used AI to boost stem cell reprogramming markers 50-fold by optimizing Yamanaka factors. Enhanced variants improved pluripotency and DNA repair, aiding cell rejuvenation.

Accelerating Life Sciences Research with AI
OpenAI and Retro Biosciences have demonstrated a remarkable 50-fold increase in the expression of stem cell reprogramming markers using an AI-driven approach. This advancement focuses on enhancing the Yamanaka factors—protein sets essential for creating induced pluripotent stem cells (iPSCs), which have broad applications in regenerative medicine.
These redesigned proteins not only boost marker expression but also show improved DNA damage repair, indicating increased rejuvenation potential. The results have been validated across multiple donors, cell types, and delivery methods, confirming full pluripotency and genomic stability in the derived iPSC lines.
Developing GPT-4b Micro: An AI Model for Protein Engineering
To explore AI’s role in life sciences, a specialized model named GPT-4b micro was created. This compact version of GPT-4o was trained on a rich dataset combining protein sequences, biological text, and tokenized 3D structural data—information often missing in typical protein language models.
This enhanced context allows the model to generate protein sequences with specific desired properties, including those with intrinsically disordered regions, which are common in proteins like the Yamanaka factors. The model supports very long prompts—up to 64,000 tokens—unprecedented in protein modeling, improving control and output quality.
AI-Assisted Redesign of SOX2 and KLF4 to Improve Reprogramming Efficiency
The Yamanaka factors—OCT4, SOX2, KLF4, and MYC—are fundamental to reprogramming adult cells into stem cells but suffer from low efficiency, often below 0.1%. Optimizing these proteins is challenging due to their size and the astronomical number of possible variants.
Using GPT-4b micro, Retro Biosciences generated novel "RetroSOX" and "RetroKLF" variants. Over 30% of RetroSOX variants outperformed wild-type SOX2, with differences of over 100 amino acids on average. For KLF4, nearly half of the AI-generated variants surpassed previous bests, a significant improvement over traditional methods.
Combining top RetroSOX and RetroKLF variants led to dramatically increased expression of early and late pluripotency markers in fibroblasts. These markers appeared faster and more robustly than with the original Yamanaka factors, confirmed by alkaline phosphatase staining indicative of pluripotency.
Further testing with mRNA delivery in mesenchymal stromal cells from older donors showed that more than 30% of cells expressed pluripotency markers within 7 days, with over 85% activating critical stem cell genes by day 12. The resulting iPSCs demonstrated the ability to differentiate into all three germ layers and maintained genomic stability, surpassing conventional benchmarks.
Enhanced DNA Damage Repair and Rejuvenation Potential
The AI-designed variants also improved DNA damage repair, a key factor in cellular aging. Cells treated with the RetroSOX/KLF cocktail exhibited reduced γ-H2AX intensity, indicating fewer double-strand breaks compared to those treated with standard Yamanaka factors. This reduction suggests stronger rejuvenation capabilities and potential for future therapeutic applications.
Looking Ahead
This project illustrates how a domain-specific AI model can rapidly produce meaningful improvements in complex biological systems. By combining AI with expert scientific insight, advancements that might take years can be achieved in a matter of days.
As AI tools continue to evolve, they offer promising avenues for accelerating research and development in life sciences, enabling new therapeutic strategies and insights.
- Yamanaka Factors and Stem Cell Biology - Nature Reviews Molecular Cell Biology
- Induced Pluripotent Stem Cells and Aging - PMC Article