Generative AI Model Tracks Electrons to Predict Chemical Reactions with Greater Accuracy
MIT's FlowER model improves chemical reaction predictions by tracking electron flow and conserving mass. Its open-source approach enhances accuracy and reliability in AI-driven chemistry.

A New Generative AI Approach to Predicting Chemical Reactions
Predicting the outcomes of chemical reactions has long challenged researchers, especially when using artificial intelligence and large language models (LLMs). Many previous attempts lacked grounding in fundamental physical laws, such as conservation of mass, limiting their accuracy and reliability. A team at MIT has developed a new model that incorporates these physical constraints, resulting in significantly improved prediction quality.
The system, called FlowER (Flow matching for Electron Redistribution), enables explicit tracking of all electrons involved in a reaction. This ensures that electrons are neither spuriously created nor destroyed during prediction, addressing a key limitation in earlier AI models.
Why Conservation Matters in Reaction Prediction
Joonyoung Joung, a recent MIT postdoc and now an assistant professor, explains that knowing the likely products of a reaction is essential for tasks like drug development. Most AI approaches only compare inputs and final products without considering the intermediate steps or enforcing mass conservation. This can lead to predictions that violate physical laws, such as creating or deleting atoms artificially.
LLMs like ChatGPT operate using computational “tokens” that can represent atoms, but without constraints, these tokens may be added or removed incorrectly. Joung points out that without grounding in the chemistry, such models produce results akin to “alchemy.” The FlowER system avoids this by tracking the transformation of chemicals throughout the reaction process, not just the start and end points.
How FlowER Works
The team built their approach on a method from the 1970s by chemist Ivar Ugi, who introduced a bond-electron matrix to represent electrons in reactions. FlowER uses this matrix to represent bonds and lone electron pairs with nonzero values, and zeros where no electrons are present. This method allows simultaneous conservation of atoms and electrons during prediction.
Mun Hong Fong, a former MIT software engineer now at Duke University, highlights that this electron-based representation is key to embedding mass conservation into the model. By explicitly accounting for electron redistribution, FlowER maintains chemical realism in its predictions.
Performance and Limitations
Connor Coley, the senior author and professor at MIT, describes the system as a proof of concept demonstrating that flow matching is well suited for reaction prediction. Although trained on over a million reactions from the U.S. Patent Office database, FlowER currently lacks coverage of certain metals and catalytic reactions.
Despite these limitations, the system delivers reliable predictions that conserve both mass and electrons. It is openly available on GitHub, allowing researchers to use it as a tool for assessing reactivity and mapping reaction pathways.
Open Source and Future Directions
One distinguishing feature of this work is its open-source nature. Fong emphasizes that all models and datasets, including a comprehensive mechanistic reaction dataset developed by Joung, are publicly accessible. This transparency encourages broader adoption and further development.
FlowER matches or exceeds existing methods in identifying standard mechanistic pathways and can generalize to reaction types it has not encountered before. Potential applications include medicinal chemistry, materials science, combustion, atmospheric chemistry, and electrochemistry.
The team plans to extend the model’s capabilities to include metals and catalytic cycles, which are currently underrepresented. Coley notes that while the system is still in early stages, the approach holds promise for discovering new complex reactions and elucidating mechanisms supported by experimental data.
Conclusion
FlowER represents an important step forward in AI-based chemical reaction prediction by incorporating physical laws directly into the generative model. Its tracking of electrons and mass conservation improves the accuracy and credibility of predictions, moving beyond purely data-driven guesses.
For scientists and researchers working on reaction mechanisms or new molecule synthesis, FlowER offers a practical and accessible tool. Its open-source availability invites collaboration and further innovation in computational chemistry.
To explore more about AI applications in scientific research, visit Complete AI Training’s latest courses.