Monday, October 9, 2023

A Periodic Table for Molecules: Using Artificial Intelligence for Chemical Analysis

ChatGPT works by finding patterns in large data sets, that is to say, the texts of books, websites, etc. The same principle could be applied to chemistry, whereby with a large enough data set, an AI program could propose molecules with the desired properties that have yet to be discovered or synthesized.

Just as the periodic table enabled scientists to predict the existence and properties of elements before they were discovered, using AI to analyze a large chemistry database would allow chemists to find desired molecules with greater efficiency. 

The Beilstein database has information on millions of chemical compounds. Imagine what could be discovered if an AI program searched through it all. That is possible while it is impossible for any group of scientists to accomplish the same feat. 

It is true that computers have been used in similar ways for years, but to the best of my knowledge, never on the scale that I am proposing. 

The first stage would be to correlate data such as molecular weight, IUPAC name, known uses, etc. For example, the user could ask the program to return a chemical that contains a certain functional group (methyl for instance) that can be used for some purpose or has some property. The program would then return the names of existing chemicals of that nature as well as an educated guess on the IUPAC name of a compound with the desired properties. It's sort of like the way you can ask ChatGPT to write a story or poem about a given topic. 

It would be harder, but not impossible to train the same system using molecular diagrams, but that becomes an image recognition problem, albeit of a simpler variety. With a neural net and a sufficient set of training data, it would be possible for a program to correctly identify chemical compounds from their respective structural formulas. The next step would be to have they AI generate structural formulas based on input such as desired properties or applications. 

In the ideal case, a user could ask the program to give the structural formula of a molecule that cures cancer or AIDS, and the output would be something that's never been tried before. Perhaps it would even work. I don't know how much pharmaceutical information is in the Beilstein database, but whatever is lacking would be easy enough to add. 

The World Health Organization has a list of 1200 recommendations for 591 drugs and 103 therapeutic equivalents. Oh, how wonderful it would be if that list was analyzed by AI, though it would be best if the structural and empirical formulas were included with the list. Information gleaned from it could be used to search the Beilstein database and to generate novel structural formulas. 

There would plenty of other things to discover, which would include better and cheaper plastics, lubricants, solvents, coolants, refrigerants, catalysts and many more of all kinds. 

I will conclude this article with a shout-out to Elizabeth Fulhame, who invented the concept of catalysis. In the preface to one of her books published in 1810, she wrote: 

"But censure is perhaps inevitable: for some are so ignorant, that they grow sullen and silent, and are chilled with horror at the sight of anything that nears the semblance of learning, in whatever shape it may appear; and should be the spectre appear in the shape of a woman, the pangs which they suffer are truly dismal."

If any such chemical analysis program is invented, I suggest it be named Fulhame. 

No comments:

Post a Comment