Just a sample of the Echomail archive
Cooperative anarchy at its finest, still active today. Darkrealms is the Zone 1 Hub.
|    EARTH    |    Uhh, that 3rd rock from the sun?    |    8,931 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 8,780 of 8,931    |
|    ScienceDaily to All    |
|    Learning the language of molecules to pr    |
|    07 Jul 23 22:30:28    |
      MSGID: 1:317/3 64a8e669       PID: hpt/lnx 1.9.0-cur 2019-01-08       TID: hpt/lnx 1.9.0-cur 2019-01-08        Learning the language of molecules to predict their properties                Date:        July 7, 2023        Source:        Massachusetts Institute of Technology        Summary:        A new framework uses machine learning to simultaneously predict        molecular properties and generate new molecules using only a small        amount of data for training.                      Facebook Twitter Pinterest LinkedIN Email              ==========================================================================       FULL STORY       ==========================================================================       Discovering new materials and drugs typically involves a manual,       trial-and- error process that can take decades and cost millions       of dollars. To streamline this process, scientists often use machine       learning to predict molecular properties and narrow down the molecules       they need to synthesize and test in the lab.              Researchers from MIT and the MIT-Watson AI Lab have developed a new,       unified framework that can simultaneously predict molecular properties       and generate new molecules much more efficiently than these popular       deep-learning approaches.              To teach a machine-learning model to predict a molecule's biological       or mechanical properties, researchers must show it millions of labeled       molecular structures -- a process known as training. Due to the expense       of discovering molecules and the challenges of hand-labeling millions       of structures, large training datasets are often hard to come by, which       limits the effectiveness of machine-learning approaches.              By contrast, the system created by the MIT researchers can effectively       predict molecular properties using only a small amount of data. Their       system has an underlying understanding of the rules that dictate how       building blocks combine to produce valid molecules. These rules capture       the similarities between molecular structures, which helps the system       generate new molecules and predict their properties in a data-efficient       manner.              This method outperformed other machine-learning approaches on both       small and large datasets, and was able to accurately predict molecular       properties and generate viable molecules when given a dataset with fewer       than 100 samples.              "Our goal with this project is to use some data-driven methods to       speed up the discovery of new molecules, so you can train a model to do       the prediction without all of these cost-heavy experiments," says lead       author Minghao Guo, a computer science and electrical engineering (EECS)       graduate student.              Guo's co-authors include MIT-IBM Watson AI Lab research staff members       Veronika Thost, Payel Das, and Jie Chen; recent MIT graduates Samuel Song       '23 and Adithya Balachandran '23; and senior author Wojciech Matusik, a       professor of electrical engineering and computer science and a member       of the MIT-IBM Watson AI Lab, who leads the Computational Design       and Fabrication Group within the MIT Computer Science and Artificial       Intelligence Laboratory (CSAIL). The research will be presented at the       International Conference for Machine Learning.              Learning the language of molecules To achieve the best results       with machine-learning models, scientists need training datasets with       millions of molecules that have similar properties to those they hope to       discover. In reality, these domain-specific datasets are usually very       small. So, researchers use models that have been pretrained on large       datasets of general molecules, which they apply to a much smaller,       targeted dataset. However, because these models haven't acquired much       domain- specific knowledge, they tend to perform poorly.              The MIT team took a different approach. They created a machine-learning       system that automatically learns the "language" of molecules -- what       is known as a molecular grammar -- using only a small, domain-specific       dataset. It uses this grammar to construct viable molecules and predict       their properties.              In language theory, one generates words, sentences, or paragraphs based       on a set of grammar rules. You can think of a molecular grammar the       same way. It is a set of production rules that dictate how to generate       molecules or polymers by combining atoms and substructures.              Just like a language grammar, which can generate a plethora of sentences       using the same rules, one molecular grammar can represent a vast number       of molecules.              Molecules with similar structures use the same grammar production rules,       and the system learns to understand these similarities.              Since structurally similar molecules often have similar properties,       the system uses its underlying knowledge of molecular similarity to       predict properties of new molecules more efficiently.              "Once we have this grammar as a representation for all the different       molecules, we can use it to boost the process of property prediction,"       Guo says.              The system learns the production rules for a molecular grammar using       reinforcement learning -- a trial-and-error process where the model is       rewarded for behavior that gets it closer to achieving a goal.              But because there could be billions of ways to combine atoms and       substructures, the process to learn grammar production rules would be       too computationally expensive for anything but the tiniest dataset.              The researchers decoupled the molecular grammar into two parts. The first       part, called a metagrammar, is a general, widely applicable grammar       they design manually and give the system at the outset. Then it only       needs to learn a much smaller, molecule-specific grammar from the domain       dataset. This hierarchical approach speeds up the learning process.              Big results, small datasets In experiments, the researchers' new system       simultaneously generated viable molecules and polymers, and predicted       their properties more accurately than several popular machine-learning       approaches, even when the domain-specific datasets had only a few hundred       samples. Some other methods also required a costly pretraining step that       the new system avoids.              The technique was especially effective at predicting physical properties       of polymers, such as the glass transition temperature, which is       the temperature required for a material to transition from solid to       liquid. Obtaining this information manually is often extremely costly       because the experiments require extremely high temperatures and pressures.              To push their approach further, the researchers cut one training set       down by more than half -- to just 94 samples. Their model still achieved       results that were on par with methods trained using the entire dataset.              "This grammar-based representation is very powerful. And because the       grammar itself is a very general representation, it can be deployed       to different kinds of graph-form data. We are trying to identify other       applications beyond chemistry or material science," Guo says.              In the future, they also want to extend their current molecular grammar       to include the 3D geometry of molecules and polymers, which is key       to understanding the interactions between polymer chains. They are       also developing an interface that would show a user the learned grammar       production rules and solicit feedback to correct rules that may be wrong,       boosting the accuracy of the system.              This work is funded, in part, by the MIT-IBM Watson AI Lab and its member       company, Evonik. Paper: "Hierarchical Grammar-Induced Geometry for Data-       Efficient Molecular Property Prediction"        * RELATED_TOPICS        o Matter_&_Energy        # Materials_Science # Chemistry # Organic_Chemistry        # Nature_of_Water # Engineering_and_Construction #        Nanotechnology # Inorganic_Chemistry # Physics        * RELATED_TERMS        o Polymer o Periodic_table o Macromolecule o Microwave o        Fluid_mechanics o Mass o Wind_turbine o Nanotechnology              ==========================================================================               Print               Email               Share       ==========================================================================       ****** 1 ****** ***** 2 ***** **** 3 ****       *** 4 *** ** 5 ** Breaking this hour       ==========================================================================        * Cystic_Fibrosis:_Lasting_Improvement *        Artificial_Cells_Demonstrate_That_'Life_...               * Advice_to_Limit_High-Fat_Dairy_Foods_Challenged        * First_Snapshots_of_Fermion_Pairs *        Why_No_Kangaroos_in_Bali;_No_Tigers_in_Australia        * New_Route_for_Treating_Cancer:_Chromosomes *        Giant_Stone_Artefacts_Found:_Prehistoric_Tools        * Astonishing_Secrets_of_Tunicate_Origins *        Most_Distant_Active_Supermassive_Black_Hole *        Creative_People_Enjoy_Idle_Time_More_Than_Others              Trending Topics this week       ==========================================================================       SPACE_&_TIME Asteroids,_Comets_and_Meteors Big_Bang Jupiter       MATTER_&_ENERGY Construction Materials_Science Civil_Engineering       COMPUTERS_&_MATH Educational_Technology Communications       Mathematical_Modeling                     ==========================================================================              Strange & Offbeat       ==========================================================================       SPACE_&_TIME       Quasar_'Clocks'_Show_Universe_Was_Five_Times_Slower_Soon_After_the_Big_Bang       First_'Ghost_Particle'_Image_of_Milky_Way       Gullies_on_Mars_Could_Have_Been_Formed_by_Recent_Periods_of_Liquid_Meltwater,       Study_Suggests MATTER_&_ENERGY Holograms_for_Life:_Improving_IVF_Success       Researchers_Create_Highly_Conductive_Metallic_Gel_for_3D_Printing       Artificial_Cells_Demonstrate_That_'Life_Finds_a_Way' COMPUTERS_&_MATH       Number_Cruncher_Calculates_Whether_Whales_Are_Acting_Weirdly       AI_Tests_Into_Top_1%_for_Original_Creative_Thinking       Growing_Bio-Inspired_Polymer_Brains_for_Artificial_Neural_Networks       Story Source: Materials provided by       Massachusetts_Institute_of_Technology. Original written by Adam       Zewe. Note: Content may be edited for style and length.                     ==========================================================================                     Link to news story:       https://www.sciencedaily.com/releases/2023/07/230707153847.htm              --- up 1 year, 18 weeks, 4 days, 10 hours, 50 minutes        * Origin: -=> Castle Rock BBS <=- Now Husky HPT Powered! (1:317/3)       SEEN-BY: 15/0 106/201 114/705 123/120 153/7715 218/700 226/30 227/114       SEEN-BY: 229/110 112 113 307 317 400 426 428 470 664 700 291/111 292/854       SEEN-BY: 298/25 305/3 317/3 320/219 396/45 5075/35       PATH: 317/3 229/426           |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca