Machine Learning for the Materials Scientist




Loading...







The FuTure oF MaTerials science and MaTerials engineering

Engineered materials will certainly play an important role in enabling these solutions, and the workshop participants considered it important to introduce materials science and engineering concepts into K-12 curricula to educate both the next generation of scientists and engineers as well as to make the next generation materials science literate

Dr Pearl Agyakwa Materials scientist - PSTT

A scientist just like me Material science is about discovering why different materials behave the way they do, why we make something out of one material rather than another and why materials wear out I sometimes deliberately break things by putting them through too much heat or current and then look at the cracks under powerful microscopes

MATERIALS SCIENCE AND ENGINEERING An Introduction

A materials scientist has to consider four ‘intertwined’ concepts, which are schematically shown as the ‘Materials Tetrahedron’ When a certain performance is expected from a component (and hence the material constituting the same), the ‘expectation’ is put forth as a set of properties

Machine Learning for the Materials Scientist

Materials Scientist How much information is carried by knowledge of structure ? NGDM, October 10, 2007 • Data mining and materials design – make

Searches related to how much does a materials scientist make filetype:pdf

Scientist at BP in January 2010 Williams is responsible for keeping a pulse on science and technology developments that could advance BP’s energy portfolio and serves as a liaison

Machine Learning for the Materials Scientist 136138_7Fischer_ngdm07.pdf

NGDM, October 10, 2007Machine Learningfor the Materials Scientist Chris Fischer*, Kevin Tibbetts, Gerbrand CederMassachusetts Institute of Technology, Cambridge, MADane MorganUniversity of Wisconsin, Madison, WI

NGDM, October 10, 2007Motivation: materials design through calculationRun-time: polynomial scaling with number of atomscomputing power:

exponential scaling with timeMoore, G. ISSCC 2003 slides (http://www.intel.com)Skylaris, C. et. al. J. Phys. Chem. 122, 084119 (2005)O(N3)

O(N)

NGDM, October 10, 2007DFT as a predictive toolBurkett, T. et. al. Phys. Rev. Lett. 93 (2004)Norskov, J. et. al. MRS Bulletin 31 (2006)Marzari, N. MRS Bulletin 31 (2006)courtesy of M. Lazzeri, Paris VI JussieuMarzari, N. MRS Bulletin 31 (2006)courtesy of D. Scherlis, MIT

NGDM, October 10, 2007computational materials design strategiesGalli, G. University of California, DavisLee, Y. S. et al. PRL 95 076804 (2005)Calculating properties ofrealistic nanostructuresab initio

NGDM, October 10, 2007computational materials design strategiesWhich combinations yieldthe optimal material ?

NGDM, October 10, 2007Outline

Machine learning inComputational Materials Design Searching for Structure: combining historical information with Density Functional Theory

Data Mining theHigh-Throughput engine

wrap-up

NGDM, October 10, 2007computational materials design strategiesWhich combinations yieldthe optimal material ?

NGDM, October 10, 2007Motivation: searching for new materialsfor i in (relevant chemistries) { ...

... getStablePhases(i); ... ...

calculateProperty(i); i = nextChemistry();}Depends on which phases are stable and their structure

NGDM, October 10, 2007for i in (relevant chemistries) { ... ... getStablePhases(i); ... ...

calculateProperty(i); i = nextChemistry();}Motivation: materials by designDepends on which phases are stable and their structure

Machine Learningneeded here !!

NGDM, October 10, 2007The need for machine learningDFT CodeMaterialPropertyPredictionsDoesn't know whatto calculate next

NGDM, October 10, 2007The need for machine learningDFT CodeMaterialPropertyPredictionsDatabaseofComputed andExperimentalresultsMachine LearningFramework

NGDM, October 10, 2007Computational Materials Design poised for impact'Commodity' computational resourcesOpen source electronic structure software~$200-250k capitalinvestmentComputing budget~50k compounds/year

NGDM, October 10, 2007Computational Materials Design poised for impactComputing budget~50k compounds/yearICSD: World's Largest database of inorganic crystal structuresFirst Entry: 1913# of entries: 100,243# usable compounds: 29,962

NGDM, October 10, 2007for i in (relevant chemistries) { ...

... getStablePhases(i); ... ... calculateProperty(i); i = nextChemistry();}The structure search problemDepends on which phases are stable and their structure

Where do we put the atomsif no experimental structureis known ?? NGDM, October 10, 2007Strategies to search for structure Coordinate Search:Optimize energy (or free energy) directly in the space of atomic coordinates

Heuristic Rulesor

Chemical Intuition

NGDM, October 10, 2007Methods to search for structure

Coordinate Search:Optimize energy (or free energy) directly in the space of atomic coordinates# of dimensions = 3N - 3 + dim(a,b,c,,,)GroundState≡argminr1,r2,,rN

Er1,r2,,rNDoye, J. PRL, 88, 238701, (2002) complex energy landscape NGDM, October 10, 2007Methods to search for structure

Coordinate Search:Optimize energy (or free energy) directly in the space of atomic coordinates# of dimensions = 3N - 3 + dim(a,b,c,,,)

Doye, J. PRL, 88, 238701, (2002)Proposed SolutionsCalculate energy of a finite set of structure prototypesGroundState≡argminr1,r2,,rN

Er1,r2,,rN

NGDM, October 10, 2007Methods to search for structure

Coordinate Search:Optimize energy (or free energy) directly in the space of atomic coordinates# of dimensions = 3N - 3 + dim(a,b,c,,,)

Doye, J. PRL, 88, 238701, (2002)Proposed SolutionsUse a stochastic optimization procedure (hop from basin to basin)e.g., Simulated Annealing Genetic AlgorithmsGroundState≡argminr1,r2,,rN

Er1,r2,,rNCalculate energy of a finite set of structure prototypes

NGDM, October 10, 2007Doye, J. PRL, 88, 238701, (2002)Proposed SolutionsUse a stochastic optimization procedure (hop from basin to basin)e.g., Simulated Annealing Genetic AlgorithmsCalculate energy of a finite set of structure prototypesMethods to search for structure

Coordinate Search:Optimize energy (or free energy) directly in the space of atomic coordinates# of dimensions = 3N - 3 + dim(a,b,c,,,)GroundState≡argminr1,r2,,rN

Er1,r2,,rN

Knowledge is not transferred across chemistries

NGDM, October 10, 2007Methods to search for structure Heuristic RulesUse previous experiments to suggest

what to calculateHow ?Identify a set of simple parameters based on alloy constituents1932: Pauling electronegativity1935: Laves & Witte

rA,B1926,1936-7: Hume-Rothery, Mott & Jones nat e1976: Miedema nws e NGDM, October 10, 2007Methods to search for structure Heuristic RulesPlot stable structures in space of parameters1983: VillarsrA,Bnat e1986: Pettifor NGDM, October 10, 2007Methods to search for structure Heuristic RulesPlot stable structures in space of parameters1983: VillarsrA,Bnat e1986: Pettifor

Heuristic rules efficiently code historical knowledgeprovide transfer of knowledgeCan we leverage historicalknowledge to intelligentlysearch for structure ?

NGDM, October 10, 2007Knowledge BaseExperimental Datadescription of knowledge basePauling File binaries edition (Villars, P. et. al. J. of Alloys and Compounds, (2004))

1335 binary alloys3975 non-unique compounds4263 compounds totalalloys not containing elements: He, B, C, N, O, F, Ne, Si, P, S, Cl, Ar, As, Se, Br, Kr, Te, I, Xe, At, Rn

NGDM, October 10, 2007Low temperature state of alloyx=xA,x0,,x1 2 ,,xB Data≡{x1,,xN}database of N binary alloysMachine learning framework: concepts

NGDM, October 10, 2007Low temperature state of alloyProbability of low temperature state (fitted to data)x=xA,x0,,x1

2 ,,xB px px∣eProbability of low temperature state conditioned on evidence 'e' Data≡{x1,,xN}database of N binary alloysMachine learning framework: concepts

NGDM, October 10, 2007how to use the machine learning frameworkDFT CodeMaterialPropertyPredictionsDatabaseofComputed andExperimentalresultsMachine LearningFrameworkpx∣eSet of likely structure candidates

NGDM, October 10, 2007Preliminaries and open questionsAre probabilities consistent with physical intuition ? Do probabilities encode the physics of structure stability ?

NGDM, October 10, 20071

0g(2 )(xi,xj)anti-correlated correlated uncorrelated gijxi,xj=pxi,xj

pxipxjPair Cumulantprobability that both structures occur in same system estimated from databaseprobability that only xi occursquantifying correlation in probabilistic framework

NGDM, October 10, 2007Do probabilities embody real physical effects ? gijxi,xj=pxi,xj

pxipxjDo probabilities embody real physical effects ? Compounds stabilized by "size" effect: Fe3C

MgCu2

Data from Pauling File, Binaries Edition

01 2 13 4 1 4 2 3 1

3cB8.48how probabilities represent physics of mixing

NGDM, October 10, 2007Do probabilities embody real physical effects ? Do probabilities embody real physical effects ? Compounds stabilized by "size" effect: Fe3C

MgCu2

Data from Pauling File, Binaries Edition 01

2 13 4 1 4 2 3 1

3cB8.48how probabilities represent physics of mixing

~0 Places 'small' atoms on 'large' atom sites G. Ceder gijxi,xj=pxi,xj pxipxj

NGDM, October 10, 2007how probabilities represent physics of mixing:more interesting correlationsGd2Co7PuNi3

Both structures share the same local environmentsAABAAB... stackingBAA

BAAABAB... stackingBA

BAgijxi,xj=54

NGDM, October 10, 2007Structure correlation observationsCorrelation factors are probabilistic analogue of heuristic rulesNo explicit reference to physics. Physics is embedded in experimental data

NGDM, October 10, 2007Information theory for structure stabilitySuppose I know Fe3C forms @ c = ¾, how does this change prediction @ c = ½ ? Mutual InformationIi,j=∑xi,xj

pxi,xjlogpxi,xj

pxipxjIi,j=〈log[gijxi,xj]〉How much information is carried by knowledge of structure ?

NGDM, October 10, 2007degree of c

orrel at ionEach element of matrix is correlation between Xi and Xj

Ii,j=∑xi,xj

pxi,xjlogpxi,xj pxipxj e.g., Xi="AB prototype" and Xj="A2B prototype"Information theory for structure stability NGDM, October 10, 2007Prediction and validation in Li-Pt NGDM, October 10, 2007Predicting structures in Li-Pt AlB2 LiRh ?? MgCu2 CuPt7 a.k.a. MgPt7px∣eUse these as conditioning evidence for: NGDM, October 10, 2007Predicting structures in Li-Pt

Suggested phases

Known phases

NGDM, October 10, 2007cross validation to evaluate performanceDFT CodeMaterialPropertyPredictionsDatabaseofComputed andExperimentalresultsMachine LearningFrameworkpx∣eSet of likely structure candidates

Success of methoddepends on howshort this list is

NGDM, October 10, 2007Independent VariablesIncluding structure correlation10 candidates --> 95% chance of seeing GS !!Nature Materials, 6, 641-646, 2006~28 candidates req'd for freq.Length

of List = a ve ra ge 'los s'Cross validation results

NGDM, October 10, 2007Some open questionsICSD: World's Largest database of inorganic crystal structuresFirst Entry: 1913# of entries: 100,243# usable compounds: 29,962# structure prototypes: 2,485What is the information content in a chemical database? How many 'independent' crystal structures exist in nature ?

NGDM, October 10, 2007Structure prediction: wrap-upfor i in (relevant chemistries) { ... ... getStablePhases(i); ... ...

calculateProperty(i); i = nextChemistry();}Now have efficienttool for thisMuch more neededhere

NGDM, October 10, 2007Directions for future work/collaborationDFT CodeMaterialPropertyPredictionsDatabaseofComputed resultsMachine LearningFramework

NGDM, October 10, 2007Directions for future work/collaborationDFT CodeSet of features●Charge Density●Total energy●Bulk moduli●Coordination●Bond strength●Bond character●Magnetic moments●Polarization●...

NGDM, October 10, 2007Directions for future collaborationDatabaseofComputed resultsDFT CodeMachine LearningFramework(functional mapping)Material Properties-catalytic activity-conductivity-plasticity-voltage/energy density

NGDM, October 10, 2007The EndITR grant (DMR-031253)http://datamine.mit.eduData from High Throughput alloy studyOnline structure predictor

NGDM, October 10, 2007•introduce CMS, what is it being applied to ?

•Data mining and materials design - make some outline slide ?•introduce structure prediction problem, present our solution•discuss higher order property prediction. data management, disseminationDELETE ME !!!

NGDM, October 10, 2007DATASET NOTES1335 alloys3975 non-unique compounds4263 compounds totalalloys not containing elements: He, B, C, N, O, F, Ne, Si, P, S, Cl, Ar, As, Se, Br, Kr, Te, I, Xe, At, Rn


Politique de confidentialité -Privacy policy