News

Latest News and Publications

A Researcher's Guide to AI in Materials Science: From First Steps to a Frontier

Thank you to everyone who attended my tutorial at the McMaster CCEM Summer School. The energy and insightful questions were truly inspiring. As promised, this post serves as a detailed guide for anyone looking to begin their journey into the exciting intersection of artificial intelligence, machine learning, and materials science.

The central message of my talk was that AI is not a replacement for rigorous scientific inquiry, but a powerful amplifier for it. Our goal is to teach our microscopes not just to see, but to understand. This is a transformative endeavor, and while it may seem daunting, the path to getting started is more accessible than ever.

This guide provides a curated list of resources to take you from foundational concepts to the cutting edge of research.

1. Building a Conceptual Foundation (Online Courses)

Before diving into complex code or research papers, it's crucial to build a strong conceptual intuition for how machine learning works. These courses are exceptional starting points that prioritize understanding over abstract mathematics.

  • Machine Learning Specialization by Andrew Ng (Coursera)

    • Who it's for: The absolute beginner. This is widely considered the definitive "first course" in machine learning for a reason.

    • What you'll learn: Dr. Ng has a gift for explaining the core ideas of supervised and unsupervised learning, including regression, classification, and clustering, with brilliant clarity. You will come away understanding the "why" behind the algorithms.

  • Deep Learning Specialization by Andrew Ng (Coursera)

    • Who it's for: Those who have completed the introductory course and want to understand the architecture of modern AI.

    • What you'll learn: This specialization demystifies neural networks, convolutional neural networks (CNNs) for image analysis, and the strategies used to train these powerful models effectively.

  • Practical Deep Learning for Coders by fast.ai

    • Who it's for: Those who prefer a "top-down," code-first approach. It’s perfect for the experimentalist who wants to start applying models quickly and learn the theory along the way.

    • What you'll learn: You'll learn how to build and train state-of-the-art deep learning models using the PyTorch library with a focus on practical results and best practices.

2. The Practitioner's Toolkit (Key Software & Libraries)

Your research will ultimately be done through code. The open-source community has built an incredible ecosystem of tools that are free to use and supported by excellent documentation. All these tools are primarily based in the Python programming language.

  • The Deep Learning Frameworks: TensorFlow & PyTorch

    • These are the two titans of deep learning. You don't need to know both; choose one and go deep. They provide the fundamental building blocks for creating and training neural networks. My group has used both to great effect in our work.

  • The Classical ML Workhorse: Scikit-learn

    • For any task that doesn't require a deep neural network (e.g., clustering, classical regression, dimensionality reduction like PCA), Scikit-learn is the gold standard. It’s incredibly user-friendly and powerful.

  • Model Exchanges: Huggingface

    • There are many models that have been developed that you can adapt for your own problems. Huggingface is one such exchange that can help you quickly get up to speed.

  • Domain-Specific Tools for Microscopy & Materials:

    • Atomap: A foundational open-source tool for finding and fitting atomic column positions in high-resolution images using 2D Gaussian fitting. Our TEMWizard software was built to provide a graphical interface for this library.

    • AtomAI: A powerful PyTorch-based toolkit that connects deep learning models to experimental microscopy data, enabling analysis of atomic and mesoscale images and spectroscopy.

    • HyperSpy: An essential open-source Python library for analyzing multi-dimensional data, including electron energy loss spectroscopy (EELS) and energy-dispersive X-ray spectroscopy (EDS) datacubes.

    • py4DSTEM: A specialized toolkit for the analysis of 4D-STEM datasets, a rapidly growing area of research.

3. Seminal & Foundational Papers: Your Reading List

Staying current means reading the literature. This list is not exhaustive, but it provides a starting point with papers that established a new concept, provide a vision for the future, or demonstrate a particularly insightful application in our field. I have organized them by theme to help guide your reading.

The Deep Learning Revolution (General)

Core Task: Feature Identification and Segmentation

Advanced Task: Discovery, Forecasting, and New Modalities

The Vision: Reviews, Perspectives, and Autonomous Science

4. Communities & Research Groups to Follow

Science is a collaborative effort. Following the work of leading groups and participating in community events is a great way to stay inspired and informed.

  • Influential Research Groups:

  • Professional Societies & Meetings:

    • Microscopy Society of America (MSA): The annual Microscopy & Microanalysis meeting is the premier venue for our field. We regularly organize symposia on AI and data science there.

    • Materials Research Society (MRS): The MRS Spring and Fall meetings are essential for materials scientists, with a rapidly growing number of symposia dedicated to data science, AI, and automated discovery.

I hope this guide provides you with a solid starting point and a clear roadmap. This is a journey of continuous learning, but one that holds immense promise for the future of materials science. The tools are here, the community is growing, and the scientific frontiers are waiting.

I look forward to seeing what you discover!

Steven S