<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Xiangyu Yin Blog</title>
<link>https://xiangyu-yin.com/blog.html</link>
<atom:link href="https://xiangyu-yin.com/blog.xml" rel="self" type="application/rss+xml"/>
<description>Research notes on computational imaging, AI, scientific software, and scientific automation.</description>
<language>en-US</language>
<image>
<url>https://xiangyu-yin.com/content/img/my_image.jpg</url>
<title>Xiangyu Yin Blog</title>
<link>https://xiangyu-yin.com/blog.html</link>
</image>
<generator>quarto-1.9.36</generator>
<lastBuildDate>Sun, 23 Mar 2025 05:00:00 GMT</lastBuildDate>
<item>
  <title>Building an MCP Server for the Materials Project</title>
  <dc:creator>Xiangyu Yin</dc:creator>
  <link>https://xiangyu-yin.com/content/post_mp_mcp.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>In this blog post, I will demonstrate how to build a <em>Model Context Protocol</em> (MCP) server that connects to the <a href="https://next-gen.materialsproject.org/">Materials Project (MP)</a> database. MCP is a standard for bridging AI applications (like Claude, ChatGPT, or in general any LLM-based client that supports MCP) with external data and tools. Instead of manually calling an API, you’ll expose a set of tools under a standardized protocol. This means you can seamlessly integrate these Materials Project queries into multiple AI frameworks with minimal custom code. For more details on MCP, see the <a href="https://modelcontextprotocol.io/">official website</a>.</p>
</section>
<section id="a-brief-introduction-to-the-materials-project-api" class="level2">
<h2 class="anchored" data-anchor-id="a-brief-introduction-to-the-materials-project-api">A brief introduction to the Materials Project API</h2>
<p>The <a href="https://next-gen.materialsproject.org/api">Materials Project API</a> provides programmatic access to a massive store of computed materials data: from crystal structures (CIFs), to band structures, to thermodynamic quantities like formation energy and energy above hull. The official Python client is <strong><code>mp_api</code></strong>. Typical usage (non-MCP) might look like:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mp_api.client <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> MPRester</span>
<span id="cb1-2"></span>
<span id="cb1-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">with</span> MPRester(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"your_api_key_here"</span>) <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> mpr:</span>
<span id="cb1-4">    docs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mpr.materials.summary.search(elements<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Si"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"O"</span>], band_gap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span>))</span></code></pre></div></div>
<p>Although it’s straightforward to call <code>mp_api</code> from your local Python environment, hooking it up to an LLM for agent-like usage typically requires additional bridging code.</p>
</section>
<section id="why-not-just-call-the-mp-api-directly" class="level2">
<h2 class="anchored" data-anchor-id="why-not-just-call-the-mp-api-directly">Why not just call the MP API directly?</h2>
<p>If you’re <em>only</em> using one AI platform or a single environment (like a Jupyter notebook), you <em>can</em> just call <code>mp_api</code> functions directly. However, this approach doesn’t scale well if you want:</p>
<ol type="1">
<li><strong>Multiple AI apps</strong>: Tools like Cursor, Windsurf, Claude Desktop, Emacs clients, etc., all want to discover and call your “tools” in a standard, stable way.</li>
<li><strong>One integration for many LLMs</strong>: With MCP, you write your code once. Any LLM app that supports the protocol can discover your server, see its docs, and call your exposed tools.</li>
<li><strong>Granular permission control</strong>: The protocol ensures the user must approve a tool call if the LLM attempts to use it. This keeps you in the loop on usage.</li>
</ol>
<p>Using MCP is especially convenient if you want to combine the Materials Project data with other advanced data sources or local resources, e.g.&nbsp;local quantum chemistry codes, Python scripts, or HPC job management tools. AI applications that speak MCP can chain all of them together seamlessly.</p>
</section>
<section id="step-by-step-the-materials-project-mcp-server" class="level2">
<h2 class="anchored" data-anchor-id="step-by-step-the-materials-project-mcp-server">Step-by-step: The Materials Project MCP server</h2>
<p>Below is the main code of our server (a Python script). You can see how we define two <em>tools</em>:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># File: materials_project_plugin.py</span></span>
<span id="cb2-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#</span></span>
<span id="cb2-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># An MCP server exposing tools for querying the Materials Project database</span></span>
<span id="cb2-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># via mp_api. Once running, tools can be invoked through any MCP-compatible</span></span>
<span id="cb2-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># client (e.g. Claude Desktop).</span></span>
<span id="cb2-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#</span></span>
<span id="cb2-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Dependencies:</span></span>
<span id="cb2-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#   pip install "mcp[cli]" aiohttp pydantic mp_api </span></span>
<span id="cb2-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#</span></span>
<span id="cb2-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Environment:</span></span>
<span id="cb2-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#   MP_API_KEY=&lt;your_materials_project_api_key&gt;</span></span>
<span id="cb2-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#</span></span>
<span id="cb2-13"></span>
<span id="cb2-14"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> os</span>
<span id="cb2-15"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> logging</span>
<span id="cb2-16"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Optional, List</span>
<span id="cb2-17"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> emmet.core.electronic_structure <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> BSPathType</span>
<span id="cb2-18"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mcp.server.fastmcp <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> FastMCP</span>
<span id="cb2-19"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> pydantic <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Field</span>
<span id="cb2-20"></span>
<span id="cb2-21"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Materials Project client</span></span>
<span id="cb2-22"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mp_api.client <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> MPRester</span>
<span id="cb2-23"></span>
<span id="cb2-24"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Setup logging</span></span>
<span id="cb2-25">logging.basicConfig(level<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>logging.INFO)</span>
<span id="cb2-26">logger <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> logging.getLogger(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"materials_project_mcp"</span>)</span>
<span id="cb2-27"></span>
<span id="cb2-28"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Obtain your Materials Project API key from env var</span></span>
<span id="cb2-29">API_KEY <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> os.environ.get(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MP_API_KEY"</span>)</span>
<span id="cb2-30"></span>
<span id="cb2-31"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create the MCP server instance</span></span>
<span id="cb2-32">mcp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> FastMCP(</span>
<span id="cb2-33">    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Materials Project Plugin"</span>,</span>
<span id="cb2-34">    version<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"0.0.1"</span>,</span>
<span id="cb2-35">    description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(</span>
<span id="cb2-36">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"A Model Context Protocol (MCP) server that exposes query tools "</span></span>
<span id="cb2-37">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"for the Materials Project database using the mp_api client."</span></span>
<span id="cb2-38">    ),</span>
<span id="cb2-39">)</span>
<span id="cb2-40"></span>
<span id="cb2-41"></span>
<span id="cb2-42"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _get_mp_rester() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> MPRester:</span>
<span id="cb2-43">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb2-44"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Helper function to initialize a MPRester session with the user's API key.</span></span>
<span id="cb2-45"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb2-46">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> API_KEY:</span>
<span id="cb2-47">        logger.warning(</span>
<span id="cb2-48">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"No MP_API_KEY found in environment. Attempting MPRester() without key."</span></span>
<span id="cb2-49">        )</span>
<span id="cb2-50">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> MPRester()</span>
<span id="cb2-51">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> MPRester(API_KEY)</span>
<span id="cb2-52"></span>
<span id="cb2-53"></span>
<span id="cb2-54"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@mcp.tool</span>()</span>
<span id="cb2-55"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">async</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> search_materials(</span>
<span id="cb2-56">    elements: Optional[List[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>]] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Field(</span>
<span id="cb2-57">        default<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>,</span>
<span id="cb2-58">        description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"List of element symbols to filter by (e.g. ['Si', 'O'])."</span>,</span>
<span id="cb2-59">    ),</span>
<span id="cb2-60">    band_gap_min: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Field(</span>
<span id="cb2-61">        default<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Lower bound for band gap filtering in eV."</span></span>
<span id="cb2-62">    ),</span>
<span id="cb2-63">    band_gap_max: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Field(</span>
<span id="cb2-64">        default<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10.0</span>, description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Upper bound for band gap filtering in eV."</span></span>
<span id="cb2-65">    ),</span>
<span id="cb2-66">    is_stable: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Field(</span>
<span id="cb2-67">        default<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>,</span>
<span id="cb2-68">        description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Whether to only retrieve stable materials (True) or all (False)."</span>,</span>
<span id="cb2-69">    ),</span>
<span id="cb2-70">    max_results: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Field(</span>
<span id="cb2-71">        default<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, ge<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, le<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span>, description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Maximum number of results to return."</span></span>
<span id="cb2-72">    ),</span>
<span id="cb2-73">) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb2-74">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb2-75"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Search for materials in the Materials Project database using basic filters:</span></span>
<span id="cb2-76"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    - elements (list of elements to include)</span></span>
<span id="cb2-77"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    - band_gap (range in eV)</span></span>
<span id="cb2-78"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    - is_stable</span></span>
<span id="cb2-79"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Returns a formatted list of matches with their material_id, formula, and band gap.</span></span>
<span id="cb2-80"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb2-81">    logger.info(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Starting search_materials query..."</span>)</span>
<span id="cb2-82">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">with</span> _get_mp_rester() <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> mpr:</span>
<span id="cb2-83">        docs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mpr.materials.summary.search(</span>
<span id="cb2-84">            elements<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>elements,</span>
<span id="cb2-85">            band_gap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(band_gap_min, band_gap_max),</span>
<span id="cb2-86">            is_stable<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>is_stable,</span>
<span id="cb2-87">            fields<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"material_id"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"formula_pretty"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"band_gap"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"energy_above_hull"</span>],</span>
<span id="cb2-88">        )</span>
<span id="cb2-89"></span>
<span id="cb2-90">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Truncate results to max_results</span></span>
<span id="cb2-91">    docs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(docs)[:max_results]</span>
<span id="cb2-92"></span>
<span id="cb2-93">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> docs:</span>
<span id="cb2-94">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"No materials found matching your criteria."</span></span>
<span id="cb2-95"></span>
<span id="cb2-96">    results_md <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (</span>
<span id="cb2-97">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"## Materials Search Results</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-98">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- **Elements**: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>elements <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">or</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Any'</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-99">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- **Band gap range**: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>band_gap_min<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> eV to </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>band_gap_max<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> eV</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-100">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- **Stable only**: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>is_stable<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-101">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"**Showing up to </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>max_results<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> matches**</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-102">    )</span>
<span id="cb2-103">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i, mat <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">enumerate</span>(docs, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>):</span>
<span id="cb2-104">        results_md <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> (</span>
<span id="cb2-105">            <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"**</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>i<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">.** ID: `</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>mat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>material_id<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">` | Formula: **</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>mat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>formula_pretty<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">** | "</span></span>
<span id="cb2-106">            <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Band gap: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>mat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>band_gap<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.3f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> eV | E above hull: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>mat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>energy_above_hull<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.3f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> eV</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-107">        )</span>
<span id="cb2-108">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> results_md</span>
<span id="cb2-109"></span>
<span id="cb2-110"></span>
<span id="cb2-111"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@mcp.tool</span>()</span>
<span id="cb2-112"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">async</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_structure_by_id(</span>
<span id="cb2-113">    material_id: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Field(..., description<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Materials Project ID (e.g. 'mp-149')"</span>)</span>
<span id="cb2-114">) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb2-115">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb2-116"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Retrieve the final computed structure for a given material_id from the Materials Project.</span></span>
<span id="cb2-117"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Returns a plain text summary of the structure (lattice, sites, formula).</span></span>
<span id="cb2-118"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb2-119">    logger.info(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"Fetching structure for </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>material_id<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">..."</span>)</span>
<span id="cb2-120">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">with</span> _get_mp_rester() <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> mpr:</span>
<span id="cb2-121">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Shortcut method to get just the final structure</span></span>
<span id="cb2-122">        structure <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mpr.get_structure_by_material_id(material_id)</span>
<span id="cb2-123"></span>
<span id="cb2-124">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> structure:</span>
<span id="cb2-125">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"No structure found for </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>material_id<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">."</span></span>
<span id="cb2-126"></span>
<span id="cb2-127">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Summarize the structure</span></span>
<span id="cb2-128">    formula <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> structure.composition.reduced_formula</span>
<span id="cb2-129">    lattice <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> structure.lattice</span>
<span id="cb2-130">    sites_count <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(structure)</span>
<span id="cb2-131">    text_summary <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (</span>
<span id="cb2-132">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"## Structure for </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>material_id<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-133">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- **Formula**: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>formula<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-134">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- **Lattice**:</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-135">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"   a = </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>lattice<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>a<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.3f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> Å, b = </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>lattice<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>b<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.3f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> Å, c = </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>lattice<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>c<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.3f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> Å</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-136">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"   α = </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>lattice<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>alpha<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.2f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">°, β = </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>lattice<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>beta<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.2f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">°, γ = </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>lattice<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>gamma<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.2f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">°</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-137">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- **Number of sites**: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>sites_count<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-138">        <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"- **Reduced formula**: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>structure<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>composition<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.</span>reduced_formula<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span></span>
<span id="cb2-139">    )</span>
<span id="cb2-140">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> text_summary</span>
<span id="cb2-141"></span>
<span id="cb2-142"></span>
<span id="cb2-143"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">__name__</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"__main__"</span>:</span>
<span id="cb2-144">    logger.info(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Starting Materials Project MCP server..."</span>)</span>
<span id="cb2-145">    mcp.run()</span></code></pre></div></div>
</details>
<ol type="1">
<li><strong><code>search_materials</code></strong>: Queries the Materials Project for stable/instable materials with certain elements and band gap constraints.</li>
<li><strong><code>get_structure_by_id</code></strong>: Retrieves the final computed crystal structure for a known <code>material_id</code> (like <code>mp-149</code> for silicon).</li>
</ol>
<p>When run, it starts an MCP server over STDIO. Any MCP-compatible client can spawn this script, discover the tools, and use them interactively with user approval.</p>
<section id="whats-happening" class="level3">
<h3 class="anchored" data-anchor-id="whats-happening">What’s happening?</h3>
<ol type="1">
<li><strong><code>FastMCP(...)</code></strong>: Creates a fast auto-configured server that uses standard input/output to communicate with LLM-based clients.</li>
<li><strong><code>@mcp.tool()</code></strong>: Decorates an async function, making it discoverable by MCP clients as a “tool.”</li>
<li><strong><code>search_materials</code></strong>: Queries for materials, filtering by band gap or stability. Returns a markdown summary.</li>
<li><strong><code>get_structure_by_id</code></strong>: Grabs the final computed structure from the MP database and returns it in a textual summary format.</li>
<li><strong><code>mcp.run()</code></strong>: Launches the event loop and starts listening for requests from any connected client.</li>
</ol>
</section>
</section>
<section id="using-the-mcp-server-in-claude-desktop" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="using-the-mcp-server-in-claude-desktop">Using the MCP server in Claude Desktop</h2>
<p>As an example, we’ll show how you might connect to this server with <a href="https://claude.ai/download">Claude Desktop</a>.</p>
<section id="step-1-install-dependencies-set-environment" class="level3">
<h3 class="anchored" data-anchor-id="step-1-install-dependencies-set-environment">Step 1: Install dependencies &amp; set environment</h3>
<p>Make sure you have:</p>
<ul>
<li>A Python environment: <code>python3 -m venv .venv</code> then <code>source .venv/bin/activate</code><br>
</li>
<li>Packages: <code>pip install "mcp[cli]" aiohttp pydantic mp_api</code></li>
</ul>
</section>
<section id="step-2-put-the-server-script-somewhere" class="level3">
<h3 class="anchored" data-anchor-id="step-2-put-the-server-script-somewhere">Step 2: Put the server script somewhere</h3>
<p>For example, you put <code>materials_project_plugin.py</code> in <code>~/mcp-servers/mp_plugin/</code>. Make sure it’s executable or that you can call python on it.</p>
</section>
<section id="step-3-add-to-claude_desktop_config.json" class="level3">
<h3 class="anchored" data-anchor-id="step-3-add-to-claude_desktop_config.json">Step 3: Add to <code>claude_desktop_config.json</code></h3>
<p>Locate your Claude Desktop config. On macOS, it’s at: <code>~/Library/Application Support/Claude/claude_desktop_config.json</code>. For Windows, check: <code>%APPDATA%\Claude\claude_desktop_config.json</code></p>
<p>Add a snippet like this:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode json code-with-copy"><code class="sourceCode json"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb3-2">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"mcpServers"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb3-3">    <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"materials_project_plugin"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb3-4">      <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"command"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/ABSOLUTE/PATH/TO/python/executable"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb3-5">      <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"args"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">[</span></span>
<span id="cb3-6">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/ABSOLUTE/PATH/TO/mp_plugin/materials_project_plugin.py"</span></span>
<span id="cb3-7">      <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">]</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb3-8">      <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"env"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb3-9">        <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"MP_API_KEY"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"&lt;YOUR_MATERIALS_PROJECT_API_KEY&gt;"</span></span>
<span id="cb3-10">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">}</span></span>
<span id="cb3-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">}</span></span>
<span id="cb3-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">}</span></span>
<span id="cb3-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">}</span></span></code></pre></div></div>
</section>
<section id="step-4-restart-claude-desktop" class="level3 page-columns page-full">
<h3 class="anchored" data-anchor-id="step-4-restart-claude-desktop">Step 4: Restart Claude Desktop</h3>
<p>Claude will now spawn the plugin automatically. You should see a new hammer icon in your chat window, representing available MCP tools.</p>
<div class="quarto-figure quarto-figure-center page-columns page-full">
<figure class="figure page-columns page-full">
<p><img src="https://xiangyu-yin.com/content/img/available_mcp_tools.png" class="img-fluid figure-img" style="width:95.0%"></p>
<figcaption class="margin-caption">Claude Desktop Showing MCP Server Tools</figcaption>
</figure>
</div>
</section>
<section id="step-5-test-queries" class="level3 page-columns page-full">
<h3 class="anchored" data-anchor-id="step-5-test-queries">Step 5: Test queries</h3>
<p>Here is an example prompt you can try:</p>
<pre class="plaintext"><code>&gt; I'm researching materials for solar cell applications. Can you:
&gt; 1) Find me 5 stable semiconductor materials with band gaps between 1.0 and 2.0 eV 
&gt;    that contain either Ga, or In?
&gt; 2) For the most promising candidate (lowest energy above hull), retrieve and explain 
&gt;    its crystal structure.</code></pre>
<p>Claude will use the “search_materials” tool with your provided constraints, then possibly call “get_structure_by_id” to retrieve further details. Below is a sample snippet from the conversation output:</p>
<div class="quarto-figure quarto-figure-center page-columns page-full">
<figure class="figure page-columns page-full">
<p><img src="https://xiangyu-yin.com/content/img/mp_mcp_example.png" class="img-fluid figure-img" style="width:95.0%"></p>
<figcaption class="margin-caption">Example of a conversation with Claude Desktop and the MCP server</figcaption>
</figure>
</div>
</section>
</section>
<section id="wrap-up-and-additional-links" class="level2">
<h2 class="anchored" data-anchor-id="wrap-up-and-additional-links">Wrap-up and additional links</h2>
<p>That’s it! You now have an MCP server that can call Materials Project queries. You can start from this basic example and adapt it to your needs by:</p>
<ul>
<li>Experiment with advanced queries (like stable phases in multi-element chemical systems).</li>
<li>Add or refine tools in your server for more specialized searches (e.g., stable doping compositions).</li>
<li>Combine with other servers in your environment, for a multi-functional AI agent.</li>
</ul>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Tip
</div>
</div>
<div class="callout-body-container callout-body">
<p>Enjoy exploring materials data with your new AI-savvy pipeline!😊</p>
</div>
</div>


</section>

<script data-collect-dnt="true" async="" src="https://scripts.simpleanalyticscdn.com/latest.js"></script><a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a><div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-reuse"><h2 class="anchored quarto-appendix-heading">Reuse</h2><div class="quarto-appendix-contents"><div><a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</a></div></div></section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{yin2025,
  author = {Yin, Xiangyu},
  title = {Building an {MCP} {Server} for the {Materials} {Project}},
  date = {2025-03-23},
  url = {https://xiangyu-yin.com/content/post_mp_mcp.html},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-yin2025" class="csl-entry quarto-appendix-citeas">
Yin, Xiangyu. 2025. <span>“Building an MCP Server for the Materials
Project.”</span> March 23. <a href="https://xiangyu-yin.com/content/post_mp_mcp.html">https://xiangyu-yin.com/content/post_mp_mcp.html</a>.
</div></div></section></div> ]]></description>
  <guid>https://xiangyu-yin.com/content/post_mp_mcp.html</guid>
  <pubDate>Sun, 23 Mar 2025 05:00:00 GMT</pubDate>
  <media:content url="https://xiangyu-yin.com/content/img/mp_mcp_example.png" medium="image" type="image/png" height="86" width="144"/>
</item>
<item>
  <title>Building an AI-Driven Scientific Workflow &amp; Chatbot with Nodeology</title>
  <dc:creator>Xiangyu Yin</dc:creator>
  <link>https://xiangyu-yin.com/content/post_nodeology_example.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p><a href="https://github.com/xyin-anl/nodeology">Nodeology</a> is a new AI workflow-building library designed to simplify the creation of robust state machine workflows through an intuitive, user-friendly interface. The framework empowers researchers—especially those without extensive programming backgrounds—to rapidly design and deploy complete AI workflows using just prompt templates and existing functions. In this post, we’ll introduce Nodeology by constructing a simple, interactive, AI-assisted particle trajectory simulation workflow. By the end, you’ll have a chatbot capable of:</p>
<ul>
<li>Requesting user input for physical parameters of a simulated particle.</li>
<li>Calculating the particle’s trajectory under electric and magnetic fields.</li>
<li>Visualizing the trajectory through a 3D plot.</li>
<li>Analyzing particle motion using a Vision-Language Model (VLM).</li>
<li>Determining whether to continue or conclude based on user feedback.</li>
</ul>
<p>We’ll walk through each step clearly—from understanding Nodeology’s core concepts (State, Nodes, and Workflows) to integrating them into an interactive user interface.</p>
</section>
<section id="understanding-nodeologys-core-concepts" class="level2">
<h2 class="anchored" data-anchor-id="understanding-nodeologys-core-concepts">Understanding Nodeology’s Core Concepts</h2>
<section id="state" class="level3">
<h3 class="anchored" data-anchor-id="state">State</h3>
<p><code>State</code> is a shared data “backpack” accessible to every step (or <code>Node</code>) within the workflow, which can both read from and update it. In our example, it keeps track of:</p>
<ul>
<li>Simulation parameters (e.g., <code>mass</code>, <code>charge</code>).</li>
<li>Intermediate outputs (such as the 3D plot).</li>
<li>User inputs and conversation history.</li>
</ul>
<p>Think of <code>State</code> as the single source of truth traveling with you throughout the workflow’s lifecycle.</p>
</section>
<section id="nodes" class="level3">
<h3 class="anchored" data-anchor-id="nodes">Nodes</h3>
<p>A <code>Node</code> is the fundamental building block within a workflow and comes in two types:</p>
<ul>
<li><strong>Prompt-based Node</strong>: Utilizes an LLM/VLM with prompt templates, extracting necessary data from <code>State</code>, invoking the model, and then updating or adding new data back into the <code>State</code>.</li>
<li><strong>Function-based Node</strong>: A decorated Python function that handles traditional scientific computations or data processing tasks (e.g., numerical simulations, file I/O, plotting) and stores the results directly into the <code>State</code>.</li>
</ul>
<p>This combination of AI-driven logic and conventional Python functions is precisely how Nodeology seamlessly integrates AI with domain-specific expertise.</p>
</section>
<section id="workflow" class="level3">
<h3 class="anchored" data-anchor-id="workflow">Workflow</h3>
<p>A <code>Workflow</code> in Nodeology is essentially a directed graph composed of <code>Nodes</code>:</p>
<ul>
<li><strong>Flow definition</strong>: Clearly define the progression, such as “After Node A completes, pass control to Node B.”</li>
<li><strong>Conditional branches</strong>: Allow decision-making like “If condition X is true, execute Node Y; otherwise, Node Z.”</li>
<li><strong>User interactions</strong>: Easily integrate points where the workflow awaits user input before proceeding.</li>
</ul>
<p>All these tasks are efficiently managed by Nodeology’s <code>Workflow</code> class, significantly reducing common workflow management overhead like error handling, concurrency, and checkpointing.</p>
</section>
</section>
<section id="defining-our-simulation-state" class="level2">
<h2 class="anchored" data-anchor-id="defining-our-simulation-state">Defining Our Simulation State</h2>
<p>We’ll begin by specifying <em>which data</em> we expect to carry throughout the workflow. This includes physical parameters (e.g., <code>mass</code>, <code>charge</code>, <code>initial_velocity</code>, etc.), plus placeholders for the user’s analysis or plot results.</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> List, Dict</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> nodeology.state <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> State</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> TrajectoryState(State):</span>
<span id="cb1-6">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    State for our particle trajectory workflow.</span></span>
<span id="cb1-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    It holds everything from initial parameters,</span></span>
<span id="cb1-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    user confirmations, and final analysis.</span></span>
<span id="cb1-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb1-11">    mass: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>                   <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Particle mass (kg)</span></span>
<span id="cb1-12">    charge: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>                 <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Particle charge (C)</span></span>
<span id="cb1-13">    initial_velocity: np.ndarray  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># [vx, vy, vz]</span></span>
<span id="cb1-14">    E_field: np.ndarray           <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># [Ex, Ey, Ez]</span></span>
<span id="cb1-15">    B_field: np.ndarray           <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># [Bx, By, Bz]</span></span>
<span id="cb1-16">    </span>
<span id="cb1-17">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># A boolean for user confirmation</span></span>
<span id="cb1-18">    confirm_parameters: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span></span>
<span id="cb1-19"></span>
<span id="cb1-20">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># For storing the raw JSON from an LLM-based node</span></span>
<span id="cb1-21">    parameters_updater_output: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span></span>
<span id="cb1-22"></span>
<span id="cb1-23">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># The computed 3D positions of the particle</span></span>
<span id="cb1-24">    positions: List[np.ndarray]</span>
<span id="cb1-25"></span>
<span id="cb1-26">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># We'll store a path or figure for the plotted trajectory</span></span>
<span id="cb1-27">    trajectory_plot: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span></span>
<span id="cb1-28">    trajectory_plot_path: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span></span>
<span id="cb1-29"></span>
<span id="cb1-30">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Where the LLM’s analysis result will land</span></span>
<span id="cb1-31">    analysis_result: Dict</span>
<span id="cb1-32"></span>
<span id="cb1-33">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># A boolean for continuing or finishing</span></span>
<span id="cb1-34">    continue_simulation: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span></span></code></pre></div></div>
</details>
<blockquote class="blockquote">
<p><strong>Key Takeaway</strong>: If you want a new value to persist or be shared across nodes, define it here.</p>
</blockquote>
</section>
<section id="writing-our-nodes" class="level2">
<h2 class="anchored" data-anchor-id="writing-our-nodes">Writing Our Nodes</h2>
<section id="displaying-parameters" class="level3">
<h3 class="anchored" data-anchor-id="displaying-parameters">Displaying Parameters</h3>
<p>We’ll start small: a Node that <em>shows</em> the current simulation parameters to the user.<br>
We’ll decorate a simple Python function with <code>@as_node(...)</code>. This way, Nodeology recognizes it as a building block in the workflow.</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> chainlit <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> cl</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> chainlit <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Message, run_sync</span>
<span id="cb2-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> nodeology.node <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> as_node</span>
<span id="cb2-4"></span>
<span id="cb2-5"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@as_node</span>(sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[])</span>
<span id="cb2-6"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> display_parameters(</span>
<span id="cb2-7">    mass: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>,</span>
<span id="cb2-8">    charge: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>,</span>
<span id="cb2-9">    initial_velocity: np.ndarray,</span>
<span id="cb2-10">    E_field: np.ndarray,</span>
<span id="cb2-11">    B_field: np.ndarray,</span>
<span id="cb2-12">):</span>
<span id="cb2-13">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb2-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Display the current simulation parameters using</span></span>
<span id="cb2-15"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Chainlit's custom UI element.</span></span>
<span id="cb2-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb2-17">    parameters <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {</span>
<span id="cb2-18">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Mass (kg)"</span>: mass,</span>
<span id="cb2-19">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Charge (C)"</span>: charge,</span>
<span id="cb2-20">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Initial Velocity (m/s)"</span>: initial_velocity.tolist(),</span>
<span id="cb2-21">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Electric Field (N/C)"</span>: E_field.tolist(),</span>
<span id="cb2-22">        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Magnetic Field (T)"</span>: B_field.tolist(),</span>
<span id="cb2-23">    }</span>
<span id="cb2-24"></span>
<span id="cb2-25">    run_sync(</span>
<span id="cb2-26">        Message(</span>
<span id="cb2-27">            content<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Below are the current simulation parameters:"</span>,</span>
<span id="cb2-28">            elements<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb2-29">                cl.CustomElement(</span>
<span id="cb2-30">                    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"DataDisplay"</span>,</span>
<span id="cb2-31">                    props<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{</span>
<span id="cb2-32">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"data"</span>: parameters,</span>
<span id="cb2-33">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"title"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Particle Parameters"</span>,</span>
<span id="cb2-34">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"badge"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Configured"</span>,</span>
<span id="cb2-35">                    },</span>
<span id="cb2-36">                )</span>
<span id="cb2-37">            ],</span>
<span id="cb2-38">        ).send()</span>
<span id="cb2-39">    )</span>
<span id="cb2-40">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># No return value needed, so sink=[] above</span></span></code></pre></div></div>
</details>
<p>When the workflow runs, it calls this function, <em>pulls the relevant fields</em> from the State, then sends a UI message with the data.</p>
</section>
<section id="confirming-parameters" class="level3">
<h3 class="anchored" data-anchor-id="confirming-parameters">Confirming Parameters</h3>
<p>We want an <em>interactive step</em> where the user can say “Yes, I’m okay with these parameters” or “No, let me adjust.” This is also done with a function-based Node:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> chainlit <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> AskActionMessage</span>
<span id="cb3-2"></span>
<span id="cb3-3"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@as_node</span>(sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"confirm_parameters"</span>)</span>
<span id="cb3-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> ask_confirm_parameters():</span>
<span id="cb3-5">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb3-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Prompt user with 2 actions: 'Yes' or 'No'.</span></span>
<span id="cb3-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Return True if 'Yes', False otherwise.</span></span>
<span id="cb3-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb3-9">    res <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> run_sync(</span>
<span id="cb3-10">        AskActionMessage(</span>
<span id="cb3-11">            content<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Are you happy with the parameters?"</span>,</span>
<span id="cb3-12">            actions<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb3-13">                cl.Action(name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yes"</span>, payload<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"value"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yes"</span>}, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Yes"</span>),</span>
<span id="cb3-14">                cl.Action(name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"no"</span>,  payload<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"value"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"no"</span>},  label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"No"</span>),</span>
<span id="cb3-15">            ],</span>
<span id="cb3-16">        ).send()</span>
<span id="cb3-17">    )</span>
<span id="cb3-18">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># We store the boolean in 'confirm_parameters' in the State</span></span>
<span id="cb3-19">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> res <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">and</span> res[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"payload"</span>][<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"value"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yes"</span>:</span>
<span id="cb3-20">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb3-21">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb3-22">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span></code></pre></div></div>
</details>
</section>
<section id="gathering-user-edits" class="level3">
<h3 class="anchored" data-anchor-id="gathering-user-edits">Gathering User Edits</h3>
<p>If the user chooses “No,” we want to <em>ask</em> for new parameters:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> chainlit <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> AskUserMessage</span>
<span id="cb4-2"></span>
<span id="cb4-3"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@as_node</span>(sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"human_input"</span>])</span>
<span id="cb4-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> ask_parameters_input():</span>
<span id="cb4-5">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb4-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Wait for a text message from the user specifying</span></span>
<span id="cb4-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    how they want to update the parameters.</span></span>
<span id="cb4-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb4-9">    user_msg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> run_sync(</span>
<span id="cb4-10">        AskUserMessage(</span>
<span id="cb4-11">            content<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Please let me know how you want to change any of the parameters :)"</span>,</span>
<span id="cb4-12">        ).send()</span>
<span id="cb4-13">    )</span>
<span id="cb4-14">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> user_msg[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"output"</span>]  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># We'll store this in 'human_input'</span></span></code></pre></div></div>
</details>
</section>
<section id="updating-parameters" class="level3">
<h3 class="anchored" data-anchor-id="updating-parameters">Updating Parameters</h3>
<p>Now we let the <em>LLM</em> parse the user’s text and produce new numeric values. A typical instruction might be <code>"Change the magnetic field to 1e4 1e4 0"</code>. The LLM can create a JSON object with updated <code>mass</code>, <code>charge</code>, <code>E_field</code>, etc.</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> nodeology.node <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Node</span>
<span id="cb5-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> json</span>
<span id="cb5-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb5-4"></span>
<span id="cb5-5">parameters_updater <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Node(</span>
<span id="cb5-6">    node_type<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"parameters_updater"</span>,</span>
<span id="cb5-7">    prompt_template<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""Update the parameters based on the user's input.</span></span>
<span id="cb5-8"></span>
<span id="cb5-9"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">Current parameters:</span></span>
<span id="cb5-10"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">mass: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{mass}</span></span>
<span id="cb5-11"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">charge: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{charge}</span></span>
<span id="cb5-12"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">initial_velocity: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{initial_velocity}</span></span>
<span id="cb5-13"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">E_field: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{E_field}</span></span>
<span id="cb5-14"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">B_field: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{B_field}</span></span>
<span id="cb5-15"></span>
<span id="cb5-16"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">User input:</span></span>
<span id="cb5-17"><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{human_input}</span></span>
<span id="cb5-18"></span>
<span id="cb5-19"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">Please return the updated parameters in JSON format.</span></span>
<span id="cb5-20"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb5-21"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    "mass": float,</span></span>
<span id="cb5-22"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    "charge": float,</span></span>
<span id="cb5-23"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    "initial_velocity": list[float],</span></span>
<span id="cb5-24"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    "E_field": list[float],</span></span>
<span id="cb5-25"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">    "B_field": list[float]</span></span>
<span id="cb5-26"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">}</span></span>
<span id="cb5-27"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""</span>,</span>
<span id="cb5-28">    sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"parameters_updater_output"</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># We'll store the LLM response here (raw text/JSON)</span></span>
<span id="cb5-29">    sink_format<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"json"</span>,</span>
<span id="cb5-30">)</span>
<span id="cb5-31"></span>
<span id="cb5-32"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># We'll need a post-processing function to interpret that JSON</span></span>
<span id="cb5-33"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> parameters_updater_transform(state, client, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span>kwargs):</span>
<span id="cb5-34">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Convert the LLM's output from text to Python objects</span></span>
<span id="cb5-35">    params_dict <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> json.loads(state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"parameters_updater_output"</span>])</span>
<span id="cb5-36">    state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mass"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> params_dict[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mass"</span>]</span>
<span id="cb5-37">    state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"charge"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> params_dict[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"charge"</span>]</span>
<span id="cb5-38">    state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"initial_velocity"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(params_dict[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"initial_velocity"</span>])</span>
<span id="cb5-39">    state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"E_field"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(params_dict[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"E_field"</span>])</span>
<span id="cb5-40">    state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"B_field"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(params_dict[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"B_field"</span>])</span>
<span id="cb5-41">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> state</span>
<span id="cb5-42"></span>
<span id="cb5-43">parameters_updater.post_process <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> parameters_updater_transform</span></code></pre></div></div>
</details>
</section>
<section id="calculating-the-trajectory" class="level3">
<h3 class="anchored" data-anchor-id="calculating-the-trajectory">Calculating the Trajectory</h3>
<p>Next, a <em>pure Python</em> function that solves the equations of motion for a charged particle via the Lorentz force. This is the “traditional” science piece:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> tempfile</span>
<span id="cb6-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.integrate <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> solve_ivp</span>
<span id="cb6-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> List</span>
<span id="cb6-4"></span>
<span id="cb6-5"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@as_node</span>(sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"positions"</span>])</span>
<span id="cb6-6"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> calculate_trajectory(</span>
<span id="cb6-7">    mass: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>,</span>
<span id="cb6-8">    charge: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>,</span>
<span id="cb6-9">    initial_velocity: np.ndarray,</span>
<span id="cb6-10">    E_field: np.ndarray,</span>
<span id="cb6-11">    B_field: np.ndarray,</span>
<span id="cb6-12">) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> List[np.ndarray]:</span>
<span id="cb6-13">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb6-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Numerically integrate the trajectory of a particle under</span></span>
<span id="cb6-15"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    electric &amp; magnetic fields, returning the positions over time.</span></span>
<span id="cb6-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb6-17">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Estimate cyclotron period if B is non-zero:</span></span>
<span id="cb6-18">    B_magnitude <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linalg.norm(B_field)</span>
<span id="cb6-19">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> B_magnitude <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">or</span> charge <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>:</span>
<span id="cb6-20">        cyclotron_period <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span></span>
<span id="cb6-21">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb6-22">        cyclotron_frequency <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(charge) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> B_magnitude <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> mass</span>
<span id="cb6-23">        cyclotron_period    <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> cyclotron_frequency</span>
<span id="cb6-24"></span>
<span id="cb6-25">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># We'll simulate 5 cycles, each with 100 steps</span></span>
<span id="cb6-26">    num_periods <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span></span>
<span id="cb6-27">    num_points_per_period <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span></span>
<span id="cb6-28">    total_time <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> num_periods <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> cyclotron_period</span>
<span id="cb6-29">    total_points <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> num_periods <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> num_points_per_period</span>
<span id="cb6-30"></span>
<span id="cb6-31">    time_points <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, total_time, total_points)</span>
<span id="cb6-32"></span>
<span id="cb6-33">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> lorentz_force(t, state):</span>
<span id="cb6-34">        vel <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> state[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>:]</span>
<span id="cb6-35">        force <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> charge <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (E_field <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> np.cross(vel, B_field))</span>
<span id="cb6-36">        acc <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> force <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> mass</span>
<span id="cb6-37">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> np.concatenate([vel, acc])</span>
<span id="cb6-38"></span>
<span id="cb6-39">    initial_position <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>])</span>
<span id="cb6-40">    initial_state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([initial_position, initial_velocity])</span>
<span id="cb6-41"></span>
<span id="cb6-42">    sol <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> solve_ivp(</span>
<span id="cb6-43">        lorentz_force,</span>
<span id="cb6-44">        (time_points[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], time_points[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),</span>
<span id="cb6-45">        initial_state,</span>
<span id="cb6-46">        t_eval<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>time_points,</span>
<span id="cb6-47">        method<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"RK45"</span>,</span>
<span id="cb6-48">        rtol<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-8</span>,</span>
<span id="cb6-49">    )</span>
<span id="cb6-50"></span>
<span id="cb6-51">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> sol.success:</span>
<span id="cb6-52">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Return zeros if something fails</span></span>
<span id="cb6-53">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> [np.zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(time_points))]</span>
<span id="cb6-54"></span>
<span id="cb6-55">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> [sol.y[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, i] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(time_points))]</span></code></pre></div></div>
</details>
</section>
<section id="plotting-the-trajectory" class="level3">
<h3 class="anchored" data-anchor-id="plotting-the-trajectory">Plotting the Trajectory</h3>
<p>We can create a <em>plot</em> (e.g., Plotly 3D scatter) and display it through Chainlit:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> plotly.graph_objects <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> go</span>
<span id="cb7-2"></span>
<span id="cb7-3"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@as_node</span>(sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"trajectory_plot"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"trajectory_plot_path"</span>])</span>
<span id="cb7-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> plot_trajectory(positions: List[np.ndarray]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>:</span>
<span id="cb7-5">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb7-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Generate a 3D plot of the positions and</span></span>
<span id="cb7-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    save it to a temporary file.</span></span>
<span id="cb7-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb7-9">    arr <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(positions)</span>
<span id="cb7-10">    fig <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> go.Figure(</span>
<span id="cb7-11">        data<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb7-12">            go.Scatter3d(</span>
<span id="cb7-13">                x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>arr[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],</span>
<span id="cb7-14">                y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>arr[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],</span>
<span id="cb7-15">                z<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>arr[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>],</span>
<span id="cb7-16">                mode<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lines"</span>,</span>
<span id="cb7-17">                line<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">dict</span>(width<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>),</span>
<span id="cb7-18">            )</span>
<span id="cb7-19">        ]</span>
<span id="cb7-20">    )</span>
<span id="cb7-21">    fig.update_layout(</span>
<span id="cb7-22">        scene<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">dict</span>(xaxis_title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"X (m)"</span>, yaxis_title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Y (m)"</span>, zaxis_title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Z (m)"</span>)</span>
<span id="cb7-23">    )</span>
<span id="cb7-24"></span>
<span id="cb7-25">    image_path <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tempfile.mktemp(suffix<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".png"</span>)</span>
<span id="cb7-26">    fig.write_image(image_path)</span>
<span id="cb7-27"></span>
<span id="cb7-28">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Display in UI</span></span>
<span id="cb7-29">    run_sync(</span>
<span id="cb7-30">        Message(</span>
<span id="cb7-31">            content<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Below is the trajectory plot:"</span>,</span>
<span id="cb7-32">            elements<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[cl.Plotly(figure<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>fig)],</span>
<span id="cb7-33">        ).send()</span>
<span id="cb7-34">    )</span>
<span id="cb7-35"></span>
<span id="cb7-36">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Return figure + path</span></span>
<span id="cb7-37">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> fig, image_path</span></code></pre></div></div>
</details>
</section>
<section id="analyzing-the-trajectory" class="level3">
<h3 class="anchored" data-anchor-id="analyzing-the-trajectory">Analyzing the Trajectory</h3>
<p>Finally, a Node where the <strong>LLM</strong> interprets the motion qualitatively:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1">trajectory_analyzer <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Node(</span>
<span id="cb8-2">    node_type<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"trajectory_analyzer"</span>,</span>
<span id="cb8-3">    prompt_template<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"""Analyze this particle trajectory plot.</span></span>
<span id="cb8-4"></span>
<span id="cb8-5"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">Please determine:</span></span>
<span id="cb8-6"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">1. The type of motion (linear, circular, helical, or chaotic)</span></span>
<span id="cb8-7"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">2. Key physical features (radius, period, pitch angle if applicable)</span></span>
<span id="cb8-8"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">3. Explanation of the motion</span></span>
<span id="cb8-9"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">4. Anomalies in the motion</span></span>
<span id="cb8-10"></span>
<span id="cb8-11"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">Output in JSON format:</span></span>
<span id="cb8-12"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb8-13"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">  "trajectory_type": "type_name",</span></span>
<span id="cb8-14"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">  "key_features": {</span></span>
<span id="cb8-15"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">     "feature1": value,</span></span>
<span id="cb8-16"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">     "feature2": value</span></span>
<span id="cb8-17"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">  },</span></span>
<span id="cb8-18"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">  "explanation": "detailed explanation",</span></span>
<span id="cb8-19"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">  "anomalies": "anomaly description"</span></span>
<span id="cb8-20"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">}"""</span>,</span>
<span id="cb8-21">    sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analysis_result"</span>,</span>
<span id="cb8-22">    sink_format<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"json"</span>,</span>
<span id="cb8-23">    image_keys<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"trajectory_plot_path"</span>],  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># for image-based context, if the LLM can handle it</span></span>
<span id="cb8-24">)</span>
<span id="cb8-25"></span>
<span id="cb8-26"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> display_trajectory_analyzer_result(state, client, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span>kwargs):</span>
<span id="cb8-27">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb8-28"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    A post-process function that picks up the LLM's</span></span>
<span id="cb8-29"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    output, converts it to a dict, and displays it nicely.</span></span>
<span id="cb8-30"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb8-31">    <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> json</span>
<span id="cb8-32">    state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analysis_result"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> json.loads(state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analysis_result"</span>])</span>
<span id="cb8-33">    run_sync(</span>
<span id="cb8-34">        Message(</span>
<span id="cb8-35">            content<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Here is the trajectory analysis:"</span>,</span>
<span id="cb8-36">            elements<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb8-37">                cl.CustomElement(</span>
<span id="cb8-38">                    name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"DataDisplay"</span>,</span>
<span id="cb8-39">                    props<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{</span>
<span id="cb8-40">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"data"</span>: state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analysis_result"</span>],</span>
<span id="cb8-41">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"title"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Trajectory Analysis"</span>,</span>
<span id="cb8-42">                        <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"badge"</span>: state[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analysis_result"</span>].get(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"trajectory_type"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Unknown"</span>),</span>
<span id="cb8-43">                    },</span>
<span id="cb8-44">                )</span>
<span id="cb8-45">            ],</span>
<span id="cb8-46">        ).send()</span>
<span id="cb8-47">    )</span>
<span id="cb8-48">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> state</span>
<span id="cb8-49"></span>
<span id="cb8-50">trajectory_analyzer.post_process <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> display_trajectory_analyzer_result</span></code></pre></div></div>
</details>
</section>
<section id="asking-for-more-simulation-or-finish" class="level3">
<h3 class="anchored" data-anchor-id="asking-for-more-simulation-or-finish">Asking for More Simulation or Finish</h3>
<p>We’ll close the loop by letting the user decide if they want to run more cycles:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@as_node</span>(sink<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"continue_simulation"</span>)</span>
<span id="cb9-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> ask_continue_simulation():</span>
<span id="cb9-3">    res <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> run_sync(</span>
<span id="cb9-4">        AskActionMessage(</span>
<span id="cb9-5">            content<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Would you like to continue the simulation?"</span>,</span>
<span id="cb9-6">            actions<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[</span>
<span id="cb9-7">                cl.Action(name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"continue"</span>, payload<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"value"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"continue"</span>}, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Continue Simulation"</span>),</span>
<span id="cb9-8">                cl.Action(name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"finish"</span>,   payload<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"value"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"finish"</span>},   label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Finish"</span>),</span>
<span id="cb9-9">            ],</span>
<span id="cb9-10">        ).send()</span>
<span id="cb9-11">    )</span>
<span id="cb9-12">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> (res <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">and</span> res[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"payload"</span>][<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"value"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"continue"</span>)</span></code></pre></div></div>
</details>
</section>
</section>
<section id="orchestrating-everything-in-a-workflow" class="level2">
<h2 class="anchored" data-anchor-id="orchestrating-everything-in-a-workflow">Orchestrating Everything in a Workflow</h2>
<p>Now we piece all Nodes together in a <code>Workflow</code>. Let’s define it:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> langgraph.graph <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> END</span>
<span id="cb10-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> nodeology.workflow <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Workflow</span>
<span id="cb10-3"></span>
<span id="cb10-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> TrajectoryWorkflow(Workflow):</span>
<span id="cb10-5">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> create_workflow(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb10-6">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 1) Register our Nodes</span></span>
<span id="cb10-7">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"display_parameters"</span>, display_parameters)</span>
<span id="cb10-8">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_confirm_parameters"</span>, ask_confirm_parameters)</span>
<span id="cb10-9">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_parameters_input"</span>, ask_parameters_input)</span>
<span id="cb10-10">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"update_parameters"</span>, parameters_updater)</span>
<span id="cb10-11">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"calculate_trajectory"</span>, calculate_trajectory)</span>
<span id="cb10-12">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"plot_trajectory"</span>, plot_trajectory)</span>
<span id="cb10-13">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analyze_trajectory"</span>, trajectory_analyzer)</span>
<span id="cb10-14">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_node(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_continue_simulation"</span>, ask_continue_simulation)</span>
<span id="cb10-15"></span>
<span id="cb10-16">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 2) Define the flow (edges)</span></span>
<span id="cb10-17">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_flow(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"display_parameters"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_confirm_parameters"</span>)</span>
<span id="cb10-18">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_conditional_flow(</span>
<span id="cb10-19">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_confirm_parameters"</span>,</span>
<span id="cb10-20">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"confirm_parameters"</span>,       <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># If user is happy</span></span>
<span id="cb10-21">            then<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"calculate_trajectory"</span>,</span>
<span id="cb10-22">            otherwise<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_parameters_input"</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># If not</span></span>
<span id="cb10-23">        )</span>
<span id="cb10-24">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_flow(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_parameters_input"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"update_parameters"</span>)</span>
<span id="cb10-25">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_flow(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"update_parameters"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"display_parameters"</span>)</span>
<span id="cb10-26">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_flow(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"calculate_trajectory"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"plot_trajectory"</span>)</span>
<span id="cb10-27">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_flow(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"plot_trajectory"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analyze_trajectory"</span>)</span>
<span id="cb10-28">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_flow(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"analyze_trajectory"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_continue_simulation"</span>)</span>
<span id="cb10-29">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.add_conditional_flow(</span>
<span id="cb10-30">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ask_continue_simulation"</span>,</span>
<span id="cb10-31">            <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"continue_simulation"</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># If user wants to keep going</span></span>
<span id="cb10-32">            then<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"display_parameters"</span>,</span>
<span id="cb10-33">            otherwise<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>END,         <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># If user says “finish”</span></span>
<span id="cb10-34">        )</span>
<span id="cb10-35"></span>
<span id="cb10-36">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 3) Set the first Node to run</span></span>
<span id="cb10-37">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.set_entry(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"display_parameters"</span>)</span>
<span id="cb10-38"></span>
<span id="cb10-39">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 4) Compile</span></span>
<span id="cb10-40">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">compile</span>()</span></code></pre></div></div>
<p><strong>Note</strong>: Once compiled, Nodeology internally organizes everything as a state machine with branching logic, enabling flexible re-runs and so on.</p>
</section>
<section id="launching-the-interactive-chat" class="level2">
<h2 class="anchored" data-anchor-id="launching-the-interactive-chat">Launching the Interactive Chat</h2>
<p>At last, we bring it all together. We instantiate our workflow with an initial set of parameters and run it in <em>UI mode</em>:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 1) Create the workflow</span></span>
<span id="cb11-2">workflow <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> TrajectoryWorkflow(</span>
<span id="cb11-3">    state_defs<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>TrajectoryState,       <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Our custom state</span></span>
<span id="cb11-4">    llm_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"gemini/gemini-2.0-flash"</span>,</span>
<span id="cb11-5">    vlm_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"gemini/gemini-2.0-flash"</span>,</span>
<span id="cb11-6">    debug_mode<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span>
<span id="cb11-7">)</span>
<span id="cb11-8"></span>
<span id="cb11-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 2) Provide initial data</span></span>
<span id="cb11-10">initial_state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {</span>
<span id="cb11-11">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mass"</span>: <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">9.1093837015e-31</span>,  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># electron mass</span></span>
<span id="cb11-12">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"charge"</span>: <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.602176634e-19</span>,</span>
<span id="cb11-13">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"initial_velocity"</span>: np.array([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e6</span>]),</span>
<span id="cb11-14">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"E_field"</span>: np.array([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5e6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e6</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5e6</span>]),</span>
<span id="cb11-15">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"B_field"</span>: np.array([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">50000.0</span>]),</span>
<span id="cb11-16">}</span>
<span id="cb11-17"></span>
<span id="cb11-18"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 3) Run with a user interface</span></span>
<span id="cb11-19">result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> workflow.run(init_values<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>initial_state, ui<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span></code></pre></div></div>
</details>
<p>When you run this file, a <em>Chainlit</em> web server will pop up in your console logs, typically at <code>http://localhost:8000</code>. Open that address in your browser and you now have a complete, AI-driven pipeline that merges human interaction, classical simulation, and LLM-based analysis. The user sees a straightforward chat interface, while under the hood, Nodeology coordinates each Node in a robust workflow. Below is a short video showing the workflow in action:</p>
<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/4c-TmLCWd_U" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
</section>
<section id="visualizing-exporting-and-sharing" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="visualizing-exporting-and-sharing">Visualizing, Exporting and Sharing</h2>
<p>You can visualize your workflow as a <code>mermaid</code> graph or export it as a <code>.yaml</code> file:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Visualize workflow as a mermaid graph</span></span>
<span id="cb12-2">workflow.graph.get_graph().draw_mermaid_png(</span>
<span id="cb12-3">    output_file_path<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"particle_trajectory_analysis.png"</span></span>
<span id="cb12-4">)</span></code></pre></div></div>
</details>
<div class="quarto-figure quarto-figure-center page-columns page-full">
<figure class="figure page-columns page-full">
<p><img src="https://xiangyu-yin.com/content/img/particle_trajectory_analysis_flowchart.png" class="img-fluid figure-img" style="width:60.0%"></p>
<figcaption class="margin-caption">Exported Workflow Flowchart</figcaption>
</figure>
</div>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Export workflow to YAML file for sharing</span></span>
<span id="cb13-2">workflow.to_yaml(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"particle_trajectory_analysis.yaml"</span>)</span></code></pre></div></div>
</details>
<details>
<summary>
Click to expand the YAML file
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode yaml code-with-copy"><code class="sourceCode yaml"><span id="cb14-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Example of the exported YAML file</span></span>
<span id="cb14-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">name</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> TrajectoryWorkflow_03_13_2025_20_06_45</span></span>
<span id="cb14-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">state_defs</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">current_node_type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-5"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">previous_node_type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-6"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">human_input</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-7"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">input</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-8"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">output</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-9"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">messages</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> List[dict]</span></span>
<span id="cb14-10"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mass</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> float</span></span>
<span id="cb14-11"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">charge</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> float</span></span>
<span id="cb14-12"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">initial_velocity</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ndarray</span></span>
<span id="cb14-13"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">E_field</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ndarray</span></span>
<span id="cb14-14"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">B_field</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ndarray</span></span>
<span id="cb14-15"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">confirm_parameters</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> bool</span></span>
<span id="cb14-16"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">parameters_updater_output</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-17"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">positions</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> List[ndarray]</span></span>
<span id="cb14-18"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">trajectory_plot</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-19"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">trajectory_plot_path</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> str</span></span>
<span id="cb14-20"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">analysis_result</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> dict</span></span>
<span id="cb14-21"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">-</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">continue_simulation</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> bool</span></span>
<span id="cb14-22"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">nodes</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-23"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">display_parameters</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-24"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> display_parameters</span></span>
<span id="cb14-25"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ask_confirm_parameters</span></span>
<span id="cb14-26"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ask_confirm_parameters</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-27"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ask_confirm_parameters</span></span>
<span id="cb14-28"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sink</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> confirm_parameters</span></span>
<span id="cb14-29"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-30"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">      </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">condition</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> confirm_parameters</span></span>
<span id="cb14-31"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">      </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">then</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> calculate_trajectory</span></span>
<span id="cb14-32"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">      </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">otherwise</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ask_parameters_input</span></span>
<span id="cb14-33"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ask_parameters_input</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-34"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ask_parameters_input</span></span>
<span id="cb14-35"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sink</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> human_input</span></span>
<span id="cb14-36"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> update_parameters</span></span>
<span id="cb14-37"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">update_parameters</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-38"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> prompt</span></span>
<span id="cb14-39"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">template</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Update the parameters based on the user</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">''</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">s input. Current parameters:</span></span>
<span id="cb14-40"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      mass: {mass} charge: {charge} initial_velocity: {initial_velocity} E_field:</span></span>
<span id="cb14-41"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      {E_field} B_field: {B_field} User input: {human_input} Please return the updated</span></span>
<span id="cb14-42"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      parameters in JSON format. {{ "mass": float, "charge": float, "initial_velocity":</span></span>
<span id="cb14-43"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      list[float], "E_field": list[float], "B_field": list[float] }}'</span></span>
<span id="cb14-44"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sink</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> parameters_updater_output</span></span>
<span id="cb14-45"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> display_parameters</span></span>
<span id="cb14-46"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_trajectory</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-47"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> calculate_trajectory</span></span>
<span id="cb14-48"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sink</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> positions</span></span>
<span id="cb14-49"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> plot_trajectory</span></span>
<span id="cb14-50"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_trajectory</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-51"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> plot_trajectory</span></span>
<span id="cb14-52"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sink</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">[</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">trajectory_plot</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">,</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> trajectory_plot_path</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">]</span></span>
<span id="cb14-53"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> analyze_trajectory</span></span>
<span id="cb14-54"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">analyze_trajectory</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-55"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> prompt</span></span>
<span id="cb14-56"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">template</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Analyze this particle trajectory plot. Please determine: 1. The type</span></span>
<span id="cb14-57"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      of motion (linear, circular, helical, or chaotic) 2. Key physical features (radius,</span></span>
<span id="cb14-58"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      period, pitch angle if applicable) 3. Explanation of the motion 4. Anomalies</span></span>
<span id="cb14-59"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      in the motion Output in JSON format: {{ "trajectory_type": "type_name", "key_features":</span></span>
<span id="cb14-60"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      { "feature1": value, "feature2": value }, "explanation": "detailed explanation",</span></span>
<span id="cb14-61"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">      "anomalies": "anomaly description" }}'</span></span>
<span id="cb14-62"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sink</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> analysis_result</span></span>
<span id="cb14-63"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">image_keys</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> trajectory_plot_path</span></span>
<span id="cb14-64"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ask_continue_simulation</span></span>
<span id="cb14-65"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">  </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ask_continue_simulation</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-66"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">type</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> ask_continue_simulation</span></span>
<span id="cb14-67"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sink</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> continue_simulation</span></span>
<span id="cb14-68"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">    </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">next</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span></span>
<span id="cb14-69"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">      </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">condition</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> continue_simulation</span></span>
<span id="cb14-70"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">      </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">then</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> display_parameters</span></span>
<span id="cb14-71"><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">      </span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">otherwise</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> END</span></span>
<span id="cb14-72"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">entry_point</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> display_parameters</span></span>
<span id="cb14-73"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llm</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> gemini/gemini-2.0-flash</span></span>
<span id="cb14-74"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vlm</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> gemini/gemini-2.0-flash</span></span>
<span id="cb14-75"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exit_commands</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">:</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> </span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">[</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stop workflow</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">,</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> quit workflow</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">,</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;"> terminate workflow</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">]</span></span></code></pre></div></div>
</details>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Tip
</div>
</div>
<div class="callout-body-container callout-body">
<p><strong>Happy Building!</strong> 🏗️ Nodeology is constantly evolving, so keep an eye on updates 👀 😊</p>
</div>
</div>


</section>

<script data-collect-dnt="true" async="" src="https://scripts.simpleanalyticscdn.com/latest.js"></script><a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a><div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-reuse"><h2 class="anchored quarto-appendix-heading">Reuse</h2><div class="quarto-appendix-contents"><div><a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</a></div></div></section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{yin2025,
  author = {Yin, Xiangyu},
  title = {Building an {AI-Driven} {Scientific} {Workflow} \&amp; {Chatbot}
    with {Nodeology}},
  date = {2025-03-13},
  url = {https://xiangyu-yin.com/content/post_nodeology_example.html},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-yin2025" class="csl-entry quarto-appendix-citeas">
Yin, Xiangyu. 2025. <span>“Building an AI-Driven Scientific Workflow
&amp; Chatbot with Nodeology.”</span> March 13. <a href="https://xiangyu-yin.com/content/post_nodeology_example.html">https://xiangyu-yin.com/content/post_nodeology_example.html</a>.
</div></div></section></div> ]]></description>
  <guid>https://xiangyu-yin.com/content/post_nodeology_example.html</guid>
  <pubDate>Thu, 13 Mar 2025 05:00:00 GMT</pubDate>
  <media:content url="https://xiangyu-yin.com/content/img/nodeology.png" medium="image" type="image/png" height="49" width="144"/>
</item>
<item>
  <title>“Deep Research” for Scientific Literature Review</title>
  <dc:creator>Xiangyu Yin</dc:creator>
  <link>https://xiangyu-yin.com/content/post_deep_research.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level1">
<h1>Introduction</h1>
<p>Artificial intelligence (AI) is increasingly playing an integral role in helping researchers navigate the ever-expanding scientific literature. Innovations in natural language processing (NLP) and large language models (LLMs) promise to streamline how we discover, interpret, and compile research findings. Among the latest developments, <strong>Deep Research</strong> features from OpenAI and Google’s Gemini aim to autonomously gather online data, analyze and synthesize information, and return comprehensive summaries or research reports.</p>
<p>In this blog, I investigate whether these “Deep Research” tools can truly meet the rigorous standards of literature reviews demanded by scientific communities—particularly in a niche but rapidly growing area of <strong>generative models for inorganic crystal structures</strong>. I compare:</p>
<ul>
<li><strong>OpenAI’s Deep Research</strong> built into ChatGPT Pro,</li>
<li><strong>Google Gemini’s Deep Research</strong>,</li>
<li>A <strong>semi-manual approach</strong>, where I provide curated papers to the model.</li>
</ul>
<p>Throughout this post, I’ll walk through the prompts used, the format of the generated reports, and how I evaluated factors such as information retrieval and research insight quality. Finally, I’ll present the highest-scoring merged report to show the best possible outcome from this mini-experiment.</p>
</section>
<section id="what-is-deep-research" class="level1">
<h1>What is “Deep Research”?</h1>
<p>There is not a rigorous definition of “Deep Research”, but it generally refers to an <strong>AI-powered agent</strong> that autonomously conducts multi-step investigations by gathering, analyzing, and synthesizing diverse online data—from text and images to PDFs—to generate comprehensive, cited reports. This concept has origins in frameworks such as <a href="https://storm.genie.stanford.edu/">STORM</a> and has been further developed by large players:</p>
<ul>
<li>OpenAI integrated a variant into ChatGPT for Pro subscribers.</li>
<li>Google’s Gemini includes a “Deep Research” component for Gemini Advanced subscribers.</li>
<li>Some open-source alternatives are also emerging.</li>
</ul>
<p>Excitement has grown around these capabilities, fueled by user testimonials and demos. Researchers like me are wondering if these tools can expedite their literature reviews while maintaining rigor and reliability.</p>
</section>
<section id="deep-research-for-scientific-studies" class="level1">
<h1>Deep Research for Scientific Studies</h1>
<p>In academia, literature reviews are fundamental to scientific research. Scientists regularly summarize current research findings to identify trends, gaps, and future directions. If Deep Research could automate large portions of the review process, that would be a significant boon.</p>
<p>I chose generative models for inorganic crystal structures as my test case for several reasons. During my PhD research in computational materials design, I focused extensively on inorganic crystal materials and authored a <a href="https://www.sciencedirect.com/science/article/pii/S2211339821000587">review paper on crystal structure prediction (CSP)</a>. At that time, generative models were a relatively small subfield. However, there has since been an explosion of research in this area, with the emergence of new techniques like diffusion models, flow matching models, and foundation models, along with more application-oriented studies focusing on conditional generation. This topic is particularly relevant to me as I recently led a seed LDRD project on using crystal generative models for X-ray diffraction (XRD) structure determination. Given both my background and current research interests, I’m eager to understand how this field has evolved.</p>
</section>
<section id="simple-prompt-engineering-for-deep-research" class="level1">
<h1>Simple Prompt Engineering for “Deep Research”</h1>
<p>Rather than expecting a single, monolithic prompt to produce a full-blown review, I started by asking each system a <strong>guiding question</strong>:</p>
<blockquote class="blockquote">
<p>I am a computational material scientist trying to do a literature review on generative models for inorganic crystal structure prediction/generation. What do you think I should pay attention to or focus on when reading through the papers?</p>
</blockquote>
<p>This initial prompt helped the AI define an <strong>outline</strong> or <strong>review structure</strong>. Based on the responses, I then formulated a final prompt:</p>
<blockquote class="blockquote">
<p>Can you help me research on generative models for inorganic crystal structure prediction/generation/design? Return me a report with three sections:</p>
<ol type="1">
<li>A table with seven columns: authors/method, generative modeling approach, representation of crystal structures, data and training protocols, validation &amp; evaluation metrics, interpretability, computational efficiency.</li>
<li>A comparison of these methods and how the field has evolved.</li>
<li>Discussion of gaps, challenges, and future directions.</li>
</ol>
</blockquote>
</section>
<section id="openai-vs.-gemini-vs.-semi-manual" class="level1">
<h1>OpenAI vs.&nbsp;Gemini vs.&nbsp;Semi-Manual</h1>
<ol type="1">
<li><strong>OpenAI’s Deep Research</strong>
<ul>
<li>After choosing “Deep Research,” ChatGPT typically asks follow-up questions to refine search criteria.</li>
<li>An internal version of O3 model is used.</li>
<li>It will output a rendered markdown file that the user can copy.<br>
</li>
<li>I ran the same prompt (and minor clarifications) three times to see if results varied.</li>
<li>I also merged the three outputs to see if it would improve the quality of the report.</li>
</ul></li>
<li><strong>Google Gemini</strong>
<ul>
<li>Gemini generates a research plan, requests user review, and then “executes” the plan.</li>
<li>Currently only Gemini 1.5 Pro can be used for this feature.</li>
<li>The final step yields a Google Doc-like report (which can be exported to other formats).<br>
</li>
<li>I repeated the entire process three times as well.</li>
<li>I also merged the three outputs to see if it would improve the quality of the report.</li>
</ul></li>
<li><strong>Semi-Manual Approach</strong>
<ul>
<li>I collected 27 relevant papers from my own personal research library (PDF format).</li>
<li>Using <a href="https://github.com/VikParuchuri/marker">Marker-PDF</a>, I converted PDFs into markdown files. This will likely overcome paywalls and limited web-scraping issues.<br>
</li>
<li>I fed these markdown documents as a context to LLM, prompting for the same final structure.</li>
<li>I used Gemini 2 Pro for this approach due to context window constraints.</li>
<li>I repeated the process three times as well.</li>
<li>I also merged the three outputs to see if it would improve the quality of the report.</li>
</ul></li>
</ol>
<p>Finally, I also experimented with “merging” all individual reports from each approach to see if combining them could reduce errors or improve coverage. I ended up with 13 total reports (3 from each method, plus 4 merges). All the reports can be found in the <a href="https://drive.google.com/drive/folders/1elFR4KDwy0iz0dSbGtwdjua4wouvnigz?usp=sharing">Google Drive Folder</a></p>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Tip
</div>
</div>
<div class="callout-body-container callout-body">
<p>Gemini 2 Pro offers a generous 2-million token context window. My collection of 27 papers consumed only 10% of this capacity, suggesting that Gemini 2 Pro could theoretically handle over 200 full-length academic papers in a single context.</p>
</div>
</div>
</section>
<section id="evaluate-information-retrieval-effectiveness" class="level1 page-columns page-full">
<h1>Evaluate Information Retrieval Effectiveness</h1>
<p>First I want to see how effective each method is at retrieving the papers from the internet.</p>
<div class="quarto-figure quarto-figure-center page-columns page-full">
<figure class="figure page-columns page-full">
<p><img src="https://xiangyu-yin.com/content/img/deep_research_retrieval_heatmap.png" class="img-fluid figure-img"></p>
<figcaption class="margin-caption">Row: Method and attempt number; Column: Papers; Cell: colored if the method retrieved the paper else grey (green: Gemini, blue: OpenAI, red: Semi-Manual)</figcaption>
</figure>
</div>
<p><strong>Observations</strong>:</p>
<ul>
<li>None of the methods can cover all references.</li>
<li>Each deep research method has its own “preference” in terms of papers it retrieved.</li>
<li>There are differences across repeated runs for deep research.</li>
<li>The deep research methods indeed discover some references that the Semi-Manual approach missed.</li>
<li>Merging outputs from different runs can improve coverage.</li>
</ul>
<p>Overall, deep research methods can be convenient but can miss domain-specific or paywalled literature unless carefully prompted or otherwise augmented. A hybrid workflow (e.g., running multiple deep research attempts and also feeding in curated PDF/Markdown documents) may be the most robust for critical academic or R&amp;D use cases.</p>
<p>Below is the full table of papers that were successfully retrieved across different attempts from Gemini, OpenAI, and the semi-manual method. Each row is a paper and each column is an attempt. The notation <code>Y</code> indicates that the method captured that reference. An empty slot indicates it was not cited or found in the final summary. By looking at the table in detail, we can see that each method has its own “preference” in terms of papers it retrieved. The Gemini method can retrieve very recent papers released in 2025 potentialy due to Google’s up to date search engine. The OpenAI method tends to retrieve more classic papers that are more related to generative models. The semi-manual method uses my own library of papers that are most relevant to the topic I care about such as XRD (i.e., conditional generation), which could be a bit niche and the reason why deep research methods miss those papers.</p>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Tip
</div>
</div>
<div class="callout-body-container callout-body">
<p>Use the navigation bar on the right to skip this very long table to the next section.</p>
</div>
</div>
<table class="caption-top table">
<colgroup>
<col style="width: 6%">
<col style="width: 26%">
<col style="width: 11%">
<col style="width: 11%">
<col style="width: 11%">
<col style="width: 7%">
<col style="width: 7%">
<col style="width: 7%">
<col style="width: 8%">
</colgroup>
<thead>
<tr class="header">
<th>Year</th>
<th>Author</th>
<th>Gemini_1</th>
<th>Gemini_2</th>
<th>Gemini_3</th>
<th>OAI_1</th>
<th>OAI_2</th>
<th>OAI_3</th>
<th>Manual</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>2018</td>
<td>Nouira et al.</td>
<td></td>
<td>Y</td>
<td></td>
<td>Y</td>
<td>Y</td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2019</td>
<td>Noh et al.</td>
<td></td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2019</td>
<td>Hoffmann et al.</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2020</td>
<td>Court et al.</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td></td>
<td>Y</td>
<td></td>
</tr>
<tr class="odd">
<td>2020</td>
<td>Kim et al.</td>
<td>Y</td>
<td>Y</td>
<td></td>
<td></td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr class="even">
<td>2020</td>
<td>Kim et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2020</td>
<td>Dan et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
</tr>
<tr class="even">
<td>2020</td>
<td>Pathak et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
</tr>
<tr class="odd">
<td>2021</td>
<td>Xie et al.</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr class="even">
<td>2021</td>
<td>Long et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td>Y</td>
<td></td>
</tr>
<tr class="odd">
<td>2021</td>
<td>Zhao et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
</tr>
<tr class="even">
<td>2022</td>
<td>Ren et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2023</td>
<td>Jiao et al.</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
</tr>
<tr class="even">
<td>2023</td>
<td>Zeni et al.</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td>Y</td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2023</td>
<td>Luo et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td>Y</td>
<td></td>
</tr>
<tr class="even">
<td>2023</td>
<td>Chenebuah et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
</tr>
<tr class="odd">
<td>2023</td>
<td>Yang et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Alverson et al.</td>
<td>Y</td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Su et al.</td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="even">
<td>2024</td>
<td>Zhu et al.</td>
<td>Y</td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Takahara et al.</td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="even">
<td>2024</td>
<td>Antunes et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Choudhary</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Riesel et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Choudhary</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Li et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Miller et al.</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Gruver et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Lai et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Guo et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Pakornchote et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Mohanty et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Mishra et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Aykol et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Sriram et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2024</td>
<td>Yang et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Lin et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
</tr>
<tr class="even">
<td>2024</td>
<td>Jiao et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
</tr>
<tr class="odd">
<td>2024</td>
<td>Luo et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="even">
<td>2025</td>
<td>Collins et al.</td>
<td>Y</td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="odd">
<td>2025</td>
<td>Luo et al.</td>
<td>Y</td>
<td>Y</td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="even">
<td>2025</td>
<td>Kazeev et al.</td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2025</td>
<td>Breuck et al.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="even">
<td>2025</td>
<td>Levy et al.</td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
</tr>
<tr class="odd">
<td>2025</td>
<td>Tone et al.</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="even">
<td>2025</td>
<td>Liu et al.</td>
<td></td>
<td></td>
<td></td>
<td>Y</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</section>
<section id="evaluate-research-insights-quality" class="level1 page-columns page-full">
<h1>Evaluate Research Insights Quality</h1>
<p>Next I wanted to see how effective each method is at generating insightful and novel research insights. I asked two different LLMs (OpenAI O1 Pro and Gemini 2 Pro) to act as a <strong>reviewer</strong>—scoring each report from 0 to 10 based on <strong>logical quality, understanding of papers, depth of thinking, and novelty of insights</strong> (rather than sheer length or grammatical quality). For each LLM reviewer, I ran the same prompt three times. Below is the prompt I used:</p>
<blockquote class="blockquote">
<p>I have 13 reports on generative models for crystal structure and I need to grade them on a scale from 0 to 10, the focus is the quality of the content such as logic, paper understanding, thinking depth, idea novelty etc. We do NOT focus on quantity of the report such as number of papers cited, number of bullet points, length of report etc. We do NOT judge the grammar error, language usage or writing styles either. Can you come up with a good metric to reflect our focus on quality and grade those twelve reports and give your reasons for each. Return a table with report index, score, summary of reasons columns.</p>
</blockquote>
<div class="quarto-figure quarto-figure-center page-columns page-full">
<figure class="figure page-columns page-full">
<p><img src="https://xiangyu-yin.com/content/img/deep_research_scores.png" class="img-fluid figure-img"></p>
<figcaption class="margin-caption">Average scores from the two LLM reviewers</figcaption>
</figure>
</div>
<p><strong>Observations</strong>:</p>
<ul>
<li>Most “merge” reports (e.g., Gemini_merge, OAI_merge, Manual_merge) achieve higher average scores than their individual runs. This suggests that combining multiple runs of the same method often leads to better synthesis and, consequently, a more robust final report.</li>
<li>Manual_merge achieves the highest overall average (9.08). Interestingly, All_merge (which includes everything) does slightly worse at 8.92 overall—primarily because the Gemini reviewer gave it a comparatively lower score.</li>
<li>One single-run report, OAI_1, has an overall average of 9.00, which is higher than both OAI_merge (8.92) and All_merge (8.92). This is an example of how sometimes a single “lucky” run can outperform a merge, depending on the content and the reviewer’s perspective.</li>
<li>In some cases, the OpenAI (OAI) reviewer and the Gemini reviewer exhibit notable scoring differences. For example, Manual_1 receives 7.00 from OAI but 9.17 from Gemini. Conversely, All_merge receives 9.67 from OAI but only 8.17 from Gemini. These variations highlight that different reviewer models may emphasize different aspects of “quality.”</li>
<li>While Gemini_merge does respectably at 8.50, the individual Gemini attempts (Gemini_1, Gemini_2, and Gemini_3) are at the lower end of the overall range (7.08–7.67). This could indicate that the Gemini-based reports benefited significantly from merging multiple attempts.</li>
</ul>
<p>Below is the full table of the scores from both LLM reviewers:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 21%">
<col style="width: 26%">
<col style="width: 30%">
<col style="width: 20%">
</colgroup>
<thead>
<tr class="header">
<th>Report</th>
<th>OAI_score_1,2,3,avg</th>
<th>Gemini_score_1,2,3,avg</th>
<th>OAI_Gemini_avg</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Gemini_1</td>
<td>8, 7, 7 (7.33)</td>
<td>7.5, 7, 6 (6.83)</td>
<td>7.08</td>
</tr>
<tr class="even">
<td>Gemini_2</td>
<td>7, 6, 7 (6.67)</td>
<td>8, 8, 7 (7.67)</td>
<td>7.17</td>
</tr>
<tr class="odd">
<td>Gemini_3</td>
<td>8, 7, 8 (7.67)</td>
<td>7, 7, 7 (7.00)</td>
<td>7.33</td>
</tr>
<tr class="even">
<td>Gemini_merge</td>
<td>9, 8, 8 (8.33)</td>
<td>9, 9, 8 (8.67)</td>
<td>8.50</td>
</tr>
<tr class="odd">
<td>OAI_1</td>
<td>8, 9, 9 (8.67)</td>
<td><strong>9, 9, 10 (9.33)</strong></td>
<td>9.00</td>
</tr>
<tr class="even">
<td>OAI_2</td>
<td>8, 8, 8 (8.00)</td>
<td>9, 9, 9 (9.00)</td>
<td>8.50</td>
</tr>
<tr class="odd">
<td>OAI_3</td>
<td>9, 9, 9 (9.00)</td>
<td>8.5, 8, 8 (8.17)</td>
<td>8.58</td>
</tr>
<tr class="even">
<td>OAI_merge</td>
<td>10, 9, 9 (9.33)</td>
<td>8.5, 9, 8 (8.50)</td>
<td>8.92</td>
</tr>
<tr class="odd">
<td>Manual_1</td>
<td>7, 7, 7 (7.00)</td>
<td>8.5, 10, 9 (9.17)</td>
<td>8.08</td>
</tr>
<tr class="even">
<td>Manual_2</td>
<td>8, 8, 8 (8.00)</td>
<td>8.5, 9, 8 (8.50)</td>
<td>8.25</td>
</tr>
<tr class="odd">
<td>Manual_3</td>
<td>9, 8, 9 (8.67)</td>
<td>8, 8, 8 (8.00)</td>
<td>8.33</td>
</tr>
<tr class="even">
<td>Manual_merge</td>
<td>9, 9, 9 (9.00)</td>
<td>9.5, 9, 9 (9.17)</td>
<td><strong>9.08</strong></td>
</tr>
<tr class="odd">
<td>All_merge</td>
<td><strong>9, 10, 10 (9.67)</strong></td>
<td>8.5, 8, 8 (8.17)</td>
<td>8.92</td>
</tr>
</tbody>
</table>
<hr>
</section>
<section id="conclusion" class="level1">
<h1>Conclusion</h1>
<p>From this experiment, it’s clear that <strong>Deep Research</strong> tools can provide <strong>valuable</strong> and <strong>well-structured</strong> summaries, but they aren’t yet a drop-in replacement for a thorough, human-curated literature review—particularly in cutting-edge scientific fields.</p>
<p><strong>Key Observations</strong>:</p>
<ul>
<li>Information retrieval is not yet up to par due to paywalls, limited web-scraping, and domain specificity.</li>
<li>Merging multiple AI outputs can substantially improve final report quality</li>
<li>The advanced models can generate interesting and novel research insights.</li>
</ul>
<p>Overall, while the promise of “Deep Research” is highly appealing, scientists should remain actively involved for a thorough literature review. Nonetheless, the strides made in foundation models are both impressive and indicative of a rapidly evolving landscape. Deep research as for now is more like a starting point for a literature review. But it can be envisioned that future advancements in data sources, model capabilities, and workflow automation will make it a truly robust autonomous tool for literature review.</p>
</section>
<section id="highest-scored-final-report" class="level1">
<h1>Highest Scored Final Report</h1>
<p>Below is the final merged output from my <strong>semi-manual</strong> approach, which scored highest overall.</p>
<section id="section-1-comprehensive-table-of-methods" class="level2">
<h2 class="anchored" data-anchor-id="section-1-comprehensive-table-of-methods"><strong>Section 1: Comprehensive Table of Methods</strong></h2>
<div class="custom-small">
<table class="caption-top table">
<colgroup>
<col style="width: 5%">
<col style="width: 5%">
<col style="width: 12%">
<col style="width: 20%">
<col style="width: 17%">
<col style="width: 20%">
<col style="width: 18%">
</colgroup>
<thead>
<tr class="header">
<th><strong>Authors &amp; Method (Year)</strong></th>
<th><strong>Generative Modeling Approach</strong></th>
<th><strong>Representation of Crystal Structures</strong></th>
<th><strong>Data &amp; Training Protocols</strong></th>
<th><strong>Validation &amp; Evaluation Metrics</strong></th>
<th><strong>Physical &amp; Chemical Interpretability</strong></th>
<th><strong>Computational Efficiency &amp; Practicality</strong></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Nouira et al., CrystalGAN (2019)</strong></td>
<td>GAN (two cross-domain blocks with constraints)</td>
<td>2D matrix (unit cell + fractional coordinates).</td>
<td>1,416 binary hydrides (e.g., NiH, PdH) to generate ternary hydrides. Data augmentation used to diversify structures.</td>
<td>Number of generated ternary compositions satisfying geometric constraints (like interatomic distances).</td>
<td>Explicitly encodes geometric constraints (min/max atomic distances).</td>
<td>High throughput, though evaluation relies heavily on geometric rules rather than post-DFT stability checks. Focus is on a limited composition space (ternary hydrides).</td>
</tr>
<tr class="even">
<td><strong>Hoffmann et al.&nbsp;(2019)</strong></td>
<td>VAE</td>
<td>3D voxel grid (image-based).</td>
<td>Materials Project.</td>
<td>Structural validity, reconstruction accuracy; low success rates reported.</td>
<td>Lacks rotational invariance; voxel representations can be large and less chemically intuitive.</td>
<td>Voxel-based approach is computationally heavy and suffers from low validity rates.</td>
</tr>
<tr class="odd">
<td><strong>Noh et al., iMatGen (2019)</strong></td>
<td>VAE (hierarchical, two-step)</td>
<td>3D voxel image (“cell image” + “basis image”), invertible to atomic positions &amp; cell parameters.</td>
<td>10,981 VxOy structures from MP (via elemental substitution).</td>
<td>RMSE of atomic positions &amp; cell parameters; rediscovery of known VxOy structures; DFT validation for stability (energy above hull).</td>
<td>Demonstrates invertibility and reasonable stability predictions. Still uses voxel grids, which are memory-intensive.</td>
<td>More efficient than some genetic algorithms; requires post-processing for structure recovery; moderately high computational cost due to voxel approach.</td>
</tr>
<tr class="even">
<td><strong>Kim et al.&nbsp;(2020)</strong></td>
<td>GAN</td>
<td>3D point-based representation (atomic positions + lattice parameters).</td>
<td>Mg-Mn-O system from Materials Project. Data augmentation to reach ~112k structures.</td>
<td>DFT validation of newly generated structures (energy above convex hull). 23 new stable compounds identified.</td>
<td>Incorporates domain rules (compositions, stoichiometry). Targets stable crystals with Ehull &lt; 0.2 eV/atom.</td>
<td>Less expensive than exhaustive searches; still requires DFT for final validation.</td>
</tr>
<tr class="odd">
<td><strong>Ren et al., FTCP (2020)</strong></td>
<td>Variational Autoencoder (VAE) + 1D CNN</td>
<td>Real-space (CIF-like: element matrix, lattice, site coords) + Reciprocal-space (Fourier transforms of crystal properties).</td>
<td>Materials Project; focuses on ternary crystals. Used for formation energy, bandgap, thermoelectric power factor design.</td>
<td>Validity rate, success rate (generated structures with target properties), improvement over random sampling. DFT used to confirm stability.</td>
<td>Explicitly considers both real-space and reciprocal-space info, improving periodic and electronic property representations.</td>
<td>Much faster than brute-force screening. DFT validation is still required for final check.</td>
</tr>
<tr class="even">
<td><strong>Xie et al., CDVAE (2021)</strong></td>
<td>Diffusion model + VAE (“Crystal Diffusion VAE”)</td>
<td>Periodic multi-graph (atoms, bonds) with SE(3) equivariance adapted for crystals.</td>
<td>Perov-5, Carbon-24, MP-20 datasets.</td>
<td>Reconstruction, validity, diversity, property statistics (e.g., energy above hull).</td>
<td>Incorporates explicit periodic symmetry and an energy-based inductive bias.</td>
<td>Good performance across tasks but diffusion-based sampling can be relatively slow compared to simpler VAEs.</td>
</tr>
<tr class="odd">
<td><strong>Pakornchote et al., DP-CDVAE (2023)</strong></td>
<td>Diffusion probabilistic model + VAE</td>
<td>Fractional coordinates + lattice, building on CDVAE’s framework.</td>
<td>Perov-5, Carbon-24, MP-20.</td>
<td>Reconstruction accuracy, stability, ground-state performance (energy minimization).</td>
<td>Improves energy accuracy over standard CDVAE.</td>
<td>Similar complexity to other diffusion-based approaches; yields better ground-state structures.</td>
</tr>
<tr class="even">
<td><strong>Jiao et al., DiffCSP (2023)</strong></td>
<td>Diffusion probabilistic model (E(3)-equivariant)</td>
<td>Fractional coordinates + lattice vectors, jointly diffused.</td>
<td>MP-20, Perov-5, Carbon-24.</td>
<td>Match rate, RMSE, stability checks, property statistics, space group distribution.</td>
<td>Captures periodic E(3) invariance; joint generation of lattice &amp; coordinates yields better crystallographic symmetry.</td>
<td>Outperforms existing CSP models with moderate diffusion overhead.</td>
</tr>
<tr class="odd">
<td><strong>Zeni et al., MatterGen (2023)</strong></td>
<td>Diffusion probabilistic model</td>
<td>Atom types, coordinates, lattice (joint diffusion).</td>
<td>Alex-MP-20 (607,684 structures) as base training; fine-tuning on property-driven subsets.</td>
<td>Stability, uniqueness, novelty (S.U.N.) rate; RMSD vs.&nbsp;relaxed structures; property constraints.</td>
<td>Demonstrated ability to generate stable, novel materials. Fine-tuning for targeted properties.</td>
<td>High generation success rate; fine-tuning approach is flexible but adds training overhead.</td>
</tr>
<tr class="even">
<td><strong>Levy et al., SymmCD (2023)</strong></td>
<td>Diffusion probabilistic model</td>
<td>Asymmetric unit &amp; site symmetry, encoded as binary matrices.</td>
<td>MP-20.</td>
<td>Validity, diversity, space group distribution, RMSD, symmetry reproduction rate.</td>
<td>Innovative symmetry representation enables more consistent generation of non-trivial space groups.</td>
<td>Comparable diffusion approach with improved space group fidelity.</td>
</tr>
<tr class="odd">
<td><strong>AI4Science et al., CrystalGFN (2023)</strong></td>
<td>GFlowNet (Generative Flow Network)</td>
<td>Wyckoff positions + space group + composition constraints.</td>
<td>Datasets not fully specified; demonstration on select examples.</td>
<td>Symmetry analysis, structural validity, distribution over space groups.</td>
<td>Enforces symmetry constraints by design, generating structures consistent with a given space group.</td>
<td>GFlowNets can be efficient; method is relatively new in materials domain, so broader performance data limited.</td>
</tr>
<tr class="even">
<td><strong>Flam-Shepherd &amp; Aspuru-Guzik (2023)</strong></td>
<td>Language Model (trained from scratch)</td>
<td>CIF files treated as text.</td>
<td>MP-20.</td>
<td>Validity and structural metrics; comparisons to known structures from MP.</td>
<td>Text-based representation is flexible; direct generation of CIF strings.</td>
<td>Matches or outperforms some prior generative models but not explicitly symmetry-aware.</td>
</tr>
<tr class="odd">
<td><strong>Jiao et al., DiffCSP++ (2023)</strong></td>
<td>Diffusion model (enhanced from DiffCSP)</td>
<td>Fractional coordinates + lattice vectors (E(3)-equivariant).</td>
<td>MP-20, Perov-5, Carbon-24.</td>
<td>Match rate, RMSE, validity, property statistics.</td>
<td>Improves structural diversity and generation beyond unseen space groups compared to original DiffCSP.</td>
<td>Competitive performance and more efficient sampling than the original DiffCSP.</td>
</tr>
<tr class="even">
<td><strong>Gruver et al.&nbsp;(2024)</strong></td>
<td>Large Language Model (fine-tuned LLM)</td>
<td>Text-encoded crystal data (“crystal strings” or CIF-like text).</td>
<td>Materials Project for structure data; extensive text corpora for pre-training.</td>
<td>Validity, coverage, property distribution, stability checks (DFT), diversity.</td>
<td>High stability rate; improved symmetry understanding through LLM’s language-based abstraction.</td>
<td>Computationally efficient sampling; flexible prompting for structure generation.</td>
</tr>
<tr class="odd">
<td><strong>Antunes et al., CrystaLLM (2024)</strong></td>
<td>Autoregressive Large Language Model</td>
<td>Complete CIF files as text tokens.</td>
<td>Millions of CIF files; large-scale pre-training.</td>
<td>Match rate (structure similarity), RMSE of positions, property-based conditioning tests.</td>
<td>Learns complex crystal-chemistry relationships; can condition on formation energy or space group.</td>
<td>High efficiency in generating valid CIFs; training is expensive but inference can be rapid.</td>
</tr>
<tr class="even">
<td><strong>Mishra et al., LLaMat (2024)</strong></td>
<td>Pretrained LLM (LLaMA-based) + domain fine-tuning</td>
<td>Text + CIF data (materials literature, RedPajama, MP).</td>
<td>Large corpora (R2CID, 30B tokens), instruction fine-tuning on materials tasks, plus CIF files from MP.</td>
<td>F1 scores for textual tasks, match rate and RMSE for crystal generation, DFT stability checks.</td>
<td>Combines domain knowledge from literature with direct crystal structure data, enabling retrieval and generation.</td>
<td>Scales well to large text data. Efficient generation after fine-tuning.</td>
</tr>
<tr class="odd">
<td><strong>Kazeev et al., Wyckoff Transformer (2024)</strong></td>
<td>Transformer</td>
<td>Wyckoff positions as tokens (focus on space group &amp; symmetry).</td>
<td>MP-20.</td>
<td>Novelty, uniqueness, symmetry distribution, property prediction accuracy.</td>
<td>Emphasizes crystal symmetry and site occupancy. Good generalization across common space groups.</td>
<td>Yields high novelty and structural diversity; handles up to moderate unit cell sizes.</td>
</tr>
<tr class="even">
<td><strong>Miller et al., FlowMM (2024)</strong></td>
<td>Riemannian Flow Matching</td>
<td>Fractional coordinates + lattice parameters (atom types).</td>
<td>MP-20, Perov-5, Carbon-24.</td>
<td>Stability rate (energy above hull), S.U.N. rate, generation cost.</td>
<td>Rotation-invariant representation; advanced geometry handling.</td>
<td>Very fast inference, significantly more efficient than typical diffusion-based methods.</td>
</tr>
<tr class="odd">
<td><strong>Choudhary, AtomGPT (2024)</strong></td>
<td>Transformer (GPT-style)</td>
<td>Chemical &amp; structural text descriptions.</td>
<td>JARVIS-DFT (formation energies, bandgaps, superconducting Tc).</td>
<td>MAE/RMSE for property prediction and structure generation tasks.</td>
<td>Targets forward (property) and inverse (structure) predictions. BCS superconductors highlighted.</td>
<td>Fine-tuning approach is relatively lightweight. Allows text-based property prompts.</td>
</tr>
<tr class="even">
<td><strong>Choudhary, DiffractGPT (2024)</strong></td>
<td>Generative Pre-trained Transformer (GPT)</td>
<td>PXRD patterns to full crystal structure (text-based).</td>
<td>JARVIS-DFT; simulated and some experimental XRD data.</td>
<td>MAE, RMS deviation in predicted structure, match rate to known references.</td>
<td>Focuses on inverse design from PXRD. Potentially closes the loop with experimental characterization.</td>
<td>Easy integration with existing XRD analysis pipelines; inference is relatively fast.</td>
</tr>
<tr class="odd">
<td><strong>Guo et al., PXRDNet (2024)</strong></td>
<td>Diffusion Model</td>
<td>Lattice parameters + fractional coordinates from input PXRD patterns (nanomaterials).</td>
<td>MP-20-PXRD (simulated), plus experimental PXRD (IUCr database).</td>
<td>Rwp (weighted profile R-factor), material reconstruction success, breakdown by crystal system.</td>
<td>Good at handling nanoscale PXRD (peak broadening) and bridging the simulation-experiment gap.</td>
<td>Moderate computational cost with Langevin steps; outperforms older structure-solution techniques on complex nanomaterials.</td>
</tr>
<tr class="even">
<td><strong>Riesel et al.&nbsp;(2024)</strong></td>
<td>Conditional Diffusion Model</td>
<td>Encoded PXRD patterns.</td>
<td>RRUFF, Materials Project, Powder Diffraction File (PDF) databases.</td>
<td>Match rate, cosine similarity to known PXRD peaks, visual comparison.</td>
<td>First large-scale demonstration for novel structure discovery from PXRD, including high-pressure phases.</td>
<td>Can determine unknown or unlabeled structures; computational cost is moderate for each reconstruction.</td>
</tr>
<tr class="odd">
<td><strong>Q. Li et al., PXRDGen (2024)</strong></td>
<td>Diffusion Model + Rietveld refinement</td>
<td>PXRD data, composition, fractional coordinates, lattice parameters.</td>
<td>MP-20.</td>
<td>Match rate, RMSE, R-factor after Rietveld refinement.</td>
<td>Addresses challenges like positioning light atoms and distinguishing near elements.</td>
<td>Automated structure determination from PXRD with high accuracy.</td>
</tr>
<tr class="even">
<td><strong>Cao et al., CrystalFormer (2024)</strong></td>
<td>Transformer (autoregressive)</td>
<td>Wyckoff positions with assigned chemical elements.</td>
<td>MP-20.</td>
<td>Match rate, RMSE, novelty of generated structures.</td>
<td>Focuses on Wyckoff-based generation, improving symmetry handling.</td>
<td>Can generate many new templates exceeding prior diffusion/flow-based methods in certain space groups.</td>
</tr>
<tr class="odd">
<td><strong>Zhu et al., WyCryst (2024)</strong></td>
<td>VAE</td>
<td>Wyckoff positions + chemical elements (unordered sets).</td>
<td>MP-20.</td>
<td>Validity, match rate, space group distribution.</td>
<td>Captures symmetry within fixed space groups but has difficulty generalizing new symmetries.</td>
<td>Potentially fast training but less flexible across diverse space groups.</td>
</tr>
<tr class="even">
<td><strong>Aykol et al., a2c (Year not specified, ~2023-2024)</strong></td>
<td>Deep learning potentials + subcell relaxation</td>
<td>Local structural motifs (subcells) of amorphous precursors (melt-quench MD).</td>
<td>Amorphous configurations from MD simulations, GNN potential used to predict crystallization.</td>
<td>Accuracy in predicting crystallization products vs.&nbsp;experimental observations (Ostwald’s rule).</td>
<td>Predicts final crystallized phase from amorphous states, bridging an important gap in materials synthesis.</td>
<td>High accuracy for diverse materials; specialized approach for amorphous → crystalline transformations.</td>
</tr>
<tr class="odd">
<td><strong>Sriram et al., FlowLLM (2024)</strong></td>
<td>Hybrid: Large Language Model + Flow Matching</td>
<td>Text-based prompts + learned flow transformations.</td>
<td>Not extensively detailed, but uses combined textual data (CIF) and structural knowledge for flow-based generation.</td>
<td>Standard generative metrics (validity, novelty) plus property consistency.</td>
<td>Combines LLM’s flexibility with geometry-aware flow methods.</td>
<td>Potentially efficient. Method is relatively new; broad performance data pending.</td>
</tr>
<tr class="even">
<td><strong>De Breuck et al., Matra-Genoa (Year not specified)</strong></td>
<td>Autoregressive Transformer (“sequence to Wyckoff + coords”)</td>
<td>Sequenced Wyckoff representation + free coordinates and lattice parameters (hybrid discrete-continuous).</td>
<td>Materials Project and combined MP + “Alexandria” dataset.</td>
<td>Validity rates (bond lengths, space-group consistency), duplicate rates, S.U.N. ratio (Stable, Unique, Novel). DFT validation of some selected structures.</td>
<td>Explicitly uses symmetry tokens (Wyckoff), can condition on energy above hull or space group.</td>
<td>High throughput (up to 1000 structures/min). Uses robust relaxation workflow.</td>
</tr>
</tbody>
</table>
</div>
</section>
<section id="section-2-comparison-and-evolution-of-the-field" class="level2">
<h2 class="anchored" data-anchor-id="section-2-comparison-and-evolution-of-the-field"><strong>Section 2: Comparison and Evolution of the Field</strong></h2>
<section id="early-attempts-pre-2020-to-2020" class="level3">
<h3 class="anchored" data-anchor-id="early-attempts-pre-2020-to-2020"><strong>1. Early Attempts (pre-2020 to ~2020)</strong></h3>
<ul>
<li><strong>Voxel / Image-Based Representations</strong>
<ul>
<li>Early VAE approaches (e.g., <strong>Hoffmann et al., 2019</strong>; <strong>Noh et al., iMatGen, 2019</strong>) used 3D voxel grids or “image-like” unit-cell representations.<br>
</li>
<li><strong>Limitations:</strong> High memory usage, poor rotational invariance, and lower success rates in generating valid structures without extensive post-processing.</li>
</ul></li>
<li><strong>First GAN/Conditional Models</strong>
<ul>
<li><strong>Nouira et al., CrystalGAN (2019)</strong> and <strong>Kim et al.&nbsp;(2020)</strong> introduced GANs, often with constraints or data augmentation to respect atomic distances and stoichiometry.<br>
</li>
<li><strong>Focus:</strong> Generating limited composition spaces (e.g., hydrides or specific ternary systems) and demonstrating feasibility.<br>
</li>
<li><strong>Drawbacks:</strong> Typically had to rely on DFT checks after generation to confirm stability.</li>
</ul></li>
</ul>
</section>
<section id="transition-to-coordinate-based-and-more-expressive-representations-20202021" class="level3">
<h3 class="anchored" data-anchor-id="transition-to-coordinate-based-and-more-expressive-representations-20202021"><strong>2. Transition to Coordinate-Based and More Expressive Representations (2020–2021)</strong></h3>
<ul>
<li><strong>Coordinate &amp; Reciprocal-Space Features</strong>
<ul>
<li><strong>Ren et al., FTCP (2020)</strong> combined real-space (CIF-like) and reciprocal-space features, enabling both structure and property conditioning.<br>
</li>
<li><strong>Xie et al., CDVAE (2021)</strong> introduced diffusion-based generation within a VAE framework, leveraging graph neural networks with periodic boundary conditions.</li>
</ul></li>
<li><strong>Better Physical Inductive Bias</strong>
<ul>
<li>Models started to embed crystal symmetry, minimal distances, or energy-based inductive biases (e.g., “score-based” or diffusion approaches).<br>
</li>
<li><strong>Result:</strong> Improved validity, diversity, and some gains in stability.</li>
</ul></li>
</ul>
</section>
<section id="growth-of-diffusion-and-flow-methods-20212024" class="level3">
<h3 class="anchored" data-anchor-id="growth-of-diffusion-and-flow-methods-20212024"><strong>3. Growth of Diffusion and Flow Methods (2021–2024)</strong></h3>
<ul>
<li><strong>Diffusion Models</strong>
<ul>
<li><strong>DiffCSP, MatterGen, SymmCD, PXRDNet, DiffCSP++</strong> employ diffusion-based techniques to incrementally refine random noise into plausible crystal structures.<br>
</li>
<li><strong>Advantages:</strong> Better sample quality, built-in symmetry or E(3)-equivariance, ability to condition on composition or PXRD data.</li>
</ul></li>
<li><strong>Flow-based Approaches</strong>
<ul>
<li><strong>Miller et al., FlowMM (2024)</strong> introduced Riemannian Flow Matching, improving efficiency over diffusion methods.<br>
</li>
<li><strong>GFlowNets</strong> (AI4Science et al., CrystalGFN) show promise for controlling generation paths in a more interpretable manner.</li>
</ul></li>
<li><strong>Wyckoff-Driven / Symmetry-Centric Models</strong>
<ul>
<li><strong>Kazeev et al., Wyckoff Transformer (2024)</strong>, <strong>Cao et al., CrystalFormer (2024)</strong>, and <strong>Zhu et al., WyCryst (2024)</strong> incorporate Wyckoff positions, aiming for explicit space-group symmetry enforcement.<br>
</li>
<li><strong>SymmCD (2023)</strong> uses innovative binary matrix encoding of site symmetries.</li>
</ul></li>
</ul>
</section>
<section id="emergence-of-large-language-models-llms-20232024" class="level3">
<h3 class="anchored" data-anchor-id="emergence-of-large-language-models-llms-20232024"><strong>4. Emergence of Large Language Models (LLMs) (2023–2024)</strong></h3>
<ul>
<li><strong>Text-Based Generative Paradigm</strong>
<ul>
<li><strong>Gruver et al.&nbsp;(2024)</strong>, <strong>Antunes et al., CrystaLLM (2024)</strong>, <strong>Mishra et al., LLaMat (2024)</strong>, and <strong>Flam-Shepherd &amp; Aspuru-Guzik (2023)</strong> treat crystal data (CIF files) as text sequences.<br>
</li>
<li><strong>AtomGPT, DiffractGPT (2024)</strong> similarly adopt GPT-style architectures to handle forward/inverse design and PXRD-based inverse problem-solving.</li>
</ul></li>
<li><strong>Benefits</strong>
<ul>
<li>Easy to integrate domain knowledge (prompts), straightforward file-based input/output (CIF, XRD, text).<br>
</li>
<li>Some solutions also incorporate property conditioning or space-group tokens, bridging purely numeric approaches with a flexible textual interface.</li>
</ul></li>
</ul>
</section>
<section id="incorporation-of-experimental-data" class="level3">
<h3 class="anchored" data-anchor-id="incorporation-of-experimental-data"><strong>5. Incorporation of Experimental Data</strong></h3>
<ul>
<li><strong>PXRD-Conditioned Models</strong>
<ul>
<li><strong>Guo et al., PXRDNet (2024)</strong>, <strong>Riesel et al.&nbsp;(2024)</strong>, and <strong>Q. Li et al., PXRDGen (2024)</strong> directly tackle structure solution from powder X-ray diffraction (PXRD) patterns, bridging a gap between simulated training data and real-world experiments.<br>
</li>
<li><strong>Significance:</strong> Brings generative AI closer to experimental lab workflows, enabling near-instant predictions of unknown crystal structures from measured diffraction patterns.</li>
</ul></li>
</ul>
</section>
<section id="overall-evolution" class="level3">
<h3 class="anchored" data-anchor-id="overall-evolution"><strong>Overall Evolution</strong></h3>
<ul>
<li><strong>Representations:</strong> Moved from bulky voxel grids to coordinate-based, symmetry-aware, and text-based.<br>
</li>
<li><strong>Models:</strong> Evolved from early VAEs and GANs toward diffusion, flow, and hybrid LLM approaches.<br>
</li>
<li><strong>Scope:</strong> Expanded from small sets of elemental systems (binary/ternary) to large databases (MP-20, tens or hundreds of thousands of structures).<br>
</li>
<li><strong>Evaluation:</strong> Progressed from naive “validity only” checks to multi-criteria evaluations: stability (energy above hull), novelty, coverage of known crystals, symmetry fidelity, and direct property constraints.</li>
</ul>
</section>
</section>
<section id="section-3-gaps-challenges-and-future-directions" class="level2">
<h2 class="anchored" data-anchor-id="section-3-gaps-challenges-and-future-directions"><strong>Section 3: Gaps, Challenges, and Future Directions</strong></h2>
<section id="synthesizability-beyond-energy-above-hull" class="level3">
<h3 class="anchored" data-anchor-id="synthesizability-beyond-energy-above-hull"><strong>1. Synthesizability Beyond Energy Above Hull</strong></h3>
<ul>
<li><strong>Current Gap:</strong> DFT-based convex-hull calculations are used as a proxy for stability, but real-world synthesizability also depends on kinetics, metastability, defects, and reaction pathways.<br>
</li>
<li><strong>Challenge:</strong> Develop richer metrics (e.g., finite-temperature free energies, reaction pathways) or incorporate experimental feedback loops.<br>
</li>
<li><strong>Future Direction:</strong> Integration with robotic synthesis or active learning pipelines to iteratively validate and refine generative outputs based on actual experimental outcomes.</li>
</ul>
</section>
<section id="multi-property-and-multi-objective-optimization" class="level3">
<h3 class="anchored" data-anchor-id="multi-property-and-multi-objective-optimization"><strong>2. Multi-Property and Multi-Objective Optimization</strong></h3>
<ul>
<li><strong>Current Gap:</strong> Most models can condition on a single property (e.g., bandgap) or incorporate stability constraints. Real applications often require balancing multiple properties (mechanical, optical, electronic).<br>
</li>
<li><strong>Challenge:</strong> Achieving robust multi-objective optimization in high-dimensional chemical spaces without sacrificing stability or validity.<br>
</li>
<li><strong>Future Direction:</strong> Reinforcement learning or classifier-free guidance within diffusion/flow frameworks, advanced reward mechanisms in GFlowNets, or hierarchical LLM prompts to handle conflicting property requirements.</li>
</ul>
</section>
<section id="scaling-to-complex-large-systems" class="level3">
<h3 class="anchored" data-anchor-id="scaling-to-complex-large-systems"><strong>3. Scaling to Complex &amp; Large Systems</strong></h3>
<ul>
<li><strong>Current Gap:</strong> Many benchmark datasets (MP-20, Perov-5, etc.) have relatively small unit cells. Complex structures (e.g., MOFs, zeolites, large supercells, interfaces) pose difficulties due to more extensive degrees of freedom.<br>
</li>
<li><strong>Challenge:</strong> Representations that can handle hundreds or thousands of atoms while maintaining invertibility and efficiency (e.g., hierarchical generation, segmenting the unit cell).<br>
</li>
<li><strong>Future Direction:</strong> Combining segment-based or hierarchical generative models, possibly guided by known building blocks (in MOFs or zeolites), or coarse-to-fine generation steps.</li>
</ul>
</section>
<section id="data-limitations-and-bias" class="level3">
<h3 class="anchored" data-anchor-id="data-limitations-and-bias"><strong>4. Data Limitations and Bias</strong></h3>
<ul>
<li><strong>Current Gap:</strong> Public databases often contain predominantly stable, simpler structures, leading to biased training. Under-explored chemistries or meta-stable phases may be missing.<br>
</li>
<li><strong>Challenge:</strong> Enriching training data with negative examples, systematically sampling less-known chemical spaces, or leveraging active learning to expand coverage.<br>
</li>
<li><strong>Future Direction:</strong> Synthetic data generation that obeys fundamental chemical rules, or collaborative data-sharing from experimental labs focusing on “failed” or “inconclusive” syntheses.</li>
</ul>
</section>
<section id="interpretability-and-explainability" class="level3">
<h3 class="anchored" data-anchor-id="interpretability-and-explainability"><strong>5. Interpretability and Explainability</strong></h3>
<ul>
<li><strong>Current Gap:</strong> Generative models, especially LLM-based or deep diffusion approaches, can be black boxes—difficult to probe for “why” certain structures are generated.<br>
</li>
<li><strong>Challenge:</strong> Materials scientists need interpretable insights to trust new predicted structures.<br>
</li>
<li><strong>Future Direction:</strong> Incorporate attention mechanisms or latent-space visualization to highlight chemical reasoning. Model distillation or rule-based post-processing that clarifies how constraints (symmetry, stoichiometry) are satisfied.</li>
</ul>
</section>
<section id="integration-with-experimental-data-automation" class="level3">
<h3 class="anchored" data-anchor-id="integration-with-experimental-data-automation"><strong>6. Integration with Experimental Data &amp; Automation</strong></h3>
<ul>
<li><strong>Current Gap:</strong> Most validation remains at the DFT level. Direct assimilation of partial or noisy experimental data (PXRD, electron microscopy, spectroscopy) is still developing.<br>
</li>
<li><strong>Challenge:</strong> Handling real experimental uncertainties, data heterogeneity, and bridging the “simulation-to-lab” gap.<br>
</li>
<li><strong>Future Direction:</strong> More robust, multimodal generative models that can fuse PXRD with other data sources (e.g., ED/EM images), plus closed-loop experimentation with robotic platforms.</li>
</ul>
</section>
<section id="beyond-bulk-crystals-surfaces-defects-and-amorphous-materials" class="level3">
<h3 class="anchored" data-anchor-id="beyond-bulk-crystals-surfaces-defects-and-amorphous-materials"><strong>7. Beyond Bulk Crystals: Surfaces, Defects, and Amorphous Materials</strong></h3>
<ul>
<li><strong>Current Gap:</strong> Nearly all approaches target perfect periodic crystals. Real materials often have grain boundaries, defects, or are amorphous (e.g., a2c approach).<br>
</li>
<li><strong>Challenge:</strong> Extending generative methods to partial periodicity, surface reconstructions, heterostructures, or fully amorphous systems.<br>
</li>
<li><strong>Future Direction:</strong> Combining MD-based approaches (melt-quench) with generative networks that can propose stable or metastable configurations, or hierarchical representations for layered/defective structures.</li>
</ul>


</section>
</section>
</section>

<script data-collect-dnt="true" async="" src="https://scripts.simpleanalyticscdn.com/latest.js"></script><a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a><div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-reuse"><h2 class="anchored quarto-appendix-heading">Reuse</h2><div class="quarto-appendix-contents"><div><a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</a></div></div></section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{yin2025,
  author = {Yin, Xiangyu},
  title = {“{Deep} {Research}” for {Scientific} {Literature} {Review}},
  date = {2025-02-09},
  url = {https://xiangyu-yin.com/content/post_deep_research.html},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-yin2025" class="csl-entry quarto-appendix-citeas">
Yin, Xiangyu. 2025. <span>“<span>‘Deep Research’</span> for Scientific
Literature Review.”</span> February 9. <a href="https://xiangyu-yin.com/content/post_deep_research.html">https://xiangyu-yin.com/content/post_deep_research.html</a>.
</div></div></section></div> ]]></description>
  <guid>https://xiangyu-yin.com/content/post_deep_research.html</guid>
  <pubDate>Sun, 09 Feb 2025 06:00:00 GMT</pubDate>
  <media:content url="https://xiangyu-yin.com/content/img/deep_research_retrieval_heatmap.png" medium="image" type="image/png" height="25" width="144"/>
</item>
<item>
  <title>Identifying Local Doping Patterns Around Atomic Sites</title>
  <dc:creator>Xiangyu Yin</dc:creator>
  <link>https://xiangyu-yin.com/content/post_doping_pattern.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>Dopants and point defects play a pivotal role in tailoring material properties—from enhancing catalytic activity in metal oxides to shifting electronic bands in semiconductors. Traditional computational methods, such as full-lattice cluster expansions, supercell approaches, or large-scale Monte Carlo simulations, typically focus on <strong>global</strong> doping distributions across an entire crystal or supercell.</p>
<p>While these global techniques are essential for capturing macroscopic trends, the <strong>local coordination sphere</strong> often dominates the physics and chemistry of defects. For instance, in catalytic oxides, only atoms immediately adjacent to a dopant participate directly in reactions. In semiconductor physics, local defect complexes can introduce deep electronic levels that critically affect carrier lifetimes. Similarly, in solid electrolytes or ionic conductors, the clustering of dopants and vacancies typically occurs within the first or second coordination shells, significantly influencing ionic transport.</p>
<p>Moreover, dopants are frequently introduced at <strong>low concentrations</strong> in practical scenarios. Under these conditions, global symmetry in a large periodic supercell often breaks down locally, causing the dopant and its immediate coordination environment to exhibit reduced or entirely different symmetry compared to the parent crystal structure. Focusing specifically on a local cluster surrounding the dopant effectively circumvents the computational inconvenience associated with large periodic supercells, providing a more realistic representation of these site-specific distortions.</p>
<p>Additionally, from a computational standpoint, local doping patterns can be enumerated or sampled <strong>efficiently</strong> and <strong>rigorously</strong>. Such localized doping patterns can recur throughout the crystal structure, making this approach naturally scalable. Local configurations can be replicated or shifted to equivalent or near-equivalent sites without requiring extensive periodic supercells containing the dopant in every translational repeat, thereby avoiding a combinatorial explosion.</p>
</section>
<section id="code-description" class="level2">
<h2 class="anchored" data-anchor-id="code-description">Code Description</h2>
<p>Given the significance of local doping patterns, this blog presents a <a href="https://gist.github.com/xyin-anl/b220db49fdb72cf0d23b874d15f02121">Python code</a> designed to:</p>
<ol type="1">
<li><p><strong>Identify Neighbors</strong>: Determine atomic species within a specified radius of a target site, forming the essential first step in analyzing the local atomic environment.</p></li>
<li><p><strong>Utilize Symmetry</strong>: Leverage crystallographic site-symmetry operations to identify equivalent neighbor positions. This method eliminates redundancy by consolidating symmetrically equivalent doping configurations.</p></li>
<li><p><strong>Generate Configurations with Controlled Composition</strong>: Perform exhaustive enumeration of all possible local doping arrangements in manageable systems. For more complex systems with extensive possibilities, employ quasi-random Sobol sampling to efficiently explore the configuration space while maintaining targeted dopant concentrations.</p></li>
<li><p><strong>Represent and Visualize Patterns</strong>: Produce structured JSON outputs detailing unique doping patterns and neighbor orbits obtained through site-symmetry operations. Interactive visualization capabilities are provided via the <a href="https://docs.crystaltoolkit.org/"><code>crystaltoolkit</code></a> package, allowing intuitive inspection of atomic arrangements.</p></li>
</ol>
</section>
<section id="code-walkthrough" class="level2">
<h2 class="anchored" data-anchor-id="code-walkthrough">Code Walkthrough</h2>
<p>Below, I highlight the essential Python constructs that make the workflow possible.</p>
<section id="finding-and-storing-neighbor-positions" class="level3">
<h3 class="anchored" data-anchor-id="finding-and-storing-neighbor-positions">Finding and Storing Neighbor Positions</h3>
<p>After loading a structure (e.g., from a <code>cif</code> file) and selecting a target site <code>site_index</code>, the code lists all neighbors within a user-defined cutoff <code>cutoff</code>. It stores their <strong>fractional coordinates</strong> relative to the reference site:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1">structure <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Structure.from_file(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"cubic_batio3.cif"</span>)</span>
<span id="cb1-2">site_index <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb1-3">cutoff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">7.0</span></span>
<span id="cb1-4"></span>
<span id="cb1-5">target_site <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> structure[site_index]</span>
<span id="cb1-6">neighbors <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> structure.get_sites_in_sphere(</span>
<span id="cb1-7">    pt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>target_site.coords,</span>
<span id="cb1-8">    r<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>cutoff,</span>
<span id="cb1-9">    include_index<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb1-10">)</span>
<span id="cb1-11"></span>
<span id="cb1-12">rel_positions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-13">neighbor_indices <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-14"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (site, dist, idx) <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> neighbors:</span>
<span id="cb1-15">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> idx <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> site_index:</span>
<span id="cb1-16">        rel_positions.append(</span>
<span id="cb1-17">            structure.frac_coords[idx] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> structure.frac_coords[site_index]</span>
<span id="cb1-18">        )</span>
<span id="cb1-19">        neighbor_indices.append(idx)</span>
<span id="cb1-20">rel_positions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mod(rel_positions, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span>)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># ensure within [0,1) for fractional coords</span></span></code></pre></div></div>
</section>
<section id="building-a-kd-tree-and-mapping-symmetry-operations" class="level3">
<h3 class="anchored" data-anchor-id="building-a-kd-tree-and-mapping-symmetry-operations">Building a KD-Tree and Mapping Symmetry Operations</h3>
<p>To handle symmetry, the code retrieves the <strong>site-symmetry group</strong>:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1">sym_ops <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> get_site_symmetries(structure, site_index<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>site_index)</span></code></pre></div></div>
<p>However, not all these operations are unique; some may differ by trivial translations. The code filters duplicates by comparing rotation matrices:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_unique_rotations(sym_ops, decimals: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>):</span>
<span id="cb3-2">    unique_ops <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb3-3">    seen <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">set</span>()</span>
<span id="cb3-4">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> op <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> sym_ops:</span>
<span id="cb3-5">        rot_flat <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">tuple</span>(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">round</span>(op.rotation_matrix.ravel(), decimals))</span>
<span id="cb3-6">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> rot_flat <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> seen:</span>
<span id="cb3-7">            seen.add(rot_flat)</span>
<span id="cb3-8">            unique_ops.append(op)</span>
<span id="cb3-9">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> unique_ops</span>
<span id="cb3-10"></span>
<span id="cb3-11">sym_ops <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> get_unique_rotations(sym_ops)</span></code></pre></div></div>
<p>Next, the code sees how each operation permutes the neighbor list. It first builds a <strong>KD-tree</strong> from <code>rel_positions</code>:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> build_kdtree(rel_positions: np.ndarray) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> cKDTree:</span>
<span id="cb4-2">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> cKDTree(rel_positions)</span>
<span id="cb4-3"></span>
<span id="cb4-4">tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> build_kdtree(rel_positions)</span></code></pre></div></div>
<p>Then, for each symmetry operation, the code transforms the relative positions and queries the KD-tree to identify which neighbor index best matches the transformed position:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> find_permutation(sym_op, rel_positions: np.ndarray, tree, rtol: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-2</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> List[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>]:</span>
<span id="cb5-2">    N <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(rel_positions)</span>
<span id="cb5-3">    perm <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> N</span>
<span id="cb5-4">    transformed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array([sym_op.operate(pos) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> pos <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> rel_positions])</span>
<span id="cb5-5">    transformed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mod(transformed, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span>)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># keep fractional coords in [0,1)</span></span>
<span id="cb5-6"></span>
<span id="cb5-7">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i, tpos <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">enumerate</span>(transformed):</span>
<span id="cb5-8">        dist, idx <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.query(tpos)</span>
<span id="cb5-9">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> dist <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> rtol:</span>
<span id="cb5-10">            perm[i] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> idx</span>
<span id="cb5-11">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb5-12">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># handle errors or no match cases</span></span>
<span id="cb5-13">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">pass</span></span>
<span id="cb5-14">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> perm</span></code></pre></div></div>
<p>By repeating this for every unique symmetry operation, the code collects a list of <strong>permutations</strong> mapping each neighbor index to another under the operation.</p>
</section>
<section id="orbits-under-the-site-symmetry-group" class="level3">
<h3 class="anchored" data-anchor-id="orbits-under-the-site-symmetry-group">Orbits Under the Site-Symmetry Group</h3>
<p>With permutations in hand, the code groups neighbor indices into <strong>orbits</strong>. Indices that map onto each other by any symmetry operation lie in the same orbit. This partitioning drastically reduces the number of distinct doping configurations:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_orbits_under_group(N: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>, permutations: List[List[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>]]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> List[List[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>]]:</span>
<span id="cb6-2">    visited <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> N</span>
<span id="cb6-3">    orbits <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb6-4"></span>
<span id="cb6-5">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> start_idx <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(N):</span>
<span id="cb6-6">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> visited[start_idx]:</span>
<span id="cb6-7">            orbit <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">set</span>()</span>
<span id="cb6-8">            queue <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [start_idx]</span>
<span id="cb6-9">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> queue:</span>
<span id="cb6-10">                current <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> queue.pop()</span>
<span id="cb6-11">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> current <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> orbit:</span>
<span id="cb6-12">                    orbit.add(current)</span>
<span id="cb6-13">                    visited[current] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb6-14">                    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> perm <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> permutations:</span>
<span id="cb6-15">                        neighbor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> perm[current]</span>
<span id="cb6-16">                        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> visited[neighbor]:</span>
<span id="cb6-17">                            queue.append(neighbor)</span>
<span id="cb6-18">            orbits.append(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sorted</span>(orbit))</span>
<span id="cb6-19">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> orbits</span></code></pre></div></div>
</section>
<section id="enumerating-or-sampling-doping-patterns" class="level3">
<h3 class="anchored" data-anchor-id="enumerating-or-sampling-doping-patterns">Enumerating or Sampling Doping Patterns</h3>
<p>Suppose I have a <code>doping_dict</code> that specifies which elements can be replaced and by what dopants. For instance:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1">doping_dict <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {</span>
<span id="cb7-2">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ba"</span>: [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ba"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sr"</span>],</span>
<span id="cb7-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ti"</span>: [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ti"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Zr"</span>]</span>
<span id="cb7-4">}</span></code></pre></div></div>
<p>If a neighbor is originally <code>"Ba"</code>, it can become <code>"Ba"</code> or <code>"Sr"</code>; if it is <code>"Ti"</code>, it can become <code>"Ti"</code> or <code>"Zr"</code>. After grouping neighbors into orbits, I can <strong>enumerate</strong> doping assignments within each orbit and apply backtracking to avoid counting symmetry-equivalent patterns multiple times.</p>
<p>For large orbits or numerous dopant species, enumeration can become huge. The code checks if the total number of combinations exceeds a threshold (e.g., <code>1e6</code>). If it does, it uses <strong>quasi-random Sobol sampling</strong> to ensure coverage while respecting doping fraction constraints.</p>
</section>
<section id="applying-doping-fraction-constraints" class="level3">
<h3 class="anchored" data-anchor-id="applying-doping-fraction-constraints">Applying Doping Fraction Constraints</h3>
<p>Constraints of the form</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode json code-with-copy"><code class="sourceCode json"><span id="cb8-1"><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"doping_fraction_constraints"</span><span class="er" style="color: #AD0000;
background-color: null;
font-style: inherit;">:</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">{</span></span>
<span id="cb8-2">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"Sr"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">[</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span><span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">,</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span><span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">]</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">,</span></span>
<span id="cb8-3">  <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">"Zr"</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">:</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">[</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span><span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">,</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">]</span></span>
<span id="cb8-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">}</span></span></code></pre></div></div>
<p>ensure, for example, that the overall fraction of <code>"Sr"</code> among the dopable neighbor sites lies between 0.0 and 0.3. The code checks each final labeling to ensure it meets these fraction bounds before accepting it into the final set.</p>
</section>
</section>
<section id="example-usage" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="example-usage">Example Usage</h2>
<p>Below are two common ways to use this code: <strong>command-line</strong> or <strong>direct Python import</strong>.</p>
<section id="command-line" class="level3">
<h3 class="anchored" data-anchor-id="command-line">Command-Line</h3>
<ol type="1">
<li><p><strong>Structure Input</strong>: Provide a <code>cif</code> file or other supported format for your crystal.<br>
</p></li>
<li><p><strong>Configuration JSON</strong>: Specify doping species and fraction constraints</p></li>
<li><p><strong>Run</strong>:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb9-1"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">python</span> find_doping_pattern.py <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb9-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">--structure</span> cubic_batio3.cif <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb9-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">--site-index</span> 0 <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb9-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">--cutoff</span> 7.0 <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb9-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">--config-json</span> doping_config.json <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb9-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">--max-enum-threshold</span> 1000000 <span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">\</span></span>
<span id="cb9-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">--output</span> results.json</span></code></pre></div></div></li>
</ol>
<p>The script identifies neighbors of site <code>0</code> up to 7 Å, enumerates doping patterns (or samples them if above the threshold <code>1e6</code>), and saves unique configurations to <code>results.json</code>.</p>
</section>
<section id="python-usage" class="level3">
<h3 class="anchored" data-anchor-id="python-usage">Python Usage</h3>
<p>You can also import a driver function in your own Python workflow:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> doping_analysis <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> run_doping_analysis</span>
<span id="cb10-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> pymatgen.core <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Structure</span>
<span id="cb10-3"></span>
<span id="cb10-4">structure <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Structure.from_file(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"cubic_batio3.cif"</span>)</span>
<span id="cb10-5"></span>
<span id="cb10-6">doping_dict <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ba"</span>: [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ba"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sr"</span>], <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ti"</span>: [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ti"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Zr"</span>]}</span>
<span id="cb10-7">doping_fraction_constraints <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sr"</span>: (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Zr"</span>: (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)}</span>
<span id="cb10-8"></span>
<span id="cb10-9">results <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> run_doping_analysis(</span>
<span id="cb10-10">    structure<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>structure,</span>
<span id="cb10-11">    site_index<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,</span>
<span id="cb10-12">    cutoff<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">7.0</span>,</span>
<span id="cb10-13">    doping_dict<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>doping_dict,</span>
<span id="cb10-14">    doping_fraction_constraints<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>doping_fraction_constraints,</span>
<span id="cb10-15">    max_enum_threshold<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1_000_000</span></span>
<span id="cb10-16">)</span>
<span id="cb10-17"></span>
<span id="cb10-18"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Number of final patterns:"</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(results[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"final_patterns"</span>]))</span>
<span id="cb10-19"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Orbits:"</span>, results[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"orbits"</span>])</span></code></pre></div></div>
</section>
<section id="interactive-visualization" class="level3 page-columns page-full">
<h3 class="anchored" data-anchor-id="interactive-visualization">Interactive Visualization</h3>
<p>To help users explore and understand the generated doping patterns, the code includes an interactive visualization interface built with Dash and Crystal Toolkit. The visualization component allows you to:</p>
<ol type="1">
<li><strong>Select Different Patterns</strong>: Browse through all generated doping patterns using a dropdown menu</li>
<li><strong>Display Options</strong>: Toggle between showing all atoms or only the dopant atoms</li>
<li><strong>3D Interaction</strong>: Rotate, zoom, and pan to examine the local atomic environment from any angle</li>
</ol>
<div class="quarto-figure quarto-figure-center page-columns page-full">
<figure class="figure page-columns page-full">
<p><img src="https://xiangyu-yin.com/content/img/doping_pattern_example_merged.png" class="img-fluid figure-img"></p>
<figcaption class="margin-caption">Example Doping Pattern Visualization (left) all atoms, (right) only dopants</figcaption>
</figure>
</div>
</section>
</section>
<section id="how-this-code-could-be-used" class="level2">
<h2 class="anchored" data-anchor-id="how-this-code-could-be-used">How This Code Could Be Used</h2>
<ol type="1">
<li><p><strong>DFT Cluster Setup</strong><br>
By enumerating unique local doping patterns, you can generate input structures for small-cluster or supercell-based <em>ab initio</em> calculations, ensuring that you capture all relevant local doping motifs (without duplicates) near a site of interest.</p></li>
<li><p><strong>Machine Learning Training Data</strong><br>
When training ML potentials or neural network force fields, coverage of doped environments is essential. This code systematically creates local doping configurations to feed into high-fidelity calculations (e.g., DFT) for label generation.</p></li>
<li><p><strong>High-Throughput Screening</strong><br>
Researchers exploring doping across multiple materials or sites can automate the pipeline: for each material, each site, and each doping fraction range, the script enumerates or samples doping arrangements.</p></li>
<li><p><strong>Local Defect Complex Analysis</strong><br>
In ion conductors or oxide-based electrolytes (like ceria or zirconia), dopants often cluster with oxygen vacancies. With minor modifications (e.g., including <code>"Vac"</code> as a valid species), you can systematically explore dopant–vacancy complexes.</p></li>
<li><p><strong>Interfacing with Cluster Expansion</strong><br>
Even if you eventually plan a global cluster expansion, having a library of local dopant environments is valuable for building an initial pool of training structures that sufficiently represent local doping motifs.</p></li>
</ol>


</section>

<script data-collect-dnt="true" async="" src="https://scripts.simpleanalyticscdn.com/latest.js"></script><a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a><div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-reuse"><h2 class="anchored quarto-appendix-heading">Reuse</h2><div class="quarto-appendix-contents"><div><a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</a></div></div></section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{yin2025,
  author = {Yin, Xiangyu},
  title = {Identifying {Local} {Doping} {Patterns} {Around} {Atomic}
    {Sites}},
  date = {2025-02-02},
  url = {https://xiangyu-yin.com/content/post_doping_pattern.html},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-yin2025" class="csl-entry quarto-appendix-citeas">
Yin, Xiangyu. 2025. <span>“Identifying Local Doping Patterns Around
Atomic Sites.”</span> February 2. <a href="https://xiangyu-yin.com/content/post_doping_pattern.html">https://xiangyu-yin.com/content/post_doping_pattern.html</a>.
</div></div></section></div> ]]></description>
  <guid>https://xiangyu-yin.com/content/post_doping_pattern.html</guid>
  <pubDate>Sun, 02 Feb 2025 06:00:00 GMT</pubDate>
</item>
<item>
  <title>“Wave Tracker” for Pressure Swing Adsorption</title>
  <dc:creator>Xiangyu Yin</dc:creator>
  <link>https://xiangyu-yin.com/content/post_wave_tracking.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>Pressure Swing Adsorption (PSA) is a widely adopted process design for gas separation. A foundational analytical framework for PSA is the <strong>equilibrium theory</strong>, first introduced by works such as <a href="https://www.sciencedirect.com/science/article/abs/pii/0009250985851393">Knaebel and Hill (1985)</a>. Equilibrium theory simplifies PSA analysis by neglecting mass transfer resistances, dispersive effects, and other dissipative phenomena. Under this framework, several assumptions are typically made:</p>
<ul>
<li>Linear adsorption isotherms.</li>
<li>Instantaneous equilibrium between gas and adsorbed phases.</li>
<li>Isothermal operation.</li>
<li>Ideal gas behavior.</li>
</ul>
<p>With these assumptions, the governing equations simplify significantly, facilitating quick numerical or even analytical solutions. Furthermore, the complex dynamics within adsorption columns can usually be adequately represented by a few concentration (or partial pressure) fronts and characteristic lines between these fronts, enhancing interpretability and visualization. Consequently, equilibrium theory provides a computationally inexpensive yet insightful approach to elucidate key characteristics of PSA processes across various boundary conditions, without resorting to computationally intensive PDE-based simulations. Equilibrium theory is particularly beneficial for:</p>
<ul>
<li><strong>Guiding</strong> cycle designs and determining step sequences.</li>
<li><strong>Providing</strong> fast approximations, initial guesses, or surrogate models for more rigorous PDE-based analyses.</li>
<li><strong>Enabling</strong> simplified optimization and parametric sensitivity studies.</li>
<li><strong>Estimating</strong> process-level performance metrics for large-scale adsorbent screening and optimization.</li>
</ul>
<p>In this blog post, I first provide an overview of the equilibrium theory as applied to PSA. Next, I demonstrate a <a href="https://gist.github.com/xyin-anl/6a077e8afe793b87b3f402ce4277c1b0">Python code</a> for wave tracking based on equilibrium theory. Lastly, I present several illustrative examples, showcasing the practical application of the code and interpreting the resulting wave diagrams.</p>
</section>
<section id="equilibrium-theory-for-psa" class="level2">
<h2 class="anchored" data-anchor-id="equilibrium-theory-for-psa">Equilibrium Theory for PSA</h2>
<section id="basic-setting" class="level3">
<h3 class="anchored" data-anchor-id="basic-setting">Basic Setting</h3>
<p>Consider a packed bed of length <img src="https://latex.codecogs.com/png.latex?L"> used for separating a binary gas mixture consisting of:</p>
<ul>
<li>Component <img src="https://latex.codecogs.com/png.latex?A"> (the <em>heavy</em> or more strongly adsorbed species),</li>
<li>Component <img src="https://latex.codecogs.com/png.latex?B"> (the <em>light</em> or less strongly adsorbed species).</li>
</ul>
<p>The total pressure is denoted by <img src="https://latex.codecogs.com/png.latex?P">, so <img src="https://latex.codecogs.com/png.latex?p_A%20+%20p_B%20=%20P">, where <img src="https://latex.codecogs.com/png.latex?p_A"> and <img src="https://latex.codecogs.com/png.latex?p_B"> are the partial pressures of <img src="https://latex.codecogs.com/png.latex?A"> and <img src="https://latex.codecogs.com/png.latex?B">, respectively.</p>
<p><strong>Key assumptions</strong>:</p>
<ol type="1">
<li><strong>Linear isotherms</strong> of the form: <img src="https://latex.codecogs.com/png.latex?%0Aq_A%20=%20k_A%5C,p_A,%20%5Cquad%20q_B%20=%20k_B%5C,p_B,%0A"> where <img src="https://latex.codecogs.com/png.latex?q_A"> and <img src="https://latex.codecogs.com/png.latex?q_B"> are the adsorbed amounts of <img src="https://latex.codecogs.com/png.latex?A"> and <img src="https://latex.codecogs.com/png.latex?B">.<br>
</li>
<li><strong>Instantaneous equilibrium</strong> between the gas and adsorbed phases (no mass-transfer resistance or kinetic limitations).<br>
</li>
<li><strong>Ideal gas behavior</strong> and <strong>isothermal operation</strong> (temperature <img src="https://latex.codecogs.com/png.latex?T"> is constant).<br>
</li>
<li>The adsorbent bed has an <strong>interstitial void fraction</strong> <img src="https://latex.codecogs.com/png.latex?%5Cvarepsilon">, meaning a fraction <img src="https://latex.codecogs.com/png.latex?%5Cvarepsilon"> of the bed volume is accessible to gas flow, and <img src="https://latex.codecogs.com/png.latex?(1-%5Cvarepsilon)"> is occupied by the solid adsorbent.<br>
</li>
<li>Under these assumptions, the evolution of gas composition and pressure along the bed can be described by a system of hyperbolic PDEs (or derived characteristic equations) which are simplified by the equilibrium and linear isotherm assumptions.</li>
</ol>
</section>
<section id="dimensionless-groups" class="level3">
<h3 class="anchored" data-anchor-id="dimensionless-groups">Dimensionless Groups</h3>
<p>To simplify the governing equations, it is helpful to define: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbeta_A%20%5C;=%5C;%20%5Cfrac%7B1%7D%7B%5C,1%20%5C;+%5C;%20%5Cfrac%7B(1-%5Cvarepsilon)%7D%7B%5Cvarepsilon%7D%5C,k_A%5C,%7D,%0A%5Cquad%0A%5Cbeta_B%20%5C;=%5C;%20%5Cfrac%7B1%7D%7B%5C,1%20%5C;+%5C;%20%5Cfrac%7B(1-%5Cvarepsilon)%7D%7B%5Cvarepsilon%7D%5C,k_B%5C,%7D,%0A%5Cquad%0A%5Cbeta%20%5C;=%5C;%20%5Cfrac%7B%5Cbeta_B%7D%7B%5Cbeta_A%7D%20%5C;%5Cle%5C;%201.%0A"> These dimensionless parameters incorporate the adsorption constants <img src="https://latex.codecogs.com/png.latex?k_A">, <img src="https://latex.codecogs.com/png.latex?k_B"> and the void fraction <img src="https://latex.codecogs.com/png.latex?%5Cvarepsilon">. Notably, <img src="https://latex.codecogs.com/png.latex?%5Cbeta"> provides a measure of the separation factor: for large <img src="https://latex.codecogs.com/png.latex?k_A"> relative to <img src="https://latex.codecogs.com/png.latex?k_B">, <img src="https://latex.codecogs.com/png.latex?%5Cbeta"> becomes small, reflecting stronger selectivity for the heavy species <img src="https://latex.codecogs.com/png.latex?A">.</p>
</section>
<section id="shocks-and-waves" class="level3">
<h3 class="anchored" data-anchor-id="shocks-and-waves">Shocks and Waves</h3>
<p>In a typical PSA cycle, the column may experience phases of <strong>constant pressure</strong> (e.g., during a high-pressure feed step) and phases of <strong>changing pressure</strong> (e.g., blowdown or pressurization steps). During these steps, characteristic curves in the <img src="https://latex.codecogs.com/png.latex?(t,z)"> plane can converge (forming shocks) or diverge (forming expansion or simple waves). I outline the resulting wave phenomena below.</p>
<section id="constant-pressure-operation" class="level4">
<h4 class="anchored" data-anchor-id="constant-pressure-operation">Constant-Pressure Operation</h4>
<p>Suppose the gas enters the bed at one end at a fixed total pressure <img src="https://latex.codecogs.com/png.latex?P">. For example, during a high-pressure feed step, the velocity <img src="https://latex.codecogs.com/png.latex?u"> in the bed generally depends on the local composition <img src="https://latex.codecogs.com/png.latex?y"> (the mole fraction of the heavy component <img src="https://latex.codecogs.com/png.latex?A"> in the gas phase). A common result under constant <img src="https://latex.codecogs.com/png.latex?P"> is: <img src="https://latex.codecogs.com/png.latex?%0Au(z)%20%5C;%5Cpropto%5C;%20%5Cfrac%7B1%7D%7B%5C,1%20+%20(%5Cbeta%20-%201)%5C,y(z)%5C,%7D.%0A"></p>
<ul>
<li><strong>Shock waves</strong> form when a gas richer in the heavy component <img src="https://latex.codecogs.com/png.latex?A"> pushes into a region with lower <img src="https://latex.codecogs.com/png.latex?A"> concentration. In this case, characteristics merge, and the shock speed <img src="https://latex.codecogs.com/png.latex?u_S"> (in dimensionless form) is given by <img src="https://latex.codecogs.com/png.latex?%0Au_S%20%5C;=%5C;%20%5Cbeta_A%20%5C,%5Cfrac%7B%5C,u_2%5C,y_2%20%5C;-%5C;%20u_1%5C,y_1%5C,%7D%7B%5C,y_2%20%5C;-%5C;%20y_1%5C,%7D,%0A"> where <img src="https://latex.codecogs.com/png.latex?(u_1,%20y_1)"> and <img src="https://latex.codecogs.com/png.latex?(u_2,%20y_2)"> are the velocity and composition just ahead of and behind the shock, respectively.</li>
<li><strong>Simple waves</strong> (or expansion waves) appear when a more dilute gas (lower <img src="https://latex.codecogs.com/png.latex?y">) contacts a region of higher <img src="https://latex.codecogs.com/png.latex?y">. In this case, the characteristics diverge. Mathematically, the velocity and concentration along the characteristic can be determined by integrating the relevant PDE or by directly applying the method of characteristics, which yields a continuous fan of solutions rather than a jump (shock).</li>
</ul>
</section>
<section id="changing-pressure-steps" class="level4">
<h4 class="anchored" data-anchor-id="changing-pressure-steps">Changing-Pressure Steps</h4>
<p>In other steps (e.g., blowdown or pressurization), the total pressure <img src="https://latex.codecogs.com/png.latex?P"> varies with time. One end of the column is typically closed (no flow), while gas either exits or enters from the other end. In this scenario, the local velocity <img src="https://latex.codecogs.com/png.latex?u"> and the bed composition are governed by relationships such as <img src="https://latex.codecogs.com/png.latex?%0A%5Cfrac%7B1%7D%7B%5Cbeta_B%7D%20%5C,%5Cfrac%7B%5Cpartial%20P%7D%7B%5Cpartial%20t%7D%20%5C;+%5C;%20%5Cdots%20%5C;=%5C;%200,%0A%5Cquad%5Ctext%7Band%7D%5Cquad%0Au%20%5C;=%5C;%20%5Ctext%7Bfunction%7D%5Cbigl(P,%5Ctfrac%7BdP%7D%7Bdt%7D,z%5Cbigr),%0A"> derived from the continuity equations and the assumption of instantaneous equilibrium. By integrating along characteristic curves, one obtains explicit expressions for how composition <img src="https://latex.codecogs.com/png.latex?y"> changes with <img src="https://latex.codecogs.com/png.latex?P">, and how wave fronts (shocks or expansions) propagate through the column. The shock speed in a variable-pressure step is similarly given by <img src="https://latex.codecogs.com/png.latex?%0Au_S%20%5C;=%5C;%20%5Cbeta_A%20%5C,%5Cfrac%7Bu_2%5C,y_2%20%5C;-%5C;%20u_1%5C,y_1%7D%7B%5C,y_2%20%5C;-%5C;%20y_1%5C,%7D,%0A"> but with <img src="https://latex.codecogs.com/png.latex?u_1"> and <img src="https://latex.codecogs.com/png.latex?u_2"> determined by the pressure-change relationships (often involving <img src="https://latex.codecogs.com/png.latex?%5Ctfrac%7BdP%7D%7Bdt%7D"> and the dimensionless groups <img src="https://latex.codecogs.com/png.latex?%5Cbeta_A,%5Cbeta_B">).</p>
<p>Under these linear and equilibrium assumptions, the bed dynamics during a PSA cycle can be understood via piecewise integration of hyperbolic PDEs, supplemented by shock or simple-wave conditions. Each step—whether at constant pressure or during pressurization/depressurization—admits characteristic equations that are integrable under suitable boundary conditions (e.g., a feed at one end, product at the other, or a closed end).</p>
<p>This <em>equilibrium theory</em> has proven highly useful in analyzing PSA processes because it provides:</p>
<ul>
<li>Exact or semi-analytical wave speeds and concentration profiles,</li>
<li>Simple criteria for identifying shock fronts vs.&nbsp;expansion regions,</li>
<li>A framework for understanding how separation performance depends on key dimensionless parameters <img src="https://latex.codecogs.com/png.latex?%5Cbeta_A">, <img src="https://latex.codecogs.com/png.latex?%5Cbeta_B">, and <img src="https://latex.codecogs.com/png.latex?%5Cbeta">.</li>
</ul>
<p>More detailed calculations can incorporate boundary conditions for each PSA step, track how the shock or wave front moves through the bed, and predict when and where the heavy or light components break through. Although the theory is based on idealized assumptions, it offers valuable insights for process design, performance limits, and optimization of PSA cycles.</p>
</section>
</section>
</section>
<section id="code-overview" class="level2">
<h2 class="anchored" data-anchor-id="code-overview">Code Overview</h2>
<p>The Python implementation follows the equilibrium theory equations described by Knaebel &amp; Hill for simulating wave propagation in a PSA (Pressure Swing Adsorption) column. The core data structures (<code>Region</code>, <code>WaveFront</code>, <code>Characteristic</code>) capture the state of the bed and any traveling fronts or characteristics. The main computation logic resides in the <code>WaveTracker</code> class, which calculates wave speeds at both constant and changing pressures. A higher-level controller, <code>WaveDiagram</code>, orchestrates time-stepping (or pressure-stepping) through multiple PSA steps, updates the column profile, and manages plotting.</p>
<section id="region" class="level3">
<h3 class="anchored" data-anchor-id="region"><code>Region</code></h3>
<p>A <code>Region</code> stores a constant concentration of the heavy component (<code>y_value</code>) for a segment of the column between <code>z_left</code> and <code>z_right</code>. Multiple <code>Region</code> objects strung together represent the piecewise-constant concentration profile along the column length.</p>
</section>
<section id="wavefront" class="level3">
<h3 class="anchored" data-anchor-id="wavefront"><code>WaveFront</code></h3>
<p>A <code>WaveFront</code> tracks a discontinuity moving through the bed. It has a <code>wave_type</code> that may be <code>"shock"</code> (when a discontinuity steepens) or <code>"simple"</code> (when it spreads out). The attributes <code>z_position</code>, <code>y_ahead</code>, and <code>y_behind</code> indicate the current position of the wavefront and the compositions it separates.</p>
</section>
<section id="characteristic" class="level3">
<h3 class="anchored" data-anchor-id="characteristic"><code>Characteristic</code></h3>
<p>A <code>Characteristic</code> represents a path in the ((z,t)) or ((z,P)) plane corresponding to a particular composition. These paths are stored in arrays (<code>z_values</code>, <code>y_values</code>, <code>t_values</code>, and <code>p_values</code>) for post-processing or plotting. Characteristics are typically generated inside the column and can merge with (or be terminated by) wavefronts.</p>
</section>
<section id="wavetracker" class="level3">
<h3 class="anchored" data-anchor-id="wavetracker"><code>WaveTracker</code></h3>
<p>The <code>WaveTracker</code> class implements the theoretical equations needed to determine front speeds and characteristic velocities under different operating conditions. It contains methods for:</p>
<ul>
<li><strong>Constant-Pressure Wave Speeds.</strong> Functions such as <code>get_characteristic_speed_constant_p()</code> and <code>get_shock_speed_constant_p()</code> implement Equations (9)-(10) and (15) of Knaebel &amp; Hill, giving the velocities of simple waves or shocks.</li>
<li><strong>Changing-Pressure Wave Speeds.</strong> Additional methods like <code>get_u_changing_p()</code> and <code>get_shock_speed_changing_p()</code> capture Equations (7) and (16) for situations where the column is undergoing pressurization or depressurization.</li>
</ul>
<p>By calling these methods, the code can determine how a wavefront or characteristic should move over a small time (or pressure) interval.</p>
</section>
<section id="wavediagram" class="level3">
<h3 class="anchored" data-anchor-id="wavediagram"><code>WaveDiagram</code></h3>
<p>This class coordinates the overall simulation. It breaks the total PSA process into discrete “steps” (e.g., a constant-pressure feed step, a blowdown step with decreasing pressure, etc.) and handles:</p>
<ol type="1">
<li>Detecting discontinuities in the bed and spawning new wavefronts.</li>
<li>Moving wavefronts and characteristics incrementally through time or pressure changes.</li>
<li>Rebuilding the piecewise-constant column profile based on where wavefronts have moved.</li>
<li>Storing intermediate snapshots and final states for visualization.</li>
</ol>
<p>When <code>WaveDiagram</code> completes a step, it has an updated column profile plus a record of all wavefronts and characteristics for plotting.</p>
</section>
</section>
<section id="examples-and-usage" class="level2">
<h2 class="anchored" data-anchor-id="examples-and-usage">Examples and Usage</h2>
<p>Below are simplified examples illustrating how to set up and run PSA steps. The bed length, inlet compositions, pressures, and velocities are all adjustable, making it easy to experiment with different cycle designs.</p>
<section id="bed-4-step-cycle-product-for-pressurization" class="level3">
<h3 class="anchored" data-anchor-id="bed-4-step-cycle-product-for-pressurization">2-Bed 4-Step Cycle (product for pressurization)</h3>
<p>The following code creates a bed of length <img src="https://latex.codecogs.com/png.latex?L=1.0"> with an initially uniform composition of <img src="https://latex.codecogs.com/png.latex?y=0.0">. It runs four steps—Pressurization, Feed, Blowdown, and Purge—and then plots the resulting wave diagrams.</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Initialize the WaveTracker and WaveDiagram</span></span>
<span id="cb1-2">wave_tracker <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> WaveTracker(betaA<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, betaB<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.7</span>, beta<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-3">wave_diagram <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> WaveDiagram(wave_tracker)</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Define the bed and step parameters</span></span>
<span id="cb1-6">L <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span></span>
<span id="cb1-7">initial_profile <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [Region(y_value<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, z_left<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>, z_right<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>L)]</span>
<span id="cb1-8">T_pressurization <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.0</span></span>
<span id="cb1-9">T_feed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.0</span></span>
<span id="cb1-10">T_blowdown <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4.0</span></span>
<span id="cb1-11">T_purge <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.0</span></span>
<span id="cb1-12">P_low <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span></span>
<span id="cb1-13">P_high <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5.0</span></span>
<span id="cb1-14">y_feed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span></span>
<span id="cb1-15">y_product <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># pure light</span></span>
<span id="cb1-16"></span>
<span id="cb1-17"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Pressurization step: increases pressure from P_low to P_high,</span></span>
<span id="cb1-18"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># with flow entering from the 'reverse' end (z=L).</span></span>
<span id="cb1-19">wave_diagram.run_changing_pressure_step(</span>
<span id="cb1-20">    step_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Pressurization"</span>,</span>
<span id="cb1-21">    y_in<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>y_product,</span>
<span id="cb1-22">    p_start<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>P_low,</span>
<span id="cb1-23">    p_end<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>P_high,</span>
<span id="cb1-24">    t_step<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>T_pressurization,</span>
<span id="cb1-25">    z_profile_in<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>initial_profile,</span>
<span id="cb1-26">    flow_dir<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"reverse"</span>,</span>
<span id="cb1-27">)</span>
<span id="cb1-28"></span>
<span id="cb1-29"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Feed step: keeps pressure constant, injects heavier component from z=0</span></span>
<span id="cb1-30">wave_diagram.run_constant_pressure_step(</span>
<span id="cb1-31">    step_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Feed"</span>,</span>
<span id="cb1-32">    u_in<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.8</span>,</span>
<span id="cb1-33">    y_in<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>y_feed,</span>
<span id="cb1-34">    t_step<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>T_feed,</span>
<span id="cb1-35">    flow_dir<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"default"</span>,</span>
<span id="cb1-36">)</span>
<span id="cb1-37"></span>
<span id="cb1-38"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Blowdown step: decreases pressure from P_high to P_low</span></span>
<span id="cb1-39">wave_diagram.run_changing_pressure_step(</span>
<span id="cb1-40">    step_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Blowdown"</span>,</span>
<span id="cb1-41">    y_in<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>,</span>
<span id="cb1-42">    p_start<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>P_high,</span>
<span id="cb1-43">    p_end<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>P_low,</span>
<span id="cb1-44">    t_step<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>T_blowdown,</span>
<span id="cb1-45">    flow_dir<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"default"</span>,</span>
<span id="cb1-46">)</span>
<span id="cb1-47"></span>
<span id="cb1-48"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Purge step: at low pressure, flow enters from z=L to desorb heavy component</span></span>
<span id="cb1-49">wave_diagram.run_constant_pressure_step(</span>
<span id="cb1-50">    step_name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Purge"</span>,</span>
<span id="cb1-51">    u_in<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.0</span>,</span>
<span id="cb1-52">    y_in<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>,</span>
<span id="cb1-53">    t_step<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>T_purge,</span>
<span id="cb1-54">    flow_dir<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"reverse"</span>,</span>
<span id="cb1-55">)</span>
<span id="cb1-56"></span>
<span id="cb1-57"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Generate and display the wave diagram</span></span>
<span id="cb1-58">wave_diagram.plot_steps([<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Pressurization"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Feed"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Blowdown"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Purge"</span>])</span></code></pre></div></div>
</details>
<p>In this cycle:</p>
<ol type="1">
<li><strong>Pressurization</strong> lifts the column from low to high pressure. The code checks whether a wavefront forms at <img src="https://latex.codecogs.com/png.latex?z=L"> depending on the composition mismatch.<br>
</li>
<li><strong>Feed</strong> introduces a heavier component at the inlet <img src="https://latex.codecogs.com/png.latex?z=0"> under a constant, now high, pressure.<br>
</li>
<li><strong>Blowdown</strong> vents the column from the other end, decreasing pressure from <img src="https://latex.codecogs.com/png.latex?P_%7Bhigh%7D"> to <img src="https://latex.codecogs.com/png.latex?P_%7Blow%7D">.<br>
</li>
<li><strong>Purge</strong> at low pressure pushes a light component backward through the column to remove heavy species.</li>
</ol>
<p>When the script finishes, the result is a stitched wave diagram showing how composition fronts move step by step.</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden; max-width: 100%; border: 1px solid #ddd; border-radius: 5px;">
<p><iframe style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;" src="img/wave_diagram_example_1.html" frameborder="0" allowfullscreen=""> </iframe></p>
</div>
</section>
<section id="bed-4-step-cycle-feed-for-pressurization" class="level3">
<h3 class="anchored" data-anchor-id="bed-4-step-cycle-feed-for-pressurization">2-Bed 4-Step Cycle (feed for pressurization)</h3>
<p>A similar 4-step sequence can be constructed but this time the feed is used for pressurization. This allows you to see how slight changes in step sequences can lead to very different wavefronts.</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden; max-width: 100%; border: 1px solid #ddd; border-radius: 5px;">
<p><iframe style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;" src="img/wave_diagram_example_2.html" frameborder="0" allowfullscreen=""> </iframe></p>
</div>
</section>
<section id="bed-9-step-industrial-cycle" class="level3">
<h3 class="anchored" data-anchor-id="bed-9-step-industrial-cycle">4-Bed 9-Step Industrial Cycle</h3>
<p>The third example demonstrates a typical industrial multi-step PSA, including multiple depressurizations, equalizations, and repressurizations. The code is longer, but follows the same pattern:</p>
<ul>
<li>A series of <code>run_constant_pressure_step</code> and <code>run_changing_pressure_step</code> calls for each sub-step (e.g., Adsorption, partial depressurization, purge).</li>
<li>Each step specifies relevant inlet compositions, target pressures, and durations.</li>
<li>After finishing all steps, <code>plot_steps</code> creates a consolidated wave diagram.</li>
</ul>
<p>In this scenario, you can explore the effects of partial equalizations (<code>PEQ</code>) and depressurizations (<code>DEQ</code>) to see how multi-step cycles shift bed compositions.</p>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden; max-width: 100%; border: 1px solid #ddd; border-radius: 5px;">
<p><iframe style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;" src="img/wave_diagram_example_3.html" frameborder="0" allowfullscreen=""> </iframe></p>
</div>
</section>
</section>
<section id="how-to-run-and-adapt" class="level2">
<h2 class="anchored" data-anchor-id="how-to-run-and-adapt">How to Run and Adapt</h2>
<ol type="1">
<li><strong>Dependencies and Setup.</strong> You need Python 3, NumPy, Scipy, and Matplotlib. Save the code as <code>wave_tracking.py</code> (or any name) and run it directly (<code>python wave_tracking.py</code>).<br>
</li>
<li><strong>Modifying the Bed.</strong> Adjust or extend <code>initial_profile</code> to reflect different initial loading conditions (multiple <code>Region</code> objects with various <code>y_value</code>s).<br>
</li>
<li><strong>Adjusting Equilibrium Parameters.</strong> The isotherm parameters <code>kA</code>, <code>kB</code>, and bed porosity <code>epsilon</code> determine the values of <code>betaA</code>, <code>betaB</code>, and <code>beta</code>. Changing these inputs reflects different adsorbent materials.<br>
</li>
<li><strong>Defining Custom Steps.</strong> In your own PSA script, call <code>run_constant_pressure_step</code> or <code>run_changing_pressure_step</code> with the times, pressures, flows, and compositions you want. The code automatically spawns shocks or simple waves where discontinuities appear.<br>
</li>
<li><strong>Plotting and Analysis.</strong> The <code>plot_steps</code> method creates a stitched diagram of composition fronts across multiple steps. It can be customized or replaced with your own post-processing routines if desired.</li>
</ol>
<p>By experimenting with different inlet conditions, velocities, and pressure profiles, you can simulate and visualize how composition waves and shocks evolve in a PSA system under various operating strategies.</p>


</section>

<script data-collect-dnt="true" async="" src="https://scripts.simpleanalyticscdn.com/latest.js"></script><a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a><div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-reuse"><h2 class="anchored quarto-appendix-heading">Reuse</h2><div class="quarto-appendix-contents"><div><a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</a></div></div></section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{yin2025,
  author = {Yin, Xiangyu},
  title = {“{Wave} {Tracker}” for {Pressure} {Swing} {Adsorption}},
  date = {2025-01-28},
  url = {https://xiangyu-yin.com/content/post_wave_tracking.html},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-yin2025" class="csl-entry quarto-appendix-citeas">
Yin, Xiangyu. 2025. <span>“<span>‘Wave Tracker’</span> for Pressure
Swing Adsorption.”</span> January 28. <a href="https://xiangyu-yin.com/content/post_wave_tracking.html">https://xiangyu-yin.com/content/post_wave_tracking.html</a>.
</div></div></section></div> ]]></description>
  <guid>https://xiangyu-yin.com/content/post_wave_tracking.html</guid>
  <pubDate>Tue, 28 Jan 2025 06:00:00 GMT</pubDate>
  <media:content url="https://xiangyu-yin.com/content/img/wave_diagram_example_3.png" medium="image" type="image/png" height="77" width="144"/>
</item>
<item>
  <title>Designing Materials with Cluster Expansion and MatOpt</title>
  <dc:creator>Xiangyu Yin</dc:creator>
  <link>https://xiangyu-yin.com/content/post_cluster_expansion.html</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>Computational materials design requires balancing multiple competing interests such as accuracy, computational cost, interpretability, etc. One classical approach to trade off computational cost and accuracy is to fit a <strong>cluster expansion</strong> on first-principle data. Due to its interpretable and simple structure, cluster expansion, once trained, can be used to formulate a mathematical optimization problem to search the (globally) optimal arrangement of atomic species on a chosen lattice or supercell in terms of a target property (e.g., mixing energy), potentially under constraints such as composition, geometry, etc. In this post, we will demonstrate how to use two Python packages, <a href="https://icet.materialsmodeling.org"><strong>icet</strong></a> and <a href="https://github.com/xyin-anl/matopt"><strong>matopt</strong></a>, to train a cluster expansion using DFT data and then formulate and solve the a MILP to find the optimal arrangement of atomic species for the best mixing energy.</p>
</section>
<section id="cluster-expansions" class="level2">
<h2 class="anchored" data-anchor-id="cluster-expansions">Cluster Expansions</h2>
<p>A cluster expansion approximates a property (<img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BP%7D">) of a crystal structure by decomposing that structure into clusters of sites (singlets, pairs, triplets, etc.) and summing up their contributions. Mathematically, if (<img src="https://latex.codecogs.com/png.latex?%5Cboldsymbol%7B%5Csigma%7D">) is the configuration specifying which species occupy each site, we write</p>
<p><img src="https://latex.codecogs.com/png.latex?%20%5Cmathcal%7BP%7D(%5Cboldsymbol%7B%5Csigma%7D)%20=%5Csum_%7B%5Calpha%7D%20J_%7B%5Calpha%7D,%5Cbigl%5Clangle%20%5CGamma_%7B%5Calpha%7D(%5Cboldsymbol%7B%5Csigma%7D)%5Cbigr%5Crangle,%20"></p>
<p>where:</p>
<ul>
<li><img src="https://latex.codecogs.com/png.latex?%5Calpha"> indexes distinct <strong>symmetry orbits</strong> of clusters (e.g., all pairs of sites separated by the same distance/orientation in the crystal),</li>
<li><img src="https://latex.codecogs.com/png.latex?J_%7B%5Calpha%7D"> is the <strong>effective cluster interaction</strong> (ECI) for orbit <img src="https://latex.codecogs.com/png.latex?%5Calpha">,</li>
<li><img src="https://latex.codecogs.com/png.latex?%5Clangle%20%5CGamma_%7B%5Calpha%7D(%5Cboldsymbol%7B%5Csigma%7D)%5Crangle"> is the <strong>average</strong> of certain <strong>basis functions</strong> (often orthonormal) over the individual clusters in orbit <img src="https://latex.codecogs.com/png.latex?%5Calpha">.</li>
</ul>
<p>To train a cluster expansion, we first sample a set of training structures/configurations and compute their property values via DFT or other methods. Then we do a linear regression (often with regularization) to obtain the ECIs ({<img src="https://latex.codecogs.com/png.latex?J_%7B%5Calpha%7D">}). The typical workflow looks like this:</p>
<ul>
<li>Picking a <strong>prototype structure</strong> (primitive cell).</li>
<li>Defining <strong>cutoffs</strong> for 2-body, 3-body (and higher) distances so we only keep clusters within certain radii.</li>
<li>Computing the DFT energies of various <strong>sampled structures/configurations</strong> of the same underlying lattice.</li>
<li>Solving a <strong>linear regression</strong> problem to learn those “effective cluster interactions” (ECIs).</li>
</ul>
<p>These ECIs can then be used to predict the property of new configurations cheaply. Tools like <a href="https://icet.materialsmodeling.org"><strong>icet</strong></a> automate and streamline the above workflow.</p>
</section>
<section id="milp-reformulation" class="level2">
<h2 class="anchored" data-anchor-id="milp-reformulation">MILP Reformulation</h2>
<p>To effectively utilize a trained cluster expansion in an optimization problem, we need to reformulate/transform the parameters. Below we show how to do this for a <strong>Mixed-Integer Linear Programming</strong> (MILP) formulation, where we have explicit linear expressions and binary variables.</p>
<section id="site-indicator" class="level3">
<h3 class="anchored" data-anchor-id="site-indicator">Site Indicator</h3>
<p>For each site (<img src="https://latex.codecogs.com/png.latex?i%20=%201,%5Cdots,N">) and each species (<img src="https://latex.codecogs.com/png.latex?k">) (e.g., Cu, Ni, Pd, Ag), we introduce a binary decision variable:</p>
<p><img src="https://latex.codecogs.com/png.latex?%20Y_%7Bi,k%7D%20=%0A%5Cbegin%7Bcases%7D%0A1,%20&amp;%20%5Ctext%7Bif%20site%20$i$%20is%20assigned%20species%20$k$,%7D%5C%5C%0A0,%20&amp;%20%5Ctext%7Botherwise.%7D%0A%5Cend%7Bcases%7D%0A"></p>
<p>We will also impose:</p>
<p><img src="https://latex.codecogs.com/png.latex?%20%5Csum_%7Bk%7D%20Y_%7Bi,k%7D%20=%201%20%5Cquad%20%5Ctext%7Bfor%20each%20site%20%7D%20i,%20"></p>
<p>so that exactly one species is chosen for each site.</p>
</section>
<section id="cluster-indicator" class="level3">
<h3 class="anchored" data-anchor-id="cluster-indicator">Cluster Indicator</h3>
<p>We index a single cluster by <img src="https://latex.codecogs.com/png.latex?n">, and define another binary variable (<img src="https://latex.codecogs.com/png.latex?Z_n">) to indicate that the cluster is present in the configuration:</p>
<p><img src="https://latex.codecogs.com/png.latex?%20Z_%7Bn%7D%20=%0A%5Cbegin%7Bcases%7D%0A1,%20&amp;%20%5Ctext%7Bif%20cluster%20$n$%20exists%20in%20the%20current%20configuration,%7D%5C%5C%0A0,%20&amp;%20%5Ctext%7Botherwise.%7D%0A%5Cend%7Bcases%7D%0A"></p>
<p>We will also need to impose the logic that:</p>
<p><img src="https://latex.codecogs.com/png.latex?%20Z_n%20=%20%5Cprod_%7B(i,k)%20%5Cin%20%5Ctext%7Bcluster%20%7D%20n%7D%20Y_%7Bi,k%7D%20%5Cquad%20%5Ctext%7Bfor%20all%20%7D%20n,%20"></p>
<p>In matopt, these cluster-indicator variables <img src="https://latex.codecogs.com/png.latex?Z_n"> and underlying logic (reformulated as linear constraints) will be built automatically to ensure that (<img src="https://latex.codecogs.com/png.latex?Z_n%20=%201">) happens exactly when the relevant combination of occupant species occurs across that cluster.</p>
</section>
<section id="objective-function" class="level3">
<h3 class="anchored" data-anchor-id="objective-function">Objective Function</h3>
<p>Next, we need to find coefficients (<img src="https://latex.codecogs.com/png.latex?c_%7Bn%7D">) for each cluster <img src="https://latex.codecogs.com/png.latex?n"> such that the property/search objective can be linearly represented as:</p>
<p><img src="https://latex.codecogs.com/png.latex?%20%5Cmathcal%7BP%7D%20=%20%5Csum_%7Bn%7D%20c_%7Bn%7DZ_%7Bn%7D,%20"></p>
<p>subject to the constraints. Hence, the cluster expansion parameters must be fully expanded in the sense that each cluster in the geometry - including each possible combination of sites and species - will correspond to a numeric coefficient value. We designed a helper function to do this automatically that:</p>
<ol type="1">
<li>Read an <code>icet.ClusterExpansion</code> object.</li>
<li>Enumerate all clusters in your “design supercell” (the structure you want to optimize).</li>
<li>Output “cluster specs” + “coefficients” so that each cluster monomial (e.g., site 10 = Ni &amp; site 11 = Cu) has its unique coefficient.</li>
</ol>
<p>Finally, we can feed these coefficients into <strong>matopt</strong> to build and solve the MILP optimization problem.</p>
</section>
</section>
<section id="an-end-to-end-example" class="level2">
<h2 class="anchored" data-anchor-id="an-end-to-end-example">An End-to-end Example</h2>
<p>In the sections below, we show how to:</p>
<ol type="1">
<li>Load DFT data for Cu–Ni–Pd–Ag structures</li>
<li>Build a <code>ClusterSpace</code> and train the expansion with <code>LassoCV</code></li>
<li>Fully expand the learned cluster expansion onto a “design supercell”</li>
<li>Build a <code>matopt</code> MILP and solve for an optimal arrangement.</li>
</ol>
<section id="imports-and-helper-function" class="level3">
<h3 class="anchored" data-anchor-id="imports-and-helper-function">Imports and Helper Function</h3>
<p>Let’s start by importing the key packages and defining a helper to “expand” the learned parameters. This function inspects the cluster expansion, enumerates the geometry for your chosen design supercell, and returns a list of cluster specs and coefficients.</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> itertools</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> ase</span>
<span id="cb1-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> collections <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Counter</span>
<span id="cb1-5"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.linear_model <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> LassoCV</span>
<span id="cb1-6"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.model_selection <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> train_test_split</span>
<span id="cb1-7"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.metrics <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> mean_absolute_error</span>
<span id="cb1-8"></span>
<span id="cb1-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># icet imports</span></span>
<span id="cb1-10"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> icet <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> ClusterSpace, ClusterExpansion, StructureContainer</span>
<span id="cb1-11"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> icet.core.local_orbit_list_generator <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> LocalOrbitListGenerator</span>
<span id="cb1-12"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> icet.core.structure <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Structure</span>
<span id="cb1-13"></span>
<span id="cb1-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># MatOpt imports</span></span>
<span id="cb1-15"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> matopt.materials.atom <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Atom</span>
<span id="cb1-16"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> matopt.materials.canvas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Canvas</span>
<span id="cb1-17"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> matopt.aml.expr <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> SumSites, SumClusters</span>
<span id="cb1-18"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> matopt.aml.rule <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> EqualTo, FixedTo</span>
<span id="cb1-19"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> matopt.aml.model <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> MatOptModel</span>
<span id="cb1-20"></span>
<span id="cb1-21"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> parameterize_cluster_expansion(cs, ce, design_atoms, weight<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span>, tol<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-15</span>):</span>
<span id="cb1-22">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-23"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Expand a trained ClusterExpansion over a 'design_atoms' structure into</span></span>
<span id="cb1-24"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    fully enumerated cluster specs and coefficients for use in MatOpt.</span></span>
<span id="cb1-25"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb1-26">    cluster_specs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-27">    cluster_coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-28"></span>
<span id="cb1-29">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># (0) Zero-let (constant term)</span></span>
<span id="cb1-30">    zero_coeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ce.parameters[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> weight</span>
<span id="cb1-31">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(zero_coeff) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> tol:</span>
<span id="cb1-32">        cluster_specs.append([])</span>
<span id="cb1-33">        cluster_coeffs.append(zero_coeff)</span>
<span id="cb1-34"></span>
<span id="cb1-35">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># (1) Generate orbit list for the design structure</span></span>
<span id="cb1-36">    lolg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> LocalOrbitListGenerator(</span>
<span id="cb1-37">        cs.orbit_list,</span>
<span id="cb1-38">        Structure.from_atoms(design_atoms),</span>
<span id="cb1-39">        ce.fractional_position_tolerance,</span>
<span id="cb1-40">    )</span>
<span id="cb1-41">    orbit_list <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> lolg.generate_full_orbit_list().get_orbit_list()</span>
<span id="cb1-42"></span>
<span id="cb1-43">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># (2) We assume chemical_symbols used in cs match the "elements" list</span></span>
<span id="cb1-44">    elements <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(cs.species_maps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].keys())</span>
<span id="cb1-45"></span>
<span id="cb1-46">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># (3) Expand each orbit's cluster-vector elements</span></span>
<span id="cb1-47">    param_idx <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># param[0] is zero-let</span></span>
<span id="cb1-48">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> orbit <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> orbit_list:</span>
<span id="cb1-49">        cves <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> orbit.cluster_vector_elements</span>
<span id="cb1-50">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> cves:</span>
<span id="cb1-51">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">continue</span></span>
<span id="cb1-52">        multiplicity <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> cves[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>][<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"multiplicity"</span>]</span>
<span id="cb1-53"></span>
<span id="cb1-54">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> cve <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> cves:</span>
<span id="cb1-55">            param_eci <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ce.parameters[param_idx]</span>
<span id="cb1-56">            param_idx <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb1-57">            outer_coeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (param_eci <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> multiplicity) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> weight</span>
<span id="cb1-58"></span>
<span id="cb1-59">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(outer_coeff) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> tol:</span>
<span id="cb1-60">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">continue</span></span>
<span id="cb1-61"></span>
<span id="cb1-62">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># (4) For each actual cluster in that orbit</span></span>
<span id="cb1-63">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> cluster <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> orbit.clusters:</span>
<span id="cb1-64">                site_indices <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [site.index <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> site <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> cluster.lattice_sites]</span>
<span id="cb1-65"></span>
<span id="cb1-66">                <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># (5) Loop over all species combos for those sites</span></span>
<span id="cb1-67">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> species_combo <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> itertools.product(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(elements)), repeat<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(site_indices)):</span>
<span id="cb1-68">                    final_coeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> outer_coeff</span>
<span id="cb1-69">                    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(final_coeff) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> tol:</span>
<span id="cb1-70">                        cluster_spec <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">zip</span>(site_indices, species_combo))</span>
<span id="cb1-71">                        cluster_specs.append(cluster_spec)</span>
<span id="cb1-72">                        cluster_coeffs.append(final_coeff)</span>
<span id="cb1-73"></span>
<span id="cb1-74">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> cluster_specs, cluster_coeffs</span></code></pre></div></div>
</details>
</section>
<section id="load-and-prepare-dft-data" class="level3">
<h3 class="anchored" data-anchor-id="load-and-prepare-dft-data">Load and Prepare DFT Data</h3>
<p>Next, suppose you have a dataset of Cu–Ni–Pd–Ag structures, each with a final DFT energy. We’ll convert those energies into mixing energies (relative to pure elemental references) and store them in a <code>DataFrame</code>.</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Reference energies for pure elements (per atom)</span></span>
<span id="cb2-2">reference_energies <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {</span>
<span id="cb2-3">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Cu"</span>: <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">240.11121086</span>,</span>
<span id="cb2-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ni"</span>: <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">352.62417749</span>,</span>
<span id="cb2-5">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Pd"</span>: <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">333.69496589</span>,</span>
<span id="cb2-6">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ag"</span>: <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">173.55506507</span>,</span>
<span id="cb2-7">}</span>
<span id="cb2-8">reference_structure_indice <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">155</span></span>
<span id="cb2-9">cutoffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>]</span>
<span id="cb2-10"></span>
<span id="cb2-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Read from JSON database</span></span>
<span id="cb2-12">db <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ase.io.read(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"structures_CuNiPdAg.json"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">":"</span>)</span>
<span id="cb2-13">dft_data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.read_csv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"properties_CuNiPdAg.csv"</span>, index_col<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-14">indices <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> dft_data.index.values</span>
<span id="cb2-15"></span>
<span id="cb2-16">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.DataFrame(columns<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"atoms"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MixingEnergy"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"TotalEnergy"</span>])</span>
<span id="cb2-17"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> idx, i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">enumerate</span>(indices):</span>
<span id="cb2-18">    st <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> db[i]</span>
<span id="cb2-19">    energy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> dft_data.loc[i, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"final_energy"</span>]</span>
<span id="cb2-20">    dict_representation <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">dict</span>(Counter(st.get_chemical_symbols()))</span>
<span id="cb2-21">    NumAtoms <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(st)</span>
<span id="cb2-22"></span>
<span id="cb2-23">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Mixing energy calculation</span></span>
<span id="cb2-24">    mixing_energy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (</span>
<span id="cb2-25">        energy</span>
<span id="cb2-26">        <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(</span>
<span id="cb2-27">            [</span>
<span id="cb2-28">                v <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> NumAtoms <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> reference_energies[k]</span>
<span id="cb2-29">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> k, v <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> dict_representation.items()</span>
<span id="cb2-30">            ]</span>
<span id="cb2-31">        )</span>
<span id="cb2-32">    ) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> NumAtoms</span>
<span id="cb2-33"></span>
<span id="cb2-34">    df.loc[idx] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [st, mixing_energy, energy]</span></code></pre></div></div>
</details>
<p>Here, <code>df["atoms"]</code> holds an <code>ase.Atoms</code> object, and <code>"MixingEnergy"</code> is the property we want to fit. In a real scenario, you’d have your own structures plus energies.</p>
</section>
<section id="build-and-train-the-cluster-expansion" class="level3">
<h3 class="anchored" data-anchor-id="build-and-train-the-cluster-expansion">Build and Train the Cluster Expansion</h3>
<p>We now define a <code>ClusterSpace</code> for a chosen reference structure (one from our dataset). Then we store all data in a <code>StructureContainer</code> and fit the expansion with <code>LassoCV</code>:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 4-species expansion: Cu, Ni, Pd, Ag</span></span>
<span id="cb3-2">cs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ClusterSpace(</span>
<span id="cb3-3">    structure<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>df.loc[reference_structure_indice, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"atoms"</span>],  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># pick a reference</span></span>
<span id="cb3-4">    cutoffs<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>cutoffs,</span>
<span id="cb3-5">    chemical_symbols<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(reference_energies.keys()),</span>
<span id="cb3-6">)</span>
<span id="cb3-7"></span>
<span id="cb3-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Collect training data</span></span>
<span id="cb3-9">sc <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> StructureContainer(cluster_space<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>cs)</span>
<span id="cb3-10"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> df.index:</span>
<span id="cb3-11">    sc.add_structure(</span>
<span id="cb3-12">        structure<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>df.loc[i, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"atoms"</span>],</span>
<span id="cb3-13">        properties<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mixing_energy"</span>: df.loc[i, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MixingEnergy"</span>]},</span>
<span id="cb3-14">    )</span>
<span id="cb3-15"></span>
<span id="cb3-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Extract design matrix (X) and target (y)</span></span>
<span id="cb3-17">X, y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sc.get_fit_data(key<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mixing_energy"</span>)</span>
<span id="cb3-18"></span>
<span id="cb3-19"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Fit with LassoCV</span></span>
<span id="cb3-20">x_train, x_test, y_train, y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_test_split(X, y, test_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>, random_state<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">42</span>)</span>
<span id="cb3-21">lasso_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> LassoCV(cv<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>, random_state<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">42</span>)</span>
<span id="cb3-22">lasso_model.fit(x_train, y_train)</span>
<span id="cb3-23"></span>
<span id="cb3-24"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train R2:"</span>, lasso_model.score(x_train, y_train))</span>
<span id="cb3-25"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test  R2:"</span>, lasso_model.score(x_test, y_test))</span>
<span id="cb3-26"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test MAE:"</span>, mean_absolute_error(lasso_model.predict(x_test), y_test))</span>
<span id="cb3-27"></span>
<span id="cb3-28"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Build the cluster expansion object</span></span>
<span id="cb3-29">ce <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ClusterExpansion(cluster_space<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>cs, parameters<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>lasso_model.coef_)</span>
<span id="cb3-30"></span>
<span id="cb3-31"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Intercept goes into the zero-let</span></span>
<span id="cb3-32">ce.parameters[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> lasso_model.intercept_</span></code></pre></div></div>
</details>
<p>At this point, <code>ce</code> holds all the learned parameters. You can use <code>ce.predict(supercell_atoms)</code> to estimate mixing energies. But we’re going further: turning it into a MILP model.</p>
</section>
<section id="fully-expand-the-learned-ce-for-our-design-structure" class="level3">
<h3 class="anchored" data-anchor-id="fully-expand-the-learned-ce-for-our-design-structure">Fully Expand the Learned CE for Our Design Structure</h3>
<p>Let’s pick one structure from the DataFrame as our “design space,” or you could replicate it to enlarge the supercell. We call our helper <code>parameterize_cluster_expansion(...)</code>:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1">design_atoms <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.loc[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"atoms"</span>]  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># for demonstration</span></span>
<span id="cb4-2">cluster_specs, cluster_coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> parameterize_cluster_expansion(</span>
<span id="cb4-3">    cs, ce, design_atoms, weight<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span></span>
<span id="cb4-4">)</span>
<span id="cb4-5"></span>
<span id="cb4-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Example of what we get:</span></span>
<span id="cb4-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># cluster_specs might look like [ [(40,3),(47,0),(44,3)], [(44,3),(57,1)], ... ]</span></span>
<span id="cb4-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># cluster_coeffs like [5.8e-05, -1.85e-04, ...]</span></span></code></pre></div></div>
</details>
<p>We now have two lists describing the objective in terms of explicit site–species combos. Notice how we break down the orbits into explicit binary-variable products and store them as <code>[(site_index, species_index), ...]</code>.</p>
</section>
<section id="construct-a-matopt-model-and-solve" class="level3">
<h3 class="anchored" data-anchor-id="construct-a-matopt-model-and-solve">Construct a <code>matopt</code> Model and Solve</h3>
<p>Now we build a <code>matopt</code> <code>Canvas</code> for the chosen geometry (site positions, neighbor lists if needed), define which species are allowed, and specify any constraints. Then we create an objective that sums the cluster terms:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">points_list <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> design_atoms.get_positions().tolist()</span>
<span id="cb5-2">num_sites <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(design_atoms)</span>
<span id="cb5-3">Canv <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Canvas(points_list)</span>
<span id="cb5-4"></span>
<span id="cb5-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Define the candidate atoms in the same order as "reference_energies.keys()"</span></span>
<span id="cb5-6">AllAtoms <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [Atom(s) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> reference_energies.keys()]</span>
<span id="cb5-7"></span>
<span id="cb5-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Convert cluster_specs to "site-Atom" pairs</span></span>
<span id="cb5-9">cluster_specs_expanded <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb5-10"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> cluster_specs:</span>
<span id="cb5-11">    c_expanded <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb5-12">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> st, spidx <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> c:</span>
<span id="cb5-13">        c_expanded.append((st, AllAtoms[spidx]))</span>
<span id="cb5-14">    cluster_specs_expanded.append(c_expanded)</span>
<span id="cb5-15"></span>
<span id="cb5-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create the MatOpt model</span></span>
<span id="cb5-17">m <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> MatOptModel(Canv, AllAtoms, clusters<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>cluster_specs_expanded[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:])</span>
<span id="cb5-18"></span>
<span id="cb5-19"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Optional: add composition constraints </span></span>
<span id="cb5-20">CompBounds <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {</span>
<span id="cb5-21">    AllAtoms[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]: (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, num_sites),</span>
<span id="cb5-22">    AllAtoms[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]: (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, num_sites),</span>
<span id="cb5-23">    AllAtoms[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]: (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>),  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># e.g., require 5–10 Pd</span></span>
<span id="cb5-24">    AllAtoms[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>]: (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>),  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># e.g., require 5–10 Ag</span></span>
<span id="cb5-25">}</span>
<span id="cb5-26">m.addGlobalTypesDescriptor(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Composition"</span>, bounds<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>CompBounds, rules<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>EqualTo(SumSites(desc<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>m.Yik)))</span>
<span id="cb5-27"></span>
<span id="cb5-28"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Force the structure to be fully occupied (or any other site-based constraint)</span></span>
<span id="cb5-29">m.Yi.rules.append(FixedTo(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span>))</span>
<span id="cb5-30"></span>
<span id="cb5-31"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Build the objective from the cluster coefficients</span></span>
<span id="cb5-32">obj_expr <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> SumClusters(desc<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>m.Zn, coefs<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>cluster_coeffs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:])</span>
<span id="cb5-33"></span>
<span id="cb5-34"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Solve the MILP</span></span>
<span id="cb5-35">D <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> m.minimize(obj_expr, solver<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"neos-cplex"</span>, tilim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">60</span>)</span></code></pre></div></div>
</details>
<p>A few highlights:</p>
<ul>
<li><code>m.Zn</code>: matopt variable for “is cluster <img src="https://latex.codecogs.com/png.latex?n"> present?”</li>
<li>We linked <code>m.Zn</code> to <code>m.Yik</code> automatically, so <code>Zn[n] = 1</code> only if all site–species pairs in that cluster are chosen.</li>
<li>Compositional constraints can restrict how many sites are allowed to be each type.</li>
<li><code>m.minimize(obj_expr)</code> calls the solver. Use <code>m.maximize(...)</code> if needed.</li>
</ul>
</section>
<section id="analyzing-the-results" class="level3">
<h3 class="anchored" data-anchor-id="analyzing-the-results">Analyzing the Results</h3>
<p>After solving, <code>D</code> (a <code>Design</code> object) gives the final configuration. We can export or convert to common structure formats:</p>
<details>
<summary>
Click to expand the code
</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1">D.toPDB(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"result.pdb"</span>)</span>
<span id="cb6-2">D.toXYZ(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"result.xyz"</span>)</span>
<span id="cb6-3">D.toPOSCAR(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"result.vasp"</span>)</span>
<span id="cb6-4"></span>
<span id="cb6-5"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> pymatgen.core <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Structure</span>
<span id="cb6-6">structure <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> D.to_pymatgen()</span>
<span id="cb6-7">structure.to(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"cif"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"result.cif"</span>)</span></code></pre></div></div>
</details>
<p>That’s it! You have a fully integrated pipeline from DFT data → cluster expansion → MILP-based discrete optimization → final structure. You could run a DFT check on that final arrangement or evaluate further properties.</p>
</section>
</section>
<section id="remarks" class="level2">
<h2 class="anchored" data-anchor-id="remarks">Remarks</h2>
<p>Using <code>icet</code> to train a cluster expansion and <code>matopt</code> to solve the resulting discrete optimization problem provides a powerful path toward rational materials design. You can rapidly propose candidate structures that minimize or maximize a desired property, subject to constraints (composition, geometry, etc.). While we showed a small demonstration on a single Cu–Ni–Pd–Ag cell, you can scale up to larger supercells, more complex constraints, or different properties.</p>


</section>

<script data-collect-dnt="true" async="" src="https://scripts.simpleanalyticscdn.com/latest.js"></script><a onclick="window.scrollTo(0, 0); return false;" id="quarto-back-to-top"><i class="bi bi-arrow-up"></i> Back to top</a><div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-reuse"><h2 class="anchored quarto-appendix-heading">Reuse</h2><div class="quarto-appendix-contents"><div><a rel="license" href="https://creativecommons.org/licenses/by/4.0/">CC BY 4.0</a></div></div></section><section class="quarto-appendix-contents" id="quarto-citation"><h2 class="anchored quarto-appendix-heading">Citation</h2><div><div class="quarto-appendix-secondary-label">BibTeX citation:</div><pre class="sourceCode code-with-copy quarto-appendix-bibtex"><code class="sourceCode bibtex">@online{yin2025,
  author = {Yin, Xiangyu},
  title = {Designing {Materials} with {Cluster} {Expansion} and
    {MatOpt}},
  date = {2025-01-27},
  url = {https://xiangyu-yin.com/content/post_cluster_expansion.html},
  langid = {en}
}
</code></pre><div class="quarto-appendix-secondary-label">For attribution, please cite this work as:</div><div id="ref-yin2025" class="csl-entry quarto-appendix-citeas">
Yin, Xiangyu. 2025. <span>“Designing Materials with Cluster Expansion
and MatOpt.”</span> January 27. <a href="https://xiangyu-yin.com/content/post_cluster_expansion.html">https://xiangyu-yin.com/content/post_cluster_expansion.html</a>.
</div></div></section></div> ]]></description>
  <guid>https://xiangyu-yin.com/content/post_cluster_expansion.html</guid>
  <pubDate>Mon, 27 Jan 2025 06:00:00 GMT</pubDate>
  <media:content url="https://xiangyu-yin.com/content/img/icet_logo.png" medium="image" type="image/png" height="76" width="144"/>
</item>
</channel>
</rss>
