Workflow of ML-guided synthesis of CQDs
Synthesis parameters have great impacts on the target properties of resulting samples. However, it is intricate to tune various parameters for optimizing multiple desired properties simultaneously. Our ML-integrated MOO strategy tackles this challenge by learning the complex correlations between hydrothermal/solvothermal synthesis parameters and two target properties of CQDs in a unified MOO formulation, thus recommending optimal conditions that enhance both properties simultaneously. The overall workflow for the ML-guided synthesis of CQDs is shown in Fig.1 and Supplementary Fig.1. The workflow primarily consists of four key components: database construction, multi-objective optimization formulation, MOO recommendation, and experimental verification.
It consists of four key components: database construction, multi-objective optimization (MOO) formulation, MOO recommendation, and experimental verification.
Using a representative and comprehensive synthesis descriptor set is of vital importance in achieving the optimization of synthesis conditions36. We carefully selected eight descriptors to comprehensively represent the hydrothermal system, one of the most common methods to prepare CQDs. The descriptor list includes reaction temperature (T), reaction time (t), type of catalyst (C), volume/mass of catalyst (VC), type of solution (S), volume of solution (VS), ramp rate (Rr), and mass of precursor (Mp). To minimize human intervention, the bounds of synthesis parameters are determined primarily by the constraints of the synthesis methods and equipment used, instead of expert intuition. For instance, in employing hydrothermal/solvothermal method to prepare CQDs, as the reactor inner pot is made of polytetrafluoroethylene material, the usage temperature should be 220oC. Moreover, the capacity of the reactor inner pot used in the experiment is 25mL, with general guidance of not exceeding 2/3 of this volume for reactions. Therefore, in this study, the main considerations of experimental design are to ensure experimental safety and accommodate the limitations of equipment. These practical considerations naturally led to a vast parameter space, estimated at 20 million possible combinations, as detailed in Supplementary Table1. Briefly, the 2,7-naphthalenediol molecule along with catalysts such as H2SO4, HAc, ethylenediamine (EDA) and urea, were adopted in constructing the carbon skeleton of CQDs during the hydrothermal or solvothermal reaction process (Supplementary Fig.2). Different reagents (including deionized water, ethanol, N,N-dimethylformamide (DMF), toluene, and formamide) were used to introduce different functional groups into the architectures of CQDs, combined with other synthesis parameters, resulting in tunable PL emission. To establish the initial training dataset, we collected 23 CQDs synthesized under different randomly selected parameters. Each data sample is labelled with experimentally verified PL wavelength and PLQY (see Methods).
To account for the varying importance of multiple desired properties, an effective strategy is needed to quantitatively evaluate candidate synthesis conditions in a unified manner. A MOO strategy has thus been developed that prioritizes full-color PL wavelength over PLQY enhancement, by assigning an additional reward when maximum PLQY of a color surpassing the predefined threshold for the first time. Given (N) explored experimental conditions, {(({x}_{i},,{y}_{i}^{c},,{y}_{i}^{gamma }{|; i}=(1,2,ldots,N))}, ({x}_{i}) indicates the (i)-th synthesis condition defined by 8 synthesis parameters, ({y}_{i}^{c}) and ({y}_{i}^{gamma }) indicate the corresponding color label and yield (i.e., PLQY) given ({x}_{i}); ({y}_{i}^{c}in left{{c}_{1},,,{c}_{2},ldots,{c}_{M}right}) for (M) possible colors, ({y}_{i}^{gamma }in left[0,,1right]). The unified objective function is formulated as the sum of maximum PLQY for each color label, i.e.,
$$mathop{sum}nolimits_{{c}_{j}}{Y}_{{c}_{j}}^{max },$$
(1)
where (jin left{1,,2,,ldots,,Mright}) and ({Y}_{{c}_{j}}^{max }) is 0 if (nexists {y}_{i}^{c}={c}_{j}); otherwise
$${Y}_{{c}_{j}}^{max }={max }_{i}left[Big({y}_{i}^{gamma }+R{{cdot }}{mathbb{1}}left({y}_{i}^{gamma }ge alpha right)Big){{cdot }}{mathbb{1}}left({y}_{i}^{c}={c}_{j}right)right].$$
(2)
({mathbb{1}}({{cdot }})) is an indicator function that output 1 if true, otherwise outputs 0. The term (Rcdot {mathbb{1}}({y}_{i}^{gamma }ge alpha )) enforces a higher priority of full-color synthesis, where PLQY for each color shall be at least (alpha) ((alpha=0.5) in our case) to have an additional reward of (R) ((R=10) in our settings). (R) can be any real value larger than 1 (i.e., maximum possible improvement of PLQY for one synthesis condition), to ensure the higher priority of exploring synthesis conditions for colors in which yield has not achieved (alpha). We set (R) to 10, such that the tens digit of unified objective functions value clearly indicates the number of colors with maximum PLQYs exceeding (alpha), and the units digit reflects the sum of maximum PLQYs (without the additional reward) for all colors. As defined by the ranges of PL wavelength in Supplementary Table2, seven primary colors considered in this work are purple (<420nm), blue (420 and <460nm), cyan (460 and <490nm), green (490 and <520nm), yellow (520 and <550nm), orange (550 and <610nm), and red (610nm), i.e., (M=7). Notably, the proposed MOO formulation unifies the two goals of achieving full color and high PLQY into a single objective function, providing a systematical approach to tune synthesis parameters for the desired properties.
The MOO strategy is premised on the prediction results of ML models. Due to the high-dimensional search space and limited experimental data, it is challenging to build models that generalize well on unseen data, especially considering the nonlinear nature of the condition-property relationship37. To address this issue, we employed a gradient boosting decision tree-based model (XGBoost), which has proven advantageous in handling related material datasets (see Methods and Supplementary Fig.3)30,38. In addition, its capability to guide hydrothermal synthesis has been proven in our previous work (Supplementary Fig.4)21. Two regression models, optimized with the best hyperparameters through grid search, were fitted on the given dataset, one for PL wavelength and the other for PLQY. These models were then deployed to predict all unexplored candidate synthesis conditions. The search space for candidate conditions is defined by the Cartesian product of all possible values of eight synthesis parameters, resulting in ~20 million possible combinations (see Supplementary Table1). The candidate synthesis conditions, i.e., unexplored regions of the search space, are further ranked by MOO evaluation strategy with the prediction results.
Finally, the PL wavelength and PLQY values of the CQDs synthesized under the top two recommended synthesis conditions are verified through experiments and characterization, whose results are then augmented to the training dataset for the next iteration of the MOO design loop. The iterative design loops continue until the objectives are fulfilled, i.e., when the achieved PLQY for all seven colors surpasses 50%. In prior studies on CQDs, its worth noting that only a limited number of CQDs with short-wavelength fluorescence (e.g., blue and green), have reached PLQYs above 50%39,40,41. On the other hand, their long-wavelength counterparts, particularly those with orange and red fluorescence, usually demonstrate PLQYs under 20%42,43,44. Underlining the efficacy of our ML-powered MOO strategy, we have set an ambitious goal for all fluorescent CQDs: the attainment of PLQYs exceeding 50%. The capacity to modulate the PL emission of CQDs holds significant promise for various applications, spanning from bioimaging and sensing to optoelectronics. Our four-stage workflow is crafted to forge an ML-integrated MOO strategy that can iteratively guide hydrothermal synthesis of CQDs for multiple desired properties, while also constantly improving the models prediction performance.
To assess the effectiveness of our ML-driven MOO strategy in the hydrothermal synthesis of CQDs, we employed several metrics, which were specifically chosen to ascertain whether our proposed approach not only meets its dual objectives but also enhances prediction accuracy throughout the iterative process. The unified objective function described above measures how well the two desired objectives have been realized experimentally, and thus can be a quantitative indicator of the effectiveness of our proposed approach in instructing the CQD synthesis. The evaluation output of the unified objective function after a specific ML-guided synthesis loop is termed as objective utility value. The MOO strategy improves the objective utility value by a large margin of 39.27% to 75.44, denoting that the maximum PLQY in all seven colors exceeds the target of 0.5 (Fig.2a). Specifically, at iterations 7 and 19, the number of color labels with maximum PLQY exceeding 50% increases by one, resulting in an additional reward of 10 each time. Even on the seemingly plateau, the two insets illustrate that the maximally achieved PLQY is continuously enhanced. For instance, during iterations 8 to 11, the maximum PLQY for cyan emission escalates from 59% to 94%, and the maximum PLQY for purple emission rises from 52% to 71%. Impressively, our MOO approach successfully fulfilled both objectives within only 20 iterations (i.e., 40 guided experiments).
a MOOs unified objective utility versus design iterations. b Color explored with new synthesized experimental conditions. Value ranges of colors defined by PL wavelength: purple (PL<420nm), blue (420nm PL<460nm), cyan (460nm PL<490nm), green (490nm PL<520nm), yellow (520nm PL<550nm), orange (550nm PL<610nm), and red (610nm PL). It shows that while high PLQY has been achieved for red, orange, and blue in the initial dataset, the MOO strategy purposefully enhances PLQYs for yellow, purple, cyan, green respectively in subsequent synthesized conditions in a group of five. c MSE between the predicted and real target properties. d Covariance matrix for correlation among the 8 synthesis parameters (i.e., reaction temperature T, reaction time t, type of catalyst C, volume/mass of catalyst VC, type of solution S, volume of solution VS, ramp rate Rr, and mass of precursor Mp) and 2 target properties, i.e., PLQY and PL wavelength (PL ). e Two-dimensional t-distributed stochastic neighbor embedding (t-SNE) plot for the whole search space, including unexplored (circular points), training (star-shaped points), and explored (square points) conditions, where the latter two sets are colored by real PL wavelengths.
Figure2b reveals that the MOO strategy systematically explores the synthesis conditions for each color, addressing those that have not yet achieved the designed PLQY threshold, starting with yellow in the first 5 iterations and ending with green in the last 5 iterations. Notably, within each quintet of 5 iterations, a singular color demonstrates an enhancement in its maximum PLQY. Initially, the PLQY for yellow surges to 65%, which is then followed by a significant rise in purples maximum PLQY from 44% to 71% during the next set of 5 iterations. This trend continues with cyan and green, where the maximum PLQY escalates to 94% and 83% respectively. Taking into account both the training set (i.e., the first 23 samples) and the augmented dataset, the peak PLQY for all colors exceeds 60%. Several colors approach 70% (including purple, blue, and red), and some are near 100% (including cyan, green, and orange). This further underscores the effectiveness of our proposed ML technique. A more detailed visualization of the PL wavelength and PLQY along each iteration is provided in Supplementary Fig.5.
The MOO strategy ranks candidate synthesis conditions based on ML prediction; thus, it is vital to evaluate the ML models performance. Mean squared error (MSE) is employed as the evaluation metric, commonly used for regression, which is computed based on the predicted PL wavelength and PLQY from the ML models versus the experimentally determined values45. As shown in Fig.2c, the MSE of PLQY drastically decreases from 0.45 to approximately 0.15 within just four iterations a notable error reduction of 64.5%. The MSE eventually stabilizes around 0.1 as the iterative loops progress. Meanwhile, the MSE of PL wavelength remains consistently low, always under 0.1. MSE of PL wavelength is computed after normalizing all values to the range of zero to one for a fair comparison, thus MSE of 0.1 signifies a favorable deviation within 10% between the ML-predicted values and the experimental verifications. This indicates that the accuracies of our ML models for both PL wavelength and PLQY consistently improve, with predictions closely aligning with actual values after enhanced learning from augmented data. This not only demonstrates the efficacy of our MOO strategy in optimizing multiple desired properties but also in refining ML models.
To unveil the correlation between synthesis parameters and target properties, we further calculated the covariance matrix. As illustrated in Fig.2d, the eight synthesis parameters generally exhibit low correlation among each other, indicating that each parameter contributes unique and complementary information for the optimization of the CQDs synthesis conditions. In terms of the impact of these synthesis parameters on target properties, factors such as reaction time and temperature are found to influence both PL wavelength and PLQY. This underscores the importance for both experimentalists and data-driven methods to adjust them with higher precision. Besides reaction time and temperature, PL wavelength and PLQY are determined by distinct sets of synthesis parameters with varying relations. For instance, the type of solution affects PLQY with a negative correlation, while solution volume has a stronger positive correlation with PLQY. This reiterates that, given the high-dimensional search space, the complex interplay between synthesis parameters and multiple target properties can hardly be unfolded without capable ML-integrated methods.
To visualize how the MOO strategy has navigated in the expansive search space (~20 million) using only 63 data samples, we have compressed the initial training, explored, and unexplored space into two dimensions by projecting them into a new reduced embedding space using t-distributed stochastic neighbor embedding (t-SNE)46. As shown in Fig.2e, discerning distinct clustering patterns by color proves challenging, which emphasizes the intricate task of uncovering the relationship between synthesis conditions and target properties. This complexity further underscores the critical role of a ML-driven approach in deciphering the hidden intricacies within the data. The efficacy of ML models is premised on the quality of training data. Thus, selecting training data that span as large search space as possible is particularly advantageous to models generalizability37. As observed in Fig.2e, our developed ML models benefit from the randomly and sparsely distributed training data, which in turn encourage the models to further generalize to previously unseen areas in the search space, and effectively guide the searching of optimal synthesis conditions within this intricate multi-objective optimization landscape.
With the aid of ML-coupled MOO strategy, we have successfully and rapidly identified the optimal conditions giving rise to full-color CQDs with high PLQY. The ML-recommended synthesis conditions that produced the highest PLQY of each color are detailed in the Methods section. Ten CQDs with the best optical performance were selected for in-depth spectral investigation. The resulting absorption spectra of the CQDs manifest strong excitonic absorption bands, and the normalized PL spectra of the CQDs displayed PL peaks ranging from 410nm of purple CQDs (p-CQDs) to 645nm of red CQDs (r-CQDs), as shown in Fig.3a and Supplementary Fig.6. This encompasses a diverse array of CQD types, including p-CQDs, blue CQDs (b-CQDs, 420nm), cyan CQDs (c-CQDs, 470nm), darkcyan CQDs (dc-CQDs, 485nm), green CQDs (g-CQDs, 490nm), yellow-green CQDs (yg-CQDs, 530nm), yellow CQDs (y-CQDs, 540nm), orange CQDs (o-CQDs, 575nm), orange red CQDs (or-CQDs, 605nm), and r-CQDs. Importantly, PLQY of most of these CQDs were above 60% (Supplementary Table3), exceeding the majority of CQDs reported to date (Supplementary Table4). Corresponding photographs of full-color fluorescence ranging from purple to red light under UV light irradiation are provided in Fig.3b. Excellent excitation-independent behaviors of the CQDs have been further revealed by the three-dimensional fluorescence spectra (Supplementary Fig.7). Furthermore, a comprehensive investigation of the time-resolved PL spectra revealed a notable trend. The monoexponential lifetimes of CQDs progressively decreased from 8.6ns (p-CQDs) to 2.3ns (r-CQDs) (Supplementary Fig.8). This observation signified that the lifetimes of CQDs diminished as their PL wavelength experiences a shift towards the red end of the spectrum47. Moreover, the CQDs also demonstrate long-term photostability (>12hours), rendering them potential candidates for applications in optoelectronic devices that require stable performance over extended periods of time (Supplementary Fig.9). All the results together demonstrate the high quality and great potential of our synthesized CQDs.
a Normalized PL spectra of CQDs. b Photographs of CQDs under 365 nm-UV light irradiation. c Dependence of the HOMO and LUMO energy levels of CQDs.
To gain further insights into the properties of the synthesized CQDs, we calculated their bandgap energies using the experimentally obtained absorption band values (Supplementary Fig.10 and Table5). It is revealed that the calculated bandgap energies gradually decrease from 3.02 to 1.91eV from p-CQDs to r-CQDs. In addition, we measured the highest occupied molecular orbital (HOMO) energy levels of the CQDs using ultraviolet photoelectron spectroscopy. As shown in the energy diagram in Fig.3c, the HOMO values exhibit wave-like variations without any discernible pattern. This result further suggests the robust predictive and optimizing capability of our ML-integrated MOO strategy, which enabled the successful screening of these high-quality CQDs from vast and complex search space using only 40 sets of experiments.
To uncover the underlying mechanism of the tuneable optical effect of the synthesized CQDs, we have carried out a series of characterizations to comprehensively investigate their morphologies and structures (see Methods). X-ray diffraction (XRD) patterns with a single graphite peak at 26.5 indicate a high-degree graphitization in all CQDs (Supplementary Fig.11)15. Raman spectra exhibit a stronger signal intensity for the ordered G band at 1585cm1 compared to the disordered D band at 1397cm1, further confirming the high-degree graphitization (Supplementary Fig.12)48. Fourier-transform infrared (FT-IR) spectroscopy was then performed to detect the functional groups in CQDs, which clearly reveals the NH2 and NC stretching at 3234 and 1457cm1, respectively, indicating the presence of abundant NH2 groups on the surface of CQDs, except for orange CQDs (o-CQDs) and yellow CQDs (y-CQDs) (Supplementary Fig.13)49. The C=C aromatic ring stretching at 1510cm1 confirms the carbon skeleton, while three oxide-related peaks, i.e., OH, C=O, and CO stretching, were observed at 3480, 1580, and 1240cm1, respectively, due to abundant hydroxyl groups of the precursor. The FT-IR spectrum also shows a stretching vibration band SO3 at 1025cm1, confirming the additional functionalization of y-CQDs by SO3H groups.
X-ray photoelectron spectroscopy (XPS) was adopted to further probe the functional groups in CQDs (Supplementary Fig.14 to 23). XPS survey spectra analysis reveals three main elements in CQDs, i.e., C, O, and N, except o-CQDs and y-CQDs. Specifically, o-CQDs and y-CQDs lack the N element and y-CQDs contains S element. The high-resolution C1s spectrum of CQDs can be deconvoluted into three peaks, including a dominant CC/C=C graphitic carbon bond (284.8eV), CO/CN (286eV), and carboxylic C=O (288eV), revealing the structures of CQDs. The N1s peak at 399.7eV indicates the presence of NC bonds, verifying the successful N-doping in the basal plane network structure of CQDs, except o-CQDs and y-CQDs. The separated peaks of O1s at 531.5 and 533eV indicate the two forms of oxyhydrogen functional groups with C=O and CO, respectively, consistent with the FT-IR spectra50. The S2p band of y-CQDs can be decomposed into two peaks at 163.5 and 167.4eV, representing SO3/2P3/2 and SO3/2P1/2, respectively47,51. Combining the results of structure characterization, the excellent fluorescence properties of the CQDs are attributed to the presence of N-doping, which reduces non-radiative sites of CQDs and promotes the formation of C=O bonds. The C=O bonds play a crucial role in radiation recombination and can increase the PLQY of the CQDs.
To gain deeper insights into the morphology and microstructures of the CQDs, we have then conducted transmission electron microscopy (TEM). The TEM images demonstrate uniformly shaped and monodisperse nanodots, with the gradual increase of average lateral sizes ranging from 1.85nm for p-CQDs to 2.3nm for r-CQDs (Fig.4a and Supplementary Fig.24), which agrees with the corresponding PL wavelength, providing further evidence for the quantum size effect of CQDs (Fig.4a)47. High-resolution TEM images further reveal the highly crystalline structures of CQDs with well-resolved lattice fringes (Fig.4b-c). The measured crystal plane spacing of 0.21nm corresponds to the (100) graphite plane, further corroborating the XRD data. Our analysis suggests that the synthesized CQDs possess a graphene-like high-crystallinity characteristic, thereby giving rise to their superior fluorescence performance.
a The lateral size and color of full-color fluorescent CQDs (inset: dependence of the PL wavelength and the lateral size of full-color fluorescent CQDs). Data correspond to meanstandard deviation, n=3. b, c High-resolution TEM images and the fast Fourier transform patterns of p-, b-, c-, g-, y-, o- and r-CQDs, respectively. d Boxplots of PL wavelength (left)/PLQY (right) and 7 synthesis parameters of CQDs. VC is excluded here as its value range is dependent on C, whose relationships with other parameters are not directly interpretable. The labels at the bottom indicate the minimum value (inclusive) for the respective bins, whereas the bins on the left are the same as the discretization of colors in Supplementary Table2, the bins on the right are uniform. Each box spans vertically from the 25th percentile to the 75th percentile, with the horizontal line marking the median and the triangle indicating the mean values. The upper and lower whiskers extend from the ends of the box to the minimum and maximum data values.
Following the effective utilization of ML in thoroughly exploring the entire search space, we proceeded to conduct a systematic examination of 63 samples using box plots, aiming to elucidate the complex interplay between various synthesis parameters and the resultant optical properties of CQDs. As depicted in Fig.4d, the synthesis under conditions of high reaction temperature, prolonged reaction time, and low-polarity solvents, tends to result in CQDs with a larger PL wavelength. These findings are consistent with the general observations in the literature, which suggest that the parameters identified above can enhance precursor molecular fusion and nucleation growth, thereby yielding CQDs with increased particle size and high PL wavelength47,52,53,54,55. Moreover, a comprehensive survey of existing literature implies that precursors and catalysts, typically including electron donation and acceptance, aid in producing long-wavelength CQDs56,57. Interestingly, diverging from traditional findings, we successfully synthesized long-wavelength red CQDs under ML guidance, with 2,7-naphthalenediol containing electron-donating groups as the precursor and EDA is known for its electron-donating functionalities as the catalyst. This significant breakthrough questions existing assumptions and offers new insights into the design of long-wavelength CQDs.
Concerning PLQY, we found that catalysts with stronger electron-donating groups (e.g., EDA) led to enhanced PLQY in CQDs, consistent with earlier observations made by our research team16. Remarkably, we uncovered the significant impact of synthesis parameters on CQDs PLQY. In the high PLQY regime, strong positive correlations were discovered between PLQY and reaction temperature, reaction time, and solvent polarity, previously unreported in the literature58,59,60,61. This insight could be applied to similar systems for PLQY improvement.
Aside from the parameters discussed above, other factors such as ramp rate, the amount of precursor, and solvent volume also influence the properties of CQDs. Overall, the emission color and PLQY of CQDs are governed by complex, non-linear trends resulting from the interaction of numerous factors. Its noteworthy to mention that the traditional methods used to adjust CQDs properties often result in a decrease in PLQY as the PL wavelength redshifts4,47,51,54. However, utilizing AI-assisted synthesis, we have successfully increased the PLQY of the resulting full-color CQDs to over 60%. This significant achievement highlights the unique advantages offered by ML-guided CQDs synthesis and confirms the powerful potential of ML-based methods in effectively navigating the complex relationships among diverse synthesis parameters and multiple target properties within a high-dimensional search space.
Read the rest here:
Machine learning-guided realization of full-color high-quantum-yield carbon quantum dots - Nature.com
Read More..