High-security learning-based optical encryption assisted by disordered metasurface – Nature.com

Working principle

The whole process can be divided into two stages: optical encryption and learning-based decryption, as shown in Fig.1. In the optical encryption stage (Fig. 1a), the sender (Alice) projects a light beam of two different polarizations (P(i) or P(j)) (ij) onto a plaintext, which is firstly encrypted by a QR code phase pattern (security key) and then traveling through the DM as the secondary infilling of the plaintext, generating a speckle pattern (ciphertext). The DM scatters light differently with different input polarizations due to the spin-multiplexing random phase design. The relationship among the speckle, plaintext, security key, and DM can be expressed as:

$$U(x,, y,, z)=iint {U}_{{{{{{rm{P}}}}}}}({x}_{0},, {y}_{0}){U}_{{{{{{rm{S}}}}}}}{left({x}_{0},, {y}_{0}right)U}_{{{{{{{rm{DM}}}}}}}}left({x}_{0},, {y}_{0}right)hleft(x-{x}_{0},, y-{y}_{0},, zright){{{{{rm{d}}}}}}{x}_{0}{{{{{rm{d}}}}}}{y}_{0},$$

(1)

where UP(x0, y0), US(x0, y0), and UDM(x0, y0) correspond to the functions of the plaintext, security key, and DM, respectively, and h(x, y, z) is an impulse response. From Eq. (1), it is very clear that the security key and the DM are applied encryption on the plaintext in sequence to achieve double-secure function. In addition, as UDM(x0, y0) varies with the change of incident beam polarization according to the design, multi-channel encryption can be implemented by changing the polarization of the incident beam.

a Optical encryption. The sender (Alice) illuminates light beams with two different polarizations of P(i) and P(j) onto the phase profiles of the superposition of plaintexts (human face images) and security keys (QR codes), which propagates through DM, generating ciphertexts (speckles). b Learning-based decryption. Two deep neural networks (DNN) of the same structure, e.g., P(i)-DMNet and P(j)-DMNet, are trained with data obtained with incident beams of P(i) and P(j), respectively. After recording the ciphertext and being authorized by Alice to acquire the security key and the polarization of the incident beam, the receiver (Bob) can feed the ciphertext and the security key into the corresponding neural network to decrypt the plaintext. The mark above the straight line with arrows at both ends indicates that the information cannot be commutative. DM disordered metasurface.

In the learning-based decryption stage, several different deep neural networks (DNN) sharing the same structure, termed as P(i)-DMNet and P(j)-DMNet (Fig.1b), are trained with data from incident beams of P(i) and P(j), in which ciphertext and the security key serve as the inputs to decode the plaintext. The receiver (Bob) needs authorization from Alice to acquire the security key and the polarization of the incident beam. Assuming that Bob can receive the ciphertext at the output terminal in real time by himself, he can directly get access to the plaintext by feeding the ciphertext and QR code into the polarization-matched network. For hackers who can even have access to the ciphertext, they cannot decrypt the plaintext without the authentication from Alice (i.e., lack of the security key and the polarization of the incident beam).

The DM consists of elliptical titanium dioxide (TiO2) meta-pillars, as shown in Fig.2a. The meta-pillars are 600nm tall (h) and rest on a square lattice with a periodic constant (P) of 350nm, and the design wavelength is 488nm. The length of two axis (u and v) of meta-pillars varies in the range of 70320nm, such that a controllable propagation phase ({phi }_{{{{{{{rm{propagation}}}}}}}}) is introduced for both LCP and RCP light beams. The simulated phase delays (({varphi }_{{xx}}) and ({varphi }_{{yy}})) of the meta-pillar for two orthogonal linear polarizations (x and y) versus lengths based on a commercial software Lumerical FDTD are shown in Fig.2b. The propagation phase of the structure can be calculated from ({varphi }_{{xx}}) and ({varphi }_{{yy}}), i.e., ({phi }_{{{{{{{rm{propagation}}}}}}}}={arg }left(({{{{{{rm{e}}}}}}}^{1{{{{{rm{i}}}}}}*{varphi }_{{xx}}}-{{{{{{rm{e}}}}}}}^{1{{{{{rm{i}}}}}}*{varphi }_{{yy}}})/2right)) (more details are discussed in Supplementary Note1). The birefringent meta-pillar is rotated with a rotation angle of that is able to perform circular polarization (CP) conversion ({|L}rangle to {e}^{i2delta })|R and ({|R}rangle to {e}^{-i2delta }{|L}rangle), i.e., the LCP and RCP beams are converted to the opposite spin with a geometric phase (or PancharatnamBerry (PB) phase) ({phi }_{{{{{{{rm{geometric}}}}}}}}) of (2delta) and (-2delta), respectively. The combination of the propagation phase and geometric phase enables the decoupling of RCP and LCP light at the designed wavelength for multiplexing wavefront modulation applications30. Given the desired phase of two orthogonal CP light ({phi }_{{{{{{{rm{RCP}}}}}}}}) and ({phi }_{{{{{{{rm{LCP}}}}}}}}), the required propagation phase and geometric phase at each meta-pillar can be calculated as31

$${phi }_{{{{{{{rm{propagation}}}}}}}}=frac{({phi }_{{{{{{{rm{RCP}}}}}}}}+{phi }_{{{{{{{rm{LCP}}}}}}}})}{2}$$

(2)

$${phi }_{{{{{{{rm{geometric}}}}}}}}=frac{left({phi }_{{{{{{{rm{LCP}}}}}}}}-{phi }_{{{{{{{rm{RCP}}}}}}}}right)}{4}$$

(3)

Therefore, phase profiles of the DM for RCP and LCP incident beam are randomly distributed for the generation of speckle images.

a A TiO2 unit meta-pillar of the DM with designed parameters is arranged in a square lattice on a fused silica substrate. b The simulated phase delays of the meta-pillar for two orthogonally linear polarizations (along x and y directions) versus lengths of the two axis of the DM. c Seven different polarization states between the LCP and RCP are defined by tuning the fast axis of QWP in the setup (Fig.3a) and the recorded speckles corresponding to the 7 polarization states. d Speckle PCC versus polarization of incident beam, with the speckle associated with incident LCP as the reference. e Top (left) and perspective (right) views of SEM images of the fabricated DM. The scale bar in (e) is 1mm. DM disordered metasurface, PCC Pearsons correlation coefficient, RCP right-handed circular polarization, LCP left-handed circular polarization.

Specific parameters of meta-pillar structures selected in the experiment can be found in Supplementary Note2. As any polarization can be decomposed into two orthogonal polarization states (RCP and LCP in this study) with different weights32, speckles generated from the DM vary with the polarization of the incident beam. A combination of a half-wave plate (HWP) and a quarter-wave plate (QWP) after the spatial light modulator (SLM) as shown in Fig.3a is used to alter the polarization of the incident beam. Two specific orthogonal optical channels are defined by the two circular polarization states, i.e., P(1): LCP and P(7): RCP. In addition to these two orthogonal channels, 5 intermediate polarization channels, P(2) to P(6), located between P(1) and P(7), are created by rotating the QWP with an interval of 15, as shown in the second row in Fig.2c. Figures in the third row of Fig.2c shows the recorded speckles corresponding to these 7 incident polarizations. Variation of Pearson correlation coefficient (PCC) of the speckles, taking the speckle of incident LCP as the reference, is illustrated in Fig.2d. It can be seen that the speckle is highly sensitive to the rotation angle: the PCC gradually decreases from 1 to 0.08. Such a decrease of PCC can significantly impair the recovery efficiency of the input information. Meanwhile, it suggests the independence of each polarization state. It should be noted that only part of the diffused light field needs to be collected due to the complex mapping between the input and output light fields for information decryption33, which further introduces benefits to the enhancement of the spatial security and the information capacity. Scanning electron microscope (SEM) images of the top and perspective views of the DM are shown in Fig.2e (please refer Methods for more details).

a The schematic diagram of the optical setup. b Examples of plaintext for encryption. c The corresponding ciphertexts, i.e., the speckles. d Exampled QR codes. e The decrypted information by inputting (c, d) into the DMNet. The DMNet herein is trained by the RCP data. Inset numbers below each image in (d) are formatted as PCC(SSIM) between b the ground truth and e the decrypted images. SLM spatial light modulator, DM disordered metasurface, HWP half-wave plate, L1, L2 lens, PCC Pearsons correlation coefficient, RCP right-handed circular polarization, QR quick response, QWP quarter-wave plate, HWP quarter-wave plate, SSIM structure similarity.

The schematic diagram of the optical setup for data collection is illustrated in Fig.3a. A collimated continuous-wave coherent laser beam with a wavelength of 488nm (OBIS, Coherent, USA) is expanded to illuminate the aperture of a reflective SLM (HOLOEYE PLUTO VIS056, German), although a transmissive SLM for better visual observation is shown in Fig.3a. Phase patterns are pre-loaded on the SLM to modulate the laser beam, which is polarized and tuned by a pair of a HWP and a QWP with controllable polarization state and then is slightly focused on the DM using a lens (L1) to generate optical speckles captured by a CMOS camera (FL3-U3-32S2M-CS, PointGrey, Canada). Another lens (L2) put in front of the camera is used to adjust the grain size of the recorded speckles. Sine the decryption is not a trivial inverse of the scattering process like other works16,20,21 (more detailed discussion will be given in Discussion), a DNN named DMNet is specifically designed to match the physical process, with details provided in Supplementary Note3.

When the training of DMNets in this experiment is done (more details can be found in Methods), the encryption process is ready. Notably, the DMNet trained and tested with the data generated via an RCP incident beam, i.e., P(7) polarization in Fig.2c, serves as the example in this part, i.e., the RCP-DMNet or P(7)-DMNet. As shown in Fig.3, by feeding both the ciphertext (i.e., speckles in Fig.3c) and the security key (i.e., the QR code in Fig.3d) into the well-trained DMNet, decrypted images can be retrieved with high quality, as shown in Fig.3e. Many fine features on the retrieved human faces can be identically mapped to the ground truth images (plaintext, Fig.3b)34. Metrics for evaluation, as well, indicate excellent performance with averaged PCC=0.941 and structural similarity index measure (SSIM)=0.833. An example with PCC and SSIM as high as 0.97 and 0.93, respectively, as listed in the second column in Fig.3. The network is therefore proved to accomplish accurate information reconstruction from the speckles. Nevertheless, such success depends on another two factors which strictly ensure the decryption: the second input (i.e., QR code used in this study) and the matched polarization between speckles and the network. Other datasets such as fMNIST and Quickdraw (quantitative analysis of information complexity for different datasets can be referred to Supplementary Note4) have also been tried, and the results can be referred to Supplementary Note5.

As discussed in our previous work21, speckle-based cryptosystem benefits from the complexity of the physical secret key demonstrating high-level security. Nevertheless, if the ciphertext (i.e., speckles) is accidentally obtained by the hackers, it is expected that the system still has the ability to protect itself. As designed in this study, additional authorized security key (i.e., the QR code) from the sender is needed for decryption at the receiver terminal. Several ciphertexts are generated when different QR codes (100 in this study) are paired up with each single plaintext. The performance of the decryption is therefore set to be sensitive to the change from the correct one in Input 2 in Fig.3, given that the Input 1 or the ciphertext is correct. Likewise, RCP data serves as the example and five samples are randomly chosen for demonstrations, as shown in Fig.4. As seen, if a uniform matrix is fed as Input 2 (Fig.4aII), the DMNet merely outputs faces without recognizable features, whose PCC and SSIM (0.080 and 0.109, respectively) are both far below the performance with correct QR code (0.941 and 0.833, respectively; Fig.4aI). Furtherly, excellent protection from the brutal attack for Input 2 is also achieved (Fig.4aIII). By randomly generating one million binary-amplitude matrices to attack Input 2, the guessed plaintext is similar with that in Fig.4aII. Notably, metrics to quantify the performance of brutal attack are not the average in Fig.4bIII but the maximum, since the brutal attack succeed if one trial passes the guess regardless of its number of realizations. Nevertheless, the low PCC and SSIM (0.005 and 0.121, respectively) validate the safety of the designed network against the brutal attack for Input 2. Cases with mismatched pairs for the two inputs, for example, Input 1 is accurate but Input 2 is a correct QR code corresponding to another plaintext, can be found in Supplementary Note6. The DMNet output (denoted as Mismatched output) also fails to visualize the human faces but with similar patterns as shown in Rows II and III in Fig.4a.

a, b Attack analysis regarding Input 2. Decryption with correct ciphertext (i.e., Input 1: speckles) by varying Input 2 with a correct QR code (Row I), a uniform pattern (Row II), and a random binary pattern (Row III) for a qualitative demonstration and b the statistics, quantifying the PCC and SSIM between the plaintext and decrypted images for Rows IIII. PCC Pearsons correlation coefficient, SSIM structure similarity index. The metrics for both Correct and Uniform are averaged over 2000 samples, and metrics for the Random group is the average of 1,000,000 randomly generated binary-amplitude attacks. c Cross-validation for the decryption by inputting speckles with seven different polarization states (i.e., P(i)-speckles, i=1,2,3,4,5,6,7) into DMNet with seven different states (i.e., P(i)-DMNet, i=1,2,3,4,5,6,7). (d) Averaged decryption PCC corresponding to the cross-validation arrangement in (c) and each is averaged over 2000 samples. QR quick response, PCC Pearsons correlation coefficient, SSIM structure similarity.

In Fig.2c, d, we have demonstrated the sensitivity of speckles to the incident polarization. Here, the data independency in these 7 polarization channels will be further verified. Seven DMNets are individually trained using these seven polarized datasets, and each DMNet trained with P(i) data is denoted as P(i)-DMNet (i=1,2,3,4,5,6,7). With correct QR code (not shown in the Fig. 4c for simplicity), the plaintexts can only be correctly deciphered when the polarization state of the speckle matches that of the corresponding DMNet, as shown in the diagonal in Fig.4c: P(i) speckles are input into the P(i)-DMNet, resulting in decryption PCCs of ~0.94. Once the polarization channels between the input data and network are mismatched, e.g., P(1)-speckles (LCP) input into P(7)-DMNet (RCP) or P(7)-speckles (RCP) input into P(1)-DMNet (LCP), the decrypted plaintext exhibits unrecognizable faces, with decryption PCCs of 0.0158 and 0.0268, respectively. In statistical analysis in Fig.4d, it can be observed that the decryption PCCs for matched polarization states (~0.94 on the diagonal) are orders of magnitude higher than those with mismatched polarizations (<0.06 off the diagonal). That said, realizations for multi-channel decryption do not necessarily rely on the orthogonality of the polarization. The additional polarization states between the orthogonal ones can also support independence among the polarization channels. By jointly adjusting a half-wave plate and a quarter-wave plate, more polarization states can be created. In principle, arbitrary polarization state could be an encryption channel, with the polarization regulation as discussed in the Working principle section. Therefore, the feasibility of achieving multi-channel encryption, which requires independence of polarization channel and the realization of multi-polarization channels based on the DM, is assured.

Stability of the decryption performance is critical in real applications but has seldom been discussed in earlier works due to the nature of CSM used in experiment. In this study, the system has been collecting data intermittently for 135h (Periods 114 in Fig.5a), whose status is characterized by the background PCC (blue dots). The background PCC is defined as the PCC between instant background speckle pattern and the initial one at Time=0. All background speckle patterns are generated with the same uniform phase pattern displayed on the SLM as described in Methods. Thereby, the initial status of the cryptosystem is defined in Period 1 in Fig. 5a, whose data is fed into RCP-DMNet for training with average decryption PCC (red bar) of around 0.94, as demonstrated in previous sessions. In other words, test data in the Periods 214 are new data for the network, which are collected under temporally varying medium status and have never been learned or probed by the network. Without additional training, decryption PCC in the following periods (Periods 214) changes accordingly with the background PCC, which is positively correlated. More importantly, the varying status can recover back to the initial status, e.g., Periods 26, Periods 7 to 8, and Periods 1214, whose corresponding averaged decryption PCC recovers from 0.82 to 0.93, from 0.73 to 0.90, and from 0.68 to 0.90, respectively. The decrypted images can be seen in Fig.5b. One should be noted that during such 135h, the experiment is performed on the seventh floor and the environmental perturbations are general and diverse, including switching the laser/SLM/camera, other experiments on the same optical table, traffic around the building, large machine noise from adjacent machine room, etc. As seen, in our cryptosystem, the DM provides excellent stability against those everyday perturbations and the deviation from the initial status is reversible. Such a phenomenon can hardly be seen in CSM-based implementations (Ground glass diffuser, DG-10-220, Thorlabs) for such a long duration of time as shown in Fig.5c: with everyday perturbations, the background PCC of the CSM-based system (with the same setup as the DM-based implementations) decreases obviously (down to around 0.2) without recovery back to the initial status. As seen in Fig.5d, starting from period 2, the decryption performance also deteriorates over time. The fine facial features gradually erode, resulting in significant deviations from the ground truth images. This highlights an additional advantage of utilizing DM over CSM: for those media like ground glass diffusers, the deviation from the initial state is highly unpredictable and often irreversible. However, our proposed DM-based system exhibits reversibility (Fig.5a). This remarkable feature can be attributed to single-layered nature of the DM, which ensures a wider range of the memory effect24. This characteristic physically enables a more relaxed optical conjugation of the DM with the input wavefront compared to typical multi-layered diffusers. Therefore, our system can be practically recovered back to the initial status, as quantified by the background PCC of the recorded speckle (i.e., 0.98) when the perturbations become similar to those at initial status or when simply tuning the system is feasible. Furthermore, since no additional training for the network is needed over time, encrypting new plaintext with the proposed cryptosystem becomes practically feasible even though long period of time has elapsed since the network was trained.

a, b Stability analysis for the DM-based decryption performance. a Background PCC (blue dots) and decryption PCC (red columns) based on the data collected in 14 periods. b Decryption performance for three representative examples with respect to the 14 periods in (a). Digits below each reconstructed images are the Decryption PCCs between the decrypted image and the ground truth image. c, d Stability analysis for the CSM-based decryption performance. c, d are the counterparts of (a, b), respectively, under the same experiment conditions with a ground glass to replace the DM as the scattering medium. GT ground truth, DM disordered metasurface, CSM conventional scattering medium, PCC Pearsons correlation coefficient.

Follow this link:
High-security learning-based optical encryption assisted by disordered metasurface - Nature.com

Related Posts

Comments are closed.