Future space-based coronagraphs will rely critically on focal-plane wavefront sensing and control with deformable mirrors (DMs) to reach deep contrast by mitigating optical aberrations in the primary beam path. Until now, most focal-plane wavefront control algorithms have been formulated in terms of Jacobian matrices, which encode the predicted effect of each DM actuator on the focal-plane electric field. A disadvantage of these methods is that Jacobian matrices can be cumbersome to compute and manipulate, particularly when the number of DM actuators is large. Recently, we proposed a new class of focal-plane wavefront control algorithms that utilize gradient-based optimization with algorithmic differentiation to compute wavefront control solutions while avoiding the explicit computation and manipulation of Jacobian matrices entirely. In simulations using a coronagraph design for the proposed Large UV/Optical/Infrared Surveyor, we showed that our approach reduces overall CPU time and memory consumption compared to a Jacobian-based algorithm. Here, we expand on these results by implementing the proposed algorithm on the High-contrast Imager for Complex Aperture Telescopes tested at the Space Telescope Science Institute and present initial experimental results, demonstrating contrast suppression capabilities equivalent to Jacobian-based methods. |
1.IntroductionFuture space coronagraphs attempting to image and characterize Earth-like planets around nearby solar-type stars will rely critically on closed-loop wavefront sensing and control (WFS&C) using deformable mirrors (DMs) to mitigate optical aberrations in the primary beam path. These aberrations, primarily mid-spatial frequency wavefront errors and optical misalignments in the telescope and coronagraph optics, give rise to a speckle floor that is coherent with the star and evolves slowly over time in response to miniscule drifts in the thermal and mechanical state of the observatory. If uncorrected, the speckle floor overwhelms the faint image of the orbiting planet, which is expected to be times fainter than the host star at 0.1 arcseconds of separation or less at visible wavelengths.1 The current state-of-the-art wavefront control algorithms, stroke minimization (SM)2 and electric field conjugation (EFC),3 compute DM command solutions using a first-order approximation of the focal-plane electric field in an optimal control framework. In both cases, the optimal DM update is written down in closed form as the solution of a linear system of equations constructed from a Jacobian matrix that describes the influence of each DM actuator on the focal-plane electric field. Until recently, little attention has been paid to the computational demands of SM and EFC. The complexity of SM and EFC is dominated by the cost of computing and manipulating the Jacobian matrix, whose size is proportional to the product of the number of DM actuators and number of dark-zone pixels. In broadband imaging, this is compounded by the requirement to compute a separate Jacobian for each controlled wavelength. The Jacobian is most often model-based, in which case an optical diffraction model of the coronagraph is evaluated repeatedly to predict the focal-plane influence of each actuator. Moreover, the Jacobian is a linearization of the true, nonlinear behavior of the DMs and must be recalculated periodically as the state of the DMs evolves over time. As direct imaging missions demand DMs with ever-higher actuator density to enable wider and wider search areas, computational aspects will inevitably become a point of concern from a systems engineering standpoint. In on-orbit WFS&C, all sensing and control computations are processed by the flight computer; conversely, in ground-in-the-loop scenarios, raw data is communicated to a ground-based computing node that calculates the DM correction and relays the correction back to the observatory. Though each approach has tradeoffs, a major advantage of on-orbit WFS&C is the ability to update DM commands more frequently without relying on the continuous availability of a communication link with the ground station. On-orbit WFS&C can help to relax observatory-level wavefront stability requirements by enabling the high-contrast dark zone to be maintained over shorter time intervals. Successful deployment of an on-orbit architecture is predicated on the availability of sufficient computational resources; however, radiation-hard, space-qualified computing hardware lags behind conventional hardware by decades and is extremely resource-limited. This poses a substantial computation capability gap if the current algorithms are expected to be deployed on-orbit on a future direct imaging mission, such as NASA’s planned Habitable Worlds Observatory (HWO). As a case in point, the current state-of-the-art testbed for coronagraph laboratory demonstrations, the decadal survey testbed (DST) at NASA’s Jet Propulsion Laboratory, has successfully demonstrated instrumental contrast over a 10% fractional bandpass within an annular dark zone extending from to , where is the central imaging wavelength and is the diameter of the telescope aperture, using two DMs each with actuators.4 The baseline requirements for an HWO-like mission will be considerably steeper, using the Large UV/Optical/Infrared Surveyor (LUVOIR) and Habitable Exoplanet Observatory (HabEx) flagship mission concepts formulated for the Astro2020 Decadal Survey5,6 as representative reference designs. The HabEx design features two DMs, and a search area with a maximal outer radius of over a 20% bandpass—nearly double the total actuator count, a factor of 4 increase in search radius, and a factor of two increase in control bandwidth. Meanwhile, the LUVOIR Architecture “A” reference design includes a pair of DMs and a dark zone with a maximal outer radius over a 10% fractional bandpass or wider. Because the number of detector pixels in the dark zone scales with the area of the dark zone rather than its radius, these parameters correspond to an increase by a factor of 32 (Habex) and 512 (LUVOIR-A) of the worst-case Jacobian dimensionality, for each control wavelength, compared to DST. In recent work, we formulated an alternative wavefront control framework that iteratively compute DM updates using gradient-based optimization techniques, which eliminates the need to calculate and manipulate the Jacobian.7 Our approach is based in part on a numerical technique known as algorithmic differentiation (AD),8 to calculate the gradients of the cost function for optimal control accurately and efficiently. We described an AD-based counterpart to SM, which we termed AD penalty stroke minimization (AD-PSM), and compared it to SM in simulations with a small-angle LUVOIR design. Our results indicated superior computational efficiency with AD-PSM and comparable starlight suppression performance. While the CPU time and memory consumption of SM grew superlinearly with actuator count, the increase in both for AD-PSM was negligible (e.g., with actuators, AD-PSM utilized 95% less memory and CPU time), suggesting that iterative methods are a promising alternative to Jacobian-based techniques for on-orbit WFS&C with high actuator counts and large dark zones. In this paper, we report on the first experimental demonstration of AD-PSM as well as an AD-based counterpart to the EFC algorithm, termed AD-EFC, using the High-contrast Imager for Complex Aperture Telescopes (HiCAT) at the Space Telescope Science Institute in Baltimore, Maryland. We benchmark the contrast performance of AD-PSM and AD-EFC as a function of several key parameters including regularization and the termination tolerance of the nonlinear optimizer, and compare it to SM and EFC. This paper is structured as follows. In Sec. 2, we review concepts from our earlier work, including AD and the mathematical formulation of AD-PSM and AD-EFC. In Sec. 3, we provide an overview of HiCAT and discuss our experimental setup, including algorithm implementation details that are pertinent to our demonstration. In Sec. 4, we present and discuss our experimental results. Finally, in Sec. 5, we provide our conclusions and briefly describe our planned future work. 1.1.NotationIn this paper, our principal concern is with algorithms that operate on discrete vector-valued quantities, which are represented in boldface. Many of these quantities vary as a function of control iteration, and are denoted with the subscript . These may be truly discrete, such as the vector of DM actuator command updates, or may represent discretizations of functions of continuous spatial variables, such as electric fields. We denote as a column vector, its transpose as a row vector, and as its Euclidean length. For complex-valued quantities, denotes the Hermitian transpose. Scalar quantities are denoted in ordinary (i.e., non-boldface) typographic weight. By convention, the derivative of a scalar with respect to a column vector is a row vector, i.e. where is the ’th element of . Consequently, the derivative of a column vector with respect to another column vector is a matrix of row vectors:2.Wavefront Control Using Algorithmic DifferentiationThe goal of the WFS&C loop in coronagraphy is to drive starlight within the dark zone toward zero over time so that a faint orbiting companion becomes detectable against the reduced starlight background. Each iteration of the WFS&C loop, indexed by the integer , consists of two steps: (1) an estimation step, in which an estimate of the true aberrated electric field is formed within the dark zone from focal-plane intensity measurements, and (2) a control step, in which the DM correction is updated to reduce the energy in , as shown in Fig. 1. In this paper, we focus principally on the control step. Modern model-based wavefront control algorithms find the DM correction update by minimizing some cost function with respect to . Usually, is constructed to trade off between minimizing starlight and minimizing the size of the correction, which helps to regularize the problem and stabilize the solution. In general, the true relationship between the starlight in the dark zone and the DM correction is highly nonlinear and nonconvex, owing to the fact that the DMs impart phase-only corrections of the form at or near the coronagraph entrance pupil. However, when the optical aberrations are small, we can approximate the true electric field in the coronagraph entrance pupil with a first-order Taylor series expansion about a small update to the DM commands. In this case, the corrected electric field in the dark zone has the form where is the estimate of the aberrated dark-zone electric field produced by the estimation step, and is the change in electric field resulting from the update to the DM correction. We can also write in the form where is the Jacobian matrix with dimensions , is the number of dark-zone pixels, and is the total number of controllable DM actuators. The intensity from the corrected electric field, integrated over the dark zone, can be written in terms of the Jacobian asThis is a quadratic function of , meaning that under this approximation, there exists a unique, optimal DM correction that minimizes the dark-zone starlight. SM and EFC utilize the relationship in Eq. (5) to derive closed-form expressions for this optimal correction in terms of that can be evaluated by solving a linear system of dimension , as illustrated in Fig. 2. As we discuss in Appendix B, this is equivalent to minimizing the cost function using Newton’s method with an exact Hessian matrix. Alternatively, it is possible to find the DM correction by minimizing the cost function with respect to iteratively, rather than analytically, using gradient-based nonlinear optimization as shown in Fig. 3. To do so eliminates the need to calculate the Jacobian matrix, but requires a way of calculating the gradient vector . Reverse-mode algorithmic differentiation (RMAD) provides a way of doing so that is both computationally efficient and accurate, in the sense that the derivatives computed by RMAD are accurate to machine precision and do not utilize finite-difference approximations.8 The basic principle of RMAD is that any function that can be written down as a sequence of differentiable operations, called the forward model, can be transformed mechanistically to construct a related function, called the adjoint model, that evaluates the derivative of the forward model with respect to any of the intermediate variables encountered during its evaluation, as well as its inputs. Figure 4 illustrates this procedure for the wavefront control cost function . We refer the reader to our earlier work for a more comprehensive discussion of the principles of RMAD and its application to wavefront control.7 2.1.Stroke Minimization: From Lagrange Multipliers to Penalty MethodSM finds the smallest DM correction that achieves a desired level of stellar intensity integrated over the dark zone, denoted by .2 It is solved by finding the stationary point of the Lagrangian function i.e., a point such that , where is the DM actuator command update, is the corrected electric field in the dark zone, , and is the Lagrange multiplier. Because this stationary point is a saddle point, it cannot be reached by minimizing with respect to . Instead, one chooses a fixed starting value for the Lagrange multiplier, , and minimizes with respect to to find a corresponding DM solution . If the constraint is not satisfied, a larger value is selected and this procedure is repeated. For any fixed value of , the minimum of Eq. (6) is given in closed form in terms of the Jacobian matrix : where is the identity matrix and is the estimate of the aberrated electric field.AD-PSM has the same goal, but instead iteratively minimizes the cost function where is a parameter that encodes the relative importance of minimizing actuator stroke and driving the integrated intensity toward the target value. The minimum of with respect to is coincident with the stationary point of ,9 meaning that the DM solutions obtained by SM and AD-PSM are, in principle, identical. In this paper, we choose in each WFS&C iteration as where , known as the penalty parameter, is a constant set by the experimenter. The denominator scales the cost function to be invariant to the energy in the dark-zone electric field, which helps in practice to obtain good solutions in all WFS&C iterations as gradually converges toward zero.2.2.Electric Field ConjugationEFC attempts to drive the dark zone electric field toward zero, with Tikhonov regularization to mitigate ill-conditioning caused by the presence of actuators that are completely or partially obscured by pupil features. Its cost function for a single correction wavelength is given as where is the Tikhonov regularization matrix. In the most common case, one chooses , making EFC identical to SM with a fixed Lagrange multiplier . For this case, its solution can be obtained using Eq. (7). The general solution for the Jacobian-based formulation of EFC is given asIn AD-EFC, one iteratively minimizes a scaled version of the EFC cost function Similarly to AD-PSM, the scaling factor makes the AD-EFC cost function invariant to the energy in the dark-zone electric field to aid in obtaining numerical solutions in practice. The RMAD adjoint model for is provided in Appendix D. 3.Experimental SetupIn this section, we provide an overview of the HiCAT testbed and provide details about the implementation of AD-PSM and AD-EFC, the reference Jacobian-based implementations of SM and EFC, and the electric field estimation algorithm used in the estimation step of the WFS&C loop. 3.1.HiCAT TestbedHiCAT is a testbed dedicated to technology demonstrations for coronagraphy on segmented-aperture space observatories, with the intent of being directly traceable to a future HWO-like mission.10–17 These technologies include Lyot coronagraphy, high-order WFS&C for generating and stabilizing dark zones, and low-order wavefront sensing (LOWFS).18 HiCAT operates in a mid-contrast regime ( to ) which approaches the limit achievable outside of a vacuum environment. The testbed is equipped with two Boston Micromachines Kilo-DMs for high-order sensing and control, with 952 actuators each, making it suitable for our proof-of-concept demonstrations. One DM is placed in a plane conjugate to the HiCAT entrance pupil, while the second DM is placed farther along the optical axis, corresponding to a Fresnel number at a wavelength of 638 nm. This configuration enables simultaneous control of amplitude and phase aberrations over a dark zone that extends over both halves of the image plane. We conducted our experiments using a Thorlabs MCLS1 laser diode source, which emits monochromatic light with a central wavelength . HiCAT additionally has an IrisAO segmented DM with 37 hexagonal segments, each with controllable piston/tip/tilt, to act as a telescope pupil simulator and to inform experimental efforts devoted to segment-level tolerancing and stabilization.19,20 Figure 5 shows a simplified system layout of HiCAT, including the primary imaging beam path as well as several additional beam paths used by the LOWFS and metrology subsystems. Our experiments on HiCAT utilized a classical Lyot coronagraph (CLC) design with a hexagonal entrance pupil mask designed to mask the extreme edges of the IrisAO, a circular focal-plane mask, and a circular Lyot stop. Figure 6 shows an overlay of the CLC pupil masks along with a simulated stellar point-spread functions (PSFs). Figure 7 shows example experimental PSFs obtained before and after closed-loop WFS&C using SM, along with the corresponding DM commands. 3.2.Algorithm ImplementationWe developed a differentiable model of HiCAT using Python, including a hand-derived adjoint model. To facilitate testing, our model was comprised of several sub-modules each with its own individual forward and adjoint models.
We refer the reader to our earlier work for detailed descriptions of the operations in the forward and adjoint models for the latter two sub-modules.7 A pre-existing numerical model of HiCAT based on the POPPY framework22,23 served as a reference for calibrating our differentiable model. 3.2.1.Optimization algorithmWe used the limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithm,9 as implemented in the SciPy package,24 as the optimization algorithm for AD-PSM and AD-EFC. L-BFGS is a quasi-Newton algorithm, meaning that it uses the gradient vectors collected during optimization to approximate the inverse Hessian matrix of second derivatives, involving substantially less effort than Newton’s method with an exact Hessian (see Appendix B). All numerical optimization algorithms have a termination criterion that determines when the algorithm has reached a location in the parameter space that is sufficiently close to a local optimum. In the L-BFGS implementation used by SciPy, the termination criterion is set by a tolerance parameter defined as the magnitude of the largest element of the gradient vector. As , L-BFGS carries out a greater number of optimization iterations to terminate closer to the minimum. Choosing a larger value of reduces total computation with the tradeoff of a less-optimal DM solution. However, as our experiments showed, in certain instances this can help to regularize the solution by terminating before L-BFGS converges to an overly aggressive DM correction that would otherwise make the WFS&C loop unstable. The termination criterion presents an additional consideration not present in SM and EFC which exactly minimize their respective cost functions in each WFS&C iteration. The value of the tolerance parameter should be chosen to optimally trade-off between accuracy and computational effort. We discuss this in further detail below. 3.3.Estimation AlgorithmWe used the pairwise probe estimator3 for the estimation step in all experiments. The pairwise estimator forms a least-squares estimate of the focal-plane electric field by applying a series of probing DM commands to generate probing electric fields that interfere with . The data vector for the least-squares estimate is formed by differencing the images resulting from and . The estimate of the ’th pixel in the dark zone is then found by finding the least-squares solution of the system where .We generated four DM probe functions that were optimized to produce probing fields of the form where and is the sign of the focal-plane coordinate. These probing fields, which have anti-Hermitian symmetry, correspond to purely real commands applied to the in-pupil DM, which we can calculate via a regularized least-squares approach for each , where is the Tikhonov regularization term for the probe calculation and is the Jacobian for the in-pupil DM. This is identical to the EFC problem in Eq. (10) but with the probing field on the right-hand side instead of , and has the solutionFor the experiments with SM and EFC, we applied Eq. (16) to compute probe commands for pairwise probe estimation. For experiments with AD-PSM and AD-EFC, we iteratively solved Eq. (15) using the AD-EFC framework. In all cases, we set the Tikhonov regularization parameter for probe generation as ; for the iterative probes, we set the optimizer tolerance to . Figure 8 shows the probe commands along with the corresponding magnitude and phase of . 4.Experimental Results and DiscussionWe conducted a series of experiments to compare the contrast performance of AD-PSM and AD-EFC relative to SM and EFC, respectively. All experiments used an annular control region extending from to , where is the central wavelength of the Thorlabs MCLS1 laser diode and is the Lyot stop diameter. Each experiment consisted of 80 WFS&C iterations. For each experiment, we computed the median and 10th and 90th percentile values of the spatially averaged dark-zone contrast values for the final 50 iterations, which captured the steady-state performance of the WFS&C loop with each algorithm after converging to deepest-possible contrast. Figures 9 and 10 show example contrast vs. iteration time series with AD-PSM/SM and AD-EFC/EFC, respectively, as well as the PSF from the iteration with deepest contrast using AD-PSM and AD-EFC. In both cases, the convergence properties of AD-PSM and AD-EFC are nearly identical to their Jacobian-based counterparts, validating our proposed approach. For SM and AD-PSM, we chose as the contrast target, i.e., a factor of two improvement iteration-over-iteration in spatially integrated dark zone contrast. For the Lagrange multiplier line search in SM, we let . For AD-PSM, we tested combinations of the penalty parameter and the nonlinear optimization convergence tolerance . We determined these values based on simulations and prior WFS&C experiments on HiCAT. For EFC and AD-EFC, we selected the Tikhonov regularization matrix as , with for the first 30 WFS&C iterations and thereafter. We determined through previous experiments with EFC that maintaining an aggressive value throughout the experiment was detrimental to the stability of the control loop after reaching the steady-state regime. For AD-EFC, we used the same set of values as AD-PSM. For AD-EFC and AD-PSM, we compared the value of the cost function for two different starting guesses for the DM correction: , i.e., the solution of the previous WFS&C iteration, and . The starting guess with the lower of the two cost function values was then selected. In all cases was ultimately chosen as the starting guess. Figures 11 and 12 show the statistics of the post-convergence spatially averaged dark-zone contrast achieved with AD-PSM and AD-EFC for each combination of regularization ( or ) and optimization tolerance , respectively, compared to reference experiments with SM and EFC. For all parameter combinations, AD-PSM and AD-EFC equaled the contrast performance of SM and EFC, respectively. Moreover, we observed no strongly identifiable trends in achievable contrast as a function of the algorithm parameters, indicating that in all cases, AD-PSM and AD-EFC reached the contrast floor imposed by environmental instabilities. For full contrast versus iteration curves for all experiments, we refer to Fig. 13 in Appendix A. 4.1.DiscussionOur experiments were aimed at exploring a relevant subset of the space of free parameters for each algorithm, namely the nonlinear optimization convergence tolerance , the Tikhonov regularization for AD-EFC, and the penalty parameter for AD-PSM. In principle, each parameter affects the attainable contrast of the WFS&C loop, but in subtly different ways, which we discuss here. As discussed in Sec. 3.2.1, determines the effort that the nonlinear optimization algorithm will expend to find a solution close to the true minimum of the cost function. A smaller value of corresponds to a larger number of optimization iterations before termination, and ultimately a larger time interval between DM updates. In principle, on a system, such as HiCAT, where environmental disturbances cause the aberrated electric field to evolve over time scales on the order of seconds or faster, an excessive delay between the estimation step and application of the DM correction can cause a degradation in achievable contrast. However, we observed no meaningful degradation for smaller values of . In a real space-borne system, this is unlikely to be a significant consideration because of the much greater electric field stability, and because the total duration of each WFS&C iteration will be dominated by the exposure times needed for the estimation step. The value can also serve as an auxiliary form of regularization, by terminating the optimization algorithm before it reaches an overly aggressive DM correction caused by a noisy electric field estimate, insufficient regularization using or , or both. For instance, in Fig. 12, with (rightmost panel), EFC diverged altogether whereas AD-EFC did not. On the other hand, choosing too large can impose an effective contrast floor by limiting the ability of the optimization algorithm to converge to appropriately strong corrections. We determined in simulation that this was the case for . For EFC and AD-EFC, smaller values of the Tikhonov regularization correspond to more aggressive correction of the electric field, with the tradeoff of increased sensitivity to small perturbations in the estimated electric field, which reduces the stability of the WFS&C loop. For AD-PSM and SM, the aggressiveness of the WFS&C control loop is set first and foremost by the target contrast level . For SM, a smaller corresponds to a larger value of the Lagrange multiplier , and therefore more aggressive correction (recalling from Sec. 2.2 that ). For AD-PSM, as tends toward infinity, the goal of achieving the target contrast is enforced more strongly; is interpretable as tuning the aggressiveness of the control loop up to a maximum level imposed by . Our experiments indicated that over the range of values considered, the performance of the WFS&C loop was insensitive to the value of . In principle, because and have similar effects on the performance of AD-EFC, there potentially exists a single combination of the two parameters that is optimal in terms of contrast, or perhaps a continuum of combinations with similar performance, with a change in compensated by a change to in the opposite direction. The same holds true for AD-PSM. 5.Conclusions and Future WorkIn this paper, we reported the first experimental demonstrations of two AD-based wavefront control algorithms, AD-PSM and AD-EFC, using the HiCAT testbed. To within statistical uncertainty, AD-PSM and AD-EFC equaled their Jacobian-based counterparts in dark-zone contrast for all combinations of parameters that we tested. These demonstrations pave the way for future experimental validation at higher contrast. The analysis in our earlier work indicated that the largest computational gains are realized for DMs with more than actuators. The DMs currently in use on HiCAT have 34 actuators across the diameter of the active region or 952 actuators per DM in total, which is comparatively low. Therefore, our goal was to validate the fundamental capability of AD-PSM and AD-EFC to reach deep contrast, rather than to demonstrate improved computational efficiency. In this work and in our earlier work, we utilized the L-BFGS optimization algorithm to minimize the wavefront control cost function because of its desirable convergence properties compared to first-order optimization methods and low storage requirements. Despite this, L-BFGS is known to converge slowly for poorly conditioned problems compared to methods that utilize the exact Hessian matrix.9 However, there exist alternative methods, such as truncated Newton algorithms, with excellent convergence properties for quadratic or nearly-quadratic cost functions, that require only the ability to evaluate Hessian-vector products, rather than the Hessian matrix itself.9 Hessian-vector products can be evaluated using AD in a similar fashion to gradients. Future work will explore such methods as a potentially faster approach than L-BFGS. 6.Appendix A: Convergence Data for All ExperimentsIn Sec. 4, we showed the time series of spatially averaged contrast vs. WFS&C iteration for an AD-PSM with and compared to an experiment with SM (Fig. 9), as well as an AD-EFC experiment with and compared to an EFC experiment with the same value of (Fig. 10). We then summarized the steady-state time series statistics (median, 10th percentile, and 90th percentile) for a series of nine AD-PSM and nine AD-EFC runs with different combinations of and , respectively (Figs. 11 and 12). In this section, Fig. 13 shows the full time series of contrast versus iteration for all 18 AD-PSM and AD-EFC experiments compared to their respective SM and EFC reference experiments. 7.Appendix B: Equivalence of Jacobian-Based Solutions and Newton’s MethodFinding solutions for EFC and SM using the Jacobian matrix is equivalent to minimizing their respective cost functions with respect to using Newton’s method. Newton’s method is a second-order optimization technique that utilizes second-derivative information about the cost function, given by the local Hessian matrix at any point in the DM command parameter space. Given an initial guess for the solution , the full Newton update is given as9 For cost functions that are exactly quadratic, including EFC and SM, Newton’s method converges in a single iteration. For general numerical optimization problems, Newton’s method is rarely used in practice because forming the Hessian matrix explicitly is expensive. On the other hand, quasi-Newton methods, such as the BFGS algorithm or the limited-memory BFGS (L-BFGS) variant, can approximate using changes in over successive optimization iterations. As a consequence, they are substantially less computationally expensive. Although quasi-Newton algorithms do not converge as rapidly as Newton’s method, and in particular can converge slowly for poorly conditioned problems, they are nonetheless superior to purely first-order methods such as steepest descent.9 As we show below, the Hessian matrix for EFC and SM has an analytic expression in terms of the Jacobian given by , where is a symmetric, positive-definite matrix given by for EFC and for SM. Our approach is to replace this full Newton iteration, requiring computation of the Jacobian, by a series of cheaper quasi-Newton iterations instead, requiring only computation of the gradient , which we achieve using AD. The cost function for the EFC algorithm described in Sec. 2 can be written in the form where is a regularization matrix; for SM, . For now, we will restrict our attention to the EFC algorithm, but note that the result derived here applies equally as well to minimizing the Lagrangian function for SM with respect to the DM correction .Expanding Eq. (18) and recalling that , the cost function has the form where we use the fact that is purely real to discard . The Jacobian-based solution is found by finding such that vanishes. We therefore begin by writing down the gradient:This is a linear system of equations that we can solve for the optimal correction : The Hessian matrix is given as Since both terms in are positive definite, is positive definite as well, confirming that the solution is a minimum of the cost function. We will now show that the solution obtained above is the same as the solution obtained by applying a single iteration of Newton’s method to the EFC cost function. Let be an initial guess for the solution. Newton’s method produces an iterate of the form9 Combining Eqs. (20) and (22) Inserting back into Eq. (23) Inserting the definition of the Hessian matrix from Eq. (22), we see that the Newton iterate is identical to the analytical solution in Eq. (21) As we described earlier, the same result holds if minimizing in Sec. 2.1 with respect to . 8.Appendix C: Fast Convolutional DM ModelConsider a DM with actuators along each side (i.e., ) whose surface can be modeled as a linear superposition of identical influence functions For fixed actuator spacing along the horizontal and vertical directions, we can rewrite the above summation as a convolution between a weighted Dirac comb function and the influence function Fourier transforming both sides transforms the convolution operation into a multiplication: We define and : We next define the discretized surface and influence function arrays and such that so that where denotes element-wise multiplication. Finally, we define the vectors of actuator center coordinates and such that and , as well as the array of actuator commands for whichWe can write this more succinctly as the following sequence of element-wise and matrix products where exponentiation is performed element-wise and denotes the outer product of the vectors and .The term in the parentheses is more commonly referred to as the matrix Fourier transform21 or matrix triple product Fourier transform25 of , which we denote as follows: yieldingThe final step is to compute an inverse discrete Fourier transform to obtain the desired discrete DM surface , which is carried out most efficiently using the inverse fast Fourier transform, yielding the final result For DMs whose active actuators are a subset of the grid modeled above, only the elements of corresponding to active actuators are set to nonzero values. 8.1.C.1 Adjoint ModelThe algorithm described in the previous section computes the DM surface resulting from a two-dimensional array of actuator commands , under the assumptions that the influence function is identical across all actuators and that the surface can be approximated as a linear superposition of the actuator influence functions. In the context of gradient-based nonlinear optimization using RMAD, we can derive an adjoint model for this algorithm that computes the derivative for some scalar cost function , given the derivative with respect to the surface . To begin, we break the forward model into the following sequence: This leads to the following adjoint model, following the RMAD adjoint variable rules in Refs. 7 and 26: where denotes element-wise complex conjugation and IMFT denotes the inverse matrix Fourier transform:Combining these expressions, the adjoint model is then 9.Appendix D: Adjoint Model for EFC Cost FunctionIn Sec. 2.2, we describe the cost function for the EFC algorithm for a single correction wavelength. Here, we derive its RMAD adjoint model, which computes the derivative . We begin by writing the cost function as a series of operations evaluated sequentially: We now apply the RMAD gradient rules7,26 to each step in reverse order to derive the adjoint model, letting for any variable : Combining and simplifying, we see that the desired gradient is given as This gradient is passed to the next block of the adjoint model, which in this case is the adjoint model for propagation through the coronagraph, which, referring to Fig. 4, evaluates the derivatives of with respect to the surfaces and , respectively. We refer to our earlier work for a derivation of the coronagraph propagation adjoint model.7 Code and Data AvailabilityNASA regulations govern the release of data and source code. Please contact Scott Will at scott.d.will@nasa.gov to request supporting materials. HiCAT makes use of the NumPy,27,28 Matplotlib,29,30 AstroPy,31 SciPy,24 scikit-image,32 Pandas,33,34 ImageIO,35 Photutils,36 HCIPy,37 POPPY,22,23 and CatKit38 packages. AcknowledgmentsThe authors are especially thankful to the extended HiCAT team (over 50 people) who have worked over the past several years to develop this testbed. This work was supported in part by the National Aeronautics and Space Administration (Grant No. 80NSSC19K0120) issued through the Strategic Astrophysics Technology/Technology Demonstration for Exoplanet Missions Program (SAT-TDEM; PI: R. Soummer). E.H.P. was supported by the NASA Hubble Fellowship (Grant No. HST-HF2-51467.001-A) awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Incorporated, under NASA (Grant No. NAS5-26555). I.L. acknowledges the support by a postdoctoral grant issued by the Centre National d’Êtudes Spatiales (CNES) in France. This paper was adapted from an earlier SPIE conference proceedings.39 ReferencesM. Perryman, The Exoplanet Handbook, 2nd ed.Cambridge University Press(
(2018). Google Scholar
L. Pueyo et al.,
“Optimal dark hole generation via two deformable mirrors with stroke minimization.,”
Appl. Opt., 48 6296
–6312 https://doi.org/10.1364/AO.48.006296 APOPAI 0003-6935
(2009).
Google Scholar
A. Give’on et al.,
“Closed loop, DM diversity-based, wavefront correction algorithm for high contrast imaging systems,”
Opt. Express, 15
(19), 12338
–12343 https://doi.org/10.1364/oe.15.012338 OPEXFF 1094-4087
(2007).
Google Scholar
B.-J. Seo et al.,
“Testbed demonstration of high-contrast coronagraph imaging in search for Earth-like exoplanets,”
Proc. SPIE, 11117 111171V https://doi.org/10.1117/12.2530033 PSISDG 0277-786X
(2019).
Google Scholar
The LUVOIR Study Team,
“LUVOIR,”
(2019). Google Scholar
The Habitable Exoplanet Observatory Study Team,
“Habitable Exoplanet Observatory Final Report,”
(2019). Google Scholar
S. D. Will, T. D. Groff and J. R. Fienup,
“Jacobian-free coronagraphic wavefront control using nonlinear optimization,”
J. Astron. Telesc. Instrum. Syst., 7 019002 https://doi.org/10.1117/1.JATIS.7.1.019002
(2021).
Google Scholar
A. Griewank and A. Walther, Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, 2nd ed.Society for Industrial and Applied Mathematics(
(2008). Google Scholar
J. Nocedal and S. J. Wright, Numerical Optimization, Springer-Verlag, New York
(1999). Google Scholar
M. N’Diaye et al.,
“High-contrast imager for complex aperture telescopes (HiCAT): 1. testbed design,”
Proc. SPIE, 8864 88641K https://doi.org/10.1117/12.2023718 PSISDG 0277-786X
(2013).
Google Scholar
M. N’Diaye et al.,
“High-contrast imager for complex aperture telescopes (HICAT): II. Design overview and first light results,”
Proc. SPIE, 9143 914327 https://doi.org/10.1117/12.2056694 PSISDG 0277-786X
(2014).
Google Scholar
M. N’Diaye et al.,
“High-contrast imager for complex aperture telescopes (HiCAT): 3. First lab results with wavefront control,”
Proc. SPIE, 9605 96050I https://doi.org/10.1117/12.2188497 PSISDG 0277-786X
(2015).
Google Scholar
L. Leboulleux et al.,
“High-contrast imager for Complex Aperture Telescopes (HiCAT). 4. Status and wavefront control development,”
Proc. SPIE, 9904 99043C https://doi.org/10.1117/12.2233640 PSISDG 0277-786X
(2016).
Google Scholar
R. Soummer et al.,
“High-contrast imager for complex aperture telescopes (HiCAT): 5. First results with segmented-aperture coronagraph and wavefront control,”
Proc. SPIE, 10698 106981O https://doi.org/10.1117/12.2314110 PSISDG 0277-786X
(2018).
Google Scholar
C. Moriarty,
“High-contrast imager for complex aperture telescopes (HiCAT): 6. software control infrastructure and calibration,”
Proc. SPIE, 10698 1069853 https://doi.org/10.1117/12.2314058 PSISDG 0277-786X
(2018).
Google Scholar
R. Soummer et al.,
“High-contrast imager for complex aperture telescopes (HiCAT): 6. Two deformable mirror wavefront control (Conference Presentation),”
Proc. SPIE, 11117 111171Y https://doi.org/10.1117/12.2530299 PSISDG 0277-786X
(2019).
Google Scholar
R. Soummer et al.,
“High-contrast imager for complex aperture telescopes (HiCAT): 7. Dark zone demonstration with fully segmented aperture coronagraph,”
Proc. SPIE, 11823 118230T https://doi.org/10.1117/12.2594684 PSISDG 0277-786X
(2021).
Google Scholar
R. Pourcelot et al.,
“Experimental validation of active control of low-order aberrations with a Zernike sensor through a Lyot coronagraph,”
Proc. SPIE, 11823 118231M https://doi.org/10.1117/12.2594609 PSISDG 0277-786X
(2021).
Google Scholar
I. Laginja et al.,
“Wavefront tolerances of space-based segmented telescopes at very high contrast: Experimental validation,”
Astron. Astrophys., 658 A84 https://doi.org/10.1051/0004-6361/202142150 AAEJAF 0004-6361
(2022).
Google Scholar
S. M. Redmond et al.,
“Dark hole maintenance results for segmented aperture wavefront error drift in a high contrast space coronagraph,”
Proc. SPIE, 11823 118231K https://doi.org/10.1117/12.2594647 PSISDG 0277-786X
(2021).
Google Scholar
R. Soummer et al.,
“Fast computation of Lyot-style coronagraph propagation,”
Opt. Express, 15 15935
–15951 https://doi.org/10.1364/OE.15.015935 OPEXFF 1094-4087
(2007).
Google Scholar
M. D. Perrin et al.,
“Simulating point spread functions for the James Webb Space Telescope with WebbPSF,”
Proc. SPIE, 8442 84423D https://doi.org/10.1117/12.925230 PSISDG 0277-786X
(2012).
Google Scholar
M. Perrin et al.,
“POPPY: Physical Optics Propagation in PYthon,”
(2016). https://ascl.net/1602.018 Google Scholar
P. Virtanen et al.,
“SciPy 1.0: fundamental algorithms for scientific computing in Python,”
Nat. Methods, 17 261
–272 https://doi.org/10.1038/s41592-019-0686-2 1548-7091
(2020).
Google Scholar
A. S. Jurling, M. D. Bergkoetter and J. R. Fienup,
“Techniques for arbitrary sampling in two-dimensional Fourier transforms,”
J. Opt. Soc. Am. A, A35 1784
–1796 https://doi.org/10.1364/JOSAA.35.001784 JOAOD6 0740-3232
(2018).
Google Scholar
A. S. Jurling and J. R. Fienup,
“Applications of algorithmic differentiation to phase retrieval algorithms.,”
J. Opt. Soc. Am. A, 31
(7), 1348
–59 https://doi.org/10.1364/JOSAA.31.001348 JOAOD6 0740-3232
(2014).
Google Scholar
T. E. Oliphant, A Guide to NumPy, 1 Trelgol Publishing, USA
(2006). Google Scholar
S. van der Walt, S. C. Colbert and G. Varoquaux,
“The NumPy array: a structure for efficient numerical computation,”
Comput. Sci. Eng., 13 22
–30 https://doi.org/10.1109/MCSE.2011.37
(2011).
Google Scholar
J. D. Hunter,
“Matplotlib: a 2D graphics environment,”
Comput. Sci. Eng., 9 90
–95 https://doi.org/10.1109/MCSE.2007.55
(2007).
Google Scholar
T. A. Caswell et al.,
“matplotlib/matplotlib: Rel: v3.3.3,”
(2020). Google Scholar
Astropy Collaboration,
“The astropy project: building an open-science project and status of the v2.0 core package,”
Astrophys. J., 156 123 https://doi.org/10.3847/1538-3881/aabc4f ASJOAB 0004-637X
(2018).
Google Scholar
S. van der Walt et al.,
“scikit-image: image processing in Python,”
PeerJ, 2 e453 https://doi.org/10.7717/peerj.453
(2014).
Google Scholar
Wes McKinney,
“Data structures for statistical computing in python,”
in Proc. 9th Python in Sci. Conf.,
56
–61
(2010). Google Scholar
J. Reback et al.,
“pandas-dev/pandas: Pandas 1.1.4,”
(2020). Google Scholar
S. Silvester et al.,
“imageio/imageio v2.8.0,”
(2020). Google Scholar
L. Bradley et al.,
“astropy/photutils: 1.0.1,”
(2020). Google Scholar
E. H. Por et al.,
“High Contrast Imaging for Python (HCIPy): an open-source adaptive optics and coronagraph simulator,”
Proc. SPIE, 10703 1070342 https://doi.org/10.1117/12.2314407 PSISDG 0277-786X
(2018).
Google Scholar
J. Noss et al.,
“spacetelescope/catkit: v0.36.1,”
(2021). Google Scholar
S. D. Will et al.,
“Wavefront control with algorithmic differentiation on the HiCAT testbed,”
Proc. SPIE, 11823 118230V https://doi.org/10.1117/12.2594283 PSISDG 0277-786X
(2021).
Google Scholar
BiographyScott D. Will is an optical engineer at NASA Goddard Space Flight Center in Greenbelt, Maryland, USA. He received a BS degree in electrical engineering from the University at Buffalo in 2015 and a PhD in optics from the University of Rochester in 2021. As a graduate student, he was also a member of the Russell B. Makidon Optics Laboratory team at the Space Telescope Science Institute in Baltimore, Maryland, United States. His research interests include wavefront sensing and control, computational imaging, and astronomical instrumentation. Emiel H. Por is an NHFP Sagan fellow at the Russell B. Makidon Optics Laboratory at the Space Telescope Science Institute in Baltimore, United States. He received his BSc degree in astronomy and physics, his MSc degree in astronomical instrumentation, and his PhD in astronomy from Leiden University, The Netherlands. His research interests include coronagraphy, wavefront sensing and control, and high-contrast imaging. Ananya Sahoo is a staff scientist at STScI. She currently splits her time between infrastructure development for the HiCAT testbed and studying wavefront-error induced by thermal disturbances for future segmented space telscopes. She received her PhD in 2020 from The Graduate University for Advanced Studies, SOKENDAI in alliance with National Astronomical Observatory of Japan. She previously worked on the photometric and astrometric calibration for the SCExAO instrument at the Subaru Telescope. Iva Laginja is a postdoctoral fellow with CNES, the French space agency and previous member, now collaborator, of the Russell B. Makidon Optics Laboratory at the Space Telescope Science Institute in Baltimore, United States. She received an MSc degree in astronomy and instrumentation at Leiden University, Netherlands and her doctorate from Paris Observatory in 2021. Her research interests include wavefront sensing and control, high contrast imaging, and coronagraphy for exoplanet detection. Raphaël Pourcelot is a postdoc at STScI within the Russell B. Makidon Optics Laboratory. He received an MSc degree in astronomy from Aix-Marseille Université and an engineering degree at Institut d’Optique Graduate School in Université Paris-Saclay. He received his PhD from Université Côte d’Azur in Nice, France, in 2022. He is mainly working on wavefront sensing and control for high-contrast imaging for exoplanet detection and Zernike wavefront sensor applications in particular. Susan F. Redmond is a PhD student at Princeton University in Princeton, United States, and collaborates with the Russell B. Makidon Optics Laboratory. She previously obtained her BEng degree from Memorial University of Newfoundland (2015) and her MEng and MASc degrees in aerospace engineering from the University of Toronto in 2016 and 2018, respectively. Her current research is split between the optical and thermal design for balloon-borne telescopes and developing focal-plane wavefront sensing and control algorithms for high-contrast imaging. Laurent Pueyo is an associate astronomer at STScI. He received his doctorate from Princeton University in 2008 and previously worked as a NASA Fellow at JPL, California and as a Sagan Fellow at JHU. His research focuses on imaging faint planets around nearby stars. He has pioneered optical technologies that allow astronomers to take images of other planetary systems, and has developed data analysis methods now standardly used to study extrasolar planets. Tyler D. Groff received his BS degree in mechanical engineering and astrophysics from Tufts University and his PhD in mechanical and aerospace engineering from Princeton University under an NESSF fellowship. He was the lab manager of the Princeton High Contrast Imaging Laboratory and the CHARIS instrument at Subaru Telescope. He is currently the lead engineer at Goddard Space Flight Center for the spectroscopy and polarization modes on the Roman Space Telescope Coronagraph Instrument. James R. Fienup received his AB degree from Holy Cross College and MS and PhD degrees in applied physics from Stanford University, where he was a National Science Foundation graduate fellow. After performing research at ERIM, he became the Robert E. Hopkins Professor of Optics at the University of Rochester. He is a fellow of SPIE and OSA, a member of the National Academy of Engineering, and a recipient of SPIE’s Rudolf Kingslake Medal and Prize. Remi Soummer is an associate astronomer at STScI. He received his doctorate from the University of Nice in 2002 and has been working in the field of high contrast imaging and instrumentation for the detection and characterization of exoplanets ever since. He is currently the head of the Russell B. Makidon Optics Laboratory and working on a coronagraph demonstration for future large segmented aperture space telescopes. |