|
1.IntroductionEach of the four major decadal missions currently underway to support the Astro 2020 decadal review1 has a major challenge to overcome for that mission to be determined to be executable. The existential problem is called the big fundamental problem (BFP) of the concept. The BFP for Lynx is the cost-effective manufacture of the Lynx x-ray mirror assembly (XMA) of suitable quality to meet the science objectives of the mission. The XMA is made of parts, 40,000 mirror elements, and 111,000 posts.2 The technology to manufacture optical elements and posts is making excellent progress and is reported elsewhere in this volume.2 The technical maturation is necessary but not sufficient to make Lynx viable. A strong case for the determinative cost and schedule for the revolutionary XMA is needed to provide sufficiency to the argument. This paper describes the analytic model and a few example of the analysis possible to demonstrate that such quantitative and determinative management is possible and executable. All large hardware programs face the challenge of managing cost and schedule. In order to quantitatively manage any aspect of system performance, be it image quality, mass, power, thermal margin, or production schedule, an analytic model is needed. The technical performance metrics have models and tools to manage, but not schedule or cost. This paper will show that there is a solid analytic framework for establishing a rigorous management process to produce the XMA in a cost-efficient and effective manner. Failure to imagine this analytic model makes the analysis of strategic options and risks challenging at best and cost and schedule an exercise in reporting, not proactive management. The model to be developed in this paper will be used during Lynx’s development to analyze the schedule and cost for risk and allow for optimization later in Lynx’s maturation. This paper presents the derivation of the system of equations that predict the cost and duration of manufacture of the XMA. Analysis of the resulting system of equations identifies which parameters must be measured and determined during the ongoing technology development efforts. The uncertainties in the various parameters and the model assumptions are identified as the risks to cost and schedule performance. The foundation of the modeling is an area of operations research known as queuing theory (QT). Operations research began in World War II and is described as “math and physics in service to the corner office.”3 This is certainly such an application. Specifically, QT is the study of lines or queues that are an integral part of modern life.4 Queue or QT has been applied to health services,5 telephone network design, and many other important and interesting applications.6 The development of the approach begins with the introduction of useful notation and nomenclature, in which we will frame the development of the model. We will also introduce the definitions of the relevant terms for our discussion. The model development begins by building on some initial results presented in earlier work expanding the illustrative, but nonrealistic, example of an entirely deterministic XMA manufacturing process. The assumption that the production line is governed by a deterministic processes is then relaxed and the effects of variance are introduced, generalizing the model. The model for the cost and duration of an optimized process is examined in light of the Lynx technology development. This paper concludes with lessons that can be gleaned from the model even at the early stage of development and quantification. Especially important will be the identifications of the key assumptions and values that exert the greatest influence on cost, duration, and risk. The modeling and analysis of cost and schedule is not entirely new to the x-ray community. The process used to develop and optimize the AXAF (Chandra) x-ray calibration test schedule and efficiency7 is very similar to the approach discussed in this paper. 2.Nomenclature and NotationThe model being developed is based on QT, which is a well-developed area of applied mathematics and field of operations research. This section is intended to serve as a primer to introduce the QT nomenclature and notation that is used elsewhere in this paper. The basic building block of our model development and analysis is the concept of a queue, a frequent occurrence in modern life. Such examples are airport security, fast food restaurants, rental car counters, and the infamous call centers for help with purchased items and services. Consider Fig. 1, a queue/server system, which will be shortened to being called a queue. Figure 1 shows that incoming elements, from the th process, are processed in ’th server, and once processed, they move on to the next stage of the process, the th step. If the server is not busy, the element or part is processed and moves on to the next step. If the part from step arrives and the server is busy, the incoming part waits in the queue or line for its turn. The mean population of the queue, the mean number of parts waiting for service, denoted , is a key parameter in this analysis. The components of this simple system are the elements or parts, the queue where waiting is done and the server or service node or machine, which processes parts for the ’th step and sends them on to the next step. Figure 2 introduces some additional components to the system, namely the arrival pattern or distribution in time of the arriving element, denoted , the distribution of service or process times, which is denoted and the number of servers . Kendall8 introduced a useful and compact notation to describe queues. In the eponymous notation, the queue is written . (In more advanced analysis, the Kendall notation contains more parameters, but for present purposes the three attributes are sufficient.) The dummy variables and will be replaced with symbols representing various options of the statistical properties of the arrival and service processes. As we will see, the optimization of the time to manufacture is the story of how the is chosen for each of the process steps. The model of the production process is depicted in Fig. 3. This model is a series of queues that are serial, as a manufacturing process has a specific order, which cannot be altered. In this model, there are steps and shipments or transitions from one process to the next. 3.Deterministic Arrival and Service TimesThe first model we will develop is the most naive model of an -step process. Namely, the arrival model and the processing time models are deterministic (no variance) and denoted as . In this case, a deterministic model means a specific time for each operation. We can now write compactly and efficiently that our process is made up of queues, where the index representing the ordinal value of each step. 3.1.Total Process Time: Determinative ProcessConsider the time for a part to complete the step process, this is called the total processing time, or and is given and are the individual process and shipping times and to , indexing each step.is the processing time for a single element or unit, but Lynx will have to make many or units. To calculate the total process time for units , we need to know which process step has the largest process time. Define as If a part is sent to the manufacturing process as soon as the first server is available, at some point in the process, the stream of parts reaches the step with the longest process time, the so-called gate, or rate limiting step or process. When the parts arrive at this gate, they arrive faster than they can be processed, and a queue builds up until the ’th part has arrived. Using Eqs. (1) and (2), an expression for can be written down as where in the process, the gate occurs does not matter, addition is associative so Eq. (3) is generally correct. Equation (3) can be rearranged to give In light of the formulation of Eq. (4), the name “gate” for the maximum process time becomes clear, it is the one process time that is multiplied by the number of parts to be made, dominating . If we wish to reduce , Eq. (4) shows us that the best investment is increase the number of servers that perform the process at the gate, effectively reducing the . If the number of servers is , then is given by Equation (5) is a good answer if the reduced gate time is much larger than all the others, namely A simple and powerful lesson to be gleaned from this very naïve analysis is that knowledge of the values of the allow for the most impactful investment. For example, if a second copy of the gate server is bought, the effective gate time is cut in half for the cost of that server. If we did not know this and bought a second complete copy of the production line, we would get essentially the same result, a reduction in by a factor of two, but at a cost of all of the equipment, a far less cost-effective strategy!In order to formulate a more general solution, namely when Eq. (6) is not true, a change in notation is helpful. The author hopes that the reason for the change in notation, while not immediately obvious will be clear to the reader very shortly. Recall Eq. (1), which is rewritten Relabel the according to their magnitude, denote the maximum value as , the second largest process time as , and so on, with the smallest value being . The use of the curly braces is used so a reader is not confused with rank statistics traditional notation, which uses parentheses around the index and the index 1 indicates the minimum. The use of the braces hopefully will remind the reader that this is a reverse rank index. The assumption that all of the shipping times are small implies that all the XMA manufacturing will be done at a single location. This co-location enables rapid transit from one step to the next and avoids transportation as the rate limiting process. Let , then Eq. (7) can be written Equation (5) can be rewritten as Recall the condition, Eq. (6), and now consider increasing the value of until , now the second longest process time is the gating process and Eq. (9) is no longer correct and is the gate, so an investment should be made in buying more servers at that step, and we have to also invest in addition servers for the original gate or it might regain its place as the longest step. We need to find and such that the effective process time is smaller than namely Hopefully, the reader can see the pattern that is emerging. Additional copies of the servers for the mitigation of gates or bottlenecks should consider how many levels of bottlenecks are to be mitigated by investment.If there are bottlenecks to be mitigated, then Eq. (10) can be generalized to The mitigation of bottlenecks gives as Equation (12) gives us the means to examine key programmatic questions, such as, “What is the investment needed to meet the schedule duration requirement of ?” or “What investments must be made to decrease the production time?”Equation (12) can be used to derive a conservative estimate of . Consider the limiting case of Eq. (12), which is the equality; all of the terms in the first summation in Eq. (12) can be replaced with , which gives the inequality Equation (13) simplifies to By definition So we have Substitution of Eq. (16) into Eq. (14) now gives Collecting terms gives In order to meet the hypothetical requirement on manufacturing span of , we desire , this will also be met if Equation (19) can be solved to give Equation (20) provides a means for estimating the as will be shown in Sec. 8 of this paper.3.2.Cost of Servers: Deterministic CaseLet be the cost of the server for the th step, so the total cost of servers, is Using Eq. (11), the can be expressed in terms of the : where the square brackets () indicate that the value is rounded up to the next larger integer, as machines come in integral quantities. So can be written as4.Statistically Distributed Arrival and Service TimesUp to this point in the discussion, all of the process times or and the have been considered as deterministic quantities. This section extends the deterministic analysis to accommodate stochastic arrival and service times. We introduce the additional concepts and results and develop the calculation for general distributions of arrival and service times including the effect of variance. Considerations of problems of this class are in effect the “bread and butter” of QT, and we have a rich heritage of results from which to draw on. The foundation of QT is known as Little’s rule9–11 and is written and where is the mean number in the queue/server system, is the mean rate of arrivals, is the mean service rate, is the mean waiting time, and the subscript refers to population and wait time in the queue, awaiting service. If the value of , the results of the deterministic queues are recovered. A seminal new parameter is also introduced here, called the server utilization, and is defined Small values of mean that the arrival rate, is small compared to the service rate, and the server is not really very busy. If the values of are large, then the server is busy most or all of the time, and there is a wait for service, or what we have termed a gate or bottleneck.For a single server, if the arrival model is described by a Poisson process and the service time is exponential, the queue is described as , where means Markovian. For this case, it can be shown in Ref. 12 that is given by Result [Eq. (28)] states that to keep queue populations small, no bottlenecks, that should be small. Figure 4 shows a plot of Eq. (28) and shows that as increases diverges as . Our strategy to minimize has been to avoid bottlenecks, in which parts are idle, now are described by the continuous variable .The statistical model of service times is not general, so consider now a queue, where the meaning a general statistical distribution model of service times, characterized by its mean , and standard deviation in service time . for this queue/server system, , is knowing as the Pollazcek–Khintine (PK) formula13,14 and is given by The PK result, Eq. (29), explicitly shows that service time variance does increase the mean number that are in the queue. Moreover, Eq. (29) informs us that we must measure and control and avoid large values.The QT literature also provides a result for in the case where both arrivals and service times are generally distributed.15 is described by the closed form approximation where means the coefficient of variance of , which is the ratio of the variance to the mean squared. So and the coefficients of variance in service time, subscript , and the variance in arrival time, subscript , are given, respectively, by and Substitution of Eqs. (31) and (32) into Eq. (30) gives We want to operate with , using this and the definition of , Eqs. (27) and (33) can be rewritten as Examination of Eq. (34) illustrates the key role of variance in the service time, in increasing and pushing the process to a bottleneck. Equation (34) also shows that variance in the arrival rate does matter but is of secondary importance. The implication of Eq. (34) is quite clear. Variance in service and arrival times can and should be planned for in any manufacturing schedule. Underestimation of these quantities will cause schedule delays and cost increases. In other words, risk is not that the manufacturing process for Lynx, or any other project, will have variance in service times. The risk lies in underestimation of that quantity. Proper measurement and estimation of the distribution of variance in service time is clearly one of the lessons from this analysis that needs to be passed to the technology and manufacturing development efforts.The result for for an queue16 is found to be where is given The result for for a queue is given as17 Using Little’s rule, Eqs. (25) and (37) give the expected queue population for a queue: However, complicated a numerical process, Eq. (38), gives us the means of making a lower bound estimate for the number of servers at each of the bottlenecks chosen for mitigation. Namely, we seek the value of that gives given the other parameter values. Moreover, it can be seen that more servers do help and are a means to mitigate risk in the knowledge of the .We are now in a position to calculate for the case of a production line made up of queues, with bottlenecks mitigated. Recall the result for an queue, Eq. (18), mitigated to the ’th level: To derive the result for a production line made up of queues mitigated to the ’th level, the gating time will be replace with the total waiting time for the gating process, which includes the queue delay and call it . Using Little’s rule, Eq. (26) we get is given by Eq. (25) and using Eq. (38) for gives So the expression for is now and Eq. (40) becomes Equation (42) can be rewritten, if we recognize that the mean processing rate, is the reciprocal of the mean processing time, allowing us to write Substitution of Eq. (42) into Eq. (39) gives the final result for TQ for queues, mitigated to the ’th level as Using the equality gives the lowest upper bound, namely, Substitution of Eq. (43) into Eq. (45) gives the result for as Equation (46) shows that like in the deterministic case, the central role of the value of the gating process time is shown as well as the effects of variance on arrival and service times.5.Cost ModelThe analysis of schedule gives us the optimal number of servers and knowing the cost of each. It is a triviality to add that up to give the server cost, but that is not the whole cost of LMA production. The other elements that must be included are:
The cost of XMA production is of course the sum of all these cost elements, namely We can now estimate how each cost element changes with respect to number of servers and the time to produce the XMA . It is assumed that the cost of raw materials is fixed with respect to production strategy. is the cost of servers is from Eq. (21): If we assume an average cost, to install servers, is proportional to the number of servers letting the average cost of installation , we get The cost of utilities depends on the need from each server, times the number of server, and times the duration of the production effort. If we assume the average cost of utilities for a server is , then is Assuming the mean cost rate to maintain a server is , then is Personnel costs are also proportional to the number of servers and the time to manufacture, so if the cost rate per server is , then is The level of effort cost is proportion to the duration, if we call the LOW cost rate of , then Substitution of Eqs. (48)–(53) into Eq. (47) gives Equation (54) can be rearranged and factored to give For any given set of , the term in the parentheses in Eq. (55) is a fixed value, so Eq. (55) can be written We have seen above generally that if is larger, there are more servers, but then will decrease, this means that it is possible to find a minimum value for the cost to manufacture the XMA . The actual numerical calculation will have to be done once all the values of then relevant parameters have been determined.6.Effects of Process YieldAny real process is not perfect, and therefore has finite yield, where the yield for each step is the probability that process produces and acceptable result. The yield may be thought of as a probability of success for each process and of course has the value from 0 to 1. Note that a part that fails to complete step must be remade. This rework costs money in new material and time, causing to increase, which as we have seen in Eq. (56) also increases cost. The yield for step , , is the probability of success, of step , the probability of failure is the complement or and since each process must product successes so the mean number of failures for each step , , is The total number of parts needing rework is the sum of Eq. (57) over all process steps The probability of success or failure of a process is described by the binomial distribution, with large and small, means small a rare set of failures so the special case of the Poisson distribution applies. The standard deviation of , , is Using Eqs. (58) and (59), the number of additional parts that need to made to achieve successes with the associated risk that still more will be needed can be calculated. To achieve the required confidence, in general, more parts than alone will need to be made. For the purposes of this discussion, assume to get the desired level of confidence we must make parts, which of course can be written as . So the total time needed to make the LMA is the time to make parts. Using Eq. (46), the total process time is which is also Equation (60) or Eq. (61) gives a formulation for the time to manufacture the LMA, the cost model, Eq. (56) when is replaced by gives The system of Eq. (60) or Eqs. (61) and (62) is the system that gives the cost and schedule estimates for the manufacture of the LMA.7.Application Example: Determination ofThis section gives a simple example of how the results of this analysis can be applied to the problem of manufacturing the XMA. In discussion with the GSFC Optics Team,18 very early estimates of the process time for each step have been make. The results are presented in a table of as Fig. 5. Since this is very early in the development of the XMA manufacturing process, we will use results from Sec. 3, the determinative manufacture process, as the distribution of process times has not been characterized. To get an estimate of the , the number of copies of servers for each step , we start with Eq. (19) which gives the inequality Equation (63) can be rewritten with the effective value of process time of the ’th step as as a means to estimate the value of for mitigated step. We know from earlier in this paper, that all of the longer steps must have their large enough that all . This gives a simple means of evaluating the and determining . We select the limit case (equality) of Eq. (64) and solve for giving An interesting question can be posed using Eq. (65), “if we want to complete XMA production in less than (to have planned margin), what are the for those cases?” To address this question, is replace by , where is fractional reduction in time. Equation (65) now becomes Let years, further, we will assume that the shipping times are not the rate limiting steps so . From Fig. 5, we can see that and we know that is 37,492.19 Figure 6 shows the evaluation of Eq. (66) for values of f from 0% to 50%.For each value of , the total number of servers has been determined and plotted against and shown in Fig. 7. The trend line fit to the data in Fig. 7 is a quadratic. This result informs us that the sensitivity (partial derivative of duration with respect to cost) can be expected also to be quadratic as many costs are proportional to the number of servers. Such proportional costs are floor space, utilities, manufacturing personnel, and of course, the server and installation costs themselves. These investments decrease the time to manufacture the XMA. The optimal point can and will be determined once all the relevant parameter values are known. 8.Summary and Next StepsThe main objective of this work was to demonstrate that a credible model for the cost and schedule for production the Lynx mirror assembly exists. The proof of this is on offer in results, Eq. (60) or Eqs. (61) and (62). During the development of the model, the derivation has identified relevant parameters and laid out the role of uncertainty or variance in these quantities as they affect cost and schedule. This analysis has shown that uncertainty in schedule and process time increases the time to manufacture the LMA and increases cost. As the technology development proceeds, all of the relevant process times must be measured along with the distribution of process times, so that we can properly go from to that is rank ordering the processes correctly. All of this must be done in the process and technology development period. This analysis is not complete. The next factor to be included in the analysis will be the availability of the servers, namely how often are they operating or and not in a state of repair or calibration or other off line status. Inclusion of finite availability will further refine the selection of the or numbers of servers needed. Future work will also include the application of propagation of errors to develop a formal “error budget” for the cost and schedule models. The error budget approach will enable the further rigorous refinement of estimates of risk in cost and schedule. Preliminary application of the model with early process data shows that the relationship between the number of servers and time to complete the XMA is quadratic, as shown in Fig. 7. It is the happy conclusion of this work that it is possible to write down a model for cost and schedule and carry out a tolerance analysis on the value and assumption yielding a nuanced causal estimate of risk. This model gives us insight into how the XMA manufacture process works, so we can understand how the process will fail in terms of cost and schedule. Armed with this model and its descendants, the Lynx project will truly be able to manage the process of XMA manufacture and not just report on it. AcknowledgmentsThe author would like to acknowledge the anonymous reviewers whose in-scope suggestions for this paper have helped make it better. Their out-of-scope suggestions will help guide future publications building on this work. The author gratefully acknowledges that much of this work was done under the Lynx Cooperative Agreement Notice contract funded by NASA and the matching Northrop Grumman internal funding. ReferencesM. P. Nagaraja et al.,
“2020 Decadal survey planning,”
(2018) https://science.nasa.gov/astrophysics/2020-decadal-survey-planning December ). 2018). Google Scholar
W. W. Zhang et al.,
“High-resolution, lightweight, and low-cost x-ray optics for the Lynx observatory,”
J. Astron. Telesc. Instrum. Syst., 5
(2), 021012
(2019). https://doi.org/10.1117/1.JATIS.5.2.021012 Google Scholar
P. M. Morse and G. E. Kimball,
“Methods of operations research,”
Washington, DC
(1946). Google Scholar
J. F. Shortle et al., Fundamentals of Queueing Theory, 5th ed.John Wiley and Sons, Hoboken, New Jersey
(2018). Google Scholar
L. Mayhew and D. Smith,
“Using queuing theory to analyse completion times in accident and emergency departments in the light of the Government 4-hour target,”
(2006). Google Scholar
A. Nafees,
“Analysis of the sales checkout operation in ICA supermarket using queuing simulation,”
(2018) http://www.statistics.du.se/essays/D07E.Nafees.pdf December ). 2018). Google Scholar
J. W. Arenberg et al.,
“Lessons we learned designing and building the Chandra telescope,”
Proc. SPIE, 9144 91440Q
(2014). https://doi.org/10.1117/12.2055515 PSISDG 0277-786X Google Scholar
D. G. Kendall,
“Stochastic processes occurring in the theory of queues and their analysis by the method of the imbedded Markov chain,”
Ann. Math. Stat., 24 338
–354
(1953). https://doi.org/10.1214/aoms/1177728975 AASTAD 0003-4851 Google Scholar
J. D. C. Little,
“A proof for the queuing formula: ,”
Oper. Res., 9
(3), 383
–387
(1961). https://doi.org/10.1287/opre.9.3.383 OPREAI 0030-364X Google Scholar
W. S. Jewell,
“A simple proof of: ,”
Oper. Res., 15
(6), 1109
–1116
(1967). https://doi.org/10.1287/opre.15.6.1109 OPREAI 0030-364X Google Scholar
S. Eilon,
“A simpler proof of ,”
Oper. Res., 17
(5), 915
–917
(1969). https://doi.org/10.1287/opre.17.5.915 OPREAI 0030-364X Google Scholar
D. Gross et al., Fundamentals of Queueing Theory, 4th edn.John Wiley and Sons, Hoboken, New Jersey
(2008). Google Scholar
F. Pollaczek,
“Über eine Aufgabe der Wahrscheinlichkeitstheorie,”
Math. Z., 32 64
–100
(1930). https://doi.org/10.1007/BF01194620 MAZEAX 0025-5874 Google Scholar
A. Y. Khintchine,
“Mathematical theory of a stationary queue,”
Mat. Sb., 39
(4), 73
–84
(1932). MATSAB 0368-8666 Google Scholar
W. G. Marchal and C. M. Harris,
“A modified Erlang approach to approximating GI/G/1 queues,”
J. Appl. Probab., 13
(1), 118
–126
(1976). https://doi.org/10.2307/3212671 JPRBAM 0021-9002 Google Scholar
L. Kleinrock, Queueing Systems Volume 1: Theory, 101
–103 John Wiley & Sons, Inc., New York
(1975). Google Scholar
J. F. C. Kingman,
“The single server queue in heavy traffic,”
Math. Proc. Cambridge Philos. Soc., 57
(4), 902
(1961). https://doi.org/10.1017/S0305004100036094 MPCPCO 0305-0041 Google Scholar
W. W. Zhang,
(2019). Google Scholar
J. A. Gaskin et al.,
“Lynx x-ray observatory: an overview,”
J. Astron. Telesc. Instrum. Syst., Google Scholar
BiographyJonathan W. Arenberg is the chief engineer for space science missions at Northrop Grumman Aerospace Systems. He has worked on the Chandra x-ray observatory and James Webb space telescope. He co-conceived and developed the Starshade for the direct imaging of extra-solar planets. He is the author of more than 180 conference presentations, papers, book chapters, and a recent SPIE press book. He holds a dozen European and U.S. Patents and is an SPIE fellow. |