### Create a StudySoup account

#### Be part of our community, it's free to join!

Already have a StudySoup account? Login here

# Dsgn & Analy ISYE 6413

GPA 3.82

### View Full Document

## 11

## 0

## Popular in Course

## Popular in Industrial Engineering

This 0 page Class Notes was uploaded by Maryse Thiel on Monday November 2, 2015. The Class Notes belongs to ISYE 6413 at Georgia Institute of Technology - Main Campus taught by Chien-Fu Wu in Fall. Since its upload, it has received 11 views. For similar materials see /class/234190/isye-6413-georgia-institute-of-technology-main-campus in Industrial Engineering at Georgia Institute of Technology - Main Campus.

## Popular in Industrial Engineering

## Reviews for Dsgn & Analy

### What is Karma?

#### Karma is the currency of StudySoup.

#### You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!

Date Created: 11/02/15

sca1ed1ambdap10t A C l1 11 1l ll ll ll ll ll B C l ll1 1 lll l lll l lll C C l l l lllll l l l lllll D C l l l l l l l lllllllll y C1681983 283 444985 7099790720724440945377794311751630 AB AB AC AC AD AD BC BC SD BD CD CD Here we transform the y va1ues iambda 2 y1 yAiambda 11ambda iambda 1 y2 yAiambda 11ambda iambda 5 y3 yAiambda 11ambda iambda 0 y4 10gy iambda 5 y5 yAiambda 11ambda iambda 1 y6 yAiambda 11ambda iambda 32 y7 yAiambda 11ambda iambda 2 y8 yAiambda 11ambda Now we put these y39s into a matrix as coiumns ymat cbindy1y2y3y4y5y6y7y8 we sha11 put the t statistics into a matrix ca11es tmat tmat matrixnc018nrow 110 for i 1 18 ymati Here we fit the regression modei 1myABCDABACADBCBDCD mode1matrixg 39n Y 9 we get the modei matrix X X we caicuiate the coefficients be ta s01vetXXtXy Page 1 scaied1ambdapiot we caicuiate the estimate of sigmaAZ RSSdf s2 sumy gfittedA25 We caicuiate the std error of the estimates var s2diagsoivetXX sd sqrtvar Now caicuiate the t statistic t betasd tmati t Here we ignore the t statistics corresponding to the intercepts tmat tmat 1 Now we piot these iambda C 2 l l20l2l322 x11 piot1anbdatmat1y1imcmintmatmaxtmattype n x1ab quotLambdaquoty1ab quott Stat stics po nts39ambcatmat 1 po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat type 1 coi1 type 1 coi1 type 1 coi1 0type 1 coil type p pch20 po nts ambcatmat pe p pch20 po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat po nts39ambcatmat type p pch20 type p pch20 type p pch20 type p pch20 2 3 4 5 6 7 8 9 l po ntsambcatmat 3 4 5 6 7 8 9 1 o i 3 text1ambdaitmatli text1ambdaitmat2i text1ambdaitmat3i text1ambdaitmat4i text1ambdaitmat5i B39 text1ambdaitmat6i AC text1ambdaitmat7i ADY3 Y text1ambdaitmat8i BC text1ambdaitmat9i BD Page 2 Unit 6 Fractional Factorial Experiments at Three Levels Source Chapter 5 Sections 51 56 o Largerthebetter and smallerthebetter problems 0 Basic concepts for 3quot full factorial designs 0 Analysis of 3quot designs using orthogonal components system 0 Design of 31evel fractional factorials 0 Effect aliasing resolution and minimum aberration in 31 fractional factorial designs 0 Analysis of 3level designs ANOVA using orthogonal components system Seat Belt Experiment An experiment to study the effect of four factors on the pull strength of truck seat belts Four factors each at three levels Table 1 Two responses crimp tensile strength that must be at least 4000 lb and ash that cannot exceed 14 mm 27 runs were conducted each run was replicated three times as shown in Table 2 Table 1 Factors and Levels SeatBelt Experiment Level Factor 0 1 2 A pressure psi 1100 1400 1700 B die at mm 100 102 104 C crimp length mm 18 23 27 D anchor lot P74 P75 P76 Design Matrix and Response Data SeatBelt Experiment Table 2 Design Matrix and Response Data SeatBelt Experiment rst 14 runs Factor Run A B C D Strength Flash 1 0 0 0 0 5164 6615 5959 1289 1270 1274 2 0 0 1 1 5356 6117 5224 1283 1273 1307 3 0 0 2 2 3070 3773 4257 1237 1247 1244 4 0 1 0 1 5547 6566 6320 1329 1286 1270 5 0 1 1 2 4754 4401 5436 1264 1250 1261 6 0 1 2 0 5524 4050 4526 1276 1272 1294 7 0 2 0 2 5684 6251 6214 1317 1333 1398 8 0 2 1 0 5735 6271 5843 1302 1311 1267 9 0 2 2 1 5744 4797 5416 1237 1267 1254 10 1 0 0 1 6843 6895 6957 1328 1365 1358 11 1 0 1 2 6538 6328 4784 1262 1407 1338 12 1 0 2 0 6152 5819 5963 1319 1294 1315 13 1 1 0 2 6854 6804 6907 1465 1498 1440 14 1 1 1 0 6799 6703 6792 1300 1335 1287 Design Matrix and Response Data SeatBelt Experiment contd Table 3 Design Matrix and Response Data SeatBelt Experiment last 13 runs Factor Run A B C D Strength Flash 15 1 1 2 1 6513 6503 6568 1313 1340 1380 16 1 2 0 0 6473 6974 6712 1355 1410 1441 17 1 2 1 1 6832 7034 5057 1486 1327 1364 18 1 2 2 2 4968 5684 5761 1300 1358 1345 19 2 0 0 2 7148 6920 6220 1670 1585 1490 20 2 0 1 0 6905 7068 7156 1470 1397 1366 21 2 0 2 1 6933 7194 6667 1351 1364 1392 22 2 1 0 0 7227 7170 7015 1554 1616 1614 23 2 1 1 1 7014 7040 7200 1397 1409 1452 24 2 1 2 2 6215 6260 6488 1435 1356 1300 25 2 2 0 1 7145 6868 6964 1570 1645 1585 26 2 2 1 2 7161 7263 6937 1521 1377 1434 27 2 2 2 0 7060 7050 6950 1351 1342 1307 LargerTheBetter and SmallerTheBetter problems In the seatbelt experiment the strength should be as high as possible and the ash as low as possible There is no xed nominal value for either strength or ash Such type of problems are referred to as largerthebetter and smallerthebetter problems respectively For such problems increasing or decreasing the mean is more dif cult than reducing the variation and should be done in the rst step Why Twostep procedure for largerthebetter problems 1 Find factor settings that maximize E y 2 Find other factor settings that minimize Vary Twostep procedure for smallerthebetter problems 1 Find factor settings that minimize E y 2 Find other factor settings that minimize Vary 5 Situations where threelevel experiments are useful 0 When there is a curvilinear relation between the response and a quantitative factor like temperature It is not possible to detect such a curvature effect with two levels 0 A qualitative factor may have three levels e g three types of machines or three suppliers 0 It is common to study the effect of a factor on the response at its current setting x0 and two settings around x0 Analysis of 3quot designs using AN OVA We consider a simpli ed version of the seatbelt experiment as a 33 full factorial experiment with factors A B C Since a 33 design is a special case of a multiway layout the analysis of variance method introduced in Section 24 can be applied to this experiment We consider only the strength data for demonstration of the analysis Using analysis of variance we can compute the sum of squares for main effects A B C interactions A X B A X C B X C and A X B X C and the residual sum of squares Details are given in Table 4 The breakup of the degrees of freedom will be as follows Each main effect has two degrees of freedom because each factor has three levels Each twofactor interaction has 3 l X 3 l 4 degrees of freedom TheA x B x C interaction has 3 l x 3 l x 3 l 8 degrees offreedom The residual degrees of freedom is 54 27 X 3 1 since there are three replicates 7 Analysis of Simpli ed SeatBelt Experiment Table 4 ANOVA Table Simpli ed SeatBelt Experiment Degrees of Sum of Mean Source Freedom Squares Squares F pValue A 2 34621746 17310873 8558 0000 B 2 938539 469270 232 0108 C 2 9549481 4774741 2361 0000 A X B 4 3298246 824561 408 0006 A X C 4 3872179 968045 479 0002 B gtltC 4 448348 112087 055 0697 A X B X C 8 5206919 650865 322 0005 residual 54 10922599 202270 total 80 68858056 Orthogonal Components System Decomposition of A x B Interaction o A X B has 4 degrees of freedom 0 A X B has two components denoted by AB and ABZ each having 2 df 0 Let the levels of A and B be denoted by x1 and xz respectively 0 AB represents the contrasts among the response values Whose x1 and xz satisfy x1 x2 O 12m0d3 0 AB2 represents the contrasts among the response values Whose x1 and xz satisfy x12x2 O 12m0d 3 Orthogonal Components System Decomposition of A x B x C Interaction o A X B X C has 8 degrees of freedom 0 It can be further split up into four components denoted by ABC ABCZ ABZC and ABZCZ each having 2 df 0 Let the levels of A B and C be denoted by x1 x2 and X3 respectively 0 ABC ABCZ ABZC and ABZC2 represent the contrasts among the three groups of x1xzX3 satisfying each of the four systems of equations x1x2X3 Ol2m0d3 x1x2ZX3 Ol2m0d3 x12x2X3 Ol2m0d3 lt gt x12x2ZX3 012 m0d3 Uniqueness of Representation 0 To avoid ambiguity the convention that the coe ictent for the rst nonzero factor is I will be used 0 ABC2 is used instead of AZBZC even though the two are equivalent 0 For AZBZC there are three groups satisfying 2x12x2X3 Ol2mod3 equivalently 2 X 2x1 2x2 X3 2 X 0 12 mod3 equivalently x1 x2 2X3 02 lmod3 which corresponds to ABC2 by relabeling of the groups Hence ABC2 and AZBZC are equivalent Analysis using the Orthogonal components system Table 52 Factor A and B Combinations x1 denotes the levels of factor A and x2 denotes the levels of factor B x 1 0 x12 2 0 06139 000 3k 001 W 0 02 1 31010 WWII 066012 2 YkCVzo 061021 31022 The nine level combinations of A and B can be represented by the cells in the 3 X3 square in Table 5 1 ya 5 000 y12 y21 1 yr 0101 y10 y22 l M g 0101y12 J20 where each ylj represents the average of n replicates in the i j cell 12 Analysis using the Orthogonal components system contd SSAB 3 Voc y2 is f2 y f2 where 7 070 75 3 and n is the number of replicates o For the simpli ed seatbelt experiment 7 6024407 75 6177815 and 377 64670 so that 6223074 and ssAB 3 9 6024407 62230742 6177815 62230742 64670 62230742 2727451 0 Similarly the AB2 interaction component represents the contrasts among the three groups represented by the letters 739 j and k The corresponding 2 and 37k values represent the averages of observations with x1 2x2 012m0d 3 respectively and the formula for SS A 32 can be de ned in a similar manner 13 AN OVA Simpli ed SeatBelt Experiment Degrees of Sum of Mean Source Freedom Squares Squares F pvalue A 2 34621746 17310873 8558 0000 B 2 938539 469270 232 0108 C 2 9549481 4774741 2361 0000 A X3 4 3298246 824561 408 0006 AB 2 2727451 1363725 674 0002 AB2 2 570795 285397 141 0253 A X C 4 3872179 968045 479 0002 AC 2 2985591 1492796 738 0001 AC2 2 886587 443294 219 0122 BgtltC 4 448348 112087 055 0697 BC 2 427214 213607 106 0355 BC2 2 21134 10567 005 0949 A X B X C 8 5206919 650865 322 0005 ABC 2 4492927 2246464 1111 0000 ABC2 2 263016 131508 065 0526 ABZC 2 205537 102768 051 0605 ABZCZ 2 245439 122720 061 0549 residual 54 10922599 202270 total 80 68858056 Analysis of Simpli ed SeatBelt Experiment contd o The signi cant main effects are A and C 0 Among the interactions A X B A X C andA X B X C are signi cant 0 We have dif culty in interpretations when only one component of the interaction terms become signi cant What is meant by A X B is signi cant Here AB is signi cant but AB2 is not Is A X B signi cant because of the signi cance of AB alone For the original SeatBelt Experiment we have AB CD2 0 Similarly AC is signi cant but not AC2 How to interpret the signi cance of AXC o This di iculty in interpreting the signi cant interaction effects can be avoided by using LinearQuadratic Systems 15 Why threelevel fractional factorial 0 Run size economy it is not economical to use a 34 design with 81 runs unless the experiment is not costly o If a 34 design is used for the experiment its 81 degrees of freedom would be allocated as follows Main Interactions Effects 2Factor 3Factor 4Factor 8 24 32 16 0 Using effect hierarchy principle one would argue that 3 s and 4 s are not likely to be important Out of a total of 80 df 48 correspond to such effects De ning a 34 1 Experiment Returning to the original seatbelt experiment it employs a onethird fraction of the 34 design This is denoted as a 34 1 design The design is constructed by choosing the column for factor D lot to be equal to ColumnA ColumnB Column C mod 3 This relationship can be represented by the notation DABC If x1 x4 are used to represent these four columns then x4 x1 x2 x3 mod 3 or equivalently x1x2X3ZX4Om0d3 l which can be represented by 1 ABCDZ l7 Aliasing Patterns of the SeatBelt Experiment 0 The aliasing patterns can be deduced from the de ning relation For example by adding 2x1 to both sides of l we have 2x1 3x1x2 X3 2le xz X3 ZX4m0d 3 o This means thatA and BCD2 are aliased Why 0 By following the same derivation it is easy to show that the following effects are aliased A BCD2 ABZCZD B ACD2 AB2 CD2 C ABD2 ABCZDZ D ABC ABCD AB CD2 ABCZD AB2 ACZD BCZD AC BD2 ABZCD 2 AC2 ABZD BCZDZ AD A13202 BCD AD2 BC AB2 C2 D2 3C2 ABZD2 ACZDZ BD ABZC ACD CD ABC2 ABD Clear and Strongly Clear Effects If threefactor interactions are assumed negligible from the aliasing relations in 2 A B C D ABZ AC2 AD BCZ BD and CD can be estimated These main effects or components of twofactor interactions are called clear because they are not aliased with any other main effects or twofactor interaction components A twofactor interaction say A x B is called clear if both of its components AB and A32 are clear Note that each of the siX twofactor interactions has only one component that is clear the other component is aliased with one component of another twofactor interaction For example for A x B AB2 is clear but AB is aliased with CD2 A main effect or twofactor interaction component is said to be strongly clear if it is not aliased with any other main effects twofactor or threefactor interaction components A twofactor interaction is said to be strongly clear if both of its components are strongly clear A 35 2 Design a 5 factors 27 runs 0 The oneninth fraction is de ned by I ABD2 ABZCEZ from which two additional relations can be obtained 1 A302 ABZCEZ AZCDZEZ ACZDE and 1 A302 ABZCE22 BZCZDZE BCDEZ Therefore the de ning contrast subgroup for this design consists of the following de ning relation 1 ABD2 ABZCE2 ACZDE BCDEZ 3 20 Resolution and Minimum Aberration 0 Let A be to denote the number of words of length i in the subgroup and W A3 A4 to denote the wordlength pattern 0 Based on W the de nitions of resolution and minimum aberration are the same as given before in Section 42 o The subgroup de ned in 3 has four words Whose lengths are 3 4 4 and 4 and hence W 130 Another 35 2 design given by D ABE AB2 has the de ning contrast subgroup I ABD2 ABZEZ ADE BDEZ with the wordlength pattern W 400 According to the aberration criterion the rst design has less aberration than the second design 0 Moreover it can be shown that the rst design has minimum aberration 21 General 31 Design 0 A 31 design is a fractional factorial design with k factors in 31 runs 0 It is a 3 pth fraction of the 3quot design 0 The fractional plan is de ned by 9 independent generators o How many factors can a 31 design study 3 l2 where n k p This design has 3 runs with the independent generators x1 x2 x We can obtain altogether 3 1 2 orthogonal columns as different combinations of 271 06x with 06 O 1 or 2 where at least one 06 should not be zero and the rst nonzero 06 should be written as 1 to avoid duplication o For n3 the 3 1 2 13 columns were given in Table 55 of WH book 0 A general algebraic treatment of 31 designs can be found in Kempthome 1952 22 Simple Analysis Methods Plots and AN OVA 0 Start with making a main effects plot and interaction plots to see what effects might be important 0 This step can be followed by a formal analysis like analysis of variance and halfnormal plots The strength data will be considered rst The location main effect and interaction plots are given in Figures 1 and 2 The main effects plot suggests that factor A is the most important followed by factors C and D The interaction plots in Figure 2 suggest that there may be interactions because the lines are not parallel 23 Main Effects Plot of Strength Location I I I 1 2 O 1 2 A B C D Figure 1 Main Effects Plot of Strength Location SeatBelt Experiment 24 Interaction Plots of Strength Location 7000 gt 6500 6000 5000 4500 5500 gt a BC 7000 6500 7 6000 l 5500 5000 4500 l 7000 6500 6000 5500 5000 4500 7000 6500 6000 5500 5000 4500 6500 7000 gt 6000 5000 5500 7 I gt a 4500 7000 6500 I 600 0 5500 5000 4500 Figure 2 Interaction Plots of Strength Location SeatBelt Experiment 25 AN OVA Table for Strength Location Degrees of Sum of Mean Source Freedom Squares Squares F pValue A 2 34621746 17310873 8558 0000 B 2 938539 469270 232 0108 AB CD2 2 2727451 1363725 674 0002 AB2 2 570795 285397 141 0253 C 2 9549481 4774741 2361 0000 AC BD2 2 2985591 1492796 738 0001 AC2 2 886587 443294 219 0122 BC AD2 2 427214 213607 106 0355 BC2 2 21134 10567 005 0949 D 2 4492927 2246464 1111 0000 AD 2 263016 131508 065 0526 BD 2 205537 102768 051 0605 CD 2 245439 122720 061 0549 residual 54 10922599 202270 26 Analysis of Strength Location SeatBelt Experiment 0 In equation 2 on slide 18 the 26 degrees of freedom in the experiment were grouped into 13 sets of effects The corresponding ANOVA table gives the sum of squares for these 13 effects 0 Based on the pValues in the ANOVA Table clearly the factor A C and D main effects are signi cant 0 There are also two aliased sets of effects that are signi cant AB CD2 and AC 2 BD2 0 These ndings are consistent with those based on the main effects plot and interaction plots In particular the signi cance of AB and CD2 is supported by the A X B and C X D interaction plots and the signi cance of AC and BD2 is supported by the A X C and B X D interaction plots 27 Unit 8 Robust Parameter Design Source Chapter 10 sections 101 106 part of sections 107 108 and 1010 0 Revisiting two previous experiments 0 Strategies for reducing variation 0 Types of noise factors 0 Variation reduction through robust parameter design 0 Cross array locationdispersion modeling response modeling 0 Single arrays vs cross arrays o Signaltonoise ratios and limitations Robust Parameter Design 0 Statisticalengineering method for productprocess improvement G Taguchi 0 Two types of factors in a system productprocess control factors once chosen values remain xed noise factors hardtocontrol during normal process or usage 0 Robust Parameter design RPD or PD choose control factor settings to make response less sensitive iemore robust to noise variation exploiting controlbynoise interactions A Robust Design Perspective of Layergrowth and Leaf Spring Experiments 0 The original AT amp T layer growth experiment had 8 control factors 2 noise factors location and facet Goal was to achieve uniform thickness around 145 um over the noise factors See Tables 1 and 2 o The original leaf spring experiment had 4 control factors 1 noise factor quench oil temperature The quench oil temperature is not controllable with efforts it can be set in two ranges of values 130150 150170 Goal is to achieve uniform free height around 8 inches over the range of quench oil temperature See Tables 3 and 4 0 Must understand the role of noise factors in achieveing robustness Layer Growth Experiment Factors and Levels Table 1 Factors and Levels Layer Growth Experiment Level Control Factor A susceptorrotation method continuous oscillating B code of wafers 668G4 678D4 C deposition temperature0 C 1210 1220 D deposition time short long E arsenic ow rate 55 59 F hydrochloric acid etch temperature0 C l 180 1215 G hydrochloric acid ow rate 10 14 H nozzle position 2 6 Level Noise Factor L location bottom top M facet l 2 3 4 Layer Growth Experiment Thickness Data Table 2 Cross Array and Thickness Data Layer Growth Experiment Noise Factor Control Factor LBottorn LTop A B C D E F G H M 1 M 2 M 3 M 4 M 1 M 2 M 3 M 4 7 7 7 7 7 7 7 142908 141924 142714 141876 153182 154279 152657 154056 7 7 7 148030 147193 146960 147635 149306 148954 149210 151349 7 7 7 7 7 138793 139213 138532 140849 140121 139386 142118 140789 7 7 7 7 7 134054 134788 135878 135167 142444 142573 143951 143724 7 7 7 7 7 141736 140306 141398 140796 141492 141654 141487 142765 7 7 7 7 7 132539 133338 131920 134430 142204 143028 142689 144104 7 7 7 140623 140888 141766 140528 152969 155209 154200 152077 7 7 7 143068 144055 146780 145811 150100 150618 155724 154668 7 7 7 7 7 137259 132934 126502 132666 149039 147952 141886 146254 7 7 7 7 7 138953 145597 144492 137064 137546 143229 142224 138209 7 7 7 142201 143974 152757 150363 141936 144295 155537 152200 7 7 7 135228 135828 142822 138449 145640 144670 152293 151099 7 7 7 145335 142492 146701 152799 147437 141827 149695 155484 7 7 7 145676 140310 137099 146375 158717 152239 149700 160001 7 7 7 7 7 129012 127071 131484 138940 142537 138368 141332 151681 7 139532 140830 141119 135963 138136 140745 144313 136862 Leaf Spring Experiment Table 3 Factors and Levels Leaf Spring Experiment Level Control Factor 7 B high heat temperature OF 1840 1880 C heating time seconds 23 25 D transfer time seconds 10 12 E hold down time seconds 2 3 Level Noise Factor 7 Q quench oil temperature OF 130150 150170 Table 4 Cross Array and Height Data Leaf Spring Experiment Control Factor Noise Factor B C D E Q Q 7 778 778 781 750 725 712 815 818 788 788 788 744 7 7 750 756 750 750 756 750 7 7 759 756 775 763 775 756 7 7 794 800 788 732 744 744 7 7 769 809 806 756 769 762 7 7 7 7 756 762 744 718 718 725 7 7 756 781 769 781 750 759 Strategies for Variation Reduction Sampling inspection passive sometimes last resort Control charting and process monitoring can remove special causes If the process is stable it can be followed by using a designed experiment Blocking covariate adjustment passive measures but useful in reducing variability not for removing root causes Reducing variation in noise factors effective as it may reduce variation in the response but can be expensive Better approach is to change control factor settings cheaper and easier to do by exploiting controlbynoise interactions ie use robust parameter design Types of Noise Factors Variation in process parameters Variation in product parameters Environmental variation Load Factors Upstream variation Downstream or user conditions Unittounit and spatial variation Variation over time Degradation Traditional design uses 7 and 8 Variation Reduction Through RPD 0 Suppose y f xz x control factors and 2 noise factors If x and z interact in their effects on y then the vary can be reduced either by reducing varz ie method 4 on p7 or by changing the x values ie RPD 0 An example y yOOC1BZYx2Z87 yOCX1BszZe By choosing an appropriate value of x to reduce the coe icient 3 yxz the impact of z on y can be reduced Since 3 and y are unknown this can be achieved by using the controlbynoise interaction plots or other methods to be presented later Exploitation of N onlinearity o Nonlinearity between y and x can be exploited for robustness if x0 nominal values of x are control factors and deviations of x around x0 are viewed as noise factors called internal noise Expand y f around x0 8f y fX0 2 8 Xi X10 i xi X510 1 This leads to a f 2 2 2 o m 6 g lt axl x10 1 where 62 vary 612 varx each component x has mean xio and variance 612 0 From 1 it can be seen that 62 can be reduced by choosing xio with a smaller slope This is demonstrated in Figure 1 Moving the nominal value a to b can 1 reduce vary because the slope at b is more at This is a parameter design step On the other hand reducing the variation of x around a can also reduce vary This is a tolerance design step 10 Exploitation of N onlinearity to Reduce Variation f x y response x design parameter Figure 1 Exploiting the Nonlinearity of f x to Reduce Variation 11 Cross Array and LocationDispersion Modeling 0 Cross array control array gtlt noise array control array array for control factors noise array array for noise factors 0 Locationdispersion modeling compute 7 SI2 based on the noise settings for the 1m control setting analyze 7 location and lnsl2 dispersion identify signi cant location and dispersion effects Twostep Procedures for Parameter Design Optimization 0 TwoStep Procedure for NominaltheBest Problem i select the levels of the dispersion factors to minimize dispersion ii select the level of the adjustment factor to bring the location on target 2 o TwoStep Procedure for LargertheBetter and SmallertheBetter Problems i select the levels of the location factors to maximize or minimize the location 3 ii select the levels of the dispersion factors that are not location factors to minimize dispersion Note that the two steps in 3 are in reverse order from those in 2 Reason It is usually harder to increase or decrease the response y in the latter problem so this step should be the rst to perform 13 Analysis of Layer Growth Experiment From the 7 and ln s columns of Table 5 compute the factorial effects for location and dispersion respectively These numbers are not given in the book From the halfnormal plots of these effects Figure 2 D is signi cant for location and H A for dispersion 143520402xD 1822 06l9xA 0982xH a II N H Twostep procedure i choose A at the level continuous rotation and H at the level nozzle position 6 ii By solving 9 14352O402xD 145 choose x1 0368 Layer Growth Experiment Analysis Results Table 5 Means Log Variances and SN Ratios Layer Growth Experiment Control Factor A B C D E F G H y Inst2 lny l2 l 7 7 7 7 7 7 7 1479 1018 5389 641 7 7 7 1486 3879 5397 928 7 7 7 7 7 1400 4205 5278 948 7 7 7 7 7 1391 1623 5265 689 7 7 7 7 7 1415 5306 5299 1060 7 7 7 7 7 1380 1236 5250 649 7 7 7 1473 0760 5380 614 7 7 7 1489 1503 5401 690 7 7 7 7 7 1393 0383 5268 565 7 7 7 7 7 1409 2180 5291 747 7 7 7 1479 1238 5388 663 7 7 7 1433 0868 5324 619 7 7 7 1477 1483 5386 687 7 7 7 1488 0418 5400 582 7 7 7 7 7 1376 0418 5243 566 7 1397 2636 5274 791 Layer Growth Experiment Plots location 2 7 39 D 3 lt9 E C I a r 539 C B 8 E m o H e C o 39 39 C 00 05 10 15 20 25 half normal quantiles dispersion C i 39 H 9 L0 8 A q C E D 8 AE Q l m C C CS 00 05 10 15 20 25 half normal quantiles Figure 2 HalfNormal Plots of Location and Dispersion Effects Layer Growth Experiment 16 Analysis of Leaf Spring Experiment Based on the halfnormal plots in Figure 3 B C and E are signi cant for location C is signi cant for dispersion 76360 01106x3 0088GC 00519xE 368861O901xC a II N H Twostep procedure i choose C at ii With xC 1 9 75479 01106x3 00519xE To achieve 80 xB and xE must be chosen beyond 1 ie x3 xE 278 This is too drastic and not validated by current data An alternative is to select xB xE xC 1 not to follow the twostep procedure then 397 89 is closer to 8 Note that 9 771 with BCE Reason for the breakdown of the 2step procedure its second step cannot achieve the target 80 Leaf Spring Experiment Analysis Results Table 6 Means and Log Variances Leaf Spring Experiment Control Factor B C D E y Ins 7540 24075 7902 26488 7520 69486 7640 48384 7670 23987 7785 29392 7372 32697 7660 40582 Leaf Spring Experiment Plots location B 3 C 0 LT I 7 039 I E E O 8 m cgt CD 30 BD 39 D 00 05 10 15 20 half normal quantiles dispersion C C N 9 0 Lo g x 39 39 I E Q 7 D CD 8 BD 9 E 1 L0 BC C B 00 05 10 15 20 half normal quantiles Figure 3 HalfNormal Plots of Location and Dispersion Effects Leaf Spring Experiment 1 9 Response Modeling and ControlbyNoise Interaction Plots 0 Response Model model yij directly in terms of control noise effects and controlbynoise interactions half normal plot of various effects regression model tting obtaining 9 0 Make controlbynoise interaction plots for signi cant effects in 9 choose robust control settings at which y has a atter relationship with noise 0 Compute VarO with respect to variation in the noise factors Call VarO the transmitted variance model Use it to identify control factor settings with small transmitted variance 20 Response Modeling Layer Growth Experiment De ne M1 M1Mz M3 M47 Mq M1M4 M2 M37 Me M1M3 M2 M4 From Figure 4 select D L HL and the cluster of next four effects M H CM1AHMq The following model is obtained 9 14352O4O2xDO087xH033OxL OO9Ole O239xHxL 0083xCle 0082xAxHqu Recommendations H position 2 to position 6 A oscillating to continuous C 1210 to 1220 resulting in 37 reduction of thiCkIZIICSS standard variation Halfnormal Plot of Factorial Effects HL absolute effects 00 05 10 15 20 25 half normal quantiles Figure 4 HalfNormal Plot of Response Model Effects Layer Growth Experiment 22 Controlbynoise Interaction Plots 146 142 138 145 143 Figure SI H x L and C gtltM Interaction Plots Layer Growth Experiment 23 AgtltHgtltMPlot A H 142 140 Figure 6 A x H gtltM Interaction Plot Layer Growth Experiment 24 Predicted Variance Model 0 Assume L M and Mg are random variables taking 1 and 1 with equal probabilities This leads to xi x l x q x x x 1 MM ExM1 Equ 0 4 C0vxLle C0vxLqu C0vlequ O 0 From 4 we have VarO 330 239xH2VarxL 090 083xC2Varle 082xAxH2Varqu constant 330 239xH2 090 083m2 constant 233O239xH2O9O083xc constant 158xH015xC 0 Choose H and C But factor A is not present here Why See explanation on p 456 25 Estimation Capacity for Cross Arrays 0 Example 1 Control array is a 2 design with I ABC and the noise array is a 2E1 design with I abc The resulting cross array is a l6run 22 design with I ABC abc ABCabc Easy to show that all 9 controlbynoise interactions are clear but not the 6 main effects This is indeed a general result stated next Theorem Suppose a 21 design dc is chosen for the control array a 2mq design dN is chosen for the noise array and a cross array denoted by dc dN is constructed from dc and dN i If 061 ocA are the estimable factorial effects among the control factors in dc and 31 BB are the estimable factorial effects among the noise factors in dN then ociBJocBj for i lA j lB are estimable in dc dN ii All the km controlbynoise interactions ie twofactor interactions between a control factor main effect and a noise factor main effect are clear in dc dN 26 Cross Arrays or Single Arrays 0 Three control factors A B C two noise factors a b 23 X 22 design allowing all main effects and twofactor interactions to be clearly estimated 0 Use a single array with 16 runs for all ve factors a resolution V 25 1 design with I ABCab or I ABCab all main effects and twofactor interactions are clear See Table 7 0 Single arrays can have smaller runs but cross arrays are easier to use and interpret 27 32run Cross Array and 16run Single Arrays Table 7 32Run Cross Array a b Runs A B C 1 4 o o o o 5 8 o o o 0 9 12 o o o 0 13 1 6 o o o 0 17 20 o o o 0 21 24 o o o 0 25 28 o o o 0 29 32 o o o o o I ABCab2c I ABCab Comparison of Cross Arrays and Single Arrays 0 Example 1 continued An alternative is to choose a single array 22 design with I ABCa ABbc abcC This is not advisable because no 2 s are clear and only main effects are clear Why We need to have some clear controlbynoise interactions for robust optimization A better one is to use a 22 design with I ABCa abc ABCbc It has 9 clear effects ABCAbAcBbBcCbCc 3 control main effects and 6 controlbynoise interactions 29 SignaltoNoise Ratio Taguchi s SN ratio In Twostep procedure 1 Select control factor levels to maximize SN ratio 2 Use an adjustment factor to move mean on target Limitations maximizing 72 not always desired little justi cation outside linear circuitry statistically justi able only when Vary is proportional to E y2 Recommendation Use SN ratio sparingly Better to use the locationdispersion modeling or the response modeling The latter strategies can do whatever SN ratio analysis can achieve 30 Halfnormal Plot for SN Ratio Analysis absolute effects 06 08 10 w w 04 i 02 00 00 05 10 15 20 25 half normal quantiles Figure 7 HalfNormal Plots of Effects Based on SN Ratio Layer Growth Exper imth 31 SN Ratio Analysis for Layer Growth Experiment 0 Based on the i column in Table 5 compute the factorial effects using SN ratio From Figure 7 the conclusion is similar to locationdispersion analysis Why Using i lny2 lnsl2 and from Table 5 the variation among ln s is much larger than the variation among 1ny2 thus maximizing SN ratio is equivalent to minimizing lnsl2 in this case 32 106 CHAPTER 3 EXPERIMENTS WITH MORE THAN ONE FACTOR Table 3221 Multiple Comparison t Statistics Tire Experiment i si si s i vs 02 4 71 6 22 4 44 5 95 152 where k 3t 4 and A 2 The multiple comparison t statistics according to 343 are given in Table 322 With t 4 and tribitJrl 4374741 5 the 005 critical value for the Tukey method is i 7 522 q45005 7 71 14 By comparing 369 with the t values in Table 322 we conclude that at the 005 level compounds A and B are different from C and D with C and D wearing more than A and B 369 38 SplitPlot Design Kowalski and Potcner 2003 and Potcner and Kowalski 2004 described an experiment involving the water resistant property of wood Type of wood pre treatment A and type of stain B are the two experimental factors Two types of pretreatment and four types of stain are considered Thus the experi ment consists of two factors one at two levels and the other at four levels An experimental unit is a wood panel to which a speci c type of pretreatment and a speci c type of stain will be applied In order to conduct this experiment in a completely randomized manner one would need eight wood panels for each replicatei Each of these panels has to be assigned randomly to a particular combination of pretreatment and stain see Table 323 Table 323 Completely n 39 Version of the Wood Experiment Run 1 2 4 5 6 Pretreatment A A1 A2 A2 A1 A2 Al Al A2 Stain B B2 B4 B1 B1 B3 B4 B3 B2 Since it is rather inconvenient to apply the pretreatment to a small wood panel a much more pragmatic approach is to select two large panels apply the pretreatment Al to one panel and A2 to the other Next each panel can be cut into four smaller pieces and different stain types B1 B2 B3 or B4 may be applied to themi Note that the assignment of pretreatment and stain type is done randomly see Table 324 This type of experimental design is called a splitplot design which has its origin in agricultural experiment where experimental units are usually plots of and Owing to practical limitations some factors need to be applied to larger plots as compared to other factors For example if the type of irrigation 38 SPLIT PLOT DESIGN 107 Table 324 SplitPlot Version of the Wood Experiment irst pane econd panel Pretreated with A1 Pretreated with A2 1331321341311321311341331 method and the type of fertilizer used are two factors then the former requires a larger plot as compared to the latter One has to choose a large plot called a main plot or a whole plot apply a speci c type of irrigation to it split it up into smaller plots called splitplots or subplots and apply different fertilizers to these subplotsl Note that this is similar to the wood experiment described earlier Here a large wood panel to which a particular pretreatment type is applied is a whole plot The four small wood panels cut from it for application of four different stain types are the subplotsl Factors associated with whole plots and subplots are called whole plot factors and subplot factors respectively Thus in the wood experiment factor A is the whole plot factor and factor B is the subplot factorl Despite its agricultural basis splitplot structure occurs commonly in other scienti c and engineering investigations Besides the wood experiment addi tional examples will be given in later chapters However the term plot may be inappropriate outside agriculturel Terms like whole unit subunits and split um39t expen39ments are sometimes used when such terminology better represents the experimental situationsl Three replications of the wood experiment were conducted and the data are shown in Table 325 The rst column shows the whole plot number ie the number of the large piece of wood to which a speci c pretreatment was applied before splitting it into four smaller pieces Each whole plot number corresponds to either pretreatment 1 or 2 The entries within column Stain are nested within column Pretreatment l That is there are four stain types 1 2 3 4 in random order for each whole plot number ie each large panel Each of the three replicates column Rep represents two large panels for pretreatment 1 and 2 For example Rep 1 has two panels corresponding to whole plot nor 1 and 4 Altogether there are 24 2 X 4 X 3 observations corresponding to all combinations of the three columns Pretreatment Stain and Rep The last column gives the response y which is the water resistant property of wood An incorrect model and analysis If the splitplot structure of the experiment is not known an analyst would treat it as a twoway layout described in Section 3131 Denoting by yijk the kth observation corresponding to level i of A and level j of B the following linear model same as model 3113 for a twoway layout would be used 108 CHAPTER 3 EXPERIMENTS WITH MORE THAN ONE FACTOR able 325 Water Resistance Data7 Wood Experiment Whole Plot Pretreatment A Stain B n quot 39 Rep Resistance 4 2 535 325 466 354 446 522 459 483 408 430 518 45f 609 553 511 574 321 301 344 322 528 5L7 553 592 HHHHMMMMHHHHHHHHMMMMMMMM mawwwmswgwsmamwwmwgngg WWWWWWWWMMMMHHHHMMMMHHHH wwwwaaaammmmwwwwmmmmnbnbnb yijk 77Oti j a ij ijk7 i1uiI jliuJ 161Hi7n7 347 where yijk is the observation for the kth replicate of the ith level of factor A and the jth level of factor B7 ai is the ith main effect for A7 j is the jth main effect for E7 a j is the 239 jth interaction effect between A and B7 and cm are independent errors distributed as N0702 i Proceeding exactly as described in Section 337 the ANOVA table obtained from this twoway layout is shown in Table 326 The F value 1349 for factor A is much larger than the F values 153 and 036 for B and A X Bi Thus the analysis declares factor A pretreatment as highly signi cant7 while neither B stain nor the A X B interaction is signi cant Note that the above analysis does not take into consideration that the two factors A and B use di erent randomization schemes and that the numbers of replicates are not the same for each factor For the subplot factor B7 restn39cted randomization is applied because the levels of B are assigned in random order 38 SPLIT PLOT DESIGN 109 Table 3126 Incorrect Analysis using ANOVA Wood Experiment Degrees of Sum of Mean Source Freedom Squares Squares F 1 782104 782104 13149 B 3 266100 88167 1153 A X B 3 62179 20193 0 Residual 16 927188 57199 Total 23 2038172 to the subplots within a given whole plot For the whole plot factor A complete randomization can usually be applied in assigning the levels of A to the whole plotsl Therefore the error in the underlying model should consist of two parts 7 the whole plot error and the subplot error In order to test the signi cance of factors A and B one needs to compare their respective mean squares with the whole plot error component and the subplot error component respectively This discussion leads to the following modeling and analysis The correct model and analysis Because the data in Table 3125 can be viewed as a threeway layout with factors A B and Rep we rst consider the following model yijk 77Tkai Taki j 015 T kj Ta kij 6rL39jk7 ilulI jlulJ klllln 3148 where yijk is the observation for the kth replicate of the ith level of factor A and the jth level of factor B m is the effect of the kth replicate ai is the ith main effect for A 70 is the k ith interaction effect between replicate and A j is the jth main effect for B etc and egjk is the error term with the N002 distribution This model is the same as the threeway layout model 3122 except for the slightly different notation The replication factor 7 is treated as a random effect here because there can potentially be many replications of the wood experiment and the three replications actually conducted are viewed as a random sample Refer to the discussion on random effects in Section 2151 The factors A and B are treated as xed effects Therefore model 3148 can be viewed as a mixed e ects modell Out of the nine terms in 3148 the four terms associated with 739 and 621k are random effect terms while the others are xed effect termsl Terms involving only 739 and a are the whole plot effect termsl Thus we use the term 7a as the whole plot error for assessing the signi cance of effects associated with m and ai and the term 6W 75 mom Eijk 349 as the subplot error In most practical situations the subplot error is smaller than the whole plot error because the subplots tend to be more homogeneous than the whole plotsl This will be observed in the data analysis given belowl 110 CHAPTER 3 EXPERIMENTS WITH MORE THAN ONE FACTOR Because the subplot treatments can be compared with higher precision it is advised to assign factors of greater importance or interest to the subplots if practically possible Thus model 348 can be rewritten as yijk 77 7k ai mm j a m em ilulI jlluJ kluln 350 where ai j and a ij are xed effects while 7 7a and ekij are random effects Assume that m N N0a 771161 N N0a a and ekij N N0a 2 The following null hypotheses are of interest H01 1 0amp10117 H02 1 51quot39517 H03 01511 awn 01521 a hr To simplify the notation we can use the zerosum constraints I J I 7 2a 0 5 0 Zoom 0 Zoom 0 351 i1 j1 i1 j1 under which these three hypotheses can be expressed more simply as H01 ai0ilul1 3 52 H02 j 0 j1l J 3 53 H03 a ij0ilulIjllHJl 354 The equivalence between the two sets of hypotheses can be seen as follows under 351 d 0 and thus 11 a is equivalent a1 a 0 The other two cases can be similarly shown For notational simplicity we will adopt the hypotheses in 3amp3 354 for the remaining section To test the above hypotheses we need to break up the total sum of squares SST in y into different components As observed before 348 this model can be viewed at least algebraically as a threeway layout with factors A B and Rep Thus 348 is the same as model 322 with the omission of the last subscript ll Therefore the ANOVA decomposition follows from the corresponding one in Section 34 SST SSrep SSA 553 SSrepr SSAxB SSrepr SSreprx37 355 where SST is the same as the last entry in Table 312 and the seven terms on the right side of 355 are the same as the rst seven entries in Table 312 In using the formulas from Table 312 for the current situation the last subscript l in these expressions should be dropped Next the terms in 355 should be 38 SPLIT PLOT DESIGN 111 grouped into those for xed effects7 for whole plot error and for subplot errori Noting the previous de nitions for the whole plot and subplot errors7 de ne the sum of squares for the whole plot SSWhole and the sum of squares for the subplot SSsub as SSwhole SSrepr 3 56 SSsub SSrepgtltB SSreprxBA 357 From 3553577 we have the following ANOVA decomposition for the split plot model in 3149 SST SSrep Jr SSA Jr SSWhole Jr 553 Jr SSAXB Jr Sssubi 358 We can use SSWhOle to test the signi cance of A7 and SSSUb to test the signi cance of B and A X B1 The residual sum of squares in Table 3126 is the sum of SSrep SSWhOle and SSSUb which explains why the analysis there is incorrect Table 327 Expected Mean Squares and Degrees of Freedom for Model 3150 Degrees of Source Effect Freedom EMean Squares Replicate m n 7 1 a Jaga 1J0 A ai I 71 a Jaga 2531 a Whole plot error 7a I 7 1n 7 1 a Jaga n 7 2 B Q J 7 1 a 1 21511 37 2 2171 Eldamz A X B 15 I 71J 71 06 Subplot error ekij H 7 1n 7 1 a Total IJn 7 1 To test the hypotheses H017 H027 and H03 in i ii 23ii 47 we use respec tively the following F statistics MSA FA 7 3 59 MSwhole MSB FB 3 60 MSsub 7 MSAXB FAxB 7 my 3161 112 CHAPTER 3 EXPERIMENTS WITH MORE THAN ONE FACTOR Table 3128 Wood Experiment Compiled Data for Whole Plot Analysis Rep 1 Rep 2 Rep 3 Total A1 1811 22417 21910 6248 A2 1680 19110 12818 48718 Total 34911 41517 34718 111216 where MSA M53 MSAxg MSWhOle and MSsub are the mean squares ob tained by dividing SSA SSE SSAxg SSWhole and SSsub by their respective degrees of freedom which are given in Table 3127 An intuitive justi cation for using these test statistics can be given from the expressions of the expected mean squares in Table 3127 Derivations of the expected mean squares can be done in a similar but more tedious manner than those in Section 2141 77 Clearly if H01 ai 0 holds true both EMSA and EMSWhOle are equal to a Jail and thus their ratio should be 1 As the departure from the null hypothesis H01 gets larger ie as 211 1 gets larger the ratio 2 I 2 EMSA 0 Jafa M 2 al 1 1U Mia EMSWhOle U Jaga 1 will get larger This justi es the use of the FA statistici Similar justi cations can be provided for F3 and FAX used to test H02 and H031 Let us now turn to the analysis of the wood experiment data All we need to do is to split up the row corresponding to the residuals in Table 3126 into two rows 7 one for whole plot error and the other for subplot errori To obtain the whole plot error we compile the data as shown in Table 3128 Each entry in the table represents the sum of four values corresponding to four stain types for each combination ofpretreatment method and replication number For example for A2 pretreatment method 2 and Rep 1 replication 1 the sum of the rst four numbers in Table 3125 equals 1680 given in Table 3128 The following sums of squares are computed SSA 7 62482 48718212 7 111216224 7 78204 SSrep 7 34911 41517 347i828 7 111216224 7 37699 55 pr 7 18102 128i824 7 111216224 7 SSrep 7 SSA 7 39837 Note that 24 in the denominator of 1112162 is equal to 6 X 4 since each of the six entries in Table 328 is the sum of four values in Table 3125 Consequently from 3156 SSWhole 5539 pr 3981371 Recall from 3553157 that the residual sum of squares in Table 326 is the sum of three components 7 SSrep SSWhole and SSSUb Therefore SSSUb can easily be computed by subtracting SSWhole SSrep from the residual sum of squares given in Table 3126 Thus assub 7 92788 7 SSWhOle 7 ssrep 7 92788 7 39837 7 37699 715252 38 SPLIT PLOT DESIGN 113 Table 3129 Correct ANOVA Table Wood Experiment Degrees of Sum of Mean Source Freedom Squares Squares F Replicate 2 37699 188150 0195 A 1 782104 782104 3193 Whole plot error 2 398137 199119 B 3 26600 88167 698 A X B 3 62179 20193 1165 Subplot error 12 152152 1271 Total 23 2038172 To split up the 16 degrees of freedom for residuals in Table 3126 note that i The degrees of freedom for replicates are 3 7 2 ii The degrees of freedom for whole plot error are the degrees of freedom for the replicate X A interaction which is 2 X 1 2 iii Thus the degrees of freedom for subplot error are 16 7 2 2 121 The degrees of freedom for the subplot error can also be obtained by summing up the degrees of freedom of its two components using 3157 i The correct ANOVA is shown in Table 3129 The values of FA F3 and FAX are respectively 3193 782104199119 698 88671271 and 1165 20931271 The results of the correct analysis are in sharp contrast with the results obtained from the earlier analysis The main effect of the whole plot factor A is no longer signi cant with a p value of 0186 whereas the main effect of the subplot factor B is now seen to be highly signi cant with a p value of 01006 The discrepancy between the two analyses is due to the computation of the error terms The residual mean squares 5799 in Table 3126 is much smaller than the mean squares for whole plot error 19919 in Table 329 but is much larger than the mean squares for subplot error 12171 in Table 3129 That 1271 is much smaller than 19919 con rms our previous remark after 3149 that the subplot error is in most cases smaller than the whole plot error In the wood experiment the difference among replicates is of little practical importance In other contexts this may be an important issue In general one can test for this difference using a fourth null hypothesis H04 0397 or The test statistic for H04 is M Step F 7 3 62 rep M Swhole where MS is equal to 55 divided by the corresponding degrees of freedom given in Table 3127 From Table 3127 it is seen that the ratio of the expectations of these two mean squares is 1 IJagU Jaga which is 1 if H04 is true and increases beyond 1 as 03 increases From Table 3129 the computed Frep value is 188150199119 0195 which is smaller than 1 indicating that there is Notes for ISyE 6413 Design and Analysis of Experiments Instructor C F Jeff Wu School of Industrial and Systems Engineering Georgia Institute of Technology Text book Experiments Planning Analysis and Parameter Design Optimization by Wu and Hamada Wiley 2000 Unit 1 Introduction to DOE and Basic Regression Analysis Sources Sections 11 to 15 additional materials in these notes on regression analysis 0 Historical perspectives and basic de nitions 0 Planning and implementation of experiments 0 Fisher s fundamental principles 0 Simple linear regression 0 Multiple regression variable selection 0 Regression diagnostics Historical perspectives 0 Agricultural Experiments Comparisons and selection of varieties andor treatments in the presence of uncontrollable eld conditions Fisher s pioneering work on design of experiments and analysis of variance ANOVA 0 Industrial Era Process modeling and optimization Large batch of materials large equipments Box s work motivated in chemical industries and applicable to other processing industries regression modeling and response surface methodology Historical perspectives Contd 0 Quality Revolution Quality and productivity improvement variation reduction total quality management Taguchi s work on robust parameter design Sixsigma movement 0 A lot of successful applications in manufacturing cars electronics home appliances etc 0 Current Trends and Potential New Areas Computer modelling and experiments large and complex systems applications to biotechnology nanotechnology material development etc Types of Experiments 0 Treatment Comparisons Purpose is to compare several treatments of a factor have 4 rice varieties and would like to see if they are different in terms of yield and draught resistence 0 Variable Screening Have a large number of factors but only a few are important Experiment should identify the important few 0 Response Surface Exploration After important factors have been identi ed their impact on the system is explored regression model building Types of Experiments Contd 0 System Optimization Interested in determining the optimum conditions eg maximize yield of semiconductor manufacturing or minimize defects 0 System Robustness Wish to optimize a system and also reduce the impact of uncontrollable noise factors e g would like cars to run well in different road conditions and different driving habits an IC fabrication process to work well in different conditions of humidity and dust levels Some De nitions Factor variable Whose in uence upon a response variable is being studied in the experiment Factor Level numerical values or settings for a factor Trial or run application of a treatment to an experimental unit Treatment or level combination set of values for all factors in a trial Experimental unit object to which a treatment is applied Randomization using a chance mechanism to assign treatments to experimental units or run order Systematic Approach to Experimentation State the objective of the study Choose the response variable should correspond to the purpose of the study Nominalthebest largerthebetter or smallerthebetter Choose factors and levels Use ow chart or causeandeffect diagram see Figure 1 Choose experimental design ie plan Perform the experiment use a planning matrix to determine the set of treatments and the order to be run Analyze data design should be selected to meet objective so that the analysis is ef cient and easy Draw conclusions Causeand Effect Diagram MACHINE injection pressure injection speed mold temperature nozzle temperature MATERIAL preblend pigmentation arrel temperature urnidity material shot length melt ow COSMETIg DEFECTS screw amp barrel cleaning lack of training hopper cleaning operator replacements operators on shifts METHOD MAN Figure 1 CauseandEffect Diagram Injection Molding Experiment 9 Choice of Response An Example 0 To improve a process that often produces underweight soap bars Obvious choice of response y soap bar weight 0 There are two subprocesses i mixing which affects soap bar density y1 ii forming which affects soap bar dimensions y2 0 Even though y is a function of y1 and y2 better to study yl and yz separately and identify factors important for each of the two subprocesses Fundamental Principles Replication randomization and blocking Replication 0 Each treatment is applied to units that are representative of the population example measurements of 3 units vs 3 repeated measurements of 1 unit 0 Replication vs Repetition ie repeated measurements 0 Enable the estimation of experimental error Use sample standard deviation 0 Decrease variance of estimates and increase the power to detect signi cant differences for independent yi s I l Var12yiVarO1 quot11 Randomization Use of a chance mechanism e g random number generators to assign treatments to units or to run order It has the following advantages 0 Protect against latent variables or lurking variables give an example 0 Reduce in uence of subjective bias in treatment assignments eg clinical trials 0 Ensure validity of statistical inference This is more technical will not be discussed in the book See Chapter 4 of Statistics for Experimenters by Box Hunter Hunter for discussion on randomization distribution Blocking A block refers to a collection of homogeneous units Effective blocking larger betweenblock variations than withinblock variations Examples hours batches lots street blocks pairs of twins 0 Run and compare treatments within the same blocks Use randomization within blocks It can eliminate blockblock variation and reduce variability of treatment effects estimates 0 Block what you can and randomize what you cannot 0 Discuss typing experiment to demonstrate possible elaboration of the blocking idea See next page Illustration Typing Experiment 0 To compare two keyboards A and B in terms of typing ef ciency Six manuscripts 16 are given to the same typist 0 Several designs ie orders of test sequence are considered 1 lAB 2AB 3AB 4AB 5AB 6AB A always followed by B why bad 2 Randomizing the order leads to a new sequence like this lAB 2 BA 3AB 4 BA 5AB 6AB an improvement but there are four with AB and two with BA Why is this not desirable Impact of learning e ect 3 Balanced randomization To mitigate the learning effect randomly choose three with AB and three with BA Produce one such plan on your own 4 Other improved plans Simple Linear Regression An Example The data taken from certain regions of Great Britain Norway and Sweden contains the mean annual temperature in degrees F and mortality index for neoplasms of the female breast MortalityrateZl 1025 1045 1004 959 870 950 886 892 TemperatureT 513 499 500 492 485 478 473 451 MortalityrateZl 789 846 817 722 651 681 673 525 TemperatureT 463 421 442 435 423 402 318 340 Objective Obtaining the relationship between mean annual temperature and the mortality rate for a type of breast cancer in women Getting Started 100 Mortality rate Tempe ratu re Figure 2 Scatter Plot of Temperature versus Mortality Rate Breast Cancer Data 16 Fitting the Regression Line 0 Underlying Model y B0leSa 8 NN076239 o Coef cients are estimated by minimizing 1 1 2 2 yl Bo Hm i1 0 Least Squares Estimates Estimated Coef cients 31 2m mm y3 A 62 206139 f2 7 Explanatory Power of the Model 0 The total variation in y can be measured by corrected total sum of squares CTSS z 1y DZ 0 This can be decomposed into two parts Analysis of Variance ANOVA C T SS RegrSS l RSS where n RegrSS Regression sum of squares 20 2 il 1 RSS Residual sum of squares 2 y 92 il l 30 lei is called the predicted value of 32 at xi 2 RegrSS RSS 39 R CTSS 1 CTSS by the tted model measures the proportion of variation in y explained tStatistic c To test the null hypothesis H0 BJ 0 against the alternative hypothesis H0 3 0 use the test statistic Z is sdltBJgt o The higher the value of t the more signi cant is the coef cient 1 o For 2sided alternatives pvalue Prob ltdfl gt ltobsl df degrees of freedom for the tstatistic tabs observed value of the tstatistic If p value is very small then either we have observed something which rarely happens or H0 is not true In practice if pvalue is less then 0c 005 or 001 H0 is rejected at level 06 Con dence Interval 1001 00 con dence interval for B is given by jitn27 marry where 114 is the upper 06 2 point of the 1 distribution with n 2 degrees of freedom If the con dence interval for 3 does not contain 0 then H0 is rejected 20 Predicted Values and Residuals o l 30 31m is the predicted value of yl at xi 0 rl 32 9 is the corresponding residual rt 0 Standardized residuals are de ned as Sdrl o Plots of residuals are extremely useful to judge the goodness of tted model Normal probability plot will be explained in Unit 2 Residuals versus predicted values Residuals versus covariate x 21 Analysis of Breast Cancer Data The regression equation is M 2179 236 T Predictor Coef SE Coef T P Constant 2179 1567 139 0186 T 23577 03489 676 0000 s 754466 R Sq 765 R Sqadj 749 Analysis of Variance Source DE SS MS E P Regression 1 25995 25995 4567 0000 Residual Error 14 7969 569 Total 15 33964 Unusual Observations Obs T M Fit SE Fit Residual St Resid 15 318 6730 5318 485 1412 244RX R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large leverage 22 Prediction from the Breast Cancer Data The tted regression model is Y 2l 7 9 236X where Y denotes the mortality rate and X denotes the temperature The predicted mean of Y atX x0 can be obtained from the above model For example prediction for the temperature of 49 is obtained by substituting x0 49 which gives yxo 9385 The standard error of yx0 is given by x x02 Here x0 491n x022 1xi 2 01041 and 6 xMSE 754 Consequently SEQW 2432 23 Con dence interval for mean and prediction interval for individual observation 0 A 95 con dence interval for the mean response 30 cho of y at x x0 is B0 51X0 i tn p LOOZS gtlt SE Oxo 0 Here the 95 con dence interval for the mean mortality corresponding to a temperature of 49 is 8863 9907 0 A 95 prediction interval for an individual observation yxO corresponding to x x0 is 97 X0 2 237196139 f2 7 Where 1 under the square root represents 62 variance of the new observation yxo A A A 1 50 51960 iIn p ipozs 1 Z o The 95 prediction interval for the predicted mortality of an individual corresponding to the temperature of 49 is 7685 11085 24 Regression Results after Removing the Outlier The regression equation is M 5262 302 T Predictor Coef SE Coef T P Constant 5262 1582 333 0005 T 30152 03466 870 0000 s 593258 R Sq 853 R Sqadj 842 Analysis of Variance Source DF SS MS F P Regression 1 26643 26643 7570 0000 Residual Error 13 4575 352 Total 14 31219 Unusual Observations Obs T M Fit SE Fit Residual St Resid 15 340 5250 4990 425 260 063 X X denotes an observation whose X value gives it large leverage 25 Residual Plots Residuals versus Fitted Values Residuals versus Temperature 0 7 O o 7 O o o O O m e m e o o O O O O m 1 Ta 0 Ta 0 i c 0 i c o 0 0 n n o o m 7 m I I o O o O O O O 7 o O o T T l l l l l l l l l l 50 60 70 80 90 100 35 40 45 50 Fitted Value Temperature Figure 3 Residual Plots Comments N0 systematic pattern is discerned 26 Multiple Linear Regression An Example http1ibstatcmueduDASLStoriesAirPollutionandMortalityhtmI 0 Data collected by General Motors 0 Response is ageadjusted mortality 0 Predictors Variables measuring demographic characteristics Variables measuring climatic characteristics Variables recording pollution potential of 3 air pollutants 0 Objective To determine whether air pollution is signi cantly related to mortality 27 Predictors JanTemp Mean January temperature degrees Farenheit JulyTemp Mean July temperature degrees Farenheit RelHum Relative Humidity Rain Annual rainfall inches Education Median education PopDensity Population density NonWhite Percentage of non whites WC Percentage of White collar workers pop Population pophouse Population per household income Median income HCPot HC pollution potential N OxPot Nitrous Oxide pollution potential SOZPot Sulphur Dioxide pollution potential 28 Getting Started 0 There are 60 data points 0 Pollution variables are highly skewed log transformation makes them nearly symmetric The variables HCPot NOXPot and SO2Pot are replaced by logHCPot logNOXPot and logSOZPot 0 Observation 21 Fort Worth TX has two missing values so this data point will be discarded from the analysis 29 Scatter Plots Figure 4 Scatter Plots of mortality against selected predictors a J anTemp b Education 4 a n r e m p E a u c a i i on c NonWhite d LogNOXPot 3O Fitting the Multiple Regression Equation Underlying Model y Bo B1x1B2x2Bpxpe e NO62 Coef cients are estimated by minimizing n 2 2 yz Bo B1xn 3m spiel19gt y XB y XB il Least Squares estimates 3 x X 1x y VarianceCovariance matrix of B 2A 62XX1 31 Analysis of Variance o The total variation in y ie corrected total sum of squares CTSS 27105 D2 yTy N72 can be decomposed into two parts Analysis of Variance ANOVA C T SS RegrSS l RSS where RSS Residual sum of squares 20 92 y XSTy X3 n A 2 T T 2 RegrSS Regress10n sum of squares 11 32 3 X XB Ny ANOVA Table Degrees of Sum of Mean Source Freedom Squares Squares quot T A A T A regressmn p B XTXB N32 B XTXB N7p residual N p 1 y XSTlty x6gt y X6gtTlty x8gtltN p 1gt total N 1 yTy Ny 2 corrected 32 Explanatory Power of the Model 2 RegrSS RSS o R CTSS 1 CTSS measures of the proportion of variation my explained by the tted model R is called the multiple correlation coef cient Adjusted R2 RSS R21 1 11 A RSS a Cn TES n p l CTSS c When an additional predictor is included in the regression model R2 always increases This is not a desirable property for model selection However R31 may decrease if the included variable is not an informative predictor Usually R31 is a better measure of model t 33 Testing signi cance of coef cients tStatistic To test the null hypothesis H0 3 0 against the alternative hypothesis H0 BJ 7 0 use the test statistic A 31A SdBj The higher the value of t the more signi cant is the coef cient 139 In practice if p value is less then 0c 005 or 001 H0 is rejected Con dence Interval 100l 00 con dence interval for 3 is given by A3 itn pl X Sdlf j where tnp17 is the upper 06 2 point of the 1 distribution with n p 1 degrees of freedom If the con dence interval for 3 does not contain 0 then H0 is rejected 34 Predictor Constant JanTemp JulyTemp RelHum Rain Educatio PopDensi NonWhit WC pop pophous income logHC logNOx 109802 8 3458 Analysis of Air Pollution Data Coef 13327 23052 1657 0407 14436 9458 0004509 5194 1852 000000109 4595 0000549 5347 8022 691 R Sq Analysis of Variance Source DF Regression 14 Residual Error 44 Total 58 0 76 SE Coef 2917 08795 2051 1070 05847 9080 0004311 1005 1210 00000401 3978 0001309 3539 3266 1672 7 88 173383 52610 225993 T 457 262 081 038 247 l04 105 517 l53 027 l16 042 l5l 246 041 R Sqadj MS 12384 1196 35 P 000 012 424 706 018 303 301 000 133 788 254 677 138 018 681 OOOOOOOOOOOOOOO 693 1036 P 0000 Variable Selection Methods 0 Principle of Parsimony Occam s razor Choose fewer variables with su icient explanatory power This is a desirable modeling strategy 0 The goal is thus to identify the smallest subset of covariates that provides good t One way of achieving this is to retain the signi cant predictors in the tted multiple regression This may not work well if some variables are strongly correlated among themselves or if there are too many variables e g exceeding the sample size 0 Two other possible strategies are Best subset regression using Mallows C p statistic Stepwise regression 36 Best Subset Regression o For a model with p regression coef cients re p l covariates plus the intercept 30 de ne its C p value as RSS Cp S z N ZP7 where RSS residual sum of squares for the given model s2 mean square RSS for the complete model df for the complete model error N number of observations 0 If the model is true then E Cp 9 Thus one should choose 9 by picking models Whose Cp values are low and close to 9 For the same 19 choose a model with the smallest C p value ie the smallest RSS value 37 Stepwise Regression o This method involves adding or dropping one variable at a time from a given model based on a partial F statistic Let the smaller and bigger models be Model I and Model II respectively The partial F statistic is de ned as RSSM0del1 RSSM0del II RSSM0del IIV 7 where V is the degrees of freedom of the RSS residual sum of squares for Model 11 0 There are three possible ways 1 Backward elimination starting with the full model and removing covariates 2 Forward selection starting with the intercept and adding one variable at a time 3 Stepwise selection alternate backward elimination and forward selection Usually stepwise selection is recommended 38 Criteria for Inclusion and Exclusion of Variables o F toremove At each step of backward elimination compute the partial F value for each covariate being considered for removal The one with the lowest partial F provided it is smaller than a preselected value is dropped The procedure continues until no more covariates can be dropped The preselected value is often chosen to be F1 N70 the upper 06 critical value of the F distribution with l and v degrees of freedom 0 F toenter At each step of forward selection the covariate with the largest partial F is added provided it is larger than a preselected F critical value which is referred to as an F toemer value c For stepwise selection the F toremove and F toenter values should be chosen to be the same See Section 15 39 Vars 4 XlO U Pollution Data Analysis Mallows Cp R Sq R Sqadj C p 8 variables 697 674 83 35624 14713 729 703 43 34019 145713 742 713 37 33456 1467813 750 716 43 33290 146781213 754 715 54 33322 14578101213 Number of variables VS Minimum C I Figure 5 Number of variables vs minimum C p 40 Pollution Data Analysis Stepwise Regression Stepwise Regression Mortality versus JanTemp JulyTemp Alpha to Enter 015 Alpha to Remove 015 Response is Mortality on 14 predictors with N 59 Noases with missing observations 1 Na11 oases 60 Step 1 2 3 4 5 6 7 Constant 8879 12085 11127 11354 10087 10295 10287 NonWhit 449 392 392 473 436 415 415 T Value 640 626 681 732 673 660 666 P Value 0000 0000 0000 0000 0000 0000 0000 Educatio 286 235 211 141 156 155 T Value 432 374 347 210 240 249 P Value 0000 0000 0001 0041 0020 0016 109802 280 210 268 O4 T Value 337 248 311 002 P Value 0001 0016 0003 0980 41 Pollution Data Analysis Stepwise Regression Contd Alpha to Enter 015 Alpha to Remove 015 Response is Mortality on 14 predictors with N 59 Noases with missing observations 1 Na11 oases 60 JanTemp 142 129 215 214 T Value 241 226 325 417 P Value 0019 0028 0002 0000 Rain 108 166 165 T Value 215 307 316 P Value 0036 0003 0003 logNOx 42 42 T Value 235 404 P Value 0023 0000 S 480 420 385 370 358 343 340 R Sq 4180 5635 6384 6736 6999 7286 7286 R Sqadj 4078 5480 6186 6494 6716 6973 7030 C p 550 295 174 127 97 63 43 42 Final Model Rival Models Variables C p R2 R2 Remarks Model 1 14678l3 37 742 713 Model with Minimum Cp Model 2 145713 43 729 703 Model from Stepwise We shall analyze data with Model 2 Why Refer to the rules on page 35 and use the principle of parsimony 43 Analysis of Model 2 Predictor Coef SE Coef T Constant 103843 8040 1292 JanTemp 20471 05044 406 Rain 15346 05096 301 Educatio 15956 6223 256 NonWhit 41755 06218 672 logNOx 38139 9723 392 S 3403 R Sq 726 R Sqadj Analysis of Variance Source DF 88 MS Regression 5 165871 33174 Residual Error 54 62527 1158 Total 59 228398 44 000 000 004 013 000 000 000000 701 2865 P 0000 ReSIduas Residual Plot Residuals versus Fitted Values o u o a o c n a 39 u o u n o u 39 n u o I I I I I 850 900 950 l 000 l 050 Fltted Values Residual versus Fitted Values Figure 6 Plot of Residuals 45

### BOOM! Enjoy Your Free Notes!

We've added these Notes to your profile, click here to view them now.

### You're already Subscribed!

Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'

## Why people love StudySoup

#### "Knowing I can count on the Elite Notetaker in my class allows me to focus on what the professor is saying instead of just scribbling notes the whole time and falling behind."

#### "I used the money I made selling my notes & study guides to pay for spring break in Olympia, Washington...which was Sweet!"

#### "There's no way I would have passed my Organic Chemistry class this semester without the notes and study guides I got from StudySoup."

#### "It's a great way for students to improve their educational experience and it seemed like a product that everybody wants, so all the people participating are winning."

### Refund Policy

#### STUDYSOUP CANCELLATION POLICY

All subscriptions to StudySoup are paid in full at the time of subscribing. To change your credit card information or to cancel your subscription, go to "Edit Settings". All credit card information will be available there. If you should decide to cancel your subscription, it will continue to be valid until the next payment period, as all payments for the current period were made in advance. For special circumstances, please email support@studysoup.com

#### STUDYSOUP REFUND POLICY

StudySoup has more than 1 million course-specific study resources to help students study smarter. If you’re having trouble finding what you’re looking for, our customer support team can help you find what you need! Feel free to contact them here: support@studysoup.com

Recurring Subscriptions: If you have canceled your recurring subscription on the day of renewal and have not downloaded any documents, you may request a refund by submitting an email to support@studysoup.com

Satisfaction Guarantee: If you’re not satisfied with your subscription, you can contact us for further help. Contact must be made within 3 business days of your subscription purchase and your refund request will be subject for review.

Please Note: Refunds can never be provided more than 30 days after the initial purchase date regardless of your activity on the site.