Popular in Course
Popular in ComputerScienence
This 38 page Class Notes was uploaded by Abe Jones on Saturday September 12, 2015. The Class Notes belongs to CS 791X at West Virginia University taught by Bojan Cukic in Fall. Since its upload, it has received 15 views. For similar materials see /class/202749/cs-791x-west-virginia-university in ComputerScienence at West Virginia University.
Reviews for CS 791X
Report this Material
What is Karma?
Karma is the currency of StudySoup.
You can buy or earn more Karma at anytime and redeem it for class notes, study guides, flashcards, and more!
Date Created: 09/12/15
W West Virginia University I I Executing Tests Boj an Cukic Department of Computer Science and Electiical Engineen39ng West Virginia University W West Virginia University Some questions rst 7 In our parking garage workshop we chose an automated system that parks the cars without a human The safety subsystem requires high reliability to prevent loss of life say 099999 But a glitch in the user interface is not severe We don39t care as much if say it tmncates the attendant39s name if it is too long So we assign a reliability of 098 to that class So I39m talking to my customer whom I39m implementing this for and tell him there39s a 2 chance the system will fail after parkng 1000 cars That sounds misleading My customer is going to think that 20 out of every 1000 cars is going to bump into something get lost or injure somebody I guess my point is that the overall system reliability seems to be less important as looking at the reliability of the classes You mentioned that we should start with a system reliability and then from there study the sub systems if we wish to have reliabilities for the sub systems correct If so I don39t understand how to get that total system reliability without first doing the subsystems or first doing the severity classes I guess I need to read Chapter 3 again Answer 71 For systems which have drastically different failure severities it is customary to express reliability for each severity class W West Virginia University One more 7 I think I39m confused about converting between Failure intensity FI and reliability R Suppose we have a goal of a system that processes on average 20800 cars between failures In terms of 1000 cars processed as our natural unit this is 100020800 0048 failures1000carsprocessed So our failure intensity rate is 0048 right Now I want to calculate the reliability which is the probability that the system will function without failure for the natural unit of 1000 cars Is this right so far Well the probability of failure for 1000 cars is 0048 so the probability of success is simply I 0048 0952 Correct If the above is correct what then is the equation R eXplamdat for Equation 36 on page 171 What is lamda and what is t What is R in this equation for that matter Answer 7 Failure intensity Fl l20800 parked cars 0048 1000 parked cars 0000048parked car 71 Probability of failure l0000048O9999519 7 More generally reliability R expressed as u e where A is failure intensity I is natural time units 7 tlt0 05 9 R1 it the same number as calculated above 7 Converting R into 1 can be done as follows 71 11R0rl z l Rt ifRgt095 W West Virginia University I I Executing Tests Boj an Cukic Department of Computer Science and Electiical Engineen39ng West Virginia University w So far 7 Preparing for tests included 7 Concepts such as RUN TEST CASE INPUT VARIABLES DIRECT and INDIRECT 71 Procedures 2 Estimating the number of new tests 7 Allocating tests throughout the system 7 Allocating new tests to new operations features a Specifying test cases 7 Preparing test procedures adjusting operational profiles etc 71 Test automation W West Virginia University The big picture List Associated Systems Define Necessary Reliability Develop Operational i Profiles 5 Prepare for Test 5 Execute amp Apply Failure Data Tests to Guide Decisions Requirements and Design and Test amp Validation architecture Implementation Concepts 71 Feature test 7 Execute all new test cases of the release independent of each other 7 Interactions and effects of the eld environment minimized 7 Purpose is to identify functional features resulting from execution of test cases by themselves Concepts 71 Load test 71 Execute all valid test cases from all releases together 71 Full interactions and all the effects of the field environment 7 Purpose to identify failures from 71 Interactions among test cases 71 Overloading of and queuing for resources 7 Database or operating system degradation Concepts 7 Regression test 71 Execute a subset of all test cases of all releases at each system build with signi cant change independent of each other 7 Interactions and field effects minimized 71 Test subsets contain 71 Tests for rare critical conditions 71 Certain number of tests cases for noncritical non rare conditions selected randomly 71 Purpose is to reveal functional failures caused by faults induced by program changes w Executmg Tests iversity I I 71 Use test cases and test procedures developed in the previous phase 71 Test execution phase phase involves 71 Allocating test time 7 Invoking tests 7 Identifying failures that occur 71 This information will be used in guiding tests and making decisions Allocating test time 71 Measure test time in hours For multiple con gurations TestTimegtRea1Time 7 Proceed with allocation in 3 steps 1 Among the systems to be tested 2 Among feature regression load test for each system in reliability growth phase 3 Among operational modes for each system in load test 71 Load testing usually equals certi cation testing System allocation Allocate test according to the estimated risk For the other systems and remaining time divide test time in same proportions as the numbers of new tests were assigned 71 Reuse of previous allocations based on operational distributions and newness Allow enough time in feature test for all the new test cases for the release regression tests The rest of the time goes to load testing w Example University I I 71 Planned testing period 320 h 7 40h allocated to the testing of the supersystem based on its criticality 7 System has 2 components with test distributions of 071 and 029 7 Then component 1 Product receives 200 h component 2 OS receives 80 h for testing Operational Proportion of Operating Mode Calls Supersystem Product System Peak hours 01 4 20 8 Prime hours 07 28 140 56 Off Hours 02 8 40 16 w Invoking tests niversity I I 71 SRE starts after the components have been assembled into the system so features can be tested 71 Assumes unit testing completed 71 Recommended sequence of system level tests 71 1 Acquired components 71 2 Product and variations 71 2a Feature tests 71 2b Load tests 71 3 Supersystem w Invoking Tests iversity I I 71 Feature tests 7 Select tests in random order from the set of new test cases for the given release 7 Invoke each test case after previously invoked test case is complete avoid interaction 71 Provide setup and cleanup 71 Do not replace test case after execution 71 Execute each test case 01106 W West Virginia University Invoking Tests 2 71 Load tests 71 Operational pro le based 71 The number of tests run based on the time available 7 Fault density likely to be similar to the field operation 7 Select load tests with replacement repetition OK 7 Detect multiple faults possibly associated with an operation 7 Yet each RUN will be different 71 Repeat a run if you need to 71 Better investigate fault behavior 71 Verify fault removal 71 Random test case and operation selection usual W5 Invoking Tests 2 71 Regression tests 7 Invoke each test case after previously invoked test case compete 7 Choose a subset of test cases for each build 71 All valid critical cases 7 Randomly selected noncritical test cases 71 Select test cases randomly 7 Provide setup and cleanup 7 No replacement of test cases w MISC University 7 Occurrence probabilities for selecting operations are constant BUT constant probability does not mean constant constant order for testing 7 PA 07 PBO3 71 2 sequences ABAABAAABA and AAAAAAABBB Operation Ex 1 PA Ex 2 PA 1 1 05 067 075 06 067 071 1 075 088 067 078 07 07 OkOmNCDU39IAOONA Identification of system failures 7 Involves 71 Analyze the test output for deviations 7 Determine deviations which are failures 71 Establish WHEN failures occurred 7 Assign failure severity classes to be used for priority assignment in fault resolution Analyzing Test Outputs 71 Deviation departure from the expected behavior 71 Tools exist for the analysis of 71 Inter process communication failures 7 Illegal memory references 7 Deviant return codes z Deadlocks 71 Crashes hangs w Analyzmg outputs 7 Assertions can help in the analysis 7 Time deviations can be counted depending on the nature of the application 71 DO NOT count cascaded deviations as multiple problems 71 Count a cascade as a single deviation Are deviations failures 71 Higher severity failures usually easier to observe 7 Deviations in fault tolerant systems may not be failures 7 Violation of written or nonwrl tten requirements 71 A complaint from the user community means it IS a fault 7 Record hardware and personnel failures too 71 Help in system level problem resolution W West Virginia University When did a failure occur 7 Use common reference unit chosen for failure intensities 7 If average load over processing time varies 71 Take execution time measurements better characterize failure inducing stress 7 Convert execution time to operating time by dividing by average over system life ratio of execution time to operating t1me Adjusted Failure Exe Time Utilization Time 1 02 04 05 2 06 04 15 3 12 04 3 Software Reliability Class Software Engineering SENG 691 D 1 Computer Science CS 791 X I Time Tuesdays 6PM 7 830PM SENG section is online CS section in ESB 801 I Instructor Bojan Cukic V Of ce phone 3042930405 ext 2526 quot Email boj ancukic 1nailwvuedu WWquot Rules of operation Mmeer 1 Email communication strongly preferred 1 Chats possible ma or Skype need to arrange time 1 Of ce Visiw encouraged v ESB 731Evansda1e carnpus Morgantown 1 Textbook John Mnsa Su wam Reliability Engmezrmg Mme Relzable Su warz Faster and Cheaper McGrawHill 1998 The book is out ofprint but the 2nd edition a Print on Demand POD from AuthorHouse publishers with a considerable discount Access the following We site ltggmembers aol comJohnDMusahook litm and follow 39 links appropnate WWquot Rules of Operation Mmeer 1 Tests Midterm and finals 1 Presentations and research papers Each student will choose a topic in agreement with the instructor Presentations during regular classes online studenw use phone V Presentations will grow into papers by the end of the ester ts can report on reliability best practices or the application ofreliability engineering to projects they are involve with int 39 39 at39ons cs student s topics will require research content project WW Rules of Operation 1 See the syllabus for details on tests presentations and papers 1 Grading 1 Midterm Finals 40 1 Class presentation 15 1 Term papers project reports 35 1 Class participation 10 1 90A 80 B 70 C 1 Must obtain a passing grade from both tests and paperspresentations mem Mmeer Software Reliability Engineering A Short Overview Bojan Cukic Lane Department of Computer Science and Electrical Engmeenng West Virginia Univ ersny Introduction 1 Hardware for safetycritical systems is very reliable and is reliability is being improved 1 So ware is not as reliable as hardware however its role in safetycritical systems increases 1 Today the majority of engineers understand very little about the science of programming or the mathematics that one needs to analyze a program On the other hand the scientisw who study programming know very little about what it means to be an engineer Parnas 1997 Introduction 1 How good is software 1 Close to 75 of software projects never achieve completion or are never use 1 25 35 ofUNIX utilities crash or hang the system When exposed to unusual inpuw Miller OS X the most unreliable according to these studies 1 12 commercial programs for seismic data processing Numerical disagreement between resu1ts grows 1 per 4000 lines ofsource code Hatton 94 Introduction Warnlain uman quot Software needs to be su iciently good for its application 1 Increased use of computerized control systems in safety critical applications quot ight control nuclear plant monitoring robotic surgery military applications etc quot Canwe expect perfect software in practice lim some gtmf good software perfect software Introduction Essential Warnlain uman 1 The goal ofproducing perfect software remains elusive Brooks 86 due to complexity functional complexity structural complexity code comp exi quot changing requirements quot invisibility 1 Software faults introduced in all phases of the lifecycle speci cation design implementation testing maintenance Introduction Ariane ight 501 WEiLZ En L39 fajmm 1 Ariane 4 SRI Inertial Reference Systems software was reused on Ariane 5 I Ariane 4 accelerated much slower used different trajectory 391 In SRI1 and SRI2 Operand Error exception appeared due to an over ow in converting 64 bit oating point to 16 bit unsigned integer w SRIs dec1ared rai1ure in two successive data cycles 72 ms 1 On Board Computer interpreted SRI2 diagnostic pattern as ight data and commanded nozzle de ection 1 39s a er launch the launcher disintegrated because of high aerodynamic loads due to an angle of attack of more than 20 degrees Infamous Software Failures 1an 1 July 28 1962 Mariner I space probe i A formula written by pencil on paper improperly coded Trajectory rniscaicu1ated rocket diverted from the path at launch destroyed over the Atlantic 1 1982 TransSiberian gas pipeline explosion 1 Fault p1anted into Canadian pipeline control software covertly acquired by the Russians Not much known about the nature of the fault 39I 1985 7 1987 Therac 7 25 medical accellerator 1 Radiation therapy device delivers lethal doses at several facilities Software safety interlock replaced electromechanical and failed An operating system race condition was at fault Infamous Software Failures 1 1988 Buffer over ow at Berkeley Unix finger daemon 1 Allowed the spread of the rst internet worm getsO function did not control the length of string warm code was able to take control of the machines 1 1988 7 1996 Kerberos random number generator 39 Generator not properly seeded For ears it was possible to break into the most secure authentication system using trivial mathematics Not known whether fault ever exploited Infamous Software Failures will 4 Ian 15 1990 ATampT network outage 1 A fault in a new software release causes long distance switches to crash When they receive a crash recovery message from the neighboring machine 114 switches kept crashing and rebooting every 6 seconds for 9 hours Old software release loaded back to x the problem 1 1993 Intel Pentium oating point division 1 Error of0006 in division causes public relations ni are 3 to 5 million chips in circulation replacement for anyone Who complains 475M in damages Infamous Software Failures Warnwill lmwm 1 199596 The Ping ofdeath i Malformed ping packets not checked and cause the computers to display the blue screen ofdeath Lack oferror handling sanity checks windows Mac andUer systems affected 1 November 2000 National cancer institute Panama City i rherapy planning soltvwre miscalculates the proper dosage of radiation Doctors trick the so ware by placing additional shielding bloc s notplanned in software Dosage calculation depends on peculiar user interaction 8 patients die at least 20 receive overdoses Doctors who were supposedto double check computer s calculations by hand are indicted for murder Software Reliability Wii iifn i39 1 Software Reliability PA1B 0 not fail when operated for time units under speci ed conditions B Software has not failed at time 0 1 Ultrahigh reliability requiremenw for safetycritical systems Draft Int l standard IEC 65A1 23 for Safety rntegnty Level 4 Continuous control systems lt 10398failures perhour Airbus 320330340 and Being 777 lt103 failuresh This translates to 113155 years ofoperation without encountering afailure 39Protection systems emergency shutdown lt 10quot failuresh UK Seizewell B nuclear reactor emerg lt10393failuresh Introduction 1 Software faulm introduced in all phases of the lifecycle specification design implementation testing maintenance a Reliable operation of programmable electronics requires assurance in all the phases of the lifecycle mummy Formal Veri cation Software Reliability Assessment Formal Veri cation Testing V PRO Time Domain n CONS a Proves program correctness i e t input Domain 7 Cannot cope with specification errors os compiler hardware fan ts Reliability 1 is established 7 Proofs can be erroneous unless by proving the absence of performed automatically implementatio errors 7 Its applicability limited to small Independent of operational medium size pro profile system usage Formal methods 1n SE 1 Used for requirements specifications and verification 39 Based on mathematical logic state machines or process algebra 7 Most popular forms of verification 1 Model checking Finite state transition model represents the system Constraints expressed in temporal logic Signi cant limitations wrt system size 39i Formal Veri cation Proying properties from the set of axioms Time Domain Approach Software Reliabilesessment Formal Veri cation Testing Time Domain Input Domain 1 observed failure datafrom testing Failure tted to various statistical models if w TimeBetweenFailure models and PeriodFailure Count mode s d Used for 7 CONS we 595mg quot mirel ab l w 7 Perfect fault removal assumed predicting future reliability c t b dt d a anno euse o rei controlling software testing ct ultranigh reliability levels Time domain models 1 Reliability Growth models JelenskiMoranda model JM The number ofinitial faults unknown but fixed Fault detection is perfect no new faults introduced Times between failure occurrences are independent exponentially distributed random quantities all remaining faults contribute equally to failure intensity I General problems more assumptions All faults detectable V Statistical independence of interfailure arrival Related Work Statistical testing Software Reliability Assessment Formal Veri cation Testing Time Domain InputDomain 1 PROS 1 System level assessment i Theoretically sound 1 CONS Program F39 i Large number oftest cases an oracle needed i De ends on the ouau spare operational profile Urn Model of Software Testing 1 Random software testing is modeled as sampling with replacement If repeated sampling reveals no black balls all We gain is a con dence that there are none Only one type of testing can prove that there are no existing faulw exhaustive testing 1 Program with 20 variables 10 valuesvar 1n per test case gt i me years onesmg39 pmb Operatiunal dls mbutlun Introduction Dependability Dependability Attributes Means impairments l t Availability Safety lntEgth Fault Fault Faults Preventiun Tolerance Reliabtlai y Maintainability Faut Fauit Ems Removal Furecasting Cuntidentiality Failures W Safetycritical systems require both best practices for software development With dependability being the major concern 1 rigorous validation procedures A Reality Check 1 Collection of operational software data is difficult I Problem occurrence rates for essential aircraft flight functions Shooman 96 1 2x10398 to 10396 occurrences per hour of operation V The reported failure occurrence rates are higher than required 1 Error Fault and Failure EFF data collection initiatives V Come and go 39 We still miss data Software Reliability lawn 1 Measurement 1 Practical techniques 39 Right or wrong U nreliability of released products 1 Missed schedules Cost overruns W Market sharereaction What ls SRE 1 The set of best practices that empower testers and developers to Ensure product reliability meets users needs Speed the product to market faster Reduce product cost Improve customer satisfaction fewer angry users Increase their productivity W Applicable to all software based systems 1 Two fundamental ideas Focus resources on the most usedcritical functions Make testing realistically represent field conditions SRE Process 7 Widely used and accepted especially by the large corporations Morosoft included I l 39 Increase in project cost less than 1 1 Predominant SRE work ow Define Necessary Reliability Develop Operational F39rufilES Prepare tut Test Execute is Apply Failure Data Tests tu Guide Deeisiuns Requirements and Design and Test twaiiuatiuri architecture impiementatiuri SRE Process 1 Tasks frequently iterate W Postdelivery and maintenance phase not shown 1 Testers must be involved throughout the process t Allows better understanding of user s perspective Improvement of system requirements planning 1 Selection of appropriate mix of 1 fault prevention 1 fault removal fault tolemnce 1 Types of tests applicable to SRE based on objectives rather than phases in the lifecycle Reliability growth tests nd and remove faults need a minimum of 1020 detected faults to achieve statistica11y meaningru1resu1ts Fcuum minimize impact ortne environment laud aximize environmental impacts regren iun tests renewing a major change Certi cation tests 39no debugging accept or reject software under test 39no observed failures not important De ning the system 7 System is an independently tested unit I SRE should be applied to subsystems acquired COTS OS for example systems and supersystems 7 Different configuration represents different system v Interface stubs may not be correct But more systems implies higher cost w aggregation Welcome i Product 1ines help reducing the cost mem new SRE and SW design amp test process 1 Use knowledge of operational profile to guide and focus design efforts I Established failure intensity drives the quality assurance efform 1 Failure intensity goal determines when to stop testing 1 Measurement throughout the lifecycle helps identify better methodologies Is Reliability Important It should be since it is measurable property Unlike software quality 1 Useful since the software is tested under the conditions of perceived usage 1 The number of resident faults for example is a developer oriented measure Reliability is auser oriented measure I The number of faults found has NO correlation to reliability Neither has program com ty I Accurate measurements of reliability are feasible Why to Measure Reliability 7 Isn t the best software development process 7 sufficient 1 What is best quot It is important to measure the results of the process 1 Early consideration of target reliability is beneficial since it impacts cost and schedule 1 CMIVI levels 4 and 5 and 3 indirectly recommend reliability measurement W Common Misconceptions 1 Software reliability is primarily concerned with software reliability models I It copies hardware reliability theory I Not because reliability of software is more likely to change over time modi cations upgrades 1 It deals with faults or bugs 1 It does not concern itself with requiremenm based testing 1 Testing ultrareliable software is hopeless Reliability Measurement Wmvlglu lmwm Observe failure occurrences in terms of execution time Fallure Fallure Fallure No 1185 Interval l O 2 l 9 9 3 32 l 3 4 43 l l 5 58 l 5 6 70 l 2 7 88 1 8 8 I 03 l 5 9 l 25 22 Measurements 1 Typical variation of Fallexec hr failure intensity and reliability over testing I Each expression has its advantages 1 Curves not necessarily so smoo Fallure lrlterlslty 1 Alternatives MTTF larger better but may TlME beundeflned TBFMTTFMITR comes from va reliability Summary 1 Definition of software reliability W Software reliability engineering is the process that leads to high reliability software I Based on statistical evaluation of quality factors throughout the development lifecycle 1 Reliability can be assessed using different approaches W Simple activities can significantly reduce software failure rates
Are you sure you want to buy this material for
You're already Subscribed!
Looks like you've already subscribed to StudySoup, you won't need to purchase another subscription to get this material. To access this material simply click 'View Full Document'