##,. ####,. #######,. ######,. ##, #####,. #,.#,. ##,. ##, ##,. ##,. ##,. ##, ##,. #,. ##,.##,. ##,. ##,. ##,. ##,. ##, ##,. ##,. ##,. ##,. #######,. ##,. ##,. ##, ##, ########,. ##,. #, ##,. ##,. ##,. ##, .###, ##,. ##,. ##,.. #, ##,. ##,. ##,. ##, ##,.. ##, ##,. ##,. ######, #######,. #######, ##, ######, ## http://www.agedis.de #############.. ##.. +===================================================+ +======= AGEDIS Newsletter =======+ +======= December 2001 =======+ +===================================================+ The AGEDIS NEWSLETTER is E-mailed quarterly to subscribers worldwide to provide information on the AGEDIS EC project in particular and on software testing and software test generation in general. Permission to copy and/or re-distribute is granted, and secondary circulation is encouraged by recipients of the AGEDIS newsletter provided that the entire document/file is kept intact and this complete copyright notice appears with it in all copies. Information on how to subscribe or unsubscribe can be found at the end of this issue. (c) Copyright 2001 by imbus AG, Germany. For best viewing quality we recommend the use of non-proportional fonts, like Courier in your email displayer. ======================================================================== o The Agony and the Ecstasy of Automated Test Generation for GUI-based Applications by Alan Hartman, Joachim Hofer, Kenneth Nagin, and Tomer Shiran o TGV main principles by Thierry Jeron o About us: The Software Engineering Programme at Oxford by Jim Davies o AGEDIS at conferences: Conference Reports by Klaudia Dussa-Zieger, Jim Davies o AGEDIS at conferences: Future Appearances o AGEDIS newsletter Article Submission, Subscription Information ======================================================================== The Agony and the Ecstasy of Automated Test Generation for GUI-based Applications by Alan Hartman, IBM Haifa < Email: hartman@il.ibm.com > Introduction ------------ The tools for automated test generation are primarily used to generate test suites for application programming interfaces (APIs) or for manual test execution. The tools for automating the test execution of GUI-based applications are usually driven by test scripts that are either manually written or created by record and playback. It is rare to find instances of both automated test execution and automated test case generation used together for an application designed to work from a graphical user interface (GUI). This article deals with three attempts to use automation for both the generation and execution of test suites for GUI-based applications. The tools we used were taken from the GOTCHA-TCBeans toolkit produced by the IBM Haifa Research Laboratory. These tools, together with the test generator TGV from Verimag and Irisa Research Laboratories, form the technology base from which the AGEDIS tools will be developed. Other more extensive experiments with GOTCHA-TCBeans and TGV are documented in papers available from the AGEDIS website (www.agedis.de). These experiments dealt with applications that did not have GUIs. Note that GOTCHA-TCBeans is an internal IBM tool which is not available to the public. However GOTCHA-TCBeans and TGV are the basic technologies on which the AGEDIS tool set will be based. Experiment 1 ------------ In the first experiment, we attempted to generate and execute test suites on a component of the Eclipse suite of application development tools. The application was extremely difficult to test directly since its structure required that a lower level application be written in order to test the framework itself. The GUI was not written using standard GUI class libraries, so an execution framework like WinRunner could not be used to automate the test execution. The test cases were automatically generated by GOTCHA, but in order to run them we had to do extensive coding of Java stubs to access the back end of the GUI. This in turn had the unfortunate consequence of generating a large number of defects which were rejected by the developers, since such situations could not be reached using the physical GUI. The bottom line was a lot of effort, no important bugs discovered (that were recognized as such by the developers), and high levels of frustration. Lessons from Experiment 1 ------------------------- 1. Developers should design a testing interface that is independent of the GUI (This is an example of the tester saying: It's not my problem; bad development habits are to blame). 2. Know when to give up on automation: If it can't be done efficiently, move on to more productive testing. Experiment 2 ------------ In the second experiment, we tested an open source Java application written by Kenneth Turner and Iain Robin, named JASPER: Java Simulation of Protocols for Education and Research. Since JASPER is a GUI-based application, we decided to extend its main class in order to provide a custom API. We wrote a subclass SimulatorAPI that extends the class ProtocolSimulator. Our subclass introduces public methods that dispatch events to the applications AWT elements. For example, executing the method clear() is equivalent to pressing the Clear button in the GUI. Our main tool for interacting with JASPER was a method that instructs the simulator to select an action from the drop-down list at the bottom of the window. When writing the model for the sliding window protocol, we had to make sure that its rules corresponded with the application's GUI/API. We decided to implement one rule for each service in the application (each possible value in the drop-down list): - send data - timeout - resend - deliver to receiver (S2M to M2R) [with a parameter specifying the sequence number of the desired packet] - deliver to sender (R2M to M2S) [with a parameter specifying the sequence number of the desired acknowledgement] - send ack The GOTCHA model then generated a set of interesting sequences of these commands, testing the protocol simulator at all its extremities. The GOTCHA test generator is directed by a set of coverage criteria, which we focussed on areas of the simulation where we expected to find problems in the implementation. We also wrote some generic coverage criteria to provide general coverage of the application. The TCBeans execution engine verifies that the application's behaviour corresponds to the behaviour predicted by the model. It does this by inspecting the application's internal state. Since we didn't want to touch the original application's source, we decided to map the application's private variables via Java's reflection technique (this enabled us to access private fields). Lessons from Experiment 2 ------------------------- 1. If the application has a clear Java interface, it is not difficult to interact directly with the application by extending the Java code, thus utilizing the execution engine in GOTCHA-TCBeans. 2. We found one interesting bug, and we would have found more, except that the code was pretty high quality. Experiment 3 ------------ The third experiment tested an application used in the imbus tester's training scheme. It has several embedded bugs in the computation of the expected cost of a motor car that is configured by the GUI. The logic of the computation is quite complex, and in the absence of any specifications, our model required several iterations before we agreed with the developers that we had indeed modelled their car configurator. This sequence of events is typical for model-based testing, and illustrates the added value provided by this testing methodology in the absence of adequate specifications. The most interesting aspect of the model is the use of discrete variables for the integer parameters of the discounts available. Another interesting feature was the use of coverage criteria to cover the input space, which was vast. GOTCHA produced a large test suite which was then converted into WinRunner scripts using the TCBeans test suite translation tool. The execution methodology used in Experiment 2 could not be applied in this case, since we did not have access to the source code. The test engineer, who was an experienced WinRunner user, had little difficulty producing WinRunner scripts from the GOTCHA test suite. The only problem encountered was in the size of the files generated. The test suite was run overnight (running for nearly 12 hours) with WinRunner stating 18702 assertions, of which 10825 failed and 7877 passed. The failures were analysed and five defects were exposed, including one which was previously unknown to the authors of the application. Lessons from Experiment 3 ------------------------- 1. The GOTCHA/TCBeans suite is able to find bugs that manual testers (or manual test case designers) are inclined to overlook. 2. The generated test script quickly becomes too large for WinRunner, so that some kind of post-processing will probably be necessary before the scripts are fed to WinRunner. 3. It is necessary to log a lot of "debug" information when a test fails in order to conveniently trace the cause of the bugs (the automatically generated test suite and test scripts are huge; visualizing and navigating them can be confusing). 4. Modelling is an iterative process as the specifications become clearer to the modeller. This has the added benefit of early detection of specification bugs. Validation and visualization of models becomes more important as the applications become more complex. 5. To avoid useless and time-consuming runs of large test suites, it has proven useful to conduct a pre-run on a small sample of the generated test script (alternatively: to conduct a pre-run with some simple coverage criteria and a small test suite) in order to quickly detect problems in the model, the test suite translation code, or the WinRunner utility code. Conclusions ----------- We should look at Experiment 2 as the prototype for testing GUIs when the source code is available, and Experiment 3 as the prototype for testing GUIs when we must work without the source code. In fact the approach used in Experiment 2 can be applied without the source code, provided we have access to the compiled Java classes and documentation of the interface. We should treat Experiment 1 as a lesson in humility, and be aware that model-based testing is not a silver bullet. ======================================================================== TGV main principles by Thierry Jeron < Email: Thierry.Jeron@irisa.fr > The AGEDIS test generation tool that we are building is based on the principles of two existing tools – TGV [1] and Gotcha [2]. In this brief article, we describe the main principles of TGV. TGV (Test Generation with Verification technology) is a test generation tool developed by Verimag and Irisa [1]. It came into being in 1995 when Jean-Claude Fernandez, Claude Jard, Thierry Jéron, and Joseph Sifakis imagined that model-checking algorithms could be transformed into efficient test generation algorithms. Model-checking is a verification technique whose principle is to verify that a formula in a particular logic satisfies a specification. Our idea was to describe test purposes as automata that describe abstract properties of test cases and to view the test generation process as the computation of evidence ensuring that a test purpose satisfies a specification. Models and Theory ----------------- TGV is based on a sound testing theory [3]. A model called IOLTS (Input Output Labelled Transition Systems) is used in several places to describe behavior with a clear distinction between inputs, outputs, and internal events. It is used, in particular, to model the behavior of specifications. The behavior of the implementation under test is supposed to be unknown (black box). Nevertheless, for mathematical reasoning, we will suppose that its behavior can be modelled by an IOLTS. In this case, a conformance relation "ioco" defines the correct implementations, I, with respect to a given specification, S. To allow the detection of incorrect quiescence of I (by timers in test cases), ioco is defined in terms of traces of the suspension automata, d(I) and d(S). d(S) is built from S by the addition of loops labelled by a new output d in each quiescent state (i.e., a livelock, a deadlock, or an absence of output). Now, I ioco S if after every trace of d(S) (including d), the outputs of I are included in those of S. The two main inputs of TGV are the IOLTS of the specification S and a test purpose TP used for test selection. TP is an automaton labelled by S actions with two distinguished sets of states used for test selection: - Accept states are used to select the behaviors of S that one wants to test. - Reject states may prune the exploration of S during test generation. UNIX-like regular expressions are used to ease the description of labels in test purposes. Algorithms ----------- The test generation process is based on verification algorithms. As stated above, test generation from a specification S and a test purpose TP can be viewed as the production of witnesses of the satisfaction of TP by S. But the process is more complex, because additional problems must be taken into account, such as partial observation, non-determinism, and the difference between observation and control. Moreover, efficiency is gained by on-the-fly algorithms. Test generation is composed of several operations. A product S x TP, which synchronizes on common actions, is used to mark states of S with Accept and Reject states of TP. This operation may also unfold S. The second operation computes the suspension automaton d(S x TP) and determines it while propagating the marking on state sets. The result is a deterministic IOLTS SP_vis, with the same observable traces as d(S), and in which Accept and Reject states mark behaviors accepted or rejected by TP. A selection operation then builds two possible objects: - A complete test graph CTG containing all test cases for TP. This consists of all traces leading to accept (to which a Pass verdict is associated), plus divergences on these traces by outputs of S (giving rise to an Inconclusive verdict). - A test case TC (a subgraph of CTG) obtained by the additional constraint that test cases never have controllability conflicts (i.e., choices between an output and another action). In either case, for any state of CTG or TC where an output of S is observable, Fail verdicts are implicit on all unspecified outputs. Finally, to facilitate communication between test cases and implementations, a mirror image is applied to CTG or TC, which inverts inputs and outputs. Test Case Properties -------------------- The theory and test generation algorithms ensure two fundamental properties of the test cases generated: 1. Soundness: No conformant implementation can be rejected by a test case. 2. Exhaustiveness: It is possible, at least in theory, to generate a test case that can reject every non-conformant implementation. From a more practical viewpoint, TGV also ensures qualitative properties. In fact, most test generation tools produce test cases as sequences (considering only deterministic specifications) or trees. However, to our knowledge, TGV is the only test generation tool that can generate test cases as graphs with loops and convergences (different sequences leading to the same state). The main interest is to limit the number of Inconclusive verdicts to a strict minimum, avoiding re-execution of test cases. On-the-fly generation --------------------- In order to avoid state explosion, a specification S can be given implicitly by a simulation API (functions for S's traversal) produced by a compiler of the input language. In this case, the test generation operations are not applied in sequence as described previously, but on-the-fly (except in the conflict resolution phase in some particular cases). In this case, only the necessary parts of S, of S x TP, and of SP_vis are built. This may dramatically improve test generation for large specifications, and even allow test generation for specifications with infinite state spaces. Test Generation Options ----------------------- TGV accepts various options. First, optional files may help define the test architecture. All support regular expressions. A rename file is used to rename labels of S, a hide file specifies unobservable actions of S and an IO file distinguishes inputs from outputs among observable actions. Some additional options enable tuning of the test generation process: exploration depth, computation of postambles, priorities on the order of exploration of transitions, and synthesis of timer operations (start, cancel, timeout). Languages -------- TGV can be used for different specification languages, as soon as a simulation API can be produced by a compiler of the required language. This has been done effectively for SDL with ObjectGeode (Telelogic) [4], Lotos with the CAESAR compiler of CADP [5], UML with the Umlaut tool [6] and IF with the IF2C compiler. TGV also accepts specifications in the form of explicit graphs in the BCG and Aldebaran formats of CADP. Test cases are produced in BCG or Aldebaran format, and in pseudo-TTCN (for SDL specifications). Experimentation --------------- TGV has been used in several case studies in different application domains, some of which are on an industrial scale. The first one was the ISDN D protocol specified in SDL [7]. An SDL specification of the ATM protocol SSCOP was also investigated [8]. Several specifications of a simple conference protocol in Lotos and SDL have been recently studied [9]. TGV has also been used on hardware systems with specifications in Lotos [10,11]. Distribution ------------ Two versions of TGV are available. One is a commercial implementation in the TestComposer tool of the Telelogic SDL toolset ObjectGeode. The other is distributed freely within the CADP toolbox [5] (note that TGV uses some CADP libraries). It is distributed in the form of a library that can be used for Lotos, IF, and UML specifications with the appropriate compilers and simulation API (see above). Bibliography ------------ [1] T. Jéron, and P. Morel. Test generation derived from model-checking. In CAV'99, LNCS 1633, Springer-Verlag, pp 108-122. Trento, Italy, July 1999. [2] I. Gronau, A. Hartman, A. Kirshin, K. Nagin, and S. Olvosvsky. A methodology and architecture for automated software testing, http://www.haifa.il.ibm.com/projects/verification/~gtb/papers/gtbmanda.pdf, 2000 [3] J. Tretmans. Test Generation with Inputs, Outputs and Repetitive Quiescence. Software - Concepts and Tools, 17(3), pp 103-120, Springer-Verlag, 1996. [4] R. Groz, T. Jéron, and A. Kerbrat. Automated test generation from SDL specifications. In Rachida Dssouli, Gregor von Bochmann, and Yair Lahav, editors, SDL'99 The Next Millennium, 9th SDL Forum, Montreal, Quebec, pp 135-152, Elsevier, June 1999. [5] J.-C. Fernandez, H. Garavel, A. Kerbrat, R. Mateescu, L. Mounier, and M. Sighireanu. CADP: A protocol validation and verification toolbox. In Proc. of CAV'96, New Brunswick, New Jersey, USA, LNCS 1102, Springer Verlag R. Alur and T. A. Henzinger (Ed.), August 1996. [6] T. Jéron, J.-M. Jézéquel, and A. Le Guennec. Validation and test generation for object-oriented distributed software. In IEEE Proc. Parallel and Distributed Software Engineering, PDSE'98. Kyoto, Japan, April 1998. [7] J.-C. Fernandez, C. Jard, T. Jéron, and G. Viho. An experiment in automatic generation of conformance test suites for protocols with verification technology. Science of Computer Programming, 29:123-146, 1997. [8] M. Bozga, J.-C. Fernandez, L. Ghirvu, C. Jard, T. Jéron, A. Kerbrat, P. Morel, and L. Mounier. Verification and test generation for the SSCOP protocol. Journal of Science of Computer Programming, special issue on Formal Methods in Industry, 36(1):27-52, January 2000. [9] A. Belinfante, L. Du Bousquet, S. Ramangalahy, S. Simon, C. Viho, and R. De Vries. Formal test automation: the conference protocol with TGV/TorX. In IFIP TC6/WG6.1 13th International Conference on Testing of Communicating Systems, TestCom 2000, Ottawa, Ontario, Canada, Kluwer Academic, pp 221-228, August 2000. [10] H. Kahlouche, C. Viho, and M. Zendri. An industrial experiment in automatic generation of executable test suites for a cache coherency protocol. In IFIP TC6 11th International Workshop on Testing of Communicating Systems, A. Petrenko, N. Yevtushenko (ed.), Chapman & Hall, September 1998. [11] H. Kahlouche, C. Viho, M. Zendri, Hardware testing using a communication protocol conformance testing tool , in Tools and Algorithms for the Construction and Analysis of Systems (TACAS'99), W. R. Cleaveland (ed.), LNCS 1579, Springer Verlag, p.315-329, March 1999. ============================================================================== About us: The Software Engineering Programme at Oxford by Jim Davies The Programme provides postgraduate-level education to professional software engineers - developers, testers, architects, and managers - working in the computing and telecommunications industries. At present over 200 employees of companies such as IBM, Nokia, Motorola, and Ericsson are studying, part-time, for postgraduate qualifications from the University of Oxford. The teaching activity of the Programme is closely related to the research activity of its academic staff: most of these are members of Oxford University's Computing Laboratory (OUCL), while the others are subject experts working for other institutions, such as the University of Edinburgh, or companies, such as Telelogic. Four of the academic staff are directly involved in the AGEDIS project: Jim Davies, Alessandra Cavarra, Andrew Martin, and Charles Crichton. Jim is also the Director of the Programme: he has been researching into the modelling of concurrency, as well as teaching courses to industry, since 1987. Alessandra completed her doctorate at the beginning of the year: she was studying the semantics of UML in ASM; she works exclusively for the AGEDIS project. Andrew Martin is a lecturer on the Programme: apart from Oxford, he has worked at Southampton, and at the SVRC in Queensland; he organises the Programme's teaching in the area of software testing. Last but not least, Charlie is a researcher working towards his doctorate on the concurrency semantics of object-oriented models. ============================================================================== AGEDIS at Conferences: Conference Reports AGEDIS at EuroSTAR 2001 and at the Technology Forum by Klaudia Dussa-Zieger EuroStar2001 ( http://www.testingconferences.com ), the European conference on software testing analysis and review is the biggest and most well-known conference on software test in Europe. It combines a three-day conference with interesting tutorials before (two days) and an excellent exhibition of test tool vendors and consultants which is open during the whole conference. The 9th EuroSTAR took place this year at the Sollentuna exhibition center in Stockholm, Sweden, November 19-23, 2001. Like in the other years the conference was well attended by the community, although the aftermath of September 11th could also be felt. Altogether about 450 visitors came together from all over Europe and even from overseas countries like Mexico. In addition there were about 150 speakers, invited speakers and tool vendors. A presentation on AGEDIS with the title "AGEDIS - Research into automated generation and execution of test suite" was given by Dr. Dussa-Zieger, imbus AG, on November, 21st. The talk was placed in the tools session and it was well attended. The intention of the presentation was to give the audience insight into what AGEDIS is about. Beside the overall objective and structure of AGEDIS the presentation focused on the AGEDIS modelling language. The language itself was presented and contrasted against other approaches. Also the interfaces to other test tools, e.g. commercially available capture-and-replay tools, were outlined. The 45 minute presentation was followed by a discussion where the audience clearly showed that it had understood the content of the talk. Questions on the state space explosion during the generation of test cases were asked, as well as questions like "Are we ever planning to generate 'normal' code with our approach?". One of the most interesting and intensely discussed questions was the following: "Let us assume that AGEDIS would use the UML models produced during the design of the system. If we based the generation of the test cases on these UML models, do we expect to find any faults, or do we fall into the same trap as the developer who does his own tests?" We welcome comments from the readers on this question. Feel free to join the discussion groups on the AGEDIS website, http://www.agedis.de . Summarizing the presentation of AGEDIS went very well. After the talk we were even approached by the organizers of LatinSTAR2002 and asked whether we would like to give the same presentation there. :) One word on Stockholm. It is a lovely place and absolutely recommended for a visit. However, I would choose to go there in summer. As Stockholm is located on 14 islands, it is on the water and pretty windy. In November we already had snow and it was cold! If you go there, an absolute must is the visit of the Vasa museum. AGEDIS at the Technology Forum The technology forum is a workshop which was organized by method park in Frauenaurach, Germany, on December 4th, 2001 ( http://www.methodpark.de ). It is an event which allows the visitors to discuss new ideas. The theme of this year was to isolate factors for the efficient generation of software. In that context the focus of all presentations was on UML. imbus AG represented the test process during software developement and gave a presentation on AGEDIS. The content of the talk was essentially the presentation at EuroSTAR 2001. Here again, the talk was well received and sparked the interest in a number of people in the audience. ----------------------------------------------------------------------- AGEDIS at Comdex Fall 2001 and at the UML 2001 by Jim Davies Oxford University Computing Laboratory (OUCL) were invited, by the UK's Department for Trade and Industry (DTI), to participate in UK@COMDEX: the UK's presence at Comdex Fall 2001. Of the various research projects underway at OUCL, the one that caught the attention of the DTI was AGEDIS. Before September 11th, the organisers were expecting a minimum of 200,000 visitors, and planned the space accordingly: by November the figure had dropped to 150,000. There was considerable disruption to air travel during the week of the show, starting with the American Airlines crash in New York. Whatever the cause, a lot of people registered (and paid, even) but stayed away: unofficial estimates ranged between 60,000 and 80,000 visitors; our contact at Key3Media (the show's organisers) was sticking with a figure of 150,000. Worse still, many of the _exhibitors_ had evidently pulled out; there were walled-off areas in both of the main halls. There were long periods during which no-one came to speak to me, there was never anyone waiting for me to return from one of my speculative, promotional expeditions, and I was surprised by the low audience figures for the talks: there were only about 10 or 12 people listening to me. On the other this is much compared to other talks, which attracted much lower audience figures! It wasn't a complete waste of time: the fact of our participation has raised our profile within the UK, and amongst a wide, almost random selection of Comdex attendees: there were a surprising number of executives and development managers amongst the salespeople. However, thinking of the value that should have gained from seven days, I wasn't too pleased. We could have hoped for more. AGEDIS at UML 2001 In comparison, our trip to UML 2001, in Toronto, was incredibly valuable. We had the opportunity to discuss our ideas with a variety of people involved in the design of UML, the formulation of its semantics, and the development of tools. We obtained support for our ideas, and a unique insight into the workings of the UML community. We were scheduled to present two papers at a concurrency workshop, one of four one-day workshops preceding the main conference. The organisers had opted for a mixture of discussions, short presentations, and group working, leading up to a more informed, related set of presentations at the end of the day. As a forum, it was enlightening. In particular, in discussion with Bran Selic, we were reassured that our impressions regarding "active objects" and multithreading were correct. Many of the talks at the conference were slightly disappointing: those who had similar goals to ourselves had made comparatively little progress; those who had different goals had little to report (that interested us, at least). The panel sessions were more useful, particularly that on formal semantics: the synthesis of views, from Perdita Stevens' determined pragmatism to, well... some less determined positions, was quite reassuring. We drew two important conclusions: 1. our interpretation of the `dynamic aspects' of UML is entirely consistent with that of the original authors 2. the definition of the AGEDIS `profile' and semantics fits well with the predicted evolution of UML At the same time, we made a significant number of people aware of the AGEDIS project, and what it is trying to do with (and for?) UML. ============================================================================== AGEDIS at Conferences: future appearances Talks on AGEDIS within the next months can be found at the following conferences: Quality Week Europe 2002 - March 11-15, 2002, Brussels, Belgium, http://www.qualityweek.com - AGEDIS will be presented on March 13th, 2002 ======================================================================== ----->>> AGEDIS newsletter ARTICLE SUBMISSION POLICY <<<------ ======================================================================== AGEDIS newsletter is E-mailed quarterly to subscribers worldwide. Submission policy of AGEDIS newsletter is: o Length of submitted articles should not exceed 350 lines (about four pages). Longer articles are OK but may be serialized. o Length of submitted event announcements should not exceed 60 lines. o Publication of submitted items is determined by the AGEDIS consortium, and may be edited for style and content as is seen necessary. DISCLAIMER: Articles and items appearing in AGEDIS newsletter represent the opinions of their authors or submitters; AGEDIS newsletter disclaims any responsibility for their content. ======================================================================== --->>> AGEDIS newsletter SUBSCRIPTION INFORMATION <<<--- ======================================================================== To SUBSCRIBE to AGEDIS newsletter visit the AGEDIS web site to UNSUBSCRIBE and CHANGE your address send email to Johannes Trost . Please, when using either method to subscribe or unsubscribe, type the correctly and completely. Requests to unsubscribe that do not match an email address on the subscriber list are ignored. imbus AG Kleinseebacher Strasse 9 D-91069 Moehrendorf Germany ## End ##