UBC Undergraduate Research

Extending Findbugs to detect test bugs Rezaiean-Asel, Armin Sep 30, 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52966-Rezaiean-Asel_Armin_Extending_findbugs_2015.pdf [ 320.46kB ]
JSON: 52966-1.0343137.json
JSON-LD: 52966-1.0343137-ld.json
RDF/XML (Pretty): 52966-1.0343137-rdf.xml
RDF/JSON: 52966-1.0343137-rdf.json
Turtle: 52966-1.0343137-turtle.txt
N-Triples: 52966-1.0343137-rdf-ntriples.txt
Original Record: 52966-1.0343137-source.json
Full Text

Full Text

EXTENDING FINDBUGS TO DETECTTEST BUGSUndergraduate Thesisin theDepartment of Electrical and Computer EngineeringFaculty of Applied ScienceUniversity of British ColumbiaEECE 496byArmin Rezaiean-AselSeptember, 2015AcknowledgementsI would like to thank two individuals for their assistance and guidance in the completionof this thesis.Firstly, Dr. Mesbah for taking the time to supervise my undergraduate research en-deavor and for providing support and advice throughout the process.Secondly, Arash Vahabzadeh for his assistance throughout the development and executionof my work. From his help and guidance with idea generation, running the experiment,finalizing this paper, and other areas, his support was invaluable.AbstractA number of bug detection tools currently exist and are used in development processes.However, not all bugs are properly detected in such tools. In this paper, the FindBugs toolis explored with respect to production test code. An empirical study of its bug detectioncapability is conducted, resulting in an analysis of the prevalence of false negative resultsas well as a categorization of patterns that lead to such results. Furthermore, potentialsolutions for decreasing false negatives are explored.A number of research questions are posed, all of which revolve around the concept ofhow the FindBugs tool can be made more accurate in detecting test bugs. Following anexploration of these questions, I discuss some of the lessons learned, and further workthat can be done in future research initiatives.GlossaryBugs Software bugs are characterized as being a failure, fault, or error in a computerprogram. Bugs result in a particular software producing false outcomes and behavior,essentially behaving in a manner that is not expected or desired.Empirical Study A study or analysis that is done based on observations and evidence,rather than solely on theory.False Negative A result that appears to be negative, when in fact it should not be.False Positive A result that appears to be positive, when in fact it should not be.FindBugs An open-source bug detection tool for Java.Static Analysis An analysis done on some computer software without executing anyof the programs.FiguresFigure 1: Breakdown of Bug Report Findings Page 4Figure 2: FindBugs Report Snapshot for Helix Admin Webapp Page 5Figure 3: Example of code producing bug 1 Page 6Figure 4: Example of code producing bug 2 Page 7Figure 5: Example of code producing bug 3 Page 7Figure 6: Example of code producing bug 4 Page 7Figure 7: Breakdown of Detector Findings Page 8Figure 8: HardCoding Detector on Recent Commits Page 10Figure 9: FileCreation Detector on Recent Commits Page 10Figure 10: EnviroVars Detector on Recent Commits Page 10ivContentsGlossaryFigures iv1 Introduction 12 Approach 22.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Sourcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.3 Running Findbugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.4 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.5 Bug Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.6 Detector Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.7 Detector Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Results 43.1 False Negatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.2 Bug Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.3 Custom Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Discussion 114.1 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.2 Critique of Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Related Work 135.1 Static Correspondences reported by Bug Finding Tools . . . . . . . . . . . 135.2 Validation of Findbugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.3 Benchmarks for Bug Detection Tools . . . . . . . . . . . . . . . . . . . . . 146 Conclusion 146.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146.2 Final Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15References 16v1 IntroductionBug detection tools are useful to developers of all levels. Many different types of detectorscurrently exist, with some being more popular in development environments than others.In this paper, I explore the accuracy of a popular detection tool, FindBugs, with respectto production test code and its ability to detect test bugs.FindBugs is an open-source static analysis tool [1]. As FindBugs is a static code checker,it checks software for instances of bugs without executing the code, hence the name staticcode analysis [2]. Rather, it works on the compiled byte code of a program. Other staticcheckers for Java work on a program’s source code.Certain successful benchmarks currently exist for assessing the quality of bug detectiontools [3] [4], which we could theoretically use to deduce a quality assessment of FindBugs.However, not many benchmarks exist for assessing quality of tools as they pertain to testbugs. To further analyze the quality of the FindBugs tool, especially with respect to testcode, I explore a specific area of the FindBugs results – false negative detection. A falsenegative detection is a bug detection that appears negative even though it should notbe. In other words, if a bug is not detected, but it exists and therefore should have beenidentified, this instance is classified as a false negative result.After exploring the frequency of false negative detection in production test code, I identifythe causes that lead to those instances. Following that, I explore possible means of solv-ing the detection errors, and I implement the proposed solutions as custom bug detectors.Throughout the experiment, the research questions that will be explored are as follows:RQ1: How frequently does the FindBugs tool fail at identifying test bugs?RQ2: What type of scenarios and bugs lead to false negative detection?RQ3: How can the FindBugs tool be modified to cover these holes?RQ4: Can suggested modifications lead to more accurate bug detection results?To conclude this report, I provide an analysis on the accuracy of the implemented solutionsand explore whether the newly implemented bug detectors are providing superior resultsin test bug detection.12 ApproachSeveral different steps are followed, and they are designed based on the need to address theaforementioned research questions. Below, further discussion on the overall methodologyof this experiment, as well as additional details on each step, is provided.2.1 MethodologyIn order to gather the necessary data for the research questions identified above, I conductmy experimentation in the following steps:Step 1: Source repositories for open-source projects. These projects must possess testbug reports with fixing commits.Step 2: Run the FindBugs tool on a project from step 1, generating an HTML bug report.Repeat this step with one commit prior, thus creating bug reports for pre – and post –bug fix instances.Step 3: Compare the results of both bug reports generated in order to identify whetherany cases of false negative bug detection exist.Step 4: Repeat step 2 and step 3 for all projects gathered in step 1.Step 5: Identify the types of bugs being fixed in each project’s initial commit. Usingthis information, we can gather insight on the types of bugs not being detected by theFindBugs tool. This step is relevant for projects that have false negative detection cases.Step 6: Implement custom detectors to identify the bugs being missed in the false negativereporting.Step 7: Analyze the performance of the custom detectors.2.2 SourcingIn step 1, after acquiring a list of potential projects with which to run the experiment[5], I set up a PostgreSQL database [6]. In the database, each project is listed with itsGithub commit ID, bug report information, and other project-relevant information.2.3 Running FindbugsIn step 2, I first checkout the project to its post-bug fix commit, the ID stored in thedatabase. In order to save time and not have to build every project through an IDE, I2run FindBugs using its Maven plugin [7], and transform the default XML report into areadable HTML format, all via command line.mvn findbugs:findbugsmvn xml:transformAfterwards, I revert the project to the commit one prior to its current state.git checkout HEAD~1The initial commit on which I run FindBugs is the project commit after some bug hasbeen fixed. As such, by doing the same thing on the prior commit, I am able to runFindBugs on a version of the project that definitely contained the bug.2.4 ComparisonAfter generating FindBugs reports for both project commit IDs in the previous step, Icompare their outputs for potential matches. For cases in which the two project commitIDs’ reported bugs are identical, I am able to confirm the existence of a false negativeidentification. For example, bugX has just been fixed through commit 2 on a givenproject, leaving the project with bugA, which is reported by FindBugs. When we revertthe project to commit 1 and run FindBugs, it identifies bugA, but not bugX, even thoughbugX has yet to have its fix added in the subsequent commit. Consequently, FindBugspresented a false negative account on the existence of bugX, and this was identified bythe FindBugs report only mentioning bugA in each of the commit instances.2.5 Bug IdentificationAfter going through the previous steps and identifying which projects possess cases offalse negative bug detection, I analyze those projects for cases of bugs that are not beingdetected by the FindBugs tool. Firstly, I look at the code changes made between thetwo relevant commits (using git log —p) of the project to see what changes were made inthe code. Secondly, I reference bug descriptions from the Apache JIRA Issue Tracker [8].Finally, by comparing the descriptions with the code changes from git log, I take note ofthe identifiable patterns that emerge.2.6 Detector ImplementationOnce non-detected bug patterns are identified, I implement custom bug detectors in orderto properly track these patterns through a FindBugs sweep. For each bug pattern iden-tified in the previous step, a detector class is created, with certain similar bug patterns3being covered by the same detector. The findbugs.xml and messages.xml are then editedto add meta information on the new detectors. These files are then packaged into a jarfile and added into the FindBugs installation.2.7 Detector AnalysisIn this final step, I analyze the accuracy of the custom detectors, checking their abilityto find the bugs for which they were created. This is accomplished by repeating step 2and step 3 from above, but using the new detectors. The goal is to see the bugs beingidentified in the earlier commit, but not being detected in the subsequent, post-bug fixcommit. This would imply proper bug identification. I also run the detectors on recentversions of certain sample projects in order to see what sort of bug detection patternsemerge on the latest commits.3 ResultsI run FindBugs on 143 projects. Overall, the majority of these project cases present falsenegative results, which shows us that the FindBugs tool isn’t as strong as it could be withrespect to identifying test bugs. Below, I discuss these results in more detail.3.1 False NegativesOf the 143 tested projects, although most present cases of false negative bug detection,there are also instances of proper and false positive identifications. In summary, the bugreporting is as follows: 8 properly identified cases, 2 false positive cases, and 133 falsenegative cases. This breaks down into percentages as shown in the table below.Finding Type Occurrences PercentageFalse Negative 133 93.0%False Positive 2 1.4%Properly ID’d 8 5.6%Figure 1: Breakdown of Bug Report FindingsWhen reading the FindBugs reports, I analyze the Browse by Categories tab to iden-tify which types of bugs come up – and how frequently they occur – in the particularproject. If everything matches identically between the two commits of a project, I am4able to conclude that the same types of bugs are being detected - and at the same fre-quency - between the two commits. As such, false negative bug detection has occurred.Below is an example of a project FindBugs report that leads to such a conclusion. Forboth pre - and post - bug fix commits, the number and type of bugs are identical eventhough the later commit was after a bug fix.Figure 2: FindBugs Report Snapshot for Helix Admin Webapp, HELIX-398As stated above, there are 10 non false negative cases. In such examples, a differenceexists between the FindBugs reports. For these projects, I make notes to review themin more detail later, to distinguish whether it is a case of proper bug identification orotherwise (false positive). When conducting the review, I analyze the differences betweenthe commits to see what potential code changes lead to the difference in FindBugs reports.Ultimately, 8 of these cases’ FindBugs reports have differences that stem from proper bugidentifications, where proper detectors exist for the bugs in question. In 2 cases, however,the differences stem from a false positive bug being reported.3.2 Bug PatternsUpon identifying the project cases that contain false negative bug detection, I start an-alyzing bug descriptions as well as code changes between commits, for those particularprojects. Additionally, I look into some known missed patterns in FindBugs’ test code bug5detection. In other words, these are patterns that are not on FindBugs’ list of currentlydetected patterns. The reason behind this is because as I looked into the various projectsthat presented false negative detection results, I noticed that many of the bug reportsoutlined bugs that couldn’t be discerned into a more general pattern (and were ratherindividualized for the given project and code base). As such, this allowed me to analyzemore new detection patterns without being bottle-necked by a lack of general patterns inthe sample projects.Combining the above approaches, a set of missed environmental bug patterns – not cur-rently being covered by FindBugs – is developed, accounting for some of the false negativeresults acquired in the FindBugs reports of the test code. I also discuss, below, potentialmethods for fixing the bugs.Bug 1: Within a test, strings with hardcoded line ending characters should not becompared.Figure 3: Example of code producing bug 1Potential Fix: Although a potential fix for this bug is obvious (strings of this natureshould not be compared), a custom detector was created in order to track this previouslyundetected bug type within FindBugs.Bug 2: Within a test, hard coded environmental variables should not be used.6Figure 4: Example of code producing bug 2Potential Fix: As with bug 1, a potential fix for this bug is obvious (such cases shouldnot be hard coded), but a custom detector was created in order to track this previouslyundetected bug type within FindBugs.Bug 3: Comparing strings that include certain types of file paths (solidus and spaces)will result in failure.Figure 5: Example of code producing bug 3Potential Fix: As with bugs 1 and 2, a potential fix for this bug is obvious (strings con-taining a file path for comparison should not include failure-inducing characters), but acustom detector was created in order to track this previously undetected bug type withinFindBugs.Bug 4: For Windows and Unix, creating a file with a solidus in its name will leadto failure.Figure 6: Example of code producing bug 4Potential Fix: As with bugs 1, 2 and 3, a potential fix for this bug is rather simple7(don’t include any solidus in the name of a file being created), but a custom detector wascreated in order to track this previously undetected bug type within FindBugs.3.3 Custom DetectorsThe FindBugs tool already has a number of different bug detectors that exist. As such, Ifirst identify detectors that fulfill some of the work necessary for implementation; in otherwords, their functionality can be extended. This is done to help create custom detectorsfor the above bug cases. They can be viewed at https://github.com/arminrez/EECE496The detector HardCoding.java accounts for bugs 1 and 3, checking for hard coded lineending characters as well as solidus and spaces.The detector FileCreation.java accounts for bug 4, checking for unwanted characters(solidus) in the name of newly created files.The detector EnviroVars.java accounts for bug 2, checking specifically for the hard codedenvironmental variable of JAVA _HOME since that was in the example project cases used.Ultimately, the custom detectors slightly improve the bug search results by decreasingthe number of false negative cases in the test bug detection. However, there is an increasein false positive results. Overall, there is a decrease in false negatives (from 133 to 130)and an increase in false positives (from 2 to 7).Detector Bugs AddressedCorrectlyDetectedFalse PositivesHardCoding 5 2 3FileCreation 1 0 1EnviroVars 2 1 1Figure 7: Breakdown of Detector FindingsIn the above table, the Bugs Addressed column represents the number of identified casesthe detector found throughout the various projects. The Correctly Detected column noteshow many of those identifications were of previously false negative test bug detection,thus they are identification cases that are now fixed and correct (and no longer false8negatives). The False Positives column represents the number of bug identifications thatactually ended up being false positives. Therefore, the values in Correctly Detected andFalse Positives equate to the value in Bugs Addressed since they are a breakdown ofthat total. False positive numbers are gathered by my perusing of the code in question.False positives are identified in cases where the bug pattern didn’t exist in the code, eventhough it was identified by the detector.With regards to the above results, one noteworthy point is the higher level of false pos-itives. This could be due to the construction of the custom detectors and the fact thatthey are misidentifying clean code cases due to a similarity to the bug case.In addition, it’s also important to observe the rate of false negatives that were "fixed," soto speak. The overall value isn’t very high per detector, which could mean a few differentthings. Firstly, the detectors could require more rigorous fine-tuning. Secondly, the bugpatterns that led to false negatives in many of the sample projects were of some othertype and category of bug (or individualized and not general outside of the specific project,as I had previously mentioned), therefore not covered by the detectors or the patternsidentified for analysis in this report.Furthermore, the FileCreation detector did not correctly detect any bugs of its type.Earlier, I mentioned that the bug patterns addressed in this research were both identifiedthrough the code of the sample projects as well as through them being known missed pat-terns in FindBugs’ test code bug detection. This pattern falls under the latter category.Therefore, I wasn’t necessarily expecting to see any detection of this bug pattern in theresults; however, I wanted to test for it and see if any cases existed.Next, the custom detectors were used on the most recent commits of a small sampleof projects, in order to see what sort of bug detection trends would arise.9ProjectWarningsGeneratedFalse PositivesBIGTOP 1 1FLINK 0 0CURATOR 1 1FLUME 2 1JCLOUDS 0 0Figure 8: HardCoding Detector on Recent CommitsProjectWarningsGeneratedFalse PositivesBIGTOP 0 0FLINK 1 1CURATOR 0 0FLUME 1 1JCLOUDS 0 0Figure 9: FileCreation Detector on Recent CommitsProjectWarningsGeneratedFalse PositivesBIGTOP 0 0FLINK 0 0CURATOR 0 0FLUME 0 0JCLOUDS 1 1Figure 10: EnviroVars Detector on Recent CommitsAs is seen in the above results, based on the recent versions of sample projects, somedetectors are more successful than others at tracking potential bugs. As before, theFileCreation detector didn’t successfully identify any bug cases. EnviroVars mimickedFileCreation in this regard. HardCoding, however, was able to identify certain cases suc-cessfully, which is consistent with its superior performance from the previous analysis as10well, as shown in figure 7.4 DiscussionBelow, I provide more detail on some of the lessons learned, based off of the earlier researchquestions. Furthermore, I provide a critique of the experiment, providing thoughts onwhat areas could be improved in the future.4.1 Lessons LearnedAfter conducting the experiment, some lessons can be identified by delving into the originalresearch questions for further analysis.Lesson 1With RQ1, I wanted to explore the frequency of correctness with respect to FindBugs’test bug detection. As seen through the sample projects used in the experiment, 93.7%of our examples resulted in false negative results while only 5.6% cases were properlyidentified.L1: With regards to the identification of test bugs, the FindBugs tool has a high levelof false negative bug detection. This is somewhat due to a lack of custom detectors neededto identify some of the bug patterns that lead to these false negative results.Lesson 2With RQ2, I wanted to analyze the types of bug scenarios that lead to false negativedetection. As we saw with L1, a large proportion of false negative detection occurs withtest bug identification; therefore, some patterns are discernible. Furthermore, aside frompatterns from our sample projects’ code, some other types of bugs can lead to false neg-ative results with FindBugs.L2: Some of the types of bugs that lead to false negative detection can be outlined asdiscussed in further detail in section 3.2. It is important to note that there are manyother bug scenarios that lead to false negatives with test bug detection using FindBugs; forexample, various types of resource leak bugs aren’t currently being covered in bug sweeps.11Lesson 3With RQ3, I wanted to look at whether FindBugs could be modified to cover some ofthe bugs identified through L2. Because custom detectors can be added to the FindBugstool, it can indeed be modified to cover the bugs identified in L2. The more valuableinquiry was to determine how to do so most effectively.L3: By extending similar FindBugs detectors, we can most effectively implement cus-tom detectors for our particular bug types. Once this is complete, packaging the newlyimplemented class along with updated xml files (containing the new detector’s meta infor-mation) will enable the detector in a FindBugs sweep.Lesson 4With RQ4, I wanted to study the proposed modifications from L3 in order to see whetherthey helped increase the FindBugs tool’s overall accuracy with test bugs.L4: By implementing these custom detectors, the rate of false negatives decreased by3%. Thus, implementing custom detectors to track test bugs (that lead to high rates offalse negatives) can ultimately lead to a decrease in false negative results and increase inproperly identified bugs. Furthermore, these detectors led to an increase in false positivesin certain cases.4.2 Critique of ExperimentAlthough the experiment was successful in identifying answers and lessons through theresearch questions posed in the introduction, there were some areas in which improve-ment to the experiment and process could have benefited the end result. These are worthnoting for future research activity in this area.Firstly, after implementing the custom detectors, although false negatives decreased, thefrequency of false positive bug detection (albeit trivial) increased. Although the goal wasto address false negative cases, an increase in false positives could indicate that additionalfactors in the code’s structure, which could lead to such results, hadn’t been noticed, orthat perhaps the detector class was being too stringent in what it searched within thecode. More attention could be given to this area in the future.On the note of custom detector errors, there wasn’t any means of accounting for hu-12man error. Unfortunately, developers can write programs that include bugs, and thatincludes myself. So, if the detectors had code that was imperfect in any way, there wasn’tany process in the experiment to account for that. This could also be an explanation forthe increase in false positives.In summary, the experiment provided valuable insight into its main research focus; how-ever, there were certainly areas that warrant critique and further scrutiny if this work andresearch were to be continued in the future.5 Related WorkThe research discussed in this thesis revolves around conducting an empirical analysis ofthe FindBugs tool’s accuracy with respect to test bug detection. Given that bug detectionpractices and static code analysis – especially with the FindBugs tool - are prevalent insoftware engineering practices, there have been many other research initiatives in thisarea. They cover topics from validating and setting benchmarks for bug detection tools,to analyzing specific aspects of these same tools.5.1 Static Correspondences reported by Bug FindingToolsThe paper Static correspondence and correlation between field defects and warnings re-ported by a bug finding tool [9] studied the level of correlation between field defects andFindBugs warnings. It evaluated static correspondence and statistical correlation, andultimately, the results showed that there was just a small amount of statistical correlation(warnings indicating future potential field defects).Their work relates to the research in this paper because it provides another angle tothe general analysis of the FindBugs tool. By studying field defects and FindBugs warn-ings, they were able to identify a potential correlation between the two, which helps usbetter understand the quality of the FindBugs tool.5.2 Validation of FindbugsIn the paper An empirical validation of FindBugs issues related to defects [10], researchis done to explore how often issues from the FindBugs tool are actual defects. Addition-ally, it looks into what types of issues are normally actual defects. After conducting the13experiment, it was concluded that not many issues are related to actual defects.The conclusion drawn from the research could help developers reduce the FindBugs re-sults that aren’t defect-related, thus potentially helping them discover defects in less timethrough greater prioritization of results. This is an interesting point to consider withrespect to my experiment because some of the false positive and false negative cases arelikely not related to actual defects. Therefore, not prioritizing their results could improvethe efficiency and time required of identifying actual defect-related bugs.5.3 Benchmarks for Bug Detection ToolsIn BugBench: Benchmarks for Evaluating Bug Detection Tools [11], an analysis is doneon distinguishing appropriate bug benchmarks. Following this, a bug benchmark suitewas developed. A number of bug detectors are then evaluated in order to validate thebenchmarks selected.It would be interesting to see how a set of relevant bug detection tool benchmarks wouldapply to FindBugs, especially to see how these benchmarks apply to the aspects of Find-Bugs that may or may not be leading to false negatives and false positives.6 Conclusion6.1 Future WorkWithin this paper, I described an experiment that explored the FindBugs tool and itsability, as well as accuracy, to detect test bugs. As part of this process, I outlined a set oftest bug patterns that weren’t being properly identified by the tool. Although I ran teststo see how accurately those bugs could be detected after the implementation of customdetectors, and that false negatives decreased, there is still much room for improvementand future research.Additional bug patterns can be identified. Evidently, with the sample size of projectsthat I used in this experiment, there exist many others that were not analyzed or dis-cussed. Within them, there is surely a great deal of further bug patterns that are goingundetected at the moment, thus leading to additional false negatives in test bug identifi-cation. By further studying new projects and discovering other undetected bug patterns,14we can potentially reduce false negative detection even more. By identifying increasingamounts of bug patterns, and implementing detectors, test bug detection with FindBugscan become more and more accurate.Furthermore, research can be done into the unknown presence of the newly identifiedpatterns. At the moment, the custom detectors were run on project commits that hada bug present, and the subsequent commit after the bug had been fixed. However, it isworth exploring further commits to identify patterns of whether certain bugs – and whichtypes – recur in test code as well as how frequently. This analysis was done briefly with asmall number of projects, as was shown in figures 9 to 11, but it can be further exploredin more detail and with more projects. This knowledge can then be used to analyze testcode development pitfalls and common bugs written in test code by developers.6.2 Final ThoughtsAlthough FindBugs is a very popular bug detection tool for Java, it has a lot of room forimprovement with regards to test bug detection. A large portion of the sample projectsreturned cases of false negative bug detection. Even after implementing custom detectorsto cover some of the missed patterns that arose in the projects, a sizable proportion offalse negatives remained. Nevertheless, this did signify a small improvement in test bugdetection with the FindBugs tool, and it showed future promise that supplemental customdetectors – for pertinent bug patterns – could help decrease the false negative rates evenmore.Therefore, as was identified through the research questions and lessons learned, FindBugscan be effectively modified to cover previously undetected test bugs, and the changes leadto a decreased level of false negatives. This was proven through the sample projects usedin this experiment.15References[1] FindBugs - Find Bugs in Java Programs. URL: http://findbugs.sourceforge.net/.[2] Louridas, P. Static code analysis. In Software, IEEE, pages 58-61, 2006.[3] Standard Performance Evaluation Corporation. SPEC benchmarks. URL:http://www.spec.org/.[4] Transaction Processing Council. TPC benchmarks. URL: http://www.tpc.org/.[5] Vahabzadeh, Arash and Milani Fard, Amin and Mesbah, Ali. An Empirical Studyof Bugs in Test Code. In Proceedings of the International Conference on SoftwareMaintenance and Evolution (ICSME), 10 pages, 2015.[6] pgAdmin PostgreSQL Tools. URL: http://www.pgadmin.org/.[7] FindBugs Maven Plugin. URL: http://mvnrepository.com/artifact/org.codehaus.mojo/findbugs-maven-plugin/.[8] The Apache Software Foundation. URL: https://issues.apache.org/jira/.[9] Couto, Cesar and Montandon, JoãoEduardo and Silva, Christofer and Valente, Mar-coTulio. Static correspondence and correlation between field defects and warningsreported by a bug finding tool. Software Quality Journal, pages 241-257, 2013.[10] Vetro, A. and Morisio, M. and Torchiano, M. An empirical validation of FindBugsissues related to defects. In Evaluation Assessment in Software Engineering (EASE2011), pages 144-153, 2011.[11] Lu, Shan and Li, Zhenmin and Qin, Feng and Tan, Lin and Zhou, Pin and Zhou,Yuanyuan. Bugbench: Benchmarks for evaluating bug detection tools. In Workshopon the evaluation of software defect detection tools, 2005.16


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items