Incremental Refactoring Using Seeds

Gabriela CZIBULA
Babeş-Bolyai University
1, M. Kogălniceanu Street, Cluj-Napoca, 400084, Romania

Istvan Gergely CZIBULA
Babeş-Bolyai University
1, M. Kogălniceanu Street, Cluj-Napoca, 400084, Romania

Abstract: Refactoring is one major issue to improve the design of software systems, increasing the internal software quality. It is a disciplined technique for improving the structure of existing code without changing its observable behaviour. We have previously introduced a clustering based approach for identifying refactorings in an object oriented software system. Essentially, it takes the existing software system and restructure it using a k-means based clustering algorithm (kRED), in order to obtain a better design. But, in time, the software system evolves and new application classes are added for implementing new functional requirements. We propose in this paper a k-means based incremental clustering method, Incremental Refactoring Using Seeds (IRUS), that is capable to re-partition the existing software system, when new application classes are added to it. The method starts from the clusters obtained by applying kRED before the software system’s extension. The result is reached more efficiently than running kRED again from the scratch on the extended software system. An experimental evaluation proving the method’s efficiency is also reported.

Keywords: Software engineering, incremental refactoring, clustering.

>>Full text
CITE THIS PAPER AS:
Gabriela CZIBULA, Istvan Gergely CZIBULA, Incremental Refactoring Using Seeds, Studies in Informatics and Control, ISSN 1220-1766, vol. 19 (3), pp. 271-284, 2010.

1. Introduction

The need for continuing adaptation and evolution is intrinsic to any software application. It is due to the fact that software systems, during their life cycle, are faced with new requirements. These new requirements imply updates in the software systems structure, which have to be done quickly, due to tight schedules which appear in real life software development process. Evolution is achieved in a feedback driven and controlled maintenance process. If the consequent pressure for evolution to adapt to the new situation is resisted, the degree of satisfaction provided by the system in execution declines with time [13].

The structure of a software system has a major impact on the maintainability of the system. This structure is the subject of many changes during the system lifecycle. Improper implementations of these changes imply structure degradation that leads to costly maintenance. That is why continuous restructurings of the code are needed, otherwise the system becomes difficult to understand and change, and therefore it is often costly to maintain.

Refactoring is a solution adopted by most modern software development methodologies (extreme programming and other agile methodologies), in order to keep the software structure clean and easy to maintain. Refactoring becomes an integral part of the software development cycle: developers alternate between adding new tests and functionality and refactoring the code to improve its internal consistency and clarity.

Fowler defines in [7] refactoring as “the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. It is a disciplined way to clean up code that minimizes the chances of introducing bugs”. Refactoring is viewed as a way to improve the design of the code after it has been written. Software developers have to identify parts of code having a negative impact on the system’s maintainability, and to apply appropriate refactorings in order to remove the so called “bad-smells” [2].

We have developed in [14] a clustering based approach, named CARD (Clustering Approach for Refactorings Determination) that uses clustering for improving the class structure of a software system. In this direction, a partitional clustering algorithm, kRED (k-means for REfactorings Determination), was developed. The proposed approach can be used to automatically identify refactorings that would improve the software system’s internal structure.

Real applications evolve in time, and new application classes are added in order to met new requirements. Obviously, for restructuring the extended software system, kRED clustering algorithm can be applied over and over again, by reassembling the entire extended system, every time when the application classes set change. But this process can be inefficient, particularly for large software systems. What we want is to extend the approach from [4] and to propose a k-means based incremental clustering method, named Incremental Refactoring Using Seeds (IRUS), that is capable to efficiently re-partition a software system, when a new application class is added to it. The method starts from the partition obtained by applying kRED algorithm before the class extension. The result is reached more efficiently than running kRED again from the scratch on the extended system.

The rest of the paper is structured as follows. Section 2 briefly presents the main aspects related to our previous approach for clustering based refactorings identification [4]. In Section 3 we motivate our work by illustrating the need for incremental refactoring. An incremental clustering approach for adaptive refactorings identification is introduced in Section 4.

For the incremental process, an Incremental Refactoring Using Seeds algorithm (IRUS) is proposed. Section 5 indicates several existing approaches in the direction of automatic refactorings identification. An example illustrating how our approach works is provided in Section 6. Some conclusions of the paper and further research directions are outlined in Section 7.

References:

ANQUETIL, N., T. LETHBRIDGE, Extracting Concepts from File Names; a New File Clustering Criterion, In 20th Intl. Conf. Software Engineering, 1998, pp. 84-93.
BROWN, W. J., R. C. MALVEAU, III H. W. MCCORMICK, T. J. MOWBRAY, AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis, John Wiley & Sons, Inc., New York, NY, USA, 1998.
CHOI, S. C., W. SCACCHI, Extracting and Restructuring the Design of Large Systems, IEEE Softw., 7(1), 1990, pp. 66-71.
CZIBULA, I.G., G. SERBAN, Improving Systems Design using a Clustering Approach, Intl. Journal of Computer Science and Network Security (IJCSNS)}, 6(12), 2006, pp. 40-49.
HARMAN, M., D. FATIREGUN, R. HIERONS, Evolving Transformation Sequences using Genetic Algorithms, In Proc. of the 4th Intl. Workshop on Source Code Analysis and Manipulation (SCAM 04), Los Alamitos, California, USA, 2004, IEEE Computer Society, pp. 65-74.
DUDZIKAN, T., J. WLODKA, Tool-supported Dicovery and Refactoring of Structural Weakness, Master’s thesis, TU Berlin, Germany, 2002.
FOWLER, M., Refactoring: Improving the Design of Existing Code, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999.
GAMMA, E., JHotDraw Project, http://sourceforge.net/projects/jhotdraw.
HAN, J., Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2005.
HUTCHENS, D. H., V. R. BASILI, System Structure Analysis: Clustering with Data Bindings, IEEE Trans. Softw. Eng., 11(8), 1985, pp. 749-757.
JAIN, A. K., M. N. MURTY, P. J. FLYNN, Data Clustering: a Review, ACM Computing Surveys, 31(3), 1999, pp. 264-323.
JAIN, A. K., R. C. DUBES, Algorithms for Clustering Data, Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1988.
LEHMAN, M. M., Laws of Software Evolution Revisited, In LNCS 1149 – EWSPT96, Springer Verlag, 1997, pp. 108-124.
LUNG, C.-H., Software Architecture Recovery and Restructuring through Clustering Techniques, In Proc. of the 3rd Intl. Workshop on Software Architecture (ISAW ’98), New York, NY, USA, 1998, ACM Press, pp. 101-104.
MANCORIDIS, S., B. S. MITCHELL, C. RORRES, Y. CHEN, E. R. GANSNER, Using Automatic Clustering to Produce High-level System Organizations of Source Code, In IEEE Proc. of the 1998 Intl. Workshop on Program Understanding ({IWPC}’98), Piscataway, NY, 1998, IEEE Press, p. 45.
MANCORIDIS, S., B. S. MITCHELL, Y.-F. CHEN, E. R. GANSNER, Bunch: A Clustering Tool for the Recovery and Maintenance of Software System Structures, In ICSM, 1999, pp. 50-59.
MOISE, G., Applying Fuzzy Control in the Online Learning Systems, Studies in Informatics and Control, 18(2), 2009, p. 165.
NEIGHBORS, J. M., Finding Reusable Software Components in Large Systems, In Working Conference on Reverse Engineering, 1996, pp. 2-10.
SCHWANKE, R. W., M. A. PLATOFF, Cross References are Features, In Proc. of the 2nd Intl. Workshop on Software configuration management, New York, NY, USA, 1989, ACM Press, pp. 86-95.
SCHWANKE, R. W., An Intelligent Tool for Re-engineering Software Modularity, In ICSE ’91: Proc. of the 13th Intl. Conf. on Software Engineering, Los Alamitos, CA, USA, 1991, IEEE Computer Society Press, pp. 83-92.
SENG, O., J. STAMMEL, D, BURKHART, Search-based Determination of Refactorings for Improving the Class Structure of Object-Oriented Systems, In Proc. of the 8th Ann. Conf. on Genetic and Evolutionary Computation (GECCO ’06), New York, NY, USA, 2006, ACM Press, pp. 1909-1916.
SIMON, F., F. STEINBRUCKNER, CLAUS LEWERENTZ, Metrics Based Refactoring, In Proc. of the 5th European Conf. on Software Maintenance and Reengineering (CSMR ’01), Washington, DC, USA, 2001, IEEE Computer Society, pp. 30-38.
TAENTZER, G., T. MENS, O. RUNGE, Analysing Refactoring Dependencies using Graph Transformation, Software and System Modeling, 6(3), 2007, pp. 269-285.
TAHVILDARI, L., K. KONTOGIANNIS, A Metric-based Approach to Enhance Design Quality Through Meta-pattern Transformations, In Proc. of the 7th European Conf. on Software Maintenance and Reengineering (CSMR ’03), Washington, DC, USA, 2003, IEEE Computer Society, pp. 183-192.
TZERPOS, V., R. C. HOLT, Mojo: A Distance Metric for Software Clusterings, In Working Conf. on Reverse Engineering, 1999, pp. 187-193.
TZERPOS, V., R. C. HOLT, ACDC: An Algorithm for Comprehension-driven Clustering, In Working Conf. on Reverse Engineering, 2000, pp. 258-267.
SANKARANARAYANASAMY, K., V. DHANALAKSHMI, S. ARUNACHALAM, T. PAGE, A Fuzzy Analysis Approach to Part Family Formation in Cellular Manufacturing Systems, Studies in Informatics and Control, 17(4), 2008, p. 433.
VAN DEURSEN, A., L. MOONEN, A. VAN DEN BERGH, G. KOK, Refactoring test code, 2001, pp. 92-95.
XING, Z., E. STROULIA, Refactoring Detection Based on UMLdiff Change-facts Queries, WCRE, 2006, pp. 263-274.
XU, X., C.-H. LUNG, M. ZAMAN, A. SRINIVASAN, Program Restructuring through Clustering Techniques, In Proc. of the 4th IEEE Intl. Workshop on Source Code Analysis and Manipulation (SCAM’04), Washington, DC, USA, 2004, IEEE Computer Society, pp. 75-84.

https://doi.org/10.24846/v19i3y201007