Research
Research Overview
My main research area has been software engineering in Computer Science. My main research interests include software maintenance
and empirical studies
. I have used statistics
, data mining
, and automated software analysis techniques
to assess and improve the software design quality (e.g., maintainability
).
For over 10 years, I focused on making an evolvable software. The maintainability
of software deteriorates over time, and it is very important to maintain the software resilient to accommodate changes. Much of the total software development cost is spent on maintenance costs such as adding new features, fixing errors, and improving performance to adapt to a rapidly changing markets. Therefore, it is important to improve maintainability
to reduce the software development cost.
I studied to answer the following questions.
- Change-proneness prediction
Which parts of the software change frequently?
- Refactoring candidate identification
Where should be fixed to improve software quality and reduce maintenance costs?
Which parts need to be fixed first for effective improvement within limited budgets, resources and time?
How can we evaluate refactoring candidates more efficiently for the large-scale software?
Research Projects
Improvement of change-proneness prediction
Software changed either to add new functionalities or to fix bugs. Some part of the software may be more prone to be changed than others. Finding software parts that are more likely to change can be useful because we can decide how to effectively invest the software maintenance efforts. In previous studies, most of the metrics used to build a change-proneness predictive model are based on the structural complexity of software programs.
To improve the accuracy of the change-proneness prediction, I propose the new behavioral dependency metric that captures the aspects of the dynamic behavior program.
Efficient refactoring candidate identification
Recent research was focused on defining the cost-effective software refactoring process by suggesting the refactoring candidates that can maximize improvement in software design quality (e.g., maintainability). As the software is getting larger and more complex, it is harder to find the refactoring candidates because of the lack of resources in computing power and time. Therefore, we need a method to efficiently identify refactoring candidates for the large-scale software.
In this project, I propose the several new methods to improve the efficiency of the refactoring identification process.
Dynamic profiling-based refactoring identification
To identify the candidates in classes where real changes have occurred, I provided the method to use thedynamic profiling technique
for finding themost frequently used functions
based on the several scenarios of user behavior. Then, refactoring candidates are extracted toreduce the dependencies among the entities used in the most frequently used functions
.Multiple and independent refactoring identification
To find a sequence of cost-effective refactorings, I proposed the method for selectingmultiple refactorings that have no dependencies each other and can be applied simultaneously
based on the concept ofmaximal independent set (MIS)
.Two-phased search-based refactoring identification
To reduce the search space of candidates to be examined
, I suggested using thetwo-phase approach
. In the first phase, the refactoring candidates that are more likely to improve maintainability are chosen using theDelta Table
, lightweight and fast candidate assessment. In the second phase, only the chosen refactoring candidates are evaluated using a more complex and precise fitness function.
Fast refactoring candidate assessment metric
The cost for assessing refactoring candidates is computation-intensive. For automating refactoring identification, previous studies have limitations for assessing the impact of a large number of refactoring candidates.
In this project, I propose a fast refactoring candidate assessment metric, Delta Table
. This metric is an efficient method for assessing the impact of refactoring candidates on maintainability based on matrix computation, which is approximate but fast. This metric helps to select the efficient refactoring candidates for the large-scale software.