Fast refactoring candidate assessment metric
Period: March 2011 - December 2014
Table of Contents
- Motivation and Goal
- Method
2.1. Assessment in the refactoring identification process
2.2. Software design is captured into a graph
2.3. Delta Table calculation - Evaluation
3.1 Research questions
3.2 Data sets
3.3 Comparators
3.4 Evaluation measures - Results
4.1 Efficiency
4.2 Usefulness - Conclusion
- Papers
- Awards
Motivation and Goal
The cost for assessing refactoring candidates is computation-intensive. For automating refactoring identification, previous studies have limitations for assessing the impact of a large number of refactoring candidates.
In this project, I propose a fast refactoring candidate assessment metric, Delta Table
. This metric is an efficient method for assessing the impact of refactoring candidates on maintainability based on matrix computation, which is approximate but fast. This metric helps to select the most efficient refactoring candidates for the large-scale software.
Method
Assessment in the refactoring identification process
To identify refactoring sequence automatically, the possible moves of methods to classes in the system (i.e., refactoring candidates of Move Method
) are assessed the using the Delta Table
and the most improving refactoring is selected and applied. This process is iterated until there are no more improvements.
Software design is captured into a graph
From the object-oriented source codes, the software design is captured into a graph. The entities of methods
and attributes
are mapped into vertices (V)
and their relations such as method calls
and attribute accesses
are mapped into edges (E)
.
Delta Table calculation
The Delta Table (D)
is the matrix (rows: entities, columns: classes)
. Each element in the Delta Table (D_{ij})
represents the variance of the number of external relations
(i.e., relations across the classes) when moving entity i
to class j
.
Illustrative calculation of Delta Table
Step of the Delta Table
calculation
From the design graph, the link and membership matrices are constructed.
Link matrix (L)
L(e1, e2)
: entitye1
has a relation to entitye2
Lint
|Lext
:
internal
|external
relations that are associated between entitiesin the same class
|across the classes
Membership matrix (M)
M(e, C)
: entitye
is placed in classC
The projection matrix is produced by multiplying the link and membership matrices.
Project matrix (P)
P(e1, C)
: entitye1
has a relation to an entity placing in classC
Pint
|Pext
:
internal
|external
project matrixPint = Lint x M
,Pext = Lext x M
The formulation is devised by taking the effects (simulating) of moving entities.
inverse function()
is applied to theinternal project matrix
because moving an entity to other classes will potentially increase the external relations in the system.Inverse internal project matrix (invPint)
represents the effects of increased external relations when moving entities that have had internal relations. For example, let an entityei
has a relation with another entity within the same classC1
. Whenei
moves to other classC2
, then the existing internal relation switched to the external relation:invPint(ei, C1) = 0 (no effect)
andinvPint(ei, C2) = 1
.inverse function()
int [][] invPint = new int [num_entities][num_classes]; //inverse internal project matrix
for (e = 0; e < num_entities; e++)
{
for (c = 0; c < num_classes; c++)
{
if (Pint[e][c] != 0) //an entity `e` has a relation with another entity within the same class `c`
{
for (i = 0; i < c; i++)
{
if (i != c)
invPint[e][i] = Pint[e][c];
//moving entity `e` to other classes increases
//the external relation(s)
else
invPint[e][c] = 0; //remaining in the same class `c` has no effect
}
}
invPint[e][c] = 0;
}
}
return invPint;
- Similarly,
minus
is applied to theexternal project matrix
because moving an entity to the classes that have had external relations will decrease the external relations in the system.
The final formulation of the Delta Table (D)
is:
D = invPint - Pext
Evaluation
Research questions
RQ1. Efficiency: By how much our method is efficient for assessing the impact of refactoring candidates?
RQ2. Usefulness: Does the refactoring identification approach based on our method help improve maintainability?
Data sets
Open source projects written in Java
Name (Version) | jEdit (jEdit-4.3) | Columba (Columba-1.4) |
---|---|---|
Type | Text editor | Email clients |
Class # | 952 | 1506 |
Methods # | 6487 | 8745 |
Attributes # | 3523 | 3967 |
Comparators
Delta Table
(our approach) vsEntity Placement Metric (EPM) [Transactions on Software Engineering 2008]
Evaluation measures
Evaluation measures |
---|
- Total time - Maintainability metric: MPC (Message Passing Coupling) |
For maintainability metric, MPC (Message Passing Coupling)
is used. MPC
is a widely used coupling metric for measuring maintainability of software design. The lower coupling means the more maintainable software.
Results
1. Efficiency: By how much our method is efficient for assessing the impact of refactoring candidates?
- The total time required to perform the refactoring identification approach with the Delta Table is much less than the approach with the EPM.
- Maximum time per iteration in our approach is the time for the first iteration.
- As system become larger, computation time is increased.
- Rate of increased time with respect to the system size is much less in our approach.
- Total time (sec)
- The total time required to perform the approach with the Delta Table is much less than the approach with the EPM.
- Maximum time per iteration: time for performing the first iteration including constructing the design graph, calculating the link and membership matrices (e.g.,
jEdit
:max. 5.95 [sec] > avg. 1.66 [sec]
,Columba
:max. 5.18 [sec] > avg. 2.33 [sec]
) - As system become larger (
jEdit: 952 classes, Columba: 1506 classes
), computation time is increased. - Rate of increased time with respect to the system size is much less in our approach (e.g.,
jEdit at 90 iterations, our approach: 154.63[sec], approach with EPM: 28,622[sec]
)
2. Usefulness: Does the refactoring identification approach based on our method help improve maintainability?
The values of maintainability metric for our approach are increased in both projects (
jEdit
andColumba
).
- Maintainability metric for
jEdit
Maintainability metric forColumba
The maintainability metric ofMPC
for the refactored design at each iteration is presented. In the approach using theDelta Table
for bothjEdit
andColumba
, theMPC
decreases faster than the approach using theEPM
. TheMPC
is a coupling metric, and the lowerMPC
represents the more improved maintainable software.
Conclusion
I propose a matrix computation-based refactoring candidate assessment metric. The metric in the Delta Table
is the approximate measure, but it is sufficient to preliminarily assess the effects of the application of refactoring candidates in an extremely short time. This fast assessment can help the efficient refactoring candidate identification for the large-scale software.
Papers
An efficient approach to identify multiple and independent Move Method refactoring candidates
Ah-Rim Han, Doo-Hwan Bae, Sungdeok Cha
Information and Software Technology (IST), Vol. 59, pp. 53–66, Mar. 2015
[PDF] [DOI]An efficient method for assessing the impact of refactoring candidates on maintainability based on matrix computation
Ah-Rim Han, Doo-Hwan Bae
Proceedings of the 21st Asia-Pacific Software Engineering Conference (APSEC), pp. 453-460, Dec. 2014
(27% acceptance ratio, 55/202)
[PDF] [DOI] [Slide]
Awards
- Nov. 2013 - Oct. 2014, sole Principal Investigator, Post-Doctoral Fellowship Grant
National Research Foundation of Korea (NRF), $33,000 [certificate] [translated version]