Jose Torres-Jimenez,Arturo Rodriguez-Cristerna
CINVESTAV-Tamaulipas,Information Technology Laboratory,Km.5.5 Carretera Cd.,Victoria,Tamaulipas,Mexico
Original article
Metaheuristic post-optimization of the NIST repository of covering arrays
Jose Torres-Jimenez*,Arturo Rodriguez-Cristerna
CINVESTAV-Tamaulipas,Information Technology Laboratory,Km.5.5 Carretera Cd.,Victoria,Tamaulipas,Mexico
A R T I C L E I N F O
Article history:
Received 19 September 2016
Accepted 25 December 2016
Available online 16 February 2017
Covering arrays
Construction of Covering Arrays(CA)with minimum possible number of rows is challenging.Often the available CA have redundant combinatorial interaction that could be removed to reduce the number of rows.This paper addresses the problem of removing redundancy of CA using a metaheuristic postoptimization(MPO)approach.Our approach consists of three main components:a redundancy detector(RD);a row reducer(RR);and a missing-combinations reducer(MCR).The MCR is a metaheuristic component implemented using a simulated annealing algorithm.MPO was instantiated with 21,964 CA taken from the National Institute of Standards and Technology(NIST)repository.It is a remarkable result that this instantiation of MPO has delivered 349 new upper bounds for these CA.
?2017 Chongqing University of Technology.Production and hosting by Elsevier B.V.This is an open access article under the CC BY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).
The use of software has permeated many areas of human activity,so the reliability of software has become important worldwide.It is estimated that software testing consumes about 50%of the cost of developing a new piece of software.A 2002 NIST report [23]indicates that the cost of an inadequate infrastructure for software testing was in the range of$22.2 to$59.5 billion(US dollars).Reducing this cost is notonly important but the design and implementation of adequate software testing procedures is critical for the reliability of many electronic and mechanical systems,even more so in complex and important systems,such as space shuttles Lions and et al.[16].
According to Myers et al.[17]functional software testing methods may be divided into two main categories:white-box testing and black-box testing.The design of white-box testing suites requires source code of the software under examination. Some testing strategies based on the white-box approach are: statement coverage,decision coverage,condition coverage, decision-condition coverage and multiple-condition coverage.The building of test suites using white-box strategies is more challenging than for black-box strategies,since white-box strategies are based on knowledge of the internal structure of the system. Furthermore,if the system is modi fied,then tests must be redesigned to satisfy the new version of the system.On the other hand, the design of black box testing suites does not require source code of the software under examination.It compares actual behaviour against expected behaviour based on the functionality and the speci fication of the software system under examination.Some black-box testing strategies are:exhaustive input testing,equivalence partitioning,boundary-value analysis,cause-effect graphing, error guessing,andcombinatorial interaction testing.
It is easy to construct test suites using a random black-box approach,but they rarely cover a large percentage of the functionality of the system under examination.A black-box approach that covers 100 percent of the functionality is the exhaustive approach,but it is impractical in most cases because too many tests are required.As an example:if we need to design a test suite for a system that has 20 parameters and each parameter has 10 possible values,it would require 1020tests;however,using a combinatorial interaction testing approach that covers the combinations of all pairs of parameter values,the test suite will require only 155 tests. The number of tests required with combinatorial interaction testing grows logarithmically according to the number of parameters[11]. Empirical studies in software testing have shown that combinatorial interaction testing is a useful approach Kuhn et al.[14],Bell[4]. The mathematical objects that support combinatorial interactiontesting are Covering Arrays(CA)and Mixed Covering Arrays(MCA).
CA and MCA are combinatorial structures that have been used successfully in various areas.The most reported application of CA and MCA is in the design of test suites for software combinatorial interaction testing[7,8]which is based on the concept that software faults are caused by unexpected combinatorial interactions of certain size between components.Another application is found in the field of parameter fine-tuning of metaheuristic algorithms [12,19,21,22].
A CA,denoted by CA(N;t,k,v),is an N×k array,where every entry of the array takes values from a set of symbols of sizev,such that every N×t sub-array contains at least once all possible vtttuples of symbols.An MCA is a generalization of a CA where the alphabets of the columns could have different cardinalities.The test cases are represented by the rows,the parameters are represented by the columns,the parameter values are taken from the set {0,1…,v-1}which is called the alphabet,and t is the strength or combinatorial interaction degree between parameters covered by the CA.Fig.1 shows an example of a CA(9;2,4,3),and an MCA(6; 2,4,3123)is shown in Fig.2.
The Covering Array Number(CAN)is the minimum N such that for fixed k,v,and t a CA exists.The CAN is denoted by CAN(t,k,v). The construction of CA with N=CAN(t,k,v)is a challenging problem whether we use mathematical structures or metaheuristic algorithms.
When we have non-optimal CA(i.e.a CAwith N>CAN(t,k,v)),it usually has many t-tuples that are covered more than once.This fact presents the opportunity to reduce number of rows of CA,given that it may then be possible to identify redundant rows[18]that can be removed.
In this paper we introduce a Metaheuristic Post-Optimization (MPO)approach to reduce the size of a CA by exploiting redundant elements in CA.MPO is composed of three main components: a)a redundancy detector(RD);a row reducer(RR);and a missingcombination reducer(MCR)implemented using a simulated annealing algorithm(the metaheuristic component of our approach).MPO was extensively tested using 21,964 CA(taken from the CA NIST repository).We have improved almost all 21,964 CA,but the most remarkable result is that MPO has set 349 new upper bounds for these CA.
The remaining of the paper is structured in three sections.In section 2 we present in detail MPO approach giving details of the redundancy detector,row reducer and missing-combination reducer components.In section 3 we present the results of instantiating the MPO with the whole National Institute of Standards and Technology repository of covering arrays.Finally in section 4 we present some conclusions.
Fig.1.Transposed matrix of a CA(9;2,4,3).
Fig.2.Transposed matrix of an MCA(6;2,4,3123).
In this section we present implementation details of the MPO approach.We firstly give an overview of the whole process of MPO, secondly,we present details of each of the components RD,RR, MCR.
2.1.Design of MPO approach
The design and implementation of MPO approach is brie fly described in algorithm 1,where it can be observed that it has three components and two main loops.The inner loop executes the components:Redundancy Detector(RD)andRow Reducer(RR).After the inner loop is executed,theMissing-Combinations Reducer(MCR) runs.When the MCR(implemented with a simulated annealing (SA)algorithm)is not able to make the number of missing combinations equal to zero,MPO ends.
MPO(algorithm 1)receives as input A=CA(N;t,k,v)and gives as output B=CA(N’;t,k,v)with N’≤N and no missing t-wise combinations.The functionτcomputes the number of missing twise combinations of the parameter passed to it.τhas temporal complexity(a more detailed description of how to compute the missing interactions for CA was presented by Avila-George et al.[2].
2.2.Redundancy detector(RD)
The goal of the Redundancy Detector(RD)algorithm is to find alarge number of redundant entries in the CA given as input.RD does its job by doing three scans of the input,the first two scans visiting all t-wise combinations of the matrix(each scan with temporal complexitythe third scan visiting all elements of the matrix,searching for rows that are totally redundant(with temporal complexityO(Nk)).The total temporal complexity is
The purpose of the first scan is to set as‘Fixed Symbol’(FS)cells that participate in t-wise combinations covered only once,and as‘Possible Redundant Cell’(PRC)all other cells.The second scan works with cells marked as PRC and decides which cells transform to the status of FS,while making sure coverage property(all t-wise combinations must be covered at least once)is satis fied,and number of cells with status of PRC is maximized.The third scan removes all rows in which all elements have status of PRC.
2.3.Row reducer(RR)
The Row Reducer algorithm receives as input a CA and works in a greedy manner searching for a rowisuch that its removal minimizes missing combinations.In the worst case RR tests all rows of CA,but when a row with no missing t-wise combinations is found RR ends.If this is not the case the row removed is the one that gives the fewest number of missing t-wise combinations.
The logic of the operation of the RR algorithm is simple:replace the FS cells of thei-th row in all PRC cells of remaining rows,and then verify number of missing t-wise combinations(after removal of rowi).RR removes the first row that minimizes missing t-wise combinations.The worst case temporal complexity of the algorithm isfor the determination of row to be removed, andO(Nk)for the removal of the row.Then the total temporal complexity of the RR algorithm is
2.4.Missing-combinations reducer(MCR)
The MCR component of MPO is in charge of reducing to zero the number of missing t-tuples of the input parameter(a matrix with missing combinations).We decided to implement MCR using a simulated annealing(SA)algorithm given that SA has been applied successfully for solving related problems[3,6,20,24].The core elements of the SA are:the neighbourhood functions F F;and the cooling schedule.
We used two neighbourhood functions N F1and N F2.N F1searches for a random missing t-tuple and sets one such tuple in every row,selecting the row that gives the fewest number of missing t-wise combinations.N F2selects t cells in a row(cells and rows are selected randomly)then tests every vtpossible t-tuple in those cells,and selects combination that gives the lower number of missing t-wise combinations.SA uses a mixture of the two neighbourhood functions in such a way that N F1is applied with a probabilityprand consequently N F2is applied with a probability 1-pr.
The cooling schedule con figuration in SA involves[1,15]:(a)an initial temperature(temp0);(b)a decreasing function to reduce the temperature value;(c)an ending temperature(tempf);and(d)a finite number of iterations of the local search at the same temperature(L)(L size of a Markov chain[5].The parameters of the cooling schedule control the behaviour of the algorithm and therefore affect drastically the quality of the final solution.We selected static geometric cooling schedule controlled by a parameterα.The parameter L is static during execution of the algorithm [3,20].Also a parameter called frozen factor(ff)was added to control number of temperature reductions without improvement towards solution,which works as an alternative termination criterion that is triggered when search stagnates.
SA algorithm(algorithm 2)is based on de finition given by Kirkpatrick et al.[13].Parameter values were selected after a parameter fine-tuning,and they are:temp0=1;α=0.99; L=Nkv2;pr=0.5;tempf=1×10-14;andff=11.
?
Table 1 Results of MPO algorithm after processing entire repository[10].Information is organized in triplets containing:average percentage reduction of rows(Υ);average time in minutes(Γ);and number of instances(I)(Continues in Table 2).
Table 2 (Continued from Table 1)Results of MPO algorithm after processing entire repository[10].Information is organized in triplets containing:average percentage reduction of rows(Υ);average time in minutes(Γ);and number of instances(I).
2.5.Implementation note
The proposed algorithms were coded using C language and compiled with GCC 4.3.5 with-O3 optimization flag,and run in cores of the type AMD?8435(2.6 Ghz).
To measure the effectiveness of MPO,the NIST repository of CA [10]was processed.NIST repository of CA consists of 21,964 CAwith v∈{2,…,6}andt∈{2,…,6}.For each instance we report:average percentage reduction of rows(Υ),average time in minutes(Γ),and number of instances(I).
Tables 1 and 2 summarize the results of processing NIST repository[10].Information is organized in triplets containing:Υ,Γ, andI.It is shown that many instances were optimized,resulting in the construction of 349 state of the art upper bounds for CA by using MPO algorithm.The comparison between the MPO new upper bounds and the IPOG-F bounds is shown in Tables 3-5.Resultsare shown in Figs.3-7 where instances are grouped by combinatorial interaction coverage degree(t).
Table 3 New best upper bounds constructed with MPO algorithm.Part 1 of 3.
Table 3 (continued)
Table 4 New best upper bounds constructed with MPO algorithm.Part 2 of 3.
In this paper we have presented a Metaheuristic Post-
Optimization(MPO)approach to reduce the cardinality of Covering Arrays.MPO was implemented using three components:a redundancy detector,a row reducer,and a missing-combination reducer.The redundancy detector has the mission of detecting elements in CA that could be changed without affecting degree of coverage of CA.The row reducer takes advantage of redundant elements of CA to reduce number of rows.When the removal of a row produces missing combinations,then control is given to the missing-combination reducer.The missing-combination reducer is implemented with simulated annealing(the metaheuristic component of MPO)and tries to make the number of missing combinations equal to zero.Even though all three components are key to the success of MPO,we believe metaheuristic component is the most important part of MPO,given that through its use it is possible to reduce iteratively number of rows of CA.
Table 5 New best upper bounds constructed with MPO algorithm.Part 3 of 3.
Fig.3.Results of MPO after processing instances of repository[10]with t=2. (-Δ=N-N′).
Fig.4.Results of MPO after processing instances of repository[10]with t=3. (-Δ=N-N′).
Fig.5.Results of MPO after processing instances of repository[10]with t=4. (-Δ=N-N′).
Fig.6.Results of MPO after processing instances of repository[10]with t=5. (-Δ=N-N′).
Fig.7.Results of MPO after processing instances of repository[10]with t=6. (-Δ=N-N′).
We have conducted big-scale experimentation through instantiation of MPO with the whole NIST repository of CA.NIST repository consists of 21,964 CA,and while improving almost all CA in repository,the most remarkable result is that we have set 349 new upper bounds for these CA.All CA are available at http://www. tamps.cinvestav.mx/~oc/.
Any mention of commercial products in this paper is for information only;it does not imply recommendation or endorsement by NIST.
The authors acknowledge”Xiuhcoatl”-CGSTIC of CINVESTAV and ABACUS-CINVESTAV,CONACYT grant EDOMEX-2011-COI-165873,for providing access to high performance computing.This research was partially funded by the project:CONACyT 238469-M′etodos Exactos para Construir Covering Arrays.Part of this paper was written during first author's research stay at NIST in 2014-2015.
[1]E.Aarts,J.Lenstra,Local Search in Combinatorial Optimization,Princeton Univ Pr,2003.ISBN:0691115222.
[2]H.Avila-George,J.Torres-Jimenez,L.V.H.Gonzalez-Hernandez,Metaheuristic approach for constructing functional test-suites,IET Softw.7(2)(2013) 104-117.
[3]H.Avila-George,J.Torres-Jimenez,V.Hern′andez,L.Gonzalez-Hernandez, Simulated annealing for constructing mixed covering arrays,in:Proceedings of the 9th International Symposium on Distributed Computing and Arti ficial Intelligence-DCAI.Vol.151 of Advances in Intelligent and Soft Computing. Springer Berlin/Heidelberg,Salamanca,Spain,from 28th to 30th March,2012, pp.657-664.
[4]K.Bell,Optimizing Effectiveness and Ef ficiency of Software Testing:a Hybrid Approach.Ph.D.Thesis,North Carolina State University,2006.
[5]D.Bertsimas,J.Tsitsiklis,Simulated annealing,Stat.Sci.8(1)(1993)10-15.
[6]M.Cohen,C.Colbourn,A.Ling,Augmenting simulated annealing to build interaction test suites,in:Software Reliability Engineering,2003.ISSRE 2003. 14th International Symposium on,Nov.2003,pp.394-405.
[7]M.Cohen,P.Gibbons,W.Mugridge,C.Colbourn,Constructing test suites for interaction testing,in:Proceedings of the 25th International Conference on Software Engineering.Portland,Oregon,USA,May 2003,pp.38-48.
[8]M.B.Cohen,Designing Test Suites for Software Interaction Testing.Ph.D. Thesis,University of Auckland,2004.
[9]M.Forbes,J.Lawrence,Y.Lei,R.Kacker,D.Kuhn,Re fining the in-parameterorder strategy for constructing covering arrays,J.Res.Natl.Inst.Stand. Technol.113(5)(2008)287-297.
[10]M.Forbes,J.Lawrence,Y.Lei,R.Kacker,D.Kuhn,Repository of CAs.Last Access September,2014.URL,http://math.nist.gov/coveringarrays/ipof/ipofresults.html.
[11]A.Hedayat,N.Sloane,J.Stufken,Orthogonal Arrays:Theory and Applications, Springer Verlag,1999.ISBN:0-387-98766-5.
[12]A.Jose-Garcia,H.Romero-Monsivais,C.Hernandez-Morales,A.Rodriguez-Cristerna,I.Rivera-Islas,J.Torres-Jimenez,A simulated annealing algorithm for the problem of minimal addition chains,in:Progress in Arti ficial Intelligence.Vol.7026 of Lecture Notes in Computer Science.Springer Berlin/Heidelberg,2011,pp.311-325.
[13]S.Kirkpatrick,C.Gelatt Jr.,M.Vecchi,Optimization by simulated annealing, science 220(4598)(1983)671-680.
[14]D.Kuhn,D.Wallace,A.Gallo Jr.,Software fault interactions and implications for software testing,IEEE Trans.Softw.Eng.30(6)(2004)418-421.
[15]J.Lam,An Ef ficient Simulated Annealing Schedule.Ph.D.Thesis,Yale University New Haven CT,1988.
[16]J.Lions,et al.,Ariane 5 Flight 501 Failure,Accesed 2014-09.Tech.Rep,09 2014.http://www.di.unito.it/damiani/ariane5rep.htm.
[17]G.Myers,C.Sandler,T.Badgett,T.Thomas,The Art of Software Testing,second ed.,John Wiley&Sons Inc,2004.ISBN:0471469122.
[18]P.Nayeri,C.Colbourn,G.Konjevod,Randomized postoptimization of covering arrays,in:Combinatorial Algorithms.Vol.5874 of Lecture Notes in Computer Science.Springer Berlin/Heidelberg,2009,pp.408-419.
[19]N.Rangel-Valdez,J.Torres-Jim′enez,J.Bracho-R?os,P.Quiz-Ramos,Problem and algorithm fine-tuning-a case of study using bridge club and simulated annealing,in:A.Dourado,A.C.Rosa,K.Madani(Eds.),IJCCI.IJCCI 2009-Proceedings of the International Joint Conference on Computational Intelligence, Funchal,Madeira,Portugal,October 5-7,2009,INSTICC Press,2009,pp. 302-305.
[20]A.Rodriguez-Cristerna,J.Torres-Jimenez,A simulated annealing with variable neighborhood search approach to construct mixed covering arrays,Electron. Notes Discrete Math.39(0)(2012)249-256(eURO Mini Conference).
[21]A.Rodriguez-Cristerna,J.Torres-Jimenez,A genetic algorithm for the problem of minimal brauer chains for large exponents,in:Soft Computing Applications in Optimization,Control,and Recognition,Springer,2013,pp.27-51.
[22]A.Rodriguez-Cristerna,J.Torres-Jimnez,I.Rivera-Islas,C.Hernandez-Morales, H.Romero-Monsivais,A.Jose-Garcia,A mutation-selection algorithm for the problem of minimum brauer chains,in:I.Batyrshin,G.Sidorov(Eds.),Advances in Soft Computing.Vol.7095 of Lecture Notes in Computer Science, Springer Berlin/Heidelberg,2011,pp.107-118.
[23]G.Tassey,The Economic Impacts of Inadequate Infrastructure for Software Testing.Tech.Rep,Research Triangle Institute,National Institute of Standards and Technology,U.S.Deparment of Commerce,05 2002.
[24]J.Torres-Jimenez,E.Rodriguez-Tello,New bounds for binary covering arrays using simulated annealing,Inf.Sci.185(1)(Feb.2012)137-152.
*Corresponding author.
E-mail addresses:jtj@cinvestav.mx(J.Torres-Jimenez),arodriguez@tamps. cinvestav.mx(A.Rodriguez-Cristerna).
Peer review under responsibility of Chongqing University of Technology.
http://dx.doi.org/10.1016/j.trit.2016.12.006
2468-2322/?2017 Chongqing University of Technology.Production and hosting by Elsevier B.V.This is an open access article under the CC BY-NC-ND license(http:// creativecommons.org/licenses/by-nc-nd/4.0/).
NIST repository of covering arrays
Metaheuristic post-processing algorithms
CAAI Transactions on Intelligence Technology2017年1期