Package org.apache.commons.collections4.sequence
The two sequences can hold any object type, as only the
equals
method is used to compare the elements of the
sequences. It is guaranteed that the comparisons will always be done
as o1.equals(o2)
where o1
belongs to the
first sequence and o2
belongs to the second
sequence. This can be important if subclassing is used for some
elements in the first sequence and the equals
method is
specialized.
Comparison can be seen from two points of view: either as giving the
smallest modification allowing to transform the first sequence into
the second one, or as giving the longest sequence which is a
subsequence of both initial sequences. The equals
method
is used to compare objects, so any object can be put into
sequences. Modifications include deleting, inserting or keeping one
object, starting from the beginning of the first sequence. Like most
algorithms of the same type, objects transpositions are not
supported. This means that if a sequence (A, B)
is
compared to (B, A)
, the result will be either the
sequence of three commands delete A
, keep B
,
insert A
or the sequence insert B
,
keep A
, delete B
.
The package uses a very efficient comparison algorithm designed by
Eugene W. Myers and described in his paper: An O(ND)
Difference Algorithm and Its Variations. This algorithm produces
the shortest possible
edit script
containing
all the commands
needed to transform the first sequence into the second one.
The entry point for the user to this algorithm is the
SequencesComparator
class.
As explained in Gene Myers paper, the edit script is equivalent to all other representations and contains all the needed information either to perform the transformation, of course, or to retrieve the longest common subsequence for example.
If the user needs a very fine grained access to the comparison result,
he needs to go through this script by providing a visitor implementing
the CommandVisitor
interface.
Sometimes however, a more synthetic approach is needed. If the user
prefers to see the differences between the two sequences as global
replacement
operations acting on complete subsequences of
the original sequences, he will provide an object implementing the
simple ReplacementsHandler
interface,
using an instance of the ReplacementsFinder
class as a command converting layer between his object and the edit script. The number of
objects which are common to both initial arrays and hence are skipped between each call to the user
handleReplacement
method is also provided. This allows the user to keep track of the current index in
both arrays if he needs so.
-
Interface Summary Interface Description CommandVisitor<T> This interface should be implemented by user object to walk throughEditScript
objects.ReplacementsHandler<T> This interface is devoted to handle synchronized replacement sequences. -
Class Summary Class Description DeleteCommand<T> Command representing the deletion of one object of the first sequence.EditCommand<T> Abstract base class for all commands used to transform an objects sequence into another one.EditScript<T> This class gathers all thecommands
needed to transform one objects sequence into another objects sequence.InsertCommand<T> Command representing the insertion of one object of the second sequence.KeepCommand<T> Command representing the keeping of one object present in both sequences.ReplacementsFinder<T> This class handles sequences of replacements resulting from a comparison.SequencesComparator<T> This class allows to compare two objects sequences.