public abstract class DocumentDuplicities extends Object
chapter
object, to which paragraphs are inserted
when they are complete. The insertion is specific to the duplicity checking
level and is done in childs - in methods DocumentDuplicitiesDocumentLevel.flush()
,
DocumentDuplicitiesParagraphLevel.flush()
and
DocumentDuplicitiesSentenceLevel.flush()
. These method are called from
printTokens(java.util.List<org.egothor.core.Token>, org.egothor.duplicity.visualization.Printer)
method of each child, once we come to the first token of
the next paragraph. When creating paragraph, its content is stored in another global
object phrase
, the ordinal number of the paragraph created stored in
paragraphID
.Modifier and Type | Method and Description |
---|---|
void |
createCsv(String dirname)
Create CSV file with jaccard coeficients for this document in given directory.
|
static DocumentDuplicities |
createNew(DocumentUnitID docID,
DocumentData docMeta,
JaccardCoeficientsFile jcf,
TankerImplSecure tanker)
The recommended way to create new instance of DocumentDuplicities child class.
|
void |
createReport(String dirname,
boolean producePDF,
boolean produceHTML,
double coef)
Create duplicity checking report files for this document in given directory
in given formats.
|
static List<List<Token>> |
getDocumentUnits(Sequence<Token> words)
Takes the sequence of document words and depending on the
Constants.CHECK_DUPLICITY_LEVEL splits it
to the appropriate text units - documents, paragraphs or sentences. |
public static DocumentDuplicities createNew(DocumentUnitID docID, DocumentData docMeta, JaccardCoeficientsFile jcf, TankerImplSecure tanker)
Constants.CHECK_DUPLICITY_LEVEL
constant.public void createCsv(String dirname) throws IOException
dirname
- name of the directory where the report files will be storedIOException
- if CSV file could not be created due to input/output errorpublic void createReport(String dirname, boolean producePDF, boolean produceHTML, double coef) throws com.lowagie.text.DocumentException, IOException, DuplicityCheckingException
dirname
- name of the directory where the report files will be storedproducePDF
- if true, a PDF file report will be producedproduceHTML
- if true, a HTML file report will be producedcom.lowagie.text.DocumentException
- if report files could not be created
due to some error in document creationIOException
- if report files could not be created due to input/output errorDuplicityCheckingException
public static List<List<Token>> getDocumentUnits(Sequence<Token> words)
Constants.CHECK_DUPLICITY_LEVEL
splits it
to the appropriate text units - documents, paragraphs or sentences.words
- sequence of words of the documentCopyright © 2016 Egothor. All Rights Reserved.