PRINTOUT | Printed Documents Authentication

Summary
With the extensive range of document generation devices nowadays, the establishment of computational techniques to find manipulation, detect illegal copies and link documents to their source are useful because (i) finding manipulation can help to detect fake news and manipulated documents; (ii) exposing illegal copies can avoid frauds and copyright violation; and (iii) indicating the owner of an illegal document can provide strong arguments to the prosecution of a suspect. Different machine learning techniques have been proposed in the scientific literature to act in these problems, but many of them are limited as: (i) there is a lack of methodology, which may require different experts to solve different problems; (ii) the limited range of known elements being considered for multi-class classification problems such as source attribution, which do not consider unknown classes in a real-world testing; and (iii) they don’t consider adversarial attacks from an experienced forger. In this research project, we propose to address these problems on two fronts: resilient characterization and classification. In the characterization front, we intend to use multi-analysis approaches. Proposed by the candidate in his Ph.D. research, it is a methodology to fuse/ensemble machine learning approaches by considering several investigative scenarios, creating robust classifiers that minimize the risk of attacks. Additionally, we aim at proposing the use of open-set classifiers, which are trained to avoid misclassification of classes not included in the classifier training. We envision solutions to several printed document forensics applications with this setup: source attribution, forgery of documents and illegal copies detection. All the approaches we aim at creating in this project will be done in partnership with a document authentication company, which will provide real-world datasets and new applications.
Results, demos, etc. Show all and search (5)
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/892757
Start date: 15-06-2020
End date: 14-06-2022
Total budget - Public funding: 183 473,28 Euro - 183 473,00 Euro
Cordis data

Original description

With the extensive range of document generation devices nowadays, the establishment of computational techniques to find manipulation, detect illegal copies and link documents to their source are useful because (i) finding manipulation can help to detect fake news and manipulated documents; (ii) exposing illegal copies can avoid frauds and copyright violation; and (iii) indicating the owner of an illegal document can provide strong arguments to the prosecution of a suspect. Different machine learning techniques have been proposed in the scientific literature to act in these problems, but many of them are limited as: (i) there is a lack of methodology, which may require different experts to solve different problems; (ii) the limited range of known elements being considered for multi-class classification problems such as source attribution, which do not consider unknown classes in a real-world testing; and (iii) they don’t consider adversarial attacks from an experienced forger. In this research project, we propose to address these problems on two fronts: resilient characterization and classification. In the characterization front, we intend to use multi-analysis approaches. Proposed by the candidate in his Ph.D. research, it is a methodology to fuse/ensemble machine learning approaches by considering several investigative scenarios, creating robust classifiers that minimize the risk of attacks. Additionally, we aim at proposing the use of open-set classifiers, which are trained to avoid misclassification of classes not included in the classifier training. We envision solutions to several printed document forensics applications with this setup: source attribution, forgery of documents and illegal copies detection. All the approaches we aim at creating in this project will be done in partnership with a document authentication company, which will provide real-world datasets and new applications.

Status

CLOSED

Call topic

MSCA-IF-2019

Update Date

28-04-2024
Images
No images available.
Geographical location(s)