Summary
With the extensive range of document generation devices nowadays, the establishment of computational techniques to find manipulation, detect illegal copies and link documents to their source are useful because (i) finding manipulation can help to detect fake news and manipulated documents; (ii) exposing illegal copies can avoid frauds and copyright violation; and (iii) indicating the owner of an illegal document can provide strong arguments to the prosecution of a suspect. Different machine learning techniques have been proposed in the scientific literature to act in these problems, but many of them are limited as: (i) there is a lack of methodology, which may require different experts to solve different problems; (ii) the limited range of known elements being considered for multi-class classification problems such as source attribution, which do not consider unknown classes in a real-world testing; and (iii) they don’t consider adversarial attacks from an experienced forger. In this research project, we propose to address these problems on two fronts: resilient characterization and classification. In the characterization front, we intend to use multi-analysis approaches. Proposed by the candidate in his Ph.D. research, it is a methodology to fuse/ensemble machine learning approaches by considering several investigative scenarios, creating robust classifiers that minimize the risk of attacks. Additionally, we aim at proposing the use of open-set classifiers, which are trained to avoid misclassification of classes not included in the classifier training. We envision solutions to several printed document forensics applications with this setup: source attribution, forgery of documents and illegal copies detection. All the approaches we aim at creating in this project will be done in partnership with a document authentication company, which will provide real-world datasets and new applications.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/892757 |
Start date: | 15-06-2020 |
End date: | 14-06-2022 |
Total budget - Public funding: | 183 473,28 Euro - 183 473,00 Euro |
Cordis data
Original description
With the extensive range of document generation devices nowadays, the establishment of computational techniques to find manipulation, detect illegal copies and link documents to their source are useful because (i) finding manipulation can help to detect fake news and manipulated documents; (ii) exposing illegal copies can avoid frauds and copyright violation; and (iii) indicating the owner of an illegal document can provide strong arguments to the prosecution of a suspect. Different machine learning techniques have been proposed in the scientific literature to act in these problems, but many of them are limited as: (i) there is a lack of methodology, which may require different experts to solve different problems; (ii) the limited range of known elements being considered for multi-class classification problems such as source attribution, which do not consider unknown classes in a real-world testing; and (iii) they don’t consider adversarial attacks from an experienced forger. In this research project, we propose to address these problems on two fronts: resilient characterization and classification. In the characterization front, we intend to use multi-analysis approaches. Proposed by the candidate in his Ph.D. research, it is a methodology to fuse/ensemble machine learning approaches by considering several investigative scenarios, creating robust classifiers that minimize the risk of attacks. Additionally, we aim at proposing the use of open-set classifiers, which are trained to avoid misclassification of classes not included in the classifier training. We envision solutions to several printed document forensics applications with this setup: source attribution, forgery of documents and illegal copies detection. All the approaches we aim at creating in this project will be done in partnership with a document authentication company, which will provide real-world datasets and new applications.Status
CLOSEDCall topic
MSCA-IF-2019Update Date
28-04-2024
Images
No images available.
Geographical location(s)