Hugo Jair Escalante, Manuel Montes, Pastor López (INAOE team)
Authorship verification (AV) deals with the problem of determining whether a document has been written by a given author or not. In TeSLA, and in online education in general, AV methods can be used in several subtasks, most notably on the authentication of written documents (e.g., determining whether homework or an examination was actually done by the student that was supposed to have done it). There are very effective methods for automatic AV. These methods are based on the analysis of stylistic writing patterns that most of the time are automatically derived by using machine learning techniques.
Although there are very effective AV methodologies that can deal, up to some extend, with the online learning domain. A common drawback of most solutions is the lack of explainability and interpretability mechanisms. That is, these methods are good at telling us, whether a document was potentially written by the author in question, but in general they cannot give us enough information on what is the evidence that lead the model to make such a recommendation (i.e., explainability). In addition, one cannot tell too much on how to understand the internal structure of the model (i.e., interpretability). Both features, but most importantly explainability, are critical for reducing the uncertainty that the decision taker may have, derived from the fact that they consider AV methods as black boxes. Consider a model that in addition to making predictions on the authorship of a document can tell you why they “think” the authorship variable should be in that way and not the other. Without any doubts the usability of this type of methods is far more greater and their impact can be far more reaching than current methods.
In this direction, TeSLA is exploring methodologies that are pushing the state of the art in the direction of explainability. Specifically, we are focusing on the generation of auditable evidence that can support the recommendations of the AV model. Initial ideas are oriented to highlighting phrases/words/terms with highly discriminative information. Thus, although we are not certain on how explainable/justifiable our mechanisms will be, we are certain that they will push the state of the art on text mining, as there are no solution implementing these ideas so far.
FUNDED BY THE EUROPEAN UNION
TeSLA is not responsible for any contents linked or referred to from these pages. It does not associate or identify itself with the content of third parties to which it refers via a link. Furthermore TESLA is not liable for any postings or messages published by users of discussion boards, guest books or mailing lists provided on its page. We have no control over the nature, content and availability of any links that may appear on our site. The inclusion of any links does not necessarily imply a recommendation or endorse the views expressed within them.
TeSLA is coordinated by Universitat Oberta de Catalunya (UOC) and funded by the European Commission’s Horizon 2020 ICT Programme. This website reflects the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained therein.