UPDF AI

Textual Authenticity in the AI Era: Evaluating BERT and RoBERTa with Logistic Regression and Neural Networks for Text Classification

L. Hazim,Oguz Ata

2024 · DOI: 10.1109/ISETC63109.2024.10797291
International Symposium on Electronics and Telecommunications · 1 Citations

TLDR

This study serves as a comparison between the existing Logistic Regression and Feedforward Neural Networks by employing sentence-BERT-appended models and RoBERTa in the determination of authenticity in a given text by employing sentence-BERT-appended models and RoBERTa.

Abstract

AI-generated content impersonating human writing is an issue that has gained attention as AI spreads its wings. This particular study serves as a comparison between the existing Logistic Regression and Feedforward Neural Networks (FNNs) by employing sentence-BERT-appended models and RoBERTa in the determination of authenticity in a given text. A dataset that was balanced between human-written and AI-generated texts was utilized. Techniques like tokenization and normalization were first used, followed by feature extraction using transformer-based models. Cross-validation and confusion matrix analysis that used measures such as accuracy, precision, recall, F1 score, and ROC AUC were included to guarantee the models’ robustness. The hybrid RoBERTa-FNN model that was deposed challengers was the most outstanding model in respect of precision and recall, and the highest accuracy (99.95%) was obtained as mentioned in the data. The improved performance serves as a proof of how effectively RoBERTa uses its embeddings to represent context on the fine-grained level required for this kind of text classification. This work is a stepping stone to the creation of strong AI text detection systems, besides our advancements to the knowledge of models and embedding performance with respect to text classification. The results lay emphasis on the selection of model configuration and the embedding technique, as they are the key factors in achieving the best results in practical applications.