Enhancing Semantic Web Entity Matching Process Using Transformer Neural Networks and Pre-Trained Language Models

Authors

  • Mourad Jabrane Sultan Moulay Slimane University, Laboratory of Process Engineering, Computer Science and Mathematics, Khouribga, Morocco
  • Abdelfattah Toulaoui Sultan Moulay Slimane University, Laboratory of Process Engineering, Computer Science and Mathematics, Khouribga, Morocco
  • Imad Hafidi Sultan Moulay Slimane University, Laboratory of Process Engineering, Computer Science and Mathematics, Khouribga, Morocco

DOI:

https://doi.org/10.31577/cai_2024_6_1397

Keywords:

Entity matching, record linkage, linked data, deep learning, transformer neural networks

Abstract

Entity matching (EM) is a critical yet complex component of data cleaning and integration. Recent advancements in EM have predominantly been driven by deep learning (DL) methods. These methods primarily enhance data accuracy within structured data that adheres to a high-quality and well-defined schema. However, these schema-centric DL strategies struggle with the semantic web's linked data, which tends to be voluminous, semi-structured, diverse, and often noisy. To tackle this, we introduce a novel approach that is loosely schema-aware and leverages cutting-edge developments in DL, specifically transformer neural networks and pre-trained language models. We evaluated our approach on six datasets, including two tabular and four RDF datasets from the semantic web. The findings demonstrate the effectiveness of our model in managing the complexities of noisy and varied data.

Downloads

Download data is not yet available.

Downloads

Published

2024-12-31

How to Cite

Jabrane, M., Toulaoui, A., & Hafidi, I. (2024). Enhancing Semantic Web Entity Matching Process Using Transformer Neural Networks and Pre-Trained Language Models. Computing and Informatics, 43(6), 1397–1415. https://doi.org/10.31577/cai_2024_6_1397