Title: Building a Semantic Index from HTML Pages or XML Documents

Year of Publication: Apr - 2014
Page Numbers: 338-347
Authors: Abdeslem Dennai, Sdi Mohammed Benslimane
Conference Name: The International Conference on Computing Technology and Information Management (ICCTIM2014)
- United Arab Emirates


Among the phases of reverse engineering of web-oriented applications is the extraction of concepts hidden in HTML pages or marked in XML documents. In this article, we propose an approach to index semantically these two sources of information using on the one hand, domain ontology to validate the extracted concepts and on the other hand the similarity measure between ontology concepts with the aim of enrichment the index. This approach will be conceived in three steps (modeling, attaching and Enrichment) and thereafter, it will be validated by examples. The obtained results lead to better re-engineering of web applications and subsequently a distinguished improvement in the web structuring.