Automated Generation of Semantic Data Models from Scientific Publications
Martha O. Perez-Arriaga, University of New Mexico
The traditional problems to analyze information in digital publications scale with the ever-increasing volume of data. Other challenges on analyzing publications include the lack of standards in some areas, different formats and standards, different types of data, and diverse areas of knowledge. These challenges hinder detecting, understanding, sharing and querying information promptly.
We propose an investigation to use publications’ text and tables embedded in publications to build semantically enriched data models. These models allow integration, sharing, management, and querying information from publications.
Our work develops automated methods for information extraction, semantic analytics, information modeling, and information retrieval. It enables users to 1) detect, extract and organize information from tables within digital publications, 2) interpret and enrich tables with semantic relationships using text, vocabularies and ontologies, 3) characterize semantic data models in a machine readable format, and 5) use semantic data models to facilitate retrieval, integration, management, sharing and interoperability in publications.