Pengembangan Engine Integrasi Tabel HTML pada Halaman Web
Two problems are arisen while integrating number of tables from number of web pages, i.e. structural conflict and semantic conflict. To tackle those problems, the proposed study combines some existing methods that are already proven to solve problems in integrating process. The proposed integration...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Universitas Gadjah Mada
2016-08-01
|
Series: | Jurnal Nasional Teknik Elektro dan Teknologi Informasi |
Subjects: | |
Online Access: | http://ejnteti.jteti.ugm.ac.id/index.php/JNTETI/article/view/254 |
Summary: | Two problems are arisen while integrating number of tables from number of web pages, i.e. structural conflict and semantic conflict. To tackle those problems, the proposed study combines some existing methods that are already proven to solve problems in integrating process. The proposed integration process of HTML table consists of 4 phases: (1) locating the table in web pages, (2) separating attributes and data values, (3) integrating the table scheme, (4) migrating the data values into integrated scheme. Table location in web page is determined using heuristic approach. This approach also can separate the attributes and the data values of the table. Semantic conflict that is apparent while integrating the table scheme is handled using domain specific ontology. The resulted data value, then, is migrated to table scheme in line with duplication data checking using vector space model. Result of the integration is presented as single HTML table. This approach is implemented as an engine that is coded using Phyton language. Result of experiment shows that the proposed approach can be used to integrate number of HTML table from number of web pages into a single integrated table. |
---|---|
ISSN: | 2301-4156 2460-5719 |