Publication detail

Creating Searchable Web Page Snapshots using Semantic Technologies

BURGET, R. SALEM, H.

Original Title

Creating Searchable Web Page Snapshots using Semantic Technologies

Type

article in a collection out of WoS and Scopus

Language

English

Original Abstract

For many applications, it is necessary to create snapshots of web pages that accurately describe how the page appeared in a browser at a given point in time. Storing the original code (even when including all referenced resources) and creating bitmap screenshots have many drawbacks when it comes to searching, viewing and manipulating such snapshots. In this paper, we demonstrate a different approach that uses a remotely controlled web browser for rendering web pages. We capture the complete information about the rendered page and all pieces of its content, transform it to an explicit RDF-based model representation stored in a repository. Then, the stored page models may be examined using an interactive web-based tools, exported in different formats, linked with other data sources, and queried using SPARQL.

Keywords

Web page snapshot, Page rendering, Data extraction, RDF, SPARQL

Authors

BURGET, R.; SALEM, H.

Released

16. 6. 2023

Publisher

Springer Nature Switzerland AG

Location

Alicante

ISBN

978-3-031-34443-5

Book

Web Engineering - 23rd International Conference, ICWE 2023

Edition

Lecture Notes in Computer Science

Pages from

355

Pages to

358

Pages count

4

URL

BibTex

@inproceedings{BUT183805,
  author="Radek {Burget} and Hamza {Salem}",
  title="Creating Searchable Web Page Snapshots using Semantic Technologies",
  booktitle="Web Engineering - 23rd International Conference, ICWE 2023",
  year="2023",
  series="Lecture Notes in Computer Science",
  pages="355--358",
  publisher="Springer Nature Switzerland AG",
  address="Alicante",
  doi="10.1007/978-3-031-34444-2\{_}26",
  isbn="978-3-031-34443-5",
  url="https://link.springer.com/chapter/10.1007/978-3-031-34444-2_26"
}

Responsibility: Ing. Marek Strakoš