2nd Workshop on Managing the Evolution and Preservation of the Data Web

2nd workshop edition co-located with 13th European Semantic Web Conference 2016 in Heraklion, Crete.

News

  • 01/05/2016 - Tentative Programme
  • 30/04/2016 - Axel Polleres confirmed as keynote speaker
  • 08/04/2016 - Paper Reviews Sent to Authors - 2 papers accepted, 2 papers conditionally accepted
  • 18/03/2016 - Submission Closed - 6 papers submitted; 2 industry papers, 4 full research papers
  • 04/03/2016 - Extended Deadline
  • 26/02/2016 - Added HTML Format Submission
  • 18/01/2016 - Added Open RDF Challenge
  • 17/12/2015 - Website and Submission Portal Online
  • 14/12/2015 - Workshop accepted at ESWC2016

Accepted Papers

  1. Ruben Taelman, Ruben Verborgh, Pieter Colpaert, Erik Mannens and Rik Van de Walle - Continuously Updating Query Results over Real-Time Linked Data (pdf)
  2. Jean-Paul Calbimonte and Karl Aberer - Toward Semantic Sensor Data Archives on the Web (pdf)
  3. (Industry Paper) James Anderson and Arto Bendiken - Transaction-Time Queries in Dydra (pdf)
  4. Marios Meimaris and George Papastefanatos - The EvoGen Benchmark Suite for Evolving RDF Data (pdf)

PDFs will be published soon.


Programme 30th May 2016

09:00-09:10 Welcome
09:10-09:40 Ruben Taelman, Ruben Verborgh, Pieter Colpaert, Erik Mannens and Rik Van de Walle
Continuously Updating Query Results over Real-Time Linked Data
09:40-10:00 James Anderson and Arto Bendiken
Transaction-Time Queries in Dydra (Industry Presentation)
10:00-10:30 Marios Meimaris and George Papastefanatos
The EvoGen Benchmark Suite for Evolving RDF Data
10:30-11:00 Coffee Break
11:00-11:45 Keynote by Axel Polleres
11:45-12:15 Jean-Paul Calbimonte and Karl Aberer
Toward Semantic Sensor Data Archives on the Web
12:15-12:30 Closing Discussion


Motivation

There is a vast and rapidly increasing quantity of scientific, corporate, government and crowd-sourced data published on the emerging Data Web. Open Data are expected to play a catalyst role in the way structured information is exploited in the large scale. This offers a great potential for building innovative products and services that create new value from already collected data. It is expected to foster active citizenship (e.g., around the topics of journalism, greenhouse gas emissions, food supply-chains, smart mobility, etc.) and world-wide research according to the “fourth paradigm of science”. The most noteworthy advantage of the Data Web is that, rather than documents, facts are recorded, which become the basis for discovering new knowledge that is not contained in any individual source, and solving problems that were not originally anticipated. In particular, Open Data published according to the Linked Data Paradigm are essentially transforming the Web into a vibrant information ecosystem.

Published datasets are openly available on the Web. A traditional view of digitally preserving them by “pickling them and locking them away” for future use, like groceries, would conflict with their evolution. There are a number of approaches and frameworks, such as the LOD2 stack, that manage a full life-cycle of the Data Web. More specifically, these techniques are expected to tackle major issues such as the synchronisation problem (how can we monitor changes), the curation problem (how can data imperfections be repaired), the appraisal problem (how can we assess the quality of a dataset), the citation problem (how can we cite a particular version of a linked dataset), the archiving problem (how can we retrieve the most recent or a particular version of a dataset), and the sustainability problem (how can we spread preservation ensuring long-term access).

Preserving linked open datasets poses a number of challenges, mainly related to the nature of the LOD principles and the RDF data model. In LOD, datasets representing real-world entities are structured; thus, when managing and representing facts we need to take into consideration possible constraints that may hold. Since resources might be interlinked, effective citation measures are required to be in place to enable, for example, the ranking of datasets according to their measured quality. Another challenge is to determine the consequences that changes to one LOD dataset may have to other datasets linked to it. The distributed nature of LOD datasets furthermore makes archiving a headache.


Important Dates

  • Submission: Friday 4th March Friday 18th March
  • Notification: Friday 1st April Thursday 7th April
  • Final version: Friday 15th April Saturday 30th April
  • Workshop: 30th May (to be announced)

Topics of Interest

  • Change Discovery
    • Change detection and computation in data and/⁠or vocabularies
    • Change traceability
    • Change notifications (e.g., PubSubHubPub, DSNotify, SPARQL Push)
    • Visualisation of evolution patterns for datasets and vocabularies
    • Prediction of changes
  • Formal models and theory
    • Formal representation of changes and evolution
    • Change/⁠Dynamicity characteristics tailored to graph data
    • Query language for archives
    • Freshness guarantee for query results
    • Freshness guarantee in databases
  • Data Archiving and preservation
    • Scalable versioning and archiving systems/⁠frameworks
    • Query processing/⁠engines for archives
    • Efficient representation of archives (compression)
    • Benchmarking archives and versioning strategies

Ideally the proposed solutions should be applicable at web scale.


Submissions

We envision four types of submissions in order to cover the entire spectrum from mature research papers to novel ideas/datasets and industry technical talks:

  • Research Papers (max 15 pages), presenting novel scientific research addressing the topics of the workshop.

  • Position Papers and System and Dataset descriptions (max 5 pages), encouraging papers describing significant work in progress, late breaking results or ideas of the domain, as well as functional systems or datasets relevant to the community.

  • Industry & Use Case Presentations (max 5 pages), in which industry experts can present and discuss practical solutions, use case prototypes, best practices, etc., in any stage of implementation. Extended industry & use case papers (up to 10 pages) are allowed on demand (justification of the extension).

  • Open RDF Archiving Challenge (max 5 pages), encouraging developers, data publishers, and technology/tool creators to apply Semantic Web techniques to create, integrate, analyze or use an archive of linked open datasets. We will provide an RDF archive that may serve to these purposes, which consists of several versions on an RDF dataset. We currently envision two sources for the challenge: (i) BEAR, a benchmark of RDF archives that provides 58 versions from the linked open data observatory, and (ii) the Dbpedia Wayback Machine, that allows to retrieve the version of a DBpedia page at any point in time. See Open RDF Archiving Challenge for more details.

Papers should be formatted according to the Springer LNCS format in PDF or equivalent in HTML5. For authoring submission according to the Linked Research principles authors can use dokieli - a decentralized authoring and annotation tooling. HTML5 articles can be submitted by either providing an URL to their article (in HTML+RDFa, CSS, JavaScript etc.) with supporting files, or an archived zip file including all the material. For questions regarding HTML5 submissions, please contact Sarven Capadisli.

Submit your papers through the Easy Chair Link


Open RDF Archiving Challenge

The Open RDF archiving challenge is intended to encourage developers, data publishers, and technology/tool creators to apply Semantic Web techniques to create, integrate, analyze or use an archive of linked open datasets. Thus, we expect developments showcasing developments demonstrating one (or all) of:

  • useful functionality over RDF archives
  • a potential commercial application of RDF archives
  • tools to support/manage RDF archives at Web scale
  • visual interfaces for evolving data/archives

More concrete ideas: - System to efficiently retrieve a version of a graph at a given time point - Efficient compression of the change history of one of our datasets - Cross version querying ( Give the current address of my authors from 2012) - Memento protocol for LDF servers

How to Participate

Please submit the description of your system (max 5 pages) via EasyChair. with the same format as the papers in the Research Track (LNCS).

The following information must be provided:

  1. Abstract: no more than 200 words.
  2. Executive Summary: Please provide a short executive summary that describes the functionality and usefulness of the application for non-technical audience.
  3. Description of the system/tool/solution: The details of the system should put the focus on the functionality and innovative aspects of the system.
  4. Web access: The application should be accessible via the web. If the application is not publicly accessible, passwords should be provided. A (short) set of instructions on how to start and use the application should also be provided on the web page.

Descriptions will be published in the form of online proceedings in the Open RDF archiving challenge website.

We also encourage research submissions to the workshop to consider to participate in this challenge.

Datasets

Submissions may use any linked open datasets as source of archiving. We herein provide four suggestions that may be used as basis for submissions:

  • The Dynamic Linked Data Observatory (http://swse.deri.org/DyLDO/), monitoring more than 650 different domains across time and serving weekly crawls of these domains.
  • BEAR (ftp://nassdataweb.infor.uva.es/BEAR), is a testbed for RDF archives that provides 58 versions from the linked open data observatory
  • The Dbpedia Wayback Machine (http://data.wu.ac.at/wayback/), that allows to retrieve the version of a DBpedia page at any point in time.
  • DBpedia dumps (http://live.dbpedia.org/dumps/) and DBpedia Live changesets (http://live.dbpedia.org/changesets/)
Judging

A jury of experts from industry and academia (to be announced) will evaluate the systems according to the challenge criteria and will determine the winners.


Chairs

Jeremy Debattista (Enterprise Information Systems, University of Bonn, Germany / Organized Knowledge, Fraunhofer IAIS, Germany; Webpage [Contact Person]) is a PhD researcher at the University of Bonn. His research interests are on Linked Data Quality and Big Data for the Semantic Web.

Jürgen Umbrich (Vienna University of Economics and Business; Webpage) is a post-doctoral research at WU Vienna with research intrests in (Open) Data quality assessment and monitoring and archieving. Before he joined the WU, he worked one year as a post-doctoral researcher at Fujitsu Ireland in Galway exploiting the benefits of Linked Data for enterprise applications.

Javier D. Fernández (Vienna University of Economics and Business; Email:; Webpage) is a post-doctoral research fellow under an FWF (Austrian Science funds) Lise-Meitner grant. His current research focuses on efficient management of Big Semantic Data, RDF streaming, archiving and querying dynamic Linked Data.


Program Committee

  • Judie Attard, University of Bonn/Fraunhofer IAIS, Germany
  • Ioannis Chrysakis, FORTH-ICS, Greece
  • Keith Cortis, University of Passau, Germany
  • Giorgos Flouris, FORTH-ICS, Greece
  • Marios Meimaris, ATHENA R.C., Greece
  • Fabrizio Orlandi, University of Bonn/Fraunhofer IAIS, Germany
  • Fouad Zablith, American University of Beirut, Lebanon
  • Magnus Knuth, Hasso Plattner Institute – University of Potsdam, Germany
  • Anisa Rula, University of Milano-Bicocca, Italy
  • Wouter Beek, VU University Amsterdam, Netherlands
  • Yannis Stavrakas, ATHENA R.C., Greece
  • Amrapali J. Zaveri, Dumontier Lab - Stanford University, USA
  • Mathieu d’Aquin, The Open University, United Kingdom
  • Yannis Roussakis, FORTH-ICS, Greece
  • Kemele M. Endris, University of Bonn
  • Charlie Abela, University of Malta, Msida, Malta
  • George Papastefanatos, ATHENA R.C., Greece
  • Nandana Mihindukulasooriya, Universidad Politécnica de Madrid (UPM), Spain
  • Niklas Petersen, University of Bonn/Fraunhofer IAIS, Germany
  • Joseph Bonello, University of Malta, Msida, Malta

Contact Us

If you have any questions related to the workshop, email us