There have been several request about DARQ during the past months, so I thought it’s time to post a short project status. DARQ (original website) is a query engine for federated SPARQL queries I worked on in the past. It provides transparent query access to multiple, distributed SPARQL endpoints as if querying a single RDF graph. In other words: It automatically splits an incoming query, sends sub-queries to the relevant endpoints and merges the results into one resultset. The available endpoints need to be registered in the system using Service Descriptions. This works quiet well if the vocabularies do not overlap very much, i.e. if two or services store the same properties for resources it can be very slow. However, if there is little overlap query answering can be done in a reasonable time. Some results of the work were published at ESWC 2008 (paper), the source code is available at sourceforce.
DARQ started as a proof of concept system at HP Labs in 2006, it is in very early stage and I would not recommend to use it in any production environment. I worked on DARQ until early 2008, after that DARQ was continued by a diploma student who added query caching and a basic query translation mechanism, that allows to specify mappings between vocabularies, i.e. “Authors are a subclass of Persons” or “price in EUR can be converted to USD using exchange rare X”. Mappings are defined using SWRL. However, schema mapping support is even more a prototype than DARQ is and has not been tested extensively. He finished his work in Sept. 08, his changes can be found in the svn trunk. The version used for the benchmarks shown in the ESWC 2008 paper is tagged with ‘benchmarks2′ – all documentation writtten by me is for that version. It will likely not work with any of the recent Jena/ARQ releases, though i never tested it. There are currently no plans to continue the development.
If you have any questions related to DARQ feel free to post a comment or email me.
Update: The version in the svn trunk requires a mapping file. This file contains the mapping rules used for query translation. It must use SWRLs Concrete Syntax representation.
From where do I get the mapping file???
In order to build the project from the trunk folder, the arq.jar should be removed from the lib folder. Infact the Lib folder contains arq.jar and arq-fix.jar. The later one is fixed version of the previous. If both are used as external jar files, the project will give errors in darq.test package.
I read the paper called “Querying Distributed RDF Data Sources with SPARQL”. In chapter 4, query examples are executed on DBPedia data source. Are there any examples and results executed on different datasets? Maybe like examples in : http://code.google.com/p/fbench/wiki/Queries#Cross_Domain_%28CD%29
Ziya,
I ran the last benchmarks on DARQ in 2008 for the mentioned paper. Before we ran tests on different datasets, but all other datasets were relatively small and not public. The Benchmark you are linking to is from 2010 when I did not work on DARQ anymore.
Bastian
Hi, I dowloaded source code of DARQ. And i tried to execute a simple query as “SELECT ?o { foaf:page ?o. }” on DARQ.
And I described dbpedia service in my configuration file shown as below:
@prefix sd: .
@prefix foaf: .
[] a sd:Service ;
sd:url ;
sd:capability [
sd:predicate foaf:page ;
] ;
.
But i have had a warning message : “WARN DarqTransform :: No service found for statement: http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart @http://xmlns.com/foaf/0.1/page ?o – it will be queried locally.”
Why is the query queried locally? Isn’t it required to execute on Dbpedia? Where am i wrong?
Sorry, URIs weren’t show in my previous writing. I write my query again:
SELECT ?o {http://dbpedia.org/resource/Wolfgang_Amadeus_Mozart foaf:page ?o. }
Hi Ziya,
I must admit that it has been a while since I worked on DARQ, My guess would be that there is no URL given for the service. If you don’t tell DARQ about Dbpedia it does not know about it. DARQ does not look the the URI of the subject to find services automatically.
Bastian