content
| - %META:TOPICPARENT{name="VirtSponger"}%
---+ Enhancements the Virtuoso Sponger brings to SPARQL
%TOC%
---++What?
Virtuoso's Sponger is a sophisticated piece of middleware that provides full Linked Data fidelity for pre-existing data objects or resources. This Linked Data is then accessible via HTTP-based Web Services, and SPARQL is enhanced with Sponger pragmas and some optional additions to the <code>FROM</code> clause.
---++Why?
In the world of Linked Data, the Web is treated as a global data space where every data object has an identifier (URI) that serves as a key to its entity-attribute-value (3-tuple or triples)-based description. To make these "keys" work, data object URIs have to be dereferenceable -- i.e., they must resolve to actual object content through functionality commonly delivered via data object locator and retriever URI specializations (or subtypes) such as URLs.
---++How?
---+++ Basics
Sponger pragmas control various aspects of functionality --
1. <b>Identifier Dereference:</b> handled by INPUT pragmas.
1. <b>Actual Data Retrieval:</b> handled by GET pragmas.
1. <b>SQL Code Generation:</b> handled by SQL pragmas.
1. <b>Output Format Adjustments:</b> handled by OUTPUT pragmas.
Pragmas are qualified at usage time using the following pattern:
<verbatim>
<pragma-type>:<actual-method> ["<method-modifier>"]
</verbatim>
---+++ Details
---++++ INPUT Pragmas
INPUT Pragmas enable you control dereference behavior applied to a SPARQL query. Net effect, fine-grained control over how variables and explicit data object identifiers are dereferenced en route to creating base data from which SPARQL query solutions are derived.
Methods and method-modifiers associated with this pragma type include:
| *Method* | *Modifier(s)* | *Description* | *Usage Example* |
| <b><code>input:grab-all</code></b> | <b><code>"yes"</code></b> | Instructs the SPARQL processor to dereference everything related to the query. All variables and literal IRIs in the query become values for <code>input:grab-var</code> and <code>input:grab-iri</code>. The resulting performance may be very bad. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx2][Example]] |
| <b><code>input:grab-base</code></b> | <b><code>"<IRI>"</code></b> | Specifies the base IRI to use when converting relative IRIs to absolute. (Default: empty string.) | [[VirtSpongerLinkedDataHooksIntoSPARQLEx3][Example]] |
| <b><code>input:grab-depth</code></b> | <b><code>"0"</code></b> | Sets the maximum 'degrees of separation' or links (predicates) between nodes in the target graph. Acceptable range is 0 (unlimited) . | [[VirtSpongerLinkedDataHooksIntoSPARQLEx4][Example]] |
| <b><code>input:grab-destination</code></b> | <b><code>"<IRI>"</code></b> | Overrides the default IRI dereferencing and Local Graph IRI designation. Basically, retrieved content (triples) is stored in a graph IRI designated by the modifier value. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx5][Example]] |
| <b><code>input:grab-follow-predicate</code></b> | <b><code>"<IRI>"</code></b> | Specifies a predicate IRI to be used when traversing a graph. (This pragma can be included multiple times). Synonym of <code>input:grab-seealso</code>. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx6][Example]] |
| <b><code>input:grab-iri</code></b> | <b><code>"<IRI>"</code></b> | Specifies an IRI that should be retrieved before executing the rest of the query, if it is not in the quad store already. (This pragma can be included multiple times). | [[VirtSpongerLinkedDataHooksIntoSPARQLEx7][Example]] |
| <b><code>input:grab-limit</code></b> | <b><code>"<number>"</code></b> | Sets the maximum number of resources (triple subjects or objects IRIs) to be de-referenced. Acceptable range is 0 (unlimited) . | [[VirtSpongerLinkedDataHooksIntoSPARQLEx8][Example]] |
| <b><code>input:grab-loader</code></b> | <b><code>"<procedure-name>"</code></b> | Identifies the procedure used to retrieve, parse, and store content. (Default: <code><nowiki>DB.DBA.RDF_SPONGE_UP</nowiki></code>) | [[VirtSpongerLinkedDataHooksIntoSPARQLEx9][Example]] |
| <b><code>input:grab-resolver</code></b> | <b><code>"<procedure-name>"</code></b> | Identifies the procedure that handles IRI dereference and actual content retrieval via a specific data access protocol (e.g., HTTP). (Default: <code><nowiki>DB.DBA.RDF_GRAB_RESOLVER_DEFAULT</nowiki><code>.) | [[VirtSpongerLinkedDataHooksIntoSPARQLEx10][Example]] |
| <b><code>input:grab-seealso</code></b> | <b><code>"<IRI>"</code></b> | Synonym of <code>input:grab-follow-predicate</code>. |[[VirtSpongerLinkedDataHooksIntoSPARQLEx11][Example]] |
| <b><code>input:grab-var</code></b> | <b><code>"?<var-name>"</code></b> | Specifies the name of the SPARQL variable whose values should be used as IRIs of resources that should be downloaded. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx1][Example]] |
| <b><code>input:grab-group-destination</code></b> | <b><code>"<IRI>"</code></b> | resembles input:grab-destination but sponges will create individual graphs for Network Resource Fetch results, and in additional to this common routine, a copy of each Network Resource Fetch result is added to the resource specified by the value of input:grab-group-destination. input:grab-destination redirects loadings, input:grab-group-destination duplicates them. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx17][Example]] |
| <b><code>input:grab-intermediate</code></b> | <b><code>"<IRI>"</code></b> | extends the set of IRIs to sponge, useful in combination with input:grab-seealso. If present then for a given subject, Network Resource Fetch will retrieve not only values of see-also predicates for that subject but the subject itself. The define value is not used in current implementation. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx18][Example]] |
| <b><code>input:same-as</code></b> | <b><code>"yes"</code></b> | sets inference context for owl:sameAs (entity equivalence by name) reasoning and union expansion | [[VirtSpongerLinkedDataHooksIntoSPARQLEx19][Example]] |
| <b><code>input:storage</code></b> | <b><code>"<IRI>"</code></b> | sets dataset (quads) storage scope. The value is a storage identifier (IRI) where the default value is: virtrdf:DefaultQuadStorage. If the value is an empty string then only quads associated with Linked Data Views are used. This is a good choice for low-level admin procedures, for two reasons: they will not interfere with any changes in virtrdf:DefaultQuadStorage and they will continue to work even if all compiler's metadata is corrupted, including the description of virtrdf:DefaultQuadStorage (define input:storage "" switches the SPARQL compiler to a small set of metadata that is built in 'C' code and thus are very hard to corrupt by end-users) | [[VirtSpongerLinkedDataHooksIntoSPARQLEx20][Example]] |
| <b><code>input:ifp</code></b> | <b><code>"<keyword>"</code></b> | adds IFP keyword in <code>OPTION (QUIETCAST, ...)</code> clause in the generated SQL. The value of this define is not used yet; an empty string is safe for future extensions. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx21][Example]] |
| <b><code>input:inference</code></b> | <b><code>"<IRI>"</code></b> | specifies the name of inference rule that provides context for backward-chained reasoner. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx22][Example]] |
| <b><code>input:param</code></b> | <b><code>"<variable-name>"</code></b> | declares a variable name as a protocol parameter. The SPARQL query can refer to protocol parameter X via variable with special syntax of "<b><code>?::X</code></b>" (without quotation marks). If query text should be made by a query builder that does not understand SPARQL-BI extensions, then the query text may contain variable <code>?X</code> and <code>define input:param "X"</code>. This does not work for positional parameters; one can not replace a reference to <code>?::3</code> with <code>?3</code> and <code>define input:param "3"</code>. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx23][Example]] |
| <b><code>input:params</code></b> | <b><code>"<variable-name>"</code></b> | Synonym of <code>input:param</code> | [[VirtSpongerLinkedDataHooksIntoSPARQLEx23][Example]] |
| <b><code>input:default-graph-uri</code></b> | <b><code>"<IRI>"</code></b> | works like "<code>FROM</code>" clause | [[VirtSpongerLinkedDataHooksIntoSPARQLEx24][Example]] |
| <b><code>input:named-graph-uri</code></b> | <b><code>"<IRI>"</code></b> | works like "<code>FROM NAMED</code>" clause | [[VirtSpongerLinkedDataHooksIntoSPARQLEx25][Example]] |
| <b><code>input:default-graph-exclude</code></b> | <b><code>"<IRI>"</code></b> | works like "<code>NOT FROM</code>" clause | [[VirtSpongerLinkedDataHooksIntoSPARQLEx26][Example]] |
| <b><code>input:named-graph-exclude</code></b> | <b><code>"<IRI>"</code></b> | works like "<code>NOT FROM NAMED</code>" clause | [[VirtSpongerLinkedDataHooksIntoSPARQLEx27][Example]] |
| <b><code>input:freeze</code></b> | | blocks further changes in the list of source graphs. The web service endpoint (or similar non-web application) can edit an incoming query by placing list of pragmas ended with <code>input:freeze</code> in front of query text. Even if an intruder ties to place some graph names, they will get a compilation error, not an access to the data. <code>input:freeze</code> disables all <code>input:grab-...</code> pragmas as well. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx28][Example]] |
---++++ GET Pragmas
GET Pragmas enables you to control actual data object content retrieval behavior applied to a SPARQL query. The net effect is a fine-grained control over data access oriented matters such as --
1. Data object content format, via content negotiation;
1. Cache invalidation; and
1. Proxy handling.
This pragma type is also usable as a comma separated list of <code>SPARQL ... FROM <options></code>. Its methods and method-modifiers include --
| *Method* | *Modifier(s)* | *Description* | *Usage Example* |
| <b><code>get:proxy</code></b> | <b><code>"<host[:port]>"</code></b> | Similar to setting up a Web browser to working with a proxy style of HTTP server, this identifies the CNAME (URL "<code>host:port</code>" or "<code>authority</code>" component) to target if direct retrieval from the URL in the <code>FROM</code> clause or handling of a data object's dereferenceable identifier is not possible. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx12][Example]] |
| <b><code>get:soft</code></b> | <b><code>"soft"</code></b> %BR% <b><code>"replace"</code></b> %BR% <b><code>"add"</code></b> |"<code>soft</code>" and "<code>replace</code>" are synonyms, and replace triples stored in named graphs. %BR% "<code>add</code>", on the other hand, simply adds triples to existing named graphs. %BR% All are subject to the overarching cache invalidation scheme applied to a given DBMS instance. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx13][Example]] |
| <b><code>get:accept</code></b> | <b><code>"application/xml"</code></b> %BR% <b><code>"application/rdf+xml"</code></b> %BR% <b><code>"application/rdf+turtle"</code></b> %BR% <b><code>"application/x-turtle"</code></b> %BR% <b><code>"application/turtle"</code></b> %BR% <b><code>"text/rdf+n3"</code></b> %BR% <b><code>"text/turtle"</code></b> |The most common purpose of define <code>get:accept</code> is accessing a web service that returns a HTML by default but can also return RDF if is forced to do so. The default value is "application/rdf+xml; q=1.0, text/rdf+n3; q=0.9, application/rdf+turtle; q=0.5, application/x-turtle; q=0.6, application/turtle; q=0.5 text/turtle; q=1.0, application/xml; q=0.2, */*; q=0.1" | [[VirtSpongerLinkedDataHooksIntoSPARQLEx47][Example]] |
| <b><code>get:uri</code></b> | <b><code>"<IRI>"</code></b> |Determines the object identifiers associated with content retrieval if the data source in question differs from data object content URL used in the <code>FROM</code> clause of a SPARQL query. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx14][Example]] |
| <b><code>get:refresh</code></b> | <b><code>"<seconds>"</code></b> |limits the lifetime of a local cached copy of the source, the value is in seconds; | [[VirtSpongerLinkedDataHooksIntoSPARQLEx15][Example]] |
| <b><code>get:method</code></b> | <b><code>"GET"</code></b> or <b><code>"MGET"</code></b> | <code>"GET"</code> loads the resource itself; <code>"MGET"</code> loads metadata about the resource. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx16][Example]] |
| <b><code>get:cartridge</code></b> | | | [[VirtSpongerLinkedDataHooksIntoSPARQLEx29][Example]] |
| <b><code>get:query</code></b> | | | [[VirtSpongerLinkedDataHooksIntoSPARQLEx34][Example]] |
| <b><code>get:private</code></b> | <b><code>""</code></b> or <b><code><nowiki><graph_group_IRI></nowiki></code></b> |When used for sponging graph X, it adjusts graph-level security of graph X (and of <nowiki>graph_group_IRI</nowiki>, if specified) so that X becomes a privately accessible graph of the user who sponges the X and if <nowiki>graph_group_IRI</nowiki> is specified then X becomes accessible to users that can access <nowiki>graph_group_IRI</nowiki> with permissions like permissions they have on <nowiki>graph_group_IRI</nowiki>.<br/>The exact rules are following:<br/> * If graph is virtrdf: then an error is signaled.<br/> * If graph name is an IRI of handshaked web service endpoint or "public IRI" of a handshaked web service endpoint then an error is signaled.<br/> * If access is public by default even for private graphs then an error is signaled and sponging is not tried.<br/> * If default is "no access" but someone (other than current user) has specifically granted read access to the graph in question AND current user is not dba AND current user has no bit 32 permission on this graph then an error is signaled.<br/> * If read access is public by default for world and disabled for private graphs then the graph to be sponged is added to the group of private graphs.<br/> * If current user is not DBA, current user gets granted read+write+sponge+admin access to the graph to be sponged. In addition, current user gets special permission bit 32, indicating that the graph is made by private sponge of this specific user.<br/> * If the value of get:private is an IRI then:<br/> * the IRI is supposed to be an IRI of "plain" graph group, error is signaled in case of non-existing graph group, group of private graphs or group of graphs to be replicated.<br/> * the graph is added to that group. <br/> * each non-dba user that can get list of files of the group will get permissions for the loaded graph equal to permissions they have on graph group minus "list" permission.<br/>| 1. [[VirtSpongerLinkedDataHooksIntoSPARQLEx45][Example for entirely confidential database]]<br/>2. [[VirtSpongerLinkedDataHooksIntoSPARQLEx46][Example using private graphs]] |
---++++ SQL Pragmas
Pragmas to control code generation:
| *Method* | *Modifier(s)* | *Description* | *Usage Example* |
| <b><code>sql:signal-void-variables</code></b> | | When set to 0 that forces the SPARQL compiler to signal errors if some variables cannot be bound due to misspell names or attempts to make joins across disjoint domains. These diagnostics are especially important when the query is long. It is the most useful debugging variable if Linked Data Views are in use. It tells the SPARQL compiler to signal an error if it can prove that some variable can never be bound. Usually it means error in query, like typo in IRI or totally wrong triple pattern. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx30][Example]] |
| <b><code>sql:big-data-const</code></b> | | | [[VirtSpongerLinkedDataHooksIntoSPARQLEx31][Example]] |
| <b><code>sql:describe-mode</code></b> | | See detailed description [[http://docs.openlinksw.com/virtuoso/rdfsparql.html#rdfsqlfromsparqldescribe][here]]. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx32][Example]] |
| <b><code>sql:log-enable</code></b> | | Value that will be passed to SPARUL procedures and there it will be passed to [[http://docs.openlinksw.com/virtuoso/fn_log_enable.html][log_enable()]] BIF. Thus define sql:log-enable N will result in log_enable(N, 1) at the beginning of the operation and other [[http://docs.openlinksw.com/virtuoso/fn_log_enable.html][log_enable()]] call will restore previous mode of transaction log at exit from the procedure or at any error signaled from it. For example, set to 2 to disable logging to avoid huge transaction after-image when sponging is deep and wide. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx33][Example]] |
| <b><code>sql:globals-mode</code></b> | | tells how to print names of global variables, supported values are "XSLT" (print colon before name of global variable and "SQL" (print as usual) | [[VirtSpongerLinkedDataHooksIntoSPARQLEx35][Example]] |
| <b><code>sql:table-option</code></b> | | value will be added as an option to each triple in the query and later it will be printed in TABLE OPTION (...) clause of source table clause. This works only for SQL code for plain triples from RDF_QUAD, fragments of queries related to RDF Views will remain unchanged. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx36][Example]] |
| <b><code>sql:select-option</code></b> | | value will be added as an global OPTION () clause of the generated SQL SELECT. This clause is always printed, it is always at least OPTION (QUIETCAST, ...). The most popular use case is define sql:table-option "ORDER" to tell the SQL compiler execute joins in the order of their use in the query (this can make query compilation much faster but the compilation result can be terrible if you do not know precisely what you're doing and not inspected execution plan of the generated SQL query) | [[VirtSpongerLinkedDataHooksIntoSPARQLEx37][Example]] |
| <b><code>sql:assert-user</code></b> | | defines the user who is supposed to be the single "proper" use for the query. If the compiler is launched by other user, an error is signaled. The typical use is define sql:assist-user "dba". This is too weak to be a security measure, but may help in debugging of security issues. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx38][Example]] |
| <b><code>sql:gs-app-callback</code></b> | | application-specific callback that returns permission bits of a given graph | [[VirtSpongerLinkedDataHooksIntoSPARQLEx39][Example]] |
| <b><code>sql:gs-app-uid</code></b> | | application-specific user id to use in callback. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx40][Example]] |
| <b><code>sql:param</code></b> | <b><code>"<variable-name>"</code></b> | Synonym of <code>input:param</code> | [[VirtSpongerLinkedDataHooksIntoSPARQLEx23][Example]] |
| <b><code>sql:params</code></b> | <b><code>"<variable-name>"</code></b> | Synonym of <code>input:param</code> | [[VirtSpongerLinkedDataHooksIntoSPARQLEx23][Example]] |
---++++ OUTPUT Pragmas
Pragmas to control the type of the result.
| *Method* | *Modifier(s)* | *Description* | *Usage Example* |
| <b><code>output:valmode</code></b> | | tells the compiler which SQL datatypes should be used for output values. ODBC clients and the like known nothing about RDF and expect plain SQL values, so the appropriate value for them is "SQLVAL" and that's the default. When a Virtuoso/PL procedure is RDF-aware and keeps results for further passing to other SPARQL queries or some low-level RDF routines, the value "LONG" tells the compiler to preserve RDF boxes as is and to return IRI IDs instead of IRI string value. Third possible value, "AUTO", is for dirty hackers that do not want any conversion of any sort at the output to read the SQL output of SPARQL front-end, find the format of each column and add the needed conversions later. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx41][Example]] |
| <b><code>output:format</code></b> | | tells the compiler that the query should produce a string output with the serialization of the result, not a result set. There are three of them because the caller, like SPARQL web service endpoint, may not know the actual type of the query that should be executed. The value of output:format is used for SELECT and data manipulation queries, if specified, it can also be used for CONSTRUCT, DESCRIBE or ASK, if it is specified but related output:dict-format or output:scalar-format is not. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx42][Example]] |
| <b><code>output:scalar-format</code></b> | | tells the compiler that the query should produce a string output with the serialization of the result, not a result set. There are three of them because the caller, like SPARQL web service endpoint, may not know the actual type of the query that should be executed. The value of output:scalar-format is used for ASK queries only, if specified. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx43][Example]] |
| <b><code>output:dict-format</code></b> | | tells the compiler that the query should produce a string output with the serialization of the result, not a result set. There are three of them because the caller, like SPARQL web service endpoint, may not know the actual type of the query that should be executed. The value of output:dict-format is used for CONSTRUCT and DESCRIBE queries only, if specified. | [[VirtSpongerLinkedDataHooksIntoSPARQLEx44][Example]] |
---+++ Sponger Usage Examples
* [[http://docs.openlinksw.com/virtuoso/virtuososponger.html#virtuosospongerusageprocessorex][SPARQL Processor Usage Example]]
* [[http://docs.openlinksw.com/virtuoso/virtuososponger.html#virtuosospongerusageproxyex2][RDF Proxy Service Example]]
* [[http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtDeployingLinkedDataGuide_BrowsingNorthwindRdfView#AncMozToc2][Browsing & Exploring RDF View Example Using ODE]]
* [[http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtDeployingLinkedDataGuide_BrowsingNorthwindRdfView#AncMozToc3][Browsing & Exploring RDF View Example Using iSPARQL]]
* [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodplapissimpleexample][Basic Sponger Cartridge Example]]
* [[http://docs.openlinksw.com/virtuoso/virtuososponger.html#virtuosospongerusagebriefex][HTTP Example for Extracting Metadata using CURL]]
* [[http://docs.openlinksw.com/virtuoso/virtuososponger.html#virtuosospongercartridgetypesmetarestexamples][RESTFul Interaction Examples]]
* [[http://docs.openlinksw.com/virtuoso/sect5_virtuosospongercreatecustcartrrgstflickr.html][Flickr Cartridge Example]]
* [[http://docs.openlinksw.com/virtuoso/virtuososponger.html#virtuosospongercreatecustcartrexmp][MusicBrainz Metadatabase Example]]
* [[VirtTipsAndTricksGuideAddTriplesNamedGraph][SPARQL Tutorial -- Magic of SPARUL and Sponger]]
---++ Related
* [[VirtSponger][Virtuoso Sponger]]
* [[http://virtuoso.openlinksw.com/Whitepapers/html/VirtSpongerWhitePaper.html][Technical White Paper]]
* [[VirtSpongerCartridgeSupportedDataSources][Supported Virtuoso Sponger Cartridges]]
* [[SPARQLSponger][SPARQL Sponger]]
* [[VirtInteractSpongerMiddlewareRESTPatterns][Interacting with Sponger Middleware via RESTful Patterns]]
* [[VirtSpongerCartridgeSupportedDataSourcesMetaRESTExamples][Interacting with Sponger Meta Cartridge via RESTful Patterns]]
* [[VirtSpongerCartridgeRDFExtractor][Sponger Cartridge RDF Extractor]]
* [[RDFMappers][ Extending SPARQL IRI Dereferencing with RDF Mappers]]
* [[VirtSpongerCartridgeProgrammersGuide][Programmer Guide for Virtuoso Linked Data Middleware ("Sponger")]]
* [[VirtProgrammerGuideRDFCartridge][Create RDF Custom Cartridge Tutorial]]
* [[VirtSpongerCartridgeSupportedDataSources][OpenLink-supplied Virtuoso Sponger Cartridges]]
* [[VirtAuthServerUI][Virtuoso Authentication Server]]
* [[VirtOAuthSPARQL][Virtuoso SPARQL OAuth Tutorial]]
* [[VirtSpongerACL][Virtuoso Sponger Access Control List (ACL) Setup]]
* [[VirtSPARQLSecurityWebID][WebID Protocol & SPARQL Endpoint ACLs Tutorial]]
* [[http://docs.openlinksw.com/virtuoso/virtuososponger.html][Virtuoso Documentation]]
* [[VirtTipsAndTricksGuide][Virtuoso Tips and Tricks Collection]]
|