Not logged in : Login

About: VOSSQLRDF     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : atom:Entry, within Data Space : ods.openlinksw.com associated with source document(s)

AttributesValues
type
Date Created
Date Modified
label
  • VOSSQLRDF
maker
Title
  • VOSSQLRDF
isDescribedUsing
has creator
content
  • %VOSWARNING% %META:TOPICINFO{author="KingsleyIdehen" date="1161195828" format="1.1" version="1.1"}% %META:TOPICPARENT{name="VOSRDF"}% %VOSNAV% ---+ Mapping Relational Data to RDF in Virtuoso Orri Erling (Program Manager, OpenLink Virtuoso), Ivan Mikhailov (Lead Developer, OpenLink Virtuoso). ---++ Abstract This paper discusses Virtuoso's new declarative Meta Schema Language for mapping SQL Data to RDF Ontologies. This functionality, commonly referred to as "Linked Data Views", enables the exposure of pre-existing ODBC or JDBC accessible relational data as Virtual RDF Graphs thereby exposing said data to SPARQL query access directly or via Virtuoso's in-built support for SPARQL within SQL (SPASQL). The entire process results in RDF Data sets and optimized data access without physical regeneration of RDF Data Sets from SQL Data. Considering that almost all of the present generation of Web 2.0 and Enterprise applications run atop relational databases, the need for efficiently exposing existing data to the burgeoning Semantic Web is obvious. For instance, it now possible to dynamically generate RDF Data Sets for popular ontologies such as SIOC, SKOS, FOAF, and ATOM/OWL without disruption to the existing database infrastructure that drives existing Web 1.0 or Web 2.0 solutions. The same applies to Enterprises seeking to exploit many of the collaborative prowess of the Web 2.0 application profiles. As part of this paper we provide a simple example of how the Weblog Data Space within the OpenLink Data Spaces collective is mapped to the AtomOWL Ontology en route to making the Weblog Data Space data accessible to SPARQL. ---++ Introduction The Virtuoso RDF meta schema is a built-in feature of Virtuoso's SPARQL to SQL translator. It recognizes triple patterns that refer to graphs for which an alternate representation is declared and translates these into SQL accordingly. The main purpose of this is mapping SPARQL queries onto existing relational databases. There exists previous work from many parties for rendering relational data as RDF and opening it to SPARQL access. We can mention D2RQ, SPASQL, Squirrel RDF and others. The Virtuoso effort differs from these mainly in the following: * Integration with a triple store. Virtuoso can process a query for which some triple patterns will go to local or remote relational data and some to local physical RDF triples. * Integration with SQL. Since SPARQL and SQL share the same run time and query optimizer, the query compilation decisions are always made with the best knowledge of the data and its location. This is especially important when mixing triples and relational data or when dealing with relational data distributed across many outside databases. * No limits on SPARQL. It remains possible to make queries with unspecified graph or predicate against mapped relational data, even though these may sometimes be inefficient. * Coverage of the whole relational model. Multi-part keys, etc., are supported in all places. ---++ Rationale Since most of the data that is of likely use for the emerging semantic web is stored in relational databases, the argument for exposing this to SPARQL access is clear. We note that historically, SQL access to relational data has essentially never been given to the public outside of the organization. If programmatic access to corporate IS has been available to the public, it has been through dynamic web pages or more recently web services. There are reasons of performance, security, maintainability and so forth for this. The culture of the emerging semantic web is however taking a different turn. Since RDF and OWL offer a merge-able and query-able model for heterogeneous data, it is more meaningful and maintainable to expose selected data for outside query than it would be with SQL. Advances in hardware make this also less of a performance issue than it would have been in the client-server database era. In the context of Virtuoso, since Virtuoso is originally a virtual/federated database, incorporating SPARQL to relational mapping is an evident extension of the product's mission as a multi-protocol, multi-platform connector between information systems. ---++ Graph Patterns, Value and IRI Classes In the simplest sense, any relational schema can be rendered into RDF by converting all primary keys and foreign keys into IRI's, assigning a predicate IRI to each column, and an rdf:type predicate for each row linking it to a RDF class IRI corresponding to the table. Then a triple with the primary key IRI as subject, the column IRI as predicate and the column's value as object is considered to exist for each column that is neither part of a primary or foreign key. Strictly equating a subject value to a row and each column to a predicate is often good but is too restrictive for the general case. For example, multiple triples with the same object and predicate can exist. Or a single subject can get single-valued properties from multiple tables or in some cases stored procedures. #BasicLanguageElements The basic elements of the Virtuoso RDF meta schema are: #IriClass * IRI Class - This declares that a key column or set of key columns gets converted into a URI in a certain way. When declaring that a table's primary key is converted into a URI according to one IRI class, one usually declares that all foreign keys referring to this class also get converted into an IRI as per this same class. #ValueClass * Value Class - This is a mechanism for unit conversions, for converting enumerations into IRI's or any other scalar to scalar conversion. The difference between a value class and an IRI class is that a value resulting from converting a relational column according to a value class is not necessarily an IRI joinable to the subject or predicate of a triple. #QuadPattern * Quad pattern - This declares that a quad with any combination of specified or unspecified G, S, P or O is stored in a non-default way. The patterns are grouped together in Linked Data Views. A quad pattern consists of the following: * Subject - This can be an IRI class followed by one of more columns of a table. The IRI class specifies how the column values are combined into an IRI string and how an IRI string is decomposed back into the key value. Alternately, this can be a constant IRI. * Predicate - This is most often a constant IRI but can also be an IRI class as with the subject. * Object - This can be an IRI class applied to columns, a value class applied to a column or a literal IRI or scalar. * Condition - This is an optional SQL search condition. If the search condition is true when applied to a row of the table designated in the subject, predicate and object patterns, then the triple is considered to exist, otherwise not. * Optional RDF type of the subject. If the subject is known to be of a type disjoint from this type, then the quad pattern will not apply. We note that when S, P, or O refers to columns of a table, all of these should be to the same table. For complex joins, the table in question can also be a SQL view joining multiple tables. Thus the search condition of the quad pattern does not have to be used as a join condition between different tables. * Graph - This is most often a constant IRI but can also be an IRI class as with the subject. It is very common for many quad patterns of one SQL schema to share one constant graph, but one meta schema may consist of quad patterns with any number of different graph declarations. * Quad Storage -- This is a named set of quad patterns. This is used for compartmentalizing the RDF to SQL mapping. This allows specifying that a SPARQL query will be executed using only quad patterns of the given quad storage. Declarations of IRI classes, value classes and quad patterns are shared between all quad storages of an RDF meta schema but every quad storage contains only a subset of all available quad patterns. Two quad storages are always defined: a 'virtrdf:default' one usually consists of everything (all relational mappings plus a table of physical quads) and an 'virtrdf:empty' storage that refers to single table of physical quads). A special case of IRI or value class is the identity class, which is simply marked by table.column. For this to occur as S or P, the column must be of the IRI ID type. For an O, the column may be of any type. #SiocMappingExample1 ---+++ Example - Mapping Virtuoso Database Users and Roles to Accounts in the SIOC Ontology Here we map the Virtuoso sys_users and sys_role_grant tables to the SIOC ontology. The tables are in part declared as follows, some columns are omitted for brevity: create table sys_users (u_id int primary key, u_name varchar, u_is_role int, u_e_mail, u_home varchar, u_dav_enable int...) create table sys_role_grants (gi_super int, gi_sub int, gi_direct, primary key (gi_super, gi_sub, gi_direct)); The column gi_super is the role and gi_sub is the principal to which the role is granted. Both refer to u_id in sys_users. A row of sys_users is either a role or an account, as indicated by u_is_role. All role memberships, including indirect ones are listed in sys_role_grants, the gi_direct flag shows whether this is a directly granted role. We can map this into the sioc:User and sioc:Usergroup RDF classes. These are then related by the mutually inverse predicates sioc:member_of and sioc:has_member. Additionally, a group can have other groups as members, although not circularly. To define this mapping, we need the following group of declarations that form one big statement: sparql pragma prefix oplsioc: pragma prefix sioc: -- We put all the SIOC names in the original SIOC namespace. The names that pertain to the mapping go into the oplsioc namespace. drop graph iri("^{URIQADefaultHost}^/dataspace") . -- We have the notation ^{URIQADefaultGraph}^ to refer to the default host declared for the server on which the mapping is being loaded. This allows IRI's to be dereferenceable on the host itself. For example, the IRI's declared for the various keys can go to dynamic web pages giving information on the entities being described. In addition, the URIQA protocol server uses this host name to convert relative URIs of local web pages into absolute URIs that can be used by browsers. create iri class oplsioc:user_iri "http://^{URIQADefaultHost}^/sys/user?id=%d" (in uid integer not null) . create iri class oplsioc:group_iri "http://^{URIQADefaultHost}^/sys/group?id=%d" (in gid integer not null) . create iri class oplsioc:membership_iri "http://^{URIQADefaultHost}^/sys/membersip?super=%d&sub=%d" (in super integer not null, in sub integer not null) . create iri class oplsioc:dav_iri "http://^{URIQADefaultHost}^%s" (in path varchar) . create iri class oplsioc:grantee_iri using function DB.DBA.RDF_DF_GRANTEE_ID_URI (in id integer) returns varchar , function DB.DBA.RDF_DF_GRANTEE_ID_URI_INVERSE (in id_iri varchar) returns integer . make oplsioc:user_iri subclass of oplsioc:grantee_iri . make oplsioc:group_iri subclass of oplsioc:grantee_iri . -- Here we declare the entities that we will see in RDF. These are the user and usergroup and their abstract superclass called grantee. Note that the GI_SUB column of sys_role_grants can refer to either a user or a group and we have decided to give them different IRI's. This is not absolutely necessary since we can tell from the id whether something is a user or group but this may be good for readability and serves to illustrate features of the mapping. We can map the group membership simply as a many to many without attributes. For some queries, though, we will need to see this relationship as an entity with its own attributes, for example whether the grant is direct and whether it is re-grantable. (the sql with admin option). For the case where we are interested in the attributes of the relationship, we define the iri class for membership, although this is not part of SIOC, hence the oplsioc namespace. alter quad storage virtrdf:DefaultQuadStorage { create virtrdf:SysUsers as graph iri ("http://^{URIQADefaultHost}^/dataspace") option (exclusive) { oplsioc:user_iri (DB.DBA.SYS_USERS.U_ID) a sioc:user where (^{alias}^.U_IS_ROLE = 0) ; sioc:email U_E_MAIL ; sioc:login U_NAME ; oplsioc:home oplsioc:dav_iri (U_HOME) where (^{alias}^.U_DAV_ENABLE = 1) ; oplsioc:name U_FULL_NAME where (^{alias}^.U_IS_ROLE = 0) . oplsioc:group_iri (DB.DBA.SYS_USERS.U_ID) a sioc:role where (^{alias}^.U_IS_ROLE = 1) . oplsioc:group_iri (DB.DBA.SYS_ROLE_GRANTS.GI_SUPER) sioc:has_member oplsioc:grantee_iri (GI_SUB) ; oplsioc:group_of_membership oplsioc:membership_iri (GI_SUPER, GI_SUB) . oplsioc:grantee_iri (DB.DBA.SYS_ROLE_GRANTS.GI_SUB) sioc:has_function oplsioc:group_iri (GI_SUPER) ; oplsioc:member_of oplsioc:membership_iri (GI_SUPER, GI_SUB) . oplsioc:membership_iri (DB.DBA.SYS_ROLE_GRANTS.GI_SUPER, GI_SUB) oplsioc:is_direct GI_DIRECT ; rdf:type oplsioc:grant . } } ; -- Now we declare the actual mapping. In the graph /dataspace of the default host of the server, the above rules will apply. The first rule declare that ?s rdf:type sioc:User is true of any IRI matching the user_iri (sys_users.u_id) pattern. This means that: * the constant part of the IRI is as defined for the user_iri IRI class. * there is a row of DB.DBA.SYS_USERS with U_ID equal to the number extracted form the IRI as per the format string in the create iri class statement, i.e. ^{URIQADefaultHost}^/user?id=%d. * The rule is applicable U_IS_ROLE column of the row should be equal to zero. After this, we map some scalar properties, for example email. In defining repeating predicates for the same subject, the subject does not have to be repeated if the semicolon is used as delimiter, just as with the SPARQL syntax for triple patterns. Since U_E_MAIL is just a string, no special mapping is needed. If we wanted it as a mailto:// IRI, we could define an IRI class that would convert the string between the SPARQL and SQL values. The rule with group_iri (SYS_USERS.U_ID) subject is almost identical to the first rule with user_iri (SYS_USERS.U_ID) subject. Then we declare the many-to-many between roles and users. The predicates are the SIOC has_function and has_member. They are identical, except that the subject and object are reversed. The group is always a group, the member is either a group or user, hence the use of the grantee_iri IRI class. The mapping functions of this IRI class will have to consult the SYS_USERS table to decide which kind of IRI to make: create function DB.DBA.RDF_DF_GRANTEE_ID_URI (in id integer) { declare is_role integer; is_role := coalesce ((select top 1 U_IS_ROLE from DB.DBA.SYS_USERS where U_ID = id)); if (is_role is null) return NULL; else if (is_role) return sprintf ('http://%s:/sys/group?id=%d', get_uriqa_default_host(), id); else return sprintf ('http://%s:/sys/user?id=%d', get_uriqa_default_host(), id); } ; create function DB.DBA.RDF_DF_GRANTEE_ID_URI_INVERSE (in id_iri varchar) { declare parts any; parts := sprintf_inverse (id_iri, concat ('http://', get_uriqa_default_host(), '/sys/user?id=%d', 0); if (parts is not null) { -- The syntax of given IRI matches to the format pattern for IRIs of user accounts if (not exists (select top 1 1 from DB.DBA.SYS_USERS where U_ID = parts[0] and not U_IS_ROLE)) return NULL; -- The ID specified is not valid user ID return parts[0]; -- Both syntax and the ID value are OK } parts := sprintf_inverse (id_iri, concat ('http://', get_uriqa_default_host(), '/sys/group?id=%d', 0); if (parts is not null) { if (not exists (select top 1 1 from DB.DBA.SYS_USERS where U_ID = parts[0] and U_IS_ROLE)) return NULL; return parts[0]; } return NULL; } ; The clause options (exclusive) instructs the SPARQL compiler that all quads of the graph 'http://^{URIQADefaultHost}^/dataspace' are described by this set of declared quad patterns. Even if the table of physical quads contains quads with graph 'http://^{URIQADefaultHost}^/dataspace' they may be ignored by the SPARQL query. Any group of quad patterns or a single quad pattern can be declared exclusive. Exclusive clauses may conflict. Consider an exclusive group of patterns for .../dataspace graph and an single exclusive quad pattern for sioc:email predicate (say, one might have a table DB.DBA.E_MAILS with G, S and O columns of triples and a sioc:email predicate is assumed for all quads stored there). Consider a query sparql select * where { graph { ?s sioc:email ?o } } that contains a triple pattern that matches both .../dataspace and sioc:email mapping rules. Each rule prevents the compiler from using another rule. To resolve ambiguities, a quad storage may specify the order of quad patterns so that the compiler will use the first suitable quad pattern and ignore the rest. ---++ Sample Queries ---+++ Two Triple Patterns As A Single Table Lookup Select all accounts that have both full names and home DAV collections. sparql prefix sioc: prefix oplsioc: select ?fullname ?home where { graph { ?uid sioc:name ?fullname . ?uid oplsioc:home ?home } } fullname home VARCHAR VARCHAR _____________________________________________________ Ivan Mikhailov http://localhost/DAV/home/iv_an_ru/ Orri Erling http://localhost/DAV/home/oerling/ The generated SQL code contains only one table in FROM clause and its WHERE clause contains texts of all WHERE clauses of relevant quad patterns; duplicate conditions are removed. SELECT "t0"."U_FULL_NAME" AS "fullname", case (isnull ("t0"."U_HOME")) when 0 then sprintf ('http://localhost%s', "t0"."U_HOME") else NULL end AS "home" FROM DB.DBA.SYS_USERS AS "t0" WHERE ("t0"."U_FULL_NAME" is not null) AND ("t0".U_HOME is not null) and ("t0".U_DAV_ENABLE = 1) AND ("t0".U_IS_ROLE = 0) OPTION (QUIETCAST) ---+++ Simple Join of Three Tables Select all pairs of grantees and their roles, return names instead of integer IDs: sparql prefix sioc: prefix oplsioc: select ?super ?sub where { graph { ?superid sioc:has_member ?subid . ?superid sioc:id ?super . ?subid sioc:id ?sub . } } super sub VARCHAR NOT NULL VARCHAR NOT NULL ____________________________________ SPARQL_UPDATE SPARQL_SELECT The query is translated into the following SQL: SELECT "t1"."U_NAME" AS "super", "t2"."U_NAME" AS "sub" FROM DB.DBA.SYS_ROLE_GRANTS AS "t0", DB.DBA.SYS_USERS AS "t1", DB.DBA.SYS_USERS AS "t2" WHERE "t0"."GI_SUPER" = "t1"."U_ID" /* superid */ AND "t0"."GI_SUB" = "t2"."U_ID" /* subid */ OPTION (QUIETCAST) ---+++ Optional Clause and Filtering of Table Rows prefix sioc: prefix oplsioc: select ?uid ?login ?fullname ?home where { graph { ?uid sioc:id ?login . ?uid sioc:name ?fullname . optional { ?uid oplsioc:home ?home } } } uid login fullname home VARCHAR VARCHAR NOT NULL VARCHAR VARCHAR _______________________________________________________________________________________________________________________ http://localhost/sys/user?id=2 dav WebDAV System Administrator NULL http://localhost/sys/user?id=106 iv_an_ru Ivan Mikhailov http://localhost/DAV/home/iv_an_ru/ http://localhost/sys/user?id=5 nobody Special account NULL http://localhost/sys/user?id=107 oerling Orri Erling http://localhost/DAV/home/oerling/ The query is translated into the following SQL: SELECT sprintf ( 'http://localhost/sys/user?id=%d' , "t0"."uid" ) AS "uid", "t0"."login", "t0"."fullname", case (isnull ( "t1"."home" )) when 0 then sprintf ( 'http://localhost%s' , "t1"."home" ) else NULL end AS "home" FROM ( SELECT "t00"."U_ID" AS "uid", "t00"."U_NAME" AS "login", "t01"."U_FULL_NAME" AS "fullname" FROM DB.DBA.SYS_USERS AS "t00", (SELECT "t01-int"."U_ID", "t01-int"."U_FULL_NAME" FROM DB.DBA.SYS_USERS AS "t01-int" WHERE ("t01-int"."U_FULL_NAME" is not null) AND ("t01-int".U_IS_ROLE = 0) ) AS "t01" WHERE "t00"."U_ID" = "t01"."U_ID" ) AS "t0" LEFT OUTER JOIN ( SELECT "t10"."U_ID" AS "uid", "t10"."U_HOME" AS "home" FROM ( SELECT "t10-int"."U_ID", "t10-int"."U_HOME" FROM DB.DBA.SYS_USERS AS "t10-int" WHERE ("t10-int"."U_HOME" is not null) AND ("t10-int".U_DAV_ENABLE = 1) ) AS "t10") AS "t1" ON ( "t0"."uid" = "t1"."uid" ) OPTION (QUIETCAST) ---+++ Mix Of Relational Data and Plain RDF Quads sparql prefix sioc: prefix oplsioc: select ?home ?id ?login where { graph ?g { ?id oplsioc:home ?home } . optional { graph { ?id sioc:id ?login } } } home id login VARCHAR VARCHAR VARCHAR NOT NULL _______________________________________________________________________________ http://localhost/DAV/home/iv_an_ru/ http://localhost/sys/user?id=106 iv_an_ru http://localhost/DAV/home/oerling/ http://localhost/sys/user?id=107 oerling The context of the first triple pattern does not specify graph, so triples with oplsioc:home predicate may be retrieved from both the default storage of RDF quads (table RDF_QUADS) and mapping of relational data (table SYS_USERS). The graph of the second triple pattern matches 'exclusive' group of quad patterns so the compiled query will not try to locate sioc:id predicates in RDF_QUADS table and will make no second union. Thus the intermediate result of the compilation is like this: (sys_users as t00 union all rdf_quad as t10) left outer join sys_users as t1) The optimizer will convert a join of unions into a union of joins so the final query is like (sys_users as t00 left outer join sys_users as t01) union all (rdf_quad as t10 left outer join sys_users as t11) and can be executed efficiently: SELECT DB.DBA.RDF_SQLVAL_OF_LONG ("t00"."home") AS "home", sprintf ('http://localhost/sys/user?id=%d', "t00"."id") AS "id", "t01"."login" FROM ( SELECT "t00a"."U_ID" AS "id", case (isnull ("t00a"."U_HOME")) when 0 then DB.DBA.RDF_MAKE_IID_OF_QNAME (sprintf ('http://localhost%s', "t00a"."U_HOME")) else NULL end AS "home" FROM DB.DBA.SYS_USERS AS "t00a" WHERE ("t00a"."U_HOME" is not null) AND ("t00a".U_DAV_ENABLE = 1) ) AS "t00" LEFT OUTER JOIN ( SELECT "t01-t1"."U_ID" AS "id", "t01-t1"."U_NAME" AS "login" FROM DB.DBA.SYS_USERS AS "t01-t1") AS "t01" ON ( "t00"."id" = "t01"."id") UNION ALL SELECT DB.DBA.RDF_SQLVAL_OF_LONG ("t10"."home"), DB.DBA.RDF_SQLVAL_OF_LONG ("t10"."id"), "t11"."login" FROM ( SELECT "t10a"."S" AS "id", DB.DBA.RDF_LONG_OF_OBJ ("t10a"."O") AS "home" FROM DB.DBA.RDF_QUAD AS "t10a" WHERE "t10a"."P" = DB.DBA.RDF_MAKE_IID_OF_QNAME_SAFE (UNAME'http://www.openlinksw.com/schemas/oplsioc#home')) AS "t10" LEFT OUTER JOIN ( SELECT "t11a"."U_ID" AS "id", "t11a"."U_NAME" AS "login" FROM DB.DBA.SYS_USERS AS "t11a") AS "t11" ON ( "t10"."id" = DB.DBA.RDF_MAKE_IID_OF_QNAME (DB.DBA.RDF_DF_GRANTEE_ID_URI ("t11"."id"))) OPTION (QUIETCAST) ---++ Simplifying Queries In principle, we take each triple pattern from a SPARQL query and expand it into a union of all the quad mapping rule expansions that apply to it. If we have an unspecified graph and unspecified ?s ?p ?o, we indeed get the union of everything. In practice, the graph and the predicates are most often given. This will considerably limit the number of applicable quad patterns. We further assume all IRI classes to be disjoint, unless exceptions are declared with the create ... subclass ... statement shown in the example with user_iri and grantee_iri. Thus many joins can be known to be empty at compile time. In the simplest case, when we have two triple patterns with the same subject and the subject is mapped to a relational table, we get a join of the table with itself, with pk = pk. This self-join can however be optimized away. After this, there will be condition on multiple columns and normal SQL index and join type selection will apply. As a last step, after pruning the union terms that are known not to be relevant, the unions are taken to the top, so that we have a union of joins and not a join of unions. This is easier for the SQL compiler to optimize. We note that throughout this process, some expansions of triple patterns can refer to relational tables and some to the default RDF_QUAD table or other native RDF storage tables. This is not a problem at any stage. Thus a mixed representation where some things are obtained from local or remote relational stores and some things are stored locally as physical triple is entirely feasible. ---++ SQL Optimizer Support for Meta Schema An equality of two IRI's can become an AND of key column equalities. When this takes place between SPARQL triple patterns mapped to the relational tables, the columns that make up the URI are simply compared pair-wise, as in any regular relational join. However, it is also possible that a URI composed from relational columns is compared with an IRI_ID stored in a physical triple. In this case, we have a comparison between and S, P or O of a triple and a function over one or more columns of a relational table. If the relational table is first in join order, there is no problem. The function is evaluated and the result is used to retrieve the triple. However, if the triple is first in join order, we must be able to decompose the IRI into its constituent parts, that is, if the IRI is such that it can be returned by the key to IRI function corresponding to the columns. The Virtuoso SQL compiler has a notion of inverse functions. This allows both join orders to be tried, in one case converting the key part(s) into an IRI ID and in the other case the IRI_ID into the key part(s). Also, when a relational key is mapped into an IRI, making an IRI_ID for the text of the IRI is actually necessary only when there actually is a triple in the triple store that references this IRI. Thus, if joining a relational key to triples, the key to IRI_ID function can return NULL if no IRI_ID exists. The IRI_ID will be made only when an actual triple is created. When clients consume IRI's anyway as strings, the existence or non-existence of a physical IRI ID is entirely transparent. To facilitate this situation, the compiler recognizes the comparison of an IRI column to a string-valued function or variable. It will automatically insert a 'get IRI_ID for string if exists' function into the compilation. We note that when we cast primary keys to IRI's and pass them back and forth, we may often get situations where key_to_iri (t1.k1, t1.k2) = key_to_iri (t2.k1, t2.k2) where t1 and t2 are different correlation names for a pk and fk tables and k1, k2 are the parts of a 2 part key. This is noticed by the SQL compiler and optimized into t1.k1 = t2.k1 and t1.k2 = t2.k2. Thus no function indices or the like are needed. It suffices that the key to iri functions are declared as having inverses and as being bijective with the sinv_create_inverse function. See the documentation for this. Also, when joining native RDF triples and relational data, we get situations like rdf_quad.S = key_to_iri (t1.k1, t1.k2). We note that there may not be an IRI ID corresponding to the key_to_iri function's result. The key_to_iri function returns a string, whereas the S,P,O,G of the RDF_QUAD table are IRI ID's referenced RDF_URL. The SQL compiler recognizes this and puts a cast there, one that will convert the string returned by the key_to_iri function into an IRI ID if there indeed is such an ID and NULL otherwise. Furthermore, because all key_to_iri functions have inverses, the condition can be rewritten as iri_to_key_1 (rdf_iri_id_qname (rdf_quad.S)) = t1.k1 and iri_to_key_2 (rdf_iri_id_qname (rdf_quad.S)) = t1.k2. This allows both join orders to be used with all index usage options, again without requiring function indices or the like. We note that all these optimizations will serve equally well whether the relational tables are local to Virtuoso or not. The native RDF tables will have to be local due to their use of the native IRI_ID data type. ---++ Using Meta Schema for Triple Storage Optimization If we know that some predicates are always single-valued within a certain graph, we can use a relational-like table with the subject as primary key and the objects of single-valued predicates as columns. Then we can make a single-column index on each of the columns. Oracle has used SQL materialized views for creating such data structures on the fly. Their name for this is SPMJV for Subject-Predicate Matrix Join View. We can use the meta schema graph patterns for implementing a similar thing at the RDF to SQL transformation and RDF storage level. ---++ Using Table Valued Functions for RDF on Demand Data that we wish to make SPARQL query-able are not always in relational tables. For example, some data can come from web services. Caching all possible outputs of a web service in a relational table in the anticipation of SPARQL access does not make sense in the general case. Also, data is sometimes de-normalized into non-first normal form columns, for example, the tags of an article can be all stored in a single text column with a text index on it. Still, for purposes of RDF rendition, we wish to consider each tag as a separate sioc:topic triple. For these cases, Virtuoso offers table valued functions, also known as procedure views. A piece of procedure code can thus be invoked from the FROM clause of a SQL select and its result set will be considered as that of a derived table. A graph pattern can refer to columns of procedure views just as it can refer to columns of actual tables. The SQL compiler will choose a reasonable join order based on cost information associated with the procedure view. For example, consider create procedure TV_PROC (in blog_id int, in post_uid int, in tag varchar) { __cost (20, 1, 1000, 100, 5); /* this means that the procedure costs 20 units to invoke * and returns 1 row if all parameters have a non-NULL value. * If the 1st is not given, multiply these by 1000. * If the 2nd is not given, by 100, if the 3rd is not given, by 5. */ ... /* if blog id is given, loop over all posts and tags. * If blog id and post id are given, loop over all tags of the post. * If all are given, give a result row if the post has the tag * and so on, for all the 8 combinations of null or non-null value of the three parameters */ } create procedure view BLOG_TAG_VIEW as TV_PROC (in TV_BLOG_ID int, in TV_POST_ID int, in TV_TAG varchar) (TV_BLOG_ID int, TV_POST_ID int, TV_TAG varchar); sparql graph .... { post_iri (BLOG_TAG.TV_BLOG_ID, BLOG_TAG.TV_POST_ID) sioc:topic TV_TAG . } Now we can do a query like: select * from where { ?me sioc:name "Orri" . ?me author_of ?post . ?post sioc:topic ?tag} The join order will be left to right because the compiler knows that the tag procedure has a very high cost factor for not specifying the two first arguments, so it will place the procedure call at the end, where the post id is known, instead of iterating over all posts and tags and then seeing which of these has "Orri" as name of author. #MappingOdsBlogToAtomOwl ---++ Example: Mapping ODS Weblog Post Data to the AtomOWL Ontology Consider a weblog application from OpenLink Data Spaces (ODS). A blog consists of blog posts. Every post may have comments. Every blog is owned by a user. A blog owner is a 'database user' registered in DB.DBA.SYS_USERS even if SQL access is disabled for him. A user may create as many blogs as he wishes. The application uses a database schema that is not very convenient for the mapping to AtomOWL. It is easy to form some set of subjects and predicates, because relational data are all in 3rd normal form. It's not easy to form predicates exactly as described in an independent AtomOWL standard and to form all IRIs to match names of existing resources. This example demonstrates the urgent need in extending basic data mapping functionality to map arbitrary joins of relational tables as opposed to VIEWs only. When four tables form a group (e.g. user, blog, post and comment) come from different tables we need additional relational views. create view BLOG.DBA.SYS_BLOG_INFO_FULL as select bi.BI_WAI_NAME, -- Blog application instance name u.U_ID, u.U_NAME, u.U_E_MAIL, u.U_HOME from BLOG.DBA.SYS_BLOG_INFO as bi left outer join DB.DBA.SYS_USERS as u on (U_ID = BI_OWNER) ; create view BLOG.DBA.SYS_BLOG_FULL as select b.B_POST_ID, -- ID of a post b.B_BLOG_ID, -- ID of a blog where the post appears b.B_TITLE, -- Post title b.B_CONTENT, -- Content of a post b.B_TS, -- Creation time b.B_USER_ID, -- Post author b.B_MODIFIED, -- Last modified bi.BI_WAI_NAME, u.U_ID, u.U_NAME, u.U_E_MAIL, u.U_HOME from BLOG.DBA.SYS_BLOGS as b left outer join BLOG.DBA.SYS_BLOG_INFO as bi on (BI_BLOG_ID = B_BLOG_ID) left outer join DB.DBA.SYS_USERS as u on (U_ID = BI_OWNER) ; create view BLOG.DBA.SYS_BLOG_COMMENT_FULL as select BM_ID, -- Id of a comment BM_TITLE, -- Comment title BM_COMMENT, -- Text of the comment BM_NAME, -- Author name BM_E_MAIL, -- Author e-mail BM_TS, -- Creation time BM_IS_PUB, -- Whether the comment message is public or private B_BLOG_ID, B_CONTENT, B_TITLE, B_POST_ID, B_TS, B_MODIFIED, BI_WAI_NAME, U_ID, U_NAME, U_E_MAIL, U_HOME from BLOG.DBA.BLOG_COMMENTS bm left outer join BLOG.DBA.SYS_BLOGS as b on (BM_BLOG_ID = B_BLOG_ID and BM_POST_ID = B_POST_ID) left outer join BLOG.DBA.SYS_BLOG_INFO as bi on (BI_BLOG_ID = B_BLOG_ID) left outer join DB.DBA.SYS_USERS as u on (U_ID = BI_OWNER) ; create view BLOG.DBA.SYS_BLOG_CATEGORY_FULL as select MTC_ID, -- ID if a category MTC_BLOG_ID, -- Id of a blog MTC_NAME, -- Category name BI_WAI_NAME, -- Blog application instance name U_ID, U_NAME, U_E_MAIL, U_HOME from BLOG.DBA.MTYPE_CATEGORIES as mtc left outer join BLOG.DBA.SYS_BLOG_INFO as bi on (MTC_BLOG_ID = BI_BLOG_ID) left outer join DB.DBA.SYS_USERS as u on (U_ID = BI_OWNER) ; create view BLOG.DBA.SYS_BLOG_POST_CATEGORY_FULL as select B_BLOG_ID, B_CONTENT, B_TITLE, B_POST_ID, B_TS, B_USER_ID, B_MODIFIED, MTC_ID, MTC_BLOG_ID, MTC_NAME, BI_WAI_NAME, U_ID, U_NAME, U_E_MAIL, U_HOME from BLOG.DBA.MTYPE_CATEGORIES as mtc join BLOG.DBA.SYS_BLOGS as b on (B_BLOG_ID = MTC_BLOG_ID) left outer join BLOG.DBA.SYS_BLOG_INFO as bi on (B_BLOG_ID = BI_BLOG_ID) left outer join DB.DBA.SYS_USERS as u on (U_ID = BI_OWNER) ; Now for any combination of tables we have a view that consists of tables with proper join conditions. Virtuoso has a good chance to create good execution plans for joins of such views because a 'view' in Virtuoso is expanded like a macro before any join order decisions are made. When a FROM clause of a SQL SELECT statement contains a view, the occurrence of a view in FROM clause is replaced with the full SELECT statement of the view body. So when a view P is a join of tables of P1 and P2, view Q is a join of Q1 and Q2 and the query is a join of P and Q then the compilation 'flattens' it and runs same way as if the query were a plain join of P1, P2, Q1 and Q2. The created views use demonstrate an optimization technique where LEFT OUTER JOINS are used instead of INNER JOINS as a way of eliminating the generation of redundant table joins. Thus, If we define a VIEW as: Table A left OUTER JOIN Table B and the JOIN condition has equalities for all columns matching the unique key of Table B (e.g., primary key columns), and only columns of Table A are queried from the VIEW, then the join with B will neither alter returned columns nor change the number of resulting rows, implying that they can be omitted. This let us declare a single VIEW using: "COMMENTS left join BLOGS left join BLOG_INFO left join SYS_USERS" avoiding the creation of redundant: "COMMENTS left join BLOGS" and "COMMENTS left join BLOGS left join BLOG_INFO". sparql prefix oplsioc: prefix sioc: prefix atom: prefix odsWeblog: create iri class odsWeblog:feed_iri "http://^{URIQADefaultHost}^/dataspace/%U/weblog/%U" ( in memb varchar not null, in inst varchar not null) . create iri class odsWeblog:feed_inst_iri "http://^{URIQADefaultHost}^/dataspace/%U/weblog-inst/%U" ( in memb varchar not null, in inst varchar not null) . create iri class odsWeblog:post_iri "http://^{URIQADefaultHost}^/dataspace/%U/weblog/%U/%U" ( in memb varchar not null, in inst varchar not null, in post varchar not null) . create iri class odsWeblog:post_inst_iri "http://^{URIQADefaultHost}^/dataspace/%U/weblog-post-inst/%U/%U" ( in memb varchar not null, in inst varchar not null, in post varchar not null) . create iri class odsWeblog:text_iri "http://^{URIQADefaultHost}^/dataspace/%U/weblog-text/%U/%U" ( in memb varchar not null, in inst varchar not null, in post varchar not null) . create iri class odsWeblog:cmt_iri "http://^{URIQADefaultHost}^/dataspace/%U/weblog/%U/%U/%d" ( in memb varchar not null, in inst varchar not null, in post varchar not null, in cmtid integer) . create iri class odsWeblog:cat_iri "http://^{URIQADefaultHost}^/dataspace/%U/weblog-cat/%U/%U" ( in memb varchar not null, in inst varchar not null, in catid varchar not null) . ; Note that the AtomOWL mapping for blogs will use IRI classes oplsioc:user_iri, oplsioc:group_iri as they are already declared in the example above. As soon as views are prepared and IRI classes are tested to ensure that composed IRIs match to IRIs of actual web pages the rest of quad map declaration is quite straightforward and every single mapping is as simple as in previous 'users and groups' use case. #MappingOdsBlogToAtomOwlExample sparql prefix oplsioc: prefix sioc: prefix atom: prefix odsWeblog: create quad storage virtrdf:ODS { create virtrdf:Weblog2 as graph iri ("http://^{URIQADefaultHost}^/dataspace/weblog2") option (exclusive) { # Feed odsWeblog:feed_iri (BLOG.DBA.SYS_BLOG_INFO_FULL.U_NAME, BI_WAI_NAME) a atom:Feed ; atom:feedInstance odsWeblog:feed_inst_iri (BLOG.DBA.SYS_BLOG_INFO_FULL.U_NAME, BI_WAI_NAME) . # Feed Instance odsWeblog:feed_inst_iri (BLOG.DBA.SYS_BLOG_INFO_FULL.U_NAME, BI_WAI_NAME) a atom:FeedInstance ; atom:title BI_WAI_NAME . odsWeblog:feed_inst_iri (BLOG.DBA.SYS_BLOG_CATEGORY_FULL.U_NAME, BI_WAI_NAME) atom:category odsWeblog:cat_iri (BLOG.DBA.SYS_BLOG_CATEGORY_FULL.U_NAME, BI_WAI_NAME, MTC_ID) . # The following mappings are excluded from the example: atom:contributor, atom:extension, atom:generator, # atom:icon, atom:link, atom:logo, atom:rights, atom:subtitle, atom:updated. # Post odsWeblog:post_iri (BLOG.DBA.SYS_BLOG_FULL.U_NAME, BI_WAI_NAME, B_POST_ID) a atom:Entry ; atom:entryInstance odsWeblog:post_inst_iri (BLOG.DBA.SYS_BLOG_FULL.U_NAME, BI_WAI_NAME, B_POST_ID) . # Post Instance odsWeblog:post_inst_iri (BLOG.DBA.SYS_BLOG_FULL.U_NAME, BI_WAI_NAME, B_POST_ID) a atom:EntryInstance ; atom:author oplsioc:user_iri (U_ID) ; atom:containingFeed odsWeblog:feed_inst_iri (BLOG.DBA.SYS_BLOG_FULL.U_NAME, BI_WAI_NAME) ; atom:content odsWeblog:text_iri (U_NAME, BI_WAI_NAME, B_POST_ID) ; atom:contributor oplsioc:user_iri (B_USER_ID) ; atom:published B_TS ; atom:title B_TITLE ; atom:updated B_MODIFIED . odsWeblog:post_inst_iri (BLOG.DBA.SYS_BLOG_POST_CATEGORY_FULL.U_NAME, BI_WAI_NAME, B_POST_ID) atom:category odsWeblog:cat_iri (BLOG.DBA.SYS_BLOG_POST_CATEGORY_FULL.U_NAME, BI_WAI_NAME, MTC_ID) . # The following mappings are excluded from the example: atom:extension, atom:link, atom:rights # atom:sourceFeed atom:summary. # Text odsWeblog:text_iri (BLOG.DBA.SYS_BLOG_FULL.U_NAME, BI_WAI_NAME, B_POST_ID) a atom:Text ; rdf:value B_CONTENT . # The following mappings are excluded from the example: atom:contentBase, atom:contentLang, # atom:contentSource, atom:contentType, atom:mimeType # Comment odsWeblog:cmt_iri (BLOG.DBA.SYS_BLOG_COMMENT_FULL.U_NAME, BI_WAI_NAME, B_POST_ID, BM_ID) a atom:EntryInstance where (^{alias}^.BM_IS_PUB = 1) . odsWeblog:cat_iri (BLOG.DBA.SYS_BLOG_CATEGORY_FULL.U_NAME, BI_WAI_NAME, MTC_ID) a atom:Category ; atom:categoryLabel MTC_NAME . # atom:categoryScheme and atom:categoryTerm are excluded from the example, as well as # a view that is specific for them. } } ; The disadvantage of current SPARQL implementation is that it can create enormous intermediate SQL queries. A relatively simple SPARQL query like five triple patterns where every IRI reference comes from a join of four tables results in a query with tens of joins and the full optimization before the first run takes seconds of CPU time. Additional problem is a risk of running out of optimization time limit so the server will stop optimization before it reached the optimal plan. We are presently developing a rich syntax of quad map declarations, that allows the application developer to describe join rules in quad mapping. * There will be no need in explicit CREATE VIEW statements. * The SPARQL optimizer will never create redundant joins. * The SPARQL optimizer will be able to prove more about ranges of SPARQL variables and better eliminate joins of triple patterns that can not produce any bindings. * The generator of an intermediate SQL code will create the smallest possible query and minimize the SQL optimization time. #MappingDemoToCustomOwlExample ---++Mapping Virtuoso Demo Database to a Custom Ontology for the Northwind Data Model Consider a Virtuoso Demo Database. This example uses the following tables from the Virtuoso Demo Database SQL Schema: Customers, Orders, Order Details, Products, Product Categories, Employee, Region, Country, Province. Bearing in mind the similarities between the SQL schemas of Virtuoso's Demo database and Microsoft's Northwind Database (used by ACCESS and Microsoft SQL Server), we have chosen to broaden the scope of this example by deriving an OWL ontology from a Conceptual Data Schema Language (CSDL) file that uses XML Schema to conceptually model the Northwind Schema. The aforementioned Ontology is then used as the basis for declaring mappings between the SQL Database and the OWL Ontology. ---+++XML Schema for Northwind (CSDL File) ---+++OWL Ontology for Northwind (in N3) @prefix rdfs: . @prefix xsd: . @prefix virtrdf: . @prefix owl: . @prefix rdf: . @prefix northwind: . northwind: rdf:type owl:Ontology ; rdfs:label "Northwind" ; rdfs:comment "Northwind database classes and properties" ; virtrdf:catName "Northwind" ; virtrdf:version "1.00" . northwind:Product rdf:type rdfs:Class ; rdfs:label "Product" . northwind:has_category rdf:type rdf:Property ; rdfs:range northwind:Category ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "Category" . northwind:has_supplier rdf:type rdf:Property ; rdfs:range northwind:Supplier ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "Supplier" . northwind:ProductName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "ProductName" . northwind:QuantityPerUnit rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "QuantityPerUnit" . northwind:UnitPrice rdf:type rdf:Property ; rdfs:range xsd:double ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "UnitPrice" . northwind:UnitsInStock rdf:type rdf:Property ; rdfs:range xsd:integer ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "UnitsInStock" . northwind:UnitsOnOrder rdf:type rdf:Property ; rdfs:range xsd:integer ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "UnitsOnOrder" . northwind:ReorderLevel rdf:type rdf:Property ; rdfs:range xsd:integer ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "ReorderLevel" . northwind:Discontinued rdf:type rdf:Property ; rdfs:range xsd:integer ; rdfs:domain northwind:Product ; rdfs:cardinality "1" ; rdfs:label "Discontinued" . northwind:Supplier rdf:type rdfs:Class ; rdfs:label "Supplier" . northwind:CompanyName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "CompanyName" . northwind:ContactName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "ContactName" . northwind:ContactTitle rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "ContactTitle" . northwind:Address rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "Address" . northwind:City rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "City" . northwind:Region rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "Region" . northwind:PostalCode rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "PostalCode" . northwind:Country rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "Country" . northwind:Phone rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "Phone" . northwind:Fax rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "Fax" . northwind:HomePage rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Supplier ; rdfs:cardinality "1" ; rdfs:label "HomePage" . northwind:Category rdf:type rdfs:Class ; rdfs:label "Category" . northwind:CategoryName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Category ; rdfs:cardinality "1" ; rdfs:label "CategoryName" . northwind:Description rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Category ; rdfs:cardinality "1" ; rdfs:label "Description" . northwind:Shipper rdf:type rdfs:Class ; rdfs:label "Shipper" . northwind:CompanyName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Shipper ; rdfs:cardinality "1" ; rdfs:label "CompanyName" . northwind:Phone rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Shipper ; rdfs:cardinality "1" ; rdfs:label "Phone" . northwind:Customer rdf:type rdfs:Class ; rdfs:label "Customer" . northwind:CompanyName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "CompanyName" . northwind:ContactName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "ContactName" . northwind:ContactTitle rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:donorthwind:main northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "ContactTitle" . northwind:Address rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "Address" . northwind:City rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "City" . northwind:Region rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "Region" . northwind:PostalCode rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "PostalCode" . northwind:Country rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "Country" . northwind:Phone rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "Phone" . northwind:Fax rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Customer ; rdfs:cardinality "1" ; rdfs:label "Fax" . northwind:Employee rdf:type rdfs:Class ; rdfs:label "Employee" . northwind:LastName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "LastName" . northwind:FirstName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "FirstName" . northwind:Title rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "Title" . northwind:TitleOfCourtesy rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "TitleOfCourtesy" . northwind:BirthDate rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "BirthDate" . northwind:HireDate rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "HireDate" . northwind:Address rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "Address" . northwind:City rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "City" . northwind:Region rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "Region" . northwind:PostalCode rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "PostalCode" . northwind:Country rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "Country" . northwind:HomePhone rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "HomePhone" . northwind:Extension rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "Extension" . northwind:Notes rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "Notes" . northwind:ReportsTo rdf:type rdf:Property ; rdfs:range xsd:integer ; rdfs:domain northwind:Employee ; rdfs:cardinality "1" ; rdfs:label "ReportsTo" . northwind:Order rdf:type rdfs:Class ; rdfs:label "Order" . northwind:has_customer rdf:type rdf:Property ; rdfs:range northwind:Customer ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "Customer" . northwind:has_employee rdf:type rdf:Property ; rdfs:range northwind:Employee ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "Employee" . northwind:OrderDate rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "OrderDate" . northwind:RequiredDate rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "RequiredDate" . northwind:ShippedDate rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "ShippedDate" . northwind:order_ship_via rdf:type rdf:Property ; rdfs:range northwind:Shipper ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "Shipper" . northwind:Freight rdf:type rdf:Property ; rdfs:range xsd:double ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "Freight" . northwind:ShipName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "ShipName" . northwind:ShipAddress rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "ShipAddress" . northwind:ShipCity rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "ShipCity" . northwind:ShipRegion rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "ShipRegion" . northwind:ShipPostalCode rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "ShipPostalCode" . northwind:ShipCountry rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Order ; rdfs:cardinality "1" ; rdfs:label "ShipCountry" . northwind:OrderLine rdf:type rdfs:Class ; rdfs:label "OrderLine" . northwind:has_order_id rdf:type rdf:Property ; rdfs:range northwind:Order ; rdfs:domain northwind:OrderLine; rdfs:cardinality "1" ; rdfs:label "Order" . northwind:has_product_id rdf:type rdf:Property ; rdfs:range northwind:Product ; rdfs:domain northwind:OrderLine; rdfs:cardinality "1" ; rdfs:label "Product" . northwind:UnitPrice rdf:type rdf:Property ; rdfs:range xsd:double ; rdfs:domain northwind:OrderLine; rdfs:cardinality "1" ; rdfs:label "UnitPrice" . northwind:Quantity rdf:type rdf:Property ; rdfs:range xsd:integer ; rdfs:domain northwind:OrderLine; rdfs:cardinality "1" ; rdfs:label "Quantity" . northwind:Discount rdf:type rdf:Property ; rdfs:range xsd:double ; rdfs:domain northwind:OrderLine; rdfs:cardinality "1" ; rdfs:label "Discount" . northwind:Country rdf:type rdfs:Class ; rdfs:label "Country" . northwind:Name rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "Name" . northwind:Code rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "Code" . northwind:SmallFlagDAVResourceName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "SmallFlagDAVResourceName" . northwind:LargeFlagDAVResourceName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "LargeFlagDAVResourceName" . northwind:SmallFlagDAVResourceURI rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "SmallFlagDAVResourceURI" . northwind:LargeFlagDAVResourceURI rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "LargeFlagDAVResourceURI" . northwind:Lat rdf:type rdf:Property ; rdfs:range xsd:double ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "Lat" . northwind:Lng rdf:type rdf:Property ; rdfs:range xsd:double ; rdfs:domain northwind:Country ; rdfs:cardinality "1" ; rdfs:label "Lng" . northwind:Province rdf:type rdfs:Class ; rdfs:label "Province" . northwind:has_country_code rdf:type rdf:Property ; rdfs:range northwind:Country ; rdfs:domain northwind:Provinces ; rdfs:cardinality "1" ; rdfs:label "Country Code" . northwind:ProvinceName rdf:type rdf:Property ; rdfs:range xsd:string ; rdfs:domain northwind:Province ; rdfs:cardinality "1" ; rdfs:label "ProvinceName" . ---+++ IRI Generator (Entity ID Constructor) Declarations sparql prefix demo: prefix oplsioc: prefix sioc: create iri class demo:Category "http://^{URIQADefaultHost}^/demo/Category?id=%d" (in category_id integer not null) . create iri class demo:Shipper "http://^{URIQADefaultHost}^/demo/Shipper?id=%d" (in shipper_id integer not null) . create iri class demo:Supplier "http://^{URIQADefaultHost}^/demo/Supplier?id=%d" (in supplier_id integer not null) . create iri class demo:Product "http://^{URIQADefaultHost}^/demo/Product?id=%d" (in product_id integer not null) . create iri class demo:Customer "http://^{URIQADefaultHost}^/demo/Customer?id=%d" (in customer_id integer not null) . create iri class demo:Employee "http://^{URIQADefaultHost}^/demo/Employee?id=%d" (in employee_id integer not null) . create iri class demo:Order "http://^{URIQADefaultHost}^/demo/Order?id=%d" (in order_id integer not null) . create iri class demo:OrderLine "http://^{URIQADefaultHost}^/demo/OrderLine?id=%d&prod_id=%d" (in order_id integer not null, in product_id integer not null) . create iri class demo:Province "http://^{URIQADefaultHost}^/demo/Province?country=%s&province=%s" (in country_name varchar not null, in province_name varchar not null) . create iri class demo:Country "http://^{URIQADefaultHost}^/demo/Country?country=%s" (in country_name varchar not null) . ; ---+++ SQL to RDF Mapping (Quad Patterns) sparql prefix demo: prefix oplsioc: prefix sioc: create quad storage virtrdf:Northwind from Demo.demo.Products as products from Demo.demo.Suppliers as suppliers from Demo.demo.Shippers as shippers from Demo.demo.Categories as categories from Demo.demo.Customers as customers from Demo.demo.Employees as employees from Demo.demo.Orders as orders from Demo.demo.Order_Details as order_lines from Demo.demo.Countries as countries from Demo.demo.Provinces as provinces { create virtrdf:Demo as graph iri ("http://^{URIQADefaultHost}^/demo") option (exclusive) { demo:Product (products.ProductID) a demo:Product as virtrdf:Product-ProductID ; demo:has_category demo:Category (products.CategoryID) as virtrdf:Product-product_has_category ; demo:has_supplier demo:Supplier (products.SupplierID) as virtrdf:Product-product_has_supplier ; demo:ProductName products.ProductName as virtrdf:Product-name_of_product ; demo:QuantityPerUnit products.QuantityPerUnit as virtrdf:Product-quantity_per_unit ; demo:UnitPrice products.UnitPrice as virtrdf:Product-unit_price ; demo:UnitsInStock products.UnitsInStock as virtrdf:Product-units_in_stock ; demo:UnitsOnOrder products.UnitsOnOrder as virtrdf:Product-units_on_order ; demo:ReorderLevel products.ReorderLevel as virtrdf:Product-reorder_level ; demo:Discontinued products.Discontinued as virtrdf:Product-discontinued . demo:Supplier (suppliers.SupplierID) a demo:Supplier as virtrdf:Supplier-SupplierID ; demo:CompanyName suppliers.CompanyName as virtrdf:Supplier-company_name ; demo:ContactName suppliers.ContactName as virtrdf:Supplier-contact_name ; demo:ContactTitle suppliers.ContactTitle as virtrdf:Supplier-contact_title ; demo:Address suppliers.Address as virtrdf:Supplier-address ; demo:City suppliers.City as virtrdf:Supplier-city ; demo:Region suppliers.Region as virtrdf:Supplier-region ; demo:PostalCode suppliers.PostalCode as virtrdf:Supplier-postal_code ; demo:Country suppliers.Country as virtrdf:Supplier-country ; demo:Phone suppliers.Phone as virtrdf:Supplier-phone ; demo:Fax suppliers.Fax as virtrdf:Supplier-fax ; demo:HomePage suppliers.HomePage as virtrdf:Supplier-home_page . demo:Category (categories.CategoryID) a demo:Category as virtrdf:Category-CategoryID ; demo:CategoryName categories.CategoryName as virtrdf:Category-home_page ; demo:Description categories.Description as virtrdf:Category-description ; demo:Picture categories.Picture as virtrdf:Category-picture . demo:Shipper (shippers.ShipperID) a demo:Shipper as virtrdf:Shipper-ShipperID ; demo:CompanyName shippers.CompanyName as virtrdf:Shipper-company_name ; demo:Phone shippers.Phone as virtrdf:Shipper-phone . demo:Customer (customers.CustomerID) a demo:Customer as virtrdf:Customer-CustomerID ; demo:CompanyName customers.CompanyName as virtrdf:Customer-company_name ; demo:ContactName customers.ContactName as virtrdf:Customer-contact_name ; demo:ContactTitle customers.ContactTitle as virtrdf:Customer-contact_title ; demo:Address customers.Address as virtrdf:Customer-address ; demo:City customers.City as virtrdf:Customer-city ; demo:Region customers.Region as virtrdf:Customer-region ; demo:PostalCode customers.PostalCode as virtrdf:Customer-postal_code ; demo:Country customers.Country as virtrdf:Customer-country ; demo:Phone customers.Phone as virtrdf:Customer-phone ; demo:Fax customers.Fax as virtrdf:Customer-fax . demo:Employee (employees.EmployeeID) a demo:Employee as virtrdf:Employee-EmployeeID ; demo:LastName employees.LastName as virtrdf:Employee-last_name ; demo:FirstName employees.FirstName as virtrdf:Employee-first_name ; demo:Title employees.Title as virtrdf:title ; demo:TitleOfCourtesy employees.TitleOfCourtesy as virtrdf:Employee-title_of_courtesy ; demo:BirthDate employees.BirthDate as virtrdf:Employee-birth_date ; demo:HireDate employees.HireDate as virtrdf:Employee-hire_date ; demo:Address employees.Address as virtrdf:Employee-address ; demo:City employees.City as virtrdf:Employee-city ; demo:Region employees.Region as virtrdf:Employee-region ; demo:PostalCode employees.PostalCode as virtrdf:Employee-postal_code ; demo:Country employees.Country as virtrdf:Employee-country ; demo:HomePhone employees.HomePhone as virtrdf:Employee-home_phone ; demo:Extension employees.Extension as virtrdf:Employee-extension ; demo:Photo employees.Photo as virtrdf:Employee-photo ; demo:Notes employees.Notes as virtrdf:Employee-notes ; demo:ReportsTo employees.ReportsTo as virtrdf:Employee-reports_to . demo:Order (orders.OrderID) a demo:Order as virtrdf:Order-Order ; demo:has_customer demo:Customer (orders.CustomerID) as virtrdf:Order-order_has_customer ; demo:has_employee demo:Employee (orders.EmployeeID) as virtrdf:Order-order_has_employee ; demo:OrderDate orders.OrderDate as virtrdf:Order-order_date ; demo:RequiredDate orders.RequiredDate as virtrdf:Order-required_date ; demo:ShippedDate orders.ShippedDate as virtrdf:Order-shipped_date ; demo:order_ship_via demo:Shipper (orders.ShipVia) as virtrdf:Order-order_ship_via ; demo:Freight orders.Freight as virtrdf:Order-freight ; demo:ShipName orders.ShipName as virtrdf:Order-ship_name ; demo:ShipAddress orders.ShipAddress as virtrdf:Order-ship_address ; demo:ShipCity orders.ShipCity as virtrdf:Order-ship_city ; demo:ShipRegion orders.ShipRegion as virtrdf:Order-ship_region ; demo:ShipPostal_code orders.ShipPostalCode as virtrdf:Order-ship_postal_code ; demo:ShipCountry orders.ShipCountry as virtrdf:ship_country . demo:OrderLine (order_lines.OrderID, order_lines.ProductID) a demo:OrderLine as virtrdf:OrderLine-OrderLines ; demo:has_order_id demo:Order (order_lines.OrderID) as virtrdf:order_lines_has_order_id ; demo:has_product_id demo:Product (order_lines.ProductID) as virtrdf:order_lines_has_product_id ; demo:UnitPrice order_lines.UnitPrice as virtrdf:OrderLine-unit_price ; demo:Quantity order_lines.Quantity as virtrdf:OrderLine-quantity ; demo:Discount order_lines.Discount as virtrdf:OrderLine-discount . demo:Country (countries.Name) a demo:Country as virtrdf:Country-Name ; demo:Code countries.Code as virtrdf:Country-Code ; demo:SmallFlagDAVResourceName countries.SmallFlagDAVResourceName as virtrdf:Country-SmallFlagDAVResourceName ; demo:LargeFlagDAVResourceName countries.LargeFlagDAVResourceName as virtrdf:Country-LargeFlagDAVResourceName ; demo:SmallFlagDAVResourceURI countries.SmallFlagDAVResourceURI as virtrdf:Country-SmallFlagDAVResourceURI ; demo:LargeFlagDAVResourceURI countries.LargeFlagDAVResourceURI as virtrdf:Country-LargeFlagDAVResourceURI ; demo:Lat countries.Lat as virtrdf:Country-Lat ; demo:Lng countries.Lng as virtrdf:Country-Lng . demo:Province (provinces.CountryCode, provinces.Province) a demo:Province as virtrdf:Province-Provinces ; demo:has_country_code demo:Country (provinces.CountryCode) as virtrdf:has_country_code ; demo:Province provinces.Province as virtrdf:Province-Province . } } ; ---++ Conclusions We have described how we can arbitrarily map relational database schemas to RDF ontologies en route to generating virtual RDF Data Sets (Graphs) that are then accessible to SPARQL Queries from within SQL or via the SPARQL Query Protocol. All of this is achieved without compromising the inherent flexibility of the RDF data model or the SPARQL Query Language. It should also be noted that all of the functionality demonstrated also applies Virtual DBMS functionality realm of Virtuoso. Thus, you can now map of 3rd party ODBC or JDBC accessible SQL data to RDF on the fly. Likewise, you can also use the same Virtual DBMS layer to map data exposed via local or 3rd party SOAP or REST based Web Services to RDF. In addition to providing immense power and flexibility at the data mapping level, we have also paid great attention to the low level optimization of Virtuoso's underlying RDF storage engine (Triple or Quad Store). ---++ Future Research & Development Future areas of research and development include: * Supporting declaration of data representation conditioned by the inferred domain of the triple pattern subject. * Optimized mapping of joins of relational tables without creating redundant views. * Supporting disjunction. * Federating SPARQL queries across multiple SPARQL endpoints. ---++Other SQL to RDF Mapping Projects * [[http://sites.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/][D2RQ Project]] * [[http://jena.sourceforge.net/SquirrelRDF/][Squirrel Project]] * [[http://www.w3.org/2005/05/22-SPARQL-MySQL/XTech][SPASQL Project]] ---++References [[http://www.w3.org/DesignIssues/RDB-RDF][Relational Databases & The Semantic Web]] - Tim Berners-Lee CategoryRDF CategoryAtomOWL CategoryTutorial CategorySQL CategoryOntology CategorySIOC CategoryVOS CategoryVirtuoso CategoryODS %VOSCOPY%
id
  • 3cd194e839d2df161698cb0705ea842d
link
has container
http://rdfs.org/si...ices#has_services
atom:title
  • VOSSQLRDF
links to
atom:source
atom:author
atom:published
  • 2017-06-13T05:40:08Z
atom:updated
  • 2017-06-13T05:40:08Z
topic
is made of
is container of of
is link of
is http://rdfs.org/si...vices#services_of of
is links to of
is creator of of
is atom:entry of
is atom:contains of
Faceted Search & Find service v1.17_git132 as of May 12 2023


Alternative Linked Data Documents: iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 08.03.3332 as of Sep 11 2024, on Linux (x86_64-generic-linux-glibc25), Single-Server Edition (15 GB total memory, 2 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software