Not logged in : Login

About: VirtCrawlerSPARQLEndpoints     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : atom:Entry, within Data Space : ods.openlinksw.com associated with source document(s)

AttributesValues
type
Date Created
Date Modified
label
  • VirtCrawlerSPARQLEndpoints
maker
Title
  • VirtCrawlerSPARQLEndpoints
isDescribedUsing
has creator
attachment
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/cr3.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/cra1.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp1.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp10.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp11.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp12.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp13.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp14.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp2.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp3.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp4.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp5.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp6.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp7.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp8.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerSPARQLEndpoints/scp9.png
content
  • %META:TOPICPARENT{name="VirtSetCrawlerJobsGuide"}% ---+Setting up a Content Crawler Job to Retrieve Content from SPARQL endpoint The following step-by guide walks you through the process of: * Populating a Virtuoso Quad Store with data from a 3rd party SPARQL endpoint * Generating RDF dumps that are accessible to basic HTTP or WebDAV user agents. 1. Sample SPARQL query producing a list SPARQL endpoints: PREFIX rdf: PREFIX rdfs: PREFIX owl: PREFIX xsd: PREFIX foaf: PREFIX dcterms: PREFIX scovo: PREFIX void: PREFIX akt: SELECT DISTINCT ?endpoint WHERE { ?ds a void:Dataset . ?ds void:sparqlEndpoint ?endpoint } 1 Here is a sample SPARQL protocol URL constructed from one of the sparql endpoints in the result from the query above: http://void.rkbexplorer.com/sparql/?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E+%0D%0APREFIX+void%3A+++++%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E++%0D%0ASELECT+distinct+%3Furl++WHERE+%7B+%3Fds+a+void%3ADataset+%3B+foaf%3Ahomepage+%3Furl+%7D%0D%0A&format=sparql 1 Here is the cURL output showing a Virtuoso SPARQL URL that executes against a 3rd party SPARQL Endpoint URL: $ curl "http://void.rkbexplorer.com/sparql/?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E+%0D%0APREFIX+void %3A+++++%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E++%0D%0ASELECT+distinct+%3Furl++WHERE+%7B+%3Fds+a+void%3ADataset+%3B+foaf%3Ah omepage+%3Furl+%7D%0D%0A&format=sparql" http://kisti.rkbexplorer.com/ http://epsrc.rkbexplorer.com/ http://test2.rkbexplorer.com/ http://test.rkbexplorer.com/ ... ... ... 1 Go to Conductor UI. For ex. http://localhost:8890/conductor : %BR%%BR%%BR%%BR% 1 Enter dba credentials 1 Go to "Web Application Server"-> "Content Management" -> "Content Imports" %BR%%BR%%BR%%BR% 1 Click "New Target" %BR%%BR%%BR%%BR% 1 In the presented form enter for ex.: * "Crawl Job Name": voiD store * "Data Source Address (URL)": the url from above i.e.: http://void.rkbexplorer.com/sparql/?query=PREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E+%0D%0APREFIX+void%3A+++++%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E++%0D%0ASELECT+distinct+%3Furl++WHERE+%7B+%3Fds+a+void%3ADataset+%3B+foaf%3Ahomepage+%3Furl+%7D%0D%0A&format=sparql * "Local WebDAV Identifier": /DAV/void.rkbexplorer.com/content * "Follow links matching (delimited with ;)": % * Un-hatch "Use robots.txt" ; * "XPath expression for links extraction": //binding[@name="url"]/uri/text() * Hatch "Semantic Web Crawling"; * "If Graph IRI is unassigned use this Data Source URL:": enter for ex: http://void.collection * Hatch "Follow URLs outside of the target host"; * Hatch "Run "Sponger" and "Accept RDF" %BR%%BR% %BR%%BR%%BR% 1 Click "Create". 1 The target should be created and presented in the list of available targets: %BR%%BR%%BR%%BR% 1 Click "Import Queues": %BR%%BR%%BR%%BR% 1 Click "Run" for the imported target: %BR%%BR%%BR%%BR% 1 To check the retrieved content go to "Web Application Server"-> "Content Management" -> "Content Imports" -> "Retrieved Sites": %BR%%BR%%BR%%BR% 1 Click voiD store -> "Edit": %BR%%BR%%BR%%BR% 1 To check the imported URLs go to "Web Application Server"-> "Content Management" -> "Repository" path DAV/void.rkbexplorer.com/content: %BR%%BR%%BR%%BR% 1 To check the inserted into the RDF QUAD data go to http://cname/sparql and execute the following query: SELECT * FROM WHERE { ?s ?p ?o } %BR%%BR%%BR%%BR% %BR%%BR%%BR%%BR% ---++Related * [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler][Setting up a Content Crawler Job to Add RDF Data to the Quad Store]] * [[VirtSetCrawlerJobsGuideSitemaps][Setting up a Content Crawler Job to Retrieve Sitemaps]] (when the source includes RDFa) * [[VirtSetCrawlerJobsGuideSemanticSitemaps][Setting up a Content Crawler Job to Retrieve Semantic Sitemaps]] (a variation of the standard sitemap) * [[VirtSetCrawlerJobsGuideDirectories][Setting up a Content Crawler Job to Retrieve Content from Specific Directories]] * [[VirtCrawlerGuideAtom][Setting up a Content Crawler Job to Retrieve Content from ATOM feed]]
id
  • 375585bb5d3b7b0fa04f28ce2a196565
link
has container
http://rdfs.org/si...ices#has_services
atom:title
  • VirtCrawlerSPARQLEndpoints
links to
atom:source
atom:author
atom:published
  • 2017-06-13T05:49:29Z
atom:updated
  • 2017-06-13T05:49:29Z
topic
is made of
is container of of
is link of
is http://rdfs.org/si...vices#services_of of
is creator of of
is atom:entry of
is atom:contains of
Faceted Search & Find service v1.17_git132 as of May 12 2023


Alternative Linked Data Documents: iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 08.03.3332 as of Sep 11 2024, on Linux (x86_64-generic-linux-glibc25), Single-Server Edition (15 GB total memory, 2 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software