. "7e30b3d3b7814cd6016b3546adfa5dfd" . "VirtSetCrawlerJobsGuideSemanticSitemaps" . . . . "2017-06-13T05:44:35Z" . . "%META:TOPICPARENT{name=\"VirtSetCrawlerJobsGuide\"}%\n---+Setting up a Content Crawler Job to retrieve Semantic Sitemaps\n\n\nThe following guide describes how to set up crawler job for getting Semantic Sitemap's content -- a variation of standard sitemap:\n\n 1 Go to Conductor UI. For ex. at http://localhost:8890/conductor .\n 1 Enter dba credentials.\n 1 Go to \"Web Application Server\".\n%BR%%BR%

%BR%%BR%\n 1 Go to \"Content Imports\".\n%BR%%BR%

%BR%%BR%\n 1 Click \"New Target\".\n%BR%%BR%

%BR%%BR%\n 1 In the shown form:\n * Enter for \"Crawl Job Name\": \n\nSemantic Web Sitemap Example \n\n * Enter for \"Data Source Address (URL)\": \n\nhttp://www.connexfilter.com/sitemap_en.xml\n\n * Enter the location in the Virtuoso WebDAV repository the crawled should stored in the \"Local WebDAV Identifier \" text-box, for example, if user demo is available, then: \n\n/DAV/home/demo/semantic_sitemap/\n\n * Choose the \"Local resources owner\" for the collection from the list box available, for ex: user demo. \n * Hatch \"Semantic Web Crawling\":\n * Note: when you select this option, you can either:\n 1 Leave the Store Function and Extract Function empty - in this case the system Store and Extract functions will be used for the Semantic Web Crawling Process, or:\n 1 You can select your own Store and Extract Functions. [[VirtSetCrawlerJobsGuideSemanticSitemapsFuncExample][View an example of these functions]].\n * Hatch \"Accept RDF\"\n%BR%%BR%

\n%BR%

%BR%%BR%\n * Optionally you can hatch \"Store metadata *\" and specify which RDF Cartridges to be included from the Sponger:\n%BR%%BR%

%BR%%BR%\n 1 Click the button \"Create\". \n%BR%%BR%

%BR%%BR%\n 1 Click \"Import Queues\".\n%BR%%BR%

%BR%%BR%\n 1 For \"Robot target\" with label \"Semantic Web Sitemap Example\" click \"Run\".\n 1 As result should be shown the number of the pages retrieved.\n%BR%%BR%

%BR%%BR%\n 1 Check the retrieved RDF data from your Virtuoso instance SPARQL endpoint http://cname:port/sparql with the following query selecting all the retrieved graphs for ex:\n\nSELECT ?g \nFROM \nWHERE \n { \n graph ?g { ?s ?p ?o } . \n FILTER ( ?g LIKE ) \n }\n\n%BR%%BR%

%BR%%BR%\n\n\n---++Related\n\n * [[VirtSetCrawlerJobsGuide][Setting up Crawler Jobs Guide using Conductor]]\n * [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler][Setting up a Content Crawler Job to Add RDF Data to the Quad Store]]\n * [[VirtSetCrawlerJobsGuideSitemaps][Setting up a Content Crawler Job to Retrieve Sitemaps (where the source includes RDFa)]]\n * [[VirtSetCrawlerJobsGuideDirectories][Setting up a Content Crawler Job to Retrieve Content from Specific Directories]]\n * [[VirtCrawlerSPARQLEndpoints][Setting up a Content Crawler Job to Retrieve Content from SPARQL endpoint]]" . . . . . . . . . . . . . "VirtSetCrawlerJobsGuideSemanticSitemaps" . . . "2017-06-13T05:44:35.571999"^^ . . . "2017-06-13T05:44:35.571999"^^ . . . . "VirtSetCrawlerJobsGuideSemanticSitemaps" . . . . . . . . . . "2017-06-13T05:44:35Z" . .