Not logged in : Login

About: VirtSetCrawlerJobsGuideSitemaps     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : atom:Entry, within Data Space : ods.openlinksw.com associated with source document(s)

AttributesValues
type
Date Created
Date Modified
label
  • VirtSetCrawlerJobsGuideSitemaps
maker
Title
  • VirtSetCrawlerJobsGuideSitemaps
isDescribedUsing
has creator
attachment
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr1.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr11.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr11a.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr11ab.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr11b.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr12.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr12a.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr13.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr14.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr15.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr2.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideSitemaps/cr3.png
content
  • %META:TOPICPARENT{name="VirtSetCrawlerJobsGuide"}% ---+Setting up a Content Crawler Job to retrieve Sitemaps The following guide describes how to set up a crawler job for getting content of a basic Sitemap where the source includes RDFa. 1 From the Virtuoso Conductor User Interface i.e. http://cname:port/conductor, login as the "dba" user. 1 Go to "Web Application Server" tab. %BR%%BR%%BR%%BR% 1 Go to the "Content Imports" tab. %BR%%BR%%BR%%BR% 1 Click on the "New Target" button. %BR%%BR%%BR%%BR% 1 In the form displayed: * Enter a name of choice in the "Crawl Job Name" text-box: Basic Sitemap Crawling Example * Enter the URL of the site to be crawled in the "Data Source Address (URL)" text-box: http://psclife.pscdog.com/catalog/seo_sitemap/product/  * Enter the location in the Virtuoso WebDAV repository the crawled should stored in the "Local WebDAV Identifier" text-box, for example, if user demo is available, then: /DAV/home/demo/basic_sitemap/ * Choose the "Local resources owner" for the collection from the list-box available, for ex: user demo. * Select the "Accept RDF" check-box. %BR%%BR%%BR%%BR%%BR% 1 Click the "Create" button to create the import: %BR%%BR%%BR%%BR% 1 Click the "Import Queues" button. 1 For the "Robot targets" with label "Basic Sitemap Crawling Example " click the "Run" button. 1 This will result in the Target site being crawled and the retrieved pages stored locally in DAV and any sponged triples in the RDF Quad store. %BR%%BR%%BR%%BR% 1 Go to the "Web Application Server" -> "Content Management" tab. %BR%%BR%%BR%%BR% 1 Navigate to the location of newly created DAV collection: /DAV/home/demo/basic_sitemap/ 1 The retrieved content will be available in this location. %BR%%BR%%BR%%BR% ---++Related * [[VirtSetCrawlerJobsGuide][Setting up Crawler Jobs Guide using Conductor]] * [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler][Setting up a Content Crawler Job to Add RDF Data to the Quad Store]] * [[VirtSetCrawlerJobsGuideSemanticSitemaps][Setting up a Content Crawler Job to Retrieve Semantic Sitemaps (a variation of the standard sitemap)]] * [[VirtSetCrawlerJobsGuideDirectories][Setting up a Content Crawler Job to Retrieve Content from Specific Directories]] * [[VirtCrawlerSPARQLEndpoints][Setting up a Content Crawler Job to Retrieve Content from SPARQL endpoint]]
id
  • 8e3b28cf81a7848dc1ce50585dfeebf2
link
has container
http://rdfs.org/si...ices#has_services
atom:title
  • VirtSetCrawlerJobsGuideSitemaps
links to
atom:source
atom:author
atom:published
  • 2017-06-13T05:48:33Z
atom:updated
  • 2017-06-13T05:48:33Z
topic
is made of
is container of of
is link of
is http://rdfs.org/si...vices#services_of of
is links to of
is creator of of
is atom:entry of
is atom:contains of
Faceted Search & Find service v1.17_git132 as of May 12 2023


Alternative Linked Data Documents: iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3238 as of May 23 2023, on Linux (x86_64-generic-linux-glibc25), Single-Server Edition (15 GB total memory, 3 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software