Not logged in : Login

About: VirtSetCrawlerJobsGuideDirectories     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : atom:Entry, within Data Space : ods.openlinksw.com associated with source document(s)

AttributesValues
type
Date Created
Date Modified
label
  • VirtSetCrawlerJobsGuideDirectories
maker
Title
  • VirtSetCrawlerJobsGuideDirectories
isDescribedUsing
has creator
attachment
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/cr1.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/cr2.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/cr3.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d1.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d1a.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d2.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d3.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d4.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d5.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d6.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtSetCrawlerJobsGuideDirectories/d7.png
content
  • %META:TOPICPARENT{name="VirtSetCrawlerJobsGuide"}% ---+Setting up a Content Crawler Job to Retrieve Content from Specific Directories The following guide describes how to set up crawler job for getting directories using Conductor. 1 Go to Conductor UI. For ex. at http://localhost:8890/conductor . 1 Enter dba credentials. 1 Go to "Web Application Server". %BR%%BR%%BR%%BR% 1 Go to "Content Imports". %BR%%BR%%BR%%BR% 1 Click "New Target". %BR%%BR%%BR%%BR% 1 In the shown form set respectively: * "Crawl Job Name": Gov.UK data * "Data Source Address (URL)": http://source.data.gov.uk/data/ * "Local WebDAV Identifier" for available user, for ex. demo: /DAV/home/demo/gov.uk/ * Choose from the available list "Local resources owner" an user, for ex. demo ; %BR%%BR%%BR%%BR% * Click the button "Create". 1 As result the Robot target will be created: %BR%%BR%%BR%%BR% 1 Click "Import Queues". %BR%%BR%%BR%%BR% 1 For "Robot target" with label "Gov.UK data " click "Run". 1 As result will be shown the status of the pages: retrieved, pending or respectively waiting. %BR%%BR%%BR%%BR% 1 Click "Retrieved Sites" 1 As result should be shown the number of the total pages retrieved. %BR%%BR%%BR%%BR% 1 Go to "Web Application Server" -> "Content Management" . 1 Enter path: DAV/home/demo/gov.uk %BR%%BR%%BR%%BR% 1 Go to path: DAV/home/demo/gov.uk/data 1 As result the retrieved content will be shown. %BR%%BR%%BR%%BR% ---++Related * [[VirtSetCrawlerJobsGuide][Setting up Crawler Jobs Guide using Conductor]] * [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler][Setting up a Content Crawler Job to Add RDF Data to the Quad Store]] * [[VirtSetCrawlerJobsGuideSitemaps][Setting up a Content Crawler Job to Retrieve Sitemaps (where the source includes RDFa)]] * [[VirtSetCrawlerJobsGuideSemanticSitemaps][Setting up a Content Crawler Job to Retrieve Semantic Sitemaps (a variation of the standard sitemap)]] * [[VirtCrawlerSPARQLEndpoints][Setting up a Content Crawler Job to Retrieve Content from SPARQL endpoint]]
id
  • 01b349799c30efc349e0f22448cc4a70
link
has container
http://rdfs.org/si...ices#has_services
atom:title
  • VirtSetCrawlerJobsGuideDirectories
links to
atom:source
atom:author
atom:published
  • 2017-06-13T05:37:45Z
atom:updated
  • 2017-06-13T05:37:45Z
topic
is made of
is container of of
is link of
is http://rdfs.org/si...vices#services_of of
is links to of
is creator of of
is atom:entry of
is atom:contains of
Faceted Search & Find service v1.17_git132 as of May 12 2023


Alternative Linked Data Documents: iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3238 as of May 23 2023, on Linux (x86_64-generic-linux-glibc25), Single-Server Edition (15 GB total memory, 2 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software