Not logged in : Login

About: VirtCrawlerGuideAtom     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : atom:Entry, within Data Space : ods.openlinksw.com associated with source document(s)

AttributesValues
type
Date Created
Date Modified
label
  • VirtCrawlerGuideAtom
maker
Title
  • VirtCrawlerGuideAtom
isDescribedUsing
has creator
attachment
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cr3.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra1.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra10.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra11.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra12.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra13.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra14.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra15.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra2.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra3.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra4.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra5.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra6.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra7.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra8.png
  • http://vos.openlinksw.com/wiki/main/VOS/VirtCrawlerGuideAtom/cra9.png
content
  • %META:TOPICPARENT{name="VirtSetCrawlerJobsGuide"}% ---+Virtuoso Crawler Guide for populating Virtuoso Quad Store using ATOM feed %TOC% ---++What? This Guide demonstrates populating the Virtuoso Quad Store using ATOM feed. ---++Why? Populating the Virtuoso Quad Store can be done in different ways Virtuoso supports. The Conductor -> Content Import UI offers plenty of options, one of which is the XPath expression for crawling RDF resources URLs and this feature is a powerful and easy-to-use for managing the Quad Store. ---++How? To populate the Virtuoso Quad Store, in this Guide we will use a XPAth expression for the URLs of the RDF resources references in a given ATOM feed. For ex. [[http://data.libris.kb.se/nationalbibliography/feed/][this one]] of the "National Bibliography" Store. ---+++Sample Scenario 1 Go to http://cname/conductor 1 Enter dba credentials 1 Go to Web Application Server -> Content Management -> Content Imports: %BR%%BR%%BR%%BR% 1 Click "New Target": %BR%%BR%%BR%%BR% 1 In the presented form specify respectively: * Crawl Job Name: for ex. National Bibliography ; * Data Source Address (URL): for ex. [[http://data.libris.kb.se/nationalbibliography/feed/][http://data.libris.kb.se/nationalbibliography/feed/]] ; * Note: the entered URL will be the graph URI for storing the imported RDF data. You can also set it explicitly by entering another graph URI in the "If Graph IRI is unassigned use this Data Source URL:" option. * Local WebDAV Identifier : for ex. /DAV/temp/nbio/ * XPath expression for links extraction: //entry/link/@href * Update Interval (minutes): for ex. 10 ; * Run Sponger: hatch this check-box ; * Accept RDF: hatch this check-box ; * Store metadata: hatch this check-box ; * RDF Cartridge: hatch this check-box and specify what cartridges will be used: %BR%%BR% %BR% %BR%%BR%%BR% 1 Click "Create": 1 The new created target should be displayed in the list of available Targets: %BR%%BR%%BR%%BR% 1 Click "Import Queues": %BR%%BR%%BR%%BR% 1 Click for "National Bibliography" target the "Run" link from the very-right "Action" column: 1 Should be presented list of Top pending URLs: %BR%%BR%%BR%%BR% 1 Finally when the import is finished, should be shown the total URLs that were processed: %BR%%BR%%BR%%BR% 1 Click "Back" %BR%%BR%%BR%%BR% 1 Click "Retrieved Sites". %BR%%BR%%BR%%BR% 1 Out target should be presented in the list of available retrieved sites. From here you could manage the retrieved URLs by editing the imported URLs or exporting to External/Internal WebDAV destination. Click for ex. the "Edit" link of the very-right "Action" column for our retrieved site. 1 Should be presented all downloaded URLs of RDF resources referenced in our initial ATOM feed. %BR%%BR%%BR%%BR% 1 To view the imported RDF data, go to http://cname/sparql and enter a simple query for ex.: SELECT * FROM WHERE { ?s ?p ?o } %BR%%BR%%BR%%BR% 1 Click "Run Query". 1 The imported RDF data triples should be shown: %BR%%BR%%BR%%BR% ---++Related * [[http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html#rdfinsertmethodvirtuosocrawler][Setting up a Content Crawler Job to Add RDF Data to the Quad Store]] * [[VirtSetCrawlerJobsGuideSitemaps][Setting up a Content Crawler Job to Retrieve Sitemaps]] (when the source includes RDFa) * [[VirtSetCrawlerJobsGuideSemanticSitemaps][Setting up a Content Crawler Job to Retrieve Semantic Sitemaps]] (a variation of the standard sitemap) * [[VirtSetCrawlerJobsGuideDirectories][Setting up a Content Crawler Job to Retrieve Content from Specific Directories]] * [[VirtCrawlerSPARQLEndpoints][Setting up a Content Crawler Job to Retrieve Content from SPARQL endpoint]] * [[http://docs.openlinksw.com/virtuoso/xmlservices.html#xpath_sql][Virtuoso XPATH Implementation and SQL]] * [[http://librisbloggen.kb.se/2011/09/21/swedish-national-bibliography-and-authority-data-released-with-open-license/][Collection examples of live ATOM and OAI-PMH feeds.]]
id
  • 39696faaf5e697da124a9bcfc6054541
link
has container
http://rdfs.org/si...ices#has_services
atom:title
  • VirtCrawlerGuideAtom
links to
atom:source
atom:author
atom:published
  • 2017-06-13T05:43:06Z
atom:updated
  • 2017-06-13T05:43:06Z
topic
is made of
is container of of
is link of
is http://rdfs.org/si...vices#services_of of
is links to of
is creator of of
is atom:entry of
is atom:contains of
Faceted Search & Find service v1.17_git132 as of May 12 2023


Alternative Linked Data Documents: iSPARQL | ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3238 as of May 23 2023, on Linux (x86_64-generic-linux-glibc25), Single-Server Edition (15 GB total memory, 3 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2024 OpenLink Software