[
Skip Navigation]
≡
β©οΈ
π£οΈ
-
π
Help
:
Wiki
:
Crawl Robot Set-up
≡
Welcome
Signin
Crawl Robot Set-up@Help
View
Source
History
Discussion
Help Group
Create/Find Pages
Group Feed
My Groups
π
Locale: en-US
Page: Crawl Robot Set-up
β
ποΈ
Page Type:
Standard
Page and Feedback
Page Alias
Media List
Presentation
Url Shortener
Share Wall
Alias Page To:
Page Border:
Solid
Dashed
None
Table of Contents:
Title:
Author:
Meta Robots:
Meta Description:
Meta Properties (such as Open Graph)
One line per property in format: name|content
Header Page Name:
Footer Page Name:
The '''Crawl Robot Set-up''' fieldset is used to provide websites that you crawl with information about who is crawling them. *The field '''Crawl Robot Name''' is used to the USER-AGENT header sent by your robot. It has the format:<br> <code> Mozilla/5.0 (compatible; NAME_FROM_THIS_FIELD; YOUR_SITES_URL/bot) </code><br> The value sent will be common to all fetcher traffic from the same queue server on the site when downloading webpages.<br> If you are doing crawls using multiple queue servers you should give the same value to each queue server. The value of YOUR_SITES_URL comes from the Server Settings - Name Server URL field. *The '''Robot Instance''' field is used for web communication internal to a single yioop instance to help identify which queue server or fetcher under that queue server was involved. This string should be unique for each queue server in your Yioop set-up. The value of this string is written when logging requests between fetchers and queue servers and can be helpful in debugging. *The '''Robot Description''' field is used to specify the Public bot wiki page. This page can also be accessed and edited under Manage Groups by clicking on the wiki link for the Public group and then editing its Bot page. This wiki page is what's display when someone goes to the URL:<br> YOUR_SITES_URL/bot <br> The point of this page is to give web owners both contact info for your bot as well as a description of how your bot crawls web sites.
X
(c) Hobby
GOOTII.COM
We use cookies to implement this site's user functionality, social media features, and traffic analytics.
Privacy Policy Details
.
Allow Cookies