linkcheckerrc man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

linkcheckerrc(5)					      linkcheckerrc(5)

NAME
       linkcheckerrc - configuration file for LinkChecker

DESCRIPTION
       linkcheckerrc  is  the configuration file for LinkChecker.  The file is
       written in an INI-style format.
       The default file	 location  is  ~/.linkchecker/linkcheckerrc  on	 Unix,
       %HOMEPATH%\.linkchecker\linkcheckerrc on Windows systems.

SETTINGS
   [checking]
       cookiefile=filename
	      Read  a file with initial cookie data. The cookie data format is
	      explained in linkchecker(1).
	      Command line option: --cookiefile

       localwebroot=STRING
	      When checking absolute URLs inside local files, the  given  root
	      directory is used as base URL.
	      Note  that  the given directory must have URL syntax, so it must
	      use a slash to join directories instead of a backslash.  And the
	      given directory must end with a slash.
	      Command line option: none

       nntpserver=STRING
	      Specify  an NNTP server for news: links. Default is the environ‐
	      ment variable NNTP_SERVER. If no host is given, only the	syntax
	      of the link is checked.
	      Command line option: --nntp-server

       recursionlevel=NUMBER
	      Check recursively all links up to given depth.  A negative depth
	      will enable infinite recursion.  Default depth is infinite.
	      Command line option: --recursion-level

       threads=NUMBER
	      Generate no more than the given number of threads. Default  num‐
	      ber  of threads is 100. To disable threading specify a non-posi‐
	      tive number.
	      Command line option: --threads

       timeout=NUMBER
	      Set the timeout for connection attempts in seconds. The  default
	      timeout is 60 seconds.
	      Command line option: --timeout

       aborttimeout=NUMBER
	      Time  to	wait  for  checks  to finish after the user aborts the
	      first time (with Ctrl-C or the abort button).  The default abort
	      timeout is 300 seconds.
	      Command line option: --timeout

       useragent=STRING
	      Specify  the  User-Agent	string to send to the HTTP server, for
	      example "Mozilla/4.0". The default  is  "LinkChecker/X.Y"	 where
	      X.Y is the current version of LinkChecker.
	      Command line option: --user-agent

       sslverify=[0|1|filename]
	      If set to zero disables SSL certificate checking.	 If set to one
	      (the default) enables SSL certificate checking with the provided
	      CA certificate file. If a filename is specified, it will be used
	      as the certificate file.
	      Command line option: none

       maxrunseconds=NUMBER
	      Stop checking new URLs after the given number of	seconds.  Same
	      as  if  the  user stops (by hitting Ctrl-C or clicking the abort
	      buttin in the GUI) after the given number of seconds.
	      The default is not to stop until all URLs are checked.
	      Command line option: none

       maxnumurls=NUMBER
	      Maximum number of URLs to check. New URLs	 will  not  be	queued
	      after the given number of URLs is checked.
	      The default is to queue and check all URLs.
	      Command line option: none

       maxrequestspersecond=NUMBER
	      Limit the maximum number of requests per second to one host.

       allowedschemes=NAME[,NAME...]
	      Allowed URL schemes as comma-separated list.

   [filtering]
       ignore=REGEX (MULTILINE)
	      Only  check  syntax  of  URLs matching the given regular expres‐
	      sions.
	      Command line option: --ignore-url

       ignorewarnings=NAME[,NAME...]
	      Ignore the comma-separated list of warnings. See WARNIGS for the
	      list of supported warnings.
	      Command line option: none

       internlinks=REGEX
	      Regular  expression  to  add  more  URLs	recognized as internal
	      links.  Default is that URLs  given  on  the  command  line  are
	      internal.
	      Command line option: none

       nofollow=REGEX (MULTILINE)
	      Check  but  do  not recurse into URLs matching the given regular
	      expressions.
	      Command line option: --no-follow-url

       checkextern=[0|1]
	      Check external links. Default is to check internal links only.
	      Command line option: --checkextern

   [authentication]
       entry=REGEX USER [PASS] (MULTILINE)
	      Provide different user/password pairs for different link	types.
	      Entries  are a triple (URL regex, username, password) or a tuple
	      (URL regex, username), where the entries are separated by white‐
	      space.
	      The  password is optional and if missing it has to be entered at
	      the commandline.
	      If the regular expression matches the  checked  URL,  the	 given
	      user/password  pair  is used for authentication. The commandline
	      options -u and -p match every link and  therefore	 override  the
	      entries given here. The first match wins. At the moment, authen‐
	      tication is used/needed for http[s] and ftp links.
	      Command line option: -u, -p

       loginurl=URL
	      A login URL to be visited before checking. Also needs  authenti‐
	      cation data set for it.

       loginuserfield=STRING
	      The name of the user CGI field. Default name is login.

       loginpasswordfield=STRING
	      The name of the password CGI field. Default name is password.

       loginextrafields=NAME:VALUE (MULTILINE)
	      Optionally  any  additional  CGI name/value pairs. Note that the
	      default values are submitted automatically.

   [output]
       debug=STRING[,STRING...]
	      Print debugging output for the given loggers.  Available loggers
	      are  cmdline, checking, cache, gui, dns, thread and all.	Speci‐
	      fying all is an alias for specifying all available loggers.
	      Command line option: --debug

       fileoutput=TYPE[,TYPE...]
	      Output	   to	    a	    files	 linkchecker-out.TYPE,
	      $HOME/.linkchecker/blacklist for blacklist output.
	      Valid file output types are text, html, sql, csv, gml, dot, xml,
	      none or blacklist Default is no file output. The various	output
	      types  are documented below. Note that you can suppress all con‐
	      sole output with output=none.
	      Command line option: --file-output

       log=TYPE[/ENCODING]
	      Specify output type as text, html, sql, csv, gml, dot, xml, none
	      or  blacklist.   Default	type is text. The various output types
	      are documented below.
	      The ENCODING specifies the output encoding, the default is  that
	      of    your    locale.    Valid	encodings    are   listed   at
	      http://docs.python.org/library/codecs.html#standard-encodings.
	      Command line option: --output

       quiet=[0|1]
	      If set, operate quiet. An alias for log=none.  This is only use‐
	      ful with fileoutput.
	      Command line option: --verbose

       status=[0|1]
	      Control printing check status messages. Default is 1.
	      Command line option: --no-status

       verbose=[0|1]
	      If  set log all checked URLs once. Default is to log only errors
	      and warnings.
	      Command line option: --verbose

       warnings=[0|1]
	      If set log warnings. Default is to log warnings.
	      Command line option: --no-warnings

   [text]
       filename=STRING
	      Specify output filename for text logging.	 Default  filename  is
	      linkchecker-out.txt.
	      Command line option: --file-output=

       parts=STRING
	      Comma-separated  list of parts that have to be logged.  See LOG‐
	      GER PARTS below.
	      Command line option: none

       encoding=STRING
	      Valid	    encodings	       are	    listed	    in
	      http://docs.python.org/library/codecs.html#standard-encodings.
	      Default encoding is iso-8859-15.

       color* Color  settings  for  the	 various log parts, syntax is color or
	      type;color. The type can be bold,	 light,	 blink,	 invert.   The
	      color  can  be default, black, red, green, yellow, blue, purple,
	      cyan, white, Black, Red, Green, Yellow, Blue,  Purple,  Cyan  or
	      White.
	      Command line option: none

       colorparent=STRING
	      Set parent color. Default is white.

       colorurl=STRING
	      Set URL color. Default is default.

       colorname=STRING
	      Set name color. Default is default.

       colorreal=STRING
	      Set real URL color. Default is cyan.

       colorbase=STRING
	      Set base URL color. Default is purple.

       colorvalid=STRING
	      Set valid color. Default is bold;green.

       colorinvalid=STRING
	      Set invalid color. Default is bold;red.

       colorinfo=STRING
	      Set info color. Default is default.

       colorwarning=STRING
	      Set warning color. Default is bold;yellow.

       colordltime=STRING
	      Set download time color. Default is default.

       colorreset=STRING
	      Set reset color. Default is deault.

   [gml]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

   [dot]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

   [csv]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

       separator=CHAR
	      Set CSV separator. Default is a comma (,).

       quotechar=CHAR
	      Set CSV quote character. Default is a double quote (").

   [sql]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

       dbname=STRING
	      Set database name to store into. Default is linksdb.

       separator=CHAR
	      Set SQL command separator character. Default is a semicolor (;).

   [html]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

       colorbackground=COLOR
	      Set HTML background color. Default is #fff7e5.

       colorurl=
	      Set HTML URL color. Default is #dcd5cf.

       colorborder=
	      Set HTML border color. Default is #000000.

       colorlink=
	      Set HTML link color. Default is #191c83.

       colorwarning=
	      Set HTML warning color. Default is #e0954e.

       colorerror=
	      Set HTML error color. Default is #db4930.

       colorok=
	      Set HTML valid color. Default is #3ba557.

   [blacklist]
       filename=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

   [xml]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

   [gxml]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

   [sitemap]
       filename=STRING
	      See [text] section above.

       parts=STRING
	      See [text] section above.

       encoding=STRING
	      See [text] section above.

       priority=FLOAT
	      A	 number	 between  0.0  and  1.0	 determining the priority. The
	      default priority for the first URL is 1.0, for  all  child  URLs
	      0.5.

       frequency=[always|hourly|daily|weekly|monthly|yearly|never]
	      The frequence pages are changing with.

LOGGER PARTS
	all	  (for all parts)
	id	  (a unique ID for each logentry)
	realurl	  (the full url link)
	result	  (valid or invalid, with messages)
	extern	  (1 or 0, only in some logger types reported)
	base	  (base href=...)
	name	  (<a href=...>name</a> and <img alt="name">)
	parenturl (if any)
	info	  (some additional info, e.g. FTP welcome messages)
	warning	  (warnings)
	dltime	  (download time)
	checktime (check time)
	url	  (the original url name, can be relative)
	intro	  (the blurb at the beginning, "starting at ...")
	outro	  (the blurb at the end, "found x errors ...")

MULTILINE
       Some  option  values  can  span	multiple  lines.  Each	line has to be
       indented for that to work. Lines starting  with	a  hash	 (#)  will  be
       ignored, though they must still be indented.

	ignore=
	  lconline
	  bookmark
	  # a comment
	  ^mailto:

EXAMPLE
	[output]
	log=html

	[checking]
	threads=5

	[filtering]
	ignorewarnings=http-moved-permanent

PLUGINS
       All plugins have a separate section. If the section appears in the con‐
       figuration file the plugin is enabled.  Some plugins read extra options
       in their section.

   [AnchorCheck]
       Checks validity of HTML anchors.

   [LocationInfo]
       Adds  the  country  and	if possible city name of the URL host as info.
       Needs GeoIP or pygeoip and a local country or city lookup DB installed.

   [RegexCheck]
       Define a regular expression which prints a warning if  it  matches  any
       content	of  the	 checked link. This applies only to valid pages, so we
       can get their content.

       Use this to check for pages that contain some form  of  error  message,
       for example 'This page has moved' or 'Oracle Application error'.

       Note  that  multiple  values can be combined in the regular expression,
       for example "(This page has moved|Oracle Application error)".

   [SslCertificateCheck]
       Check SSL certificate expiration date. Only internal https: links  will
       be checked. A domain will only be checked once to avoid duplicate warn‐
       ings.

       sslcertwarndays=NUMBER
	      Configures the expiration warning time in days.

   [HtmlSyntaxCheck]
       Check the syntax of HTML pages with the online W3C HTML validator.  See
       http://validator.w3.org/docs/api.html.

   [HttpHeaderInfo]
       Print HTTP headers in URL info.

       prefixes=prefix1[,prefix2]...
	      List  of comma separated header prefixes. For example to display
	      all HTTP headers that start with "X-".

   [CssSyntaxCheck]
       Check the syntax of HTML pages with the online W3C CSS validator.   See
       http://jigsaw.w3.org/css-validator/manual.html#expert.

   [VirusCheck]
       Checks the page content for virus infections with clamav.  A local cla‐
       mav daemon must be installed.

       clamavconf=filename
	      Filename of clamd.conf config file.

   [PdfParser]
       Parse PDF files for URLs to check. Needs the  pdfminer  Python  package
       installed.

   [WordParser]
       Parse  Word files for URLs to check. Needs the pywin32 Python extension
       installed.

WARNINGS
       The following warnings are recognized in	 the  'ignorewarnings'	config
       file entry:

       file-missing-slash
	      The file: URL is missing a trailing slash.

       file-system-path
	      The file: path is not the same as the system specific path.

       ftp-missing-slash
	      The ftp: URL is missing a trailing slash.

       http-cookie-store-error
	      An error occurred while storing a cookie.

       http-empty-content
	      The URL had no content.

       mail-no-mx-host
	      The mail MX host could not be found.

       nntp-no-newsgroup
	      The NNTP newsgroup could not be found.

       nntp-no-server
	      No NNTP server was found.

       url-content-size-zero
	      The URL content size is zero.

       url-content-too-large
	      The URL content size is too large.

       url-effective-url
	      The effective URL is different from the original.

       url-error-getting-content
	      Could not get the content of the URL.

       url-obfuscated-ip
	      The IP is obfuscated.

       url-whitespace
	      The URL contains leading or trailing whitespace.

SEE ALSO
       linkchecker(1)

AUTHOR
       Bastian Kleineidam <bastian.kleineidam@web.de>

COPYRIGHT
       Copyright © 2000-2014 Bastian Kleineidam

LinkChecker			  2007-11-30		      linkcheckerrc(5)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net