webalizer man page on YellowDog

Man page or keyword search:  
man Server   18644 pages
apropos Keyword Search (all sections)
Output format
YellowDog logo
[printable version]

webalizer(1)			 The Webalizer			  webalizer(1)

NAME
       webalizer - A web server log file analysis tool.

SYNOPSIS
       webalizer [ option ... ] [ log-file ]

       webazolver [ option ... ] [ log-file ]

DESCRIPTION
       The  Webalizer is a web server log file analysis program which produces
       usage statistics in HTML	 format	 for  viewing  with  a	browser.   The
       results	are  presented	in  both  columnar and graphical format, which
       facilitates interpretation.  Yearly, monthly, daily  and	 hourly	 usage
       statistics  are	presented,  along with the ability to display usage by
       site, URL, referrer, user agent (browser),  username,  search  strings,
       entry/exit  pages,   and country (some information may not be available
       if not present in the log file being processed).

       The Webalizer supports CLF (common log format) log files,  as  well  as
       Combined	 log  formats as defined by NCSA and others, and variations of
       these which it attempts to  handle  intelligently.   In	addition,  the
       Webalizer  also	supports wu-ftpd xferlog formatted log files, allowing
       analysis of ftp servers, and squid proxy logs.  Logs may also  be  com‐
       pressed,	 via  gzip.   If a compressed log file is detected, it will be
       automatically uncompressed while it is read.  Compressed logs must have
       the standard gzip extension of .gz.

       webazolver is normally just a symbolic link to the webalizer.  When run
       as webazolver, only DNS file creation/updates are  performed,  and  the
       program	will exit once complete.  All normal options and configuration
       directives are available, however many will not be used.	 In  addition,
       a DNS cache file must be specified.  If the number of DNS children pro‐
       cesses to use are not specified, the webazolver will default to 5.

       This documentation applies to The Webalizer Version 2.01

RUNNING THE WEBALIZER
       The Webalizer was designed to be run from a Unix command line prompt or
       as a crond(8) job. Once executed, the general flow of the program is:

       o       A  default  configuration  file	is  scanned for.  A file named
	       webalizer.conf is searched for in the current directory, and if
	       found,  and  is owned by the invoking user, then its configura‐
	       tion data is parsed.  If the file is not present in the current
	       directory,   the	 file /etc/webalizer.conf is searched for and,
	       if found, is used instead.

       o       Any command line arguments given to  the	 program  are  parsed.
	       This  may  include  the	specification of a configuration file,
	       which is processed at the time it is encountered.

       o       If a log file was specified, it is opened and  made  ready  for
	       processing.  If no log file was given, STDIN is used for input.
	       If the log filename '-' is specified, STDIN will be forced.

       o       If an output  directory	was  specified,	 the  program  does  a
	       chdir(2) to that directory in prepration for generating output.
	       If no output directory was  given,  the	current	 directory  is
	       used.

       o       If  a non-zero number of DNS Children processes were specified,
	       they will be started, and the specified log file will  be  pro‐
	       cessed, creating or updating the specified DNS cache file.

       o       If no hostname was given, the program attempts to get the host‐
	       name using a uname(2) system call.  If that fails, localhost is
	       used.

       o       A history file is searched for in the current directory (output
	       directory) and read if found.  This file keeps totals for  pre‐
	       vious  months,  which is used in the main index.html HTML docu‐
	       ment.  Note: The file location can now be  specified  with  the
	       HistoryName configuration option.

       o       If  incremental	processing  was	 specified,  a	data  file  is
	       searched for and loaded	if  found,  containing	the  'internal
	       state' data of the program at the end of a previous run.	 Note:
	       The file location can now be specified with the IncrementalName
	       configuration option.

       o       Main  processing begins on the log file.	 If the log spans mul‐
	       tiple months, a seperate HTML  document	is  created  for  each
	       month.

       o       After  main  processing,	 the  main index.html page is created,
	       which has totals by month and links to each months  HTML	 docu‐
	       ment.

       o       A new history file is saved to disk, which includes totals gen‐
	       erated by The Webalizer during the current run.

       o       If incremental processing was specified, a data file is written
	       that contains the 'internal state' data at the end of this run.

INCREMENTAL PROCESSING
       Version	1.2x of The Webalizer adds incremental run capability.	Simply
       put, this allows processing large log files by breaking	them  up  into
       smaller	pieces,	 and processing these pieces instead.  What this means
       in real terms is that you can now rotate your log files as often as you
       want, and still be able to produce monthly usage statistics without the
       loss of any detail.  Basically, The Webalizer saves  and	 restores  all
       internal	 data in a file named webalizer.current.  This allows the pro‐
       gram to 'start where it left off' so to speak, and allows the preserva‐
       tion  of	 detail	 from one run to the next.  The data file is placed in
       the current output directory, and is a plain ascii text file  that  can
       be viewed with any standard text editor.	 It's location and name may be
       changed using the IncrementalName configuration keyword.

       Some special precautions need to be taken when  using  the  incremental
       run  capability	of The Webalizer.  Configuration options should not be
       changed between runs, as that could cause corruption  of	 the  internal
       data  stored.   For example, changing the MangleAgents level will cause
       different representations  of  user  agents  to	be  stored,  producing
       invalid	results in the user agents section of the report.  If you need
       to change configuration options, do it at the end of  the  month	 after
       normal  processing of the previous month and before processing the cur‐
       rent month.  You may also want to delete the webalizer.current file  as
       well.

       The  Webalizer  also  attempts  to  prevent data duplication by keeping
       track of the timestamp of the last record processed.  This timestamp is
       then  compared to current records being processed, and any records that
       were logged previous to that timestamp are ignored.  This,  in  theory,
       should  allow  you to re-process logs that have already been processed,
       or process logs that contain  a	mix  of	 processed/not	yet  processed
       records, and not produce duplication of statistics.  The only time this
       may break is if you have	 duplicate  timestamps	in  two	 seperate  log
       files... any records in the second log file that do have the same time‐
       stamp as the last record in the previous log file  processed,  will  be
       discarded  as  if  they	had already been processed.  There are lots of
       ways to prevent this however, for  example,  stopping  the  web	server
       before  rotating	 logs  will  prevent  this situation.  This setup also
       necessitates that you always process logs in chronological order,  oth‐
       erwise data loss will occur as a result of the timestamp compare.

REVERSE DNS LOOKUPS
       The  Webalizer  supports	 reverse  DNS lookups through a DNS cache file
       that is either created/updated at run-time, or has been previously cre‐
       ated,  either  by  a  previous  run of the webalizer, or by running the
       stand-alone version, webazolver.	  In  order  to	 perform  reverse  DNS
       lookups,	 a  DNSCache  filename	must  be  specified.  In order to cre‐
       ate/update the cache file at run-time, the DNSChildren number  must  be
       non-zero.   The DNSChildren value specifies the number of children pro‐
       cesses to fork, each of which will perform reverse DNS lookups in order
       to create/update the DNS cache file.  See the file DNS.README for addi‐
       tional information.

COMMAND LINE OPTIONS
       The Webalizer supports many different configuration options  that  will
       alter  the way the program behaves and generates output.	 Most of these
       can be specified on the command line, while some can only be  specified
       in  a  configuration  file.  The command line options are listed below,
       with references to the corresponding configuration file keywords.

       General Options

       -h      Display all available command line options and exit program.

       -v -V   Display program version and exit program.

       -d      Debug.  Display debugging information for errors and warnings.

       -i      IgnoreHist.  Ignore history.  USE WITH CAUTION. This will cause
	       The Webalizer to ignore any previous monthly history file only.
	       Incremental data (if present) is still processed.

       -p      Incremental.  Preserve internal data between runs.

       -q      Quiet.  Supress informational messages.	Does not supress warn‐
	       ings or errors.

       -Q      ReallyQuiet.   Supress  all  messages  including	 warnings  and
	       errors.

       -T      TimeMe.	Force display of timing information at end of process‐
	       ing.

       -c file Use configuration file file.

       -n name HostName.  Use the hostname name.

       -o dir  OutputDir.  Use output directory dir.

       -t name ReportTitle.  Use name for report title.

       -F ( clf | ftp | squid )
	       LogType.	  Specify  log	type  to  be  processed.  Value can be
	       either clf, ftp	or  squid  format.   If	 not  specified,  will
	       default	to  CLF	 format.  FTP logs must be in standard wu-ftpd
	       xferlog format.

       -f      FoldSeqErr.  Fold out of sequence log records back into	analy‐
	       sis, by treating as if they were the same date/time as the last
	       good record.  Normally, out of sequence log records are	simply
	       ignored.

       -Y      CountryGraph. Supress country graph.

       -G      HourlyGraph.  Supress hourly graph.

       -x name HTMLExtension.	Defines	 HTML  file  extension to use.	If not
	       specified, defaults  to	html.	Do  not	 include  the  leading
	       period.

       -H      HourlyStats.  Supress hourly statistics.

       -L      GraphLegend.  Supress color coded graph legends.

       -l num  GraphLines.   Specify number of background lines. Default is 2.
	       Use zero ('0') to disable the lines.

       -P name PageType.  Specify file extensions that are  considered	pages.
	       Sometimes referred to as pageviews.

       -m num  VisitTimeout.   Specify the Visit timeout period.  Specified in
	       number of seconds.  Default is 1800 seconds (30 minutes).

       -I name IndexAlias.  Use the filename name as an additional  alias  for
	       index..

       -M num  MangleAgents.   Mangle user agent names according to the mangle
	       level specified by num.	Mangle levels are:

	       5   Browser name and major version.

	       4   Browser name, major and minor version.

	       3   Browser name, major version, minor version to  two  decimal
		   places.

	       2   Browser name, major and minor versions and sub-version.

	       1   Browser name, version and machine type if possible.

	       0   All informaiton (left unchanged).

       -g num  GroupDomains.  Automatically group sites by domain.  The group‐
	       ing level specified by num can be thought of as 'the number  of
	       dots'  to display in the grouping.  The default value of 0 dis‐
	       ables any domain grouping.

       -D name DNSCache.  Use the DNS cache file name.

       -N num  DNSChildren.  Use num DNS children  processes  to  perform  DNS
	       lookups,	 either	 creating  or  updateing  the  DNS cache file.
	       Specify zero (0) to disable cache  file	creation/updates.   If
	       given, a DNS cache filename must be specified.

       Hide Options

       -a name HideAgent.  Hide user agents matching name.

       -r name HideReferrer.  Hide referrer matching name.

       -s name HideSite.  Hide site matching name.

       -X name HideAllSites.  Hide all individual sites (only display groups).

       -u name HideURL.	 Hide URL matching name.

       Table size options

       -A num  TopAgents.  Display the top num user agents table.

       -R num  TopReferrers.  Display the top num referrers table.

       -S num  TopSites.  Display the top num sites table.

       -U num  TopURLs.	 Display the top num URL's table.

       -C num  TopCountries.  Display the top num countries table.

       -e num  TopEntry.  Display the top num entry pages table.

       -E num  TopExit.	 Display the top num exit pages table.

CONFIGURATION FILES
       Configuration  files  are standard ascii(7) text files that may be cre‐
       ated or edited using any standard editor.  Blank lines and  lines  that
       begin with a pound sign ('#') are ignored.  Any other lines are consid‐
       ered to be configurgation lines, and have  the  form  "Keyword  Value",
       where  the  ´Keyword´  is  one of the currently available configuration
       keywords defined below, and 'Value' is the value to assign to that par‐
       ticular	option.	 Any text found after the keyword up to the end of the
       line is considered the keyword's value, so you should not include  any‐
       thing  after  the actual value on the line that is not actually part of
       the value being assigned.  The file sample.conf provided with the  dis‐
       tribution contains lots of useful documentation and examples as well.

       General Configuration Keywords

       LogFile name
	       Use  log	 file  named  name.   If none specified, STDIN will be
	       used.

       LogType name
	       Specify log file type as name. Values can be either web,	 squid
	       or ftp, with the default being web.

       OutputDir dir
	       Create  output  in  the	directory dir.	If none specified, the
	       current directory will be used.

       HistoryName name
	       Filename to use for history file.  Relative to output directory
	       unless  absolute	 name is given (ie: starts with '/'). Defaults
	       to ´webalizer.hist' in the standard output directory.

       ReportTitle name
	       Use the title string name for the report title.	If none speci‐
	       fied, use the default of (in english) "Usage Statistics for ".

       Hostname name
	       Set the hostname for the report as name.	 If none specified, an
	       attempt will be made to gather the hostname via a uname(2) sys‐
	       tem call.  If that fails, localhost will be used.

       UseHTTPS ( yes | no )
	       Use  https:// on links to URLS, instead of the default http://,
	       in the 'Top URL's' table.

       Quiet ( yes | no )
	       Supress informational messages.	 Warning  and  Error  messages
	       will not be supressed.

       ReallyQuiet ( yes | no )
	       Supress all messages, including Warning and Error messages.

       Debug ( yes | no )
	       Print extra debugging information on Warnings and Errors.

       TimeMe ( yes | no )
	       Force timing information at end of processing.

       GMTTime ( yes | no )
	       Use GMT (UTC) time instead of local timezone for reports.

       IgnoreHist ( yes | no )
	       Ignore  previous monthly history file.  USE WITH CAUTION.  Does
	       not prevent Incremental file processing.

       FoldSeqErr ( yes | no )
	       Fold out of sequence log records back into analysis by treating
	       them as if they had the same date/time as the last good record.
	       Normally, out of sequence log records are ignored.

       CountryGraph ( yes | no )
	       Display Country Usage Graph in output report.

       DailyGraph ( yes | no )
	       Display Daily Graph in output report.

       DailyStats ( yes | no )
	       Display Daily Statistics in output report.

       HourlyGraph ( yes | no )
	       Display Hourly Graph in output report.

       HourlyStats ( yes | no )
	       Display Hourly Statistics in output report.

       PageType name
	       Define the file extensions to consider as a page.  If a file is
	       found to have the same extension as name, it will be counted as
	       a page (sometimes called a pageview).

       GraphLegend ( yes | no )
	       Allows the color coded graph legends to be enabled/disabled.

       GraphLines num
	       Specify the number of background reference lines	 displayed  on
	       the  graphs  produced.  Disable by using zero ('0'), default is
	       2.

       VisitTimeout num
	       Specifies the visit timeout value.  Default is 1800 seconds (30
	       minutes).   A  visit is determined by looking at the difference
	       in time between the current and last request  from  a  specific
	       site.   If  the	difference  is greater or equal to the timeout
	       value, the request is counted as a  new	visit.	 Specified  in
	       seconds.

       IndexAlias name
	       Use name as an additional alias for index.*.

       MangleAgents num
	       Mangle  user agent names based on mangle level num.  See the -M
	       command line switch for mangle levels and their	meaning.   The
	       default is 0, which doesn't mangle user agents at all.

       SearchEngine name variable
	       Allows  the  specification  of  search  engines and their query
	       strings.	 The name is the name to match	against	 the  referrer
	       string  for  a  given  search  engine.  The variable is the cgi
	       variable that the search engine uses for queries.  See the sam‐
	       ple.conf file for example usage with common search engines.

       Incremental ( yes | no )
	       Enable Incremental mode processing.

       IncrementalName name
	       Filename	 to  use  for  incremental  data.   Relative to output
	       directory unless an absolute name is  given  (ie:  starts  with
	       '/').   Defaults	 to ´webalizer.current' in the standard output
	       directory.

       DNSCache name
	       Filename to use for the DNS cache.  Relative to	output	direc‐
	       tory unless an absolute name is given (ie: starts with '/').

       DNSChildren num
	       Number  of  children  DNS  processes  to	 run  in order to cre‐
	       ate/update the DNS cache file.  Specify zero (0) to disable.

       Top Table Keywords

       TopAgents num
	       Display the top num User Agents table. Use zero to disable.

       AllAgents ( yes | no )
	       Create seperate HTML page with All User Agents.

       TopReferrers num
	       Display the top num Referrers table. Use zero to disable.

       AllReferrers ( yes | no )
	       Create seperate HTML page with All Referrers.

       TopSites num
	       Display the top num Sites table. Use zero to disable.

       TopKSites num
	       Display the top num Sites (by KByte) table.  Use zero  to  dis‐
	       able.

       AllSites ( yes | no )
	       Create seperate HTML page with All Sites.

       TopURLs num
	       Display the top num URLs table. Use zero to disable.

       TopKURLs num
	       Display	the  top  num URLs (by KByte) table.  Use zero to dis‐
	       able.

       AllURLs ( yes | no )
	       Create seperate HTML page with All URLs.

       TopCountries num
	       Display the top num Countries in the table. Use	zero  to  dis‐
	       able.

       TopEntry num
	       Display the top num Entry Pages in the table.  Use zero to dis‐
	       able.

       TopExit num
	       Display the top num Exit Pages in the table.  Use zero to  dis‐
	       able.

       TopSearch num
	       Display	the  top num Search Strings in the table.  Use zero to
	       disable.

       AllSearchStr ( yes | no )
	       Create seperate HTML page with All Search Strings.

       TopUsers num
	       Display the top num Usernames in the table.  Use zero  to  dis‐
	       able.  Usernames are only available if using http based authen‐
	       tication.

       AllUsers ( yes | no )
	       Create seperate HTML page with All Usernames.

       Hide/Ignore/Group/Include Keywords

       HideAgent name
	       Hide User Agents that match name.

       HideReferrer name
	       Hide Referrers that match name.

       HideSite name
	       Hide Sites that match name.

       HideAllSites ( yes | no )
	       Hide all individual sites.  This causes only grouped  sites  to
	       be displayed.

       HideURL name
	       Hide URL's that match name.

       HideUser name
	       Hide Usernames that match name.

       IgnoreAgent name
	       Ignore User Agents that match name.

       IgnoreReferrer name
	       Ignore Referrers that match name.

       IgnoreSite name
	       Ignore Sites that match name.

       IgnoreURL name
	       Ignore URL's that match name.

       IgnoreUser name
	       Ignore Usernames that match name.

       GroupAgent name [Label]
	       Group  User  Agents  that  match	 name.	 Display Label in 'Top
	       Agent' table if given (instead of name).

       GroupReferrer name [Label]
	       Group Referrers that match name.	 Display Label in 'Top	Refer‐
	       rer' table if given (instead of name).

       GroupSite name [Label]
	       Group Sites that match name.  Display Label in 'Top Site' table
	       if given (instead of name).

       GroupDomains num
	       Automatically group sites by domain.  The value	num  specifies
	       the  level of grouping, and can be thought of as the 'number of
	       dots' to be displayed.  The default value of 0 disables	domain
	       grouping.

       GroupURL name [Label]
	       Group  URL's that match name.  Display Label in 'Top URL' table
	       if given (instead of name).

       GroupUser name [Label]
	       Group Usernames that match name.	 Display Label in  'Top	 User‐
	       names' table if given (instead of name).

       IncludeSite name
	       Force  inclusion	 of  sites  that match name.  Takes precedence
	       over Ignore# keywords.

       IncludeURL name
	       Force inclusion of URL's that  match  name.   Takes  precedence
	       over Ignore# keywords.

       IncludeReferrer name
	       Force inclusion of Referrers that match name.  Takes precedence
	       over Ignore# keywords.

       IncludeAgent name
	       Force inclusion of User Agents that match name.	 Takes	prece‐
	       dence over Ignore* keywords.

       IncludeUser name
	       Force inclusion of Usernames that match name.  Takes precedence
	       over Ignore* keywords.

       HTML Generation Keywords

       HTMLExtension text
	       Defines the HTML file extension to use.	Default is  html.   Do
	       not include the leading period!

       HTMLPre text
	       Insert  text  at the very beginning of the generated HTML file.
	       Defaults to a standard html 3.2 DOCTYPE record.

       HTMLHead text
	       Insert text within the <HEAD></HEAD> block of the HTML file.

       HTMLBody text
	       Insert text in HTML page, starting with	the  <BODY>  tag.   If
	       used,  the first line must be a <BODY ...> tag.	Multiple lines
	       may be specified.

       HTMLPost text
	       Insert text at top (before horiz. rule) of HTML pages.	Multi‐
	       ple lines may be specified.

       HTMLTail text
	       Insert  text  at	 bottom of the HTML page.  The text is top and
	       right aligned within a table column at the end of the report.

       HTMLEnd text
	       Insert text at the very end of the HTML page.   If  not	speci‐
	       fied,  the  default is to insert the ending </BODY> and </HTML>
	       tags.  If used, you must supply these tags yourself.

       Dump Object Keywords

       The Webalizer allows you to export processed data to other programs  by
       using tab delimited text files.	The Dump* commands specify which files
       are to be written, and where.

       DumpPath name
	       Save dump files in  directory  name.   If  not  specified,  the
	       default output directory will be used.  Do not specify a trail‐
	       ing slash (/fP).

       DumpExtension name
	       Use name as the filename extension  for	dump  files.   If  not
	       given, the default of tab will be used.

       DumpHeader ( yes | no )
	       Print a column header as the first record of the file.

       DumpSites ( yes | no )
	       Dump the sites data to a tab delimited file.

       DumpURLs ( yes | no )
	       Dump the url data to a tab delimited file.

       DumpReferrers ( yes | no )
	       Dump  the  referrer  data to a tab delimitd file.  This data is
	       only available if using a log that contains  referrer  informa‐
	       tion (ie: a combined format web log).

       DumpAgents ( yes | no )
	       Dump the user agent data to a tab delimited file.  This data is
	       only available if using a log that contains user agent informa‐
	       tion (ie: a combined format web log).

       DumpUsers ( yes | no )
	       Dump  the  username data to a tab delimited file.  This data is
	       only available if processing a wu-ftpd xferlog  or  a  web  log
	       that contains http authentication information.

       DumpSearchStr ( yes | no )
	       Dump the search string data to a tab delimited file.  This data
	       is only available if processing a web log that contains	refer‐
	       rer information and had search string information present.

FILES
       webalizer.conf	   Default configuration file.	Is searched for in the
			   current directory and if not found,	in  the	 /etc/
			   directory.

       webalizer.hist	   Monthly  history file for previous 12 months.  (can
			   be changed)

       webalizer.current   Current state data file  (Incremental  processing).
			   (can be changed)

       xxxxx_YYYYMM.html   Various monthly HTML output files produced. (exten‐
			   sion can be changed)

       xxxxx_YYYYMM.png	   Various monthly image files used in the reports.

       xxxxx_YYYYMM.tab	   Monthly tab delimited text files.   (extension  can
			   be changed)

BUGS
       Report bugs to brad@mrunix.net.

COPYRIGHT
       Copyright  (C) 1997-2000 by Bradford L. Barrett.	 Distributed under the
       GNU GPL.	 See the files "COPYING" and "Copyright",  supplied  with  all
       distributions for additional information.

AUTHOR
       Bradford L. Barrett <brad@mrunix.net>

Version 2.01			  22-Oct-2001			  webalizer(1)
[top]

List of man pages available for YellowDog

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net