rwgroup man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

rwgroup(1)			SiLK Tool Suite			    rwgroup(1)

NAME
       rwgroup - Tag similar SiLK records with a common next hop IP value

SYNOPSIS
	 rwgroup
	       {--id-fields=KEY | --delta-field=FIELD --delta-value=DELTA}
	       [--objective] [--summarize] [--rec-threshold=THRESHOLD]
	       [--group-offset=IP]
	       [--note-add=TEXT] [--note-file-add=FILE] [--output-path=PATH]
	       [--copy-input=PATH] [--compression-method=COMP_METHOD]
	       [--site-config-file=FILENAME]
	       [--plugin=PLUGIN [--plugin=PLUGIN ...]]
	       [--python-file=PATH [--python-file=PATH ...]]
	       [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [FILE]

	 rwgroup [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [--plugin=PLUGIN ...] [--python-file=PATH ...] --help

	 rwgroup [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
	       [--plugin=PLUGIN ...] [--python-file=PATH ...] --help-fields

	 rwgroup --version

DESCRIPTION
       rwgroup reads sorted SiLK Flow records (c.f. rwsort(1)) from the
       standard input or from a single file name listed on the command line,
       marks records that form a group with an identifier in the Next Hop IP
       field, and prints the binary SiLK Flow records to the standard output.
       In some ways rwgroup is similar to rwuniq(1), but rwgroup writes SiLK
       flow records instead of textual output.

       Two SiLK records are defined as being in the same group when the fields
       specified in the --id-fields switch match exactly and when the field
       listed in the --delta-field matches within the value given by the
       --delta-value switch.  Either --id-fields or --delta-fields is
       required; both may be specified.	 A --delta-value must be given when
       --delta-fields is present.

       The first group of records gets the identifer 0, and rwgroup writes
       that value into each record's Next Hop IP field.	 The ID for each
       subsequent group is incremented by 1.  The --group-offset switch may be
       used to set the identifier of the initial group.

       The --rec-threshold switch may be used to only write groups that
       contain a certain number of records.  The --summarize switch attempts
       to merge records in the same group to a single output record.

       rwgroup requires that the records are sorted on the fields listed in
       the --id-fields and --delta-fields switches.  For example, a call using

	 rwgroup --id-field=2 --delta-field=9 --delta-value=3

       should read the output of

	 rwsort --field=2,9

       otherwise the results are unpredictable.

OPTIONS
       Option names may be abbreviated if the abbreviation is unique or is an
       exact match for an option.  A parameter to an option may be specified
       as --arg=param or --arg param, though the first form is required for
       options that take optional parameters.

       At least one value for --id-field or --delta-field must be provided;
       rwgroup terminates with an error if no fields are specified.

       --id-fields=KEY
	   KEY contains the list of flow attributes (a.k.a. fields or columns)
	   that must match exactly for flows to be considered part of the same
	   group.  Each field may be specified once only.  KEY is a comma
	   separated list of field-names, field-integers, and ranges of field-
	   integers; a range is specified by separating the start and end of
	   the range with a hyphen (-).	 Field-names are case insensitive.
	   Example:

	    --id-fields=stime,10,1-5

	   There is no default value for the --id-fields switch.

	   The complete list of built-in fields that the SiLK tool suite
	   supports follows, though note that not all fields are present in
	   all SiLK file formats; when a field is not present, its value is 0.

	   sIP,1
	       source IP address

	   dIP,2
	       destination IP address

	   sPort,3
	       source port for TCP and UDP, or equivalent

	   dPort,4
	       destination port for TCP and UDP, or equivalent

	   protocol,5
	       IP protocol

	   packets,pkts,6
	       packet count

	   bytes,7
	       byte count

	   flags,8
	       bit-wise OR of TCP flags over all packets

	   sTime,9
	       starting time of flow (seconds resolution)

	   duration,10
	       duration of flow (seconds resolution)

	   eTime,11
	       end time of flow (seconds resolution)

	   sensor,12
	       name or ID of sensor at the collection point

	   class,20
	       class of sensor at the collection point

	   type,21
	       type of sensor at the collection point

	   iType
	       the ICMP type value for ICMP or ICMPv6 flows and zero for non-
	       ICMP flows.  Internally, SiLK stores the ICMP type and code in
	       the "dPort" field, so there is no need have both "dPort" and
	       "iType" or "iCode" in the sort key.  This field was introduced
	       in SiLK 3.8.1.

	   iCode
	       the ICMP code value for ICMP or ICMPv6 flows and zero for non-
	       ICMP flows.  See note at "iType".

	   icmpTypeCode,25
	       equivalent to "iType","iCode" in --id-fields.  This field may
	       not be mixed with "iType" or "iCode", and this field is
	       deprecated as of SiLK 3.8.1.  As of SiLK 3.8.1, "icmpTypeCode"
	       may no longer be used as the argument to --delta-field; the
	       "dPort" field will provide an equivalent result as long as the
	       input is limited to ICMP flow records.

	   Many SiLK file formats do not store the following fields and their
	   values will always be 0; they are listed here for completeness:

	   in,13
	       router SNMP input interface or vlanId if packing tools were
	       configured to capture it (see sensor.conf(5))

	   out,14
	       router SNMP output interface or postVlanId

	   SiLK can store flows generated by enhanced collection software that
	   provides more information than NetFlow v5.  These flows may support
	   some or all of these additional fields; for flows without this
	   additional information, the field's value is always 0.

	   initialFlags,26
	       TCP flags on first packet in the flow

	   sessionFlags,27
	       bit-wise OR of TCP flags over all packets except the first in
	       the flow

	   attributes,28
	       flow attributes set by the flow generator:

	       "S" all the packets in this flow record are exactly the same
		   size

	       "F" flow generator saw additional packets in this flow
		   following a packet with a FIN flag (excluding ACK packets)

	       "T" flow generator prematurely created a record for a long-
		   running connection due to a timeout.	 (When the flow
		   generator yaf(1) is run with the --silk switch, it will
		   prematurely create a flow and mark it with "T" if the byte
		   count of the flow cannot be stored in a 32-bit value.)

	       "C" flow generator created this flow as a continuation of long-
		   running connection, where the previous flow for this
		   connection met a timeout (or a byte threshold in the case
		   of yaf).

	       Consider a long-running ssh session that exceeds the flow
	       generator's active timeout.  (This is the active timeout since
	       the flow generator creates a flow for a connection that still
	       has activity).  The flow generator will create multiple flow
	       records for this ssh session, each spanning some portion of the
	       total session.  The first flow record will be marked with a "T"
	       indicating that it hit the timeout.  The second through next-
	       to-last records will be marked with "TC" indicating that this
	       flow both timed out and is a continuation of a flow that timed
	       out.  The final flow will be marked with a "C", indicating that
	       it was created as a continuation of an active flow.

	   application,29
	       guess as to the content of the flow.  Some software that
	       generates flow records from packet data, such as yaf, will
	       inspect the contents of the packets that make up a flow and use
	       traffic signatures to label the content of the flow.  SiLK
	       calls this label the application; yaf refers to it as the
	       appLabel.  The application is the port number that is
	       traditionally used for that type of traffic (see the
	       /etc/services file on most UNIX systems).  For example, traffic
	       that the flow generator recognizes as FTP will have a value of
	       21, even if that traffic is being routed through the standard
	       HTTP/web port (80).

	   The following fields provide a way to label the IPs or ports on a
	   record.  These fields require external files to provide the mapping
	   from the IP or port to the label:

	   sType,16
	       categorize the source IP address as "non-routable", "internal",
	       or "external" and group based on the category.  Uses the
	       mapping file specified by the SILK_ADDRESS_TYPES environment
	       variable, or the address_types.pmap mapping file, as described
	       in addrtype(3).

	   dType,17
	       as sType for the destination IP address

	   scc,18
	       the country code of the source IP address.  Uses the mapping
	       file specified by the SILK_COUNTRY_CODES environment variable,
	       or the country_codes.pmap mapping file, as described in
	       ccfilter(3).

	   dcc,19
	       as scc for the destination IP

	   src-MAPNAME
	       value determined by passing the source IP or the
	       protocol/source-port to the user-defined mapping defined in the
	       prefix map associated with MAPNAME.  See the description of the
	       --pmap-file switch below and the pmapfilter(3) manual page.

	   dst-MAPNAME
	       as src-MAPNAME for the destination IP or
	       protocol/destination-port.

	   sval
	   dval
	       These are deprecated field names created by pmapfilter that
	       correspond to src-MAPNAME and dst-MAPNAME, respectively.	 These
	       fields are available when a prefix map is used that is not
	       associated with a MAPNAME.

	   Finally, the list of built-in fields may be augmented by the run-
	   time loading of PySiLK code or plug-ins written in C (also called
	   shared object files or dynamic libraries), as described by the
	   --python-file and --plugin switches.

       --delta-field=FIELD
	   Specify a single field that can differ by a specified delta-value
	   among the SiLK records that make up a group.	 The FIELD identifiers
	   include most of those specified for --id-fields.  The exceptions
	   are that plug-in fields are not supported, nor are fields that do
	   not have numeric values (e.g., class, type, flags).	The most
	   common value for this switch is "stime", which allows records that
	   are identical in the id-fields but temporally far apart to be in
	   different groups.  The switch takes a single argument; multiple
	   delta fields cannot be specified.  When this switch is specified,
	   the --delta-value switch is required.

       --delta-value=DELTA_VALUE
	   Specify the acceptable difference between the values of the
	   --delta-field.  The --delta-value switch is required when the
	   --delta-field switch is provided.  For fields other than those
	   holding IPs, when two consecutive records have values less than or
	   equal to DELTA_VALUE, the records are considered members of the
	   same group.	When the delta-field refers to an IP field,
	   DELTA_VALUE is the number of least significant bits of the IPs to
	   remove before comparing them.  For example, when --delta-field=sIP
	   --delta-value=8 is specified, two records are the same group if
	   their source IPv4 addresses belong to the same /24 or if their
	   source IPv6 addresses belong to the same /120.  The --objective
	   switch affects the meaning of this switch.

       --objective
	   Change the behavior of the --delta-value switch so that a record is
	   considered part of a group if the value of its --delta-field is
	   within the DELTA_VALUE of the first record in the group.  (When
	   this switch is not specified, consecutive records are compared.)

       --summarize
	   Cause rwgroup to print (typically) a single record for each group.
	   By default, all records in each group having at least
	   --rec-threshold members is printed.	When --summarize is active,
	   the record that is written for the group is the first record in the
	   group with the following modifications:

	   ·   The packets and bytes values are the sum of the packets and
	       bytes values, respectively, for all records in the group.

	   ·   The start-time value is the earliest start time for the records
	       in the group.

	   ·   The end-time value is the latest end time for the records in
	       the group.

	   ·   The flags and session-flags values are the bitwise-OR of all
	       flags and session-flags values, respectively, for the records
	       in the group.

	   Note that multiple records for a group may be printed if the bytes,
	   packets, or elapsed time values are too large to be stored in a
	   SiLK flow record.

       --plugin=PLUGIN
	   Augment the list of fields by using run-time loading of the plug-in
	   (shared object) whose path is PLUGIN.  The switch may be repeated
	   to load multiple plug-ins.  The creation of plug-ins is described
	   in the silk-plugin(3) manual page.  When PLUGIN does not contain a
	   slash ("/"), rwgroup will attempt to find a file named PLUGIN in
	   the directories listed in the "FILES" section.  If rwgroup finds
	   the file, it uses that path.	 If PLUGIN contains a slash or if
	   rwgroup does not find the file, rwgroup relies on your operating
	   system's dlopen(3) call to find the file.  When the
	   SILK_PLUGIN_DEBUG environment variable is non-empty, rwgroup prints
	   status messages to the standard error as it attempts to find and
	   open each of its plug-ins.

       --rec-threshold=THRESHOLD
	   Specify the minimum number of SiLK records a group must contain
	   before the records in the group are written to the output stream.
	   The default is 1; i.e., write all records.  The maximum threshold
	   is 65535.

       --group-offset=IP
	   Specify the value to write into the Next Hop IP for the records
	   that comprise the first group.  The value IP may be an integer, or
	   an IPv4 or IPv6 address in the canonical presentation form.	If not
	   specified, counting begins at 0.  The value for each subsequent
	   group is incremented by 1.

       --note-add=TEXT
	   Add the specified TEXT to the header of the output file as an
	   annotation.	This switch may be repeated to add multiple
	   annotations to a file.  To view the annotations, use the
	   rwfileinfo(1) tool.

       --note-file-add=FILENAME
	   Open FILENAME and add the contents of that file to the header of
	   the output file as an annotation.	This switch may be repeated to
	   add multiple annotations.  Currently the application makes no
	   effort to ensure that FILENAME contains text; be careful that you
	   do not attempt to add a SiLK data file as an annotation.

       --copy-input=PATH
	   Copy all binary input to the specified file or named pipe.  PATH
	   can be "stdout" to print flows to the standard output as long as
	   the --output-path switch has been used to redirect rwgroup's
	   output.

       --output-path=PATH
	   Determines where the output of rwgroup is written.  If this option
	   is not given, output is written to the standard output.

       --compression-method=COMP_METHOD
	   Specify how to compress the output.	When this switch is not given,
	   output to the standard output or to named pipes is not compressed,
	   and output to files is compressed using the default chosen when
	   SiLK was compiled.  The valid values for COMP_METHOD are determined
	   by which external libraries were found when SiLK was compiled.  To
	   see the available compression methods and the default method, use
	   the --help or --version switch.  SiLK can support the following
	   COMP_METHOD values when the required libraries are available.

	   none
	       Do not compress the output using an external library.

	   zlib
	       Use the zlib(3) library for compressing the output, and always
	       compress the output regardless of the destination.  Using zlib
	       produces the smallest output files at the cost of speed.

	   lzo1x
	       Use the lzo1x algorithm from the LZO real time compression
	       library for compression, and always compress the output
	       regardless of the destination.  This compression provides good
	       compression with less memory and CPU overhead.

	   best
	       Use lzo1x if available, otherwise use zlib.  Only compress the
	       output when writing to a file.

       --site-config-file=FILENAME
	   Read the SiLK site configuration from the named file FILENAME.
	   When this switch is not provided, rwgroup searches for the site
	   configuration file in the locations specified in the "FILES"
	   section.

       --help
	   Print the available options and exit.  Specifying switches that add
	   new fields or additional switches before --help will allow the
	   output to include descriptions of those fields or switches.

       --help-fields
	   Print the description and alias(es) of each field and exit.
	   Specifying switches that add new fields before --help-fields will
	   allow the output to include descriptions of those fields.

       --version
	   Print the version number and information about how SiLK was
	   configured, then exit the application.

       --pmap-file=MAPNAME:PATH
       --pmap-file=PATH
	   Instruct rwgroup to load the mapping file located at PATH and
	   create the src-MAPNAME and dst-MAPNAME fields.  When MAPNAME is
	   provided explicitly, it will be used to refer to the fields
	   specific to that prefix map.	 If MAPNAME is not provided, rwgroup
	   will check the prefix map file to see if a map-name was specified
	   when the file was created.  If no map-name is available, rwgroup
	   creates the fields sval and dval.  Multiple --pmap-file switches
	   are supported as long as each uses a unique value for map-name.
	   The --pmap-file switch(es) must precede the --id-fields switch.
	   For more information, see pmapfilter(3).

       --python-file=PATH
	   When the SiLK Python plug-in is used, rwgroup reads the Python code
	   from the file PATH to define additional fields that can be used as
	   part of the group key.  This file should call register_field() for
	   each field it wishes to define.  For details and examples, see the
	   silkpython(3) and pysilk(3) manual pages.

LIMITATIONS
       rwgroup requires sorted data.  The application works by comparing
       records in the order that the records are received (similar to the UNIX
       uniq(1) command), odd orders will produce odd groupings.

EXAMPLES
       In the following example, the dollar sign ("$") represents the shell
       prompt.	The text after the dollar sign represents the command line.
       Lines have been wrapped for improved readability, and the back slash
       ("\") is used to indicate a wrapped line.

       As a rule of thumb, the --id-fields and --delta-field parameters should
       match rwsort(1)'s call, with --delta-field being the last parameter.  A
       call to group all web traffic by queries from the same addresses
       (field=2) within 10 seconds (field=9) of the first query from that
       address will be:

	$ rwfilter --proto=6 --dport=80 --pass=stdout		       \
	  | rwsort --field=2,9					       \
	  | rwgroup --id-field=2 --delta-field=9 --delta-value=10      \
	       --objective

ENVIRONMENT
       PYTHONPATH
	   This environment variable is used by Python to locate modules.
	   When --python-file is specified, rwgroup must load the Python files
	   that comprise the PySiLK package, such as silk/__init__.py.	If
	   this silk/ directory is located outside Python's normal search path
	   (for example, in the SiLK installation tree), it may be necessary
	   to set or modify the PYTHONPATH environment variable to include the
	   parent directory of silk/ so that Python can find the PySiLK
	   module.

       SILK_PYTHON_TRACEBACK
	   When set, Python plug-ins will output traceback information on
	   Python errors to the standard error.

       SILK_COUNTRY_CODES
	   This environment variable allows the user to specify the country
	   code mapping file that rwgroup uses when computing the scc and dcc
	   fields.  The value may be a complete path or a file relative to the
	   SILK_PATH.  See the "FILES" section for standard locations of this
	   file.

       SILK_ADDRESS_TYPES
	   This environment variable allows the user to specify the address
	   type mapping file that rwgroup uses when computing the sType and
	   dType fields.  The value may be a complete path or a file relative
	   to the SILK_PATH.  See the "FILES" section for standard locations
	   of this file.

       SILK_CLOBBER
	   The SiLK tools normally refuse to overwrite existing files.
	   Setting SILK_CLOBBER to a non-empty value removes this restriction.

       SILK_CONFIG_FILE
	   This environment variable is used as the value for the
	   --site-config-file when that switch is not provided.

       SILK_DATA_ROOTDIR
	   This environment variable specifies the root directory of data
	   repository.	As described in the "FILES" section, rwgroup may use
	   this environment variable when searching for the SiLK site
	   configuration file.

       SILK_PATH
	   This environment variable gives the root of the install tree.  When
	   searching for configuration files and plug-ins, rwgroup may use
	   this environment variable.  See the "FILES" section for details.

       SILK_PLUGIN_DEBUG
	   When set to 1, rwgroup prints status messages to the standard error
	   as it attempts to find and open each of its plug-ins.  In addition,
	   when an attempt to register a field fails, rwgroup prints a message
	   specifying the additional function(s) that must be defined to
	   register the field in rwgroup.  Be aware that the output can be
	   rather verbose.

FILES
       ${SILK_ADDRESS_TYPES}
       ${SILK_PATH}/share/silk/address_types.pmap
       ${SILK_PATH}/share/address_types.pmap
       /usr/local/share/silk/address_types.pmap
       /usr/local/share/address_types.pmap
	   Possible locations for the address types mapping file required by
	   the sType and dType fields.

       ${SILK_CONFIG_FILE}
       ${SILK_DATA_ROOTDIR}/silk.conf
       /data/silk.conf
       ${SILK_PATH}/share/silk/silk.conf
       ${SILK_PATH}/share/silk.conf
       /usr/local/share/silk/silk.conf
       /usr/local/share/silk.conf
	   Possible locations for the SiLK site configuration file which are
	   checked when the --site-config-file switch is not provided.

       ${SILK_COUNTRY_CODES}
       ${SILK_PATH}/share/silk/country_codes.pmap
       ${SILK_PATH}/share/country_codes.pmap
       /usr/local/share/silk/country_codes.pmap
       /usr/local/share/country_codes.pmap
	   Possible locations for the country code mapping file required by
	   the scc and dcc fields.

       ${SILK_PATH}/lib64/silk/
       ${SILK_PATH}/lib64/
       ${SILK_PATH}/lib/silk/
       ${SILK_PATH}/lib/
       /usr/local/lib64/silk/
       /usr/local/lib64/
       /usr/local/lib/silk/
       /usr/local/lib/
	   Directories that rwgroup checks when attempting to load a plug-in.

SEE ALSO
       rwfilter(1), rwfileinfo(1), rwsort(1), rwuniq(1), addrtype(3),
       ccfilter(3), pmapfilter(3), pysilk(3), silkpython(3), silk-plugin(3),
       sensor.conf(5), uniq(1), silk(7), yaf(1), dlopen(3), zlib(3)

SiLK 3.11.0.1			  2016-02-19			    rwgroup(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net