STATIST(1) UNIX System V (local) STATIST(1)
NAME
statist - calculate Huffman distribution for freeze(1)
SYNOPSIS
statist [ -g... ]
DESCRIPTION
The default table is tuned for both C texts and executable
files (as in LHARC). If you will freeze any other files
(natural language texts, databases, images, fonts, etc.) you
can calculate the matching positions distribution using the
`statist' program, which calculates and displays the
mentioned distribution for the given file. It is useful for
large (100K or more) files.
Though the built-in position table is polyvalent, the tuning
can increase the compression rate up to one additional
percent. (Observed mainly on text files.)
USAGE
statist [-g...] < sample_file
or
gensample | statist [-g...]
where `gensample' is a program generating some sample stream
of bytes similar to files to be frozen.
The -g switch has the same meaning as for freeze(1) and may
be repeated.
You can also see the intermediate values and watch their
changes by pressing INTR key when you wish.
Note: If you use gensample | statist , remember that INTR
influence BOTH processes !!
The results have the following format:
n1 n2 n3 n4 n5 n6 n7 n8 (uncertainty = x)
Average match length: xx.yy
Percentile 99.9: p999
Percentile 99.5: p995
Percentile 99.0: p990
Percentile 97.0: p970
Percentile 95.0: p950
Percentile 90.0: p900
Percentile 80.0: p800
Percentile 70.0: p700
Percentile 50.0: p500
Sigma: xx.yy
Here n1 - n8 are values of the calculated position table
elements, uncertainty is a number which denotes validity of
given results (non-zero values of uncertainty indicate that
the results may be unusable). Other values (average match
Page 1 (printed 12/5/95)
STATIST(1) UNIX System V (local) STATIST(1)
length, percentiles and sigma) are FYI only.
You may create the /etc/default/freeze file (if you don't
like /etc/default/ directory, choose another - in MS-DOS it
is FREEZE.CNF in the directory of FREEZE.EXE), which has the
following format:
name = n1 n2 n3 n4 n5 n6 n7 n8
(name must start in column 1). For example:
---------- cut here -----------
# This is freeze's defaults file
russian=0 0 1 2 6 20 31 2 # The sample was mailx.lp (Russian)
english=0 0 1 2 7 16 36 0 # The sample was gcc.lp (English)
# End of file
---------- cut here -----------
If you find values, which are better THAN DEFAULT both for
text (C programs) and binary (executable) files, please send
them to me.
Important note: statist.c is NOT a part of freeze package,
it is an aditional feature.
SEE ALSO
freeze(1), melt(1), fcat(1)
DIAGNOSTICS
Huffman tree has more than 8 levels, reducing...
Self-explanatory, but sometimes reducing falls into
infinite loop.
xxxK
Progress indicator is written after each 4K of a
file processed.
BUGS
Sometimes use of the results with uncertainty = 1 (on a
file) gives compression rate worse than default but use of
the results with uncertainty = 13 (on other file) works
quite good.
Found bugs descriptions, incompatibilities, etc. please
send to leo@s514.ipmce.su.
Page 2 (printed 12/5/95)