html2xhtml man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

html2xhtml(1)							 html2xhtml(1)

NAME
       html2xhtml - Converts HTML files to XHTML

SYNTAX
       html2xhtml [ filename ] [ options ]

DESCRIPTION
       Html2xhtml  is  a  command-line	tool that converts HTML files to XHTML
       files. The path of the HTML input file can be provided  as  a  command-
       line argument. If not, it is read from stdin.

       Xhtml2xhtml  tries always to generate valid XHTML files.	 It is able to
       correct many common errors in input HTML files without loose of	infor‐
       mation.	However,  for some errors, html2xhtml may decide to loose some
       information in order to generate a valid XHTML  output.	 This  can  be
       avoided	with  the  -e option, which allows html2xhtml to generate non-
       valid output in these cases.

       Html2xhtml can generate the XHTML output compliant to one of  the  fol‐
       lowing  document	 types: XHTML 1.0 (Transitional, Strict and Frameset),
       XHTML 1.1, XHTML Basic and XHTML Mobile Profile.

OPTIONS
       The command line options/arguments are:

       filename		   Read the HTML input from filename  (optional	 argu‐
			   ment).  If  this argument is not provided, the HTML
			   input is read from standard input.

       -o filename	   Output XHTML file. The file is  overwritten	if  it
			   exists.  If	not provided, the output is written to
			   standard output.

       -e		   Instructs the program to propagate input chunks  to
			   the	output	even  if it is unable to adapt them to
			   the output XHTML doctype. Using  this  option,  the
			   XHTML  output  may  be  non-valid.  Not  using this
			   option, some input data could be removed  from  the
			   output in some [rare] cases.

       -t output-doctype   Doctype of the output XHTML file. If not specified,
			   the program selects automatically either XHTML  1.0
			   Transitional or XHTML 1.0 Frameset depending on the
			   input. Current available doctypes are:
			    o transitional XHTML 1.0 Transitional
			    o frameset XHTML 1.0 Frameset
			    o strict XHTML 1.0 Strict
			    o 1.1 XHTML 1.1
			    o basic-1.0 XHTML Basic 1.0
			    o basic-1.1 XHTML Basic 1.1
			    o mp XHTML Mobile Profile
			    o print-1.0 XHTML Print 1.0

       --ics input_charset Character set of the input  document.  This	option
			   overrides the default input character set detection
			   mechanism.

       --ocs output_charset
			   Character set for the  output  XHTML	 document.  If
			   this	 option	 is  not present, the character set of
			   the input is used as default.

       --lcs		   Dump the list of available  character  set  aliases
			   and	exit  html2xhtml.   No conversion is performed
			   when this option is present.

       -l line_length	   Number of characters per line. The default value is
			   80.	 It  must be greater or equal to 40, otherwise
			   the parameter is ignored.

       -b tab_length	   Tab length in number of characters. It  must	 be  a
			   number between 0 and 16, otherwise the parameter is
			   ignored.  Use 0 to avoid indentation in the output.

       --preserve-space-comments
			   Use this option to preserve white spaces, tabs  and
			   ends of lines in HTML comments. The default, if not
			   provided, is to rearrange spacing.

       --no-protect-cdata  Enclose CDATA sections in "script" and "style" fol‐
			   lowing   the	  XHTML	  1.0	specification	(using
			   "<!CDATA[[" and "]]>"). It  might  be  incompatible
			   with some browsers.	The default in this version is
			   to enclose CDATA sections using  "//<!CDATA[["  and
			   "//]]>", because major browsers handle it properly.

       --compact-block-elements
			   No  white spaces or line breaks are written between
			   the start tag of a block element and the start  tag
			   of  its first enclosed inline element (or character
			   data) and between the end tag of its last  enclosed
			   inline  element (or character data) and the end tag
			   of the block element. By default, if this option is
			   not	set,  a	 new line character and indentation is
			   written between them.

       --compact-empty-elm-tags
			   Do not write a  whitespace  before  the  slash  for
			   empty  element  tags (i.e. write "<br/>" instead of
			   the default "<br />").   Note  that	although  both
			   notations  are  correct in XML, the XHTML 1.0 stan‐
			   dard recommends the latter to improve compatibility
			   with old browsers.

       --empty-elm-tags-always
			   By default, empty element tags are written only for
			   elements declared as empty in the DTD. This	option
			   makes  any element not having content to be written
			   with the empty element  tag,	 even  if  it  is  not
			   declared as empty in the DTD. This option may cause
			   problems when  the  XHTML  document	is  opened  by
			   browsers in HTML (tag soup) mode.

       --dos-eol	   Write  the output XHTML file with DOS--style (CRLF)
			   end of line, instead of the default UNIX--style end
			   of  line.   Both  end of line styles are allowed by
			   the XML recommendation.

       --generate-snippet  Treat the input as an HTML fragment	instead	 of  a
			   full	 document.   The output will also be a snippet
			   and will not contain either	the  XML  and  doctype
			   declarations or the html, head and body elements.

       --help		   Show a brief help message and exit.

       --version	   Show the version number and exit.

NOTE ON CHARACTER SETS
       Since  version  1.1.2,  html2xhtml  is able to parse and write HTML and
       XHTML documents using the most popular character sets / encodings.   It
       is also able to read the input document using a given character set and
       generate an output that uses  a	different  character  set.  The	 iconv
       implementation in the GNU C library is used with that purpose.

       Any  IANA-registered  character	set  that  is  supported  by the iconv
       library may be used. When naming a character  set,  any	IANA--approved
       alias  for  it  may  be	used.  The  full list of aliases recognised by
       html2xhtml can be obtained with the --lcs command-line option.

       If the character set of the input document is not specified, html2xhtml
       tries  to  guess	 it automatically.  If the character set of the output
       document is not specified, html2xhtml writes the output using the  same
       character set as the input document.

NOTE ON END OF LINE CHARACTES
       By  default,  the  UNIX-style  one-byte	end of line is used. It can be
       changed to DOS-style CRLF end of line with the  --dos-eol  command-line
       option.

       However,	 when the program is compiled in the MinGW environment and the
       output is sent to standard output, the  output  is  automatically  con‐
       verted  by the environment to CRLF by default. Do not use the --dos-eol
       command-line option in that situation.  When the output is  sent	 to  a
       file  with the -o command-line option, the output is as expected (UNIX-
       style by default), and the --dos-eol option may be used.

ACKNOWLEDGMENTS
       Program developer up to current version:
       Jesus Arias Fisteus <jaf@it.uc3m.es>

       The first working version of this program has been developed as
       a Master Thesis at the University of Vigo (Spain) [http://www.uvigo.es],
       advised by:

       Rebeca Diaz Redondo
       Ana Fernandez Vilas

       Copyright 2000-2001 by Jesus Arias Fisteus, Rebeca Diaz Redondo, Ana
       Fernandez Vilas.
       Copyright 2002-2009 by Jesus Arias Fisteus

								 html2xhtml(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net