Text::Textile(3) User Contributed Perl Documentation Text::Textile(3)NAMEText::Textile - A humane web text generator.
SYNOPSIS
use Text::Textileqw(textile);
my $text = <<EOT;
h1. Heading
A _simple_ demonstration of Textile markup.
* One
* Two
* Three
"More information":http://www.textism.com/tools/textile is available.
EOT
# procedural usage
my $html = textile($text);
print $html;
# OOP usage
my $textile = new Text::Textile;
$html = $textile->process($text);
print $html;
ABSTRACTText::Textile is a Perl-based implementation of Dean Allen's Textile
syntax. Textile is shorthand for doing common formatting tasks.
METHODS
new( [%options] )
Instantiates a new Text::Textile object. Optional options can be passed
to initialize the object. Attributes for the options key are the same
as the get/set method names documented here.
set( $attribute, $value )
Used to set Textile attributes. Attribute names are the same as the
get/set method names documented here.
get( $attribute )
Used to get Textile attributes. Attribute names are the same as the
get/set method names documented here.
disable_html( [$disable] )
Gets or sets the "disable html" control, which allows you to prevent
HTML tags from being used within the text processed. Any HTML tags
encountered will be removed if disable html is enabled. Default
behavior is to allow HTML.
flavor( [$flavor] )
Assigns the HTML flavor of output from Text::Textile. Currently these
are the valid choices: html, xhtml (behaves like "xhtml1"), xhtml1,
xhtml2. Default flavor is "xhtml1".
Note that the xhtml2 flavor support is experimental and incomplete (and
will remain that way until the XHTML 2.0 draft becomes a proper
recommendation).
css( [$css] )
Gets or sets the CSS support for Textile. If CSS is enabled, Textile
will emit CSS rules. You may pass a 1 or 0 to enable or disable CSS
behavior altogether. If you pass a hashref, you may assign the CSS
class names that are used by Text::Textile. The following key names for
such a hash are recognized:
class_align_right
defaults to "right"
class_align_left
defaults to "left"
class_align_center
defaults to "center"
class_align_top
defaults to "top"
class_align_bottom
defaults to "bottom"
class_align_middle
defaults to "middle"
class_align_justify
defaults to "justify"
class_caps
defaults to "caps"
class_footnote
defaults to "footnote"
id_footnote_prefix
defaults to "fn"
charset( [$charset] )
Gets or sets the character set targetted for publication. At this
time, Text::Textile only changes its behavior if the "utf-8" character
set is assigned.
Specifically, if utf-8 is requested, any special characters created by
Textile will be output as native utf-8 characters rather than HTML
entities.
docroot( [$path] )
Gets or sets the physical file path to root of document files. This
path is utilized when images are referenced and size calculations are
needed (the Image::Size module is used to read the image dimensions).
trim_spaces( [$trim] )
Gets or sets the "trim spaces" control flag. If enabled, this will
clear any lines that have only spaces on them (the newline itself will
remain).
preserve_spaces( [$preserve] )
Gets or sets the "preserve spaces" control flag. If enabled, this will
replace any double spaces within the paragraph data with the
HTML entity (wide space). The default is 0. Spaces will pass through to
the browser unchanged and render as a single space. Note that this
setting has no effect on spaces within "<pre>", "<code>" or "<script>".
filter_param( [$data] )
Gets or sets a parameter that is passed to filters.
filters( [\%filters] )
Gets or sets a list of filters to make available for Text::Textile to
use. Returns a hash reference of the currently assigned filters.
char_encoding( [$encode] )
Gets or sets the character encoding logical flag. If character encoding
is enabled, the HTML::Entities package is used to encode special
characters. If character encoding is disabled, only "<", ">", """ and
"&" are encoded to HTML entities.
disable_encode_entities( $boolean )
Gets or sets the disable encode entities logical flag. If this value is
set to true no entities are encoded at all. This also supersedes the
"char_encoding" flag.
handle_quotes( [$handle] )
Gets or sets the "smart quoting" control flag. Returns the current
setting.
process( $str )
Alternative method for invoking the textile method.
textile( $str )
Can be called either procedurally or as a method. Transforms $str using
Textile markup rules.
format_paragraph( [$args] )
Processes a single paragraph. The following attributes are allowed:
text
The text to be processed.
format_inline( [%args] )
Processes an inline string (plaintext) for Textile syntax. The
following attributes are allowed:
text
The text to be processed.
format_macro( %args )
Responsible for processing a particular macro. Arguments passed
include:
pre open brace character
post
close brace character
macro
the macro to be executed
The return value from this method would be the replacement text for the
macro given. If the macro is not defined, it will return pre + macro +
post, thereby preserving the original macro string.
format_cite( %args )
Processes text for a citation tag. The following attributes are
allowed:
pre Any text that comes before the citation.
text
The text that is being cited.
cite
The URL of the citation.
post
Any text that follows the citation.
format_code( %args )
Processes '@...@' type blocks (code snippets). The following attributes
are allowed:
text
The text of the code itself.
lang
The language (programming language) for the code.
format_classstyle( $clsty, $class, $style )
Returns a string of tag attributes to accomodate the class, style and
symbols present in $clsty.
$clsty is checked for:
"{...}"
style rules. If present, they are appended to $style.
"(...#...)"
class and/or ID name declaration
"(" (one or more)
pad left characters
")" (one or more)
pad right characters
"[ll]"
language declaration
The attribute string returned will contain any combination of class,
id, style and/or lang attributes.
format_tag( %args )
Constructs an HTML tag. Accepted arguments:
tag the tag to produce
text
the text to output inside the tag
pre text to produce before the tag
post
text to produce following the tag
clsty
class and/or style attributes that should be assigned to the tag.
format_list( %args )
Takes a Textile formatted list (numeric or bulleted) and returns the
markup for it. Text that is passed in requires substantial parsing, so
the format_list method is a little involved. But it should always
produce a proper ordered or unordered list. If it cannot (due to
misbalanced input), it will return the original text. Arguments
accepted:
text
The text to be processed.
format_block( %args )
Processes "==xxxxx==" type blocks for filters. A filter would follow
the open "==" sequence and is specified within pipe characters, like
so:
==|filter|text to be filtered==
You may specify multiple filters in the filter portion of the string.
Simply comma delimit the filters you desire to execute. Filters are
defined using the filters method.
format_link( %args )
Takes the Textile link attributes and transforms them into a hyperlink.
format_url( %args )
Takes the given $url and transforms it appropriately.
format_span( %args )
format_image( %args )
Returns markup for the given image. $src is the location of the image,
$extra contains the optional height/width and/or alt text. $url is an
optional hyperlink for the image. $class holds the optional CSS class
attribute.
Arguments you may pass:
src The "src" (URL) for the image. This may be a local path, ideally
starting with a "/". Images can be located within the file system
if the docroot method is used to specify where the docroot resides.
If the image can be found, the image_size method is used to
determine the dimensions of the image.
extra
Additional parameters for the image. This would include alt text,
height/width specification or scaling instructions.
align
Alignment attribute.
pre Text to produce prior to the tag.
post
Text to produce following the tag.
link
Optional URL to connect with the image tag.
clsty
Class and/or style attributes.
format_table( %args )
Takes a Wiki-ish string of data and transforms it into a full table.
apply_filters( %args )
The following attributes are allowed:
text
The text to be processed.
filters
An array reference of filter names to run for the given text.
encode_html( $html, $can_double_encode )
Encodes input $html string, escaping characters as needed to HTML
entities. This relies on the HTML::Entities package for full effect. If
unavailable, encode_html_basic is used as a fallback technique. If the
"char_encoding" flag is set to false, encode_html_basic is used
exclusively.
decode_html( $html )
Decodes HTML entities in $html to their natural character equivelants.
encode_html_basic( $html, $can_double_encode )
Encodes the input $html string for the following characters: <, >, &
and ". If $can_double_encode is true, all ampersand characters are
escaped even if they already were. If $can_double_encode is false,
ampersands are only escaped when they aren't part of a HTML entity
already.
image_size( $file )
Returns the size for the image identified in $file. This method relies
upon the Image::Size Perl package. If unavailable, image_size will
return undef. Otherwise, the expected return value is a list of the
width and height (in that order), in pixels.
encode_url( $str )
Encodes the query portion of a URL, escaping characters as necessary.
mail_encode( $email )
Encodes the email address in $email for "mailto:" links.
process_quotes( $str )
Processes string, formatting plain quotes into curly quotes.
default_macros
Returns a hashref of macros that are assigned to be processed by
default within the format_inline method.
_halign( $alignment )
Returns the alignment keyword depending on the symbol passed.
"<>"
becomes "justify"
"<" becomes "left"
">" becomes "right"
"=" becomes "center"
_valign( $alignment )
Returns the alignment keyword depending on the symbol passed.
"^" becomes "top"
"~" becomes "bottom"
"-" becomes "middle"
_imgalign( $alignment )
Returns the alignment keyword depending on the symbol passed. The
following alignment symbols are recognized, and given preference in the
order listed:
"^" becomes "top"
"~" becomes "bottom"
"-" becomes "middle"
"<" becomes "left"
">" becomes "right"
_repl( \@arr, $str )
An internal routine that takes a string and appends it to an array. It
returns a marker that is used later to restore the preserved string.
_tokenize( $str )
An internal routine responsible for breaking up a string into
individual tag and plaintext elements.
_css_defaults
Sets the default CSS names for CSS controlled markup. This is an
internal function that should not be called directly.
_strip_borders( $pre, $post )
This utility routine will take "border" characters off of the given
$pre and $post strings if they match one of these conditions:
$pre starts with "[", $post ends with "]"
$pre starts with "{", $post ends with "}"
If neither condition is met, then the $pre and $post values are left
untouched.
SYNTAXText::Textile processes text in units of blocks and lines. A block
might also be considered a paragraph, since blocks are separated from
one another by a blank line. Blocks can begin with a signature that
helps identify the rest of the block content. Block signatures include:
p A paragraph block. This is the default signature if no signature is
explicitly given. Paragraphs are formatted with all the inline
rules (see inline formatting) and each line receives the
appropriate markup rules for the flavor of HTML in use. For
example, newlines for XHTML content receive a "<br />" tag at the
end of the line (with the exception of the last line in the
paragraph). Paragraph blocks are enclosed in a "<p>" tag.
pre A pre-formatted block of text. Textile will not add any HTML tags
for individual lines. Whitespace is also preserved.
Note that within a "pre" block, < and > are translated into HTML
entities automatically.
bc A "bc" signature is short for "block code", which implies a
preformatted section like the "pre" block, but it also gets a
"<code>" tag (or for XHTML 2, a "<blockcode>" tag is used instead).
Note that within a "bc" block, < and > are translated into HTML
entities automatically.
table
For composing HTML tables. See the "TABLES" section for more
information.
bq A "bq" signature is short for "block quote". Paragraph text
formatting is applied to these blocks and they are enclosed in a
<blockquote> tag as well as <p> tags within.
h1, h2, h3, h4, h5, h6
Headline signatures that produce "<h1>", etc. tags. You can adjust
the relative output of these using the head_offset attribute.
clear
A "clear" signature is simply used to indicate that the next block
should emit a CSS style attribute that clears any floating
elements. The default behavior is to clear "both", but you can use
the left (<) or right (>) alignment characters to indicate which
side to clear.
dl A "dl" signature is short for "definition list". See the "LISTS"
section for more information.
fn A "fn" signature is short for "footnote". You add a number
following the "fn" keyword to number the footnote. Footnotes are
output as paragraph tags but are given a special CSS class name
which can be used to style them as you see fit.
All signatures should end with a period and be followed with a space.
Inbetween the signature and the period, you may use several parameters
to further customize the block. These include:
"{style rule}"
A CSS style rule. Style rules can span multiple lines.
"[ll]"
A language identifier (for a "lang" attribute).
"(class)" or "(#id)" or "(class#id)"
For CSS class and id attributes.
">", "<", "=", "<>"
Modifier characters for alignment. Right-justification, left-
justification, centered, and full-justification.
"(" (one or more)
Adds padding on the left. 1em per "(" character is applied. When
combined with the align-left or align-right modifier, it makes the
block float.
")" (one or more)
Adds padding on the right. 1em per ")" character is applied. When
combined with the align-left or align-right modifier, it makes the
block float.
"|filter|" or "|filter|filter|filter|"
A filter may be invoked to further format the text for this
signature. If one or more filters are identified, the text will be
processed first using the filters and then by Textile's own block
formatting rules.
Extended Blocks
Normally, a block ends with the first blank line encountered. However,
there are situations where you may want a block to continue for
multiple paragraphs of text. To cause a given block signature to stay
active, use two periods in your signature instead of one. This will
tell Textile to keep processing using that signature until it hits the
next signature is found.
For example:
bq.. This is paragraph one of a block quote.
This is paragraph two of a block quote.
p. Now we're back to a regular paragraph.
You can apply this technique to any signature (although for some it
doesn't make sense, like "h1" for example). This is especially useful
for "bc" blocks where your code may have many blank lines scattered
through it.
Escaping
Sometimes you want Textile to just get out of the way and let you put
some regular HTML markup in your document. You can disable Textile
formatting for a given block using the "==" escape mechanism:
p. Regular paragraph
==
Escaped portion -- will not be formatted
by Textile at all
==
p. Back to normal.
You can also use this technique within a Textile block, temporarily
disabling the inline formatting functions:
p. This is ==*a test*== of escaping.
Inline Formatting
Formatting within a block of text is covered by the "inline" formatting
rules. These operators must be placed up against text/punctuation to be
recognized. These include:
*"strong"*
Translates into <strong>strong</strong>.
"_emphasis_"
Translates into <em>emphasis</em>.
**"bold"**
Translates into <b>bold</b>.
"__italics__"
Translates into <i>italics</i>.
"++bigger++"
Translates into <big>bigger</big>.
"--smaller--"
Translates into: <small>smaller</small>.
"-deleted text-"
Translates into <del>deleted text</del>.
"+inserted text+"
Translates into <ins>inserted text</ins>.
"^superscript^"
Translates into <sup>superscript</sup>.
"~subscript~"
Translates into <sub>subscript</sub>.
"%span%"
Translates into <span>span</span>.
"@code@"
Translates into <code>code</code>. Note that within a "@...@"
section, < and > are translated into HTML entities automatically.
Inline formatting operators accept the following modifiers:
"{style rule}"
A CSS style rule.
"[ll]"
A language identifier (for a "lang" attribute).
"(class)" or "(#id)" or "(class#id)"
For CSS class and id attributes.
Examples
Textile is *way* cool.
Textile is *_way_* cool.
Now this won't work, because the formatting characters need whitespace
before and after to be properly recognized.
Textile is way c*oo*l.
However, you can supply braces or brackets to further clarify that you
want to format, so this would work:
Textile is way c[*oo*]l.
Footnotes
You can create footnotes like this:
And then he went on a long trip[1].
By specifying the brackets with a number inside, Textile will recognize
that as a footnote marker. It will replace that with a construct like
this:
And then he went on a long
trip<sup class="footnote"><a href="#fn1">1</a></sup>
To supply the content of the footnote, place it at the end of your
document using a "fn" block signature:
fn1. And there was much rejoicing.
Which creates a paragraph that looks like this:
<p class="footnote" id="fn1"><sup>1</sup> And there was
much rejoicing.</p>
Links
Textile defines a shorthand for formatting hyperlinks. The format
looks like this:
"Text to display":http://example.com
In addition to this, you can add "title" text to your link:
"Text to display (Title text)":http://example.com
The URL portion of the link supports relative paths as well as other
protocols like ftp, mailto, news, telnet, etc.
"E-mail me please":mailto:someone@example.com
You can also use single quotes instead of double-quotes if you prefer.
As with the inline formatting rules, a hyperlink must be surrounded by
whitespace to be recognized (an exception to this is common punctuation
which can reside at the end of the URL). If you have to place a URL
next to some other text, use the bracket or brace trick to do that:
You["gotta":http://example.com]seethis!
Textile supports an alternate way to compose links. You can optionally
create a lookup list of links and refer to them separately. To do this,
place one or more links in a block of it's own (it can be anywhere
within your document):
[excom]http://example.com
[exorg]http://example.org
For a list like this, the text in the square brackets is used to
uniquely identify the link given. To refer to that link, you would
specify it like this:
"Text to display":excom
Once you've defined your link lookup table, you can use the identifiers
any number of times.
Images
Images are identified by the following pattern:
!/path/to/image!
Image attributes may also be specified:
!/path/to/image 10x20!
Which will render an image 10 pixels wide and 20 pixels high. Another
way to indicate width and height:
!/path/to/image 10w 20h!
You may also redimension the image using a percentage.
!/path/to/image 20%x40%!
Which will render the image at 20% of it's regular width and 40% of
it's regular height.
Or specify one percentage to resize proprotionately:
!/path/to/image 20%!
Alt text can be given as well:
!/path/to/image (Alt text)!
The path of the image may refer to a locally hosted image or can be a
full URL.
You can also use the following modifiers after the opening "!"
character:
"<" Align the image to the left (causes the image to float if CSS
options are enabled).
">" Align the image to the right (causes the image to float if CSS
options are enabled).
"-" (dash)
Aligns the image to the middle.
"^" Aligns the image to the top.
"~" (tilde)
Aligns the image to the bottom.
"{style rule}"
Applies a CSS style rule to the image.
"(class)" or "(#id)" or "(class#id)"
Applies a CSS class and/or id to the image.
"(" (one or more)
Pads 1em on the left for each "(" character.
")" (one or more)
Pads 1em on the right for each ")" character.
Character Replacements
A few simple, common symbols are automatically replaced:
(c)
(r)
(tm)
In addition to these, there are a whole set of character macros that
are defined by default. All macros are enclosed in curly braces. These
include:
{c|} or {|c} cent sign
{L-} or {-L} pound sign
{Y=} or {=Y} yen sign
Many of these macros can be guessed. For example:
{A'} or {'A}
{a"} or {"a}
{1/4}
{*}
{:)}
{:(}
Lists
Textile also supports ordered and unordered lists. You simply place an
asterisk or pound sign, followed with a space at the start of your
lines.
Simple lists:
* one
* two
* three
Multi-level lists:
* one
** one A
** one B
*** one B1
* two
** two A
** two B
* three
Ordered lists:
# one
# two
# three
Styling lists:
(class#id)* one
* two
* three
The above sets the class and id attributes for the <ul> tag.
*(class#id) one
* two
* three
The above sets the class and id attributes for the first <li> tag.
Definition lists:
dl. textile:a cloth, especially one manufactured by weaving
or knitting; a fabric
format:the arrangement of data for storage or display.
Note that there is no space between the term and definition. The term
must be at the start of the line (or following the "dl" signature as
shown above).
Tables
Textile supports tables. Tables must be in their own block and must
have pipe characters delimiting the columns. An optional block
signature of "table" may be used, usually for applying style, class, id
or other options to the table element itself.
From the simple:
|a|b|c|
|1|2|3|
To the complex:
table(fig). {color:red}_|Top|Row|
{color:blue}|/2. Second|Row|
|_{color:green}. Last|
Modifiers can be specified for the table signature itself, for a table
row (prior to the first "|" character) and for any cell (following the
"|" for that cell). Note that for cells, a period followed with a space
must be placed after any modifiers to distinguish the modifier from the
cell content.
Modifiers allowed are:
"{style rule}"
A CSS style rule.
"(class)" or "(#id)" or "(class#id)"
A CSS class and/or id attribute.
"(" (one or more)
Adds 1em of padding to the left for each "(" character.
")" (one or more)
Adds 1em of padding to the right for each ")" character.
"<" Aligns to the left (floats to left for tables if combined with the
")" modifier).
">" Aligns to the right (floats to right for tables if combined with
the "(" modifier).
"=" Aligns to center (sets left, right margins to "auto" for tables).
"<>"
For cells only. Justifies text.
"^" For rows and cells only. Aligns to the top.
"~" (tilde)
For rows and cells only. Aligns to the bottom.
"_" (underscore)
Can be applied to a table row or cell to indicate a header row or
cell.
"\2" or "\3" or "\4", etc.
Used within cells to indicate a colspan of 2, 3, 4, etc. columns.
When you see "\", think "push forward".
"/2" or "/3" or "/4", etc.
Used within cells to indicate a rowspan or 2, 3, 4, etc. rows.
When you see "/", think "push downward".
When a cell is identified as a header cell and an alignment is
specified, that becomes the default alignment for cells below it. You
can always override this behavior by specifying an alignment for one of
the lower cells.
CSS Notes
When CSS is enabled (and it is by default), CSS class names are
automatically applied in certain situations.
Aligning a block or span or other element to left, right, etc.
"left" for left justified, "right" for right justified, "center"
for centered text, "justify" for full-justified text.
Aligning an image to the top or bottom
"top" for top alignment, "bottom" for bottom alignment, "middle"
for middle alignment.
Footnotes
"footnote" is applied to the paragraph tag for the footnote text
itself. An id of "fn" plus the footnote number is placed on the
paragraph for the footnote as well. For the footnote superscript
tag, a class of "footnote" is used.
Capped text
For a series of characters that are uppercased, a span is placed
around them with a class of "caps".
Miscellaneous
Textile tries to do it's very best to ensure proper XHTML syntax. It
will even attempt to fix errors you may introduce writing in HTML
yourself. Unescaped "&" characters within URLs will be properly
escaped. Singlet tags such as br, img and hr are checked for the "/"
terminator (and it's added if necessary). The best way to make sure you
produce valid XHTML with Textile is to not use any HTML markup at all--
use the Textile syntax and let it produce the markup for you.
BUGS & SOURCEText::Textile is hosted at github.
Source: http://github.com/bradchoate/text-textile/tree/master
<http://github.com/bradchoate/text-textile/tree/master>
Bugs: http://github.com/bradchoate/text-textile/issues
<http://github.com/bradchoate/text-textile/issues>
COPYRIGHT & LICENSE
Copyright 2005-2009 Brad Choate, brad@bradchoate.com.
This program is free software; you can redistribute it and/or modify it
under the terms of either:
· the GNU General Public License as published by the Free Software
Foundation; either version 1, or (at your option) any later
version, or
· the Artistic License version 2.0.
Text::Textile is an adaptation of Textile, developed by Dean Allen of
Textism.com.
perl v5.14.1 2009-08-08 Text::Textile(3)