DOWNLOAD-ENTITIES(1) User Contributed Perl Documentation DOWNLOAD-ENTITIES(1)NAME
download-entities - download and parse XML Entity definitions
SYNOPSIS
$ perl download-entities.pl-i # interactive
$ perl download-entities.pl > output-file.pm
$ perl download-entities.pl output-file.pm
# instead of http://www.w3.org/2003/entities/iso9573-2003/
$ perl download-entities.pl http://my.server.com/entities.html
DESCRIPTION
This script downloads the definitions of XML entities from
http://www.w3.org/2003/entities/iso9573-2003/ or from whatever address
you give it as an argument. The argument should be an URL (that
LWP::UserAgent::get can access) pointing to a document with (absolute
or relative) references to files ending with the ".ent" suffix. These
files are expected to be DTD's with lines like
<!ENTITY amp "&" >
The script parses these files and prints the perl module to the
standard output. If you wish, you can give "file" as another argument
to the script and it will then print it to "file". You can also
specify the output file in the environment variable "OUTPUT_FILE".
The index and the output file are distinguished by the presence of
"://" substring. If you want to use a locally stored index file (the
one with the .ent references), you can access it by saying
perl download.pl file:///path/to/index.html
Note that the script currently distinguishes between relative and
absolute paths by looking at whether the href contains a "://"
substring. This can lead to crashes when the links look like
href="/path/file.ent".
Also, the script assumes the links have exactly the format href="..." -
with double quotes.
Interactive download
In case you run into problems downloading the documents, you can try to
run the script with the "-i" or "--interactive" option. This will let
you skip downloads or enter alternative URLs for individual documents.
The interactive mode is also triggered when the "INTERACTIVE"
environment variable is set to a true value (in Perl sense).
Options
Beside the "--interactive" option, this script also accepts the
"--timeout" option. It specifies the timeout for LWP::UserAgent in
seconds when downloading. The same is controlled by the
"DOWNLOAD_TIMEOUT" environment variable. The defaule (180s) timeout is
used when not specified.
# 10 seconds timeout - croak on failure
perl download-entities.pl--timeout 10 > XML/Entities/Data.pm
# 5 seconds timeout - croak on failure
DOWNLOAD_TIMEOUT=5 perl download-entities.pl > XML/Entities/Data.pm
# 1 second timeout - ask on failure
perl download-entities.pl--interactive --timeout 1 > XML/Entities/Data.pm
Dependencies
This script has dependencies that the "XML::Entities" module does not
and are therefore not mentioned in the META.yml file. These are
"LWP::UserAgent", "File::Basename" and "Fatal".
COPYRIGHT
Copyright 2010 Jan Oldrich Kruza <sixtease@cpan.org>. All rights
reserved.
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
perl v5.14.1 2010-08-26 DOWNLOAD-ENTITIES(1)