Boulder::LocusLink(3) User Contributed Perl DocumentationBoulder::LocusLink(3)NAMEBoulder::LocusLink - Fetch LocusLink data records as parsed Boulder
Stones
SYNOPSIS
# parse a file of LocusLink records
$ll = new Boulder::LocusLink(-accessor=>'File',
-param => '/home/data/LocusLink/LL_tmpl');
while (my $s = $ll->get) {
print $s->Identifier;
print $s->Gene;
}
# parse flatfile records yourself
open (LL,"/home/data/LocusLink/LL_tmpl");
local $/ = "*RECORD*";
while (<LL>) {
my $s = Boulder::LocusLink->parse($_);
# etc.
}
DESCRIPTIONBoulder::LocusLink provides retrieval and parsing services for
LocusLink records
Boulder::LocusLink provides retrieval and parsing services for NCBI
LocusLink records. It returns Unigene entries in Stone format,
allowing easy access to the various fields and values.
Boulder::LocusLink is a descendent of Boulder::Stream, and provides a
stream-like interface to a series of Stone objects.
Access to LocusLink is provided by one accessors, which give access to
local LocusLink database. When you create a new Boulder::LocusLink
stream, you provide the accessors, along with accessor-specific
parameters that control what entries to fetch. The accessors is:
File
This provides access to local LocusLink entries by reading from a
flat file (typically Hs.dat file downloadable from NCBI's Ftp site).
The stream will return a Stone corresponding to each of the entries
in the file, starting from the top of the file and working downward.
The parameter is the path to the local file.
It is also possible to parse a single LocusLink entry from a text
string stored in a scalar variable, returning a Stone object.
Boulder::LocusLink methods
This section lists the public methods that the Boulder::LocusLink class
makes available.
new()
# Local fetch via File
$ug=new Boulder::LocusLink(-accessor => 'File',
-param => '/data/LocusLink/Hs.dat');
The new() method creates a new Boulder::LocusLink stream on the
accessor provided. The only possible accessors is File. If
successful, the method returns the stream object. Otherwise it
returns undef.
new() takes the following arguments:
-accessor Name of the accessor to use
-param Parameters to pass to the accessor
Specify the accessor to use with the -accessor argument. If not
specified, it defaults to File.
-param is an accessor-specific argument. The possibilities is:
For File, the -param argument must point to a string-valued scalar,
which will be interpreted as the path to the file to read LocusLink
entries from.
get()
The get() method is inherited from Boulder::Stream, and simply
returns the next parsed LocusLink Stone, or undef if there is
nothing more to fetch. It has the same semantics as the parent
class, including the ability to restrict access to certain top-
level tags.
put()
The put() method is inherited from the parent Boulder::Stream
class, and will write the passed Stone to standard output in
Boulder format. This means that it is currently not possible to
write a Boulder::LocusLink object back into LocusLink flatfile
form.
OUTPUT TAGS
The tags returned by the parsing operation are taken from the names
shown in the Flat file Hs.dat since no better description of them is
provided yet by the database source producer.
Top-Level Tags
These are tags that appear at the top level of the parsed LocusLink
entry.
Identifier
The LocusLink identifier of this entry. Identifier is a single-
value tag.
Example:
my $identifierNo = $s->Identifier;
Current_locusid
If a locus has been merged with another, the Current_locusid
contains the previous LOCUSID line (A bit confusing, shall be
called "previous_locusid", but this is defined in NCBI README File
... ).
Example:
my $prevlocusid=$s->Current_locusid;
Organism Source species ased on NCBI's Taxonomy
Example:
my $theorganism=$s->Organism;
Status Type of reference sequence record. If "PROVISIONAL" then means
that is generated automatically from existing Genbank record and
information stored in the LocusLink database, no curation. If
"REVIEWED" than it means that is generated from the most representative
complete GenBank sequence or merge of GenBank sequenes and from
information stored in the LocusLink database
Example:
my $thestatus=$s->Status;
LocAss Here comes a complex record ... made up of LOCUS_STRING, NM
The value in the LOCUS field of the RefSeq record , NP The
RefSeq accession number for an mRNA record, PRODUCT The name of the
produc tof this transcript, TRANSVAR a variant-specific description,
ASSEMBLY The Genbank accession used to assemble the refseq record
Example:
my $theprod=$s->LocAss->Product;
AccProt Here comes a complex record ... made up of ACCNUM
Nucleotide sequence accessio number TYPE e=EST, m=mRNA,
g=Genomic PROT set of PID values for the coding region or
regions annotated on the nucleotide record. The first value is the PID
(an integer or null), then either MMDB or na, separated from the PID by
a |. If MMDB is present, it indicates there are structur edata
available for a protein related to the protein referenced by the PID
Example: my $theprot=$s->AccProt->Prot;
OFFICIAL_SYMBOL The symbol used for gene reports, validated by the
appropriate nomenclature committee
PREFERRED_SYMBOL Interim symbol used for display
OFFICIAL_GENE_NAME The gene description used for gene reports validate
by the appropriate nomenclatur eommittee. If the symbol is official,
the gene name will be official. No records will have both official and
interim nomenclature.
PREFERRED_GENE_NAME Interim used for display
PREFERRED_PRODUCT The name of the product used in the RefSeq record
ALIAS_SYMBOL Other symbols associated with this gene
ALIAS_PROT Other protein names associated with this gene
PhenoTable A complex record made up of Phenotype Phenotype_ID
SUmmary
Unigene
Omim
Chr
Map
STS
ECNUM
ButTable BUTTON LINK
DBTable DB_DESCR DB_LINK
PMID a subset of publications associated with this locus with the link
being the PubMed unique identifier comma separated
SEE ALSO
Boulder, Boulder::Blast, Boulder::Genbank
AUTHOR
Lincoln Stein <lstein@cshl.org>. Luca I.G. Toldo <luca.toldo@merck.de>
Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself. See DISCLAIMER.txt for
disclaimers of warranty.
perl v5.14.1 2002-12-14 Boulder::LocusLink(3)