KinoSearch::Index::DatUsertContributed Perl DoKinoSearch::Index::DataWriter(3)NAMEKinoSearch::Index::DataWriter - Write data to an index.
SYNOPSIS
# Abstract base class.
DESCRIPTION
DataWriter is an abstract base class for writing index data, generally
in segment-sized chunks. Each component of an index -- e.g. stored
fields, lexicon, postings, deletions -- is represented by a
DataWriter/DataReader pair.
Components may be specified per index by subclassing Architecture.
CONSTRUCTORS
new( [labeled params] )
my $writer = MyDataWriter->new(
snapshot => $snapshot, # required
segment => $segment, # required
polyreader => $polyreader, # required
);
· snapshot - The Snapshot that will be committed at the end of the
indexing session.
· segment - The Segment in progress.
· polyreader - A PolyReader representing all existing data in the
index. (If the index is brand new, the PolyReader will have no
sub-readers).
ABSTRACT METHODS
add_inverted_doc( [labeled params] )
Process a document, previously inverted by "inverter".
· inverter - An Inverter wrapping an inverted document.
· doc_id - Internal number assigned to this document within the
segment.
add_segment( [labeled params] )
Add content from an existing segment into the one currently being
written.
· reader - The SegReader containing content to add.
· doc_map - An array of integers mapping old document ids to new.
Deleted documents are mapped to 0, indicating that they should be
skipped.
finish()
Complete the segment: close all streams, store metadata, etc.
format()
Every writer must specify a file format revision number, which should
increment each time the format changes. Responsibility for revision
checking is left to the companion DataReader.
METHODSdelete_segment(reader)
Remove a segment's data. The default implementation is a no-op, as all
files within the segment directory will be automatically deleted.
Subclasses which manage their own files outside of the segment system
should override this method and use it as a trigger for cleaning up
obsolete data.
· reader - The SegReader containing content to merge, which must
represent a segment which is part of the the current snapshot.
merge_segment( [labeled params] )
Move content from an existing segment into the one currently being
written.
The default implementation calls add_segment() then delete_segment().
· reader - The SegReader containing content to merge, which must
represent a segment which is part of the the current snapshot.
· doc_map - An array of integers mapping old document ids to new.
Deleted documents are mapped to 0, indicating that they should be
skipped.
metadata()
Arbitrary metadata to be serialized and stored by the Segment. The
default implementation supplies a Hash with a single key-value pair for
"format".
get_snapshot()
Accessor for "snapshot" member var.
get_segment()
Accessor for "segment" member var.
get_polyreader()
Accessor for "polyreader" member var.
get_schema()
Accessor for "schema" member var.
get_folder()
Accessor for "folder" member var.
INHERITANCEKinoSearch::Index::DataWriter isa KinoSearch::Object::Obj.
COPYRIGHT AND LICENSE
Copyright 2005-2010 Marvin Humphrey
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
perl v5.14.1 2011-06-20 KinoSearch::Index::DataWriter(3)