MongoDB::DataTypes(3) User Contributed Perl DocumentationMongoDB::DataTypes(3)NAMEMongoDB::DataTypes - The data types used with MongoDB
DESCRIPTION
This goes over the types you can save to the database and use for
queries in the Perl driver. If you are using another language, please
refer to that language's documentation (<http://api.mongodb.org>).
NOTES FOR SQL PROGRAMMERS
You must query for data using the correct type.
For example, it is perfectly valid to have some records where the field
"foo" is 123 (integer) and other records where "foo" is "123" (string).
Thus, you must query for the correct type. If you save "{"foo" =>
"123"}", you cannot query for it with "{"foo" => 123}". MongoDB is
strict about types.
If the type of a field is ambiguous and important to your application,
you should document what you expect the application to send to the
database and convert your data to those types before sending. There
are some object-document mappers that will enforce certain types for
certain fields for you.
You generally shouldn't save numbers as strings, as they will behave
like strings (e.g., range queries won't work correctly) and the data
will take up more space. If you set MongoDB::BSON#looks_like_number,
the driver will automatically convert everything that looks like a
number to a number before sending it to the database.
Numbers are the only exception to the strict typing: all number types
stored by MongoDB (32-bit integers, 64-bit integers, 64-bit floating
point numbers) will match each other.
TYPES
Numbers
By default, numbers with a decimal point will be saved as doubles
(64-bit).
32-bit Platforms
Numbers without decimal points will be saved as 32-bit integers. To
save a number as a 64-bit integer, use bigint:
use bigint;
$collection->insert({"user_id" => 28347197234178})
The driver will die if you try to insert a number beyond the signed
64-bit range: -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807.
Numbers that are saved as 64-bit integers will be decoded as doubles.
64-bit Platforms
Numbers without a decimal point will be saved and returned as 64-bit
integers. Note that there is no way to save a 32-bit int on a 64-bit
machine.
Keep in mind that this can cause some weirdness to ensue if some
machines are 32-bit and others are 64-bit. Take the following example:
· Programmer 1 saves an int on a 32-bit platform.
· Programmer 2 retrieves the document on a 64-bit platform and re-
saves it, effectively converting it to a 64-bit int.
· Programmer 1 retrieves the document on their 32-bit machine, which
decodes the 64-bit int as a double.
Nothing drastic, but good to be aware of.
64-bit integers in the shell
The Mongo shell has one numeric type: the 8-byte float. This means
that it cannot always represent an 8-byte integer exactly. Thus, when
you display a 64-bit integer in the shell, it will be wrapped in a
subobject that indicates it might be an approximate value. For
instance, if we run this Perl on a 64-bit machine:
$coll->insert({_id => 1});
then look at it in the shell, we see:
> db.whatever.findOne()
{
"_id" :
{
"floatApprox" : 1
}
}
This doesn't mean that we saved a float, it just means that the float
value of a 64-bit integer may not be exact.
Dealing with numbers and strings in Perl
Perl is very flexible about whether something is number or a string: it
generally infers the type from context. Unfortunately, the driver
doesn't have any context when it has to choose how to serialize a
variable. Therefore, the default behavior is to introspect the flags
that are set on that variable and decide what the user meant, which are
generally affected by the last operation.
my $var = "4";
# stored as the string "4"
$collection->insert({myVar => $var});
$var = int($var) if (int($var) eq $var);
# stored as the int 4
$collection->insert({myVar => $var});
Because of this, users often find that they end up with more strings
than they wanted in their database.
If you would like to have everything that looks like a number saved as
a number, set the MongoDB::BSON#looks_like_number option.
$MongoDB::BSON::looks_like_number = 1;
my $var = "4";
# stored as the int 4
$collection->insert({myVar => $var});
This will send anything that "looks like" a number as a number. It can
recognize anything that Scalar::Util's "looks_like_number" function can
recognize.
On the other hand, sometimes there is data that looks like a number but
should be saved as a string. For example, suppose we were storing zip
codes. If we wanted to generally convert strings to numbers, we might
have something like:
$MongoDB::BSON::looks_like_number = 1;
# zip is stored as an int: 4101
$collection->insert({city => "Portland", "zip" => "04101"});
To force a "number" to be saved as a string with aggressive number
conversion on, bless the string as a "MongoDB::BSON::String" type:
my $z = "04101";
my $zip = bless(\$z, "MongoDB::BSON::String");
# zip is stored as "04101"
$collection->insert({city => "Portland",
zip => bless(\$zip, "MongoDB::BSON::String")});
Strings
All strings must be valid UTF-8 to be sent to the database. If a
string is not valid, it will not be saved. If you need to save a
non-UTF-8 string, you can save it as a binary blob (see the Binary Data
section below).
All strings returned from the database have the UTF-8 flag set.
Unfortunately, due to Perl weirdness, UTF-8 is not very pretty. For
example, suppose we have a UTF-8 string:
my $str = 'A~Xland Islands';
Now, let's print it:
print "$str\n";
You can see in the output:
"\x{c5}land Islands"
Lovely, isn't it? This is how Perl prints UTF-8. To make it "pretty,"
there are a couple options:
my $pretty_str = utf8::encode($str);
This, unintuitively, clears the UTF-8 flag.
You can also just run
binmode STDOUT, ':utf8';
and then the string (and all future UTF-8 strings) will print
"correctly."
You can also turn off $MongoDB::BSON::utf_flag_on, and the UTF-8 flag
will not be set when strings are decoded:
$MongoDB::BSON::utf8_flag_on = 0;
Arrays
Arrays must be saved as array references ("\@foo", not @foo).
Embedded Documents
Embedded documents are of the same form as top-level documents: either
hash references or Tie::IxHashs.
Dates
The DateTime package can be used insert and query for dates. Dates
stored in the database will be returned as instances of DateTime.
An example of storing and retrieving a date:
use DateTime;
my $now = DateTime->now;
$collection->insert({'ts' => $now});
my $obj = $collection->find_one;
print "Today is ".$obj->{'ts'}->ymd."\n";
An example of querying for a range of dates:
my $start = DateTime->from_epoch( epoch => 100000 );
my $end = DateTime->from_epoch( epoch => 500000 );
my $cursor = $collection->query({event => {'$gt' => $start, '$lt' => $end}});
Warning: creating Perl DateTime objects is extremely slow. Consider
saving dates as numbers and converting the numbers to DateTimes when
needed. A single DateTime field can make deserialization up to 10
times slower.
For example, you could use the time function to store seconds since the
epoch:
$collection->update($criteria, {'$set' => {"last modified" => time()}})
This will be faster to deserialize.
Regular Expressions
Use "qr/.../" to use a regular expression in a query:
my $cursor = $collection->query({"name" => qr/[Jj]oh?n/});
Regular expressions will match strings saved in the database.
You can also save and retrieve regular expressions themselves:
$collection->insert({"regex" => qr/foo/i});
$obj = $collection->find_one;
if ("FOO" =~ $obj->{'regex'}) { # matches
print "hooray\n";
}
Note for Perl 5.8 users: flags are lost when regular expressions are
retrieved from the database (this does not affect queries or Perl
5.10+).
Booleans
Use the boolean package to get boolean values. "boolean::true" and
"boolean::false" are the only parts of the package used, currently.
An example of inserting boolean values:
use boolean;
$collection->insert({"okay" => true, "name" => "fred"});
An example using boolean values for query operators (only returns
documents where the name field exists):
my $cursor = $collection->query({"name" => {'$exists' => boolean::true}});
Most of the time, you can just use 1 or 0 instead of "true" and
"false", such as for specifying fields to return. boolean is the only
way to save booleans to the database, though.
By default, booleans are returned from the database as integers. To
return booleans as booleans, set $MongoDB::BSON::use_boolean to 1.
MongoDB::OID
"OID" stands for "Object ID", and is a unique id that is automatically
added to documents if they do not already have an "_id" field before
they are saved to the database. They are 12 bytes which are guarenteed
to be unique. Their string form is a 24-character string of
hexidecimal digits.
To create a unique id:
my $oid = MongoDB::OID->new;
To create a MongoDB::OID from an existing 24-character hexidecimal
string:
my $oid = MongoDB::OID->new("value" => "123456789012345678901234");
Binary Data
By default, all database strings are UTF8. To save images, binaries,
and other non-UTF8 data, you can pass the string as a reference to the
database. For example:
# non-utf8 string
my $string = "\xFF\xFE\xFF";
$collection->insert({"photo" => \$string});
This will save the variable as binary data, bypassing the UTF8 check.
Binary data can be matched exactly by the database, so this query will
match the object we inserted above:
$collection->find({"photo" => \$string});
Comparisons (e.g., $gt, $lt) may not work as you expect with binary
data, so it is worth experimenting.
MongoDB::Code
MongoDB::Code is used to represent JavaScript code and, optionally,
scope. To create one:
use MongoDB::Code;
my $code = MongoDB::Code->new("code" => "function() { return 'hello, world'; }");
Or, with a scope:
my $code = MongoDB::Code->new("code" => "function() { return 'hello, '+name; }",
"scope" => {"name" => "Fred"});
Which would then return "hello, Fred" when run.
MongoDB::MinKey
"MongoDB::MinKey" is "less than" any other value of any type. This can
be useful for always returning certain documents first (or last).
"MongoDB::MinKey" has no methods, fields, or string form. To create
one, it is sufficient to say:
bless $minKey, "MongoDB::MinKey";
MongoDB::MaxKey
"MongoDB::MaxKey" is "greater than" any other value of any type. This
can be useful for always returning certain documents last (or first).
"MongoDB::MaxKey" has no methods, fields, or string form. To create
one, it is sufficient to say:
bless $minKey, "MongoDB::MaxKey";
MongoDB::Timestamp
my $ts = MongoDB::Timestamp->new({sec => $seconds, inc => $increment});
Timestamps are used internally by MongoDB's replication. You can see
them in their natural habitat by querying "local.main.$oplog". Each
entry looks something like:
{ "ts" : { "t" : 1278872990000, "i" : 1 }, "op" : "n", "ns" : "", "o" : { } }
In the shell, timestamps are shown in milliseconds, although they are
stored as seconds. So, to represent this document in Perl, we would
do:
my $oplog = {
"ts" => MongoDB::Timestamp->new("sec" => 1278872990, "inc" => 1),
"op" => "n",
"ns" => "",
"o" => {}
}
Timestamps are not dates. You should not use them unless you are doing
something low-level with replication. To save dates or times, use a
number or DateTime.
perl v5.14.2 2011-08-29 MongoDB::DataTypes(3)