# MongoDB Basics
## Contents
- [Introduction](#introduction)
- [Fundamentals](#fundamentals)
- [Installation and Updates](#installation-and-updates)
- [RHEL / CentOS](#rhel--centos)
- [Ubuntu / Debian](#ubuntu--debian)
- [User Management](#user-management)
- [Network Connectivity](#network-connectivity)
- [MongDB Default Ports](#mongodb-default-ports)
- [System Level Software Configuration](#system-level-software-configuration)
- [Red Hat / CentOS](#red-hat--centos)
- [Ubuntu / Debian](#ubuntu--debian-1)
- [Intermediate Troubleshooting](#intermediate-troubleshooting)
- [Kill errant MongoDB Thread](#kill-errant-mongodb-thread)
- [Check Replica Status](#check-replica-status)
- [Check Status](#check-sharding-status)
- [References](#references)
## Introduction
MongoDB is a document-oriented database model; it uses JSON constructs to not only store the data but also to interact with the system itself. Many commands may look a little odd coming from a MySQL background but in general the concept of what you're trying to do is somewhat the same. The 10gen website has a great page detailing how to apply your MySQL knowledge to the MongoDB world:
-
-
-
| **SQL Terms/Concepts** | **MongDB Terms/Concepts** |
| ---------------------- | ------------------------------ |
| database | database |
| table | collection |
| row | document or BSON document |
| column | field |
| index | index |
| table joins | embedded documents and linking |
| primary key | primary key |
| aggregation (group by) | aggregation pipeline |
A few select examples from the linked website:
| **SQL Select Statements** | **MongoDB find() Statements** |
| ------------------------------------------------------------ | ------------------------------------------------ |
| `SELECT * FROM users` | `db.users.find()` |
| `SELECT id, user_id, status FROM users` | `db.users.find({},{user_id:1,status:1})` |
| `SELECT * FROM users WHERE status="A" ORDER BY user_id DESC` | `db.users.find({status:"A"}).sort({user_id:-1})` |
> Using the `.explain()` method in MongoDB **runs the query** which is exactly the **opposite of MySQL**. Be very careful you are not using .explain() on a database with any sort of data altering command (think UPDATE / INSERT / DELETE in MySQL)
## Fundamentals
### Installation and Updates
#### RHEL / CentOS
Utilize the standard Yum repository style configuration:
```
# vi /etc/yum.repos.d/10gen.repo
[10gen]
name=10gen Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64
gpgcheck=0
enabled=1
# yum install mongo-10gen mongo-10gen-server
```
#### Ubuntu / Debian
Utilize the standard APT sources style configuration:
```
# apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10
# echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' >> /etc/apt/sources.list.d/10gen.list
# apt-get update
# apt-get install mongodb-10gen
```
### User Management
MongoDB uses role-based access control based on database level; the system.users collection contains the data which correlates roughly to the mysql.user table in MySQL, however it is not manipulated quite the same way. The 10gen website has great introductory material on Access Control:
-
-
-
-
-
> **Authentication is disabled by default** in an out of the box installation\! Refer to the above documentation and tutorials for basic user administration tasks should they be required or repaired as most production level configurations will have had security practices applied.
### Network Connectivity
#### MongoDB Default Ports
**27017**
- default port for mongod and mongos instances
- change with port with --port / port
- bind with --bind\_ip / bind\_ip
- define Replicat set with --replSet / replSet
- set DB datadir with --dbpath / dbpath
**27018**
- default port when running with --shardsvr / shardsvr
**27019**
- default port when running with --configsvr / configsvr
**28017**
- default port for the web status page
- always accessible at a port + 1000
- disable with --nohttppinterface / nohttpinterface
- no authentication by default
- enable REST interface with --rest / rest
### System Level Software Configuration
Vendor packages place the default configurations, service scripts and data directories in the standard location methodologies. Subtle differences exist between the platforms:
#### Red Hat / CentOS
- /etc/mongod.conf
- /etc/sysconfig/mongod
- /etc/rc.d/init.d/mongod
- /var/log/mongo/mongod.log
- /var/lib/mongo/
- ~/.mongorc.js
#### Ubuntu / Debian
- /etc/mongodb.conf
- /etc/init/mongodb.conf
- /etc/init.d/mongodb
- /var/log/mongodb/mongodb.log
- /var/lib/mongodb/
- ~/.mongorc.js
## Intermediate Troubleshooting
### Kill errant MongoDB Thread
Killing an errant thread in MongoDB is directly analogous to killing one in MySQL - you examine the stack, find the one in question and issue a command to terminate it.
> Do not kill threads which are compacting databases or any background threads which are indexing data - this can lead to database corruption
First, use the `db.currentOp()` mongo shell command to list your threads; this is analogous to `show full processlist` in MySQL.
```
$ mongo
MongoDB shell version: 2.4.5
connecting to: test
> db.currentOp()
{
"inprog" : [
{
"opid" : 2506233,
"active" : true,
"secs_running" : 140,
"op" : "update",
"ns" : "generators.sensor_readings",
"query" : {
"$where" : "function(){sleep(500);return false;}"
},
"client" : "127.0.0.1:51773",
"desc" : "conn20",
"threadId" : "0x7f694753d700",
"connectionId" : 20,
"locks" : {
"^" : "w",
"^generator" : "W"
},
"waitingForLock" : false,
"numYields" : 279,
"lockStats" : {
"timeLockedMicros" : {
"r" : NumberLong(0),
"w" : NumberLong(280242564)
},
"timeAcquiringMicros" : {
"r" : NumberLong(0),
"w" : NumberLong(140420592)
}
}
},
{
"opid" : 2507691,
"active" : false,
"op" : "query",
"ns" : "",
"query" : {
},
"client" : "127.0.0.1:51772",
"desc" : "conn19",
"threadId" : "0x7f6962e4a700",
"connectionId" : 19,
"locks" : {
"^generator" : "R"
},
"waitingForLock" : true,
"numYields" : 0,
"lockStats" : {
"timeLockedMicros" : {
},
"timeAcquiringMicros" : {
}
}
}
]
}
```
In the example above we see two threads; the keys to look for are the `waitingForLock`, `secs_running`, and `op` fields of the command. The threads we're looking for is the first one with `opid` 2506233 as it's the one locking up our database; but notice it has `W` in the `locks` subdocument. We kill it with the `db.killOp()` command only if we're sure the data it's writing can be lost – this is a dangerous operation to perform and should be examined carefully. Read operations are generally safe to kill in an emergency.
```
> db.killOp(2506233);
{ "info" : "attempting to kill op" }
> db.currentOp()
{ "inprog" : [ ] }
```
### Check Replica Status
Somewhat similar to MySQL, replication is based on two configurations working together; the core `mongod` process must be started with a config file/command line flag to tell it which replica set it lives. This is the `replSet` keyword and can be any string, so long as all instances (processes) share the same name. For example, here are three processes started on the same server for testing a replica set:
```
# mongod --dbpath 1 --port 27001 --smallfiles --oplogSize 50 \
--logpath 1.log --logappend --fork --replSet w4
# mongod --dbpath 2 --port 27002 --smallfiles --oplogSize 50 \
--logpath 2.log --logappend --fork --replSet w4
# mongod --dbpath 3 --port 27003 --smallfiles --oplogSize 50 \
--logpath 3.log --logappend --fork --replSet w4
```
Once the Replica set is initialized and configured (using `rs.initiate()` and `rs.add()` / `rs.reconfig()` commands), checking the status is done from any member of the set using the `rs.status()` command:
```
$ mongo --port 27002
MongoDB shell version: 2.4.5
connecting to: 127.0.0.1:27002/test
w4:PRIMARY> rs.status()
{
"set" : "w4",
"date" : ISODate("2013-08-19T18:53:23Z"),
"myState" : 1,
"members" : [
{
"_id" : 1,
"name" : "mongo1c:27002",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 586,
"optime" : Timestamp(1376937880, 1),
"optimeDate" : ISODate("2013-08-19T18:44:40Z"),
"self" : true
},
{
"_id" : 2,
"name" : "mongo1c:27003",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 584,
"optime" : Timestamp(1376937880, 1),
"optimeDate" : ISODate("2013-08-19T18:44:40Z"),
"lastHeartbeat" : ISODate("2013-08-19T18:53:21Z"),
"lastHeartbeatRecv" : ISODate("2013-08-19T18:53:21Z"),
"pingMs" : 0,
"syncingTo" : "mongo1c:27002"
},
{
"_id" : 3,
"name" : "mongo1c:27001",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 523,
"optime" : Timestamp(1376937880, 1),
"optimeDate" : ISODate("2013-08-19T18:44:40Z"),
"lastHeartbeat" : ISODate("2013-08-19T18:53:23Z"),
"lastHeartbeatRecv" : ISODate("2013-08-19T18:53:21Z"),
"pingMs" : 0,
"syncingTo" : "mongo1c:27002"
}
],
"ok" : 1
}
```
Notice how the `stateStr` field will help identify who is the PRIMARY (writer) of the set; unlike MySQL the PRIMARY node can be moved around on the fly - whether it's automatic by voting, or manual actions performed (such as taking a node offline for maintenance work). Actions such as `rs.freeze()`, `rs.stepDown()` and `rs.remove()` exist to manipulate the Replica set. Note that you can always query the instance you logged into with the `db.isMaster()` command to get another view of who is the PRIMARY writer.
```
w4:PRIMARY> db.isMaster()
{
"setName" : "w4",
"ismaster" : true,
"secondary" : false,
"hosts" : [
"mongo1c:27002",
"mongo1c:27001",
"mongo1c:27003"
],
"primary" : "mongo1c:27002",
"me" : "mongo1c:27002",
"maxBsonObjectSize" : 16777216,
"maxMessageSizeBytes" : 48000000,
"localTime" : ISODate("2013-08-19T18:58:40.488Z"),
"ok" : 1
}
```
### Check Sharding Status
-
Connecting to the shard server (mongos) to view the configuration:
```
mongo localhost:27108/admin -u admin -p
mongos> sh.status()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "db1", "host" : "db1:27001,db2:27001,db3:27001" }
{ "_id" : "db2", "host" : "db3:27002,db1:27002,db2:27002" }
{ "_id" : "db3", "host" : "db2:27003,db3:27003,db1:27003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "generators", "partitioned" : true, "primary" : "db1" }
generators.sensor_readings chunks:
db3 3
db2 6
db1 6
```
## References
-
-
-
-
-