MongoDB is a noSQL database management system that supports CRUD, data aggregation, geospatial queries and storage of blob data using GridFS. Data is stored within documents as BSON which is the binary representation of JSON. Documents are stored within collections. Collections can be stored within another collection. Collections are stored within a database. Collections and databases are simply created along with the first insertion of a document or simply by use my-new-database
. MongoDB uses WiredTiger Storage Engine (including support for Encryption at Rest).
Also, check out the official mongodb doc/manual.
Running latest mongodb as Docker container with auth enabled
MongoDb version 6.0.3 as time of writing this. To enable authenticatoin we pass --auth
and two env variables that will set the initial root username and password:
> docker pull mongo > docker run --name="mongodb-with-auth" --env=MONGO_INITDB_ROOT_USERNAME=admin --env=MONGO_INITDB_ROOT_PASSWORD=secure -p 27017:27017 mongo --auth
Authentication and Authorization
Authentication identifies a user via password. Authorization is about which roles the user has and thus what the user is allowed to do.
By default, mongodb’s auth is disabled, but with the docker run command above we have at least created an initial root user with password in the admin database.
/etc/mongod.conf
and uncomment authorization: enabled
. Finally restart mongodb.Showing all database users
Once you are authenticated, you can display all users:
As proof that we have a root user, we run this query:
> mongosh > use admin; > db.auth('admin', 'secure'); > db.system.users.find(); // This will return the admin user showing // roles: [ { role: 'root', db: 'admin' } ]
Don’t be surprised that you can continue to connect to mongodb using mongosh
even without specifying a user and password. But eventually you want to run commands (or actions) which require elevated privileges, otherwise you get this error:
MongoServerError: command find requires authentication
Authenticate yourself
There are two ways to authenticate yourself. Using credentials when connecting to mongodb:
> mongosh --authenticationDatabase "admin" -u "admin" -p
or after you connected without a user and password:
> use admin; > db.auth("admin", passwordPrompt());
Note that you have to switch to the right database before attempting to authenticate yourself. In this case we only created a user on the admin database, that’s why authentication with the admin user on a database test
would not work.
Built-in roles in mongodb
MongoDB grants access to data and commands through role-based authorization and provides built-in roles that provide the different levels of access commonly needed in a database system. Which specific action/command you can run with a specific role can be read on the official doc page. You can additionally create user-defined roles.
- Database User roles (
read
,readWrite
) are provided on every database - Database administration roles (
dbAdmin
,dbOwner
,userAdmin
) are provided on every database - All other roles are provided only on
admin
database. Those roles are- clusterAdmin
- clusterManager
- clusterMonitor
- hostManager
- backup
- restore
- readAnyDatabase
- readWriteAnyDatabase
- userAdminAnyDatabase
- dbAdminAnyDatabase
- And then there is the role
root
which provides full privileges on all resources
Add user with specific role to specific database
Once you have authenticated yourself as describe above, we can add a user with a specific role to any databases.
We will use Salted Challenge Response Authentication Mechanism (SCRAM) which is a password-based mutual authentication protocol designed to make an eavesdropping attack (i.e. man-in-the-middle) more difficult.
> use my-database; > db.createUser({ user: "my-user", pwd: passwordPrompt(), // or enter password directly as string roles: [ { role: "readWrite", db: "my-database" }, { role: "read", db: "another-database" } ] });
Edit /etc/mongod.conf
and uncomment authorization: enabled
. Finally restart mongodb.
Your connection string is
mongodb://my-user:my-password@localhost:27017/my-database
Using mongodb shell
Mongodb shell (mongosh) and database tools for Windows can be downloaded from the official MongoDB website. Mongosh is actually a JavaScript shell, so you can assign variables and execute JS. It can be started with:
> mongosh
mongosh is also available in Compass, when you click on >_MONGOSH
in the lower left area of the window.
show dbs; | Show all databases |
use mydb; | Switch to collection mydb |
show collections; | Show collections of the db you are currently in |
db.getName(); | Show name of database you are currently in |
Naming convention
You cannot use two databases with names like, salesData
and SalesData
. On Windows you cannot use /. "$*<>:|?
in database names, not /. "$
for Unix. Must be fewer than 64 characters.
Start collections with a letter or underscore. Don’t use $
or system.
prefix in the name.
The field name _id
is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array.
More thresholds and limitations.
MongoDB Data types
The full list of supported data types.
Importing JSON data into mongodb
After installing the database tools as described above, you can run
> mongoimport --db=mydatabase --jsonArray myFile.json
Inserting a doc manually
Creates a doc in mycollection
and creates mycollection
automatically if it did not exist before.
Insert One
db.mycollection.insertOne({title: "Example"}); > { acknowledged: true, insertedId: ObjectId("63b6f9c389117715d028f6d2") }
ObjectId is a unique value which also contains an encoded daytime value, which can be useful for sorting.
Insert many
db.mycollection.insertMany([{title: "Example 2"}, {title: "Example 3"}]); > { acknowledged: true, insertedIds: { '0': ObjectId("63b7048f89117715d028f6d7"), '1': ObjectId("63b7048f89117715d028f6d8") } }
The following methods can also add new documents to a collection:
db.collection.updateOne()
when used with theupsert: true
option.db.collection.updateMany()
when used with theupsert: true
option.db.collection.findAndModify()
when used with theupsert: true
option.db.collection.findOneAndUpdate()
when used with theupsert: true
option.db.collection.findOneAndReplace()
when used with theupsert: true
option.db.collection.bulkWrite()
.
Querying docs
Using find()
actually returns a cursor. That means that you can chain other functions such as limit()
, skip() etc. after find()
.
To pretty-print the results on the console you can add pretty()
to all your searches, like db.mycollection.find().pretty();
.
Find all docs
db.mycollection.find(); > { _id: ObjectId("63b6f9c389117715d028f6d2"), title: 'Example' }
Filter by field value
db.mycollection.find({title: "Example"}); // find docs that match title AND age db.mycollection.find({title: "Example", age: "20"});
Filter by nested field value
if you have a nested object like
{ area: "somewhere", product : { title: "My Product", price: 18 } }
then you can filter by the nested field:
db.mycollection.find({ "product.title" : "My Product" }); // The above is different from the following: db.mycollection.find({ "product" : {"title" : "My Product"} });
The first query returns the doc, but the second query does not return anything. The reason is that the second query is looking for an exact match, meaning that it looks for a document that exactly has one title with “My Product”, but as we see the doc also has a price with “18” and so it does not return anything.
Filter using regex
db.mycollection.find({title: { $regex: /amp/ } });
Filter which fields are returned
The 2nd argument lets you specify which fields should be returned in the result. 1 includes the field and 0 excludes the field.
db.mycollection.find({title: "Example"}, {title: 1}); // filter by title and only return title db.mycollection.find({title: "Example"}, {title: 0}); // filter by title and return everything but the title db.mycollection.find({}, {title: 1}); // do no filter and only return title
Counting docs
db.mycollection.count()
Limit number of result docs
db.mycollection.find().limit(2);
Sorting result docs
Sort by title
in ascending order (1
) or descending order (-1
):
db.mycollection.find().sort({"title": 1});
Skipping result docs
Skips the first result.
db.mycollection.find().skip(1);
Comparison operators
db.mycollection.find({age : { $gt: 19}}); // greater than db.mycollection.find({age : { $lt: 19}}); // less than db.mycollection.find({age : { $lte: 19}}); // less than or equal
or-filter
db.mycollection.find({ $or : [{"age" : { $lt:10 }}, {"age": { $gt:40 }}] });
Find docs having all specified values in a doc’s array field (and-search)
Finds docs that have all the specified values in the specified field.
// Find all docs that have "A" and "B" in their "tags" field db.mycollection.find({ "tags" : { $all: ["A", "B"] } }); { _id: ObjectId("63b70cfb89117715d028f6d9"), title: 'Example Tags', tags: [ 'A', 'B', 'C' ] }
Find docs having at least one of the specified values in a doc’s array field (or-search)
// Find all docs that have either "A" or "C" in their "tags" field db.mycollection.find({ "tags" : { $in: ["A", "C"] } }); > { _id: ObjectId("63b70cfb89117715d028f6d9"), title: 'Example Tags', tags: [ 'A', 'B', 'C' ] } > { _id: ObjectId("63b70d1489117715d028f6da"), title: 'Example Tags 2', tags: [ 'X', 'Y', 'C' ] }
Update a single document
Updates existing field in the doc
db.mycollection.updateOne({title: "Meeting"}, { $set: { title: "Event"}}); { acknowledged: true, insertedId: null, matchedCount: 1, modifiedCount: 1, upsertedCount: 0 }
Add new field
If the field that is supposed to be updated does not exist, it will create a field:
db.mycollection.updateOne({title: "Event"}, { $set: { myNewField: "brand new"}}); { acknowledged: true, insertedId: null, matchedCount: 1, modifiedCount: 1, upsertedCount: 0 } db.mycollection.find({title: "Event"}) { _id: ObjectId("63b702b689117715d028f6d6"), title: 'Event', age: '53', myNewField: 'brand new' }
Remove existing field
db.mycollection.updateOne({title: "Event"}, { $unset: { myNewField: 1}}); { acknowledged: true, insertedId: null, matchedCount: 1, modifiedCount: 1, upsertedCount: 0 } db.mycollection.find({title: "Event"}) { _id: ObjectId("63b702b689117715d028f6d6"), title: 'Event', age: '53' }
Increment a number field
Here we increment ($inc
) age
by 5. Negative values would decrement.
db.mycollection.find({title: 'Example'}) { _id: ObjectId("63b6f9c389117715d028f6d2"), title: 'Example', age: 20 } db.mycollection.updateOne({title: "Example"}, { $inc: { age: 5 }}); { acknowledged: true, insertedId: null, matchedCount: 1, modifiedCount: 1, upsertedCount: 0 } db.mycollection.find({title: 'Example'}) { _id: ObjectId("63b6f9c389117715d028f6d2"), title: 'Example', age: 25 }
Updating arrays
Use $push
to add an array item:
db.mycollection.insertOne({"title" : "Example", "tags": ["A", "B", "C"]}); { acknowledged: true, insertedId: ObjectId("63b7246289117715d028f6df") } db.mycollection.updateOne({"title" : "Example"}, { $push : {"tags" : "D"} }); { acknowledged: true, insertedId: null, matchedCount: 1, modifiedCount: 1, upsertedCount: 0 } db.mycollection.find() { _id: ObjectId("63b7246289117715d028f6df"), title: 'Example', tags: [ 'A', 'B', 'C', 'D' ] }
and $pull
to remove an array item:
db.mycollection.updateOne({"title" : "Example"}, { $pull : {"tags" : "D"} });
Delete a doc
db.mycollection.deleteOne({"_id" : ObjectId("63b7254989117715d028f6e0")}); db.mycollection.deleteMany({})
Remove an entire collection
db.mycollection.drop()
Indexes
Indexes can dramatically improve the query performance, because instead of traversing through each document, the results will be looked up in a previously created index.
Showing all indexes
db.mycollection.getIndexes();
Creating and deleting indexes
// create an index is ascending order for "myfield" db.mycollection.createIndex({"myfield": 1}) db.mycollection.dropIndex({"myfield_1"})
Getting query performance data
db.mycollection.find({title: "Example"}).explain();
Collection types
Capped Collections have a guaranteed insertion order, are limited by disk size or doc count and provide automatic first in-first-out-deletion. This is good for error logging, e.g. file size
of 10.000 bytes (size is always required) or max
10.000 documents:
db.createCollection("error_log", { capped: true, size: 10000, max: 10000 });
Time Series collection allow you to store data that changes over time, but also has a key that does not change. That can be useful if you want to track stock data for example, or performance data every 30 seconds that you want to create charts from.
Creating a mongodb connection in TypeScript
This article requires that you have setup TypeScript and Node as described in my article TypeScript and Node.
yarn add mongodb @types\mongodb
//src/index.ts import * as mongodb from "mongodb"; const uri = "mongodb://localhost:27017"; const dbName = "my-mongo-db"; async function main() { const client = new mongodb.MongoClient(uri, { useUnifiedTopology: true }); await client.connect(); console.log("DB Connection established"); const database = client.db("my-database"); const collection = database.collection('my-collection'); // find single doc const singleResult = await collection.findOne(); // find many docs const manyResults = collection.find(); while(await manyResults.hasNext()) { const result = await recipes.next(); console.log(result); } await client.close(); } main();
Storing files in mongoDb with GridFS
MongoDb can store files using its underlying GridFS, which breaks files up into chunks of max 16MB size. Chunks are stored in separate docs, for example in database files
in collection fs.chunks
as doc with id 12345.json
. That doc contains an entry “files_id: 54321”, which references a document 54321.json
in collection fs.files
and contains data such as filename
, uploadDate
, metadata
, length
etc. The chunks are streamed back via a MongoDB client.
In this example we use the mongofiles
database tool from the CLI, but this can also be written in node.
> mongofiles put myfile.jpg --db=files // uploading > mongofiles list --db=files --quiet // listing > mongofiles get myfile.jpg --db=files // downloading > mongofiles delete myfile.jpg --db=files // removing
Replica Sets
Replica Sets provide automatic failover and data redundancy: Data can be replicated by running multiple copies at once. If the primary goes down, the secondary will continue to serve requests. We create a replica set called myReplSet which runs three instances of mongoDb on different ports:
> mongod --replSet myReplSet --dbpath=/store/data/rs1 --port 27017 --smallfiles --oplogSize 200 > mongod --replSet myReplSet --dbpath=/store/data/rs2 --port 27018 --smallfiles --oplogSize 200 > mongod --replSet myReplSet --dbpath=/store/data/rs3 --port 27019 --smallfiles --oplogSize 200
Next, we have to connect each instance with each other using a config file. First, connect to rs1 with mongo --port 27017
and create a config:
> config = { _id: "myReplSet", members: [ {_id: 0, host: "localhost:27017"}, {_id: 1, host: "localhost:27018"}, {_id: 2, host: "localhost:27019"}, ] } > rs.initiate(config); > rs.status();
Sharding
Breaks your data up to distribute it on multiple servers. Uses sharding keys to route requests to each server. tbc.
Backups
Manual backup by shutting down the server and copying files
Not recommended, but that’s how it’s done: Before you create a backup, you should prevent that any info is written by locking the database with db.fsyncLock();
Then copy all files from your db file folder. After that db.fsyncUnlock()
again.
mongodump
Running mongodump
will create a dump folder with a file for each database. You can further customize the dump, use the help flag. You restore a dump using mongorestore /path/to/dump
.
Schema validation
MongoDB supports draft 4 of JSON Schema. The idea is that you pass in that schema whenever you create a collection. But a (better?) alternative is to use mongoose.
Views
MongoDB Views are essentially read-only collections that contain data computed from other collections using aggregations.