This article's content
Authentication in MongoDB

MongoDB is a noSQL database management system that supports CRUD, data aggregation, geospatial queries and storage of blob data using GridFS. Data is stored within documents as BSON which is the binary representation of JSON. Documents are stored within collections. Collections can be stored within another collection. Collections are stored within a database. Collections and databases are simply created along with the first insertion of a document or simply by use my-new-database. MongoDB uses WiredTiger Storage Engine (including support for Encryption at Rest).

Also, check out the official mongodb doc/manual.

Running latest mongodb as Docker container with auth enabled

MongoDb version 6.0.3 as time of writing this. To enable authenticatoin we pass --auth and two env variables that will set the initial root username and password:

> docker pull mongo
> docker run --name="mongodb-with-auth" --env=MONGO_INITDB_ROOT_USERNAME=admin --env=MONGO_INITDB_ROOT_PASSWORD=secure -p 27017:27017 mongo --auth

Authentication and Authorization

Authentication identifies a user via password. Authorization is about which roles the user has and thus what the user is allowed to do.

By default, mongodb’s auth is disabled, but with the docker run command above we have at least created an initial root user with password in the admin database.

Showing all database users

Once you are authenticated, you can display all users:

As proof that we have a root user, we run this query:

> mongosh
> use admin;
> db.auth('admin', 'secure');
> db.system.users.find();
// This will return the admin user showing
// roles: [ { role: 'root', db: 'admin' } ]

Don’t be surprised that you can continue to connect to mongodb using mongosh even without specifying a user and password. But eventually you want to run commands (or actions) which require elevated privileges, otherwise you get this error:

MongoServerError: command find requires authentication

Authenticate yourself

There are two ways to authenticate yourself. Using credentials when connecting to mongodb:

> mongosh --authenticationDatabase "admin" -u "admin" -p

or after you connected without a user and password:

> use admin;
> db.auth("admin", passwordPrompt());

Note that you have to switch to the right database before attempting to authenticate yourself. In this case we only created a user on the admin database, that’s why authentication with the admin user on a database test would not work.

Built-in roles in mongodb

MongoDB grants access to data and commands through role-based authorization and provides built-in roles that provide the different levels of access commonly needed in a database system. Which specific action/command you can run with a specific role can be read on the official doc page. You can additionally create user-defined roles.

  • Database User roles (read, readWrite) are provided on every database
  • Database administration roles (dbAdmin, dbOwner, userAdmin) are provided on every database
  • All other roles are provided only on admin database. Those roles are
    • clusterAdmin
    • clusterManager
    • clusterMonitor
    • hostManager
    • backup
    • restore
    • readAnyDatabase
    • readWriteAnyDatabase
    • userAdminAnyDatabase
    • dbAdminAnyDatabase
  • And then there is the role root which provides full privileges on all resources

Add user with specific role to specific database

Once you have authenticated yourself as describe above, we can add a user with a specific role to any databases.

We will use Salted Challenge Response Authentication Mechanism (SCRAM) which is a password-based mutual authentication protocol designed to make an eavesdropping attack (i.e. man-in-the-middle) more difficult.

> use my-database;

> db.createUser({
  user: "my-user", 
  pwd: passwordPrompt(), // or enter password directly as string
  roles: [
    { role: "readWrite", db: "my-database" },
    { role: "read", db: "another-database" }
 ] 
});

Edit /etc/mongod.conf and uncomment authorization: enabled. Finally restart mongodb.

Your connection string is

mongodb://my-user:my-password@localhost:27017/my-database

Using mongodb shell

Mongodb shell (mongosh) and database tools for Windows can be downloaded from the official MongoDB website. Mongosh is actually a JavaScript shell, so you can assign variables and execute JS. It can be started with:

> mongosh

mongosh is also available in Compass, when you click on >_MONGOSH in the lower left area of the window.

show dbs;Show all databases
use mydb;Switch to collection mydb
show collections;Show collections of the db you are currently in
db.getName();Show name of database you are currently in

Naming convention

You cannot use two databases with names like, salesData and SalesData. On Windows you cannot use /. "$*<>:|? in database names, not /. "$ for Unix. Must be fewer than 64 characters.

Start collections with a letter or underscore. Don’t use $ or system. prefix in the name.

The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array.

More thresholds and limitations.

MongoDB Data types

The full list of supported data types.

Importing JSON data into mongodb

After installing the database tools as described above, you can run

> mongoimport --db=mydatabase --jsonArray myFile.json

Inserting a doc manually

Creates a doc in mycollection and creates mycollection automatically if it did not exist before.

Insert One

db.mycollection.insertOne({title: "Example"});
> { acknowledged: true, insertedId: ObjectId("63b6f9c389117715d028f6d2") }

ObjectId is a unique value which also contains an encoded daytime value, which can be useful for sorting.

Insert many

db.mycollection.insertMany([{title: "Example 2"}, {title: "Example 3"}]);
> { acknowledged: true, insertedIds:  { '0': ObjectId("63b7048f89117715d028f6d7"),  '1': ObjectId("63b7048f89117715d028f6d8") } }

The following methods can also add new documents to a collection:

Querying docs

Using find() actually returns a cursor. That means that you can chain other functions such as limit(), skip() etc. after find().

To pretty-print the results on the console you can add pretty() to all your searches, like db.mycollection.find().pretty();.

Find all docs

db.mycollection.find();
> { _id: ObjectId("63b6f9c389117715d028f6d2"), title: 'Example' }

Filter by field value

db.mycollection.find({title: "Example"});

// find docs that match title AND age
db.mycollection.find({title: "Example", age: "20"});

Filter by nested field value

if you have a nested object like

{
  area: "somewhere",
  product : {
    title: "My Product",
    price: 18
  }
}

then you can filter by the nested field:

db.mycollection.find({ "product.title" : "My Product" });

// The above is different from the following:
db.mycollection.find({ "product" : {"title" : "My Product"} });

The first query returns the doc, but the second query does not return anything. The reason is that the second query is looking for an exact match, meaning that it looks for a document that exactly has one title with “My Product”, but as we see the doc also has a price with “18” and so it does not return anything.

Filter using regex

db.mycollection.find({title: { $regex: /amp/ } });

Filter which fields are returned

The 2nd argument lets you specify which fields should be returned in the result. 1 includes the field and 0 excludes the field.

db.mycollection.find({title: "Example"}, {title: 1}); // filter by title and only return title
db.mycollection.find({title: "Example"}, {title: 0}); // filter by title and return everything but the title
db.mycollection.find({}, {title: 1});  // do no filter and only return title

Counting docs

db.mycollection.count()

Limit number of result docs

db.mycollection.find().limit(2);

Sorting result docs

Sort by title in ascending order (1) or descending order (-1):

db.mycollection.find().sort({"title": 1});

Skipping result docs

Skips the first result.

db.mycollection.find().skip(1);

Comparison operators

db.mycollection.find({age : { $gt: 19}});  // greater than
db.mycollection.find({age : { $lt: 19}});  // less than
db.mycollection.find({age : { $lte: 19}});  // less than or equal

or-filter

db.mycollection.find({ $or : [{"age" : { $lt:10 }}, {"age": { $gt:40 }}] });

Find docs having all specified values in a doc’s array field (and-search)

Finds docs that have all the specified values in the specified field.

// Find all docs that have "A" and "B" in their "tags" field
db.mycollection.find({ "tags" : { $all: ["A", "B"] } });
{ _id: ObjectId("63b70cfb89117715d028f6d9"),
  title: 'Example Tags',
  tags: [ 'A', 'B', 'C' ] }

Find docs having at least one of the specified values in a doc’s array field (or-search)

// Find all docs that have either "A" or "C" in their "tags" field
db.mycollection.find({ "tags" : { $in: ["A", "C"] } });
> { _id: ObjectId("63b70cfb89117715d028f6d9"),
  title: 'Example Tags',
  tags: [ 'A', 'B', 'C' ] }
> { _id: ObjectId("63b70d1489117715d028f6da"),
  title: 'Example Tags 2',
  tags: [ 'X', 'Y', 'C' ] }

Update a single document

Updates existing field in the doc

db.mycollection.updateOne({title: "Meeting"}, { $set: { title: "Event"}});
{ acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0 }

Add new field

If the field that is supposed to be updated does not exist, it will create a field:

db.mycollection.updateOne({title: "Event"}, { $set: { myNewField: "brand new"}});
{ acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0 }
db.mycollection.find({title: "Event"})
{ _id: ObjectId("63b702b689117715d028f6d6"),
  title: 'Event',
  age: '53',
  myNewField: 'brand new' }

Remove existing field

db.mycollection.updateOne({title: "Event"}, { $unset: { myNewField: 1}});
{ acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0 }
db.mycollection.find({title: "Event"})
{ _id: ObjectId("63b702b689117715d028f6d6"),
  title: 'Event',
  age: '53' }

Increment a number field

Here we increment ($inc) age by 5. Negative values would decrement.

db.mycollection.find({title: 'Example'})
{ _id: ObjectId("63b6f9c389117715d028f6d2"),
  title: 'Example',
  age: 20 }

db.mycollection.updateOne({title: "Example"}, { $inc: { age: 5 }});
{ acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0 }

db.mycollection.find({title: 'Example'})
{ _id: ObjectId("63b6f9c389117715d028f6d2"),
  title: 'Example',
  age: 25 }

Updating arrays

Use $push to add an array item:

db.mycollection.insertOne({"title" : "Example", "tags": ["A", "B", "C"]});
{ acknowledged: true,
  insertedId: ObjectId("63b7246289117715d028f6df") }

db.mycollection.updateOne({"title" : "Example"}, { $push : {"tags" : "D"} });
{ acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 1,
  upsertedCount: 0 }

db.mycollection.find()
{ _id: ObjectId("63b7246289117715d028f6df"),
  title: 'Example',
  tags: [ 'A', 'B', 'C', 'D' ] }

and $pull to remove an array item:

db.mycollection.updateOne({"title" : "Example"}, { $pull : {"tags" : "D"} });

Delete a doc

db.mycollection.deleteOne({"_id" : ObjectId("63b7254989117715d028f6e0")});
db.mycollection.deleteMany({})

Remove an entire collection

db.mycollection.drop()

Indexes

Indexes can dramatically improve the query performance, because instead of traversing through each document, the results will be looked up in a previously created index.

Showing all indexes

db.mycollection.getIndexes();

Creating and deleting indexes

// create an index is ascending order for "myfield"
db.mycollection.createIndex({"myfield": 1})

db.mycollection.dropIndex({"myfield_1"})

Getting query performance data

db.mycollection.find({title: "Example"}).explain();

Collection types

Capped Collections have a guaranteed insertion order, are limited by disk size or doc count and provide automatic first in-first-out-deletion. This is good for error logging, e.g. file size of 10.000 bytes (size is always required) or max 10.000 documents:

db.createCollection("error_log", { capped: true, size: 10000, max: 10000 });

Time Series collection allow you to store data that changes over time, but also has a key that does not change. That can be useful if you want to track stock data for example, or performance data every 30 seconds that you want to create charts from.

Creating a mongodb connection in TypeScript

This article requires that you have setup TypeScript and Node as described in my article TypeScript and Node.

yarn add mongodb @types\mongodb
//src/index.ts
import * as mongodb from "mongodb";

const uri = "mongodb://localhost:27017";
const dbName = "my-mongo-db";

async function main() {
    const client = new mongodb.MongoClient(uri, { useUnifiedTopology: true });
    await client.connect();
    console.log("DB Connection established");
    const database = client.db("my-database");
    const collection = database.collection('my-collection');

    // find single doc
    const singleResult = await collection.findOne();

    // find many docs
    const manyResults = collection.find();
    while(await manyResults.hasNext()) {
      const result = await recipes.next();
      console.log(result);
    }

    await client.close();
}

main();

Storing files in mongoDb with GridFS

MongoDb can store files using its underlying GridFS, which breaks files up into chunks of max 16MB size. Chunks are stored in separate docs, for example in database files in collection fs.chunks as doc with id 12345.json. That doc contains an entry “files_id: 54321”, which references a document 54321.json in collection fs.files and contains data such as filename, uploadDate, metadata, length etc. The chunks are streamed back via a MongoDB client.

In this example we use the mongofiles database tool from the CLI, but this can also be written in node.

> mongofiles put myfile.jpg --db=files // uploading
> mongofiles list --db=files --quiet  // listing
> mongofiles get myfile.jpg --db=files  // downloading
> mongofiles delete myfile.jpg --db=files  // removing

Replica Sets

Replica Sets provide automatic failover and data redundancy: Data can be replicated by running multiple copies at once. If the primary goes down, the secondary will continue to serve requests. We create a replica set called myReplSet which runs three instances of mongoDb on different ports:

> mongod --replSet myReplSet --dbpath=/store/data/rs1 --port 27017 --smallfiles --oplogSize 200

> mongod --replSet myReplSet --dbpath=/store/data/rs2 --port 27018 --smallfiles --oplogSize 200

> mongod --replSet myReplSet --dbpath=/store/data/rs3 --port 27019 --smallfiles --oplogSize 200

Next, we have to connect each instance with each other using a config file. First, connect to rs1 with mongo --port 27017 and create a config:

> config = {
  _id: "myReplSet",
  members: [
    {_id: 0, host: "localhost:27017"},
    {_id: 1, host: "localhost:27018"},
    {_id: 2, host: "localhost:27019"},
  ]
}

> rs.initiate(config);
> rs.status();

Sharding

Breaks your data up to distribute it on multiple servers. Uses sharding keys to route requests to each server. tbc.

Backups

Manual backup by shutting down the server and copying files

Not recommended, but that’s how it’s done: Before you create a backup, you should prevent that any info is written by locking the database with db.fsyncLock(); Then copy all files from your db file folder. After that db.fsyncUnlock() again.

mongodump

Running mongodump will create a dump folder with a file for each database. You can further customize the dump, use the help flag. You restore a dump using mongorestore /path/to/dump.

Schema validation

MongoDB supports draft 4 of JSON Schema. The idea is that you pass in that schema whenever you create a collection. But a (better?) alternative is to use mongoose.

Views

MongoDB Views are essentially read-only collections that contain data computed from other collections using aggregations.

About Author

Mathias Bothe To my job profile

I am Mathias, born 40 years ago in Heidelberg, Germany. Today I am living in Munich and Stockholm. I am a passionate IT freelancer with more than 16 years experience in programming, especially in developing web based applications for companies that range from small startups to the big players out there. I am founder of bosy.com, creator of the security service platform BosyProtect© and initiator of several other software projects.