MongoDB schema validation rules
220311
Summary
The purpose of this post is to demonstrate how we can apply a schema validation rules in a collection.
Short Intro
MongoDB is a very popular free and open-source cross-platform document-oriented database. It is a NoSQL database and it is based on JSON-like documents. Document-based databases are either schema-less or they provide a certain level of flexibility defining schemas using schema validation rules.
For those who are coming from the RDBMS world, where a table structure is characterized by columns with strictly defined properties (type, size, etc.,), the ability to define schemas could be proved to be a quite useful option. Generally, we can think that a MongoDB database object is similar to an RDBMS schema containing tables, views, and other RDBMS objects. Respectively, a MongoDB collection is analogous to a table, and a MongoDB document can be considered as a table-row. A MongoDB database can group together collections, a collection holds documents, and a document consists of a number of objects of key-value pairs, and even of other documents.
The purpose of this post is to demonstrate how we can apply some schema validation rules in a collection. For that, it is necessary to create an example MongoDB database with a MongoDB collection.
Prerequisites and assumptions
Here, you can get a
It is presumed that you have available and accessible a running MongoDB instance. If you don’t have this, then you can easily achieve that, by also using Docker and the official MongoDB Docker image, to run a MongoDB container. Read more at: https://www.mongodb.com/compatibility/docker
For convenience, we are also going to use the MongoDB Compass which is the official GUI for MongoDB.
Run a MongoDB Docker Container
You can create and run a Docker container named ‘mongodb’, by running the following command:
docker run --name mongodb -p 27017:27017 -d mongo
After you created the container, you can stop and start it, using the following commands, respectively:
docker stop mongodb docker start mongodb
Also, you can always check the running containers, via:
docker ps
Obtain the MongoDB Compass GUI and define a database and a collection
You can download the GUI Compass at: https://www.mongodb.com/try/download/compass
After you install it, run it. Ensure that the mongodb container is up and running, and create a new connection using a connection string, which for our case it can be:
mongodb://localhost:27017/?readPreference=primary&appname=MongoDB%20Compass&directConnection=true&ssl=false
Then, after you have successfully connected to MongoDB docker instance, you can create a new database and a new collection. Name them ‘ticket-management’ and ‘users’ respectively.
The ‘users’ collection will store user documents and the documents should be validated by our validation rules.
Define a MongoDB document properties
As we have said before, a MongoDB document is an ordered set of key-value pairs. A key difference to the RDBMS is that a MongoDB document can store documents of any size of key-value pairs as well as nested documents.
However, in our case, we want to enforce the ‘users’ collection to hold documents of strictly the same properties (keys). This is analogous to the fields (columns) of a table in an RDBMS. So, for example, we want each document to have exactly the same properties/fields.
The mongo _id
Before we identify the fields of our ‘users’ collection, it’s worth mentioning here, that MongoDB, automatically generates a special ‘_id’ property/field, each time a new document is being inserted into a collection.
The _id is a special data type for MongoDB. It is actually a MongoDB object (ObjectID), of BSON type with 12-bytes size. The 12-byte _id consists of:
- 4-bytes representing the seconds since the Unix epoch
- 3-bytes specific to the host – a machine identifier
- 2-bytes of the process id, and
- 3-bytes representing a counter, starting with a random value
Even the fact that an auto-generated _id is not actually a standard UUID, the _id fields can be considered unique, they are ordered, and they can be used as ‘primary key’ of our collection.
After that, this is the example list of the fields for the ‘users’ collection:
_id username, password, email, registrationdate, confirmed, cancelled, typeid, countryid
The goal is to ensure (well, as much as we can) that all documents aimed to be inserted to ‘users’ collection, should consist of those fields.
MongoDB schema Validation Rules
In order to achieve all the documents to comply with the above fields, we will use a specific MongoDB schema. You can think, that a MongoDB schema is nothing but a set of rules for document properties (keys) and values. Those rules are functioning on a per-collection basis. The rules should be followed (=validated) during each document’s insertion or update in the specific collection.
Such a set of rules should be defined using a JSON file according to the BSON standards.
We are not going to go through more details here, but you can read more about MongoDB schema and how it works, using the official documentation. For example, you can follow the links below:
https://docs.mongodb.com/compass/current/validation/
https://docs.mongodb.com/manual/core/schema-validation/#schema-validation https://docs.mongodb.com/manual/core/schema-validation/#specify-validation-rules https://docs.mongodb.com/manual/reference/command/collMod/#add-document-validation-to-an-existing-collection https://docs.mongodb.com/manual/reference/bson-types/
After the short intro given above, now it’s time to define our MongoDB validation schema. The summary of what we actually want to define is given below:
- The fields: username, email, and password should be present in each document (they are mandatory).
- The fields: username, email and password should be of type string, and their strings’ length should be between the minimum and maximum limits.
- The field email should comply with a specific regex pattern.
- The field: registrationdate should be of type date.
- The fields: confirmed and canceled should be of type bool (Boolean: true or false).
- The fields typeid and countryid should be of type int (integer) and their values should be between a minimum and a maximum number.
We define our rules via various methods via mongo shell CLI or mongosh CLI, but since we have already created our ‘users’ collection in Compass, using the GUI of Compass seems to be the convenient way.
So, select the ‘users’ collection, click the Validation tab and put your JSON schema (leave the Validation Action and Validation Level to ERROR and STRICT options, respectively). Below is our example of validation rules that we will use:
mongodb (compass) collection validation schema
Check the Validation rules via mongosh and mongo shell CLIs
After we have saved our Validation rules in Compass, we can use the mongosh to take a taste of what they look like. Compass, provides us with an embedded version of the mongosh CLI.
By default, the mongosh is connected to the ‘test’ database, as you can see above. So, switch to tickets-management database and navigate to the validation rules using the db.getCollectionInfos() function:
use ticket-management 'switched to db ticket-management' db.getCollectionInfos({name: "users"}) . . . db.getCollectionInfos({name: "users"})[0].options.validator { '$jsonSchema': { bsonType: 'object', additionalProperties: false, required: [ 'username', 'email', 'password' ], properties: { _id: {}, username: [Object], password: [Object], email: [Object], registrationdate: [Object], confirmed: [Object], canceled: [Object], typeid: [Object], countryid: [Object] } } }
It seems that we cannot go deeper and see/check the “properties” objects using the embedded mongosh.
However, we can jump into the container shell:
docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 11b9a599c13e mongodb "docker-entrypoint.s…" 3 months ago Up 4 hours 0.0.0.0:27017->27017/tcp mongodb . . . docker exec -it mongodb bash root@11b9a599c13e:/#
And run the mongosh from within it:
root@11b9a599c13e:/# root@11b9a599c13e:/# mongosh Current Mongosh Log ID: 6229dc064130345cc3d542bf Connecting to: mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000 Using MongoDB: 5.0.5 Using Mongosh: 1.1.6 For mongosh info see: https://docs.mongodb.com/mongodb-shell/ To help improve our products, anonymous usage data is collected and sent to MongoDB periodically (https://www.mongodb.com/legal/privacy-policy). You can opt-out by running the disableTelemetry() command. ------ The server generated these startup warnings when booting: 2022-03-10T05:41:44.202+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem 2022-03-10T05:41:45.856+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted ------ Warning: Found ~/.mongorc.js, but not ~/.mongoshrc.js. ~/.mongorc.js will not be loaded. You may want to copy or rename ~/.mongorc.js to ~/.mongoshrc.js. test>
Then, we can switch to ‘ticket-management’ database and execute the db.collectionInfos() to obtain the validation rules information for the ‘users’ collection:
test> use ticket-management switched to db ticket-management ticket-management> ticket-management> db.getCollectionInfos({name: "users"}) [ { name: 'users', type: 'collection', options: { validator: { '$jsonSchema': { bsonType: 'object', additionalProperties: false, required: [ 'username', 'email', 'password' ], properties: { _id: {}, username: { bsonType: 'string', minLength: 6, maxLength: 20, description: 'It is required and it must be a string with length between 6 and 20' }, password: { bsonType: 'string', maxLength: 80, description: 'It must be a string with max length 80' }, email: { bsonType: 'string', minLength: 6, maxLength: 40, pattern: '[a-z0-9._%+!$&*=^|~#%{}/-]+@([a-z0-9-]+.){1,}([a-z]{2,22})', description: 'It is required and it must be a string with length between 6 and 40 (regular expression pattern)' }, registrationdate: { bsonType: 'date', description: 'It must be a date' }, confirmed: { bsonType: 'bool', description: 'It can only be true or false' }, canceled: { bsonType: 'bool', description: 'It can only be true or false' }, typeid: { bsonType: 'int', minimum: 1, maximum: 4, description: 'It must be an integer in [ 1, 5 ]' }, countryid: { bsonType: 'int', minimum: 1, maximum: 250, description: 'It must be an integer in [ 1, 250 ]' } } } }, validationLevel: 'strict', validationAction: 'error' }, info: { readOnly: false, uuid: UUID("b033d158-5ee1-4187-9e29-70be69ed97bb") }, idIndex: { v: 2, key: { _id: 1 }, name: '_id_' } } ] ticket-management>
This time our validation rules are clearly presented.
Alternatively, we can run just the mongo CLI (not the mongosh):
root@11b9a599c13e:/# mongo MongoDB shell version v5.0.5 connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb Implicit session: session { "id" : UUID("bb023c59-6160-461d-b022-4d88658fb890") } MongoDB server version: 5.0.5 ================ Warning: the "mongo" shell has been superseded by "mongosh", which delivers improved usability and compatibility.The "mongo" shell has been deprecated and will be removed in an upcoming release. For installation instructions, see https://docs.mongodb.com/mongodb-shell/install/ ================ Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see https://docs.mongodb.com/ Questions? Try the MongoDB Developer Community Forums https://community.mongodb.com --- The server generated these startup warnings when booting: 2022-03-10T05:41:44.202+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem 2022-03-10T05:41:45.856+00:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted --- --- Enable MongoDB's free cloud-based monitoring service, which will then receive and display metrics about your deployment (disk utilization, CPU, operation statistics, etc). The monitoring data will be available on a MongoDB website with a unique URL accessible to you and anyone you share the URL with. MongoDB may use this information to make product improvements and to suggest MongoDB products and deployment options to you. To enable free monitoring, run the following command: db.enableFreeMonitoring() To permanently disable this reminder, run the following command: db.disableFreeMonitoring() --- >
Note that, the mongo shell is depreciated, and it has been superseded by the mongosh.
Then, again, we can switch to the ‘ticket-management’ database and execute the db.collectionInfos() to obtain such information for the ‘users’ collection:
> use ticket-management switched to db ticket-management > > db.getCollectionInfos({name: "users"}) [ { "name" : "users", "type" : "collection", "options" : { "validator" : { "$jsonSchema" : { "bsonType" : "object", "additionalProperties" : false, "required" : [ "username", "email", "password" ], "properties" : { "_id" : { }, "username" : { "bsonType" : "string", "minLength" : 6, "maxLength" : 20, "description" : "It is required and it must be a string with length between 6 and 20" }, "password" : { "bsonType" : "string", "maxLength" : 80, "description" : "It must be a string with max length 80" }, "email" : { "bsonType" : "string", "minLength" : 6, "maxLength" : 40, "pattern" : "[a-z0-9._%+!$&*=^|~#%{}/-]+@([a-z0-9-]+.){1,}([a-z]{2,22})", "description" : "It is required and it must be a string with length between 6 and 40 (regular expression pattern)" }, "registrationdate" : { "bsonType" : "date", "description" : "It must be a date" }, "confirmed" : { "bsonType" : "bool", "description" : "It can only be true or false" }, "canceled" : { "bsonType" : "bool", "description" : "It can only be true or false" }, "typeid" : { "bsonType" : "int", "minimum" : 1, "maximum" : 4, "description" : "It must be an integer in [ 1, 5 ]" }, "countryid" : { "bsonType" : "int", "minimum" : 1, "maximum" : 250, "description" : "It must be an integer in [ 1, 250 ]" } } } }, "validationLevel" : "strict", "validationAction" : "error" }, "info" : { "readOnly" : false, "uuid" : UUID("b033d158-5ee1-4187-9e29-70be69ed97bb") }, "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" } } ] >
As you can see above, the result is pretty much the same.
Test our Validation rules
After we have defined our Validation rules, we can use any of the available tools (Compass GUI, mongosh CLI, mongo CLI) and test if they work correctly. For that, we can try to insert some documents that do not meet the validation rules requirements and confirm their failure. Below, there are some such examples that can be used by your own as well.
Using mongosh
Let’s try to insert an empty document:
ticket-management> db.users.insertOne({}) Uncaught: MongoServerError: Document failed validation Additional information: { failingDocumentId: ObjectId("622af142daae2661d9e8c62d"), details: { operatorName: '$jsonSchema', schemaRulesNotSatisfied: [ { operatorName: 'required', specifiedAs: { required: [ 'username', 'email', 'password' ] }, missingProperties: [ 'email', 'password', 'username' ] } ] } } ticket-management>
Now let’s try again with a document with a not-valid email:
ticket-management> db.users.insertOne( ... { ..... "username": "Panos1", ..... "password": "mypassword", ..... "email": "email.com", ..... "registrationdate": ISODate("2022-03-11T11:11:55.353Z"), ..... "typeid": 4, ..... "confirmed": false, ..... "canceled": false, ..... "countryid": 28 ..... } ... ) Uncaught: MongoServerError: Document failed validation Additional information: { failingDocumentId: ObjectId("622af390daae2661d9e8c62f"), details: { operatorName: '$jsonSchema', schemaRulesNotSatisfied: [ { operatorName: 'properties', propertiesNotSatisfied: [ { propertyName: 'email', details: [ [Object] ] } ] } ] } } ticket-management>
You can continue trying to insert documents with invalid values, e.g. using as value of the typeid field the value 0 (it should be at least 1):
ticket-management> db.users.insertOne( ... { ..... "username": "Panos1", ..... "password": "mypassword", ..... "email": "panos1@email.com", ..... "registrationdate": ISODate(), ..... "typeid": 0, ..... "confirmed": false, ..... "canceled": false, ..... "countryid": 28 ..... } ... ) Uncaught: MongoServerError: Document failed validation Additional information: { failingDocumentId: ObjectId("622af5c3daae2661d9e8c635"), details: { operatorName: '$jsonSchema', schemaRulesNotSatisfied: [ { operatorName: 'properties', propertiesNotSatisfied: [ { propertyName: 'typeid', details: [ [Object] ] } ] } ] } } ticket-management>
Using Compass
Similarly, trying to insert documents that do not comply with our validation rules, you will keep getting failure errors:
Caveats
Using MongoDB validation rules is quite useful and saves us from a lot of headaches. However, it is not a panacea. As an example-drawback, we can mention the inability to define uniqueness with fields, e.g., we cannot prevent insertion (or update) of a document with a username value that is already existed in another document. Another example is that we cannot also prevent the insertion of documents that do not have all the fields (apart from the required ones). And so on. However, as MongoDB suggests, such challenges can be solved in our business logic in Middleware, but this is the subject of another post. So, stay tuned!
That’s it!
Thank you for reading and happy coding!