1.

Are multi-document transactions possible in MongoDB?

Answer»

Shard key selection is an important aspect of the sharded cluster as it affects the performance and overall efficiency of a cluster. Chunk creation and distribution among several SHARDS is based on the choice of the shard key. Ideally shard key should allow MongoDB to distribute documents evenly across all the shards in the cluster.

There are three main factors that affect the selection of the shard key:

  • Cardinality

Cardinality refers to a number of DISTINCTIVE values for a given shard key. Ideally shard key should have high cardinality. It represents the maximum number of chunks that can exist in clusters.

For example, suppose we have an application that was used only by members of a particular city and we are sharding on the STATE, we will have a maximum of one chunk as both UPPER and lower values of chunk would be that state only. And one chunk would only allow us to have one shard. Hence we need to ensure the shard key field has high cardinality.

If we cannot have a field with high cardinality we can increase the cardinality of our shard key by creating compound shard key. So in the above scenario, we can have shard key with a combination of state and name for ensuring cardinality.

  • Frequency

Apart from having a large number of different values for our shard key, it is important to have even distribution for each value. It certain values occur more often than others then we may not have an equal distribution of load across the cluster. This limits the ability to handle scaled read and writes. For example, suppose we have an application where the majority of people using it have last name ‘jones’, the throughput of our application would be constraint with shard having those values. Chunks containing these values grow larger and larger and may SOMETIMES become jumbo chunks. These jumbo chunks reduce the ability to scale horizontally as they cannot be split. To address such issues, we should choose a good compound shard key. In the above scenario, we can add _id as a compound field to have a high frequency for compound shard key.

  • Rate of change of Shard key values

We should avoid shard keys on fields which values are always increasing or decreasing. For example, ObjectId in MongoDB whose value is always increasing with each new document. In such case, all our writes will go to the same chunk having an upper bound key. For monotonically decreasing values writes will go to the first shard with a lower bound. We can have shard key as objectId as long as it’s not the first field.



Discussion

No Comment Found