1.

What are different index options MongoDB provides?

Answer»

Chunk split operations are carried out automatically by the system when any insert operation causes chunk to exceed the maximum chunk size. Balancer then migrates recently split chunks to new shards. But in some cases we may want to pre-split the chunks manually:

  • If we have deployed a CLUSTER using existing data, we may have large data and very few chunks. In cases, pre-splitting would be beneficial for even distribution.
  • If the cluster is using hashed shard key, or we KNOW the distribution of our data very well, we can arrange for a distribution of data to be equilibrated between shards and pre-split the chunks.
  • If we perform INITIAL bulk load, all data would go to SINGLE shard and then those documents will migrate to other shards later doubling the number of writes. ALTERNATIVELY, if we can pre-split the collection across the values avoiding the documents to be written twice.

To split the chunks manually we can use the split command with helper sh.splitFind() and sh.splitAt().

Example:

To split the chunk of employee collection for employee id field at a value of 713626 below command should be used.

sh.splitAt( "test.people", { "employeid": "713626" } )

We should be careful while pre-splitting chunks as sometimes it can lead to a collection with different sized chunks.



Discussion

No Comment Found