Chunk Migration Across Shards

Chunk migration moves the chunks of a sharded collection from one shard to another and is part of the balancer process.

Diagram of a collection distributed across three shards. For this collection, the difference in the number of chunks between the shards reaches the *migration thresholds* (in this case, 2) and triggers migration.

Chunk Migration

MongoDB migrates chunks in a sharded cluster to distribute the chunks of a sharded collection evenly among shards. Migrations may be either:

  • Manual. Only use manual migration in limited cases, such as to distribute data during bulk inserts. See Migrating Chunks Manually for more details.
  • Automatic. The balancer process automatically migrates chunks when there is an uneven distribution of a sharded collection’s chunks across the shards. See Migration Thresholds for more details.

All chunk migrations use the following procedure:

  1. The balancer process sends the moveChunk command to the source shard.

  2. The source starts the move with an internal moveChunk command. During the migration process, operations to the chunk route to the source shard. The source shard is responsible for incoming write operations for the chunk.

  3. The destination shard begins requesting documents in the chunk and starts receiving copies of the data.

  4. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure that it has the changes to the migrated documents that occurred during the migration.

  5. When fully synchronized, the destination shard connects to the config database and updates the cluster metadata with the new location for the chunk.

  6. After the destination shard completes the update of the metadata, and once there are no open cursors on the chunk, the source shard deletes its copy of the documents.

    Changed in version 2.4: If the balancer needs to perform additional chunk migrations from the source shard, the balancer can start the next chunk migration without waiting for the current migration process to finish this deletion step. See Chunk Migration Queuing.

The migration process ensures consistency and maximizes the availability of chunks during balancing.

Chunk Migration Queuing

Changed in version 2.4.

To migrate multiple chunks from a shard, the balancer migrates the chunks one at a time. However, the balancer does not wait for the current migration’s delete phase to complete before starting the next chunk migration. See Chunk Migration for the chunk migration process and the delete phase.

This queuing behavior allows shards to unload chunks more quickly in cases of heavily imbalanced cluster, such as when performing initial data loads without pre-splitting and when adding new shards.

This behavior also affect the moveChunk command, and migration scripts that use the moveChunk command may proceed more quickly.

In some cases, the delete phases may persist longer. If multiple delete phases are queued but not yet complete, a crash of the replica set’s primary can orphan data from multiple migrations.

Chunk Migration Write Concern

Changed in version 2.4: While copying and deleting data during migrations, the balancer waits for replication to secondaries for every document. This slows the potential speed of a chunk migration but ensures that a large number of chunk migrations cannot affect the availability of a sharded cluster.

See also Secondary Throttle in the v2.2 Manual.