- 向mongos发送removeShard命令
- mongos向cs发送_configsvrRemoveShard命令
- cs查询除了要remove的shard,是否还有别的shard在drain状态,是的话则返回失败。
{"_id": {$ne: "shard1"}, "drainging": true}}}
表明同一时刻只能有一个removeShard操作。 - cs查询是否要remove的shard是最后一个shard,是的话本次请求返回失败。
- cs查询要remove的shard是否在drain状态,如果不是的话,则置draining状态为true,并且reload shardRegistry(缓存的shard的路由表)。日志层面记录removeShard开始
- cs查询shard对应的chunk表,还遗留有多少条chunk
- cs查询shard对应的database表,还遗留有多少个以当前shard为primary的db
- 如果还遗留chunk或者database,则返回ongoing状态。chunk靠balancer移动,primary database需要手动移动(通常是调用movePrimary命令)。
- 如果都已经空了,则删除config.shards表里面对应的shard,并重新reload shardRegistry
- cs移除对应shard的连接和监控,并记录日志,标记完成removeShard,返回completed状态。
什么时候会发起move chunk的操作?3.4之后由cs的balancer线程定期轮询,发现有shard位于draining状态,则发起move chunk操作。相当于这个跟removeShard是一个异步的操作。
通常用户在removeShard返回中,如果state是ongoing
表示还在move chunk,remaining
字段会显示还没有move完毕的chunks数:
{
"msg" : "draining ongoing",
"state" : "ongoing",
"remaining" : {
"chunks" : NumberLong(2),
"dbs" : NumberLong(2),
"jumboChunks" : NumberLong(0) // Available starting in 4.2.2 (and 4.0.14)
},
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [
"fizz",
"buzz"
],
"ok" : 1,
"operationTime" : Timestamp(1575399086, 1655),
"$clusterTime" : {
"clusterTime" : Timestamp(1575399086, 1655),
"signature" : {
"hash" : BinData(0,"XBrTmjMMe82fUtVLRm13GBVtRE8="),
"keyId" : NumberLong("6766255701040824328")
}
}
}
如果没有remaining
字段而只有dbsToMove
表示move chunk已经完毕或者没有必要,用户需要手动移除dbsToMove
下面的db,通常采用movePrimary
命令把其挪到别的shard上:
{
"msg" : "draining started successfully",
"state" : "started",
"shard" : "bristol01",
"note" : "you need to drop or movePrimary these databases",
"dbsToMove" : [
"fizz",
"buzz"
],
"ok" : 1,
"operationTime" : Timestamp(1575398919, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1575398919, 2),
"signature" : {
"hash" : BinData(0,"Oi68poWCFCA7b9kyhIcg+TzaGiA="),
"keyId" : NumberLong("6766255701040824328")
}
}
}
如果都完毕了,state
状态返回completed
:
{
"msg" : "removeshard completed successfully",
"state" : "completed",
"shard" : "bristol01",
"ok" : 1,
"operationTime" : Timestamp(1575400370, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1575400370, 2),
"signature" : {
"hash" : BinData(0,"JjSRciHECXDBXo0e5nJv9mdRG8M="),
"keyId" : NumberLong("6766255701040824328")
}
}
}
参考:
https://docs.mongodb.com/manual/reference/command/removeShard/