Alt text
由於敝司 Shopline 使用的 Mongodb 的 Shard 以及 Configsvr 版本不一致,此次升級主要會將版本一致升級到3.2, 並全部採用 Replica Set.


MongoDB 為近年來很火紅的No SQL Databese, 在新版本中也快採用近年多數資料庫使用的 Replica Set (Primary and Sceondary) 架構, 取代舊有的 Shard Cluster (Master and Slave). 進而增加許多優勢,像是自動故障恢復、讀寫控制等等。

本篇文章以 mongodb-org 3.0 以及 3.2 為主, 這兩個版本最大的差異就是在 Replica Set.

升級上官方也提供非常詳盡的documents, 不得不稱讚mongodb的文件,真的很完整。

This post is mainly for describing the processes and experiences to upgrade Mongodb from 3.0 to 3.2 on Shopline which is our company.

There’s a big difference of Mongodb 3.0 and 3.2, is the Replica Set.

Replica Set is a is a new structure of databases, using Primary and Secondary instead of Master and Slave and this structure has a lot of advantages such like auto recovery from an unexpected shutdown and split read and write nodes.


Mongodb-org 3.0 -> 3.2
Amazon Linux 1

MongoDB Introduction

Main Structure

Alt text

Shard: The place where is to store data.

Replica Set: Normally it will have a primary, a secondary and a arbiter in a replica set, whatever shard server or config server.

Config Server: The bridge which is mongos router communicate with shard server.

Mongos: A mongo client agent and shell, usually install with application.

MongoDB 主要的架構大概就是這樣,會有Shard/Configsvr/mongos

一般來說Shard會存放資料,mongos是client端,負責去連接資料庫,而configsvr就是那座橋樑,讓mongos client知道該怎麼走到正確的shard去讀寫資料。

Shard / Configsvr Replica Set

Alt text

Primary: The controller of a Shard cluster, receive all requests to a database when writing and reading, it is the only one can be read and write node in a normal situation. Also has the responsibility to store data into a database.

Secondary: Normally secondary doesn’t allow to receive any request, the only job is to replicate primary’s data to store in another place. And can be promoted to a primary node when the original primary is down in unexpected or necessary. Aka Backup.

Arbiter: The main function is to allocate primary and secondary nodes, it will use “vote” to decide which secondary will be promoted to primary when needed. Arbiter won’t store any data inside.

Heartbeat: No need to explain, health check.

這是一般的 Shard Replica Set 架構,通常包含 Primary / Secondary / Arbiter.

Primary 主要接收所有讀與寫的請求以及儲存資料,Secondary 一般來說不會接受任何讀寫請求,只做一件事,就是複製 Primary 的資料來儲存,以便在必要或意外時可以被提升為 Primary 節點。

而 Arbiter 不會存儲任何資料,主要功用是調配 Primary and Secondary,當 Primary Down的時候,可以投票決定哪一個 Secondary 將被升級為 Primary。

This is the Configsvr replica set structure.

Alt text

In confinsvr replica set, the only difference is no Arbiter.

Install MongoDB-org 3.0 on Amazon Linux

yum doesn’t have mongodb-org inside the package pools, we have to add repo manually.

cat >> /etc/yum.repos.d/mongodb-org-3.0.repo <<EOF
name=MongoDB Repository

Then > yum install mongodb-org -y

If we wanna change to another version, just modify the value on repo file.

Setup a Replica Set of a complete MongoDB

Shard : 3 mongod servers, usually use 27018 port (version 3.2)
Configsvr : 3 mongod servers, usually use 27019 port
mongos : 1 mongo-shell, usually use 27017 mongodb default route port

Please use the same version, mongodb-org 3.0 in all servers.

Shard Servers config

In 3 shard servers, please use the same config file.


## mongod-32.conf

# where to write logging data.
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

# Where and how to store data.
  dbPath: /var/lib/mongo
    enabled: true

# how the process runs
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/  # location of pidfile

# network interfaces
  port: 27018
  bindIp:  # Listen to local interface only, comment to listen on all interfaces.

  replSetName: shard1
  clusterRole: shardsvr

Start your mongod

Use service mongod start or mongod -f /etc/mongod.conf to start mongod servers.

Feel free to use ps aux | grep mongo or service mongod status to check if your mongod really started.

If you get any issues when starting mongod, please go to /var/log/mongod.log to check error messages.

** And please notice the permissions of those directories which are define in mongod.conf.

請使用上述啟動指令來啟動你的 mongodb,並檢查他是否真的被啟動了。如果再啟動過程出現 Failed,請到 log 目錄檢查錯誤。常常因為目錄權限問題而失敗,或者是目錄不存在。請小心檢查,若不存在請建立目錄並給予權限。

Setup Shard Replica mode

After mongodb is started, please use mongo-shell to configure our shard to a replica set.

Let’s do it !!! The commands are as following below.

shard - primary : > mongo localhost:27018

> mongo> rs.initiate()  // to enable replica mode
> mongo> rs.status()  // check the replica mode if really enabled
> mongo> rs.add({host: "SECONDARY-IP:27018", priority: 0.5})  // add secondary to replica set
> mongo> rs.addArb("ARBITER-IP:27018")  // add arbiter to replica set

Congratulations !! Shard Replica Set has already set up successfully if you see primary, secondary, arbiter when you execute rs.status().

shard1:PRIMARY> rs.status()
	"set" : "shard1",
	"members" : [
			"_id" : 5,
			"name" : "ip-10-0-0-57:27018",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"_id" : 8,
			"name" : "ip-10-0-0-166:27018",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"_id" : 9,
			"name" : "ip-10-0-0-122:27018",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
	"ok" : 1

Set up a mongos client agent

Actually, mongos is equal to mongo-shell, one of a package of mongodb-org. So when we install mongodb-org, mongos was be installed as well.


#config servers
#where to log
#log overwritten or appended to
#fork and run in background
#override port

> mongos -f /etc/mongod.conf to start mongos and keep it right there.

we will use mongos to configure configsvr later.

Set up configsvr

The steps of installation configsvr are the same with shards.

But the config file is different, please set up as below.

# mongod.conf
#where to log
# fork and run in background
# location of pidfile

Please running 3 configsvr with the same config file and mongod -f /etc/mongod.conf.

And go inside with mongo-shell into mongos which we just set up. > mongo // default use 27019

The steps are as following:

  1. add shard with primary
  2. create db and collection
  3. insert data to database.collection
  4. create index for collection
  5. enable sharding
  6. shard collection

In mongos

mongo> sh.addShard("shard1/PRIMARY-SHARD-IP:27018")   //add one of shard servers, then others will be add automatically
mongo> use test   //to create a test database
mongo> db.createCollection("users")   //create a collection on test db
mongo> db.users.insert({"name": "david", "sex": "male"})
mongo> db.users.ensureIndex({_id: 1}, {background: true})
mongo> sh.enableSharding("test.users")   //"test.users" = "<database>.<collection>"
mongo> sh.shardCollection("test.users", {_id: "hashed"})   //use hashed to shard collection
mongo> sh.status()   //to check if sharding enabled

Then mongo> db.users.find() to find data if exist And please insert another data to test whether succeed.

Upgrade MongoDB from 3.0 to 3.2

If wanna upgrade configsvr to 3.2, we have some notes to be remember.

3.0 and 3.2 use different storage engine
3.0 -> MMAPv1
3.2 -> wiredTiger

Cluster mode and Replica set
3.0 -> cluster mode, sccc
3.2 -> replica set, csrs

So, the step is as following below.

a. disable balancer with sh.stopBalancer() and change one of three 3.0 configsvr version to 3.2 and set configsvrMode: sccc and storage engine: mmapv1

=> now 3 configsvr, one for 3.2 and two for 3.0

b. use mongo-shell into 3.2 that one to initiate rs mode. rs.initiate()

=> waiting for the data synced to 3.2 that one.

c. launch another 3 configsvr with 3.2 mongod and start with wiredTiger storagr engine.

d. go into the mmapv1 that one and rs.add("config1/IP") to add those 3 new servers.

=> now we have four 3.2 configsvr and two 3.0.

e. rs.stepDown() in mmapv1 that 3.2 one and primary node will change to another server.

f. go to new primary and rs.remove("NNAPv1-OLD-PRIMARY") and enable balancer back with rs.enableBalancer()

g. shut down old two 3.0 configsvr

h. using 3.2 mongos to connect to new replica set configsvr and check status.

The Errors I Encountered when Upgrading to 3.2

a. Can 3.0 mongos connect to 3.2 configsvr ?
Absolutely not, 3.0 doesn’t support replica mode

mongos 3.0 是不允許連接到 3.2 或以上版本的 configsvr的。

b. "setShardVersion failed" when mongos configDB endpoint has been changed

mongos> db.users.find()
Error: error: {
    "$err" : "setShardVersion failed shard: shard1:shard1/ip-10-0-0-122:27018,ip-10-0-0-166:27018 { configdb: { stored: \",,\", given: \"ip-10-0-0-45:27019,ip-10-0-0-45:27020,ip-10-0-0-45:27021\" }, ok: 0.0, errmsg: \"mongos specified a different config database string : stored :,, vs given : ip-10-0-0-45:27019,ip-10-0-0...\", $gleStats: { lastOpTime: Timestamp 0|0, electionId: ObjectId('5bffb069a4b240083f2691cb') } }",
    "code" : 10429,
    "shard" : "shard1"

cause the shard servers will cache mongos configDB string.
If you wanna solve this issue, just restart shard servers then everything works well.

因為 shard server 會 cache mongos 的 configDB string,所以如果遇到這個問題,重啟 Shard servers 就可以解決了。(記得先 StepDown Primary 再重啟)

c. Can 3.0 configsvr connect to 3.2 or 3.4up shard servers ?
Absolutely the answer is NO.

3.0 的 configsvr 是不能連接到 3.0 以上版本的 Shard Server 的,因為不支援 Replica Mode。

d. How to separate data to two shards equally ?
when shardCollection(), please use “hashed” on index, then will be equally separated.

使用 hashed 雜湊就能使資料平均分配在不同的 Shard Server。

e. Can 3.0 mongod upgrade to 3.4 and skip 3.2 directly ?
Sorry, the answer is NO, you must to upgrade to 3.2 first then go to 3.4.

mongodb 不允許跳版本升級,3.0 只允許升級到 3.2 再到 3.4。

f. How to STOP a shard server safely ?
Please use rs.stepDown(120) and db.shutdownServer() in mongo-shell.
Then exit mongo-shell and stop or kill mongod process.

使用正確的指令停止一台 mongodb server。

g. How to change priority on Shard server ?
var cfg = rs.conf() cfg.members[ID].priority = 0.5 rs.reconfig(cfg) // The ID can use rs.status() to lookup
使用上列指令去更改 mongodb 的優先度。

h. How to failover the primary temporary ?
rs.stepDown(120) on Primary node // The 120 is the Time by second Then it will auto failover to another node.

暫時的停止一台 Shard Server 的運作,也可以運來測試主副節點切換是否正常運作。


Feel free to mail me or leave some commemts if you have any questions.
I will reply all questions if I can help.

Thanks for reading.

2018-12-01 04:05 , David in Taipei