此次 MongoDB Sharded Cluster 架構:
- 跑兩個 MongoDB Server,分別在 port 10001, 10002
- 跑一個 MongoDB Config Server - port 20001
- 跑一個 MongoDB Routing Server - port 30001
$ rm -rf /tmp/mongo && mkdir -p /tmp/mongo/db/1 /tmp/mongo/db/2 /tmp/mongo/db/c /tmp/mongo/log
$ mongod --port 10001 --dbpath /tmp/mongo/db/1 --logpath /tmp/mongo/log/1 &
$ mongod --port 10002 --dbpath /tmp/mongo/db/2 --logpath /tmp/mongo/log/2 &
$ mongod --port 20001 --configsvr --dbpath /tmp/mongo/db/c --logpath /tmp/mongo/log/c &
$ mongos --port 30001 --configdb 127.0.0.1:20001 &
設定 partition 用法:
$ mongo --port 30001
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("###############")
}
shards:
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
mongos> sh.addShard('127.0.0.1:10001')
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> sh.addShard('127.0.0.1:10002')
{ "shardAdded" : "shard0001", "ok" : 1 }
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("#####################")
}
shards:
{ "_id" : "shard0000", "host" : "127.0.0.1:10001" }
{ "_id" : "shard0001", "host" : "127.0.0.1:10002" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
mongos> sh.enableSharding('ydb')
{ "ok" : 1 }
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("######################")
}
shards:
{ "_id" : "shard0000", "host" : "127.0.0.1:10001" }
{ "_id" : "shard0001", "host" : "127.0.0.1:10002" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "ydb", "partitioned" : true, "primary" : "shard0000" }
mongos> use ydb
switched to db ydb
mongos> db.ycollection.ensureIndex( { _id : "hashed" } )
{
"raw" : {
"127.0.0.1:10001" : {
"createdCollectionAutomatically" : true,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
},
"ok" : 1
}
mongos> sh.shardCollection("ydb.ycollection", { "_id": "hashed" } )
{ "collectionsharded" : "ydb.ycollection", "ok" : 1 }
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("#####################")
}
shards:
{ "_id" : "shard0000", "host" : "127.0.0.1:10001" }
{ "_id" : "shard0001", "host" : "127.0.0.1:10002" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "ydb", "partitioned" : true, "primary" : "shard0000" }
ydb.ycollection
shard key: { "_id" : "hashed" }
chunks:
shard0000 2
shard0001 2
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-4611686018427387902") } on : shard0000 Timestamp(2, 2)
{ "_id" : NumberLong("-4611686018427387902") } -->> { "_id" : NumberLong(0) } on : shard0000 Timestamp(2, 3)
{ "_id" : NumberLong(0) } -->> { "_id" : NumberLong("4611686018427387902") } on : shard0001 Timestamp(2, 4)
{ "_id" : NumberLong("4611686018427387902") } -->> { "_id" : { "$maxKey" : 1 } } on : shard0001 Timestamp(2, 5)
簡易驗證資料分散性:
mongos>
db.ycollection.insert({user:"a", item: "1", rate: 0.3})
db.ycollection.insert({user:"a", item: "3", rate: 0.5})
db.ycollection.insert({user:"a", item: "4", rate: 0.9})
db.ycollection.insert({user:"b", item: "2", rate: 0.1})
db.ycollection.insert({user:"b", item: "3", rate: 0.6})
db.ycollection.insert({user:"b", item: "5", rate: 0.2})
db.ycollection.insert({user:"c", item: "3", rate: 0.2})
db.ycollection.insert({user:"c", item: "4", rate: 0.7})
mongos> db.ycollection.find()
{ "_id" : ObjectId("#################"), "user" : "a", "item" : "3", "rate" : 0.5 }
{ "_id" : ObjectId("#################"), "user" : "a", "item" : "1", "rate" : 0.3 }
{ "_id" : ObjectId("#################"), "user" : "a", "item" : "4", "rate" : 0.9 }
{ "_id" : ObjectId("#################"), "user" : "b", "item" : "2", "rate" : 0.1 }
{ "_id" : ObjectId("#################"), "user" : "b", "item" : "5", "rate" : 0.2 }
{ "_id" : ObjectId("#################"), "user" : "b", "item" : "3", "rate" : 0.6 }
{ "_id" : ObjectId("#################"), "user" : "c", "item" : "3", "rate" : 0.2 }
{ "_id" : ObjectId("#################"), "user" : "c", "item" : "4", "rate" : 0.7 }
接著,想要驗證資料的確是儲存在不同台 mongodb server ,作法就是把資料透過 mongodump 出來(bson),再用 bsondump 印出來驗證:
$ cd /tmp
$ mongodump --port 10001 -d ydb -c ycollection
$ bsondump dump/ydb/ycollection.bson
{ "_id" : ObjectId( "#############" ), "user" : "a", "item" : "1", "rate" : 0.3 }
{ "_id" : ObjectId( "#############" ), "user" : "b", "item" : "2", "rate" : 0.1 }
{ "_id" : ObjectId( "#############" ), "user" : "b", "item" : "3", "rate" : 0.6 }
{ "_id" : ObjectId( "#############" ), "user" : "c", "item" : "4", "rate" : 0.7 }
4 objects found
$ mongodump --port 10002 -d ydb -c ycollection
$ bsondump dump/ydb/ycollection.bson
{ "_id" : ObjectId( "#############" ), "user" : "a", "item" : "3", "rate" : 0.5 }
{ "_id" : ObjectId( "#############" ), "user" : "a", "item" : "4", "rate" : 0.9 }
{ "_id" : ObjectId( "#############" ), "user" : "b", "item" : "5", "rate" : 0.2 }
{ "_id" : ObjectId( "#############" ), "user" : "c", "item" : "3", "rate" : 0.2 }
4 objects found
因此可完成驗證,資料的確分在不同區!有興趣還可以用 MongoDB Aggregate (Map-Reduce) 實作共現矩陣(Co-Occurrence Matrix)及簡易的推薦系統 來驗證 partition 演算法的正確性。
沒有留言:
張貼留言