DarkMatter in Cyberspace
  • Home
  • Categories
  • Tags
  • Archives

MongoDB Schema Analysis


Use variety/variety-cli to analysis mongodb schema.

Install with npm

npm install variety-cli -g If the command variety is conflict with Variety the Wallpaper Changer, add alias variety='/home/leo/apps/node-v5.4.1-linux-x64/bin/variety' into ~/.bash_aliases.

The production database is connected with mongo 192.168.10.89:27017/production -u user -p password. To analyze it, run:

variety production/Fair --host 192.168.10.89 --port 27017 --username user --password password

Note you can't use -u instead of --username above.

Run with mongo

Download variety.js from it's repo and run:

mongo 192.168.10.89:27017/production -u user -p password --eval "var collection = 'Fair'" variety.js
+------------------------------------------------------------------------------------------------------------------------------+
| key                                              | types                              | occurrences | percents               |
| ------------------------------------------------ | ---------------------------------- | ----------- | ---------------------- |
| _id                                              | String                             |        9316 | 100.000000000000000000 |
| recurrence.XX._id                                | String                             |        9276 |  99.570631172176902624 |
| recurrence.XX.timeStart                          | String (9180),Date (116)           |        9275 |  99.559896951481320571 |

Verify it with mongo shell:

mongo 192.168.10.89:27017/production -u dba -p password
> db.Fair.count()
9316
> db.Fair.find({'recurrence._id': {$exists: true}}).count()
9276
> db.Fair.find({$and: [{'recurrence': {$exists: true}}, {'recurrence._id': {$exists: false}}]}).count()
40      // 9316 - 9276
> db.Fair.find({'recurrence.timeStart': {$exists: true}}).count()
9275    // 
> db.Fair.find({$and: [{'recurrence._id': {$exists: true}}, {'recurrence.timeStart': {$exists: false}}]}).count()
1       // 9276 - 9275
> db.Fair.find({'recurrence.timeStart': {$type: 2 }}).count()
9180
> db.Fair.find({'recurrence.timeStart': {$type: 9}}).count()
116
> db.Fair.find({$and: [{'recurrence.timeStart': {$exists: true}}, {'recurrence.timeStart': {$not: {$type: 2}}}, {'recurrence.timeStart': {$not: {$type: 9}}}]}).count()
0      // there's no other data type, all you need to do is converting String to Date

For type number, 2 means String, 9 means Date. See their definitions in $type in MongoDB doc.

Why 9180 + 116 > 9275? Because all the numbers here is the counts of fairs. While in a single fair there are maybe many recurrences. So if a fair has 2 recurrences, one's timeStart type is String, and the other's type is Date, it will be counted twice, both in 9180 and 116.

Unify Schema

We need convert the type of "updatedAt", "recurrence.\(.timeStart" and "recurrence.\).timeEnd" from String to Date with mongo script below:

db.Fair.find({'recurrence.timeStart': {$type: 2}}).forEach(function(fair) {   
  fair.recurrence.forEach(function(rec) { 
    if (typeof(rec.timeStart) === 'string') {
      rec.timeStart = new Date(rec.timeStart);
    }
  });
  db.Fair.save(fair);
});
db.Fair.find({'recurrence.timeEnd': {$type: 2}}).forEach(function(fair) {   
  fair.recurrence.forEach(function(rec) {
    if (typeof(rec.timeEnd) === 'string') {
      rec.timeEnd = new Date(rec.timeEnd);
    }
  });
  db.Fair.save(fair);
});
db.Fair.find({'updatedAt': {$type: 2}}).forEach(function(fair) {
  fair.updatedAt = new Date(fair.updatedAt);
  db.Fair.save(fair);
});


Published

Mar 29, 2016

Last Updated

Mar 29, 2016

Category

Tech

Tags

  • mongodb 24
  • schema 2

Contact

  • Powered by Pelican. Theme: Elegant by Talha Mansoor