PL I – ASSIGNMENT 10

ASSIGNMENT NO.10

Title:

            Aggregation and indexing with suitable example using MongoDB.

Requirements:

  1. Computer System with Windows 64 bit/Linux/Open Source Operating System.
  2. MongoDB Server
  3. MongoDB Client

Theory:

Indexes: Indexes support the efficient resolution of queries. Without indexes, MongoDB must scan every document of a collection to select those documents that match the query statement. This scan is highly inefficient and require the mongod to process a large volume of data. Indexes are special data structures, that store a small portion of the data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field as specified in index.

The ensureIndex() Method:  To create an index you need to use ensureIndex() method of mongodb.

Syntax: Basic syntax of ensureIndex() method is as follows()

>db.COLLECTION_NAME.ensureIndex({KEY:1})

Here key is the name of filed on which you want to create index and 1 is for ascending order. To create index in descending order you need to use -1.

Example:

>db.mycol.ensureIndex({“title”:1})

In ensureIndex() method you can pass multiple fields, to create index on multiple fields.

>db.mycol.ensureIndex({“title”:1,”description”:-1})

explain() method : This method provides information on the query plan. The query plan is the plan the server uses to find the matches for a query. This information may be useful when optimizing a query. The explain() method returns a document that describes the process used to return the query results. The explain() method has the following form:

db.collection.find().explain()

Aggregations: Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. In sql count(*) and with group by is an equivalent of mongodb aggregation.

The aggregate() Method:  For the aggregation in mongodb, aggregate() method is used.

Syntax: Basic syntax of aggregate() method is as follows

>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)

Example:

If a collection has following data:

{
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by_user: 'XYZ',
likes : 10
},
{
title: 'NoSQL Overview',
description: 'No sql database is very fast',
by_user: 'XYZ',
likes : 100
},
{
title: 'Indexing Overview',
description: 'Indexes helps in search queries',
by_user: 'PQR',
likes : 10
}

Now from the above collection if it is required to display a list that how many tutorials are written by each user then aggregate() method is used as shown below:

> db.mycol.aggregate([{$group : {_id : “$by_user”, num_tutorial : {$sum : 1}}}])

Sql equivalent query for the above use case will be select by_user, count(*) from mycol group by by_user

There is a list available aggregation expressions.

  1. $sum Sums up the defined value from all documents in the collection.

db.mycol.aggregate([{$group : {_id : “$by_user”, num_tutorial : {$sum : “$likes”}}}])

  1. $avg Calculates the average of all given values from all documents in the collection.            mycol.aggregate([{$group : {_id : “$by_user”, num_tutorial : {$avg : “$likes”}}}])
  2. $min Gets the minimum of the corresponding values from all documents in the collection.            mycol.aggregate([{$group : {_id : “$by_user”, num_tutorial : {$min : “$likes”}}}])
  3. $max Gets the maximum of the corresponding values from all documents in the collection.            mycol.aggregate([{$group : {_id : “$by_user”, num_tutorial : {$max : “$likes”}}}])
  4. $push Inserts the value to an array in the resulting document.         mycol.aggregate([{$group : {_id : “$by_user”, url : {$push: “$url”}}}])
  5. $addToSet Inserts the value to an array in the resulting document but does not create duplicates.            mycol.aggregate([{$group : {_id : “$by_user”, url : {$addToSet : “$url”}}}])
  6. $first Gets the first document from the source documents according to the grouping. Typically this makes only sense together with some previously applied “$sort”-stage.       mycol.aggregate([{$group : {_id : “$by_user”, first_url : {$first : “$url”}}}])
  7. $last Gets the last document from the source documents according to the grouping. Typically this makes only sense together with some previously applied “$sort”-stage.       mycol.aggregate([{$group : {_id : “$by_user”, last_url : {$last : “$url”}}}])

Conclusion : This assignment explains details about use of indexing and aggregation in MongoDB.

************************** QUERIES ***************************

AIM : Aggregation and indexing with suitable example using MongoDB.

> show collections
bank
expt5
sales
system.indexes

> db.expt5.ensureIndex({"RNO":1},{"UNIQUE":true});
{
      "createdCollectionAutomatically" : false,
      "numIndexesBefore" : 1,
      "numIndexesAfter" : 2,
      "ok" : 1
}
 
> db.expt5.find().explain();
{
      "cursor" : "BasicCursor",
      "isMultiKey" : false,
      "n" : 3,
      "nscannedObjects" : 3,
      "nscanned" : 3,
      "nscannedObjectsAllPlans" : 3,
      "nscannedAllPlans" : 3,
      "scanAndOrder" : false,
      "indexOnly" : false,
      "nYields" : 0,
      "nChunkSkips" : 0,
      "millis" : 0,
      "server" : "localhost.localdomain:27017",
      "filterSet" : false
}
 
> db.expt5.insert({"RNO" : 1, "NAME" : { "FIRST" : "SANDESH", "LNAME" : "GANGWAL" },
"ADDRESS" : { "CITY" : "PUNE", "DISTRICT" : "PUNE", "PIN" : 345678 }, "CONTACTNO" :[1234567,456788076] });
 
WriteResult({
      "nInserted" : 0,
      "writeError" : {
            "code" : 11000,
            "errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: te_a.expt5.$RNO_1  dup key: { : 1.0 }"
      }
})
 
> db.bank.insert({"CUSTID":"B1","ACCTYPE":"SB","BRANCH":"PUNE","BALANCE":2300});
WriteResult({ "nInserted" : 1 })
 
> db.bank.insert({"CUSTID":"A1","ACCTYPE":"SB","BRANCH":"PUNE","BALANCE":8900});
WriteResult({ "nInserted" : 1 })
 
> db.bank.insert({"CUSTID":"A1","ACCTYPE":"CU","BRANCH":"PUNE","BALANCE":0900});
 WriteResult({ "nInserted" : 1 })
 
> db.bank.insert({"CUSTID":"B1","ACCTYPE":"CU","BRANCH":"PUNE","BALANCE":2900});
WriteResult({ "nInserted" : 1 })

> db.bank.insert({"CUSTID":"B1","ACCTYPE":"CU","BRANCH":"WAGHOLI","BALANCE":2900});
WriteResult({ "nInserted" : 1 })

> db.bank.insert({"CUSTID":"A1","ACCTYPE":"CU","BRANCH":"WAGHOLI","BALANCE":5600});
 WriteResult({ "nInserted" : 1 })
 

> db.bank.find();
{ "_id" : ObjectId("541e535f313dc145de686dfd"), "CUSTID" : "B1", "ACCTYPE" : "SB", "BRANCH" : "PUNE", "BALANCE" : 2300 }
{ "_id" : ObjectId("541e536d313dc145de686dfe"), "CUSTID" : "A1", "ACCTYPE" : "SB", "BRANCH" : "PUNE", "BALANCE" : 8900 }
{ "_id" : ObjectId("541e5379313dc145de686dff"), "CUSTID" : "A1", "ACCTYPE" : "CU", "BRANCH" : "PUNE", "BALANCE" : 900 }
{ "_id" : ObjectId("541e5386313dc145de686e00"), "CUSTID" : "B1", "ACCTYPE" : "CU", "BRANCH" : "PUNE", "BALANCE" : 2900 }
{ "_id" : ObjectId("541e5398313dc145de686e01"), "CUSTID" : "B1", "ACCTYPE" : "CU", "BRANCH" : "WAGHOLI", "BALANCE" : 2900 }
{ "_id" : ObjectId("541e53a3313dc145de686e02"), "CUSTID" : "A1", "ACCTYPE" : "CU", "BRANCH" : "WAGHOLI", "BALANCE" : 5600 }

> db.bank.ensureIndex({"CUSTID":1,"ACCTYPE":1,"BRANCH":1},{"unique":1});
{
      "createdCollectionAutomatically" : false,
      "numIndexesBefore" : 1,
      "numIndexesAfter" : 2,
      "ok" : 1
}
 
> db.bank.insert({"CUSTID":"B1","ACCTYPE":"CU","BRANCH":"WAGHOLI","BALANCE":2900});
WriteResult({
      "nInserted" : 0,
      "writeError" : {
            "code" : 11000,
            "errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: te_a.bank.$CUSTID_1_ACCTYPE_1_BRANCH_1  dup key: { : \"B1\", : \"CU\", : \"WAGHOLI\" }"
      }
})
 
> db.bank.aggregate([{$group : {_id : "$CUSTID", count : {$sum : 1}}}]);
{ "_id" : "A1", "count" : 3 }
{ "_id" : "B1", "count" : 3 }
 
> db.bank.aggregate([{$group : {_id : "$CUSTID", BALANCE : {$sum : "$BALANCE"}}}]);
{ "_id" : "A1", "BALANCE" : 15400 }
{ "_id" : "B1", "BALANCE" : 8100 }
 
> db.bank.aggregate([{$group : {_id : "$BRANCH", BALANCE : {$avg : "$BALANCE"}}}]);
 
{ "_id" : "WAGHOLI", "BALANCE" : 4250 }
{ "_id" : "PUNE", "BALANCE" : 3750 }
 
> db.bank.aggregate([{$group : {_id : 1, BALANCE : {$avg : "$BALANCE"}}}]);
{ "_id" : 1, "BALANCE" : 3916.6666666666665 }
 
> db.bank.aggregate([{$group : {_id : 1, BALANCE : {$sum : "$BALANCE"}}}]);
{ "_id" : 1, "BALANCE" : 23500 }

> db.bank.aggregate([{$group : {_id : 1, count : {$sum : 1}}}]);
{ "_id" : 1, "count" : 6 }
 
> db.bank.aggregate([{$group : {_id : 1, count : {$min : "$BALANCE"}}}]);
{ "_id" : 1, "count" : 900 }

> db.bank.aggregate([{$group : {_id : "$ACCTYPE", BALANCE : {$max : "$BALANCE"}}}]);
 { "_id" : "CU", "BALANCE" : 5600 }
{ "_id" : "SB", "BALANCE" : 8900 }

 

Leave a Reply

Your email address will not be published. Required fields are marked *