Actions

AGGREGATE examples: Difference between revisions

From NoSQLZoo

mNo edit summary
Line 3: Line 3:
<div class='extra_space' style='width:1em; height:6em;'></div>
<div class='extra_space' style='width:1em; height:6em;'></div>


<div class=q data-lang="mongo">
<div class="q" data-lang="mongo">
<code>$match</code> performs queries in a similar way to <code>find()</code>
<code>$match</code> performs queries in a similar way to <code>find()</code>
<p class=strong>Show all the details for France</p>
<p class=strong>Show all the details for France</p>
<pre class=def>
<pre class="def"><nowiki>
db.world.aggregate([
db.world.aggregate([
     {$match:{name:"France"}}
     {$match: {name: "France"}}
])
]);</nowiki></pre>
</pre>
<div class="ans"><nowiki>db.world.aggregate([{$match:{name:"France"}}]);</nowiki></div>
<div class=ans>
db.world.aggregate([{$match:{name:"France"}}])
</div>
</div>
</div>


<div class=q data-lang="mongo"><code>limit</code> sets the amount of documents to be handed to the next stage in the pipeline.
<div class="q" data-lang="mongo"><code>limit</code> sets the amount of documents to be handed to the next stage in the pipeline.
<p class=strong>Return the first document</p>
<p class=strong>Return the first document</p>
<pre class=def>
<pre class="def"><nowiki>
db.world.aggregate([
db.world.aggregate([
     {$limit:1}
     {$limit: 1}
])
]);</nowiki></pre>
</pre>
<div class="ans"><nowiki>db.world.aggregate([{"$limit":1}]);</nowiki></div>
<div class=ans>
db.world.aggregate([{"$limit":1}])
</div>
</div>
</div>


<div class=q data-lang="mongo">
<div class="q" data-lang="mongo">
<code>$project</code> selects what fields to display.<br/>
<code>$project</code> selects what fields to display.<br/>
It can also has the ability to create new fields and to compare fields against each other without using <code>$where</code>
It can also has the ability to create new fields and to compare fields against each other without using <code>$where</code>
Line 37: Line 31:
There is no need to check if values are <code>null</code>, MongoDB will ignore these documents.
There is no need to check if values are <code>null</code>, MongoDB will ignore these documents.
</div>
</div>
<pre class=def>
<pre class="def"><nowiki>
db.world.aggregate([
db.world.aggregate([
     {$match:{area:{$ne:0}, continent:"Asia"}},
     {$match: {area: {$ne: 0}, continent: "Asia"}},
     {$project:{
     {$project: {
         _id:0,
         _id: 0,
         name:1,
         name: 1,
         density: {$divide: ["$population","$area"]}
         density: {$divide: ["$population", "$area"]}
     }}
     }}
])
])</nowiki></pre>
</pre>
<pre class="ans"><nowiki>db.world.aggregate([{"$match":{"area":{"$ne":0},"continent":"Asia"}},{"$project":{"_id":0,"name":1,"density":{"$divide":["$population","$area"]}}}]);</nowiki></pre>
<div class=ans>
db.world.aggregate([{"$match":{"area":{"$ne":0},"continent":"Asia"}},{"$project":{"_id":0,"name":1,"density":{"$divide":["$population","$area"]}}}])
</div>
</div>
</div>


<div class=q data-lang="mongo">
<div class="q" data-lang="mongo">
Because aggregate is a pipeline stages may be repeated, and stages don't have to be used in a specific order.
Because aggregate is a pipeline stages may be repeated, and stages don't have to be used in a specific order.
<p class=strong>Show the name of Asian countries with a density that's over 500 people per km<sup>2</sup>. (population/area)</p>
<p class=strong>Show the name of Asian countries with a density that's over 500 people per km<sup>2</sup>. (population/area)</p>
<pre class=def>
<pre class="def"><nowiki>
db.world.aggregate([
db.world.aggregate([
   {$match:{area:{$ne:0},continent:"Asia"}},
   {$match: {area: {$ne: 0}, continent: "Asia"}},
   {$project:{
   {$project:{
       _id:0,
       _id: 0,
       name:1,
       name: 1,
       density: {$divide: ["$population","$area"]}
       density: {$divide: ["$population", "$area"]}
   }},
   }},
   {$match:{density:{$gt:500}}}
   {$match: {density: {$gt: 500}}}
])
]);</nowiki></pre>
</pre>
<pre class="ans"><nowiki>db.world.aggregate([{"$match":{"area":{"$ne":0},"continent":"Asia"}},{"$project":{"_id":0,"name":1,"density":{"$divide":["$population","$area"]}}},{"$match":{"density":{"$gt":500}}}]);</nowiki></pre>
<div class=ans>
db.world.aggregate([{"$match":{"area":{"$ne":0},"continent":"Asia"}},{"$project":{"_id":0,"name":1,"density":{"$divide":["$population","$area"]}}},{"$match":{"density":{"$gt":500}}}])
</div>
</div>
</div>


<div class=q data-lang="mongo">
<div class="q" data-lang="mongo">
<code>$sort</code> allows ordering of the results set, where 1 is ascending and -1 is descending.<br/>
<code>$sort</code> allows ordering of the results set, where 1 is ascending and -1 is descending.<br/>
Note that not including <code>$match</code> is the same as <code>{"$match":{}}</code>
Note that not including <code>$match</code> is the same as <code>{"$match":{}}</code>
<p class=strong>Show the name of all countries in descending order.</p>
<p class=strong>Show the name of all countries in descending order.</p>
<pre class=def>
<pre class="def"><nowiki>
db.world.aggregate([
db.world.aggregate([
     {"$project":{
     {"$project":{
Line 84: Line 72:
         "name":-1
         "name":-1
     }}   
     }}   
])
])</nowiki></pre>
</pre>
<div class="ans"><nowiki>db.world.aggregate([{"$project":{"_id":0,"name":1,}},{"$sort":{"name":-1}}])</nowiki></div>
<div class=ans>
db.world.aggregate([{"$project":{"_id":0,"name":1,}},{"$sort":{"name":-1}}])
</div>
</div>
</div>


Line 135: Line 120:
To operate over the whole document (which would have the same effect as <code>"_id":"$name"</code>) <code>"_id":"null"</code> or <code>"_id":None</code> can be used.
To operate over the whole document (which would have the same effect as <code>"_id":"$name"</code>) <code>"_id":"null"</code> or <code>"_id":None</code> can be used.
==group operators==
==group operators==
<div class=q data-lang="mongo">  
<div class="q" data-lang="mongo">  
<code>$max</code> and <code>$min</code> can be used to get the largest and smallest values in a group.
<code>$max</code> and <code>$min</code> can be used to get the largest and smallest values in a group.
<p class=strong>Get the smallest and largest GDPs of each continent.</p>
<p class="strong">Get the smallest and largest GDPs of each continent.</p>
<pre class=def>
<pre class="def">
db.world.aggregate([
db.world.aggregate([
     {$group:{
     {$group: {
         _id:'$continent',
         _id: '$continent',
         min:{$min:"$gdp"},
         min: {$min: "$gdp"},
         max:{$max:"$gdp"}
         max: {$max: "$gdp"}
     }},
     }},
     {$project:{
     {$project: {
         _id:1,
         _id: 1,
         min:1,
         min: 1,
         max:1
         max: 1
     }},
     }}
])
]);</nowiki></pre>
</pre>
<div class="ans"><nowiki>db.world.aggregate([{"$group":{'_id':'$continent','min':{"$min":"$gdp"},'max':{"$max":"$gdp"}}},{"$project":{"_id":1,"min":1,"max":1}}]);</nowiki></div>
<div class=ans>
db.world.aggregate([{"$group":{'_id':'$continent','min':{"$min":"$gdp"},'max':{"$max":"$gdp"}}},{"$project":{"_id":1,"min":1,"max":1}},])
</div>
</div>
</div>


<div class=q data-lang="mongo">
<div class="q" data-lang="mongo">
Some other useful aggregate functions to know are <code>$sum</code> and average: <code>$avg</code><br/>
Some other useful aggregate functions to know are <code>$sum</code> and average: <code>$avg</code><br/>
This example combines all the material in these examples.  
This example combines all the material in these examples.  
<p class=strong>Order the continents in descending order by total GDP, Include the average GDP for each country.</p>
<p class=strong>Order the continents in descending order by total GDP, Include the average GDP for each country.</p>
<pre class=def>
<pre class="def">
db.world.aggregate([
db.world.aggregate([
     {$match:{}},
     {$match: {}},
     {$group:{
     {$group: {
         _id:"$continent",
         _id:"$continent",
         "Total GDP": {"$sum": "$gdp"},
         "Total GDP": {"$sum": "$gdp"},
         "Average GDP": {"$avg": "$gdp"}
         "Average GDP": {"$avg": "$gdp"}
     }},
     }},
     {$sort:{
     {$sort: {
         "Total GDP":-1
         "Total GDP":-1
     }},
     }},
     {$project:{
     {$project:{
         Area:"$_id",
         "Area": "$_id",
         "Total GDP": 1,
         "Total GDP": 1,
         "Average GDP":1,
         "Average GDP": 1,
         _id:0
         _id: 0
     }}
     }}
])
]);</nowiki></pre>
</pre>
<div class="ans"><nowiki>db.world.aggregate([{"$group":{"_id":"$continent","Total GDP":{"$sum":"$gdp"},"Average GDP":{"$avg":"$gdp"}}},{"$sort":{"Total GDP":-1}},{"$project":{"Area":"$_id","Total GDP":1,"Average GDP":1,"_id":0}}]);</nowiki></div>
<div class=ans>
db.world.aggregate([{"$group":{"_id":"$continent","Total GDP":{"$sum":"$gdp"},"Average GDP":{"$avg":"$gdp"}}},{"$sort":{"Total GDP":-1}},{"$project":{"Area":"$_id","Total GDP":1,"Average GDP":1,"_id":0}}])
</div>
</div>
</div>


<div class=q data-lang="mongo">
<div class="q" data-lang="mongo">
Using Conditions<br/><br/>
Using Conditions<br/><br/>
<code>$cond</code> is similar to a <code>CASE</code> statement in other languages.<br/>
<code>$cond</code> is similar to a <code>CASE</code> statement in other languages.<br/>
It has the form <code>"$cond": [{<comparison> :[<field or value>,<field or value>]},<true case>,<false case>]</code><br/><br/>
It has the form <code>"$cond": [{<comparison> :[<field or value>,<field or value>]},<true case>,<false case>]</code><br/><br/>
<pre class=def>
<pre class="def"><nowiki>
db.world.aggregate([
db.world.aggregate([
     {$group:{
     {$group: {
         _id:{
         _id: {
             $cond: [{"$eq":["$continent","Eurasia"]},"Europe","$continent"]
             $cond: [{"$eq": ["$continent", "Eurasia"]}, "Europe", "$continent"]
         },
         },
         area:{$sum: "$area"}
         area: {$sum: "$area"}
     }},
     }},
     {$sort:{
     {$sort: {
         area: -1
         area: -1
     }},
     }},
     {$project:{
     {$project: {
         _id:1,
         _id: 1,
         area:1
         area: 1
     }}
     }}
])
]);
</pre>
</nowiki></pre>

Revision as of 19:42, 21 June 2018

Introducing the aggregation framework

These examples introduce the aggregation framework and its operators. Again we will be using the collection world

$match performs queries in a similar way to find()

Show all the details for France

db.world.aggregate([
    {$match: {name: "France"}}
]);
db.world.aggregate([{$match:{name:"France"}}]);
limit sets the amount of documents to be handed to the next stage in the pipeline.

Return the first document

db.world.aggregate([
    {$limit: 1}
]);
db.world.aggregate([{"$limit":1}]);

$project selects what fields to display.
It can also has the ability to create new fields and to compare fields against each other without using $where

Show the name and population density of all Asian countries. (population/area)

Note that "density" is a new field, made from the result of dividing two existing fields, and that $divide is an aggregate function.

To avoid diving by 0 insert a $match to remove any countries with 0 area (Vatican City), then pipe these results through to $project
There is no need to check if values are null, MongoDB will ignore these documents.

db.world.aggregate([
    {$match: {area: {$ne: 0}, continent: "Asia"}},
    {$project: {
        _id: 0,
        name: 1,
        density: {$divide: ["$population", "$area"]}
    }}
])
db.world.aggregate([{"$match":{"area":{"$ne":0},"continent":"Asia"}},{"$project":{"_id":0,"name":1,"density":{"$divide":["$population","$area"]}}}]);

Because aggregate is a pipeline stages may be repeated, and stages don't have to be used in a specific order.

Show the name of Asian countries with a density that's over 500 people per km2. (population/area)

db.world.aggregate([
   {$match: {area: {$ne: 0}, continent: "Asia"}},
   {$project:{
       _id: 0,
       name: 1,
       density: {$divide: ["$population", "$area"]}
   }},
   {$match: {density: {$gt: 500}}}
]);
db.world.aggregate([{"$match":{"area":{"$ne":0},"continent":"Asia"}},{"$project":{"_id":0,"name":1,"density":{"$divide":["$population","$area"]}}},{"$match":{"density":{"$gt":500}}}]);

$sort allows ordering of the results set, where 1 is ascending and -1 is descending.
Note that not including $match is the same as {"$match":{}}

Show the name of all countries in descending order.

db.world.aggregate([
    {"$project":{
        "_id":0,
        "name":1,
    }},
    {"$sort":{
        "name":-1
    }}  
])
db.world.aggregate([{"$project":{"_id":0,"name":1,}},{"$sort":{"name":-1}}])

Grouping

Grouping provides accumulator operations such as $sum
All groups must have an _id. To see why this is useful imagine the following:

So far you've been using the collection world
As every country has a continent, it would make sense to have countries as a nested document inside continents: e.g:

[    
    {"name":"Africa",
     "countries":[
         {"name":"Algeria",    "capital":"Algiers",   ...},
         {"name":"Angola",     "capital":"Luanda",    ...},
         {"name":"Benin",      "capital":"Porto-Novo",...}.
         {...},
         ...
    ]},
    {"name":"Asia",
     "countries":[
         {"name":"Afghanistan","capital":"Kabul", ...},
         {"name":"Azerbaijan", "capital":"Baku",  ...},
         {"name":"Bahrain",    "capital":"Manama",...},
         {...},
         ...
    ]},
    {...},
    ...
]

The world collection isn't like this however. It uses the following structure, which has a redundancy where continent is repeated for each country.

[ 
    {"name":"Afghanistan","capital":"Kabul",  "continent":"Asia",  ...},
    {"name":"Albania",    "capital":"Tirana", "continent":"Europe, ...},
    {"name":"Algeria",    "capital":"Algiers","contiennt":"Africa",...},
    {...},
    ...
]

The code to group by continent is "_id":"$continent"
If instead the question was to group by country the code would be "_id":"$name".
To operate over the whole document (which would have the same effect as "_id":"$name") "_id":"null" or "_id":None can be used.

group operators

$max and $min can be used to get the largest and smallest values in a group.

Get the smallest and largest GDPs of each continent.

db.world.aggregate([
    {$group: {
        _id: '$continent',
        min: {$min: "$gdp"},
        max: {$max: "$gdp"}
    }},
    {$project: {
        _id: 1,
        min: 1,
        max: 1
    }}
]);</nowiki>
db.world.aggregate([{"$group":{'_id':'$continent','min':{"$min":"$gdp"},'max':{"$max":"$gdp"}}},{"$project":{"_id":1,"min":1,"max":1}}]);

Some other useful aggregate functions to know are $sum and average: $avg
This example combines all the material in these examples.

Order the continents in descending order by total GDP, Include the average GDP for each country.

db.world.aggregate([
    {$match: {}},
    {$group: {
        _id:"$continent",
        "Total GDP": {"$sum": "$gdp"},
        "Average GDP": {"$avg": "$gdp"}
    }},
    {$sort: {
        "Total GDP":-1
    }},
    {$project:{
        "Area": "$_id",
        "Total GDP": 1,
        "Average GDP": 1,
        _id: 0
    }}
]);</nowiki>
db.world.aggregate([{"$group":{"_id":"$continent","Total GDP":{"$sum":"$gdp"},"Average GDP":{"$avg":"$gdp"}}},{"$sort":{"Total GDP":-1}},{"$project":{"Area":"$_id","Total GDP":1,"Average GDP":1,"_id":0}}]);

Using Conditions

$cond is similar to a CASE statement in other languages.
It has the form "$cond": [{<comparison> :[<field or value>,<field or value>]},<true case>,<false case>]

db.world.aggregate([
    {$group: {
        _id: {
            $cond: [{"$eq": ["$continent", "Eurasia"]}, "Europe", "$continent"]
        },
        area: {$sum: "$area"}
    }},
    {$sort: {
        area: -1
    }},
    {$project: {
        _id: 1,
        area: 1
    }}
]);
  • You have been served by: dill