AGGREGATE Tutorial: Difference between revisions
From NoSQLZoo
| No edit summary | |||
| (35 intermediate revisions by 2 users not shown) | |||
| Line 4: | Line 4: | ||
| You may find these [[AGGREGATE examples]] useful. | You may find these [[AGGREGATE examples]] useful. | ||
| <div class='extra_space' style='width:1em; height:6em;'></div> | <div class='extra_space' style='width:1em; height:6em;'></div> | ||
| <div class=q data-lang="mongo"> | ==$group on continent== | ||
| <div class="q" data-lang="mongo"> | |||
| The aggregate method allows a <code>$group</code> - you must specify the <code>_id</code> and you can use aggregating functions such as <code>$sum</code> <code>$min</code> <code>$max</code> <code>$push</code> | |||
| The sample code shows the total population of each continent. | |||
| <div class='imper'>Show the number of countries in each continent.</div> | |||
| <pre class="def"><nowiki> | |||
| db.world.aggregate({ | |||
|   $group: { | |||
|     _id: '$continent', | |||
|     res: { | |||
|       $sum: '$population' | |||
|     } | |||
|   } | |||
| });</nowiki></pre> | |||
| <pre class="ans"><nowiki>db.world.aggregate({ | |||
|   $group: { | |||
|     _id: '$continent', | |||
|     res: { | |||
|       $sum: 1 | |||
|     } | |||
|   } | |||
| }); | |||
| </nowiki></pre> | |||
| </div> | |||
| ==Per Capita GDP== | |||
| <div class="q" data-lang="mongo"> | |||
| Give the <code>name</code> and the <code>per capita GDP</code> for those countries with a <code>population</code> of at least 200 million. | Give the <code>name</code> and the <code>per capita GDP</code> for those countries with a <code>population</code> of at least 200 million. | ||
| <div class="hint" title="How to calculate per capita GDP"> | <div class="hint" title="How to calculate per capita GDP"> | ||
| per capita GDP is the GDP divided by the population. | per capita GDP is the GDP divided by the population. | ||
| </div> | </div> | ||
| <pre class=def> | <pre class="def"><nowiki> | ||
| db.world.aggregate([ | db.world.aggregate([ | ||
|      {$match:{ |      {$match: { | ||
|          population:{$gte:250000000} |          population: {$gte: 250000000} | ||
|      }}, |      }}, | ||
|      {$project:{ |      {$project: { | ||
|          _id:0, |          _id: 0, | ||
|          name:1, |          name: 1, | ||
|          "per capita GDP": {$divide: ['$gdp',1000000]} |          "per capita GDP": {$divide: ['$gdp', 1000000]} | ||
|      }} |      }} | ||
| ]) | ]);</nowiki></pre> | ||
| </pre> | <pre class="ans"><nowiki>db.world.aggregate([{"$match":{"population":{"$gte":200000000}}},{"$project":{"_id":0,"name":1,"per capita GDP": {"$divide": ["$gdp","$population"]}}}]);</nowiki></pre> | ||
| < | |||
| </ | |||
| </div> | </div> | ||
| <div class=q data-lang="mongo"> | ==Population Density in South America== | ||
| Give the <code>name</code> and the <code>population density</code> of all countries | <div class="q" data-lang="mongo"> | ||
| Give the <code>name</code> and the <code>population density</code> of all countries in South America. | |||
| <div class="hint" title="How to calculate population density"> | <div class="hint" title="How to calculate population density"> | ||
| population density is the population divided by the area | population density is the population divided by the area | ||
| Line 33: | Line 59: | ||
| Use a <code>$match</code>. <code>{"area":{"$ne":0}}</code> | Use a <code>$match</code>. <code>{"area":{"$ne":0}}</code> | ||
| </div> | </div> | ||
| <pre class=def> | <pre class="def"><nowiki> | ||
| db.world.aggregate([ | db.world.aggregate([ | ||
|      {$project:{ |     {$match: {continent: 'Asia'}}, | ||
|          _id:0, |      {$project: { | ||
|          name:1, |          _id: 0, | ||
|          density: {$divide: ["$population","$area"] |          name: 1, | ||
|          density: {$divide: ["$population", "$area"]} | |||
|      }} |      }} | ||
| ]) | ]);</nowiki></pre> | ||
| <pre class="ans"><nowiki>db.world.aggregate([{$match:{continent:'South America'}},{$project:{_id:0,name:1,density:{$divide:["$population","$area"]}}}]);</nowiki></pre> | |||
| </div> | |||
| ==Population Density for "V"== | |||
| <div class="q" data-lang="mongo"> | |||
| Give the <code>name</code> and the <code>population density</code> of all countries with name after V in the alphabet. | |||
| <p>Note that because Vatican City (with area 0) is in Europe you will get a divide by zero error unless you filter first. | |||
| </p> | |||
| <div class="hint" title="Division by 0 error?"> | |||
| Use a <code>$match</code>. | |||
| <pre> | |||
| {  | |||
|   $match: { | |||
|     area: { | |||
|       "$ne": 0 | |||
|     } | |||
|   } | |||
| } | |||
| </pre> | </pre> | ||
| </div> | </div> | ||
| <pre class="def"><nowiki> | |||
| db.world.aggregate([ | |||
|     {$match: {name: {$gt: 'V'}}}, | |||
|     {$project: { | |||
|         _id: 0, | |||
|         name: 1, | |||
|         area: 1 | |||
|     }} | |||
| ]);</nowiki></pre> | |||
| <pre class="ans"><nowiki>db.world.aggregate([{$match:{name:{$gt:'V'}}},{$match:{area:{"$ne":0}}},{$project:{_id:0,name:1,density:{$divide:["$population","$area"]}}}]);</nowiki></pre> | |||
| </div> | </div> | ||
| ==Population in millions== | |||
| <div class=q data-lang="mongo"> | <div class="q" data-lang="mongo"> | ||
| Show the <code>name</code> and <code>population</code> in millions for the countries of the continent <b>South America</b>. Divide the population by 1000000 to get population in millions.   | Show the <code>name</code> and <code>population</code> in millions for the countries of the continent <b>South America</b>. Divide the population by 1000000 to get population in millions.   | ||
| <pre class=def> | <pre class="def"><nowiki> | ||
| db.world.aggregate([ | db.world.aggregate([ | ||
|      {$match:{ |      {$match:{ | ||
|      }}, |      }}, | ||
|      {$project:{ |      {$project:{ | ||
|          _id:0, |          _id: 0, | ||
|          name:1 |          name: 1 | ||
|      }} |      }} | ||
| ]) | ]);</nowiki></pre> | ||
| </pre> | <pre class="ans"><nowiki>db.world.aggregate([{"$match":{"continent":{"$eq":"South America"}}},{"$project":{"_id":0,"name":1,"population":{"$divide":["$population",1000000]}}}]);</nowiki></pre> | ||
| < | |||
| db.world.aggregate([{"$match":{"continent":{"$eq":"South America"}}},{"$project":{"_id":0,"name":1,"population":{"$divide":["$population",1000000]}}}]) | |||
| </ | |||
| </div> | </div> | ||
| ==Population density== | |||
| <div class=q data-lang="mongo"> | <div class="q" data-lang="mongo"> | ||
| Show the <code>name</code> and <code>population density</code> for <b>France</b>, <b>Germany</b>, and <b>Italy</b> | Show the <code>name</code> and <code>population density</code> for <b>France</b>, <b>Germany</b>, and <b>Italy</b> | ||
| <pre class=def> | <pre class="def"><nowiki> | ||
| db.world.aggregate([ | db.world.aggregate([ | ||
|      {$match:{ |      {$match:{ | ||
|          name: {$in:['United Kingdom','United States','Brazil']}, |          name: {$in: ['United Kingdom', 'United States', 'Brazil']}, | ||
|          population: {$ne: null}, |          population: {$ne: null}, | ||
|          area: {$ne: 0} |          area: {$ne: 0} | ||
|      }}, |      }}, | ||
|      {$project:{ |      {$project:{ | ||
|          _id:0, |          _id: 0, | ||
|          name:1 |          name: 1 | ||
|      }} |      }} | ||
| ]) | ]);</nowiki></pre> | ||
| </pre> | <pre class="ans"><nowiki>db.world.aggregate([{"$match":{"name":{"$in":['France','Germany','Italy']},"population":{"$ne":null},"area":{"$ne":0}}},{"$project":{"_id":0,"name":1,"population density":{"$divide":["$population","$area"]}}}]);</nowiki></pre> | ||
| < | |||
| db.world.aggregate([{"$match":{"name":{"$in":['France','Germany','Italy']},"population":{"$ne":null},"area":{"$ne":0}}},{"$project":{"_id":0,"name":1,"population density":{"$divide":["$population","$area"]}}}]) | |||
| </ | |||
| </div> | </div> | ||
| ==Continents by area== | |||
| <div class=q data-lang="mongo"> | <div class="q" data-lang="mongo"> | ||
| Order the <code>continents</code> by <code>area</code> from most to least. | Order the <code>continents</code> by <code>area</code> from most to least. | ||
| <pre class=def> | <pre class="def"><nowiki> | ||
| db.world.aggregate([ | db.world.aggregate([ | ||
|      {$group:{ |      {$group: { | ||
|          _id:"$name", |          _id: "$name", | ||
|          area:{$max: "$area"} |          area: {$max: "$area"} | ||
|      }}, |      }}, | ||
|      {$sort:{ |      {$sort: { | ||
|          area: -1 |          area: -1 | ||
|      }}, |      }}, | ||
|      {$project:{ |      {$project: { | ||
|          _id:1, |          _id: 1, | ||
|          area:1 |          area: 1 | ||
|      }} |      }} | ||
| ]) | ]);</nowiki></pre> | ||
| </pre> | <pre class="ans"><nowiki>db.world.aggregate([{"$group":{"_id":"$continent","area":{"$sum":"$area"}}},{"$sort":{"area":-1}},{"$project":{"_id":1,"area":1}}]);</nowiki></pre> | ||
| < | |||
| </ | |||
| </div> | </div> | ||
| == | ==Big Continents== | ||
| <div class=q data-lang="mongo"> | <div class="q" data-lang="mongo"> | ||
| Show the only two continents with total area greater than 25000000 and then sort from largest to smallest. | |||
| <pre class=def> | <pre class="def"><nowiki> | ||
| db.world.aggregate([ | db.world.aggregate([ | ||
|    {$match:{ |    {$match: { | ||
|        continent:"North America" |        continent: "North America" | ||
|    }}, |    }}, | ||
|    {$project:{ |    {$project: { | ||
|        _id:0, |        _id: 0, | ||
|        name:1 |        name: 1 | ||
|    }} |    }} | ||
| ]) | ]);</nowiki></pre> | ||
| </pre> | <pre class="ans"><nowiki>db.world.aggregate([{$group:{_id:"$continent",area:{$sum:"$area"}}},{$sort:{area:-1}},{$match:{area:{$gt:25000000}}}]);</nowiki></pre> | ||
| < | |||
| db.world.aggregate([{ | |||
| </div> | </div> | ||
| ==First and last country by continent== | |||
| <div class="q" data-lang="mongo"> | |||
| For each continent show the first and last country alphabetically like this: | |||
| <syntaxhighlight lang="JavaScript> | |||
|  { "_id" : "Africa", "from" : "Algeria", "to" : "Zimbabwe" } | |||
|  { "_id" : "Asia", "from" : "Afghanistan", "to" : "Yemen" } | |||
|  { "_id" : "Caribbean", "from" : "Antigua and Barbuda", "to" : "Trinidad and Tobago" } | |||
|  { "_id" : "Eurasia", "from" : "Armenia", "to" : "Russia" } | |||
|  { "_id" : "Europe", "from" : "Albania", "to" : "Vatican City" } | |||
|  { "_id" : "North America", "from" : "Belize", "to" : "United States" } | |||
|  { "_id" : "Oceania", "from" : "Australia", "to" : "Vanuatu" } | |||
|  { "_id" : "South America", "from" : "Argentina", "to" : "Venezuela" } | |||
| </syntaxhighlight> | |||
| <pre class="def"><nowiki> | |||
| db.world.aggregate([ | |||
|   {$group: { | |||
|       _id: "$continent" | |||
|   }}, | |||
|   {$sort: { | |||
|       _id: 1 | |||
|   }} | |||
| ]);</nowiki></pre> | |||
| <pre class="ans"><nowiki>db.world.aggregate([{$sort:{name:1}},{$group:{_id:'$continent',from:{$first:'$name'},to:{$last:'$name'}},},{$sort:{_id:1}}])</nowiki></pre> | |||
| </div> | </div> | ||
| <div class=q data-lang=" | ==Countries beginning with...== | ||
| <div class="q" data-lang="mongo"> | |||
| <pre class=def> | Group countries according to the first letter of the name. As shown. | ||
| Only give "U" through to "Z". | |||
| You will need to use the '''$substr''' function and the '''$push''' aggregate function. | |||
| <syntaxhighlight lang="JavaScript"> | |||
|  { "_id" : "U", "list" : [ "Uganda", "Ukraine", "United Arab Emirates", "United Kingdom", "United States", "Uruguay", "Uzbekistan" ] } | |||
|  { "_id" : "V", "list" : [ "Vanuatu", "Vatican City", "Venezuela", "Vietnam" ] } | |||
|  { "_id" : "Y", "list" : [ "Yemen" ] } | |||
|  { "_id" : "Z", "list" : [ "Zambia", "Zimbabwe" ] } | |||
| </syntaxhighlight> | |||
| <pre class="def"><nowiki> | |||
| db.world.aggregate([ | db.world.aggregate([ | ||
|   {$project: { | |||
|     _id: '$name', | |||
|     startsWith: {$substr: ['$name', 0, 1]} | |||
|   }}, | |||
|   {$match: { | |||
|      _id: {$gte: 'U'} | |||
|   }}, | |||
|      {$ |   {$sort: {_id: 1}} | ||
| ]);</nowiki></pre> | |||
| <pre class="ans"><nowiki>db.world.aggregate([{$group:{_id:{$substr:['$name',0,1]},list:{$push:'$name'}}},{$match:{_id:{$gte:'U'}}},{$sort:{_id:1}}]);</nowiki></pre> | |||
| ]) | |||
| </pre> | |||
| < | |||
| db.world.aggregate([{ | |||
| </ | |||
| </div> | </div> | ||
| <div class=q data-lang="mongo"> | ==Harder Questions== | ||
| ==Messing with continent names== | |||
| <pre class=def> | <div class="q" data-lang="mongo"> | ||
| Combine <b>North America</b> and <b>South America</b> to <b>America</b>, and then list the continents by area. Biggest first. | |||
| <pre class="def"><nowiki> | |||
| db.world.aggregate([ | db.world.aggregate([ | ||
|      {$ |   {$group: { | ||
|      _id: { | |||
|      }}, |       $cond: [ | ||
|         {$eq: ["$continent", "North America"]}, | |||
|          "America", | |||
|         {$cond: [ | |||
|           {$eq: ["$continent", "Asia"]}, | |||
| ]) |           "The East", | ||
| </pre> |           "$continent" | ||
| < |         ]} | ||
| db.world.aggregate([{"$ |       ] | ||
| </ |      }, | ||
|     area: {$sum: "$area"} | |||
|   }}, | |||
|   {$sort: {area: -1}}, | |||
|   {$project: { | |||
|     _id: 1, | |||
|     area: 1 | |||
|   }} | |||
| ]);</nowiki></pre> | |||
| <pre class="ans"><nowiki>db.world.aggregate([{"$group":{"_id":{"$cond":[{"$eq":["$continent","South America"]},"America",{"$cond":[{"$eq":["$continent","North America"]},"America","$continent"]}]},"area":{"$sum":"$area"}}},{"$sort":{"area":-1}},{"$project":{"_id":1,"area":1}}]); </nowiki></pre> | |||
| </div> | </div> | ||
| <div class=q data-lang="mongo"> | ==Country populations by order of magnitude== | ||
| Show the  | <div class="q" data-lang="mongo"> | ||
| Show the number of countries in each order. | |||
| For example UK has a population of 64,105,700 so it belongs in the 10,000,000 bucket. | |||
| <pre class=def>< | |||
| You will need the functions $floor, $log10 and $pow | |||
| <pre class="def"><nowiki> | |||
| db.world.aggregate([ | db.world.aggregate([ | ||
|      {$match:{ |      {$match: { | ||
|          name:{$regex:"^ |          name: {$regex: "^N"} | ||
|      }}, |      }}, | ||
|      {$project:{ |      {$project: { | ||
|          _id:0, |          _id: 0, | ||
|          name:1 |          name: 1 | ||
|      }} |      }} | ||
| ]) | ]);</nowiki></pre> | ||
| </ | <pre class="ans"><nowiki>db.world.aggregate([ | ||
| </ |     {$project: {name:1,l10:{$pow:[10,{$floor:{$log10:"$population"}}]}}}, | ||
|     {$group:{_id:"$l10",c:{$sum:1}}}, | |||
| < |     {$sort:{_id:1}} | ||
| ]); | |||
| </nowiki></pre> | |||
| < | |||
| </ | |||
| </div> | </div> | ||
Latest revision as of 22:53, 31 March 2021
Country Profile
For these questions you should use aggregate([]) on the collection world
You may find these AGGREGATE examples useful.
$group on continent
The aggregate method allows a $group - you must specify the _id and you can use aggregating functions such as $sum $min $max $push
The sample code shows the total population of each continent.
db.world.aggregate({
  $group: {
    _id: '$continent',
    res: {
      $sum: '$population'
    }
  }
});
db.world.aggregate({
  $group: {
    _id: '$continent',
    res: {
      $sum: 1
    }
  }
});
Per Capita GDP
Give the name and the per capita GDP for those countries with a population of at least 200 million.
per capita GDP is the GDP divided by the population.
db.world.aggregate([
    {$match: {
        population: {$gte: 250000000}
    }},
    {$project: {
        _id: 0,
        name: 1,
        "per capita GDP": {$divide: ['$gdp', 1000000]}
    }}
]);
db.world.aggregate([{"$match":{"population":{"$gte":200000000}}},{"$project":{"_id":0,"name":1,"per capita GDP": {"$divide": ["$gdp","$population"]}}}]);
Population Density in South America
Give the name and the population density of all countries in South America.
population density is the population divided by the area
Use a $match. {"area":{"$ne":0}}
db.world.aggregate([
    {$match: {continent: 'Asia'}},
    {$project: {
        _id: 0,
        name: 1,
        density: {$divide: ["$population", "$area"]}
    }}
]);
db.world.aggregate([{$match:{continent:'South America'}},{$project:{_id:0,name:1,density:{$divide:["$population","$area"]}}}]);
Population Density for "V"
Give the name and the population density of all countries with name after V in the alphabet.
Note that because Vatican City (with area 0) is in Europe you will get a divide by zero error unless you filter first.
Use a $match.
{ 
  $match: {
    area: {
      "$ne": 0
    }
  }
}
db.world.aggregate([
    {$match: {name: {$gt: 'V'}}},
    {$project: {
        _id: 0,
        name: 1,
        area: 1
    }}
]);
db.world.aggregate([{$match:{name:{$gt:'V'}}},{$match:{area:{"$ne":0}}},{$project:{_id:0,name:1,density:{$divide:["$population","$area"]}}}]);
Population in millions
Show the name and population in millions for the countries of the continent South America. Divide the population by 1000000 to get population in millions. 
db.world.aggregate([
    {$match:{
    }},
    {$project:{
        _id: 0,
        name: 1
    }}
]);
db.world.aggregate([{"$match":{"continent":{"$eq":"South America"}}},{"$project":{"_id":0,"name":1,"population":{"$divide":["$population",1000000]}}}]);
Population density
Show the name and population density for France, Germany, and Italy
db.world.aggregate([
    {$match:{
        name: {$in: ['United Kingdom', 'United States', 'Brazil']},
        population: {$ne: null},
        area: {$ne: 0}
    }},
    {$project:{
        _id: 0,
        name: 1
    }}
]);
db.world.aggregate([{"$match":{"name":{"$in":['France','Germany','Italy']},"population":{"$ne":null},"area":{"$ne":0}}},{"$project":{"_id":0,"name":1,"population density":{"$divide":["$population","$area"]}}}]);
Continents by area
Order the continents by area from most to least.
db.world.aggregate([
    {$group: {
        _id: "$name",
        area: {$max: "$area"}
    }},
    {$sort: {
        area: -1
    }},
    {$project: {
        _id: 1,
        area: 1
    }}
]);
db.world.aggregate([{"$group":{"_id":"$continent","area":{"$sum":"$area"}}},{"$sort":{"area":-1}},{"$project":{"_id":1,"area":1}}]);
Big Continents
Show the only two continents with total area greater than 25000000 and then sort from largest to smallest.
db.world.aggregate([
  {$match: {
      continent: "North America"
  }},
  {$project: {
      _id: 0,
      name: 1
  }}
]);
db.world.aggregate([{$group:{_id:"$continent",area:{$sum:"$area"}}},{$sort:{area:-1}},{$match:{area:{$gt:25000000}}}]);
First and last country by continent
For each continent show the first and last country alphabetically like this:
 { "_id" : "Africa", "from" : "Algeria", "to" : "Zimbabwe" }
 { "_id" : "Asia", "from" : "Afghanistan", "to" : "Yemen" }
 { "_id" : "Caribbean", "from" : "Antigua and Barbuda", "to" : "Trinidad and Tobago" }
 { "_id" : "Eurasia", "from" : "Armenia", "to" : "Russia" }
 { "_id" : "Europe", "from" : "Albania", "to" : "Vatican City" }
 { "_id" : "North America", "from" : "Belize", "to" : "United States" }
 { "_id" : "Oceania", "from" : "Australia", "to" : "Vanuatu" }
 { "_id" : "South America", "from" : "Argentina", "to" : "Venezuela" }
db.world.aggregate([
  {$group: {
      _id: "$continent"
  }},
  {$sort: {
      _id: 1
  }}
]);
db.world.aggregate([{$sort:{name:1}},{$group:{_id:'$continent',from:{$first:'$name'},to:{$last:'$name'}},},{$sort:{_id:1}}])
Countries beginning with...
Group countries according to the first letter of the name. As shown. Only give "U" through to "Z".
You will need to use the $substr function and the $push aggregate function.
 { "_id" : "U", "list" : [ "Uganda", "Ukraine", "United Arab Emirates", "United Kingdom", "United States", "Uruguay", "Uzbekistan" ] }
 { "_id" : "V", "list" : [ "Vanuatu", "Vatican City", "Venezuela", "Vietnam" ] }
 { "_id" : "Y", "list" : [ "Yemen" ] }
 { "_id" : "Z", "list" : [ "Zambia", "Zimbabwe" ] }
db.world.aggregate([
  {$project: {
    _id: '$name',
    startsWith: {$substr: ['$name', 0, 1]}
  }},
  {$match: {
    _id: {$gte: 'U'}
  }},
  {$sort: {_id: 1}}
]);
db.world.aggregate([{$group:{_id:{$substr:['$name',0,1]},list:{$push:'$name'}}},{$match:{_id:{$gte:'U'}}},{$sort:{_id:1}}]);
Harder Questions
Messing with continent names
Combine North America and South America to America, and then list the continents by area. Biggest first.
db.world.aggregate([
  {$group: {
    _id: {
      $cond: [
        {$eq: ["$continent", "North America"]},
        "America",
        {$cond: [
          {$eq: ["$continent", "Asia"]},
          "The East",
          "$continent"
        ]}
      ]
    },
    area: {$sum: "$area"}
  }},
  {$sort: {area: -1}},
  {$project: {
    _id: 1,
    area: 1
  }}
]);
db.world.aggregate([{"$group":{"_id":{"$cond":[{"$eq":["$continent","South America"]},"America",{"$cond":[{"$eq":["$continent","North America"]},"America","$continent"]}]},"area":{"$sum":"$area"}}},{"$sort":{"area":-1}},{"$project":{"_id":1,"area":1}}]); 
Country populations by order of magnitude
Show the number of countries in each order.
For example UK has a population of 64,105,700 so it belongs in the 10,000,000 bucket.
You will need the functions $floor, $log10 and $pow
db.world.aggregate([
    {$match: {
        name: {$regex: "^N"}
    }},
    {$project: {
        _id: 0,
        name: 1
    }}
]);
db.world.aggregate([
    {$project: {name:1,l10:{$pow:[10,{$floor:{$log10:"$population"}}]}}},
    {$group:{_id:"$l10",c:{$sum:1}}},
    {$sort:{_id:1}}
]);