Cookies help us deliver our services. By using our services, you agree to our use of cookies. More information

Difference between revisions of "AGGREGATE examples"

From NoSQLZoo
Jump to: navigation, search
Line 85: Line 85:
 
<div class=q data-lang="py3">
 
<div class=q data-lang="py3">
 
<code>$max</code> and <code>$min</code> can be used to get the largest and smallest values in a group.
 
<code>$max</code> and <code>$min</code> can be used to get the largest and smallest values in a group.
<p class=strong>Show the country with the largest gdp and the one with the lowest</p>
+
<p class=strong>Get the smallest and largest GDPs.</p>
 
<pre class=def>
 
<pre class=def>
 
pp.pprint(list(
 
pp.pprint(list(
 
     db.world.aggregate([
 
     db.world.aggregate([
 
         {"$group":{
 
         {"$group":{
             "_id": $name,
+
             '_id':'null',
             smallest: {"$min": "$gdp"},
+
             'min':{"$min":"$gdp"},
             largest: {"$max": "$gdp"}
+
             'max':{"$max":"$gdp"},
 +
        }},
 +
        {"$project":{
 +
          "_id":0, 
 +
          "min":1,
 +
          "max":1
 
         }},
 
         }},
 
     ])
 
     ])
Line 98: Line 103:
 
</pre>
 
</pre>
 
<div class=ans>
 
<div class=ans>
pp.pprint(list(db.world.aggregate([{"$group":{"_id":$name,smallest:{"$min":"$gdp"},largest:{"$max":"$gdp"}}},])))
+
pp.pprint(list(db.world.aggregate([{"$group":{'_id':'null','min':{"$min":"$gdp"},'max':{"$max":"$gdp"},}},{"$project":{"_id":0,"min":1,"max":1}},])))
 
</div>
 
</div>
 
</div>
 
</div>

Revision as of 21:21, 16 July 2015

#ENCODING
import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-16')
#MONGO
from pymongo import MongoClient
client = MongoClient()
client.progzoo.authenticate('scott','tiger')
db = client['progzoo']
#PRETTY
import pprint
pp = pprint.PrettyPrinter(indent=4)

Introducing the aggregation framework

These examples introduce the aggregation framework and its operators. Again we will be using the collection world

$match Allows us to perform queries in a similar way to find()

Show all the details for France

pp.pprint(list(
    db.world.aggregate([
        {"$match":{"name":"France"}}
    ])
))

pp.pprint(list(db.world.aggregate([{"$match":{"name":"France"}}])))

$project Allows us to select what fields to display.
It can also has the ability to insert new fields and allows you to compare fields against each other without using $where

Show the name and population density of all Asian countries. (population/area)

Note that "density" is a new field, made from the result of dividing two existing fields, and that $divide is an aggregate function.

To avoid diving by 0 we do a $match to remove any countries with negligible area, then pipe these results through to $project

pp.pprint(list(
     db.world.aggregate([
        {"$match":{"area":{"$ne":0},"continent":"Asia"}},
        {"$project":{
            "_id":0,
            "name":1,
            "density": {"$divide": ["$population","$area"]}
        }}
    ])
))

pp.pprint(list(db.world.aggregate([{"$match":{"area":{"$ne":0},"continent":"Asia"}},{"$project":{"_id":0,"name":1,"density":{"$divide":["$population","$area"]}}}])))

$sort Allows us to choose how the results are displayed, where 1 is ascending and -1 is descending.
Note that excluding $match is the same as {"$match":{}}

Show the name of all countries in descending order.

pp.pprint(list(
     db.world.aggregate([
        {"$project":{
            "_id":0,
            "name":1,
        }},
        {"$sort":{
            "name":-1
        }}  
    ])
))

pp.pprint(list(db.world.aggregate([{"$project":{"_id":0,"name":1,}},{"$sort":{"name":-1}}])))

Grouping

Grouping allows us to use accumulator operations sum as $sum
All groups must have an _id, but this can be set to null.

$max and $min can be used to get the largest and smallest values in a group.

Get the smallest and largest GDPs.

pp.pprint(list(
     db.world.aggregate([
        {"$group":{
            '_id':'null',
            'min':{"$min":"$gdp"},
            'max':{"$max":"$gdp"},
        }},
        {"$project":{
          "_id":0,  
          "min":1,
          "max":1
        }},
    ])
))

pp.pprint(list(db.world.aggregate([{"$group":{'_id':'null','min':{"$min":"$gdp"},'max':{"$max":"$gdp"},}},{"$project":{"_id":0,"min":1,"max":1}},])))