Difference between revisions of "RegEx Pattern Matching"
Line 150: | Line 150: | ||
</div> | </div> | ||
==Harder Questions== | ==Harder Questions== | ||
+ | Well done for getting this far.<br/> | ||
+ | Some optional, more complex questions are provided below. | ||
+ | <div class=q data-lang="py3"> | ||
+ | The capital of <b>Luxembourg</b> is <b>Luxembourg</b>. Show all the countries where the capital is the same as the name of the country | ||
+ | <div class="hint" title="How to compare two fields"> | ||
+ | You can compare two fields by using where <br/><code>db.<collection>.find({"$where":"this.<field1> <<operator>> this.<field2>"})</code> | ||
+ | </div> | ||
+ | <p class=strong>Find the country where the name is the capital city.</p> | ||
+ | <pre class=def> | ||
+ | pp.pprint(list( | ||
+ | db.world.find({"$where":"this.name == 'Mexico'"},{"name":1,"_id":0}) | ||
+ | )) | ||
+ | </pre> | ||
+ | <div class=ans> | ||
+ | pp.pprint(list( | ||
+ | db.world.find({"$where":"this.name == this.capital"},{"name":1,"_id":0}) | ||
+ | )) | ||
+ | </div> | ||
+ | </div> | ||
+ | |||
+ | <div class=q data-lang="py3"> | ||
+ | The capital of <b>Mexico</b> is <b>Mexico City</b>. Show all the countries where the capital has the country together with the word "City". | ||
+ | <p class=strong>Find the country where the capital is the country plus "City"..</p> | ||
+ | <pre class=def> | ||
+ | pp.pprint(list( | ||
+ | db.world.find({"$where":"this.capital == 'Mexico'+' City'"},{"name":1,"_id":0}) | ||
+ | )) | ||
+ | </pre> | ||
+ | <div class=ans> | ||
+ | pp.pprint(list( | ||
+ | db.world.find({"$where":"this.capital == this.name+' City'"},{"name":1,"_id":0}) | ||
+ | )) | ||
+ | </div> | ||
+ | </div> |
Revision as of 22:07, 15 July 2015
#ENCODING import io import sys sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-16') #MONGO from pymongo import MongoClient client = MongoClient() client.progzoo.authenticate('scott','tiger') db = client['progzoo'] #PRETTY import pprint pp = pprint.PrettyPrinter(indent=4)
Pattern Matching String
This tutorial uses RegEx to check names. We will be using find() on the collection world.
You can use '$regex':"^B"
to get all the countries that start with B.
Find the countries that start with Y
pp.pprint(list( db.world.find({"name":{'$regex':"^F"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"^Y"}},{"name":1,"_id":0})))
You can use '$regex':"a$"
to get all the countries that end with a.
Find the countries that end with Y
pp.pprint(list( db.world.find({"name":{'$regex':"l$"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"y$"}},{"name":1,"_id":0})))
Luxembourg has an x, so does one other country, list them both
Find the countries that contain the letter x
pp.pprint(list( db.world.find({"name":{'$regex':"ana"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"x"}},{"name":1,"_id":0})))
Iceland and Switzerland end with land but where are the others?
Find the countries that end with land
pp.pprint(list( db.world.find({"name":{'$regex':"stan$"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"land$"}},{"name":1,"_id":0})))
Columbia starts with a C and ends with ia - there are two other countries like this.
You can use .*
to match any character except newlines.
Find the countries that start with C and end with ia
pp.pprint(list( db.world.find({"name":{'$regex':"^A.*n$"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"^C.*ia$"}},{"name":1,"_id":0})))
Greece has a double e, who has a double o
Find the countty that has oo in its name
pp.pprint(list( db.world.find({"name":{'$regex':"ee"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"oo"}},{"name":1,"_id":0})))
Bahamas has three a, who else?
Find the country that has three or more a in the name
pp.pprint(list( db.world.find({"name":{'$regex':"^T"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"(.*[aA].*){3}"}},{"name":1,"_id":0})))
India and Angola have n as their second character.
.*
Indicates one or more characters, .
indicates just one.
Find the countries that have "t" as the second character.
pp.pprint(list( db.world.find({"name":{'$regex':"^.n"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"^.t"}},{"name":1,"_id":0})))
Lesotho and Moldova both have two o characters seperated by two other characters.
Find the countries that have two "o" characters separated by two others.
pp.pprint(list( db.world.find({"name":{'$regex':"o.o"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"o..o"}},{"name":1,"_id":0})))
Cuba and Togo have four character names.
Find the countries that have exactly four characters
pp.pprint(list( db.world.find({"name":{'$regex':"^Cu.*$"}},{"name":1,"_id":0}) ))
pp.pprint(list(db.world.find({"name":{'$regex':"^.{4}$"}},{"name":1,"_id":0})))
Harder Questions
Well done for getting this far.
Some optional, more complex questions are provided below.
The capital of Luxembourg is Luxembourg. Show all the countries where the capital is the same as the name of the country
You can compare two fields by using where db.<collection>.find({"$where":"this.<field1> <<operator>> this.<field2>"})
Find the country where the name is the capital city.
pp.pprint(list( db.world.find({"$where":"this.name == 'Mexico'"},{"name":1,"_id":0}) ))
pp.pprint(list(
db.world.find({"$where":"this.name == this.capital"},{"name":1,"_id":0})
))
The capital of Mexico is Mexico City. Show all the countries where the capital has the country together with the word "City".
Find the country where the capital is the country plus "City"..
pp.pprint(list( db.world.find({"$where":"this.capital == 'Mexico'+' City'"},{"name":1,"_id":0}) ))
pp.pprint(list(
db.world.find({"$where":"this.capital == this.name+' City'"},{"name":1,"_id":0})
))