Python: Scrape pages and extract information

0 Comments

I was amazed at how incredibly easy it was to scrape pages using Python.

To download the page markup, use:

1.import urllib
2.content = urllib.urlopen("http://finance.google.com/finance?q=IBM").read()

Once you have the content, simply use regex to parse the bit you want.

1.import re
2.m = re.search('class="pr".*?>(.*?)<', content)
3. 
4.if m:
5.  quote = m.group(1)

[ Source ]

 
Copyright © Twig's Tech Tips
Theme by BloggerThemes & TopWPThemes Sponsored by iBlogtoBlog