MOA: Screenscraping with Python 3

Tuesday, June 22, 2010

Screenscraping with Python 3

import urllib.request

# reads CNN and decodes
page = urllib.request.urlopen('http://www.cnn.com/')
text = page.read().decode("utf8")

# where to find the text, change this to wherever ( would recommend using Firebug )
where = text.find('div id="cnn_mtt1lftarea"')
stopwhere = text.find('years')

# Add + how far after to print from the where and where to stop line
start= where + 28
ending = stopwhere + 7

price = text[start:ending]

print(price)

1 comment:

Eran SmithDecember 22, 2010 at 5:07 AM
That solves many a problems Python3 is really nice to use and runs on quite a light platform.
.net Code Protection
ReplyDelete
Replies

Add comment