Join Date: Jun 2011
Location: Old Ottawa South
A script to scrape the CBC website for schedule info for mythtv
I've written a little python script to scrape the CBC web site for scheduling information and put the results in to an xmltv/mythfilldatabase compatible form. You use it like this:
# ./scraper 2012/05/21 > sched_20120521.xml
# mythfilldatabase --update --file 1 sched_20120521.xml
Since I rely on the PSIP info for my EPG, and CBC does not provide, this allows me to get CBC scheduling.
Would this be of interest to anyone - I would be happy to share it - or is there a better thread to discuss this?
If you are interested and grabbing a copy now, surf to newmovies.walma.org and right-click to download scraper.py. It is very rough and does no input sanatizing or exception catching, so use it carefully. It also depends on "beautiful soup" for html parsing, which, if you are using some flavour of linux, should be available through your package manager. There are some comments in the code which will give some hints to getting it working.