=========================== Gathering data from the web =========================== Resources --------- - `HTML Tutorial `_ - `urllib3 `_ - `BeautifulSoup Documentation `_ - `Selenium API `_ - `Twitter Search API `_ - :download:`Slides <./slides.pdf>` - `Additional Readings Posted on Piazza `_ Installing BeautifulSoup ------------------------ To install BeautifulSoup on your class VM, run: :: sudo pip3 install --upgrade beautifulsoup4 sudo pip3 install --upgrade html5lib The password is the usual student account password. Lab --- - `Beautiful soup lab <../../labs/lab2/index.html>`_ - `Regular expression lab <../../labs/lab3/index.html>`_ Using Twitter API Example ------------------------- - :download:`get_tweets.py <./get_tweets.py>` Basic HTML example ------------------ - :download:`Hello.html <./hello.html>` Examples from 2014-15 Course Catalog ------------------------------------ - :download:`Abbreviated CS index page <./cs-index.html>` - :download:`CMSC 10100 div <./101-div.html>` - :download:`CS 120s div <./120s-div.html>` .. Examples from IPDES .. ------------------- .. - `Urban Labs Scraper: IPDES Data Center Scraper `_ .. - :download:`HTML for the data center page <./ipeds-datacenter.html>`