If you’re interested in the Regular Expressions, Text Processing, and Web Scraping workshop on Friday, please make sure to bring a laptop and to have the following software on your computer:
- Python — if you have a Mac OS X or Linux machine, this is already installed. If you run Windows, you may have to download it.
- A good text editor — on OS X, I like to use TextWrangler. On Linux, nice options include kate, kedit, and gedit. On Windows, Notepad++ is a popular favorite.
- Though not a core part of the workshop, we’ll also be talking about command-line tools like wget (which is installed by default on Linux, easy to build on Mac OS X, and even available for Windows). If you’re running Windows, try downloading and installing the standalone Unix-like text-based environment Cygwin. Regardless of your system, make sure you know how to bring up a command-line terminal.
Please also bring an idea for a web site or text corpus that you’d like to slice and dice.