Prepare for the Text Processing Workshop

If you’re interested in the Regular Expressions, Text Processing, and Web Scraping workshop on Friday, please make sure to bring a laptop and to have the following software on your computer:

  • Python — if you have a Mac OS X or Linux machine, this is already installed.  If you run Windows, you may have to download it.
  • A good text editor — on OS X, I like to use TextWrangler.  On Linux, nice options include kate, kedit, and gedit.  On Windows, Notepad++ is a popular favorite.
  • Though not a core part of the workshop, we’ll also be talking about command-line tools like wget (which is installed by default on Linux, easy to build on Mac OS X, and even available for Windows).  If you’re running Windows, try downloading and installing the standalone Unix-like text-based environment Cygwin.  Regardless of your system, make sure you know how to bring up a command-line terminal.

Please also bring an idea for a web site or text corpus that you’d like to slice and dice.

Categories: General |

About Jadrian

I'm a computer science PhD student at Brown, an artist, an educator, and a proselytizer for mathematical literacy. In Spring 2013, I'm teaching an undergrad course in the CS department called "Intro to Computation for the Humanities and Social Sciences". More fundamentally, I believe that comfort and confidence in analytical thinking are essential for active citizenship and political empowerment, whether you're a scientist, a humanities researcher, an activist, a journalist, or just a person who wants to participate in the world. I think our education system alienates students from creativity and engagement, in math as in almost all other realms of inquiry, and it drives me nuts. I love interacting with passionate people of all backgrounds, and I think cross-pollination of the things that make us tick can only result in opportunities to improve ourselves and our world.