Exploratory Data Analysis: I’ve graphed my data. Now what?

I’d like to propose a session on exploratory data analysis. While it’s also useful to consider “confirmatory” analysis and visualization when making an argument, I’m interested in the messier, earlier stages of research. The first thing we might do when getting a new text, or corpus, or dataset is to graph/visualize it, map it, cluster it, and so on. What then? We might do a close reading of some of the intriguing passages or concepts turned up by the initial phase. But how can we redo our analysis and visualization in light of this close reading? This isn’t a process that I think any field, in the humanities or social or natural sciences, does very well at teaching. Could digital humanities perhaps become a leading field in practicing — and teaching — this cycle of playing with and remolding data and models?

Categories: Session Proposals | 5 Comments

Techno-Haves and Have-Nots

Technological power is political power in the modern world.  Yet many activists, humanists, artists, and disenfranchised social groups not only don’t wield technological power, they feel alienated from it.  What can we do—as academics, researchers, and citizens—to close this gap?

I’d like to have a discussion regarding our concerns about social/political disenfranchisement and its relationship with deep technological/empirical ability, and to work as a group to develop strategies for outreach and education that bring more diverse voices into the technological discourse. Success stories are welcome; bring your teaching tools, URLs, and syllabi!

Categories: Session Proposals, Teaching | 3 Comments

Reading Data and Code as Cultural Objects

Our world is run by programs written in code in one or several languages. We increasingly use data that we visualize to interpret, read trends, and “drive” decisions. Code, data sets, and databases are themselves observable, culturally determined objects, often even observable as aesthetic objects. It’s time for us to start thinking about the cultural aspects of code, databases, and other “under the hood” digital manifestations: How are they written, in what conditions? How does code circulate?  Where is the creative gesture in programming or developing a database? What kills programming languages? Etc. Let’s talk about what it means to start reading our culture in its increasingly digital, raw materials.

 

 

Categories: Session Proposals | Comments Off on Reading Data and Code as Cultural Objects

Session proposal: Development of a WWII diary project using a database such as KORA

The Naval War College is in process of digitizing the 4,000+ page Command Summary of Fleet Admiral Chester W. Nimitz, also known as the “Nimitz Gray Book.” This is one of the most important primary sources for World War II in the Pacific.

This is our first major digital initiative at the Naval War College. I hope to brainstorm with participants on how best to optimize scholarly use of these documents and perhaps allow us to work toward a more expansive portal that includes other war diaries, deck logs, dispatches, memoranda, etc. from this era. The Nimitz Command Summary has potential to serve as a “hub” document for such a portal, which would facilitate the type of cross-searching, analysis, and referencing that is presently so tedious for researchers using these sources in dispersed archives or online sites. We would also like to facilitate downloading/reuse by researchers for their own encoding or analysis projects. Beyond that, I’d like to gather any other creative ideas for timelining, visualizations, student engagement, etc. We will be incorporating this project into a World War II elective this fall.

I recently saw several databases created with KORA, including the Quilt Index, and felt something like Kora might meet our needs with minimal staffing/technology infrastructure requirements. But we are open to all ideas!

Categories: Session Proposals | Comments Off on Session proposal: Development of a WWII diary project using a database such as KORA

Best Practices for Open Access Journals

The academy needs open-access. As Bethany Nowviskie has pointed out in a memorable (and revolting) phrase, much of the intellectual product of the academy is “fight club soap.” We produce scholarly work at great cost to our institutions and the donors and governments that fund them, only to hand them over to for-profit publishers, who sell them back to our libraries at ruinous cost. This cost is exorbitant for the wealthiest universities and prohibitive for everyone else, exacerbating the divide between haves and have-nots, and locking our scholarly work behind paywalls where hardly anyone reads it.

Thankfully, there is no reason why we need to continue in this way. The economics of publishing that favored the printed, bound, and distributed academic journal are now untenable, and instead we have the opportunity though the internet for open-access publications, that is, publications which are available online, for free, regardless of the user’s affiliation. Open-access scholarly publications are the academy’s chance to cash in on the idea that “information wants only to be free.” But like anything worth doing, creating open-acccess publications will take a lot of work.

My session proposal, then, combines both the large question of open-access with the specific issues I’m going to face over the next year or so. I’d like to talk with scholars, librarians, technologists (anyone, actually) about the best practices and new ideas for open-access publications. For example, we might try answering these types of questions:

  • What new ways of publishing can an online, OA journal take advantage of?
  • What are the technical requirements of an OA journal?
  • What is the best use of web 2.0 technologies?
  • Is there a better way to handle citations than footnotes?
  • How can an OA journal keep its back catalog useable into the future?
  • What are the best software options for running an OA journal?

It would be best if this session could produce a deliverable, probably in the form of a report or syllabus listing best practices, useful readings, and possible future directions for open-access journals. We could write this collaboratively during the time we have for the session. I also have the code for the Journal of Southern Religion available on GitHub, if anyone wants to hack around with it, though I’ve proposed a separate hacking session for a particular problem involving e-books.

If you have any ideas, links to open-access publications that are doing good work, or readings that would helpful, please leave them in the comments below. Thanks!

N.B. This is a revision of a session proposal from last year’s THATCamp New England, but I still think this question is worth talking about.

Categories: Session Proposals | Tags: , , | 1 Comment

Pandoc (and Jekyll, and LaTeX, oh my!) Hacking Session

Pandoc is a utility written by philosopher John MacFarlane for converting files from one markup format to another. For example, you might write a document in a plain text format then convert it to HTML. I’ll be giving an introduction to Pandoc and Markdown in Saturday’s plain-text workshop. But for this unconference session, I’d like to propose a hacking session that will create some software to solve a problem using Pandoc.

If we have some people who know LaTeX, I propose that we create a Pandoc template to meet the requirements of the standard academic paper that undergraduates have to hand in. While the standard Pandoc templates are great, the general expectation for academic drafts are that they will look like a Turabian or MLA paper, so let’s make a template for that purpose.

If we have some people who know Ruby  or shell scripting, I propose that we figure out a way to make EPUB books from the files for a blog or website published using Jekyll. This might take the form of a Jekyll plugin written in Ruby, like Anthologize but for Jekyll, or a shell script that stands outside of Jekyll. I’m interested in generating EPUBs for each issue of a journal (code here), so I have a real world example we can hack.

By the end of the session we will have made a small but complete product to launch into the world.

Categories: Session Proposals | Tags: , , , | 1 Comment

Data Visualization: From Discovery Tools to Visual Arguments

I would like to propose a session on data visualization.  How we do it (programs, techniques, etc.) and why we do it (data cleaning, discovery tools, visual arguments)?  The conversation will hopefully range from theories of information design (Edward Tufte/Ben Fry) to case studies brought by the participants.  What kind of data visualizations have you created or would like to create in the future?  What problems have you run into?

I would also like to spend some time at the end discussing the “ethics” of visualizing data. Images can come across as more “true” than text based explanation. How do we create visualizations that are honest reflections of our scholarship, making clear the limits and affordances of the data and tools we are using.

In the “Let’s make something!” THATCamp spirit, I suggest we compile a list of explanatory information that should be attached to data visualizations when they are published to ensure that people can “read” them appropriately.

Categories: Session Proposals | 2 Comments

Which (DHish) Blogs, #hashtags and Podcasts do you follow?

I follow a set of bloggers on DH, I read Humanist, I try to catch up on Digital Library discussions, not to mention listening to the Digital Classicist, MITH and Scholars Lab, podcasts as I walk to work in the morning.  There there are all the people I follow on Twitter. Well, ok, I read, listen and follow to other topics as well, but a lot of timely information and energetic discussion comes to me in these formats. DHNow aggregates a lot of blogs, but what about podcasts? Which ones do you listen to? What else is out there?

A great outcome of this discussion might be a list of suggestions, but we should all at least discover some new material to follow.

Categories: Session Proposals | 1 Comment