CodeIRL: Filtering Data for My College’s Library

This past week at work, I was tasked with filtering library course guide data as of a bigger project. The filtering involved removing guides named [Deleted] and associating guides to the librarian in charge of maintaining a specific guide.I didn’t feel like combing through 200 odd records so I decided to use Python! I can’t show you the code because of privacy reasons, but I can tell you how I coded up a solution.

The first requirement (deleting the [Deleted] guides) took just three clicks in Excel after which I saved the spreadsheet as a comma-separated values file (.csv) to make it easier to process in Python using the csv module. I could have used something like openpyxl, but the computer I was working on didn’t have pip and I didn’t want to learn how to use another library.

With the .csv file, I started copying the guides each librarian was responsible for into their respective .txt files. It was surprisingly simple as I didn’t have to deal with an paste format issues. The next step was to devise a system to append the librarian responsible for maintaining a given guide. I thought it would be as simple as appending the librarian falling under a given if condition, but there was a big problem- some library guide titles in the comma-separated values file had commas in their title, creating a whole new row in the destination comma-separated values file. I never thought that could happen in a comma-separated values file.

Realizing that, I had to come up with a way of writing the comma into the comma-separated file without creating a new column. At first I tried to escape the comma using regular expressions, but I ended up with a lot of backslashes in the file which were needed to escape the comma so it wouldn’t create a new column in the comma-separated values file. The same thing happened when I tried to write the rows into the csv file pythonically with the module’s csvwriter.

At this point I was stumped and realized that find and replace exists and replaced the backslashes with ‘s. After adding the guide’s of another librarian, the data was all filtered for my supervisor to review. There are still many steps in the data processing, but this was a good start. I’m just wondering if the time I took to make this was much quicker than doing it manually. To be honest, I think it took about the same time. Still, it’s a good way of using Python to make daily tasks much simpler.


One thought on “CodeIRL: Filtering Data for My College’s Library

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s