Ronin Librarian: Dream Job Longshot

Today I spent designing a project that I would be able to work on and then display online that would help prove my “Ability to lean new technical skills quickly; ability to meet deadlines; strong service orientation.” I hoped to do this in such a way that it would tie in many of my other missing desired skills in a neat little bundle. For my project I decided to try and build a search aggregator, otherwise known as a federated search engine for the web, by integrating their multiple APIs using RESTful xhtml and hopefully one of those “object oriented languages (Ruby, Python, PHP, etc.)” thereby tying together a very handy demo of my job suitability. Loaded with optimism I began to search the internet for how to best do this, not knowing I was in for hours of frustration. What I did not realize is that as a university student I was spoiled. I had always had the exact steps laid out for me to be able to achieve my assignments. Back then all I had to do was follow that path laid out for me and then add a creative twist in order to excel. Here I felt blocked at every turn.

I though to start by looking at Google's API. Not only does Google have the honor of being THE major search engine on the web nowadays, but I know that they post their APIs and have specific instructions on how to use them in order to encourage programmers to build with them. What I found however was not encouraging. I did quickly find a comprehensive list of APIs that Google offerred in their APIs interface. However I couldn't but help notice that a full-fledged Google Search was not among them. It also occurred to me that I had developed this plan without fully knowing what an API was.

An API stands for Application Programming Interface, and I found the following handy definition:

(Application Programming Interface) A language and message format used by an application program to communicate with the operating system or some other control program such as a database management system (DBMS) or communications protocol. APIs are implemented by writing function calls in the program, which provide the linkage to the required subroutine for execution. Thus, an API implies that a driver or program module is available in the computer to perform the operation or that software must be linked into the existing program to perform the tasks.PC Magazine Encyclopedia - API

So plugging in an API into a webpage was basically like outsourcing a specific part of the function in that webpage that you wanted to implement. Thinking back on this I realized that we had in fact learned about this in library school! I may have never had heard of the term API before, but we had talked about the the increasing functionality of XHTML over basic HTML and how the idea of having cross-program functionality would greatly increase the utility and capabilities of the web.

Turns out, this is also connected to RESTful infrastructure that was also mentioned in by my dream job. Yet once again I was discovering that I was unfamiliar with terminology, even if I was aware of concepts.

SOAP – is the acronym for Simple Object Access Protocol, which to remove the computer geekness and IT terms from the definition – is basically the universal language for computers. It is what a windows computer will send to a linux server to ask it for information and be understood. Technically SOAP refers to the tiny information packets or messages that are sent between machines in order for them to communicate. -Techterms.com – SOAP

REST – is the acronym for Representational State Transfer, is a similar universal language for computers. However unlike SOAP it is also an architecture for computer websites as well as a language. Confused? I was! Sites that implement REST are termed RESTful systems. RESTful systems have URI or Universal Resource Identifiers attached to everything. These identifiers are in practice URLs, however instead of just having them attached to each page, they are also attached to things like users, database objects, transactions, etc.. REST also has a few other building blocks: it assumes that there is a client (i.e. you on your local computer through the website designed with REST) and a server (obviously a RESTful system), each client request is individually generated with all necessary data and using the basic REST commands the server responds statelessly (meaning without storing any data regarding the request on the server), every request will be designated on the client's side as cachable (or not) which will tell the clients computer whether or not to store the results (results could be stored for faster future processing, or not to ensure that old data is not inappropriately used or submitted), and finally that there are several communication layers in REST, the client computer, client web-browser page, the server response system, and all the data within the server itself. This allows for servers to scale what the house dynamically and appropriately. -techopedia REST

I found a good article the compares REST and SOAP called "Knowing when to REST". This article was very good at describing the difference between the two systems and which would be appropriate to use when. What this boiled down to is that if the website is providing a service based activity through its website, such as a merchant or calender, then SOAP is likely the better choice as it provides solid best practice standards in reliability and security where REST does not. (REST still can be secure, but every case has to be judged individually – there is no 'best practice'.) Whereas if the site is providing a resource, such as a digital library, search engine, or typical news site, then REST makes more sense to use.

To go back to my previous thought, REST or SOAP are related to being able to use APIs in your site because they will need one or the other in order to communicate their interactive data successfully. Which to use in a given scenario is often chosen for you based on the server requirements or API requirements in a given scenario as I discovered “SOAP vs REST API Implentation” by FliquidStudios.

Going back to my original frustration it looked like the API that Google offered for free was really only designed to search specific sites and had a limited number of uses per day. Well its not like I ever expected to exceed the limited number of uses per day, but I didn't want to just search specific sites, I wanted to do a whole-internet Google Search, and that was just for starters!

Then while poking around I found some interesting leads on how to do this theoretically with RSS feeds. Now as far as I knew, RSS feeds were information streams that you could hook your email/mobile app platform/whatever into to get constant updates on whatever topic the RSS feed was designed to cover – so how could you use it to write help plug into a search engine? Admittedly the page that claimed this was a Wikipedia page on Search Aggregation, and as a trained librarian I know full well that those are not always accurate, especially the ones that claimed to have issues like this one did. However I also knew that this was due some further investigation.

While looking around for this I stumbled across the website Wopular. I was excited by this discovery because this site appears to have achieved tangentially on a large scale what I will be trying to do on a small one. I also found this neat article describing how and why the site was designed the way it was. So my goal for tomorrow: download a localhost version of Drupal in order to start playing with it, perhaps this can count as my web-oriented programming language!

Ronin Librarian

Friday, June 7, 2013

Dream Job Longshot - Recap Day 1

No comments:

Post a Comment

About Me