Technical

Insight and technical features for the code savvy

Getting Data from Behind CAPTCHA

Written by:

CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a method that websites can use to tell the difference between other computers and humans accessing their pages.

 

Have you ever been asked to read blurred letters and type them into a box? That’s a CAPTCHA at work.

Result Query Formats

Written by:

Did you know it’s possible to get import·io data returned to you in different formats via the API?

One of our lesser-known features is the ability to toggle our API responses between returning “flat” data-structures versus “hierarchical” ones.

Query up to 100 sources via One API call with import·io

Written by:

One of the most powerful things we allow users to do with import·io is to combine multiple sources of data, commonly referred to as federation. In order to do this, we have created an architecture that allows you to pass in a single query as an input for many sources (up to 100 at the moment). That query is then executed in parallel and paginated (if available) against each of the sources specified. The results of these federated queries are then streamed back to the client – either one of our client libraries, or our data lab web app. The end result is that it is possible to retrieve multiple pages of data from many sources, in a consistent format, into your app with a single API call.

Comet vs REST Query APIs

Written by:

When we talk about getting data out of import·io using our APIs, there are two main methods that we describe. These are known as REST queries and Comet queries. I want to quickly explain the key differences between them and what this means for developers.   REST Queries REST queries are the simplest way to […]

Exposing headers over CORS with Access-Control-Expose-Headers

Written by:

Working with cross domain HTTP requests in JavaScript is generally acknowledged to be a bit of a minefield. Recently I discovered a new CORS header, Access-Control-Expose-Header, which I hadn’t know about previously. As I had to do a lot of digging to get any information about it, I thought I’d make a note. The context […]

HTML5 Canvas toDataURL: WebM vs PNG vs JPEG

Written by:

We use HTML5 Canvas elements for a number of features in our client apps, and we wanted to know once and for all – what would be the best format for us to export our results to? The toDataURL method of Canvas can handle a number of formats. In Chrome it can export “image/png”, “image/jpeg” […]

HTTPs Now Fully Supported on Query API

Written by:

 I am pleased to announce that HTTPS is now fully supported on the Query API endpoints. Previously, only api.import.io fully supported HTTPS requests. With a patch to our infrastructure yesterday, the *.query.import.io and query.import.io endpoints now fully support HTTPS for CometD queries. We strongly recommend that all clients migrate to using HTTPS for querying. This […]

New Fields in the Cometd Query API

Written by:

We have just rolled out release Battlestar (if you were wondering, our first production release was Avengers) and with it come some new fields in the CometD messaging API for querying. MESSAGE objects returned through the CometD protocol now have two additional fields, the connector GUID and the connector version GUID that were used to return the […]

Keeping Up with import•io: System Status

Written by:

On the dev team here at import•io we’re frustrated when web services we need and love to use go down, or perform maintenance when we’re not expecting it. We work hard to make our systems highly available and to make sure our system rollouts are as seamless as possible. However, we may have times where […]

“Why I Love Everything You Hate About Java”

Written by:

I recently came across a post entitled “Why I love everything you hate about Java”. I love it. I would say that the Decorator Pattern is an important pattern in CS for modularity. In the comments there is a lot of to and fro from people, but I think I can encapsulate what it got […]

Omniscient Debuggers

Written by:

I was discussing Omniscient Debuggers recently with someone at an LJC meetup. Omniscient debuggers drastically reduce the time needed to debug software by giving the programmer complete freedom with respect to time: they permit to step forward and backward, and to immediately answer questions like “when was this variable assigned that value?”. This is made […]

Node.js Garbage Collection

Written by:

So I’m a fan of the concept of node.js, but am still unconvinced of its maturity. The big thing I am worried about is its garbage collection, which is comparable to very old Java GC in that it is a “stop-the-world” style GC, where all execution is paused while GC happens. According to this post: […]