Ask Alex (anything)

I get a lot of great questions through support and on the webinars – and instead of burying them at the end of my webinar recaps, I thought I should give them their due time in the spotlight – I think they deserve it, don’t you? So, without further ado; I’d like to introduce you to my  brand new feature – Ask Alex! Each week I’ll take the most asked questions and share the answers with you, my adoring public.

These are meant to be interactive, so if you have any questions (even if they aren’t data related) please email them to me with the heading “Dear Alex” – I just ask that you keep your questions relatively generic so that they are applicable to everyone.

Dear Alex, what is the difference between a single result page and a multiple results page?

Good question. A single result page is when there is just one result on a page such as a product page on a clothing website. A multiple result page is a page with lots of different results on it with corresponding data points such as a product list on a clothing website.

Hey Alex, my page keeps redirecting – why?!

Page redirections happen mainly due to canonical URLs. Put simply, a canonical URL is an HTML element that helps webmasters prevent duplicate content issues by specifying the “canonical”, or “preferred”, version of a web page. Luckily for you, we have a workaround for just such a case, which you can find on this page.

Nice tool. I have a crawler that I want to run on a schedule – is this possible?

Thanks for the feedback! This isn’t possible “out of the box” at the moment, though it is something that’s highly sought after and we are planning to bring this functionality out in the future. If you are comfortable with the command line and running crawls from the command line, this can be done on a schedule. To find out more, please visit this page.

Dear Alex, why do APIs fail?

There are a few common reasons we can’t always publish your API:

  1. Rows and columns training – Training rows and columns is the most important step in creating your API because it is what allows us to automatically generate the XPath we use for extracting the data. Inconsistent mapping can sometimes cause an API not to publish. Retrain the rows and columns – Double check your column fields. Fields such as currency and date/time are more complex than others so try using text as a default column field.
  2. Timeout issues – In order to keep CPU costs from being too intensive, we have a processing limit on each page. Exceeding this limit (usually because of too much JavaScript) will automatically cause your API not to publish. Rebuild your API again with JavaScript disabled.
  3. XPath issues – Sometimes the automatically generated XPaths can cause the API not to publish. Try manually inserting your own XPaths for each data point.

For more information, please check out this page.

Hi Alex, I’m new to your platform, where can I learn about import.io and it’s tools?

All information and tutorials can be found in our knowledge base, which is a comprehensive guide of nearly 100 articles about import.io, who we are, how to use our tools and more!

How long should I cook my Victoria Sponge for?

Great question here. Depending on the size, if you’re making a sponge that will serve around 10 people, you should cook it for 20 minutes and leave to cool for 10. Always dust in icing sugar and serve with a cup of tea!

Turn the web into data for free

Create your own datasets in minutes, no coding required

Powerful data extraction platform

Point and click interface

Export your data in any format

Unlimited queries and APIs

Sign me up!

Comments

I knew the answers to these questions, except for the Victoria Sponge timing. My cakes are sooooo moist now, thanks!

Comments are closed.