Integration

How to get web data into excel

by Niklas Waller on November 20, 2008

in Integration

There is an easy way of getting web data in excel for further managing the data, stastistical reasons or just saving as is.
If you for some reason need to get live data from a domino server straight into excel or maybe you need to get data into excel from some other online source in a structured way – then Web Query Files (.iqy files) is what you need.

These files can either be created manually in notepad or some other text editor or by using excel. An iqy-file contains 1-4 required lines:

1. Type of query
- Can be omitted. The only valid and also the default choice is ‘WEB’.

2. Version of the query
- If line 1 is omitted, this line has to be omitted as well. The only valid version and also the default value is ’1′)

3. Uniform Resource Locator (URL)
- Required. The web document on which the query acts. Maximum 200 characters.

4. POST parameters
- Optional. If it is a GET-query this line can be omitted, otherwise the post parameters are specified on this line.

More settings can be added to this file. The first lines described above must always come first and then the following parameters can be added for more specified adjustments. I am not going to go through all of these now but instead give suggestions to other sources where you can read more at the end of the post.

Selection=EntirePage
Formatting=None (Can also be RIchText and FullHTML)
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

You can also create these files automatically from Excel:
1. Open a new spreadsheet
2. Click on ‘Data -> Import External Data -> New Web Query’
3. Type in a web address in the adress input field and click ‘Go’.
4- Click on ‘Options’ to specify settings, for instance if the content should be HTML-formatted or not.
5. Click on ‘Import’ and choose where the content should be put. Also click on Properties for additional settings. You can for example specify if the contents should be refreshed with online data for a specific interval.
6. Click ‘Ok’ and the data will be imported from the web address you specified into the excel spreadsheet.

Some examples:

A simple example would be to get the contents of your rss-feed into excel. Our feed at Wohill is valid and contains xml code and thank’s to that the content is divided up into rows and columns. It could definately look better but this is the default behaviour and raw data. You can see it if you create a file called ‘readRss.iqy’ with this content, save it to your computer and double-click on it to make it run. It will automatically open up in excel.

WEB
1
http://www.wohill.com/rss.php

So XML rendered ok, but the best results comes with HTML and HTML tables.

Another example is when I need different kinds of data in excel about our blog posts. In this case I create a php-file which displays this data in html. And then I create an iqy-file to get the data into excel. An extra twist is that when I run the iqy-file I get a popup window asking how many posts I want to display.

The web page takes a query parameter called ‘posts’ which specifies how many posts to display from the newest and down. In the iqy-file we can via a popup window hand a value to the parameter ‘posts’. So the iqy-file will look like this:

WEB
1
http://www.wohill.com/wohillposts.php?posts=["Number of posts","Enter the number of posts you would like to see an overview of"]

Selection=EntirePage
Formatting=FullHTML
PreFormattedTextToColumns=False
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

Note in the results that it is HTML-formatted. If we would have chosen ‘None’ instead of ‘FullHTML’ there would not have been a black background for the headers and no links either.

A third example shows how to get the response from the Google search engine into excel based on a search query. The results in excel would be the same as if you would search directly on www.google.com:

WEB
1
http://www.google.com/search?q=["q","Enter the Search Term:"]

Selection=EntirePage
Formatting=FullHTML
PreFormattedTextToColumns=False
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False

It doesn’t look that nice though. But a tip is to create a nice looking excel spreadsheet with perhaps white background and some default fonts and then use this as a template for the search results. In this case I would from the template go to ‘Data -> Import External Data -> Import Data’ and then browse to the predefined iqy-file. Note that the content/results will be placed where you put the cursor,

I also tried getting contents of twitter into here but didn’t really succeed. Otherwise that would be kind of cool doing something like this with a refresh mechanism for all your Twitter friends.

Now, I am sure someone might say “why bother going through all of this when you can open the file directly in the open dialog of excel.”. Well, you’re quire right about that mr anonymous. But you will not get the other benefits of the web query. The data will be HTML-enabled by default and you can not make it refresh automatically unless you make it a web query. And if you dive deeper into this by for example writing VB-scripts there are great benefits to gain when you start combining this with managing advanced spreadsheets.

Now someone might also wonder if this works in Lotus Symphony and the answer is no. Actually I can’t even open up a webpage i Lotus Symphony Spreadsheets. Hopefully that will be possible in later versions?

A special thank’s to my colleague Anders Pamlin for coming up with this idea to me.

Some sources:

- Excel tutorial from ExcelUser
- Support pages from Microsoft
- Data collection on the world wide web using Excel
- Also look at the default iqy-files that comes with Office. In my installation they are in directory “C:\Program Files\Microsoft Office\Office10\Queries”.

Share and Enjoy:

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Yahoo! Buzz
  • Twitter
  • Google Bookmarks
  • email
  • Google Buzz
  • RSS
  • Slashdot
  • Technorati
  • Add to favorites
  • DZone
  • LinkedIn
  • MySpace
  • Tumblr

Be the first to comment

Migrating from Lotus Notes

by Niklas Waller on October 16, 2008

in Integration

I am currently involved in a large migration project in where my part of the work consists of defining the scope, mapping, washing and migrating the data. The project has not finished but these are the experiences I have come across so far in short detail. Any experiences, tips or questions that you might have are most appreciated.

- Defining the scope is a work where you together with many other people in the organization talk about and define which data that should be moved to the new system. Once this is done, smaller scopes needs to be done to define which data in each database that should be moved, down to field level. We also have to define a map with any possible external connections that might exist from the Notes application and determine if these should continue to exist in the new system and how to handle them or if we can leave out on that functionality.

So the Notes application on the source side needs to be scoped. We have to talk to a lot of users and managers to get a grip on what we really want and need to migrate to the other system. Important here is to really think and discuss over this cause we don’t really want to make an exact copy we also want to make things better.

This is also the phase in which it should be defined on how much of the historical data that should be moved.

- Mapping the data comes next when the scope is defined. This work consists of mapping specific data from the Lotus Notes system, i.e. mapping forms and fields to corresponding objects and properties. This is a work that can be completely fulfilled first when we know the scope and the datamodel on both the source and destination system. But at the same time a work that can start pretty early in the process.

As an example. One Notes database is used for the customers with company information and contact information. So first we need to decide which form(s) that corresponds to which objects on the destination system. When this is done the mapping needs to be fine-tuned to field level, i.e. map an object propertiy ‘Email’ to a form’s field ‘mailAddress’.

- Washing the data
When we know what data to move we can start washing it. The information that needs to be washed is both the actual information or data, i.e. that addresses, phone numbers, names etc. are correct. This either has to be done manually or with the help from public registers automatically.

It is also a work where data needs to be washed because there are errors in it like the email address. For instance there might be requirements on the destination system that you can not import data with email addresses that are not valid. There could be other fields like this or there could be decided in you organization that some fields must contain correct information from the start.

I have created washing scripts for this purpose based on regular expressions.

- Migrating the data
When we know the scope, how to map the data and when it has been washed it is time to move the data or migrate the data. This can be done in several ways. We are using a Web Services API on the destination system to migrate all data. Important here is to have a backup plan in case the migration fails and in any case you should keep the data as a read source for a while unless you are moving all data. This is important since you will most likely want to migrate all data for a short period of time and then start using the new system only immediately after or after a short time testing. You would most likely not want to use both systems as production systems after a migration since you would need to synchronize them. This might be ok for smaller systems though.

The migration is divided into smaller phases. First intermediate files are created from the Notes data, in this specific case we are talking about csv-files. For this purpose I am using an agent I found on openNTF. Second, the csv-files are via an application from the other system imported via web services. And third we need to check checksums to verify that all data has been moved.

Share and Enjoy:

  • Print
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Yahoo! Buzz
  • Twitter
  • Google Bookmarks
  • email
  • Google Buzz
  • RSS
  • Slashdot
  • Technorati
  • Add to favorites
  • DZone
  • LinkedIn
  • MySpace
  • Tumblr

Be the first to comment