Talend Studio solved for me an ETL problem

Yesterday, I solved a tricky problem using Talend Studio. I needed do a pivot then, export to a spreadsheet template. SQL 2000 lacks usable way to do table pivots, so I let MS-Access do that part.

Next, I tried DoCmd.TransferSpreadsheet to export the data to the range A2:X30000. That didn’t work and I got a 3011 error, which means “TransferSpreadsheet” wasn’t meant to export to ranges, but sometimes works and sometimes doesn’t.

Some other techniques failed– some deleted the worksheet and re-added it. When a worksheet is deleted, all references to it become invalid. Some overwrote the entire row instead of staying inside a range.

Eventually what did work was exporting to an existing worksheet using Talend Studio. To populate the range inside the range, I used linking between cells of sheets, e.g. A1 = Sheet2$A1 and so on. Talend Open Studio uses java to talk to the spreadsheet and as of 2.3, Talend expects Excel 97 or 2000, not later. If the template is a later version, Excel will complain that data has been lost upon opening.

The debugging experience is impressive. When the visual widgets were not behaving as expected, I switched to the component or package code view, which shows the generated code.

What is most exciting is that in the end, provided you have access to a Java VM 1.6 or so, you can export the entire package as *stand alone* java code with all dependencies, which can be launched from a batch file. As long as all the feed back you need or expect is log files, then this is extremely portable and self contained. This compares to SSIS, which requires a configured windows service to run and a complicated DTSexec utility , or DTS which requires a client application like EM or DTSExec and access to an MSDB, or Pervasive, which requires a separately licensed integration engine.

Java is a leading curly brace programming language, so when your IT staff is mown down by buses, there will be plenty of people who can read the generated code.

Comments are closed.