Jamie Thomson
SQL Down Under Show 45 - Guest: Jamie Thomson - Published: 9 Nov 2009
SQL Server MVP Jamie Thomson discusses Integration Services and the lessons he's learned along the way. Jamie also discusses SSIS best practices and upgrade issues.
Details About Our Guest
Jamie Thomson is a freelance consultant working in and around SQL Server, more specifically specializing in SQL Server Integration Services and other related BI technologies. He’s proud to have been an MVP for SQL Server since 2006, which he was awarded largely for his blogging activities related to Integration Services, although he has been known to ramble off into TSQL, Analysis Services, Reporting Services, and the slightly weirder world of Web Data Services. Jamie hails from Leeds in West Yorkshire in the U.K., and now lives in Southwest London with his wife Helen.
Show Notes And Links
Show Transcript
Greg Low: Introducing Show 45 with guest Jamie Thomson.
Welcome. Our guest today is Jamie Thomson. Jamie is a freelance consultant working in and around SQL Server, more specifically specializing in SQL Server Integration Services and other related BI technologies. He’s proud to have been an MVP for SQL Server since 2006, which he was awarded largely for his blogging activities related to Integration Services, although he has been known to ramble off into TSQL, Analysis Services, Reporting Services, and the slightly weirder world of Web Data Services. Jamie hails from Leeds in West Yorkshire in the U.K., and now lives in Southwest London with his wife Helen. Welcome Jamie.
Jamie Thomson: I’m happy to be here. I’m flattered to be asked.
Greg Low: Great. What I ask everyone to do is how you came to be involved with SQL Server in the first place?
Jamie Thomson: Started about ten years ago. I got job working in Southwest London. On my first day, was handed SQL Server data warehousing textbook. I was given two or three months to read and understand it. I had thought I was going to be a coder, writing VB code or something. They gave me this book on something I’d never heard of. OK, where does this go? That was the first thing, SQL Server 7. OLAP services had just been introduced. Remember those days?
Greg Low: On yes, I do. I’ve often said that was a significant turning point in the product, when that was added. SQL Server 2008, I suspect another one with special data added at this point. Certainly, all up services at that time was a significant point.
Jamie Thomson: Yes. I remember back in early days, about 2000 when I was first introduced, I remember having conference call with guys from Microsoft. Didn’t know who they were. Len Wyatt and Dave Fackler. Still there, doing pretty much the same stuff. Same names keep cropping up.
Greg Low: Interesting that you say that. Tends to not be common. Over the period I’ve been involved it’s interesting to see the change of people involved. Do you find since you’ve been involved, enough organizational memory that flows from versions?
Jamie Thomson: Interesting you say that. Dynamic in Microsoft. As products evolve and new versions are worked on, new people come in. Windows is like that. Wholesale changes with each release. There are people who stick around the SQL Server team. Donald Farmer. He’s been around six or seven years. There are other people from the early days who have moved on. Missing in action. Shame those guys don’t stick around.
Greg Low: Yes. They’re around, but in different areas. I just came from PASS Summit in Seattle. One of the most fun things was they had the book signing for the SQL Server MVP Deep Dives. Barrel of fun. There were some 53 authors in the book, but 30 of them were there. Huge table. Interesting that all the people came past. I met most of the authors, got the book signed. One guy was Richard Waymire. An old friend. He was visible in the teams, but disappeared into the data duty area. He’s back actively working in tools. Book signing. That was fun. Apart from the books with all the people who had bought them, which thanks to all who did, we also had a lot of books come through for Microsoft Executives. Bill Gates’ book, Steve Ballmer’s book, and so on. We were trying to think of something creative to write. If you had a chance to write them a message, what do you write? Interesting.
Jamie Thomson: Good. I didn’t get to go to PASS this year. I would have been in line to get my book signed.
Greg Low: I have to say, even though I was involved, it’s a premier SQL Server conference anywhere in the world each year. I wouldn’t miss it. This year, everyone was talking cost of travel. Ironically, the cheapest year for travel ever. So many deals as airlines doing tough.
Jamie Thomson: I was watching output from conference on blogs and such. Sounds like something else. Reading what was coming out. Announcements real time. Steve Jones talked a lot. I was jealous. Then found out tickets had gone on sale for next year.
Greg Low: Early bird rate. $995 or something. Sometime in January. People should get along next year especially since a release year. Always makes more entertaining. Back to what you were saying, every time there’s a version, always working on next. Interesting challenge in this round is they’re into concurrent development of versions. Two on the way, plus SQL 11 being built at same time. Interesting challenge for the team.
Jamie Thomson: Yes. We’re seeing happening developing features for 11 fully baked. Maybe we should put in Release 2. I heard about the engine in Release 2, unexpected to me. Maybe they were putting in for 11 but it’s ready to go so why not put in Release 2?
Greg Low: Integration Services as a whole. Obviously, as you were learning, you’re rare to also document your learning’s in your blog. Fabulous for everyone else trying to learn. What did you think was toughest part of learning Integration Services?
Jamie Thomson: I learned early on that a difficult thing could be for someone trying to learn was previous experience in DTS from SQL Server 2000. I think knowing DTS was a hindrance learning SSIS as the two are significantly different. I came to SSIS with a bit of DTS background but also I’ve used another tool, Informatica.
Greg Low: Typically used with Oracle.
Jamie Thomson: Yes. Back in the day, back in 2004, I was using against DB2 type stuff. Big heavyweight. I remember picking up SSIS from a PBC build back in summer 2004. Took a look and thought nothing like DTS but a lot like Informatica. That’s when my interest began. Went from there. Donald Farmer on. I remember talking with colleague in 2004, Mick Holland. We were going through learning stuff for SSIS. He told me we’d have to do the same for Analysis Services and Reporting Services. I was saying yes, definitely. Five years later, still waiting.
Greg Low: Stage you’re talking about, point where it had just suddenly changed names as well. Was going to be the updated version of DTS but realized how very different it was and needed different message.
Jamie Thomson: That was a good decision to make. Shame it didn’t make. If you look at actual product, acronym is in a lot of places.
Greg Low: Yes. Even those packages have DTS files. All the names in the programs are Despot something. Quite a few people coming from DTS, trying to move across. What do you think big challenges are?
Jamie Thomson: Biggest people seem to have is wrapping head around notion that operates in control and data flow. Back in DTS days they were the same thing. More design service, data done in same place, constraints. Getting away from that was big challenge for people. Other thing is data flow itself. Fantastic piece of engineering. But if you have DTS head on your shoulders, preconceived ideas of how will work, so many questions in forums, still today, that you could tell questioned by people who had used DTS. They could do in old version, why not in SSIS? Not that you just can’t do that anymore, but you have to turn your head 90 degrees and look at from different angle. What’s different? Why I need to treat different. Obvious one is back in DTS you could in a row, condition ignore row. People were asking how to do in SSIS. Now conditional split. People didn’t get that straight away. Difficulty.
Greg Low: What’s the advice you give if people have a lot of DTS packages to move because they’re coming from 2000? When I see people upgrading from 2000 to 2005 or 2008, one main question I ask is how many DTS packages they have and how complicated. That’s a hard thing to do.
Jamie Thomson: Absolutely. Back in 2005 there was somewhat of an upgrade path that tried to put things in to help. Didn’t take very far. Simplest advice to offer is you have to do from scratch in SSIS as upgrade path not there. Having said that, since, one of our MVP colleagues, Brian Knight has a tool which helps in that DTS to SSIS path.
Greg Low: Grown from pragmatic works. DTSexchange.com will link to it. I’ve heard nothing but good things about it for people migrating. My concern on migration of DTS packages to SSIS is that typically involve many embedded connection strings which we’ll talk on. Other thing, a bit like when people were upgrading VB6 to VB.NET. Conversion wasn’t good. Every time people rewrote applications, ended up with better application than the one. Almost be careful what you wish you for. If you could directly upgrade, would end up with lousy Integration Services to look after for a long time.
Jamie Thomson: Yes. Fact of life. In computer. When you upgrade DTS pkgs. Unfortunate that you invest time in DTS, so many packages out there, then you come along and build a different product. Better. But the migration package is horrific.
Greg Low: Certainly the design surface, as you mentioned, in DTS was a mess. Stroke of genius branding control and data flow out. Occurred to someone. Incredibly good move.
Jamie Thomson: Absolutely. Other big thing, when I think back to old programs, coding in loops. Remember that?
Greg Low: Absolutely. Next thing I was going to mention was convoluted code in DTS pkgs. Now in box.
Jamie Thomson: Yes. Easy to take loop for granted. Incredible concept.
Greg Low: I use for each item, point at folder, go and do something to every file. That’s great.
Jamie Thomson: Interesting you mention that. One thing I think is difficulty in SSIS is you sometimes have too many options in how to solve problem. Loading data from multiple files is clustered. You have option to loop over folders and pull in one by one. You have multi file adapter which you can point at folder and it will load all the files, run, give you column in dataflow saying which file every row came from. Or you can build data flow. Every single file affected. Different circumstances. Highlights problems. Sometimes so many ways of solving a problem. Difficult to know best to do.
Greg Low: Indeed. One that I’d love your feedback on. I see two different thoughts for error handling. One is if you look at typical DTS, people have do some operation and if it fails do this error code. We can do same in SSIS, but also have option to set event procedure or event surface. I don’t know if surface is right word. Event handler. Hides the event code out of mainline code. Pro’s and con’s. Biggest con I’ve heard is that it’s not in your face anymore can lead to confusion. Someone had error in post execute event handler. Whole thing looked like it ran, but had failed. They didn’t know what had happened.
Jamie Thomson: I agree. They are dangerous. If you take corrective steps in your event handler, and anyone else taking package isn’t aware that event handler exists, from maintenance point of view that’s difficult. I don’t generally take corrective procedures, corrective steps in event handlers for that reason. If an error occurs I want to know about it. I can come and unhandle on my terms.
Greg Low: If you had to send a mail, do you surface in control flow or push behind into event handler?
Jamie Thomson: I’ll put in event handler. Logging procedures. Corrective steps you might want to take in the control flow.
Greg Low: OK. What level of logging do you tend to like to have, where do you log to?
Jamie Thomson: Where I tend to log to is log files rather than a database. If I used out of box event handler. What I did in previous co was put together logging framework with dedicated schema to log to. We built event handlers in the package that understood that logging scheme. We could put data in, file properly, rather than use out of the box providers which are limited.
Greg Low: Roughly, what do you think are the limitations?
Jamie Thomson: Awhile since I used. Main thing is if you use them, you get very long list of events, very hard to search and find information you’re looking for. Difficult. Make sure you log everything, generally. Certainly all the error information, to make sure you’re not missing anything important. But such a long list makes hard to find and know what’s going on. I like a table of events. Pertinent information that I want to see and loop back to a table. Instances of each package start and stop. You can see which events occur from which pkg. limitations. You don’t get all the information you might want. What I like to get out of packages you don’t get from log files is durations of how long to execute. I want to know how long each data flow took so I can monitor over time if time is increasing as that’s a sign of something to address. Can’t find that information from out of the box. Custom logging can be advantageous.
Greg Low: Another topic that comes up. For those moving, strong typing is an issue. We move from DTS where we had everything variant, scripted objects, to strongly typed arrangement. Causing people pain is idea it wants to do everything with Unicode data in package by default.
Jamie Thomson: Really? I don’t find people complaining on that too much.
Greg Low: Where I’ve seen is, if you take AV.NET provider and redata from voucher column and send back to voucher column, complains you’re truncating data. Standard source pulls out as in voucher data rather than voucher data.
Jamie Thomson: Yes. .NET. Yes. Strong typing is one of those that’s hard for people. Advantages to strong typing. Those with strong developer background can find advantages in it. Things like catching errors at design time not execution time. But again, if you’re coming from DTS, you’re not used to that world. You find things that worked in DTS don’t work in SSIS. Good reasons for that. Donald said in early days that when people complained about this in forums, they could come and see how many people complained about loose typing in DTS. Reason in SSIS is to clean up the mess. Repercussions. Unfortunate. Other big complaints, famous threat on SSIS from 2004 or 2005, fat file source, how if you load file that had multiple row formats in the file, a row with different number of columns as row before, SSIS couldn’t handle while DTS could.
Greg Low: Ragged file thing.
Jamie Thomson: Yes. There were things in play. Ragged writes format helps but not great. DTS, if row didn’t have columns expected, DTS didn’t care. Just read up to the row delimiter. Then no more columns, read next row. SSIS doesn’t do that. It looks for every column expecting. Reads in and isn’t aware of row delimiters. If you talk to team, makes a lot faster. No need to read each row to find delimiter to know what to do with row. Quicker. Pro’s and con’s.
Greg Low: Yes. We have ability in control flow to build script tasks and choice to build source destinations and transformations in data flow. Where do you consider creating component to drop in toolbox rather than using script?
Jamie Thomson: Interesting. I used to be ridiculed by an MVP colleague because I always wrote script rather than custom component. Simple. If you build something that will be reused, then custom component so you can reuse. Then you have to question how proficient you are at writing .NET code. Harder than script components. You get the scaffolding for understanding script component. You don’t get with custom.
Greg Low: There is a video, a webcast, on MSDN. It was there for 2005. It showed how to build tasks to drop in control flow toolbox. How to build data sources and transformations to drop in to data flow. At least within a single webcast, wasn’t ridiculously complicated. Where the product becomes more powerful, when you do that. I was doing work awhile back. Progress databases. You start with external connector, then a question mark where null. Different processing you need to do before you get data. I don’t know I want in every pkg. better off to build progress data source and drop in and have it do all the stuff.
Jamie Thomson: Makes sense. When you pull data out from data source, advantageous. Comes down to how proficient you are at wiring .NET code, how familiar you are with data flow. Using script, distracted from things like buffers, defining all up. Don’t really need to work too much about synchronized or asynchronized. Blocked or fully blocked?
Greg Low: Terminology police. Asynchronized and synchronized components.
Jamie Thomson: Synchronized component, output using same buffer as input. Example. Column components. Synchronized components comes in and comes out other side, looks exactly same as columns added to data flow by drive column. Row in, row out. Main thing about sync components. Same number in, same number out. Asynchronized are very different. Generally take in a lot of data and do aggregation. Aggregate component is classic example of as asynchronized. Shape of data coming in, not same as comes out. Metadata of all the columns or data flow is different before components than after. That’s the main characteristic of asynchronized.
Greg Low: Many third party components you tend to use? You mentioned Allen Mitchell. SQLIS.com. Materials there. Any other common ones you tend to use? Third party plug ins?
Jamie Thomson: I generally don’t. Things I’ve been doing recently. No third party components.
Greg Low: I tend to suggest people thinking about building something should go to SQLIS.com and look around first. Also if they’re looking at building components, go to toolbox and start with something in vicinity of what they need, work with that, not start from scratch.
Jamie Thomson: Absolutely. I do use row number transformation. Simple. Puts new column in data flow. Includes increments in numbers. Very useful. People ask for. Components that do for you. Option is to write code. Script components.
Greg Low: Nicely reusable when a component. Good point for a break.
Welcome back. So is there a life outside SQL Server?
Jamie Thomson: Yes. A bit of one. I’m married to Helen. About a year now. First throes of marriage. Aside from that, I’m interested in cricket, football. My real passion is a sport not big around world. Big in England and Australia. Rugby League.
Greg Low: Yes! Rugby League fan. Very good. I grew up in Brisbane. Big sport there, Sydney as well. Melbourne has tried to get bigger following. The team here has done really well. Winning competitions. Very strong camp. I remember in the 1980s I saw many games between Australia and England. Certainly rivalry between Australia and New Zealand common. The Rugby Union rather than League is gaining popularity. Big competition with countries involved. One thing to have cities playing each other but more intense when countries play each other.
Jamie Thomson: Yes. We don’t have many countries playing. I’d rather play Rugby Union than Rugby League. Club level I’m devote follower. My team is doing well at the moment. Won the championship the past three years. Quite enjoying it. Leagues. Melbourne is coming across in 2010 to play in the World Club championships.
Greg Low: Melbourne Storm won the local competition this year. Strong. League, interesting watching the reunion. Totally different competition, Australia playing Fiji or New Zealand or South Africa. It was regular competition. Suddenly it’s between all the different countries which is insanely interesting.
Jamie Thomson: Absolutely. I love watching Rugby at international level. Club level, not the same interest.
Greg Low: League is still very big sport here, but still Australian Aussie rules, Australian football is still for most of the country, bigger than what we have with Rugby League. The next one, soccer here, football in the U.K., is certainly still well further down as third type thing.
Jamie Thomson: Yeah. Australia I guess. Aussie rules I’m not familiar with.
Greg Low: Large field, lots of players. Several umpires. Spectacle, actually.
Jamie Thomson: Violent.
Greg Low: Not as violent as League. For awhile I had baseball club formed within an Australia football rules club. Famous. In Guinness Book of Records highest first grade score ever done in Australia rules. Literally won 406 to nil game. Coaching I used to do. What exactly do you say to your team at halftime when your 200-and-something nil down?
Jamie Thomson: Demoralizing isn’t the word.
Greg Low: Can we get a point.
Jamie Thomson: Cricket in the first inning.
Greg Low: Next thing with SQL Server was configuration. Probably, if I look at migration from DTS or someone building new Integration Services packages, there are many options as to how to store configuration and deal with it. What are your recommendations?
Jamie Thomson: Changed over the years. I’ve never been big storing configuration in SQL Server. Don’t really know why. I was OK putting things in files, manage files. Happy doing that. I’ve gone for configuration file option, serves me well. I’ll always recommend using indirect configurations as well, where location of configuration file is stored on machine. You won’t hard code location in configuration file into packages. Sounds like a good thing, but you are hard coding location of environment code. Store that locally on machine, things like that. How many moving parts do you want to have in your solution? Indirect configurations, file, environment, package. Three things to take care of in deployment. Is that how you want to go? Generally that’s what I recommend and what I use now. Having said that, another option that I’ve started looking at is to not define configurations at all in my package, passing location of configuration files from command line when you execute package. Com option. Tell the package where to go get configuration from. Package doesn’t need to know anything about the configuration. That’s good thing. Packages you can pick up and move where you want. All you need to change is command line to execute. I like to go down that path. There’s set option. Take any property in package and most properties can be set using the set option. Makes packages portable.
Greg Low: Point we need to make is very common in DTS packages for people to hard code all sorts of connection strings. Doing migration, often bane of my life, finding all the things embedded. Other time I find caused significant issues is where people have gone on to use the DT execute options allowing us to run existing DTS packages. They thought all connection strings modified, when some missed. Some old servers, some new, all on line at same time. Run packages, hit both old and new without them realizing. What I tend to suggest going forward is if they use different service accounts for new system, side by side, you avoid those probs.
Jamie Thomson: I agree. Security model, different security accounts on different systems ensuring.
Greg Low: Side by side installs, important to have different accounts.
Jamie Thomson: Absolutely.
Greg Low: One of the other options I’ve been doing lately is only configuration stored is the address of where configuration server is, using SQL box that we query, run in test or production, telling environment, only thing we have is configuration saying server over there, environment. Gives details we populate into variables. Use property expressions to assign to things. No configuration stored in package at all. Reasonable?
Jamie Thomson: Absolutely. Definitely good idea. I love what you say about property expressions to make sure standardized. Many advantages to storing action strings in variables. You can pause your package during execution. Watch the volumes in the variables change. See what’s going on. Directly on proxy. I like the idea of connection string and variables, letting them flow through packages.
Greg Low: The other good thing is if you have child packages, you can pass to children.
Jamie Thomson: Yes.
Greg Low: Picking up from parent variable.
Jamie Thomson: Great way to do. Interesting you say that. Idea of using sets and com options work well is single package. However not seated to when you have many child packages. Execute package in SSIS doesn’t have set or com options of data exec. Needs a lot of work. Limiting. If you’re using setting from command line, need different mechanisms to pass through to child packages. Mixing and matching how you do things. Works fine but if you’re after consistency, perhaps go different route. Many options.
Greg Low: I like it because if you have single configuration server, you just spin up and nothing changes. Someone has already preconfigured where to point to. Useful.
Jamie Thomson: Makes sense. I’ve never used configurations that reside in SQL Server. I like that you can run SSIS without SQL Server even existing. Advantageous if your store packages. If you have SQL Server, what’s the problem?
Greg Low: I like because in SQL Server I can have specific data types for my parameters where when I have XML configuration file, people can put whatever they want in values. Useful. When you build a package do you tend to start with prebuilt framework? I see people start from scratch, but framework seems productive.
Jamie Thomson: Yes. I go down that route. Simplest way is to use template packages. Not many people know about. Possible to have template package which all your packages use as base. Event handling and variables in that. Can be advantageous. Ups and downs to doing as well. Whole hog, framework is good idea. I spend a lot of time putting together framework but we used on project. Based around metadata database. Metadata globally used. Essentially what it was. Stored definitions of work that package had to do. We found we were taking control flow out of package and putting control flow into our metadata database. We could list packages and go execute in certain order. Perhaps turn off package and execute single package. What we did was slip flag in and turn off package to execute. Good reasons you might want to do. It’s difficult to turn things off in control flow. Go in package, not easy to do. If you have control flow in database, makes easier to do. When I was defining this, we had some heated discussions about whether right to do. I thought it was, the other guy disagreed saying should be in packages. Other reason I like to have definitions in control flow in metadata database is you can link execution history back to metadata. If using custom logging framework, can see what event happened in which package, executing at this time, part of overall collection of work. Nice click through of why things occurred. If you can define work and log against that definition, easier to work with. Good history.
Greg Low: I see people reinventing the wheel whenever they start another one. A lot of things you need to spin cycles and think of how to do so applies to all packages. What do you do for package locations? File system or SQL Server?
Jamie Thomson: Yeah, I store in file system. Whole separation from SQL Server. Execute packages without SQL Server being present which is an opinion. I always use packages and file system. When you’re developing all the packages are on file system. If you want to use different way of storing in production than developing. I like to keep things consistent. If you execute package using execute package task, you will store packages in SQL Server, you have to have way of telling packages they’re running in production environment, therefore get packages to execute from SQL Server. If you’re developing, tell in development environment, find on file system. Quite difficult.
Greg Low: Do you use deployment utility built in?
Jamie Thomson: No. various ways to do. Most people move them manually. We’ve built MSI packages installer. Takes our output from Integration Services project and package to store. Installer deploys onto server you’re deploying to. Takes configurations stored. Nice. Can include continuous integration build as well.
Greg Low: That raises what I’m wondering. What do you do in terms of source code control?
Jamie Thomson: I’m a fan of source control systems. Most projects I work on I find I’m using project and working using open source control system. Wouldn’t recommend particular source control system, but recommend you use at least one of them.
Greg Low: Big challenge I see with Integration Services package is lack of decent package comparison tool.
Jamie Thomson: Yes. I knew you were going to say that.
Greg Low: Like normal source code, change one line and I can see what changed in this version. Unfortunately with Integration services, package changed. APEX has one, basic XML comparison tool. Touch to have XML that looks totally different. Really is need for that. Something missing in product is application lifecycle stuff. Other things like aesthetics source code analysis, check in rules. Ability to build in relation to Integration services. Checks you haven’t hard coded connections. Nice if that was built in.
Jamie Thomson: Couldn’t agree more. A couple years ago, there was someone well known who wrote blog about using SSIS. Diatribe. Tore to pieces. A lot of his concerns were around what you said. No support for packages. People come from .NET development side are used to this. See things in SSIS which foreign. Things missing. I completely agree. All these things, deployments, logging, source control systems, how to productionize database. Application lifecycle management needs to progress. You might see things changing in SQL 11.
Greg Low: Also, some people aren’t aware, not multi-instance application. Is that issue for you?
Jamie Thomson: Not an issue as long as you’re aware of it. I don’t work on projects where we have development and set environment on same machine. Different machines. People do. UAC instance on same machine. To me, never work in that environment. Wait around, making sure packages know which environment to execute in. location of packages obvious.
Greg Low: Other topic is that it’s not clusterable application.
Jamie Thomson: Yes. Many people mention this. I remember seeing a blog on how to cluster the SSIS service.
Greg Low: Funny. There is knowledge base that says don’t do this, but here’s how you do it.
Jamie Thomson: Yes. People don’t know the SSIS server is for. Frankly doesn’t do anything. I had conversation on this a few days ago. Input from someone in SSIS who said we should have service in the box. Might change in future.
Greg Low: Check points and restarts. I was preparing material on Integration services for Masters Certification classes. Feeling I got when topic of check points came up, I got feeling that team didn’t want to talk that. Experience was less than optimal.
Jamie Thomson: Yes. Few problems with it. Very hard to configure restart ability when using check point. Nebulous. Have to tweak knobs just right to ensure your package will restart from just the point you want it to start from. Strange properties in the package most never touch. Fail package, fail parent. Need to get right if you want package to restart. Not documented anywhere. One of the problems. Another problem is check point sometimes does not work. Packages stop executing because of error. Correct errors, all fine. Restart and is completely different.
Greg Low: I suggest people just don’t go there. What I do tend to like is I build packages so they’re restart aware of themselves. Rather than moving huge data from one place to another, I’d rather build loop that says do a chunk in a transaction, then another chunk. If you have to restart, knows what it still has to do.
Jamie Thomson: Yes. I call that artificial roll back points. Points in your execution where you know if something goes wrong you can easily go back and start from that point. Metadata database. Flagging certain stage in execution. Packages can read that metadata and know what’s executed before and failed, will point to same point and start.
Greg Low: I don’t think there is any automated way to do this. Have to design packages for restart ability up front.
Jamie Thomson: Absolutely. Need to do this. What I like to do is packages I took from someone, we shove to data warehouse in single transaction. Easy point of roll back. However if you’re using SSIS and data flow, isn’t easy. Using transactions in SSIS means using MSBPC. Whole another world of hurt.
Greg Low: Seems heretical a bit, but I tend to use a lot of execute SQL tasks. Whenever people talking SSIS and designing, seem to spend a lot of time in data flow doing massaging data. My own preference is to get data from source, pull in with minimal constraints, into other location, process it in bulk with SQL statements before doing anything else.
Jamie Thomson: Yes. Nothing wrong doing that. Most people who work on SSIS come from SQL background so you can use new technology and spend time learning or use what you know? If you know TSQL, use that. Works.
Greg Low: Yes. Listen, that’s up to time, Jamie. Other thing is what is coming up in your world? Anything you’re writing?
Jamie Thomson: Yes. Coming up is SQLBits conference in a couple weeks.
Greg Low: With Simon Sabin and others. We were talking this coming up. That’s in Wales isn’t it?
Jamie Thomson: Yes. Two weeks away. I’m spending this weekend getting demo material ready for that. SQLBits, outside U.K., might not know. This is the fourth SQLBits.
Greg Low: I think it’s fifth.
Jamie Thomson: Maybe. Premier SQL conference in Europe. Lots of hard work by many guys. Fantastic job putting together. Big event. Can’t wait to get there.
Greg Low: Well good luck with the show and thank you for being here today.
Jamie Thomson: It’s been a pleasure Greg. Thank you. I enjoyed it and I listen to your show whenever you have a new one. I look forward to more in the future.
Phone: 1300 SQL SQL (1300 775 775) l International +61 1300 775 775 l Fax: +61 3 8676-4913
Copyright 2017 by SQL Down Under | Terms Of Use | Privacy Statement