Kent Tegels
SQL Down Under Show 3 - Guest: Kent Tegels - Published: 18 Jul 2005
In this show SQL Server MVP Kent Tegels discusses why XML is important for SQL Server, existing SQL Server 2000 support and upcoming XML support in SQL Server 2005.
Details About Our Guest
Kent Tegels is an enterprise architect for Insimiosis systems international, where he specializes in helping clients achieve the most from their Microsoft technology investments. He’s an instructor for DevelopMentor, teaching classes in database technologies, programming, and dot net. In 2004 he was recognized by Microsoft with an MVP status in SQL Server for his community involvement in SQL Server and dot net. He holds Microsoft certifications in database administration and systems engineering. Kent has contributed to books on data access programming and dot net, and is well known as an industry speaker. He lives in Omaha, Nebraska, where he and his fiancé enjoy making and drinking fine lagers.
Show Notes And Links
In August 2010, we got the sad news that our friend Kent lost his battle with illness and passed away. I cannot begin to express how sad I was to hear about this. The SQL and XML communities are weakened by this loss.
Farewell Kent and thanks for all the fish.
Greg
Show Transcript
Greg Low: Introducing show number three, with guest Kent Tegels.
Before we get started I wanted to mention I was at a software design review at Microsoft a few weeks ago. Met up with Wally McClure, responsible for ASP.net podcast, I suggest you listen to that.
Our guest this evening is Kent Tegels. Kent is an enterprise architect for Insimiosis systems international, where he specializes in helping clients achieve the most from their Microsoft technology investments. He’s an instructor for DevelopMentor, teaching classes in database technologies, programming, and dot net. In 2004 he was recognized by Microsoft with an MVP status in SQL Server for his community involvement in SQL Server and dot net. He holds Microsoft certifications in database administration and systems engineering. Kent has contributed to books on data access programming and dot net, and is well known as an industry speaker. He lives in Omaha, Nebraska, where he and his fiancé enjoy making and drinking fine lagers. So welcome, Kent.
Kent Tegels: Good day, Greg.
Greg Low: Maybe if we could start by getting a background as to how you came to be involved in SQL Server.
Kent Tegels: As someone who came out of a web development background. Looking for a good way to store lots of data and interact with it on the web, things like Oracle were easy to use, and My SQL, but they didn’t have the best support on Windows platform. When I made that transition eight years ago I needed a good way to support that and there was SQL Server 6.5. That’s where I got started. Obviously to step up, SQL 2000 is a great tool for us to use.
Greg Low: What were your thoughts on SQL Server 6.5?
Kent Tegels: It was better than 4.2, better than some of the other technology choices I had at the time. There were enough limitations in them that when I saw what they were going to do with seven I was really excited about seven, that’s when I became interested in SQL Server, really with seven. Worked with 4.2 and 6.5 a little, the important things were being able to have fairly good-sized row sets, have a rich data-type vocabulary to work with. Those were sort of during 6.5 and got better with seven. Now with 2000 and 2005 we have a pretty rich environment to work with.
Greg Low: Brings us to the topic I was hoping we would cover – XML support in the database engine. What I find is a lot of developers and DBAs have very little idea why having direct support for XML in the database engine is useful. What’s your take on that?
Kent Tegels: You make a good point, there’s a distinction between having XML supporting features and storing XML on a database. Lots of reasons you want to do both, primary reason to have XML supporting features in a database is you take advantage of all the things SQL Server has to offer you. Rather than bringing all the data down to some mid-level plant and lunge that into the XML you want, why not keep the abstraction of the data as close to the data as you can? One of the big reasons I like 2000 so much. You have things that do the marshalling of that data out into an XML format to work with, and you have things like annotated schema which make writing for-XML queries a little easier, less-involved. Directly call into SQL Server and bulk objects and get XML back out. If you look at everything in the extract, transform, and load method, we’re talking about extracting data out of SQL Server, as just row sets, but instead of being square row sets, we’re talking about them being XML documents.
Greg Low: What sort of things do you think are suited well for being stored as XML in a database?
Kent Tegels: That list is pretty short, there are things that are hard to reconstitute in document order. It’s hard to shred a word document, store it, and get it back out in the format you put it back in as. That’s a good use key for storing XML as XML in a database. Thing is that it isn’t efficient to shred, store, and re-constitute in the way you need it. I’m not a big fan of the point of view that says you should store everything as XML, because that’s not efficient.
Greg Low: Think it’s one of the things the pure-object guys are looking forward to. They see the database as a place to throw objects into. XML is one mechanism for doing that.
Kent Tegels: There’s a certain value in doing that. Trade-off is you can use as many features of SQL Server as you would want to. Becomes harder to make highly-efficient queries against the data those objects represent. When you get into updating data, maybe you’re storing customer objects and that customer object is referenced in 300-400 purchase orders, and you change the address, do you go update three or four hundred XML instances? That’s the kind of thing I think people will not really “grock” about, using XML properly right off the bat, they’ll get there as soon as they’ve written that system and done maintenance on it, they’ll learn.
Greg Low: I saw the other day, I was doing some consulting work, thought it was a very obvious case. In some way, whenever you’ve got some case of structured and semi-structured data, the example I’m thinking of, it was an organization that was holding details on relays that are used in power sub-stations. There’s a lot of standard relational data in there, the ID for the relay, the manufacturer, location, all that. The problem I had is that they also wanted to store all the settings, these are highly-programmable devices. Wanted to store individually, what do you do when every relay has a different set of settings. One of the partners who had proposed a solution had come back with a solution that involved thousands of tables, they had a table for each type of relay that would be stored. I thought that would be completely bizarre. It struck me as something that would be ideal where you could put the settings in a column and still query them.
Kent Tegels: Your other choice in a scenario like that is sort of a dictionary approach with open schema. Have some attribute name of that relay, you have its values stored in SQL. I think by the time you get down to that, you aren’t doing anything different than you do with XML anyway, it’s a good fit. The open schema approach is a good way to solve those types of problems and take advantage of those unique index if you need to. As a DBA purist, anytime I see SQL variant I get nervous.
Greg Low: Always thought it was amusing they only recently added variant to SQL Server, but all the other languages have finally removed it.
Kent Tegels: Think about how you use SQL to store. Of variant type there is a problematic thing. You’re always going to say a type save, but then you have something that has a variable type per instance. From a normalization standpoint that makes me nervous, never really know what day it is there. There’s something, I just can’t tell you what it is.
Greg Low: My personal take is that I dislike implicit type conversions immensely, I see them like some evil, think the best example is in T-SQL if someone wrote customer address = three, a human compiler looking at that knows there’s something wrong. What intrigues me is that the things like T-SQL compiler will assume the person just could write three as a string, that’s probably not what the problem was. They picked a wrong column or something else like that. It takes things like that and buries them. Takes bugs and buries them where they’re hard to find.
Kent Tegels: Sense in the XML community that schema is for validation only, not used the way it was intended, to strike Metadata up in a small document. In the database world we’re comfortable with the concept of having schema describe the tight semantics of the data we’re working with. People shy away from schema because it’s so hard they don’t take advantage of it. It’s no-win if you have that attitude.
Greg Low: In terms of SQL Server 2000, describe what type of support there is for XML?
Kent Tegels: For-XML queries, great way to do aggregation of the results set into an XML document fragment. Annotated schema, great way to munch query results out into XML format. The two go hand-in-hand, the for-XML queries on writing a select statement, the annotated schemas on writing a data-review schema file, let SQL Server generate a for-XML query for me on the back end. Those are the ones I use the most.
Greg Low: For people not familiar with this, on the case of a for-XML query, it’s a select statement where we’ve put at the end of the statement for-XML auto, for-XML…, or for-XML explicit.
Kent Tegels: Of those modes I’ve written enough explicit to know I like auto better. Another thing I really like are the ability to have a filter that sits on the IIS side, I can pass queries to it and get results back. That’s a nice story. The problem is IIS, brings a lot of baggage with it. More political than technical. People have this wrong perception on how you use IIS, it’s easy to keep up. That’s off-topic.
Greg Low: The other problem there is that it tends to cross the political divide, there’s a DBA camp and a network administration camp, that’s a problem
Kent Tegels: I’ve seen that in every organization I’ve been in. The DBAs don’t want IIS on the machine, the infrastructure people are wondering how it helps them get anything done. It’s awkward to migrate those waters, there’s a lot of sharks. I think the last of the big four is probably the thing I’m happiest to see changing in SQL Server 2005 – open XML. You can take an XML file, open it in memory and start doing shredding operations on it. Those are the big four. For-XML queries, annotated queries, package, and open XML.
Greg Low: With open XML at the moment, this is the one we call, we receive in SQL Server 2000 in voucher or text or something, and then call XML prepared document. That builds like an in-memory structure, query with open-XML.
Kent Tegels: I think that model is a great model for SQL Server 2000 given the constraints it had. We’re looking at three painful points with SQL Server 2000. There’s no data type that stores directly to XML, no facilities for treating that data as anything other than string. General lack of plumbing fixtures. If you’re using XML and SQL Server 2000 and you want to send that down… I was talking about points with SQL Server 2000 and the data type support. Right now you have to treat everything as a string. Whatever you gain from using XML. Some of the plumbing fixtures aren’t there because you’re having to chunk that XML down as 8,000-byte strings. Get that back together, get it into an XML document to work with. The web releases for dot net developers have been great, gave you things like an execute XML reader. For ADO developer, XML is not easy to work with.
Greg Low: In fact, I saw a number of people at the bleeding edge who found it very difficult.
Kent Tegels: Not straightforward. Had to know that ADO operating model very well. Even if you knew all that, you were still doing everything memory-resident. As soon as you start working with a 30-megabyte XML instance, you really put down resources. It’s a great model and a good way to get things done. SQL Server 2005 solves a lot of problems we have with it. Makes it less painful to work with.
Greg Low: Good point to take a short break. After we'll talk about the support in SQL Server 2005.
Greg Low: Kent, tell us about yourself so the listeners know about you.
Kent Tegels: Got to thinking pretty much your average guy in Omaha, Nebraska. I don't know if you've ever been to Nebraska, but we're the American version of the Never-Never. Friends are Star Wars fans, describe us as the Tatooine of America. Centrally located, 500 miles from other major cities. Nice and quiet, pretty sky to look at. Very nice place to live. No crime, great schools, we're happy. Easy to make good beer here, we have great water chemistry, easy to make beer.
Greg Low: Outstanding. Recently moved into a new house?
Kent Tegels: We moved into a new house, located on ¼ acre of land now, Janelle is my fiancé, she got herself a self-propelled lawnmower, keeps the grass short.
Greg Low: Is this one of the lawnmower that automatically works out where to mow, or is it a remote control?
Kent Tegels: She actually pushes it, even though she’s just as much of a technical geek as I am. Still enjoys getting out and working. We don’t have a high-tech lawnmower.
Greg Low: The idea of a device doing the rounds of the yard by itself with high speed blades worries me a bit.
Kent Tegels: We’re not quite that high-tech.
Greg Low: Tell us about what is happening with XML in SQL Server 2005.
Kent Tegels: I think the biggest thing we did in 2005 in terms of XML is that it becomes a native type. We’re able to work with it as a native type. Everything we would expect to be able to do we can do. Make a request for it, don’t get back 8,000-byte rows. We want to do edits in place. We have methods for doing that. Transform it to a row-set. We have methods for that. The other changes are those that go away, you don’t have to put up with it anymore. They’ve changed a lot of the mechanics so if you want soap-enabled Web services talking to the XML, they’re there for you to work with.
Greg Low: See a lot of discussion on whether or not SQL Server should become an application server or not. Suppose the idea of exposing stored procs and things as web services is one of the things that leads that discussion, where does it fit in?
Kent Tegels: Essential part of using SQL Server in a well-integrated scenario. People have to be careful not to get too carried away and put too much application logic in SQL Server. That limits you as far as what you can do with what SQL Server can do. Turns the scale up. If you take the approach where you leave more of your application services logic on middle-tier servers, greater scope you can scale out. Not a big fan of that, huge fan of Service Broker. Turning SQL Server more and more into an application server.
Greg Low: Must admit I must chase Roger Walter to get him on the show talk about Service Broker in particular. I find it fascinating.
Kent Tegels: A few of my cohorts at DevelopMentor I recommend. Dan Sullivan, they’re all good about service broker. Tools written for working with Service Broker, on top of SQL Server 2005.
Greg Low: Was Niels the one who was writing the Gooey Tool?
Kent Tegels: Absolutely. Such a great tool to work with when you’re trying to filter server applications. Right now it’s either that or you pretty much have to know how to type. If you don’t know, you can’t use Management Studio. Contract types you need. It’s done us a great service by having that tool.
Greg Low: New release he put on his blog the other day, downloaded it but haven’t tried it.
Kent Tegels: Service Broker is the dark horse wonder of SQL Server 2005. People either fall in love with it and try to use it for everything, or ignore it and miss out. Need to find the right balance.
Greg Low: Takes advantage of the new XML data type you were just mentioning. In a message type you get to the messages are XML, it’s up to you whether they’re well-formed or whether a schema applies. With XML data types, we can declare variables of that type, build columns, we can query directly.
Kent Tegels: Right. Take an instance and get into it using a language called X-query. Get at its values in a SQL-like way. Don’t have to take it down to a string and work with each one. As long as you know enough about X-path to get values out, you’re ready to go. Start using this to get values out of your XML, pass them into the column T-SQL query, use them as values. Lacked for a long time working with XML and SQL Server.
Greg Low: For those that haven’t looked at the X-query methods. First there’s a query method, takes X-path type syntax. Example of what you would use that for?
Kent Tegels: Use query to do full ex-query based queries. Say give me all the nodes that fall on this pattern with these constraints. Work with ware-clause in SQL Server. Order them this way, re-shape XML from one schema to another, from one schema to a not-bound instance. Query best for re-shaping.
Greg Low: Supports X-path query directly. Like a directory path for people who haven’t used that. In addition you also have these FLOWER syntax. What’s it for, you probably know.
Kent Tegels: SQL Server doesn’t support the Let concept, I don't think it's that big of a loss.
Greg Low: Also in the X-query suite there’s an exist method.
Kent Tegels: Use like you would use exist in SQL Server. Say does this X-path have value, does the X-query return some value? Predicate to reduce the size of the results.
Greg Low: Nodes method?
Kent Tegels: Nodes is a curious creature, doesn’t fit with model of how do I extract data out as XML? Its job is to take XML and return row-set data.
Greg Low: How do you see that fitting in conjunction with open XML? Noticed in SQL Server 2005 they changed the scoping of it. People had issues with memory consumption.
Kent Tegels: That had to happen, the memory issues. Nodes is where you have an XML instance in memory, I want to work with it. Open XML is I have that XML someplace else and want to work with it. Open XML is great if I already have an existing type instance of XML to work with. Nodes is when I have to have an instance to get it back to the row set.
Greg Low: Another method was modify?
Kent Tegels: There’s a lot of effort going on that the WPC just to give X-query itself… not a big speck at this point, not formalized.
Greg Low: One thing I was going to ask about. I know in the material we’ve been going through with SN training. Talk about it as based on working draft. What do you think is likely to change?
Kent Tegels: At this point most of the parties that are participating in the process aren’t interested in any more changes. Want this out and the door so they can get their products done and out the door. Then fix up things. There’s no way with X-query today to modify instance, only extract data out of it, no transform. We need a way to do transform on the data and change its values. That’s what Microsoft’s invention called XML-DML is all about. Gives us a nice way to modify values inside of an XML instance. Any operation you would think about doing with SQL Server transacts SQL today (inserts, updates, deletes), is part of XML-DML. Language is different, clearer when looking at it from XML frame of mind.
Greg Low: With modify method we’ve got insert delete and replace syntax. Had the feeling when talking about working draft, that was at the more bleeding end of it.
Kent Tegels: Wanted to address it at some level, X-query 2.0 might have more. Their first pet is how do I efficiently query XML in a T-SQL like fashion. Set operations. Come back, go through and modify instances.
Greg Low: Suppose for those who haven’t seen it, what those methods allow us to do, select some column dot, apply a behavior or method to it. Dot query, whatever. Search and retrieve just the information out of the column we want, can then be efficiently searched inside the database.
Kent Tegels: The canonical example I give, I have a word document stored in a column, want to extract the Metadata about who the author is, how many words are in it. Select statement, query to get that Metadata out. Do a ware using the exist or nodes to get matching values out of it. Write interesting queries with all of this. Interesting in terms of trying to figure out what’s going on.
Greg Low: With a traditional DBA hat on, I would say wouldn’t these be promotable properties? When I stored this data in a column wouldn’t I extract it and store those columns I want to query in addition, searched more efficiently by the database?
Kent Tegels: If your argument is that they’re extractable properties, they are. If your premise is that it’s more efficient to extract than to put somewhere else, I disagree. SQL Server 2005 gives you a full way of indexing that data, working with it efficiently. If you sat down and used it and compared it, it’s close to just as efficient. As SQL Server continues to evolve, it will become just as efficient.
Greg Low: Leads us into another of the new features, XML indexing. Number of different types of index.
Kent Tegels: Two classes of index. Primary XML index, once that’s in place there’s a secondary set of XML indexes, put on for part property and value. Good ways to tweak an index. One of the things I’ve found, CTPs have been rolling out, used to say the primary XML index didn’t give you that much value, it was the secondary that gave you the most performance help. Between the CTPs they have really kicked up the performance. Now that’s the primary way of improving query performance. Go with it, put a primary index on that XML column, see if you get the performance you want out of it.
Greg Low: We should talk about the difference between those, the primary index and the different types of secondary index. Purpose of primary index?
Kent Tegels: Basically a node-mapper. Says this node is on a record that has this primary key value. Requires there to be a cluster primary key on the table. Map where a node is in relationship to a table or page. Retrieval of it is that much more efficient. Secondary ones are used by X-query processor, figures out where values are. Becomes an index over the data inside the XML. X-query interprets a little differently. More of a map in terms of the document, what’s going on inside. Primary is how does it lay out in page memory?
Greg Low: Secondary is optimized to answer different kinds of queries?
Kent Tegels: Right. Good discussion of that in my blog and every place you find information about those queries and instances.
Greg Low: That’s great. Anything else XML-wise in SQL Server 2005 that’s of interest to you? Noticed they’ve made substantial enhancements to the standard for-XML syntax.
Kent Tegels: How can I not heap praise on for-XML path? Usually when I’m teaching a class about it, I’ll tell people the greatest pain and greatest love you’ll ever have with SQL Server 2005 and XML is writing a for-explicit query. Have to map everything into a universal table, then post-query process (aggregator) comes along, takes universal table and spits out XML. Cumbersome process, writing query upon query, all together to get the universal table. Becomes it’s easy to write several thousand lines of query, you only need 40-50 lines. That’s what XML path is. Says for this X-path put everything on this path, you’re good to go. Don’t specify tag and parent, don’t union them all together. This goes on your X-path and you’re done.
Greg Low: Noticed, to me it looked like they’ve added as options into what we can use with for-XML and auto, the things you commonly wanted to do, had to go to explicit before, was complex.
Kent Tegels: Like generating elements instead of attributes, add elements to have well-formed XML. All those things are improvements.
Greg Low: Noticed the open XML statement has improved. Since we have an XML data type, the prepared document can take XML type incoming. In the width statement for open XML you can also use XML columns there.
Kent Tegels: In my usual use for that is to put it in a blog, put into type, use nodes to do shredding rather than open XML. Matter of what you’re more comfortable with. More comfortable with nodes, makes more sense.
Greg Low: On out local SQL Down Under mailing list, a guy a month ago was having significant memory issues when working with open XML. When moving to working with nodes, fixed all his problems.
Kent Tegels: Watched with interest, wanted to see how that was resolved. Don’t know if you have American subscribers on that list, would encourage it.
Greg Low: Put it in place because what happens here is the SQL Server normal public news scripts and everything are fabulous. Not as in your face as I’m trying to get with a mailing list. What happens since they haven’t been in our time zone, people ask a question and 24 hours later get a response. Disjointed conversation. Rapid-fire discussion when you’re in the same time zone.
Kent Tegels: Had this conversation in the past. The more localized the community the stronger it is. Happy to see that list pop up, thankful you’ve allowed Yanks to subscribe.
Greg Low: Absolutely. Most people are from Australia and New Zealand, happy to see people from anywhere. Grown really well, many hundreds of people on the list. Good traffic. Started having a discussion about coding standards. Ask you the religious question, the one that causes debate. Do you have a preference for singular or plural table names?
Kent Tegels: I’m for plural, but I’m an object guy. I tend to think of tables as… Probably why I’m contaminated.
Greg Low: I thought it would make for an interesting debate, I thought I’d start with an easy one, table names. How quickly strong opinions came about.
Kent Tegels: It’s the… Everybody can all be talking about the same thing, we all have different views of it. Fascinating thing about XML, we all communicate and give the data a lot more semantic meaning, have the problem of Meta-semantics to deal with. Has to be a better way of dealing with it, long time for that to happen.
Greg Low: Now while we’re finishing up talking about what’s new in SQL Server 2005, what about data integrity?
Kent Tegels: That’s the issue you hear people bring up “it’s just XML there’s no way to make sure the data integrity is good, how do I deal with the relational and referential integrity?” those are two different issues. Is the data correct is very different than is the data repeated, is the data in this table the data that should be there? To make sure it’s correct and has the right element names… SQL Server 2005 allows us to have schema collections, XML schema collections. Create a schema collection, put any number of XML-schemas into it. Any time data goes into XML-type instance, it gets checked against that schema, the data is always what I expect it to be. Referential integrity is a harder story. Computed column answers, uses dot-values method, extract values I want to use as a key, use that to relate to other tables.
Greg Low: Dot-value is another X-query method we didn’t mention earlier, used to extract a particular value once we know the path.
Kent Tegels: Promotes it to appropriate SQL type. Whatever its correct type is. If I’m in that schema-bound mode, it won’t let me do any legal conversion. Favorite example is I have a suite of history data. Dates in it… Greg, how well does that map to SQL date time? SQL Server doesn’t allow date time before…
Greg Low: In our part of the world, discussion on date time are most entertaining.
Kent Tegels: We’ll have to cover that sometime.
Greg Low: Tibor Karaszi has a wonderful guide, The Ultimate Guide to the Datetime Datatypes. Should be compulsory reading for everyone working with the product.
Kent Tegels: I have homework to do.
Greg Low: He’s living in a country where the U.S. defaults aren’t the way he lives. The challenges with the products, how are they going to work in the environments? On the topic of schemas, the idea of a schema-collection associated with a column. Why have they done that?
Kent Tegels: The main reason is that you can do versioning with that. Want to over time, make sure there’s a way to validate the data as the schema changes. Simply by saying that new instances go into a new name space, do that. Another reason is to get support for open schema method. XML is storing word documents, or documents coming out of local office, completely different schemas. Want to have multiple schemas in a collection. Doesn’t store them as schemas, stores them where every schema element is broken down into its own thing. Each rule becomes checked as you go through. One of the things that creeps me out is you can have two schemas in a collection, if you serialize it back out, you get one schema out. The nodes that match on the name space are condensed to one single instance. No way to deal with the problem.
Greg Low: Your take on that isn’t that people wouldn’t have wildly different schemas in the same collection, it would mostly be versioning?
Kent Tegels: Great for versioning. With the different schemas in the collection, the more you’re thinking in the open-schema mindset, that’s good. Reason to do that. May be storing messages out of some integration process. Might have 18 different messages coming out of that process, each with their own schema. Whether they’re different or not causes some debate, they are different. May need to be able to store all those messages because of how I’m working with the data.
Greg Low: So, suppose that’s the main thing changed with SQL Server 2005 and XML, can you tell us what else is happening in your world?
Kent Tegels: In a few weeks I’ll be going to Japan to teach an ADO class with DevelopMentor. Working on a project, writing some dot net applications, might be able to use Visual Studio 2005 and SQL Server 2005 to automate sales of some high-end agricultural equipment. Be at PACs, have a presentation called the 15 must-know things about SQL Server 2005 XML. Me bearing my wounds to the world on what I’ve messed up on, learned the hard way. Looking forward to giving that one, having people laugh at things I’ve done. Hilary Cotter coming and giving me a good education on full-text index. Interesting things with full-text indexing.
Greg Low: Interesting discussion with Hilary on the last show. Caught up with Kim the last session in Munich, attended his session on full-text indexing. Good fun.
Kent Tegels: The last trip I made up to our client site, Hilary and you kept me company on the way up there. Great that you’re doing this, burn it to CD for instant entertainment.
Greg Low: Thanks for taking part, Kent, looking forward to the XML support in the new product.
Kent Tegels: Look forward to coming back some time in the future.
Greg Low: Great, thank you.
Phone: 1300 SQL SQL (1300 775 775) l International +61 1300 775 775 l Fax: +61 3 8676-4913
Copyright 2017 by SQL Down Under | Terms Of Use | Privacy Statement