Roger Doherty
SQL Down Under Show 35 - Guest: Roger Doherty - Published: 14 Mar 2008
In this show SQL Server DPE Evangelist Roger Doherty discusses SQL Server 2008 for database developers.
Details About Our Guest
Roger Doherty is a Senior Technical Evangelist with the Developer and Platform group at Microsoft.
Show Notes And Links
Show Transcript
Greg Low: Introducing Show 35 with guest Roger Doherty.
Our guest today is Roger Doherty. Roger is a Senior Technical Evangelist with the Developer and Platform group at Microsoft. Welcome Roger.
Roger Doherty: Thanks Greg.
Greg Low: As we do with everyone, will you tell us how on earth you got involved with SQL Server in the first place and how you got involved with Microsoft.
Roger Doherty: I’ve been with Microsoft about 18 years. Most of that has been with SQL Server. I started in 1991, but my experience with SQL Server started before that with Sybase. I was a consultant for them right out of college. Sent to various odd corners of the Earth working on Sybase installations. Jumped at opportunity to come to work for upstart Microsoft in 1991 when they announced they were going to port SQL Server over to the OS2 platform. Kind of starting off client server revaluation and PC database server market. That was my start. I’ve been mostly focused on SQL and different roles over that time. Developer, sales, evangelism, and here we are.
Greg Low: Indeed. With SQL Server 2008, let’s wander through what developers will find of most interest.
Roger Doherty: Sure. When you’re talking developers and how and why they might be interested in the database engine from development perspective, there’s different ways to slice and dice. First, I like expansive view of who developers are. Today people wear many hats. Past, luxury of saying database developer versus application developer versus BI developer. Increasingly seeing developers have to have expertise across all the areas. Easiest place to start when talking developer functionality in SQL Server 2008, the database engine itself. Moderate enhancements from programmability perspective. Extensive from data type perspective. From programmability, most who have written Transect SQL code have run into problem of inability to pass around arrays as parameters, start procedures.
Greg Low: Common request is passing an invoice in the procedure.
Roger Doherty: Yes. Simple ability to pass two dimensional row set as a parameter into a stored procedure would be helpful. Interesting workarounds over years. Introduction of XML data type solved a lot for those who didn’t mind mucking with XML, but they still wanted something easier, simpler. We’ve taken concept of table variable and extended so you can declare variables of type tables and parameter of type tables and pass them around as parameters into your stored procedures. Read only.
Greg Low: Worth noting other advice is strong typing versus in XML.
Roger Doherty: Exactly. If input table doesn’t match correct definition, you’ll get error message, parameter mismatch. Helpful versus processing through the entire XML to find what’s wrong or firing up parser and incur parsing invalidations over CPU head associated with that. High performance, straight forward. Also simple, two dimensional data structure. Nice thing that people have asked for. Tends to be something we talk about first. Simple development enhancement at engine level. Another interesting enhancement is road constructors. Ability to fully specify a row of data in Transect SQL syntax by binding in curly braces. Rather than having to code three insert statements with their sep value clauses, just one insert statement with a single value clause and have three row instances to insert three rows.
Greg Low: Worth mentioning that’s also a single automatic operation, not three separate inserts.
Roger Doherty: Yes. Obvious benefits from tripping perspective, but syntactically tighter, could make a bit more readable code, less verbose code, script level. Nice syntactical enhancement. Something I think people will appreciate as makes for tighter coding convention.
Greg Low: Someone on a group said that it would lend itself well to auto generated code but also wondered what the current restrictions are on TSQL statement. Is there practical limit to size of that?
Roger Doherty: Interesting. Those maximums we use to publish as an appendix at end. Went away as of FY2005 generation. I’ve wished it would come back. Helps. Handy when you’re throwing out RFPs, knowing technical maximum is for things. I’d have to investigate.
Greg Low: Same here. Something that had come up. Different approach rather than lots of inserts, something in all at once. Could run into size of statement.
Roger Doherty: Whether you want permeate that to a 2TB statement or not, stretching that a bit. That’s another nice little widget and feature from Transect SQL programmability perspective. Another thing, one of my favorites is ability to initialize variable directly at declare statement. More concise coding. I wanted to do in previous code but had to write declare and set statements further down. Nice feature.
Greg Low: One which I wish had gotten in, hopefully still high on list, is ability to declare variable as same type as column. One I’d love to see sometime. I find in Oracle you can literally say declare at variable and type of some column. Nice to have in SQL.
Roger Doherty: Declare variable as a column in particular table? Inherit same type?
Greg Low: Exactly. Would simplify code and for maintenance as well. When table changes, automatically change anywhere used.
Roger Doherty: Interesting suggestion.
Greg Low: Just mentioning it. It’s on the connect site. People should go and vote for it.
Roger Doherty: Yeah. Go vote for Greg’s favorite feature. Another thing is the concept of enumerators to increment variables. +=. Things around in other programming languages. New to TSQL but can make for more efficient code.
Greg Low: You know how parsing gets on with asterisk? Old join syntax.
Roger Doherty: Problem has a lot to do with compatibility level you have database set at as to if it wants to look at. Parser is intelligent as to where it’s located in body of select statement or elsewhere. I’m sure they’ve thought through that. Side note, pretty extensive effort on backward compatibility in SQL Server 2008. Want to do our absolute best to not break applications but we moved forward. Any deprecations, syntactical changes tend to be exposed through database compatibility level. If we do something in syntax to break your query or stored procedure, just dial the compatibility level of database back to your release level, whether SQL 7 or 2000, and continue to run.
Greg Low: Do you encourage people to move to the current level load?
Roger Doherty: Yes. We don’t like to see people stuck on that for too long. Usually a stop gap to get initial testing in. Hopefully coding changes not so huge prevent people from moving to next available database compatibility level. Do encourage people to move quickly as can.
Greg Low: I’ve found sites that I go to use like newer DMVs Not able to because when brought in, left at 80 not nine.
Roger Doherty: Then you’re not sure what’s going to happen if you bump up. One thing we’re doing that’s not necessarily directly related to SQL Server 2008, we’re providing better framework for people to test for compatibility issues for moving to new release. We’ve promoted idea of workload capturing. Capture a workload on current release, SQL Server 2005 for example, and replay against SQL Server 2008 and check for differences in performance, queries failing, things like that. Running upgraded check to scan instances is not enough especially if you’re dealing with Dynamic SQL. You need to look at batches generated by client application and analyze for compatibility. That playback testing very useful way to do. We’ll release content tools and methodology around SQL Server 2008 that allow self-testing of applications, and a lot of Microsoft consultant types and folks in field will host upgrade lab style events, walking people through testing.
Greg Low: Topical you mention replay. My wife Mai was on site at New Zealand this weekend, replaying SQL 2000 traces against 2005.
Roger Doherty: That’s what she does for entertainment?
Greg Low: It’s amazing. She hates me describing her as a geeky wife, but anyway. Most of her friends qualify as geeky.
Roger Doherty: Especially that she was doing it on a weekend is extra disturbing.
Greg Low: Or late at night. Yes. What was complicated was matching up the database IDs between servers as lots of cross database queries. Dawned on me useful when going to replay trace to say on this replay database ID 8 is that one over there. Otherwise, having two environments and matching IDs the same is quite hard.
Roger Doherty: Yes. Traditional gotcha in replay testing. Rather than research how database IDs work we’ve instead made our tools smart about preserving database IDs on systems playing back against. Yes. Seen a lot of people nipped.
Greg Low: Would be nice to dynamically configure as part of the replay.
Roger Doherty: Nice profiler feature. Let’s vote for that one.
Greg Low: Other question, replaying cursor based code. Does it work properly now?
Roger Doherty: Depends on how complicated the cursors are. They generate handles. Handles unique at execution time. In general, cursors don’t replay well. Current limitation of current replay arch we use. Other things that can affect replay would include blocking behaviors since we have to serialize things. Blocking situations you wouldn’t get into in real world application. All this aside, replay testing can be very useful to find behavioral differences in how your application performs against new release.
Greg Low: Hugely important to do against new versions. I was surprised it worked at all.
Roger Doherty: On that aside, getting back to what we were talking on. Compatibility levels branched off to whole backward compatibility talk. We’re walking hard not to break your application. We’re doing a lot to provide prescriptive guidance on how you go about upgrading to avoid unfortunate scenario of upgrading then having problem in production. No one wants that.
Greg Low: Indeed.
Roger Doherty: That’s good summary of most of the syntactical pieces of Transect SQL.
Greg Low: Whether it fits or not, basic things like merge statements.
Roger Doherty: Yep. We did forget to talk that. Merge is a new operation that allows you to combine insert and update into single statement. As you’re moving rows into other table will check for existence of row. If exists, appropriate columns updated. Inserted. Allows you to perform those ops in single pass rather than multiple passes which can have dramatic impact on performance.
Greg Low: Yes. Powerful is additions to output clause. You can get info back as to what occurred.
Roger Doherty: Auditing aspects are important. You need to know what happened. Your standard end rows affected doesn’t cut it.
Greg Low: Interesting one. Lot of discussion about whether there should be way to get separate totals for how many inserted, deleted, updated. Answer is typically you can do output statement and then get answer from that detail.
Roger Doherty: Yes. That’s why instrumented that way.
Greg Low: Compose able SQL. Idea that in merge or statement with output we can treat as table and query from.
Roger Doherty: Right. Adds new twist to concept of dynamic SQL.
Greg Low: Yes. Literally select from and have parentheses in merge statement with output clause. Alias that and treat as table.
Roger Doherty: Benefits often mean you don’t have to create temporary data structures between stages. Code will be more complex if you use compose able SQL constructs like that. But gives opportunity to optimize in way you couldn’t before.
Greg Low: Yes. Incredible power in that. Being able to avoid endless things where people used temporary tables before.
Roger Doherty: Yes. Tradeoff between complexity of code and efficient operation. Look at that closely and ask yourself if you need to use construct because of next guy debugging code.
Greg Low: New data types.
Roger Doherty: Sure. Big major impact areas of SQL Server 2008 from developer perspective, new data types. Simply, we’ve got the new day and time types. Pretty simple, straight forward. First, we’ve fixed day time and added date time two. Much improved accuracy. Something like 100 nanoseconds, more accurate than most CPUs. Your limitation of accuracy is CPU.
Greg Low: Yeah. Most computers can’t do more than about 18 milliseconds. Discussion is usually that sometimes we receive data from real time equipment or something with precision. Need to store in database. Gives us ability to do that.
Roger Doherty: Yes. Level of accuracy now. With complete parity with .NET framework type and its associated accuracy. No longer have impedance mismatch. Very nice.
Greg Low: One thing I haven’t tried yet is in .NET code where there is no date time data type, how that’s parsed into proc expecting time data type.
Roger Doherty: Convert to string.
Greg Low: That’s the intermediate thing to get across?
Roger Doherty: Yep. That’s how done.
Greg Low: Chance equivalent types will be added to framework?
Roger Doherty: I strongly doubt that. I believe primitive data types in .NET will take a lot to convince. Pain point on SQL side because simple migrations. Lots of people want to make SQL Server. If you’re coming from platform with discrete date and time types, makes migration more difficult.
Greg Low: Dates not bad. You can have zero time. But having 1900 date has potential of coding problem as magic number stored as part of data.
Roger Doherty: Yes. Offends people sense of accuracy when dealing with. Date time cube addresses with range. Date range is less arbitrary than what we were dealing with starts the year 01.
Greg Low: Dates get fuzzy past the 1600’s anyway.
Roger Doherty: Yes indeed. They were kind of shifting things around a bit.
Greg Low: Another comment that comes up. Is there interest in saving BC type dates?
Roger Doherty: I could see some applications that would benefit from that. I haven’t heard that asked for a lot.
Greg Low: Question is storing a negative date time. Answer is no. Wonder whether it might be interesting to have potential to go backwards four or five thousand years.
Roger Doherty: The calendaring gets pretty fuzzy at certain points. Doing meaningful calculations on dates can be challenging. Interesting idea but not something I’ve heard a strong demand for.
Greg Low: The number of people using is pretty limited.
Roger Doherty: Scientific or engineering applications as well. Historical obviously.
Greg Low: Many of those are millennia as well. Ten thousand years.
Roger Doherty: Right. Other enhancements with dates and times. Date time offset type based upon date time too but preserves time zone as part of the value. Comparisons and calculations between different time zones meaningful. Helpful for globalized applications where you have to worry about time zones. Pretty good summary of what’s happening with dates and times. Moving up in complexity, condensability model we’re using to take traditional RDBMS data site management functionality of SQL Server and extend to richer types. There is another simple type with big storage implications. The new Filestream data type can be used to store all kinds of binary information, pictures, etc. Problem is once your average instance size gets large, you’re storing a lot in row with other relational data. Tends to not be performant from IO perspective.
Greg Low: Sweet spots? I saw one graph of Dave Lane’s. Indicated about 1.5MG.
Roger Doherty: Yes. That’s the sweet spot. Performance begins to degrade when average instance of blog data larger than 1MB. Traditionally, dealt by not storing blogs in database, but externally in file system, pointing to those things whether UNC names or whatever. Maybe leveraging third party image, management retrieval system for doing that. Obvious application development complexities associated and management complexities associated with.
Greg Low: Consistency issues.
Roger Doherty: Absolutely. Transactional consistency, guaranteeing transactional capabilities when data stored outside engine. In SQL Server 2008, we’ve added new attribute to binary called file stream. I describe as SQL Server having new storage engine capability. Everyone familiar with existing where SQL Server takes space away from OS, chunking into pages and extents, managing storage itself. Not accessible to OS. SQL Server now has new storage engine that uses file system for persisting blog data. File stream. SQL Server can go and rather than create data file or file group, when you create file stream group, folder created in file system. As you store blog instance data you see crazy hierarchical folders and gobbledygook file names with no relation to your actual blog instance data. To be expected. Would you expect to examine your pages and extents and understand what rows those correlate with? Invoices or customers correlate with? No. you expect SQL Server to do sufficient job managing that and happy to let SQL Server do its own thing. In file stream, same analogy. As we persist blog instances, SQL Server makes efficient use of NTFF as storage vehicle for storing blog instances. SQL Server mgs he name space, storage, transactionally with transactional consistency, and interesting benefit is when you read and write blog instances you have option of using Win32 file streaming APIs to do, the kind of performance people want when doing streaming of large blog data.
Greg Low: I used the sample you and Zach had with the launch stuff. Simplified the code. Not difficult.
Roger Doherty: No it’s not. Different paradigm and a bit of brain shift. If you write most of your code in TSQL, file stream will look arcane. If you’ve done any file stream in WIN32 or managed code, pretty simple. We’ve faked out the WIN32 file streaming APIS, allowing SQL Server to mg name space for the file handles. SQL Server will mfr path that can be used to open handle to file. SQL Server mgs that space, the path produced has no relation to actual file or location on disk. Simply way SQL Server can communicate back and forth with WIN32 to create handle for IO operations.
Greg Low: One thing that surprised me when I first looked at the folders, when I modified things I ended up with multiple copies of the same object. Dawned on me that is needed for transactional consistency.
Roger Doherty: Just as in storage engine, if we delete rows we don’t expect SQL Server to zero out all that. Goes and cleans up when needed. Same with this. Garbage collections ops at different points in time, especially after transaction logs truncated, things like that, free up storage. Files managed by SQL Server are there and I wouldn’t mess with anymore than I would mess with pages and extents in MDF and LDF files.
Greg Low: That’s another reason why important not to just go and try to find and access file. You need SQL Server to tell you which file and which version.
Roger Doherty: What is file logically might not be file physically on disk in file stream. Might be split up. No reason you’d want to access directly. Always go through SQL Server to get the handle to open and write to.
Greg Low: Moving up the tree?
Roger Doherty: SQL Server 2005, most aware that we did major extension of SQL Server by adding and hosting common language run time for managed code. With that, we have generic user defined type capability where you can create own CLR types and serialize and deserialize data into. Rudimentary type of extension capability in product. Leveraged that capability to add new richer types in SQL Server 2008. Example of fundamental changes in 2005 are paying off in future editions of prod. First example, Hierarchy ID, I describe as nice alt to using common table expressions for exploding natural hierarchies like organization charts, BOMs, etc.
Greg Low: General ledger accounts.
Roger Doherty: Any natural hierarchy. When you’re dealing with common table expressions, you have to learn how they operate. Only a few higher order beings on planet who have figured out how to do. Second, performance ramifications of iterating and running queries can be significant. People have wanted way to deal with natural hierarchies in straight forward way. We created Hierarchy ID. Just a binary type exposed through common language run time that you can manipulate in TSQL or you can take client libraries and reference them in client applications. You can work with these things outside the engine in managed code. Hierarchy ID gives ability to represent node level hierarchy for where row happens to exist in existing hierarchy. Underlying binary representation you can cast back out as string. Root node, if casted out, shows up as backslash, saying beginning of root, no parent. Number the nodes underneath. /1 is first direct descendent of parent node, /2 second, so on. You can have several different levels. Methods for returning descendents for founding parents, manipulating hierarchy at different levels. More interesting, indexing where you if you want to do depth first search of hierarchy, supported natively in indexing strategy. Index search for that, exciting, considering performance ramifications of type of recursive querying you’d have to have done before.
Greg Low: CT or temporary tables. Lot harder.
Roger Doherty: Yes. Other type of search, breadth first search, everything at certain level. Straight forward. Grab level of every instance and persist as computed column. You can create index on both Hierarchy ID and level of that particular instance in hierarchy, making easy to breadth first searches. All managers at fourth level of organization. Efficient search. Most relational purists will have problem with is there is no referential integrity automatically maintained in system. Has to be done at application level. Pruning nodes, moving around, write code to do. Not automatic. Don’t cascade weakest nodes. Take care to keep hierarchy intact so you’ll see most samples will set up full serializable transactional isolation level to ensure multi-user system, you don’t step on someone else’s manipulation of same hierarchy.
Greg Low: So Roger, is there a life outside SQL Server?
Roger Doherty: Yeah, there definitely is. I live in Philadelphia, PA, birthplace of the U.S. where it declared its independence against British and where U.S. Constitution drafted. I live in the middle of the historical district and enjoy living in that environment. Most people don’t consider the U.S. to have a lot of history in comparison to rest of world, but what history we do have is in Philadelphia.
Greg Low: Compared to us, we have both. We have the Aboriginalists who date back some 40,000 years. Things where people have constructed big obvious things, every time I go to European place and they talk it 3,000 years old, I feel wow.
Roger Doherty: Yes. I feel fortunate to have houses on my street that pre-date the Revolution. Built in early 1700s.
Greg Low: Obviously of good material.
Roger Doherty: Yes. Red brick. We have strict standards here in Philadelphia about maintaining original structures. Unlike big brother New York where they tend to knock stuff down and put up big skyscraper in its place, we try to have it hang around a bit here.
Greg Low: You must not have much in terms of natural disasters.
Roger Doherty: Not typically. We have flooding in different areas, but Philadelphia itself, inland, with inland waterway the Delaware River, we don’t have a lot of natural disaster. Worst thing here is likely a good Nor’easter that blows up the coast in the wintertime. That’s about it.
Greg Low: No great urge to relocate to Redmond?
Roger Doherty: They’ve been trying to get me to do that for a long time. I’ll say the diplomatic thing. Redmond is a lovely place to visit but I’m not sure I want to live there.
Greg Low: That’s great. Back on the new data types. Sparse columns.
Roger Doherty: Yes. How I talk sparse columns of column sets is along lines of utility for data structures never intended to work with by humans. Everyone is familiar with SharePoint. SharePoint, major portal product at Microsoft uses SQL Server as data store. Most data structures that SharePoint creates aren’t man-made data structures. They are generated as result of portal software. Can have ungodly number of columns. Ran into column limitations quickly in SharePoint, so had to come up with own storage engine over top of storage engine to span and split rows into database. Anyone trying to work with SharePoint data directly out of SQL Server knows that those data structures never intended to be directly manipulated.
Greg Low: I haven’t done so but have heard comments.
Roger Doherty: Yes. Don’t do yourself. Go through SharePoint APIs to manipulate SharePoint data. Bottom line, that type of data structure not uncommon in software design for machine generated data structures for engineering and scientific applications. Not all data structures intended to be used by humans. Can have many columns. Most likely much data will be null. Column might have value only two to three percent of time. In existing storage engine, SQL Server 2005, would have allocated storage for null values. Sparse columns, column sets, can have many columns. Theoretical escapes me but very large.
Greg Low: I’ve heard numbers like 100,000.
Roger Doherty: Yes. Ridiculously large number of columns. We deal with as column sets. Assume sparse. Don’t allocate storage until value somewhere, then will write out the value. Don’t waste space on millions and millions of null values never needing storage. Handy addition to arsenal in dealing with complex data structures.
Greg Low: Also, XML way for reading/writing important as well.
Roger Doherty: Yes. Access model for working with these column sets needs to be different because of large number of columns and the different data manipulation you need to do. Extensions, TSQL, XML to work with columns. Efficient.
Greg Low: If you select * from table, the last thing you want is 100,000 columns coming back. Important for people to understand you get the normal columns, and a set, one column of XML with those that have values. Gigantically wide.
Roger Doherty: If you need to materialize that set in standard way you can state columns and have real column set.
Greg Low: Interesting question. I endlessly discuss in SQL trainer newsgroups, students don’t have XML interest coming in door. Trainers don’t see strong interest. Tend to be light on wanting to cover XML. I keep suggesting there is more and more use every day.
Roger Doherty: I agree. Early adopter initiatives for SQL Server 2008, we’re turning crank with all the XML capabilities in product to get people to take another look at and see how powerful and important it is, that our implementation isn’t a bolt on of XML but integrated into query processor, indexing capabilities for handling. More applications I see leverage the type and people are understanding how powerful.
Greg Low: I also ask if they have interest in log in or DDL triggers, event notifications.
Roger Doherty: Sure. We use in our own tools basic familiarity helpful. Old habits die hard.
Greg Low: Filtered indexes.
Roger Doherty: Idea behind filtered indexes is old problem of lob sided of heuristics for selectivity. A million rows in table with 98 percent the same value, I have little or no selectivity in table. Can filter out some common values and create index heuristics on the values where variability.
Greg Low: I can see great benefit. Endless discussion I used to have where I differed with training materials, old argument about bit columns and the pointless indexing as no selectivity. But every time I looked, it was male versus female. I found business applications reality was finalized and non-finalized and un-finalized is what I was looking for, a fraction of a percent. Filtered indexes help. I still wanted an index to find those few very fast. Now I can build index with just those.
Roger Doherty: Exactly. Another nice tool in arsenal. Not necessarily programmability enhancement, but affects performance for queries we deal with, columns that are lopsided.
Greg Low: Excellent. Other big thing is spatial.
Roger Doherty: Yes. I understand you have other talks coming up, so we’ll touch on that here and give your future interviewees room for deeper discussion. We extended type system in SQL Server using CLR to support spatial types. Two types. Geometry and geography. Geometry is standard, two dimension plainer type that corresponds to a lot of geospatial systems out there. A lot are two dimension systems.
Greg Low: Flat earth where we’ve taken round earth map and flattened out. Distorted by nature.
Roger Doherty: If you’re dealing with small surface area, not bad. If large surface areas, lose accuracy quickly. GPS systems use more of a geodesic coordinate system that account for curvature of earth. Geometry type corresponds with that and accounts for curvature in distance calculations and things like that. When you talk these, important to set up is there isn’t anything to be afraid of with these new types. A lot easier to leverage than most understand. If you don’t have experience with GIS applications, this isn’t about loading every map in universe into SQL Server. I would encourage you not to do that. There are great mapping and rendering services on internet that do everything but massage you and slice your bread. Virtual Earth for example.
Greg Low: Something mentioned is integration with Virtual Earth. We can build mash ups using Virtual Earth. Any other level of integration?
Roger Doherty: Yes. Using Virtual Earth web services to build your application. That’s the integration. I chuckle at that. We’re not calling Virtual Earth out of the engine to do things. What it does mean is if you’re passing data back and forth, persisting geospatial data in SQL Server, you can use to render things on top of Virtual Earth. Common thing where you want to visualize spatial data you’ve persisted, do in Virtual Earth. There are APIs for that. Sample for drawing polygons on maps using Virtual Earth. You draw the polygon and then generate insert statement, creates insert to insert polygon into SQL Server geography type. Integration to ensure we support same coordinate systems, make sure apples to apples when working with geospatial data. We support same systems as most popular GPS and Virtual Earth does. Not on completely different coordinate system. Supporting open geospatial consortiums, well known types and methods for manipulating spatial data, have support as well. Well known text and well known binary representation of spatial info whether point or polygon, line, etc. There are standards that govern how represented as string and binary instance. We support both.
Greg Low: Launch, something I did by hand. I found well known text format frustrating in that they do things like use spaces to delimit different parts of the latitude and longitude, rather than commas.
Roger Doherty: Yes. Same approach as everyone else building first demo. You learn quickly that you only want to do once, never again.
Greg Low: Leads me to next issue, getting data into SQL Server from various spots. GML, XML. In market, most data in things like shared files. Is there anything happening to help make transitions easier?
Roger Doherty: Yes. Third party solutions available. Two are common, doing a lot of integration work with SQL Server now. One is a SQL Server services add in, the other full suite of data manipulation tools. A Swiss Army knife for moving geospatial data. Nothing in box with SQL Server. We’re counting on partners for that. If you go to my blog, blogs.msdn.com/rdoherty, I’ve got geospatial post that references the tools from our partners. Having SQL Server there supporting geospatial data isn’t enough. You don’t want to build own mapping engine. Use a web service like Virtual Earth for that. You don’t want to write your own data migration and data transformation tools. There are great third party solutions for that.
Greg Low: Indexing work to find data quickly.
Roger Doherty: Yes. It’s not just about storage. Anyone can create user defined type. It’s about integration in query processor and indexing. Every type we’ve talked today, Hierarchy ID, XML, and spatial all have own indexing capabilities built into storage engine. Spatial, we have multi-tiered grid based indexing strategy that gives you ability to find tune selectivity of search based on surface area. Able to have multiple indexes. Not just index for entire planet if you only searching for Adelaide. Or you could have on index dealing with all of Au, but to improve selectivity, a second index with smaller grid size for Adelaide that creates improved creates improved selectivity against that region. Engineered all to work with underlying BTREE storage structures. Added options for tuning each level of grid.
Greg Low: That brings us up to time. Main other thing, where will people see you? Calls to action?
Roger Doherty: Yes. Check out my blog. My colleague blogs/msdn.com/zowens. We both work driving early adoption of SQL Server 2008 worldwide through programs like SQL Server 2008 Jump start and SQL Server Metro program. If you’re participating in those, you’ll run into me or Zack. Come up and say hi. We’ll get a chance to talk.
Greg Low: TechEd?
Roger Doherty: I won’t be there. It’s so supported, they don’t need me piling on. We’ll traipse worldwide doing instructor led training while TechEd going on.
Greg Low: Thank you for your time, Roger.
Roger Doherty: Thanks Greg. I appreciate it.
Phone: 1300 SQL SQL (1300 775 775) l International +61 1300 775 775 l Fax: +61 3 8676-4913
Copyright 2017 by SQL Down Under | Terms Of Use | Privacy Statement