Peter Myers
SQL Down Under Show 19 - Guest: Peter Myers - Published: 19 Sep 2006
In this show SQL Server MVP Peter Myers provides an introduction to SQL Server Analysis Services for developers and DBA's.
Details About Our Guest
Peter Myers is a mentor and I was going to say based in Melbourne, Australia, but I’d say more loosely based in Melbourne, Australia, given the fact I see Peter in various places all over the world.
Show Notes And Links
Show Transcript
Greg Low: Introducing show number 19 with guest Peter Myers.
Our guest on this show is Peter Myers. Peter is a mentor and trainer with Solid Quality Learning. I was going to say based in Melbourne, Australia, but I’d say more loosely based in Melbourne, Australia, given the fact I see Peter in various places all over the world. So welcome, Peter.
Peter Myers: Thank you, Greg.
Greg Low: So maybe again like everyone else, I’ll get you to describe how you came to be involved with SQL Server.
Peter Myers: As I scratch my head I have to think about it. I think this is my 10th year with SQL Server. My background is with a bachelor of business and economics. No idea at school or had any interest in computers. Working in bulk shipping was my expertise with chartering vessels and operating ships, out of Gladstone in Queensland of all places…
Greg Low: In another life I worked for HP and used to often be involved in ships and things out of Gladstone. Intriguing.
Peter Myers: Not the most scenic town, great shipping operations. Being an expert in shipping, got involved in writing a shipping schedule system for shipping company. Access 97 was the tool, spent 80 months building this database application, really sifted through books and learned what I could and couldn’t get enough. Once I finished this in Access I decided “well, I could have a career in shipping” and that didn’t hold much promise. Only so far you can go in this country. IT provided me a passport that crossed all industries. So I thought “this would be great.” At that point I thought let’s go for the IT jobs, found not many people were going to take a shipping expert who had worked in Access. Took back to India where I had done travel and work. Consolidate knowledge and achieve certifications. SQL Server 7 times. Made switch over there, became certified and experience, came back to Australia and landed that job.
Greg Low: With shipping did you find the industry as a whole was forward-looking? Involvement I had was it was behind the times.
Peter Myers: What you find is any job that’s going to be interesting or hold responsibility or promise is going to be overseas. Need to move to Singapore or somewhere like that. Limitations were there, I guess I decided could I do this for another 30 years? Truth was no I couldn’t. Great work, it got you out onboard ships, got you working with all sorts of things. Twenty-four/seven work as well, really once I got a taste for IT that was the path I wanted to take.
Greg Low: One story I recall from working with various ships was a thing that lead me to see people have blind faith in computing. Recall doing work for one company. Up in Macai, north Queensland, toward top end of Australia. Doing work on some HP3000 minicomputers, got a call from office, “can you have a look at this old…” 98-10 or something, old HP calculator. Seen one in the office, no real idea anything about it. Shipped me one up to play with so I could do that for a couple of hours. Sort of try to fix one that was coming on a coal ship. What had me intrigued, I was on a coal-loading facility a few kilometers out into the ocean, kind of interesting. Wondering how to get on the ship, intriguing. Big crane thing, turned around and dropped down, joy flight going through the air across onto the ship. Intrigued they were using it for stress calculations on the ship. Thing didn’t have parody checking. Do the same calculation twice and be up by hundreds of tons. Beautiful thing was they had complete faith in it. Calculation once and use result.
Peter Myers: One hundred tons isn’t here or there.
Greg Low: It just had me fascinated. Visions of this breaking in the middle of the ocean on the way back. People who had written the application in Liverpool years ago had since left the company, no one knew how it worked but they kept using it. Could’ve rewritten it on a hand-held calculator and it would’ve done a better job.
Peter Myers: Sounds like a business opportunity for you. Hate point.
Greg Low: Yes hate point Macai. What about Analysis Services, how did you get involved with it?
Peter Myers: In 2000, I worked on smaller projects, got my hands dirty with Analysis Services in 2000. Didn’t get involved with it in 7, embryonic in those days. Opportunity for training with Microsoft. Terry Clancy championed zero-to-BI course. Needed a trainer. I worked with it before, figured I could do a training unit. Greg, you were in one of my classes.
Greg Low: Yes, sat in it before, it was great.
Peter Myers: Worked with it, found that since I had training, ended up doing more repairs than construction doing it. First multi-dimension system taste. Demystified what a multi-dimension database could be. When you come from relation background, think of days when you’re learning primary keys and building knowledge incrementally. You know on the horizon there’s a multi-dimensional database, you say oh my god, it’s going to be to the power of ten more complex than a relational database could be. Great to get my hands dirty and discover they’re not as complex as they sound.
Greg Low: Enjoy the course. Maybe for developers or DBAs that haven’t been involved in it, listening to the show. What’s the interest in Analysis Services, why does it matter?
Peter Myers: In Analysis Services there is an all-out system. Most DBAs are familiar with OLTP (online transaction processing). Writing systems, designing systems that are optimized for write-intensive ops. Other side of coin with OLAP (online analytical processing), we’re going to retrieve that data and we can all write a select statement, no problems with that. Can we write one that’s going to be efficient in how it retrieves information? Give me sales by month, by sales region, product, sales person. When you start grouping by if you think of a relational select statement, single value that’s group buy and summer sales. Could be millions of records coming from table in database. We appreciate what’s involved in retrieving and caching the data and presenting the single value is expensive. High concurrency will impact other users wanting to share database resources. Where OLAP side fits in, these systems are being designed and optimized to handle high-level aggregate queries. By definition, an OLAP engine provides fast aggregate results, flexible querying. Supplement with calculations layer so we can embellish it with interesting calculations like year-to-date or moving month average. That’s what an OLAP system and what Microsoft’s implementation with Analysis Services has been able to achieve.
Greg Low: So you started to touch on what sounded like dimensions. Time for definitions I think.
Peter Myers: Like to think of it with clients at requirement gathering time. “What do you need from your system, what info will drive your decision-making process?” Listen for a small word “by.” Need information by time, by geography, by account, usually introducing the concept that dimension is required to constrain result coming back from system. Classic dimension would be time. Think of sales system, constrained by product or store, by customer. We introduce these bys, about dimension, hierarchical in nature. Product dimension has levels and categories of product. Introduces ability for OLAP engine to aggregate at different levels efficiently. Comes down to the design of the dimensions itself. That by word introduces dimension or a level within that. Might hear that client says I need to see by product category or by individual product. Two potential levels within product dimension itself.
Greg Low: So usually interesting dimension is time.
Peter Myers: It’s always a common dimension to a cube. Never come across cube that doesn’t care about time. You’ll find time is a dimension that supports one or two hierarchies. Financial ones might provide fiscal way to drill down through data through a hierarchy like fiscal year to fiscal quarter to month to date. Granularity to assess data at. Consider calendar hierarchy or manufacturing hierarchy. Analysis usually centers around time-based analysis, time dimension is first consideration in design.
Greg Low: That’s dimensions. Other terminology. Throw the word measure in there. What are we interested in analyzing. Star schema concept, familiar to some. Every Microsoft presentation they show pictures of Star and Snowflake schemas.
Peter Myers: Let’s demystify that. Any clues about what that means. Consider relational databases. Familiar with normalized structure, spider web designs of entity-relationship diagrams. The process here a design time for data warehouses is to use a Star schema. Says lets go ahead and concern with these dimensions. The other type of table would be a fact table. Fact table view is responsible for storing measures of interest. Think of these measures as those numerical values usually aggregate-able, that we wish to analyze. Referring back to sales example, we have a database with time and product and customer dimensions. What are we interested to know in the intersection of dimensions? Sales dollars, units sold could be classic measures. In Star schema, introduce fact table with these two columns. Additional columns would be dimension keys, break down by particular product, customer, date. Concept of a fact table. The reason it’s called a Star schema is if you can imagine the fact table sitting at the center of the star, dimension tables form points. Imagination if only three points being time, customer, and product. Then extend beyond that, greater snowflake, you’ll find these dimension tables are denormalized. Denormalization is an optimization for read of a system. Less need to join data, you’ll get better read response. Usually in our design we find those dimension tables will be denormalized. Time dimension table, populated with everyday you expect systems to analyze across time, find that would be denormalized. You would have a year column, calendar year, calendar quarter, month, data. You’d expect to see repeat values in those columns for year, quarter. If you want to normalize that it’s not usually recommended practice. If you did you might find product dimension gets broken into category with one-to-many relationship to product subcategory in turn a one-to-many relationship to product. Together those three dimension tables would form a snowflake. Imagine that point on the star has tables moving out from it with those relationships, it begins to resemble a snowflake.
Greg Low: That’s good. In terms of other definitions, when we go to query the things we then build with aggregated values, end up with select statements. The word select looks the most familiar thing and the rest of it tends to not as we head into multi-dimensional extensions to the language. What’s the basic idea there?
Peter Myers: Moving forward in progress there. When we query the Star schema we would still be using the relational select statement. Star schema is ideal starting point for constructing cube, after we’ve built it and defined dimension objects and measured groups, off the Star schema we go through process of processing. Provide instruction for Analysis Services to go and process. Default process is under a MOLAP configuration (stands for multi-dimensional OLAP). Go and extract all the data from that Star schema, compress it into external cache that lives on file system, optimized for querying against. They have queries that will be directed to the Analysis Services itself. Query language we use for Analysis Services is MDX (stands for multi-dimensional expression). Some people will refer to it as… Be aware there are two interpretations of an acronym. As far as the similarities, when the MDX was designed it borrowed as much as it could from relational select statement. Good and bad. Sometimes what you find is that we’re familiar with select and from and where, they play significantly different roles here in MDX. We’ll find sitting on select statement is not comma-separate list of columns, but rather definitions of what should appear on the axes. Need to recognize is querying from multi-dimensional structure, bringing back sub-cube that can have its own dimensions. In the query, you’re defining dimensions that will come back, commonly work with two. From human consumption point of view, working with three is difficult to represent or interpret.
Greg Low: Anything more than three difficult to imagine.
Peter Myers: Love to demystify it. Why a multi-dimensional database could be complex, what you find is it’s there to support a great deal of what could be complex analysis. The way you retrieve data doesn’t have to be complex. Humans at the end of the chain that have to consume the data. When we construct the first clause of our select statement, saying what will reside on the axes, two concerned with are columns and rows. Define members on there, from the dimensions themselves. On the columns give me all the quarters for 2003, on the rows give me the product categories we have. Having defined what members live on axes, need to say what we want to see at intersection. So for bike categories for Q3 2004, what do we see? That’s select part. Haven’t introduced what we see yet. From clause is straight-forward from a single cube. Can’t join cubes like you can join tables. You say it’ll be from this cube, end of story, alright and then the ware-clause is different in relational world, can’t get so fancy by using logical operators. What you need to define is a single TUPLE, really an address to a cell in a cube. Define in a TUPLE is you would say with all dimensions sitting on cube, you could lock in a member from each dimension. For example, say give me Q1 2003, product category bikes, customer XYZ. Measure of interest is I want sales amount. If you lock in those members on each dimension, you will return a single cell from this multi-dimensional structure which gives you the answer you’re looking for. Ware-clause consists of a single TUPLE, refer to as slicer, it’s how we filter. Go in and say what we have on the axes, time on the columns and products on the rows, we could filter with TUPLE that says get me customer XYZ and sales amount. Define what would be at intersection of those two axes.
Not as difficult as it sounds, matter of playing around, seeing result and saying “ah, if I change that that’s what happens.” Essentially how ware-clause works.
Greg Low: Other key element is how we get the data into Analysis Services database in the first place.
Peter Myers: Easier than MDX. If we’re talking Analysis Services 2005, when we construct the cube, cornerstones to it is a new object, called data source view. Create a project in BI development studio, only way to create cubes in 2005, first part of recipe says define data source. Nothing different with 2000, point it at data source, provider server name and credentials. Second part is to design data source view. Way to think about this is virtual schema. Design this view to be based on one or more data sources: first benefits of working with data source view, we can use it to bridge different data sources together. Might have an Oracle or Access database, SQL database, no problem. Define three data sources, in data source view introduce objects as tables or views from those three data sources. Now additionally define relationships that link those tables. Add note that if you’re relating systems like this the usual approach is to credit data warehouse, and using ETL process with extract, transform, load. Integration services combine to single data source, build cube off it. Data source view does provide this provision. The next benefit is when we’re not happy with the actual structure of the underlying system, database owners aren’t happy when you say “for my OLAP system I need this column,” or “I need you to rename it to something that works for me.” The data source view provides this level of extraction. Override what the underlying system defines for table and column names. Embellish with own calculations, introduce named queries, select statements that behave as tables in your new schema, the data source view. Benefits are multiple data sources can be linked. Modify without modifying actual data source to clean up underlying source. Other is you can work disconnected. Once you built the data source view, disconnect your laptop go home, continue to design and develop your cubes.
Having built this data source view, it’s the cornerstone of your design, you build your cubes on top. Once cubes have been designed, go through process. For Analysis Services it has three storage modes, most common is MOLAP (mentioned earlier). Process time, go retrieve data by data source view structure, compress and store in file cache. Works for dimensions members and all facts in fact table that might be comprised of sales amounts and quantities. Concept in MOLAP is its preferred because it provides best performance at query time.
Greg Low: Because of pre-aggregated values?
Peter Myers: There’s two parts of the process. One is it extracts detail values straight from fact table records. Second, according to design, it’ll aggregate data as well. Aggregation is what buys you performance, it actually heart and soul of OLAP system, equivalent of what indexes achieve in relational world. Two parts happen at processing time. Extract base data from fact table. Design degrees of aggregation. Analysis Services permits you to design an aggregation constrained by disc space, performance gain, designed on usage patents. Logging feature in Analysis Services that will capture querying. You will log those and it will use intelligent algorithms to assess usage and find aggregations. When you stop and think about multi-dimensional structure, even simplistic situation, might have 10 products, 365 days, half dozen customers. When you multiply all possible members and consider all levels of hierarchies, the number of cells in this cube is almost beyond comprehension. If you pre-calculate and store those aggregates, introduces data explosion, historical weakness of OLAP. Be intelligent and use designs that will selectively aggregate and Analysis Services is smart enough to reuse those and perform some aggregation at query time, rely on others that are being pre-calculated. Back to process. It will extract all fact data and according to aggregation design bring in and calculate and store those aggregations. That’s from OLAP. Great performance, suffers typical problems of latency. Data is only as current as last process. That historical weakness lends itself to say “if it’s important to have real time,” look at the other end of the spectrum which is ROLAP (relational OLAP). You’re fact data remains in Star schema, aggregates can be calculated and stored in relational world, you’ll get real-time response, don’t expect it to be fast.
Greg Low: Don’t look for it in a hurry. Short mention for HOLAP (hybrid).
Peter Myers: Means leave the details and fact table, aggregates would be calculated and left in multi-dimensional world. Ideal compromise, recognize queries are hardly ever directed down at leaf level, fact grain you have. The only problem is a nice theory, processing of it, extra aggregates it needs to build, couldn’t have done in MOLAP. If you thought it was a good compromise, users will be confused that queries are fast sometimes and some are slow, has to teeter between two systems.
Greg Low: What are you willing to share with us about yourself, hobbies, where you live?
Peter Myers: The life behind the presenter. Struggle to find it. For the past two years, with Yukon ramp-up, release, I’ve been amazingly flat out. Has been my choice, not always…. Quite greedy to continue learning. The BI space is exciting, return to that later. Background about myself. Live in Melbourne, when I’m here. Purchased property, going through renovations, taking up fair amount of time. I’m pretty much a homebody, play piano, classically trained for 20 years. That’s my refuge. Share this story with friends to say when I’m working as a developer, do a fair bit of developing, you know Greg when your time seizes, like the logic is too much.
Greg Low: Part of it is I need to get up and walk around.
Peter Myers: What I find is one half of the brain just needs a rest, and the piano sits opposite my desk. Sit there and play for 10 minutes, amazing you switch back, your other half of brain has worked it out for you. Complementary that music and IT and logic work very well.
Greg Low: Had a few shows where we’ve talked about that, guys like Carl Franklin talk about it endlessly. Look at number of guys in music, also in IT development. Have to be a direct relationship, too many people who play.
Peter Myers: Something about how the mind works. Music is in a sense purely mathematical, but I’ve trained in theory, yes you could break it down to a mathematical formula. Interpretation is where human side comes in. Rewarding. Although I have a TV at home, it probably gets turned on once a month. I would rather sit down at piano and interpret music. More pastime than having entertainment pushed at me.
Greg Low: Admit, most of the music I like in the 70s and early 80s, I was playing hippie rock in bands, laid back now, things like James Taylor tend to appeal greatly at this point.
Peter Myers: You were in a band yourself.
Greg Low: For many years. I was pleased with and bought some tickets to Elton John when he was here in December. Saw him last time, great fun. Also Eric Clapton coming in February, already too late to get tickets, even for $300 for two tickets behind the stage. I could buy 10 of his albums for that.
Peter Myers: And the thing is those tickets will sell. Economic boom time. Seeing this in an IT sense. IT and demand for quality resources just isn’t keeping up.
Greg Low: With Analysis Services, other thing is what’s different in 2005 compared to 2000? Told people this, seen you at too many road shows, but what’s the difference?
Peter Myers: Acronym called UDM. Unified dimensional model. Means we had cubes and multi-dimensional databases in SQL Server 7, but didn’t become mature until Analysis Services 2000. We needed to appreciate with Analysis Services 2000 that it was a hot product, and bubbled together with SQL Server license, limiting when it came time to reporting from. Met requirements of flexible and fast querying for aggregate-level queries. When it came to rich reporting it let us down. Had dimensions of time, product, customer. Construct report from that, break it down by product category, but I also want to know what the product color was. Hold on, dimensions in 2000 consisted of hierarchies only. Could’ve had one that was category/subcategory of product, but where did color fit into all that? Work around in Analysis Services 2000 so we could introduce member properties, build virtual dimensions. Prices to pay for doing this. At end of day, scalability and performance would lose out. Been addressed in re-architecture of Analysis Services 2005. Essentially product has been re-written, entire engine, architecture of cubes and how they’re designed has changed.
Greg Low: And certainly all the tools themselves. I thought in 2000 the tools didn’t look like a Microsoft product, maybe.
Peter Myers: I don’t miss those model dialogue windows. Address what you need to do there, clumsy working in design. That’s gone, have integrated development environment that developers will appreciate, rich environment Business Intelligence development studio, how we refer to it from SQL Server point of view. Essentially SQL Server license to access Visual studio. Development environment for all of the I-platform, from integration to analysis to reporting. Development environment’s a great start. More professional environment promoting working with teams and source control for meta-data. Moving on to…
Greg Low: Going to ask actually, for databases themselves, any specific considerations in terms of hardware or resources compared to “normal” databases? Use the word “normal” loosely. OLTP databases.
Peter Myers: Consideration here. Need to appreciate that typically query requests to an OLAP engine are going to be more diverse than our OLTP system is. Recognizing that OLTP is write-intensive, generally the retrieval from it is minimal, yet imagine an OLAP server and databases its hosting, and in turn the cubes that are in those databases. Situation where one user might request data across last five years on one axis, and another simultaneously wants to know what’s going on across all these sales territories. While striving to have hardware that will meet a ratio of 90 percent that’s pretty cool. Find the inverse, the vast requirements to retrieve data across this multi-dimensional system means significantly lower, accept that. One vital ingredient when factoring hardware requirements is how much memory? Really important from a querying perspective, catch the entire database if possible, not possible for huge cubes. From point of processing, put burden on processor, will want lots of memory to retrieve data from source systems. In order to process and design and process aggregations, want processor grunt. Got to size the equation and factor querying versus processing. Saying you want to grunt the image chain. Comes down to how long that string is, what size is your cube, what number of users do you have, how diverse are the queries? Consideration for OLAP is to consider 64-bit platform. Server comes with many of the memory limitations it would have had with the 32-bit platform, OLAP databases are memory-hungry. Consideration to have.
Greg Low: Other thing on an allied note. Different editions of SQL Server. When we’re buying licenses, there seem to be a big impetus in 2000 to get enterprise edition and tend to some of its capabilities compared to standard edition. What about 2005?
Peter Myers: from relational point of view, difference between enterprise and standard has been scalability and availability feature set, and still applies to BI set. Focusing on Analysis Services tonight, the availability/scalability still applies. When it comes to processing time, standard edition says you may only have one partition, which is the storage for your, going to say cube, really measure group, cubes support model fact tables in 2005. So where you find that processing takes time you could use with enterprise edition a partitioning strategy. Says lets use partitions based on a time slice, January sales in this partition and February sales in this one. In processing time you only need to touch and reprocess partitions where you know the data has changed for them. So you get that with enterprise edition, one of the key features. Others include proactive caching. Way of saying we recognize that issue with MOLAP that it has a latency issue, only as current as the last process. Using proactive caching policy you can say that according to notifications, the underlying data sources can tell Analysis Services the data has changed, automatically re-process itself. You can bring the best of ROLAP into MOLAP. Great query performance, near real-time data. Practication is the other compelling reason, moving on to perspectives. Building these cubes, I touched on the concept of unified dimensional model. Just want to add there the unified dimensional model provides best perspective of relational and dimensional reporting…
Peter Myers: Our dimensions, a significant change in 2005, are now attribute-based, unlike hierarchy they were in 2000. Dimension now consists of a series of attributes. Referred to limitation I mentioned earlier, might want to report on product and have category, name and color, didn’t appear in hierarchy. What we find is dimensions are a collection of attributes that map to columns in dimensions tables that describe the entities we want to report against. The deal is dimensions are collections of these attributes, truly hierarchies but we don’t refer to them this way. As a hierarchy it contains an all-level if you don’t wish to constrain by it, and it has a single level beneath that, contains all distinct values from columns it’s based on. Example, color attribute, there’s an all-level or each distinct color in our dimension product table. What this is is attributes are first-class citizens in a dimension, but we still want to navigate across hierarchies to summarize and drill down through data. Build hierarchies off attributes. Have attributes for categories, sub-categories, and products. Then assemble those and say will they become levels in a hierarchy. Just referring back to what unified dimensional model is. One-stop shop to query across all enterprise’s data. Reason for introducing is models now can be quite overwhelming for an end user to address and say I just want to get the answer to my question. Fifteen dimensions and 20 measured groups, the perspective is an enterprise feature, you can narrow down the visibility and say right I’ll introduce a perspective that says this is the sales perspective. It makes visible through that perspective relevant dimension, measures, and other things. Deal here is enterprise edition only, exposed as individual cubes, query as if…
Greg Low: As though it was a cube.
Peter Myers: Right. Make note they’re not a security mechanism. Other enterprise edition feature is translations, really cool if working in multi-dimensional, multi-national systems. Go ahead and say right, using Windows Locale, create translation to French. Say the name of measures and dimensions can also be translated. Dimension members, if you have columns and tables that provide translations for your products or customers can also be translated.
Greg Low: Reporting services itself, shot in the arm with Analysis Services in this edition?
Peter Myers: Yes, much asked for feature. Could take reporting services in 2000 and query using MDX, we were limited when it came to building those queries. Expectation that the developer knew MDX, when it came to collecting parameter values from the end user it was a matter of catenating them into a string.
Greg Low: Looked scary.
Peter Myers: It was. Good news is to further make Analysis Services approachable, now what’s called VDM query-builder, goes by the name Analysis Services query builder. If you create data source in reporting services that connects to Analysis Services 2005, there is a particular provider for that. In report designer you’ll be present with query-builder. Say here’s the meta-data pane that presents your dimensions and measure. Drag and drop in, construct MDX and conform it to a relational set. Requirement that reporting services can only consume relational data. Would coerce your selection into what appears to be relational, supports parameterization.
Greg Low: Basic tips and tricks on ongoing maintenance of systems already in place with Analysis Services?
Peter Myers: With management, starting in Business Intelligence development studio, developed and tested your database and cubes, deployed them. What needs to happen is they need to be maintained as far as currency of data. Referring to the default, if you’re using a MOLAP system which is typical, you need to ensure that periodically the data is re-processed. Find ways to manage this might be you use data warehouse, use something like integration services or an ETL tool to populate the data warehouse as part of work flow. Say upon success of populating warehouse, there is an Analysis Services processing task that will process dimension and partitions. Other approach is using SQL Server agent and doing this….
Greg Low: Schedule thing.
Peter Myers: With proactive caching you find that feature in enterprise edition. Means there is no administrative intervention required. One process of keeping data up to speed. As far as care and feeding is concerned, optimal, may need to revisit aggregation designs. When you first implement cubes, talked about aggregations and how it’s not appropriate to aggregate every cell in cube. When you first deploy cube you have no concept of how it’s being queried. Suggestion is configure logging at server level, log every one in 10 queries.
Greg Low: What about profiler where you can capture workloads and do all that, normally against OLTP databases and profile what’s going on? Equivalent in that?
Peter Myers: Not really. What you’re talking about is what the database is.
Greg Low: Tuning advisors or things like that.
Peter Myers: Need to collect sample of queries. Analysis Services with this configuration allows you to log these queries. Come back and say for those partitions where I arbitrarily said give me a 30 percent for improvement gain, how effective? Not sure if aggregations are being used. Use the usage-based optimization that says right consume those log queries, redesign aggregations based on…
Greg Low: That’s what I was thinking. In terms of tuning, the same thing. Indexes in place that might not be getting used, overhead in place. Ones missing, tools that help with that. Equivalent in terms of aggregations, same thing would have to happen in terms of revisiting the aggregations that are creating overhead for no reason. Processing overhead. Alternatively, might be aggregations missing that might help answer things.
Peter Myers: Balancing act. What you’re trading off is if I introduce aggregations going to extend processing and consume resources to store those.
Greg Low: Larger databases.
Peter Myers: Where is it optimal? Often challenge. Don’t have the fine grain control used to in relational world. Can’t say drop that aggregation, can’t manage it at granular level. Has to be blind faith at some point. Best you can do to influence is by using usage based. May be you reap that by saying let’s deploy, fire off queries we know will happen, apply the usage based on those.
Greg Low: Fascinating stuff. Where can we see you, what’s coming up, what’s happening?
Peter Myers: Solid quality learning offers training in the full suite of Business Intelligence for SQL Server 2005, have a five-day course introducing integration services, Analysis Services, and reporting services. Courses three to four days, going through those topics in more detail. Five-day course provides enough detail where you should be able to control your own destiny. Move forward with own learning path.
Greg Low: What do you have coming up yourself?
Peter Myers: Finishing co-authoring a course for Microsoft, four-day course called “Voyage to Microsoft BI.” Something along those lines. Think of Ascend program and Touchdown, this is the third phase, called Voyage. Microsoft has decided once a product is launched and mature, still need to propagate info on how to manage systems.
Greg Low: Not about the destination, it’s about the ride.
Peter Myers: About getting as many people on the ride as possible. What we’re finding is Microsoft is finding they really want to bring their ISVs up to speed. Business Intelligence has always been a freaky concept. It’s some backend database world. Two parts to BI, backend side, SQL Server as a platform supports.
Greg Low: Was a white-coat brigade in many places. Intrigued by number of Microsoft products that have Analysis Services as part of their own products, legitimize product offering when they start using it themselves.
Peter Myers: SQL Server is only half the story, the other half is what you do to exploit what that backend can manage. All waiting for Office 2007, now see synchronization of Office to SQL Server 2005. Part of workshop is to say this is what front and back end can do. For you ISVs there’s a lot of potential, and not just for ISVs, for anybody in the way these applications or platforms have been designed. Plug and play, built to be as extensible as possible. For Analysis Services if you want to build your own MDX functions, if you want to work with data-mining and its algorithms…
Greg Low: Another day we’ll talk about data-mining.
Peter Myers: Point here is great workshop out there to present to people the potential. Like to think that’s the message that comes out of this podcast. BI is for everyone, and Microsoft’s concept about BI for the masses couldn’t be truer. Technologies are now mature, approachable, and encourage anyone who’s been shy to take a look, not as daunting as you’d think.
Greg Low: That’s great, thanks for your time. Planned to do this at TechEd, ran into scheduling things. Busy period, great to get the chance… Notable week in Australia for the people who aren’t local. Notable people pass on, intriguing. Peter Brock, champion car driver, legend status in the country. Colin Thiele, well-known author, wrote Storm Boy, very popular in the country. Steve Irwin died this week, the Crocodile Hunter, quite a week. Noted on my blog I was privileged to not live too far from where his Australia zoo was. Seen his show and things a number of times. Used to say to people “I’ve seen the show when he was there, seen the show when he wasn’t.” Two very different shows. The guys that work for him are very good, and very game, used to think Steve was just crazy. Completely wonderful that he ended up going doing what he loved doing. When my daughter sent me an SMS saying he had died, I was surprised I was sad but not surprised because given the things I’ve seen him do, could never say it would surprise me.
Peter Myers: Only surprise is it wasn’t a crocodile.
Greg Low: Truly wonderful Australian, mention his passing this week. Thanks for this show, talk to you again soon.
Peter Myers: Thank you.
Phone: 1300 SQL SQL (1300 775 775) l International +61 1300 775 775 l Fax: +61 3 8676-4913
Copyright 2017 by SQL Down Under | Terms Of Use | Privacy Statement