Weird localization problem in DAX editor

One of my students (thanks, Stefan!) discovered during the PowerBI course a bizarre localization issue in the DAX editor of Excel and SQL Server Data Tools, which is easy to replicate: Install Office 2013 or SQL Server data tools on a machine with a locale which uses a comma as decimal separator (I’m from Belgium, a Belgian locale will do). Then create a Power Pivot model, add a table to it, and in the Table editor create a calculated column which uses at least one parameterized function. In this function provide as an argument the constant string “,”. For instance you could use as an expression =IF(1=1;”,”). As soon as I enter this in the editor, the editor replaces the constant string “,” with “;”, but still shows in the outcome the correct value “,”. Something similar happens when I type a dot as constant string. When this is what I type in Power Pivot: This is what I get when I press enter in the expression editor: Notice how the formula in the editor uses a comma instead of a dot, but the evaluated expression (CalculatedColumn1 in my case) uses the dot I originally typed: what I see is not what I get. And to make things worse: when I edit the formula (e.g. adding an else parameter), it really starts to use the comma I never typed, and replaces in the expression the comma with a semi-column I never typed nor see in the evaluated column: So, be careful when using constant strings “.” or “,” in DAX!

Excel Power Query supports Multi-value Lookup columns in SharePoint lists

Power Query is a handy tool to load all sorts of data in Excel or PowerBI. One of the supported data types is SharePoint lists. But Power Query does more than the straight-forward loading of data in a single list. If a SharePoint list contains a lookup column referring to another list, Power Query will allow you to join these two lists together easily. Ant top of the bill: it works for multi-value lookup columns as well. Let’s walk through an example. I start creating in SharePoint a plain vanilla list with some values in there, e.g. a list of Belgian beers: Then I create a second list, e.g. colleagues, and I add a column where I store their favorite beer: I input some data into this list: Next we want to load this into Excel or PowerPivot, so we launch Excel, go to the Power Query ribbon (download, install and configure the Power Query add-on if you don’t see the ribbon) and select From other sources –> From SharePoint List In the next dialog specify the URL of your SharePoint site (not the SharePoint list URL!) and click OK. Power Query will now load a list of all SharePoint lists on this site and allow you to select the list you want to consult. We right click our list of colleagues and select edit, which opens up the Power Query editor window (in the next screenshot I moved the Title column further to the right and did not include many other columns to improve readability): We see that Power Query loaded the Favorite Beer column as a nested table. By clicking on the split arrow icon to the right of the Favorite Beer header, we can ask to either expand the table of calculate aggregated values. Let’s first try an expand, and just select the name of the favorite beer: If we click OK, we get this result: one row for each combination of a colleague with his or her favorite beer: We could also ask Power Query to aggregate the beers per colleague, in our example we count how many favorite beers each has, and what the average alcohol percentage is (our beer list contains the alcohol percentage of each beer): The result of this would be the following data set: From this point on, we can continue loading this data in Excel, or via Power Pivot into Power View, Power Map or plain Excel Pivot Tables and charts.

Loading data in Azure Machine Learning

In July 2014 Microsoft made their cloud-based data mining environment (known as Azure Machine Learning, or AzureML) available to the public. With this platform users can analyze large amounts of data without the need to install and configure special software: A browser and a credit card is all you need . With the increasing number of people in a number-crunching job (data scientists) it is nice to see Microsoft focusing on this. In a previous blog post (see http://blogs.u2u.be/u2u/post/2014/07/14/First-Steps-in-Azure-Machine-Learning.aspx) I show how to get started with setting up an AzureML environment. In this blog post we take a look at loading data in AzureML. Supported data formats Currently AzureML is focusing on the most common formats used in the world of machine learning: Text files containing comma separated values (CSV), tab-separated values (TSV), the Attribute-Relation File Format (ARFF) which was introduced by the open-source Weka machine learning framework, RData files or the SVMLight format. Database tables: Hive tables (Hadoop), Azure Tables and SQL Databases in Azure Since AzureML runs in the Azure cloud, all your data must be in the cloud as well. Either you already uploaded your data to Azure (e.g. your data is stored in an Azure SQL Database) or you will upload it explicitly for this project. In both cases be careful to store your data in the same region as where you’re running your AzureML, since in the preview period, AzureML only runs from the South Central US data center. If you store your data in another data center it will be slower and more expensive to run your experiments. Let’s first consider the scenario where you upload your data from a local file directly into AzureML (Uploading a DataSet), then we cover the scenario where your data is already somewhere in Azure (Reading data). Uploading a DataSet A lot of sample machine learning data sets are already available out-of-the-box in Azure ML. But after some experimenting with public data, you probably want to play with your own data. If you didn’t have your data anywhere on Azure yet, you can upload it as a new dataset in AzureML directly. But before we start adding data sets, first a warning: In the current preview we cannot delete uploaded datatsets. We can override an existing data set with new data, but if you create 1001 data sets, they will be in the list forever (that is: until Microsoft fixes this limitation). Because of this, if your dataset is not yet fixed, consider uploading the data file(s) into a custom Azure blob store and then load them with the reader from within your experiment. To add a new dataset, click the +New button at the bottom left of the ML Studio screen, and select DataSet –> From local file. In the next dialog box, we can pick the file to upload, provide a name (choose well, it cannot be altered later on), select the type of data in the file and provide an optional description: If you select the checkbox you select an existing dataset, who’s content will be overwritten by the file you select. It is impossible to delete or rename a datset, but you can always upload an empty file ‘as a new version’ of a large data set to truncate it. If we now want to use this data, we create a new experiment by clicking the +New button. In this new experiment under the Saved Datasets we will find our uploaded dataset among the list. Just drag it to the design surface. Also remember the search box at the top: by typing part of an object name (and a data set is one of the many objects we have in AzureML) we get a filtered list which makes it easier to find an object. Now that we have our data in AzureML we can start interacting with it, such as simply visualizing our data: click in the circle under the data set and select Visualize: This will open up the overview screen, showing basic statistical information on each data field: Reading data Another way to get data in an AzureML experiment is by first uploading your data in a Azure SQL Database, an Hadoop cluster (such as HDInsight) or upload the files with data (same data types as we had in the previous paragraph) into an Azure blob store. In this case you do not need to create a data set, but you can immediately create a new experiment. In this experiment, locate the Reader under Data Input and Output and drag it into the experiment. When we click in the Reader, we get on the right-hand side all the configurable properties of this Reader. The most important property is the data source type. This one determines which other properties are needed. Select over here the location where your data can be found and configure the other properties appropriately When we now run the experiment, we can visualize the data from this reader, just as we could we an uploaded data set. But we have an extra option. By clicking Save as dataset, we can permanently store this data in AzureML. This speeds up the runtime of an experiment, but it increases the storage cost (we store another redundant copy of the data). In a next blog post, I will discuss data preprocessing.

Building custom reports for SQL Server management

Recently I presented a session at the ITPROceed conference on ‘Administration intelligence’: using Business Intelligence solutions for analyzing log files, generate documentation, do disk quota estimations and other typical administration tasks. You can find the full presentation at SlideShare. In this blog post, I want to focus on one of the techniques explained in this talk: creating Reporting Services reports to run in Management Studio. SQL Server Management Studio reports SQL Server Management Studio contains a bunch of predefined reports. In the Object Explorer, right click any database, select Reports –> Standard Reports and you get a listing of nearly 20 predefined reports:   If we select any of these reports, a new tab open in Management Studio in which the report renders. There are a few special things to notice about these reports: They can be exported in PDF, Word and Excel format, as well as being printed It can be a multi-page report. There is no explicit way to go to the next or previous page, but if you scroll past the end of the current page, you go to the next page. The report is context sensitive: if you open the same report by clicking on a different database, it shows information for that specific database. Before we dive into building our own reports, I just wanted to point out that we have these predefined reports not only at the level of databases, but also at the server instance level, as well as on the Data Collection node (Management), the Integration Services Catalogs (from 2012 on) and the SQL Server Agent node. Just go ahead and try these predefined reports: They provide very useful information without the effort of building reports yourself… plus they might inspire you in building your own reports. From report consumer to report producer These predefined reports can make our life easier, but sometimes you want a report different from these predefined reports.Building a custom report is easy. Step 1: Retrieving the data The first (and often most difficult) step in creating any report is in retrieving the right data. Luckily SQL Server provides tons of information via the many management views. Just start typing SELECT * FROM sys. and the completion tool will show you the huge list of possible completions. If you want to read more on these before you continue, checkout the web, e.g. mssqltips.com has some interesting reading material for you. In this example we want to report on log file sizes: what are the biggest log files I have on any server, and home many room for growth (or shrinking) do we still have in there. A query with which we can get started is this one: SELECT *FROM sys.dm_os_performance_countersWHERE counter_name IN ( 'Log File(s) Size (KB)' ,'Log File(s) Used Size (KB)' ) AND instance_name <> '_Total' This returns a table such as this:   But in reporting it is important to present the data as easy digestible as possible. It is easier to build reports if different measures (in our case log size and log used size) are in different columns versus all in one single column (as they are right now). The PIVOT statement in T-SQL might help us out on that. Also, simple resorting and renaming might improve the readability of the result. So after a bit of a rewrite we end up with this query: SELECT [Database Name] ,[Log File(s) Used Size (KB)] AS LogUsedSize ,[Log File(s) Size (KB)] AS LogTotalSizeFROM ( SELECT RTRIM(instance_name) AS [Database Name] ,cntr_value AS [Size in Kb] ,RTRIM(counter_name) AS [Counter Name] FROM sys.dm_os_performance_counters WHERE counter_name IN ( 'Log File(s) Size (KB)' ,'Log File(s) Used Size (KB)' ) AND instance_name <> '_Total' ) AS srcPIVOT(SUM([Size in Kb]) FOR [Counter Name] IN ( [Log File(s) Size (KB)] ,[Log File(s) Used Size (KB)] )) AS pvtORDER BY [Log File(s) Size (KB)] DESC   Just copy the query in the clipboard, we will need it for the next step: Step 2: Building the actual report To build a report which we can run in Management Studio, we need to have SQL Server Data Tools (SSDT) installed (the tools formerly known as Business Intelligence Development Studio, BIDS, so you can use BIDS as well if you’re not yet on 2012). This is part of the regular SQL Server setup, so you can find it on your SQL Server 2012 or later installation media. But you can also just go and download it from the Microsoft Download Center. After downloading, installing and starting this tool (which is just a Visual Studio template), we instruct it to create a new project. In the list of installed templates, you should find back the Business Intelligence group, select the Reporting Services group within, and on the right, click the Report Server Project Wizard and provide a useful project name:   Once you click OK, the wizard will start. Click Next to get past the start screen. On the next screen the wizard asks which type of database connection we want to have, and to which server and database we want to connect. This is the tricky part, because we actually want to connect to whatever server and database we click upon in Management Studio object explorer. But now, we first need to develop this report in SSDT. Luckily, there is an easy way out: If we use the Microsoft SQL Server data source type, whatever server and database we select will be replaced with the ‘current’ server and database whenever we run this from Management Studio. So lets just connect to a server (I’m using localhost over here, but any server to which you can connect from SSDT will do):     Notice that if you want your report to not query the current database but a fixed database (e.g. always master or msdb), you should use three part names in your query (e.g. SELECT * FROM msdb.dbo.suspect_pages).   Next screen asks for the query to retrieve the data for this report. Paste the query we’ve written earlier in this blog into this window. The next screen asks for a report type, just confirm the tabular report. Then we have a screen that asks for the report design. Just drag all three fields into the Details window and click Finish. Then we hit the last wizard screen where we give this report a meaningful name: Log Size. Click Finish. Step 3: fine-tuning the report After we finish the wizard we end up in the editor. Click the preview tab to get an impression of what the report will look once deployed:   This report only looks slightly better than the table we got in Management Studio earlier on. We need to make a couple of improvements. Switch back to the Design tab to widen the database name column (just drag the column header to the right). Finally we want to have a graphical indication on the size of these log files. To achieve this, right-click the rightmost column header in our report and add a column to the right:   Next, from the Toolbox window, drag the Data Bar component into the bottom cell of this newly created column. In the pop-up window, just confirm the default data bar type. Finally, click the blue bar that appears in this cell. On the right another pop-up window appears. Click the green plus next to Values and click LogTotalSize:   Now preview the report again:   Step 4: Run in Management Studio Now that we have a report which is easy to read, let’s make sure we can easily open up this report in Management Studio. In SSDT, in the File menu select Save Log File.rdl As... In the save dialog, navigate to My Documents > SQL Server Management Studio > Custom Reports and save it over there. Notice that this is not required, Management Studio can open reports from any file location; It’s just the default location for these files. Finally, we can now open up Management Studio, right click any object in the object explorer, select Reports > Custom Reports, and select the file we just saved there.   If you can connect to multiple instances of SQL Server, try out this report on all of them, and you will see how the report takes the server context into account.   Right click to print and export the report… you can have all the options we had on the predefined reports.   This is not the end of it, it’s just the start. There is a lot more to learn about building Reporting Services reports, as well as about all the useful information we can extract from SQL Server. So, give it a try!   Download If you just want to download the .rdl file we created in this post, you can find it at http://1drv.ms/1yQjzMD

Loading GPX waypoints with Excel PowerQuery

Plotting data on maps is becoming popular: We can use the geometry and geography data type in SQL Server, use the Reporting Services maps, Excel PowerView allows mapping coordinates on a map, and the recently released PowerMap plug-in in Excel offers beautiful visualizations of map data. But how can we easily get coordinates into these tools? Some tools connect with the Bing lookup service: you provide an address, Bing translates this into latitude-longitude coordinates. But sometimes we want to explicitly insert coordinates into the database. For instance, imagine you collecting location data with a smart phone or GPS device. Many of these devices store information in GPX or KML format. In this article, we discuss how we can load the waypoint coordinates from a GPX file into Excel. From there we can use it in Excel Power Map or Excel Power View reports, or we can use the SQL Server Import/Export wizard to import the Excel file into a SQL Server table, from which we can then use Reporting Services on this. PowerQuery PowerQuery is a free plugin for Excel 2010 and 2013. If you haven’t done so, download it from the Microsoft download site, and run the installer. Once installed, we can start loading different types of files into it, filter and manipulate the data, and in the end stored the result in either a regular Excel sheet, or in a PowerPivot model. Since GPX files are just XML files, we can load the file that way. GPX files can store coordinates in different ways. The two most popular are as a track (a bunch of points who make up a walk or route) or as waypoints: markers or points of interest. In the first example, we will load a local gpx file extracting the trackpoints. To save you all the trouble of repeating the transformation steps, I have included the M script. To rerun this: Start Excel In the PowerQuery ribbon select Get External Data –> From Other Sources –> Blank Query Go to the View ribbon, and select Advanced Editor Paste the query you see below into the query window, and replace the path of the gpx file with the actual path on your machine Click Done In the lower right corner, select the destination: If you plan on using PowerMap or PowerView on this data, select the Load to Data Model option In the Home ribbon select Apply and Close let Source = Xml.Tables(File.Contents("c:\MyGPXFile.gpx")), #"Expand trk" = Table.ExpandTableColumn(Source, "trk", {"name", "desc", "number", "http://www.topografix.com/GPX/Private/TopoGrafix/0/1", "trkseg"}, {"trk.name", "trk.desc", "trk.number", "trk.http://www.topografix.com/GPX/Private/TopoGrafix/0/1", "trk.trkseg"}), #"Expand trk.trkseg" = Table.ExpandTableColumn(#"Expand trk", "trk.trkseg", {"trkpt"}, {"trk.trkseg.trkpt"}), #"Expand trk.trkseg.trkpt1" = Table.ExpandTableColumn(#"Expand trk.trkseg", "trk.trkseg.trkpt", {"Attribute:lat", "Attribute:lon"}, {"trk.trkseg.trkpt.Attribute:lat", "trk.trkseg.trkpt.Attribute:lon"}), RemovedOtherColumns = Table.SelectColumns(#"Expand trk.trkseg.trkpt1",{"trk.trkseg.trkpt.Attribute:lat", "trk.trkseg.trkpt.Attribute:lon"}), ChangedTypeWithLocale = Table.TransformColumnTypes(RemovedOtherColumns, {{"trk.trkseg.trkpt.Attribute:lat", type number}, {"trk.trkseg.trkpt.Attribute:lon", type number}}, "en-US")in ChangedTypeWithLocale If you want to load the waypoints instead of a track, you can repeat the same steps, except using this script: let Source = Xml.Tables(Web.Contents("http://www.topografix.com/fells_loop.gpx")), ChangedType = Table.TransformColumnTypes(Source,{{"time", type datetime}, {"Attribute:version", type number}, {"Attribute:creator", type text}}), #"Expand wpt" = Table.ExpandTableColumn(ChangedType, "wpt", {"ele", "time", "Attribute:lat", "Attribute:lon"}, {"wpt.ele", "wpt.time", "wpt.Attribute:lat", "wpt.Attribute:lon"}), RemovedOtherColumns = Table.SelectColumns(#"Expand wpt",{"wpt.ele", "wpt.time", "wpt.Attribute:lat", "wpt.Attribute:lon"}), ChangedType1 = Table.TransformColumnTypes(RemovedOtherColumns,{{"wpt.time", type datetime}}), ChangedTypeWithLocale = Table.TransformColumnTypes(ChangedType1, {{"wpt.ele", type number}, {"wpt.Attribute:lat", type number}, {"wpt.Attribute:lon", type number}}, "en-US")in ChangedTypeWithLocale Notice that this last script loads the gpx file from the internet, replace Web.Content with File.Content if you want to load from a local file. PowerMap Once we have loaded the data in an Excel PowerPivot model, its easy to get it into a PowerMap. First, if you haven’t done so, download and install PowerMap. Open the Excel workbook in which you loaded the PowerQuery. We will use in the example the waypoints we loaded with the second example. Go to the Excel Insert ribbon, and click Map –> Launch Power Map The Power Map dialogue opens (you need to be connected to the internet, since it talks to Bing Maps) and prompts for the geographical information. Select the latitude and longitude fields, and specify explicitly which field is latitude and longitude: Click Next. We now already see a map with the waypoints, but we can also plot extra information that is in the GPX file. For instance, we copied in this example also the time the waypoint was visited, and the elevation. So we can ask PowerMap to draw higher data bars for points with higher elevation, and we can put the time into the time axis: Finally, by clicking the settings icon in the upper right corner we can make the data bars lower, less opaque and more narrow: This leads now to this visualization: You can use this as a starting point to learn and experiment more with Power Map. Or use the model we made earlier on in Power View reports (from the Excel Insert ribbon as well).  I hope this gives you some inspiration to get started building your own visualizations on top of either GPS information you collect yourself, or information you find on internet or intranet. If you want to learn more on using these Excel data processing and reporting options (known as PowerBI), subscribe to one of our PowerBI classes.    

Challenges in combining multiple data sets in PowerBI

Relationship between house price and income A few days ago the media reported that Belgium was amongst one of the most expensive European countries to buy a house, if you compare the house price relative to the income. A few days ago I wrote a blog post on how we could analyze the Belgian house market using Microsoft Excel PowerBI and public data from the Belgian federal government (http://statbel.fgov.be). Inspired by the news that the house price relative to the income is an important indicator, let’s try to mimick this research, but then specifically for the Belgian house market. With PowerPivot one cannot only model data coming from a single data source, it is also easy to combine different data sources into a single model. For this blog post, I took the model made in the previous Belgian house market analysis (with house prices per city) and then looked for a dataset with income information per city. On http://statbel.fgov.be/nl/modules/publications/statistiques/arbeidsmarkt_levensomstandigheden/Fiscale_inkomens_-_per_gemeente_-_2008.jsp we can get fiscal income per Belgian city in 2008 (unfortunately I’m not aware of more recent public datasets on this). Using PowerQuery we can filter and convert the data such that we the total fiscal income, number of taxpayers and number of inhabitants per city. Combining the data sets: a risky business After we loaded this data in the model, we need to setup the proper relationships between the two data sources. Since both data sources are using the same city identifier (NIScode), this link is easily made. However, if we would directly link the table with the house prices per city (which contains a row per city per year per house type per measurement) with the table with the incomes per city (which contains a row per city per year), we can get errors. PowerPivot does not allow many-to-many relationships! In our model it is even worse. Since we currently only loaded data for 2008, we do only have one row per city in the income table. But as soon as we load data for multiple years, our model would become invalid! The proper way to link data together Even with a small data set (just 2 tables) as we have in this example, it is already important to stick to some of the core ideas in business intelligence: dimensional modeling. This boils down to splitting up data into dimensions (things you group or filter upon, typically the nouns in the business story) and facts (things you aggregate upon in your reports, typically the verbs in the business story). In our analysis we have two dimensions: Cities and years. So we must create a table with only one row per city (with all the information regarding that city: code, name, county, country), and another with one row per year. Since there is not much more to tell about a year than just its value, we can easily generate the latter table in PowerQuery: 1: let 2: Source = List.Numbers(1990,25), 3: TableFromList = Table.FromList(Source, Splitter.SplitByNothing(), {"Year"}, null, ExtraValues.Error) 4: in 5: TableFromList After these tables get created, we can now link them in PowerPivot in this way: From the perspective of linking the tables, the solution is fine. But we still run the risk that users will for instance grab the Years or NIScode column from either the HousePrice or the income table in one of their reports. This would be problematic, since this would only filter data from the table from which it was choosen, not from the other table. However, if we can force our users to select years only from the Years table and NIScodes only from the Location table, we don’t face this problem, since both our fact tables (HousePrice and Income) are related to both these dimension tables (Years and Location). This can easily be obtained by just hiding these columns in the fact tables: After we made the model, we can now start defining DAX measures which are calculated over both fact tables. A very interesting one is the average house price expressed in terms of the average income: How many years of average income in a city do people need to spend to buy an average house in that city. Why Bruges isn’t the most expensive area after all Now we can start building PowerView reports on top of this data. In this way we can for instance see that Bruges is much more expensive than Brussels, but if we express the houseprice in terms of the income of its inhabitants, Bruges becomes cheaper than Brussels: Download the Excel sheet from my OneDrive, and start building lots of interesting reports on top of this data set! Check out our Excel PowerBI course to learn more about building models and reports in Excel.

Analyze the Belgian house market… yourself

From time to time articles get published about the evolution of the Belgian house market. Interesting to learn about general trends but… sometimes you whished these analyses would be more tailored to your situation. It’s nice to know that prices for villas have stabilized last year, but does this trend also hold in my region? Sometimes we just have to do things our self… PowerBI = Self Service BI Lot’s of data is nowadays publicly available. The same holds for the Belgian house market analysis: http://statbel.fgov.be has a lot of public data sets available for analyzing different aspects of Belgium. One of them contains house sales information per village per year per house type (flat, house, villa). By diving into this data set we can find all sorts of detailed information. But its not easy to navigate this data. We want to massage the data such that it becomes easier to filter and aggregate the data. To do so we can use many tools… but why leave Excel? In 2014 Microsoft released PowerBI, which amongst others contains Excel plugins for loading, cleaning and modeling data. PowerQuery makes data loading easy With the free PowerQuery plugin, we can easily create a script which downloads the data from the statbel website, filters and transforms the data until we become a representation which is easier to filter and aggregate: If you want to give it a try, here is the M code, just copy it and paste it into a blank PowerQuery. 1: let 2: Source = Excel.Workbook(Web.Contents("http://statbel.fgov.be/nl/binaries/NL_immo_statbel_jaar_131202_%20121715_tcm325-34189.xls")), 3: #"Per gemeente1" = Source{[Name="Per gemeente"]}[Data], 4: RemovedFirstRows = Table.Skip(#"Per gemeente1",2), 5: FirstRowAsHeader = Table.PromoteHeaders(RemovedFirstRows), 6: RemovedColumns = Table.RemoveColumns(FirstRowAsHeader,{"localiteit", "jaar_1", "oppervlakteklasse", "Column6", "P10 prijs(€)", "Q25 prijs(€)", "Q50 prijs(€)", "Q75 prijs(€)", "P90 prijs(€)", "Column16", "P10 prijs(€)_6", "Q25 prijs(€)_7", "Q50 prijs(€)_8", "Q75 prijs(€)_9", "P90 prijs(€)_10", "Column26", "P10 prijs(€)_15", "Q25 prijs(€)_16", "Q50 prijs(€)_17", "Q75 prijs(€)_18", "P90 prijs(€)_19", "Column36", "P10 prijs(€/m²)", "Q25 prijs(€/m²)", "Q50 prijs(€/m²)", "Q75 prijs(€/m²)", "P90 prijs(€/m²)", "gemiddelde prijs(€)", "gemiddelde prijs(€)_5", "gemiddelde prijs(€)_14", "gemiddelde prijs(€/m²)"}), 7: RenamedColumns = Table.RenameColumns(RemovedColumns,{{"aantal transacties", "House transaction count"}, {"totale prijs(€)", "House total cost"}, {"totale oppervlakte(m²)", "House building area"}, {"aantal transacties_2", "Villa transaction count"}, {"totale prijs(€)_3", "Villa total cost"}, {"totale oppervlakte(m²)_4", "Villa building area"}, {"aantal transacties_11", "Flat transaction count"}, {"totale prijs(€)_12", "Flat total cost"}}), 8: RemovedColumns1 = Table.RemoveColumns(RenamedColumns,{"totale oppervlakte(m²)_13"}), 9: RenamedColumns1 = Table.RenameColumns(RemovedColumns1,{ {"aantal transacties_20", "Ground transaction count"}, {"totale prijs(€)_21", "Ground total cost"}, {"totale oppervlakte(m²)_22", "Ground building area"}}), 10: Unpivot = Table.UnpivotOtherColumns(RenamedColumns1,{"refnis", "jaar"},"Attribute","Value"), 11: SplitColumnDelimiter = Table.SplitColumn(Unpivot,"Attribute",Splitter.SplitTextByEachDelimiter({" "}, null, false),{"Attribute.1", "Attribute.2"}), 12: ChangedType = Table.TransformColumnTypes(SplitColumnDelimiter,{{"Attribute.1", type text}, {"Attribute.2", type text}}), 13: RenamedColumns2 = Table.RenameColumns(ChangedType,{{"Attribute.1", "HouseType"}, {"Attribute.2", "Measurement"}}), 14: ChangedType1 = Table.TransformColumnTypes(RenamedColumns2,{{"Value", type number}}), 15: TransformedColumn = Table.TransformColumns(ChangedType1,{{"Measurement", Text.Proper}}), 16: RenamedColumns3 = Table.RenameColumns(TransformedColumn,{{"jaar", "Year"}}), 17: ChangedType2 = Table.TransformColumnTypes(RenamedColumns3,{{"Year", type date}}), 18: RenamedColumns4 = Table.RenameColumns(ChangedType2,{{"refnis", "NISCode"}}) 19: in 20: RenamedColumns4 PowerPivot to model the data the way you want it Next we want to define useful calculations: Sums, averages, but also more challenging calculations, such as linking each village with the corresponding county. For this we use PowerPivot, a free addin for Excel 2010, out-of-the-box present in Excel 2013. Amongst others we define an average cost per house and an average cost per square meter calculation: PowerView to build interactive charts As soon as we have loaded and modeled our data, we can finally start building the reports we needed. The PowerView add-in in Excel 2013 allows us to quickly create interactive reports on top of our PowerPivot model. In less than two minutes we can create a report which shows the price evolution over time for the different counties, with a drill down into the actual villages, split by house type. Watch the video below to get an idea, or if you have Excel 2013, just download the Excel file and play with the report itself!   The nice thing about PowerView is that out-of-the-box it usually does what we expect it to do. For instance: when we click on a house type, the bar charts automatically show the average price for the selected type of house relative to the average price for all house types. For instance, in the county Brugge a flat is more expensive than the average price for any type of house, whereas in the other counties it is cheaper:   We can also create more advanced visualization. For instance a bubble chart plotting the average land price against the average size of the building area, and how this evolves over time. When you single click a county, you see the price evolution through time, double clicking will zoom in into the cities within that county. Do it yourself You can download the Excel sheet from my OneDrive at http://1drv.ms/1fcH2u6 and have fun with the data. Be sure to download the Excel file, since OneDrive does not allow you to interact with the data in the browser. But if you want to learn how to build PowerPivot models and load data with PowerQuery, consider attending our three day course Building PowerPivot Models in Excel. If you are also interested in building PowerView and PowerMap reports on top of the PowerPivot models, you can instead attend the course Business Intelligence with Excel. Have fun with the PowerBI self service business intelligence… the world is yours to discover!

Power BI Training: Business Intelligence with Excel

Power BI allows business users to solve certain business intelligence problems without the need of IT specialists. Since 2010 Power Pivot allows Excel users to create their own data models and do advanced analytics on this. Excel 2013 out of the box contains Power View as well, allowing people to create advanced and interactive reports within Excel. And recently Microsoft added Power Query, to load and filter data from a variety of data sources, and Power Map, to create stunning geographical reports. These four products together are named Power BI.   From February 2014 on, U2U will run a 5 day course, oriented towards business users who are already familiar with the basic Excel functionality, to learn about these advanced business intelligence capabilities of Microsoft Excel. To give you a bit of an idea what becomes possible with Power BI, we included a small video illustrating some of the skills you will learn in this new course. For practical details on this course, check out http://www.u2u.be/CC/UBIPB