2: Importing and transforming data using Power Query Editor
Learning objectives
2: Importing and transforming data using Power Query Editor
Learning objectives - Video Tutorials & Practice Problems
Video duration:
4m
Play a video:
<v ->As data analysts, chances are you've spent,</v> or will spend, a lot of time wrangling data that comes in many shapes and sizes, and from a multitude of disparate sources. In most organizations, data is scattered and is sometimes not managed properly, which complicates the process of preparing data for analytics. Data preparation is often a data analysts least favorite task. It is estimated that 70% of the work in any analytic endeavor is spent getting your data report ready. The task of consuming and transforming data has long been one of the hardest, yet most important skills for a data analysts to master, as they prepare data sets for consumption and presentation. This usually involves, removing rows and columns, cleaning erroneous column values, enhancing data sets with new columns, and merging and appending datasets. If you've used Excel for analytics, as many have, you've relied on the mastery of advanced Excel functions, such as VLOOKUP, IF, FIND, CLEAN, and SUBSTITUTE, on top of traditional data import means. In this lesson, we'll explore the capabilities of Power Query to help simplify and better automate the process of adjusting and transforming data. If you're used to traditional means using Excel, this functionality is often seen as a game changer. Some of the key features of Power Query are the following. It has many data connectors available and more are being added periodically. You'll see the data gathering and shaping capabilities are fast, easy and intuitive to use. And one of my favorite things is that the transmissions are recorded, which allows you to repeat the steps to save manual and repetitive work. Plus, there's some excellent documentation features that we'll cover. So what does all this add up to? It is often estimated that optimizing the repetitive activities around collecting, combining, and transforming data using Power Query will cut your data preparation time by up to 80%. Right, so going back to our original diagram that we talked about in the Power BI End-to-End process flow, we're going to go ahead and focus on the following inside of this particular lesson We're going to get ourselves into the Power Query Editor. We are then going to go ahead and connect into data sources. After we're done that, we're going to go ahead and extract data into the Power Query Editor. We're then going to perform transformations, we're going to cleanse data, and take advantage of the M language. Lastly, we're going to load that data into a data model, so that we can get it report ready. Okay, so the learning objectives for this particular sub lesson. First, we're going to navigate the Power Query Editor. So we're going to learn how to move our way around the actual tool itself. We're going to explore data sources that are available to Power Query. We're going to load data from a website, so we're going to practice loading data to the cloud. We're going to load data from files, so we're going to work with some text files and Excel files. We're going to learn how to load data from a folder, which is a very handy technique for loading data that is very similar. We're going to explore data source settings. We're going to look at renaming columns and setting data types. We're going to reduce rows on our data sets, we're going to manage columns. Some of these techniques we just spoke of here are some of the most important things you can do to make your dataset much more performing. Going ahead and reducing rows and columns will keep your dataset small and compact. Choosing good naming conventions and data types is going to make your data model a lot easier to use from a self-service perspective. Plus, it's going to make it perform better and be able to take advantage of functionality downstream. Then we're going to worry about appending queries. Lastly, we'll talk about adding columns of data to our queries. We're going to merge queries together. We'll spend some time talking about transforming and cleansing data sets that don't meet our quality standards. And then we're going to look at organizing and managing our queries. So taking a look at giving our queries good names. Filing things properly within the Query Editor itself. And documenting our work within Power Query to make it easier for us to pick up on weeks or months later. Or if we decide to hand our work to somebody else, to make their life a lot easier. Let's begin.