2: Importing and transforming data using Power Query Editor
2.3 Load data from a website
2: Importing and transforming data using Power Query Editor
2.3 Load data from a website - Video Tutorials & Practice Problems
Video duration:
6m
Play a video:
<v ->Okay, so what we're going to do</v> is we're going to get data from a website first. So, what I wanna do here first is show you the actual website that we're gonna go to, to get data from. So, the URL for this is inside the GitHub repository that we provided for you. So, let's just walk through the webpage here to show you what we're going to connect to. So, if I just scroll down a little bit here, we can see there's this table of data that has countries, Continental regions, statistical regions and then some population data for 2018 and 2019. So, what we're gonna do is go back into the Power BI Desktop, get ourselves into Power Query and we're gonna make a connection into this website so we can pull that data down into Power BI. So, I'm just gonna jump back over to Power BI Desktop. Okay, so I've got a blank Power BI Desktop file opened. And once again, to get into the Power Query Editor, I'm going to go to this queries group, click on the transform data command and the Power Query Editor will open up. So, what I'm gonna do here now is given that we're gonna connect into a website, I'm gonna go to the home tab, new query group, click on new source. And if I just go down this most common list here, we're gonna see that web actually shows up in this list. So, I'm gonna go ahead and click on that. We'll get taken to a dialog box that's gonna ask for the URL of the website we're gonna connect in. So, I'm just gonna go ahead and paste the URL in here to avoid having to type that in. So, once I have that in here, now I can go ahead and click okay. So, what Power Query is gonna do here now is go ahead and take a look at how we should be connecting into that website. Now, this is a public website, so we're just gonna connect anonymously. But if you connect into other types of web data, there are a number of other ways that you can connect into it. So, you can use windows authentication, basic or web API if we have that type of API to connect into or an organizational account. We're not gonna dive into those right now because we are focused in connecting into this anonymous source. So, what I can do here is now click connect and let Power Query make a connection into the webpage. And what is happening right now is it's taking a look at that website and seeing what can it connect into as structured data that we may want to pull into the Power Query Editor. Okay, so we've made the connection into the website. Now we can see in our navigator dialogue box here, that we have a series of different HTML tables that we can make a connection to. So, I'm just gonna cycle through these and just show you what Power Query has found. So, table number one is this information here. So, if we take a look on the right hand side of the screen, it will give us a preview of what that table data looks like. Go to this next one here, that doesn't look like what we want. We need to go all the way down to table number three before we actually get to the information that we wanna pull off that website. So, once we see the information that we want off of the particular source we're connecting into, we can click the check box right next to table number three and that's what we're going to use as the query to bring into Power Query. So, now at this point, I'm gonna click okay. Power Query Editor is gonna go ahead and make that connection into the website and pull that particular table of data down into the actual Power Query Editor. We can see that on the right hand side of the screen now, several different things have happened in the source. So, I'm gonna leave the name at table three for the moment and just explain a few things here. So, the source, this is the website that we connect into and all the source data itself. Then what happened is it went through and extracted the table from the HTML code that it found. Then Power Query Editor went and said, all right, well, looks like this first row here likely should be your header so it promoted that into the headers. And then it went through and made some decisions on the data types that each one of these columns should get on a column-by-column basis. We're gonna explore in a later sub-lesson how we should go through and take a look at these data types and make sure that Power BI actually did assign the correct types. All right, so this contains all the data we want. The last thing we wanna do here is we're gonna rename this query. So, I'm just gonna go ahead in here and paste country population data. Because what we wanna do here in our queries is actually give them meaningful names, so that should we ever pick up on this file at a later date. Be it a day later, a week later, a month later, or we give this file over to somebody else to go ahead and maintain that it's clear as to what that query actually does. Okay, so the last thing we're gonna do here, is we're gonna go ahead and save our work. So, I'm gonna go to the top of the screen here, hit the save icon. Now I'm prompted with a dialogue box saying there's pending changes in your queries that haven't been applied, and that means applied into the data model. Do you want to apply them? What I'm gonna do here is I'm just gonna click apply later, which means we're gonna apply all these transformational steps to the data model at a later time. So, what I'm gonna do here now is I'm just going to go over to my drive, where I'm storing all my files. I'm just gonna go into the file folder right here. And I'm just gonna save this as SalesDataMart. So, it's called SalesDataMart.pbix, click save. Already a file sitting there with that name. I'm just gonna say, go ahead and replace it because that's what I want to do. And now the file has been saved. So, to give you a high-level walkthrough as to what has happened here, is I'm just gonna pull up a diagram from one of the previous lessons. Okay, so with our Power BI end-to-end process flow diagram back up, what we've done so far is on the far left hand side of the diagram, we connected into a data source, which was the website. We extracted data from one of the tables of that website and we brought it into the Power Query Editor. So, so far we've done an E. We really haven't done a transformation to this data yet because we've just brought it into the Power Query Editor and really done nothing with it at this point. And if you recall, when we had that dialogue box was asking us if we want to apply the changes we've made so far, what we did there was say, don't load that data for the data model. So that L step, which is taking data from the Power Query Editor and put it into the data model, we actually said, we'll do that later, apply those steps later on. So, we're gonna walk through these diagrams several times as we go through the rest of the sub-lessons here, just so you get a sense of how the mechanics are working on the Power Query Editor side. So with that, that brings us to the end of this sub-lesson.