Many maps in Power BI require latitude and longitude coordinates to display geography, but many data sources only contain place names. Geocoding is the process of obtaining latitude and longitude for a given location. Geocoding with Power BI is nothing new, but it may never have been as simple. In addition to several existing solutions that you could use to get coordinates for your data, the new Python script in Power BI Desktop is now an option. As a benefit, you do not need to manually construct web requests or parse JSON responses. In fact, it only takes one line of Python code.
To see geocoding and more solutions using Python for Power BI, be sure to attend BlueGranite’s webinar Python in Power BI: Navigate the Possibilities.
Date: Dec. 6, 2018
Time: 11 a.m. EST
Presenter: David Eldersveld, Senior Consultant and Microsoft MVP
Register here: Python in Power BI Webinar
Sample: Geocode Location Data for Power BI using Python and Azure Maps
Python makes translating location names to latitude and longitude easy by offering connections to numerous geocoding providers. One of these services is Microsoft's own Azure Maps. All you need is a simple Python script to specify a column, choose the provider, and supply the provider's access key.
While the following is not a complete step-by-step process, please join the webinar on Dec. 6th to see this demo live.
- First, obtain a key for the Azure Maps service using the Azure Portal. Sign up for the Azure Maps service within the Portal if needed.
- In your Python environment, install both the GeoPandas and GeoPy libraries. For Python, I use the Anaconda distribution available from https://www.anaconda.com/download. The following conda commands were executed to install the libraries using the Anaconda Prompt. If you are not using Anaconda, you can install using pip instead of conda from your command line.
- Import your location data into Power BI. If needed, use Power Query to prepare the data so that a single column contains the address or other type of location that you need to geocode. In the sample, I create a simple column named Location that combines separate City and State columns. While this type of data preparation can be done in Python too, I want to focus on a simplified Python script that only handles the geocoding.
- Add the Python code. The Transform tab in Power Query has the option to Run Python script. This adds the corresponding Python.Execute function to the underlying code. While the original DataFrame (named dataset by default) contains a single Location field, Azure Maps provides two new fields in response: Address and Geometry. The geopandas.tools.geocode function only needs three arguments: the original "Location" field, the service provider (Azure Maps), and the key. The Python code combines the Azure Maps response with the original data in the same line by adding "Address" and "Geometry" as new columns within the dataset DataFrame.
- Parse the Geometry field to split out separate Latitude and Longitude columns. This is another series of steps that could be done in the Python script itself, but which Power Query is perfectly capable to handle as well.
The final Power Query script includes a parameter named AzureMapsSubscriptionKey in place of the initially hardcoded Azure Maps key. Ignoring the imports, uniting Python with Azure Maps only takes one line of Python code embedded within your Power Query M script!
#"Run Python script" = Python.Execute("import geopandas#(lf)import geopy#(lf)#(lf)dataset[['Address','Geometry']] = geopandas.tools.geocode(dataset.Location, provider='azure', subscription_key='" &Text.From(AzureMapsSubscriptionKey)& "')#(lf)",[dataset=#"Added Custom"])
Happy coding! Once again, if you would like to learn more uses for Python in Power BI, attend BlueGranite’s free webinar on Dec. 6.