Power BI Practice: Extracting Date and Time Components from Text Data in Power BI
Mohamad's interest is in Programming (Mobile, Web, Database and Machine Learning). He is studying at the Center For Artificial Intelligence Technology (CAIT), Universiti Kebangsaan Malaysia (UKM).
Scenario:
Given meta data:
column_name, data_example, valid_profile
IoT GUID, 48196321-3809-EC11-883D-0022489A2..., 100%
IoT DateTime, 21/05/2022 18:59:25, 100%
IoT ID, 10000010000, 100%
The IoT GUID and IoT ID columns are unique to each row in query. You need to analyze IoT events by the hour and day of the year. The solution must improve dataset performance.
Solution: You split the loT DateTime column into a column named Date and a column named Time. Does this meet the goal?
No, splitting the IoT DateTime column into separate Date and Time columns alone does not fully meet the goal of improving dataset performance. While it helps in analyzing events by hour and day, additional steps like indexing, aggregation, and partitioning are needed to significantly enhance performance.
Step-by-step guide
Step 1: Split the IoT DateTime Column
1. Open Power Query Editor:
• In Power BI Desktop, go to the "Home" tab and click on "Transform data" to open the Power Query Editor.
2. Split the DateTime Column:
• Select the IoT DateTime column.
• Go to the "Transform" tab, click on "Split Column," and choose "By Delimiter."
• Split by space to separate the date and time into two columns.
• Rename the new columns to "Date" and "Time."
Step 2: Create Indexes
1. Index the Date Column:
• In the Power Query Editor, select the "Date" column.
• Go to the "Add Column" tab, click on "Index Column," and choose "From 1."
Index the Time Column:
• Repeat the same steps for the "Time" column.
Step 3: Aggregate Data
1. Group by Date and Hour:
• In the Power Query Editor, select the "Date" column.
• Go to the "Transform" tab, click on "Group By."
• Add a new grouping for the "Time" column, but extract only the hour part.
• Choose the aggregation method (e.g., count, sum) for the IoT events.
Step 4: Partition the Dataset
1. Partition by Date:
• In the Power Query Editor, create a new column that extracts the year and month from the "Date" column.
• Use this new column to partition your dataset.
Step 5: Optimize Data Types
1. Set Appropriate Data Types:
• Ensure the "Date" column is set to the Date data type.
• Ensure the "Time" column is set to the Time data type.
Step 6: Apply and Test Changes
1. Apply Changes:
• Click on "Close & Apply" in the Power Query Editor to apply the changes.
2. Refresh Data:
• Refresh your data to ensure all changes are correctly implemented.
3. Verify Results:
• Check your reports and data models to ensure everything is working as expected.
By following these steps, you can efficiently analyze IoT events by the hour and day of the year while improving dataset performance.
Sample data
| IoT GUID | IoT DateTime | IoT ID |
| 48196321-3809-EC11-883D-0022489A2... | 21/05/2022 18:59:25 | 10000010000 |
| 48196321-3809-EC11-883D-0022489A2... | 22/05/2022 12:45:10 | 10000010001 |
| 48196321-3809-EC11-883D-0022489A2... | 23/05/2022 08:30:45 | 10000010002 |
| 48196321-3809-EC11-883D-0022489A2... | 24/05/2022 14:15:30 | 10000010003 |
| 48196321-3809-EC11-883D-0022489A2... | 25/05/2022 20:05:55 | 10000010004 |
| 48196321-3809-EC11-883D-0022489A2... | 21/05/2022 18:59:25 | 10000010005 |
| 48196321-3809-EC11-883D-0022489A2... | 22/05/2022 12:45:10 | 10000010006 |
| 48196321-3809-EC11-883D-0022489A2... | 23/05/2022 08:30:45 | 10000010007 |
| 48196321-3809-EC11-883D-0022489A2... | 24/05/2022 14:15:30 | 10000010008 |
| 48196321-3809-EC11-883D-0022489A2… | 25/05/2022 20:05:55 | 10000010009 |
You can use this table to practice splitting the IoT DateTime column into separate Date and Time columns, creating indexes, aggregating data, and partitioning the dataset.