You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A collection of M code to get various formats from Excel sheets in Power Query
3
5
4
6
## Main purpose
5
-
A lot of information, stored in the Excel workbooks, has additional metadata . This metadata could be stored in various forms, mostly as cell formats, number formats, colours, etc.
6
7
7
-
A wide range of formats and the complexity of extracting their parameters by other tools, such as Power Query, lead to the loss of a noticeable piece of information. Often the format of a row, column or cell is a critical element of the data set.
8
+
Information, stored in the Excel workbooks, often has additional metadata, important for analyzis. This metadata could be stored in various forms, mostly as cell formats, number formats, colours, etc. Often a row, column or cell format is a critical element of the workbook data set.
9
+
10
+
At the moment (Aug 2017) the Microsoft Power Query and corresponding "Query Editor" in Microsoft Power BI do not allow users to get additional information (stored in Excel workbooks and spreadsheets as various applied formats) natively, except (sometimes) the data types of calculated values.
11
+
12
+
A wide range of formats and the complexity of extracting their parameters by other tools, such as Power Query, lead to the loss of a noticeable piece of information. Additional problem is storing extracted formats data in Power Query for further use.
13
+
Задачи и методы
14
+
15
+
## Tasks
16
+
17
+
Develop a set of functions to extract/import specific info about sheet and/or cell formats into Power Query.
18
+
19
+
In the future - develop universal functions:
20
+
21
+
* spreadsheet information (info about rows, columns, sheet in whole)
22
+
* cells info (colors, fonts, alignment, number formats, indents etc.)
23
+
24
+
The versatility of the methods due to the same tools (unzip and XML parsing) and the similarity of data sources. Specific kind of function result can be selected via function argument.
25
+
26
+
---
27
+
28
+
### Methods
29
+
30
+
#### Unzip
31
+
32
+
Main method is unpacking of XLSX/XLSM as zip and working with XML documents inside. Unpack performed via custom function [UnZip.pq](UnZip.pq) by Mike White. But any other analogue to unpack zip archives in Power Query can be used.
33
+
34
+
#### XML Parsing
35
+
36
+
After UnZip the XML files (`binary` type) from workbook structure become available for the (current) main function. Possible parse methods - with built-in functions `Xml.Tables` or `Xml.Document`, or with other suitable XML parsing methods.
8
37
9
-
At the moment (Aug 2017) the Microsoft Power Query and corresponding "Query Editor" in Microsoft Power BI do not allow users to get additional information, stored in Excel workbooks as various applied formats, except (sometimes) the data types of calculated values.
38
+
* Main problem: cell formats stored separate from cells, cells itself stored inside row element, cell address stored in A1 notation (need additional convert to R1C1-style or similar).
39
+
* Additional problem: linking/mapping extracted format info with cell position in Power Query table.
10
40
11
-
Additional problem is storing extracted formats data in Power Query for further use
41
+
---
12
42
## Work plan
13
43
14
44
1. Sheet structure:
@@ -19,5 +49,5 @@ Additional problem is storing extracted formats data in Power Query for further
19
49
2. Cell indents and alignment
20
50
3. Cell number formats
21
51
4. Cell color
22
-
5. Top-left rows and columns addition to UsedRange/dimension
23
-
6. Additional formats and further development
52
+
5. Top-left rows and columns addition to UsedRange/dimension (see this [post about UsedRange pitfall](http://excel-inside.pro/blog/2017/05/23/excel-sheet-as-a-source-to-power-query-and-power-bi-a-pitfall-of-usedrange/))
53
+
6. Additional formats, conditional formats and further development
0 commit comments