by Al Danial

Create Excel files in MATLAB with Python's openpyxl module

Part 4 of the Python Is The Ultimate MATLAB Toolbox series.

Your latest MATLAB masterpiece produced a goldmine of incisive data and now it’s time to present the results to management. MATLAB Live Scripts are an excellent reporting mechanism but these may not suit an organization’s workflow or documentation standards. Sometimes it’s necessary to fall back to showing results in a basic tool like Microsoft Excel.

MATLAB + Excel = not easy

MATLAB’s ability to create Excel files is modest at best. The writecell(), writematrix(), and writetable() functions can populate an Excel file with data, but only using the default text style. If you want to change typefaces, fonts, colors, merge cells and so on, you’re in for some hardship. Custom formatting is only supported on Windows using the COM layer (see Write Data to Excel Spreadsheets, bottom of the page) and examples demonstrating this process are hard to find.

If you’re using MATLAB on macOS or Linux, you’re completely out of luck writing rich Excel files.

MATLAB + Python + Excel = easy!

If you include Python with your MATLAB setup on the otherhand, creating richly-formatted Excel files on any platform is easy. Excel files can be created in MATLAB by calling functions from the Python openpyxl module. The first step is to check if you have openpyxl and to install it if you don’t.

Install openpyxl

Check your Python installation to see if it has openpyxl by starting the Python command line REPL and importing the module. If it imports successfully you can also check the version number:

> python
>>> import openpyxl
>>> openpyxl.__version__
'3.0.9'

If you get a ModuleNotFoundError, you’ll need to install it. That can be done on an Anaconda Python installation with

> conda install openpyxl

or with

> python -m pip install openpyxl

using other Python distributions.

Start a new .xlsx file

The small program below creates an empty Excel file, my_analysis.xlsx, containing two worksheets, or tabs, called “Amazing Results” and “Summary”. For special emphasis, the Summary tab’s color is changed to green. Unfortunately colors in openpyxl must be provided as RGB or aRGB (‘a’ = alpha, to control transparency) hex values rather than words like “green”.

Python file: empty.py:

1
2
3
4
5
6
7
#!/usr/bin/env python
import openpyxl as OP
book = OP.Workbook()
sheet_data = book.create_sheet("Amazing Results", 0)
sheet_summary = book.create_sheet("Summary", 1)
sheet_summary.sheet_properties.tabColor = "15AA15"
book.save("my_analysis.xlsx")

The MATLAB version is empty.m:

1
2
3
4
5
6
7
% MATLAB
OP = py.importlib.import_module('openpyxl');
book = OP.Workbook();
sheet_data = book.create_sheet("Amazing Results", int64(0));
sheet_summary = book.create_sheet("Summary", int64(1));
sheet_summary.sheet_properties.tabColor = "15AA15";
book.save("my_analysis.xlsx");

my_analysis.xlsx created by empty.py or empty.m

my_analysis.xlsx created by empty.py or empty.m

The create_sheet() method’s arguments are the worksheet’s title and an optional position in the tab sequence, where 0 is the leftmost tab. Note the presence of the ‘Sheet’ tab. This was created by default and was pushed to the right when our sheets were added. The ‘Sheet’ tab can be deleted like this:

# Python
del book['Sheet']
   #  -  or  -
existing_sheet = book['Sheet']
book.remove_sheet(existing_sheet)
   #  -  or  -
existing_sheet = book.get_sheet_by_name('Sheet')
book.remove_sheet(existing_sheet)

MATLAB won’t let us use [ ] to index the Python book object so only the third option works there:

% MATLAB
existing_sheet = book.get_sheet_by_name('Sheet');
book.remove_sheet(existing_sheet);

Avoiding bracket notation

As noted above, MATLAB won’t let us index Python objects with brackets. This prevents us from using openpyxl shortcuts such as assigning a cell’s value using something like sheet['C8'] = 3.14159. In Section 7.8 of Python for MATLAB Development, I suggested using a bridge module to overcome this interface limitation.

After the book came out, David Garrison at the MathWorks informed me that a bridge module is unnecessary since openpyxl offers method- and attribute-based ways to achieve the same thing. Setting cell C8’s value can also be done with

C8 = sheet.cell(3,8)
C8.value = 3.14159

All examples below use method and attribute access methods to accomodate MATLAB. The code examples therefore don’t demonstrate the most idiomatic use of openpyxl, but they have the benefits that Python and MATLAB code samples look almost identical, and that the MATLAB implementation won’t need a bridge module.

Fill a cell with a value

To populate a cell with a number or string, get a handle to the cell using the worksheet’s .cell() method, pass it the 1-based row and column indices, then set the handle’s .value attribute with the number or string of interest.

These lines put the string “Total” in cell B9 which corresponds to row 2, column 9:

# Python
B9 = sheet_data.cell(2, 9)
B9.value = "Total"

It looks like this in MATLAB:

% MATLAB
B9 = sheet_data.cell(int64(2), int64(9));
B9.value = "Total";

The int64() type casts are necessary because numeric literals in MATLAB have type double while the Python function expects integers.

Typeface size, font, color

A cell’s font can be modified by creating a “font style” with the Font class then applying this style to the cell. The code example below creates a style, ft_title, having a 14 point blue Arial typeface with bold italics font.

# Python
import openpyxl.styles as styles
Font = styles.Font
ft_title = Font(name="Arial", size=14, color="0000FF",
                bold=True, italics=True)

MATLAB 2021b and earlier use pyargs() to set keyword value pairs:

% MATLAB 2021b and newer
ft_title = Font( pyargs("name","Arial", "size",int64(14),  ...
                        "color","0000FF", "bold",py.True,  ...
                        "italics",py.True) );

MATLAB 2022a and newer can use pyargs() as above or, even better, let you assign keyword values the same way as Python:

% MATLAB 2022a and newer
ft_title = Font(name=Arial, size=int64(14), color="0000FF", ...
                bold=py.True, italics=py.True);

The font style can then be applied to the text in a cell by setting the cell’s .font attribute:

# Python
B9 = sheet_data.cell(2, 9)
B9.value = "Total"
B9.font  = ft_title

and

% MATLAB
B9 = sheet_data.cell(int64(2), int64(9));
B9.value = "Total";
B9.font  = ft_title;

Alignment

Text in a cell can be aligned vertically and horizontally, rotated, indented, and shrunk to fit a given container using the Alignment class. Text in merged cells looks good when it is centered both horizontally and vertically. That can be done with

# Python
import openpyxl.styles as styles
Alignment = styles.Alignment
B9.alignment = Alignment(horizontal="center", vertical="center")

and

% MATLAB 2021b and newer
B9.alignment = Alignment(pyargs("horizontal","center", ...
                                "vertical","center"));
% MATLAB 2022a and newer
B9.alignment = Alignment(horizontal="center", vertical="center");

Currency, percent, and other numeric formats

A cell can be formatted to show numeric values as currency or a percentage through the cell’s .number_format attribute. The numerical format notation is rather unusual in that it allows positive values to be formatted diffently than negative values. If such a distinction is desired, a semicolon and the negative value format are appended to the postive value format. Financial ledgers sometimes show negative values in red and/or in parentheses so these variations are shown in the example below. Oddly, one may use words like ‘Red’ or ‘Blue’ to define the color within a .number_format but all other color definition options only allow hex values.

The .number_format attribute also controls date format styles and scientific notation. See the documentation for numeric styles for all options.

B9.number_format = '"$"#,##0.00;[Red]("$"#,##0.00)'  # eg  $12,345.67
B9.number_format = '0.0%;[Red]0.0%'                  # eg  98.7%

The dollar symbol can be replaced with other currency symbols simply by replacing "$" with the desired character — "€" for example.

Cell background color

Cell backgrounds can be set to a variety of colors and patterns with the PatternFill class. Following the same technique as with fonts and number styles, one first defines a pattern style then applies this to a cell’s .fill attribute. The style bg_green defines a solid lime green background color:

# Python
import openpyxl.styles as styles
PatternFill = styles.PatternFill
bg_green = PatternFill(fgColor="C5FD2F", fill_type="solid")
B9 = sheet_data.cell(2,9)
B9.value = "Total"
B9.font  = ft_title
B9.fill  = bg_green
% MATLAB 2020b and newer
bg_green = PatternFill(pyargs("fgColor","C5FD2F","fill_type","solid"));
B9 = sheet_data.cell(int64(2),int64(9));
B9.value = "Total";
B9.font  = ft_title;
B9.fill  = bg_green;
% MATLAB 2022a and newer
bg_green = PatternFill(fgColor="C5FD2F", fill_type="solid");
B9 = sheet_data.cell(int64(2),int64(9));
B9.value = "Total";
B9.font  = ft_title;
B9.fill  = bg_green;

Merged cells

A block of adjacent cells can be coalessed into a single cell using a worksheet’s .merge_cells() method. The method takes a cell block range as an argument. The example below merges the nine cells B2, B3, B4, C2, C3, C4, D2, D3, and D4 into one. After the merge, the cell’s handle is accessed using the row and column coordinates of the top left cell:

# Python
sheet_data.merge_cells("B2:D4")
B2 = sheet_data.cell(2,2)
B2.value = "A Title"
B2.font = ft_title
B2.alignment = Alignment(horizontal="center", vertical="center")
% MATLAB 2020b and newer
sheet_data.merge_cells("B2:D4");
B2 = sheet_data.cell(int64(2),int64(2));
B2.value = "A Title";
B2.font = ft_title;
B2.alignment = Alignment(pyargs("horizontal","center", ...
                                "vertical","center");
% MATLAB 2022a and newer
sheet_data.merge_cells("B2:D4");
B2 = sheet_data.cell(int64(2),int64(2));
B2.value = "A Title";
B2.font = ft_title;
B2.alignment = Alignment(horizontal="center", vertical="center");

Equations

Equations are easily added to spreasheets as they are merely cells containing strings that begin with an equals sign and an operator. For example, to add an equation that sums values in the cell range E2:E50, one merely populates a cell with '=SUM(E2:E50)'.

B9.value = '=SUM(E2:E50)'

Putting it all together

This spreadsheet includes all customizations described above:

worldwide_sales.xlsx created by all_together.py or all_together.m. Note: values are randomly generated and so will vary with each run.

worldwide_sales.xlsx created by all_together.py or all_together.m. Note: values are randomly generated and so will vary with each run.

File: all_together.py

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#!/usr/bin/env python
import numpy as np
import openpyxl as OP
import openpyxl.styles as styles
Font = styles.Font
Alignment = styles.Alignment
PatternFill = styles.PatternFill

book = OP.Workbook()
sheet_data = book.create_sheet("Profits 2022", 0)
sheet_data.sheet_properties.tabColor = "15AA15"
Region = ['N America', 'S America', 'Europe', 'Asia', 'Australia', 'Antarctica']
nR = len(Region)
profits = 10000 * (0.75 - np.random.rand(nR,2))
dollars = '"$"#,##0.00;[Red]("$"#,##0.00)'
pct     = '0.0%;[Red]0.0%'

sheet_data.merge_cells("B2:F3")
B2 = sheet_data.cell(2,2)
B2.value = 'Worldwide Sales'
B2.font = Font(name="Arial", size=16, color="4444FF", bold=True)
B2.alignment = Alignment(horizontal="center", vertical="center")

row, col = 5, 2

header_font = Font(name="Arial", size=12, bold=True)

region_cell = sheet_data.cell(row, 2)   # B5
region_cell.value = "Region"
region_cell.font  = header_font

last_year = sheet_data.cell(row, 3)     # C5
last_year.value = "Last Year"
last_year.font  = header_font

this_year = sheet_data.cell(row, 4)     # D5
this_year.value = "This Year"
this_year.font  = header_font

growth = sheet_data.cell(row, 5)        # E5
growth.value = "Growth"
growth.font  = header_font

rank = sheet_data.cell(row,6)           # F5
rank.value = 'Rank'
rank.font  = header_font

row += 1
highest_growth, highest_growth_row = -9.9e+99, 0
all_growth = np.zeros((nR,))
for r in range(nR):
    where     = sheet_data.cell(row+r, 2)
    where.value = Region[r]
    where.font = Font(name="Calibri", italic=True)
    last_year = sheet_data.cell(row+r, 3)
    this_year = sheet_data.cell(row+r, 4)
    growth    = sheet_data.cell(row+r, 5)
    last_year.value = profits[r,0]
    last_year.number_format = dollars
    this_year.value = profits[r,1]
    this_year.number_format = dollars
    growth.value = (this_year.value - last_year.value)/abs(last_year.value)
    all_growth[r] = growth.value
    growth.number_format = pct
    if growth.value > highest_growth:
        highest_growth = growth.value
        highest_growth_row = row + r

# highlight row with highest growth
highlight_color = PatternFill(fgColor="E4FF00",fill_type="solid")
for c in range(2,7):  # columns B to F
    cell = sheet_data.cell(highest_growth_row,c)
    cell.fill = highlight_color

# add values in the ranking column
rank = nR - all_growth.argsort().argsort()
for r in range(nR):
    cell = sheet_data.cell(row+r,6)
    cell.value = rank[r]

# cummulative row
total = sheet_data.cell(row+nR, 2)
total.value = "Total"
total.font  = header_font
for i,letter in enumerate(['C', 'D']):
    cell = sheet_data.cell(row+nR,i+3)
    cell.value = f'=SUM({letter}{row}:{letter}{row+nR-1})'
    cell.number_format = dollars

# cummulative growth
row += nR
cell = sheet_data.cell(row,5)
cell.value = f'=(D{row} - C{row})/ABS(C{row})'
cell.number_format = pct

book.save("worldwide_sales.xlsx")

MATLAB 2020b & newer using pyargs():

File: all_together_2020b.m

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
% MATLAB 2020b & newer
clear
OP = py.importlib.import_module('openpyxl');
styles = py.importlib.import_module('openpyxl.styles');

Font = styles.Font;
Alignment = styles.Alignment;
PatternFill = styles.PatternFill;

book = OP.Workbook();
sheet_data = book.create_sheet("Profits 2022", int64(0));

sheet_data.sheet_properties.tabColor = "15AA15";
Region = {'N America', 'S America', 'Europe', 'Asia', 'Australia', 'Antarctica'};
nR = int64(length(Region));
profits = 10000 * (0.75 - rand(nR,2));
dollars = '"$"#,##0.00;[Red]("$"#,##0.00)';
pct     = '0.0%;[Red]0.0%';

sheet_data.merge_cells("B2:E3");
B2 = sheet_data.cell(int64(2),int64(2));
B2.value = 'Worldwide Sales';
B2.font = Font(pyargs("name","Arial","size",int64(16),...
                      "color","4444FF","bold",py.True));
B2.alignment = Alignment(pyargs("horizontal","center","vertical","center"));

row = int64(5);
col = int64(2);

header_font = Font(pyargs("name","Arial","size",int64(12),"bold",py.True));

region_cell = sheet_data.cell(row, int64(2));   % B5
region_cell.value = "Region";
region_cell.font  = header_font;

last_year = sheet_data.cell(row, int64(3));     % C5
last_year.value = "Last Year";
last_year.font  = header_font;

this_year = sheet_data.cell(row, int64(4));     % D5
this_year.value = "This Year";
this_year.font  = header_font;

growth       = sheet_data.cell(row, int64(5));  % E5
growth.value = "Growth";
growth.font  = header_font;

rank = sheet_data.cell(row, int64(6));          % F5
rank.value = "Rank";
rank.font  = header_font;

row = row + 1;
highest_growth = -9.9e+99;
highest_growth_row = int64(0);
all_growth = zeros(nR,1);
for r = 1:nR
    where = sheet_data.cell(row+r, int64(2));
    where.value = Region{r};
    where.font = Font(pyargs("name","Calibri","italic",py.True));
    last_year = sheet_data.cell(row+r, int64(3));
    this_year = sheet_data.cell(row+r, int64(4));
    growth    = sheet_data.cell(row+r, int64(5));
    last_year.value = profits(r,1);
    last_year.number_format = dollars;
    this_year.value = profits(r,2);
    this_year.number_format = dollars;
    growth.value = (this_year.value - last_year.value)/abs(last_year.value);
    all_growth(r) = growth.value;
    growth.number_format = pct;
    if growth.value > highest_growth
        highest_growth = growth.value;
        highest_growth_row = row + r;
    end
end

% highlight row with highest growth
highlight_color = PatternFill(pyargs("fgColor","E4FF00","fill_type","solid"));
for c = int64(2):int64(5)
    cell = sheet_data.cell(highest_growth_row,c);
    cell.fill = highlight_color;
end

% add values in the ranking column
[~, index] = sort(all_growth);
[~, rank] = sort(index);
rank = 1 + nR - int64(rank);
for r = 1:nR
    cell = sheet_data.cell(row+r,int64(6));
    cell.value = rank(r);
end

% cummulative row
row = row + int64(1);
total = sheet_data.cell(row+nR, int64(2));
total.value = "Total";
total.font  = header_font;
i = 0;
for letter = {'C', 'D'}
    cell = sheet_data.cell(row+nR,int64(i+3));
    cell.value = sprintf('=SUM(%s%d:%s%d)', ...
                         letter{1}, row, letter{1}, row+nR-1);
    cell.number_format = dollars;
    i = i + 1;
end

% cummulative growth
row = row + nR;
cell = sheet_data.cell(row,int64(5));
cell.value = sprintf('=(D%d - C%d)/ABS(C%d)', row, row, row);
cell.number_format = pct;

book.save("worldwide_sales.xlsx")

MATLAB 2022a & newer using x=y keyword argument assignments instead of pyargs():

File: all_together.m

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
% MATLAB 2022a & newer
clear
OP = py.importlib.import_module('openpyxl');
styles = py.importlib.import_module('openpyxl.styles');

Font = styles.Font;
Alignment = styles.Alignment;
PatternFill = styles.PatternFill;

book = OP.Workbook();
sheet_data = book.create_sheet("Profits 2022", int64(0));

sheet_data.sheet_properties.tabColor = "15AA15";
Region = {'N America', 'S America', 'Europe', 'Asia', 'Australia', 'Antarctica'};
nR = int64(length(Region));
profits = 10000 * (0.75 - rand(nR,2));
dollars = '"$"#,##0.00;[Red]("$"#,##0.00)';
pct     = '0.0%;[Red]0.0%';

sheet_data.merge_cells("B2:E3");
B2 = sheet_data.cell(int64(2),int64(2));
B2.value = 'Worldwide Sales';
B2.font = Font(name="Arial", size=int64(16), color="4444FF", bold=py.True);
B2.alignment = Alignment(horizontal="center", vertical="center");

row = int64(5);
col = int64(2);

header_font = Font(name="Arial", size=int64(12), bold=py.True);

region_cell = sheet_data.cell(row, int64(2));   % B5
region_cell.value = "Region";
region_cell.font  = header_font;

last_year = sheet_data.cell(row, int64(3));     % C5
last_year.value = "Last Year";
last_year.font  = header_font;

this_year = sheet_data.cell(row, int64(4));     % D5
this_year.value = "This Year";
this_year.font  = header_font;

growth       = sheet_data.cell(row, int64(5));  % E5
growth.value = "Growth";
growth.font  = header_font;

rank = sheet_data.cell(row, int64(6));          % F5
rank.value = "Rank";
rank.font  = header_font;

row = row + 1;
highest_growth = -9.9e+99;
highest_growth_row = int64(0);
all_growth = zeros(nR,1);
for r = 1:nR
    where = sheet_data.cell(row+r, int64(2));
    where.value = Region{r};
    where.font = Font(name="Calibri", italic=py.True);
    last_year = sheet_data.cell(row+r, int64(3));
    this_year = sheet_data.cell(row+r, int64(4));
    growth    = sheet_data.cell(row+r, int64(5));
    last_year.value = profits(r,1);
    last_year.number_format = dollars;
    this_year.value = profits(r,2);
    this_year.number_format = dollars;
    growth.value = (this_year.value - last_year.value)/abs(last_year.value);
    all_growth(r) = growth.value;
    growth.number_format = pct;
    if growth.value > highest_growth
        highest_growth = growth.value;
        highest_growth_row = row + r;
    end
end

% highlight row with highest growth
highlight_color = PatternFill(fgColor="E4FF00",fill_type="solid");
for c = int64(2):int64(5)
    cell = sheet_data.cell(highest_growth_row,c);
    cell.fill = highlight_color;
end

% add values in the ranking column
[~, index] = sort(all_growth);
[~, rank] = sort(index);
rank = 1 + nR - int64(rank);
for r = 1:nR
    cell = sheet_data.cell(row+r,int64(6));
    cell.value = rank(r);
end

% cummulative row
row = row + int64(1);
total = sheet_data.cell(row+nR, int64(2));
total.value = "Total";
total.font  = header_font;
i = 0;
for letter = {'C', 'D'}
    cell = sheet_data.cell(row+nR,int64(i+3));
    cell.value = sprintf('=SUM(%s%d:%s%d)', ...
                         letter{1}, row, letter{1}, row+nR-1);
    cell.number_format = dollars;
    i = i + 1;
end

% cummulative growth
row = row + nR;
cell = sheet_data.cell(row,int64(5));
cell.value = sprintf('=(D%d - C%d)/ABS(C%d)', row, row, row);
cell.number_format = pct;

book.save("worldwide_sales.xlsx")

openpyxl can do more

Only a small fraction of openpyxl’s capabilities are demonstrated here. This module can also create column filters, 2D and 3D stacked plots, 2D and 3D line plots with optional log scales, bar charts, scatter plots, pie charts, donut charts, radar plots, stock plots with high/low markers, 2D and 3D contour plots, and gauges.

Motivated MATLAB users will be able to make extraordinary Excel spreadsheets with openpyxl.

Join me again on July 9, 2022 to see how MATLAB can work with PostgreSQL databases using the psycopg2 Python module.