Working with Excel in Python
Read & write Excel files with pandas and openpyxl: sheets, engines, formatting, formulas, dates, and performance tips.
Quiz progress
0 / 10 answered
Contents
Fundamentals #
What you’ll use
pandas.read_excel()/DataFrame.to_excel()for fast, high-level workflows.openpyxlfor low-level control of.xlsx/.xlsm(cells, styles, freeze panes, formulas).- “Engine” matters: openpyxl (xlsx/xlsm), xlsxwriter (write xlsx with rich formatting), pyxlsb (read xlsb), xlrd 1.2.0 (legacy .xls read).
# Quick tour with pandas
import pandas as pd
df = pd.DataFrame({"id":[1,2], "name":["Ada","Grace"]})
df.to_excel("people.xlsx", index=False, sheet_name="People") # write
df2 = pd.read_excel("people.xlsx", sheet_name="People") # read
print(df2.head())
“Usepandasfor tables; drop down toopenpyxl/xlsxwriterwhen you need Excel-specific features.”
Quiz: Fundamentals
1. Which function reads an Excel file into a DataFrame?
2. Which library lets you style cells and write formulas in
.xlsx?Reading Excel #
Selective loading with pandas
- Choose a sheet:
sheet_name="Sheet1"or index0. UseNonefor all sheets (returns dict). - Limit columns/rows:
usecols="A:C"or list;skiprows=,nrows=. - Types & missing:
dtype={"id":int},na_values=["NA",""],parse_dates=[... ].
import pandas as pd
orders = pd.read_excel(
"orders.xlsx",
sheet_name="2025-Q1",
usecols="A:D",
dtype={"qty":"Int64"},
na_values=["", "NA"],
parse_dates=["order_date"]
)
print(orders.dtypes)
Quiz: Reading
3. Which argument selects only “A:C” columns while reading?
4. How do you read all sheets with pandas?
Writing Excel #
Single & multi-sheet output
DataFrame.to_excel("out.xlsx", index=False)to avoid the index column.- Multiple sheets:
pd.ExcelWriter(...)context, callto_excel(..., sheet_name=...)for each. - Append to existing file:
ExcelWriter(..., mode="a", if_sheet_exists="replace")(depends on engine).
import pandas as pd
with pd.ExcelWriter("report.xlsx", engine="xlsxwriter") as xw:
df_sales.to_excel(xw, index=False, sheet_name="Sales")
df_costs.to_excel(xw, index=False, sheet_name="Costs")
Quiz: Writing
5. How do you prevent the index from being written as a column?
6. What’s the idiomatic way to write multiple sheets in one file?
Formatting, Freeze, Formulas #
Two approaches
- pandas + xlsxwriter: easy formatting via workbook/worksheet objects.
- openpyxl: manipulate cells directly (styles, freeze panes, formulas, keep VBA).
# pandas + xlsxwriter: freeze panes & set column width
with pd.ExcelWriter("styled.xlsx", engine="xlsxwriter") as xw:
df.to_excel(xw, index=False, sheet_name="Data")
ws = xw.sheets["Data"]
ws.freeze_panes(1, 0) # freeze first row
ws.set_column("A:A", 12) # width
ws.write_formula("D2", "=B2*C2") # formula
# openpyxl: styles / formulas / freeze
from openpyxl import Workbook
from openpyxl.styles import Font
wb = Workbook()
ws = wb.active
ws.title = "Data"
ws.append(["name","qty","price","total"])
ws.append(["Widget", 3, 9.99, "=B2*C2"])
ws["A1"].font = Font(bold=True)
ws.freeze_panes = "A2"
wb.save("styled_openpyxl.xlsx")
Quiz: Formatting & Formulas
7. How do you freeze panes at B2 using
openpyxl?8. With the xlsxwriter engine, how do you add a formula to D2?
Engines & Compatibility #
Which engine for which file?
.xlsx/.xlsm: read/write withopenpyxl; rich formatting on write withxlsxwriter..xls(Excel 97-2003): read withxlrd==1.2.0(legacy) or convert via Excel/LibreOffice first..xlsb: read viapyxlsbengine in pandas; writing is not generally supported.- Macros: preserve by loading with
openpyxl.load_workbook(..., keep_vba=True)and saving as.xlsm.
# Preserve macros with openpyxl
from openpyxl import load_workbook
wb = load_workbook("macro_book.xlsm", keep_vba=True)
# ... modify values/styles ...
wb.save("macro_book_out.xlsm")
Quiz: Engines
9. To read
.xlsb with pandas, which engine should you use?10. How do you preserve VBA macros when editing an
.xlsm file?Dates & Pitfalls #
Common gotchas
- Excel stores dates as serial numbers; beware the 1900 “leap-year” bug. Prefer pandas
parse_datesand setdate_formaton write. - To read displayed values of formulas with openpyxl, load with
data_only=True(requires the file to have been calculated by Excel). - Large sheets: prefer
openpyxl.load_workbook(..., read_only=True)and iterate rows; pandasread_excelloads into memory (nochunksize).
# Read displayed values from formulas (not the formulas)
from openpyxl import load_workbook
wb = load_workbook("calc.xlsx", data_only=True)
ws = wb.active
print(ws["D2"].value) # last-calculated result
Final Quiz & Summary #
Review your performance and revisit questions you missed.
Category: python · Lesson: excel-basics
Learning by Examples
Edit the code and see the result in the console panel.
# Excel playground
# This tries to use openpyxl if available; otherwise prints a helpful message.
from io import BytesIO
print("— Excel I/O demo —")
try:
from openpyxl import Workbook, load_workbook
from openpyxl.styles import Font
# Create an in-memory workbook
wb = Workbook()
ws = wb.active
ws.title = "Data"
ws.append(["name","qty","price","total"])
ws.append(["Widget", 3, 9.99, "=B2*C2"])
ws["A1"].font = Font(bold=True)
ws.freeze_panes = "A2"
buf = BytesIO()
wb.save(buf)
print("Created workbook in memory with 2 rows + header.")
print("Cell D2 is a formula:", ws["D2"].value)
# Read back displayed values (not formulas) if pre-calculated
buf.seek(0)
wb2 = load_workbook(buf, data_only=True)
ws2 = wb2["Data"]
print("Reading back (data_only=True) D2 value:", ws2["D2"].value)
except Exception as e:
print("openpyxl not available in this environment.")
print("To run locally: pip install openpyxl")
print("Then create a workbook, write rows, set styles, and save to .xlsx.")
print("Error detail:", type(e).__name__, "-", str(e))