Automate Excel with Python: 10 Copy‑Paste Scripts to Clean Data, Build Reports, and Chart Faster
If you spend hours wrangling spreadsheets, you’re not alone. Most professionals use Excel daily, yet much of that work is repetitive: merging files, cleaning columns, formatting reports, and rebuilding the same charts week after week. What if those tasks ran themselves while you focused on analysis and decisions?
This guide gives you 10 real‑world Python scripts you can copy, paste, and run today—even if you’ve never written code. You’ll learn how to merge folders of Excel files in seconds, clean messy data automatically, generate weekly reports, email the results, and even build a daily dashboard. I’ll explain each script in plain language, with tips so you can adapt them to your workflow.
What you need to get started (no experience required)
Before automating Excel, set up a simple Python environment. You only need a few tools:
- Python 3.9+ (3.11+ recommended) – download from python.org.
- A code editor like VS Code (free) or even Notepad will do.
- Key libraries:
- pandas for data wrangling (docs)
- openpyxl for reading/writing .xlsx (docs)
- XlsxWriter for formatting and Excel charts (docs)
- matplotlib or seaborn for plots (matplotlib docs)
- watchdog for folder monitoring (docs)
- smtplib/win32com for email (smtplib docs)
Install everything with one command:
pip install pandas openpyxl XlsxWriter matplotlib seaborn watchdog
Windows Outlook users who want native Outlook sending can add:
pip install pywin32
If you’re nervous about the command line, don’t be. You’ll mostly copy and paste. Here’s why that matters: once you run these scripts once, you can reuse them forever.
Want the complete, copy‑paste bundle and step‑by‑step walkthroughs? Shop on Amazon.
Script 1: Merge dozens of Excel files in seconds (pandas + glob)
Use this when you receive a folder full of monthly or regional files and need a master sheet.
What it does: – Reads all .xlsx files in a folder – Appends them into one DataFrame – Adds a “source file” column (handy for traceability) – Saves to a clean master.xlsx
Code:
import pandas as pd
from pathlib import Path
folder = Path("data/monthly_reports") # change this
files = list(folder.glob("*.xlsx"))
frames = []
for f in files:
df = pd.read_excel(f)
df["source_file"] = f.name
frames.append(df)
master = pd.concat(frames, ignore_index=True)
master.to_excel("master.xlsx", index=False)
print(f"Merged {len(files)} files into master.xlsx")
Tip: If columns differ slightly across files, use pd.concat(..., join="outer")
and standardize later.
Script 2: Clean messy data automatically (trim, case, dates, numbers)
This is your “make it neat” button. It fixes common quality issues:
- Trim whitespace and standardize casing
- Parse dates
- Coerce numeric fields
- Drop duplicates
Code:
import pandas as pd
df = pd.read_excel("master.xlsx")
# Standardize column names
df.columns = [c.strip().lower().replace(" ", "_") for c in df.columns]
# Trim whitespace in object columns
for c in df.select_dtypes(include="object").columns:
df[c] = df[c].astype(str).str.strip()
# Example: standardize text case in a category column
if "category" in df.columns:
df["category"] = df["category"].str.title()
# Parse dates safely
for c in df.columns:
if "date" in c:
df[c] = pd.to_datetime(df[c], errors="coerce")
# Convert currency/numeric-like columns
for c in ["amount", "revenue", "cost"]:
if c in df.columns:
df[c] = pd.to_numeric(df[c], errors="coerce")
# Remove rows with no critical fields and drop duplicates
critical = [c for c in ["date", "category", "amount"] if c in df.columns]
df = df.dropna(subset=critical).drop_duplicates()
df.to_excel("master_clean.xlsx", index=False)
print("Cleaned data saved to master_clean.xlsx")
Pro move: keep the cleaning script separate so you can reuse it with different input files.
Choosing the right Excel automation tools (pandas, openpyxl, XlsxWriter, xlwings)
You have options, and the best choice depends on what you’re doing:
- pandas: fastest for reading/wrangling data at scale. Great for merges, group-bys, pivot tables, and exports.
- openpyxl: reads/writes .xlsx with control over cells, formulas, and styles. Ideal for light formatting.
- XlsxWriter: best-in-class Excel formatting and charts from Python. Fast and feature-rich.
- xlwings: controls an actual Excel instance (useful if you need macros/VBA interop or interactive workbooks).
Rules of thumb:
– Heavy data work? Use pandas.
– Fancy formatting and native Excel charts? Save with XlsxWriter (to_excel(engine="xlsxwriter")
).
– Need to click buttons in Excel from Python or reuse existing macros? Try xlwings.
If you’d like a vetted, beginner‑friendly starter pack with scripts and setup tips, Check it on Amazon.
For deeper background, see Microsoft’s Excel automation overview (Microsoft Learn).
Script 3: Generate professional weekly reports
Goal: Produce a weekly summary by team, region, or product—automatically.
What it does: – Groups data by week and category – Calculates totals – Applies basic formatting in Excel
Code:
import pandas as pd
df = pd.read_excel("master_clean.xlsx")
# Ensure a date column
date_col = "date" if "date" in df.columns else df.filter(like="date").columns[0]
df["week"] = pd.to_datetime(df[date_col]).dt.to_period("W").apply(lambda r: r.start_time)
group_cols = [c for c in ["week", "category", "region"] if c in df.columns]
value_col = "amount" if "amount" in df.columns else df.select_dtypes("number").columns[0]
weekly = df.groupby(group_cols, dropna=False)[value_col].sum().reset_index()
with pd.ExcelWriter("weekly_report.xlsx", engine="xlsxwriter") as writer:
weekly.to_excel(writer, sheet_name="Summary", index=False)
wb = writer.book
ws = writer.sheets["Summary"]
currency_fmt = wb.add_format({"num_format": "$#,##0", "font_name": "Calibri"})
date_fmt = wb.add_format({"num_format": "yyyy-mm-dd"})
header_fmt = wb.add_format({"bold": True, "bg_color": "#F2F2F2"})
ws.set_row(0, None, header_fmt)
ws.set_column("A:A", 12, date_fmt) # week
ws.set_column("B:D", 18)
ws.set_column("E:E", 14, currency_fmt) # value
Why it works: weekly aggregation gives you a repeatable, executive-friendly snapshot.
Script 4: Email updates with attachments (Gmail/Outlook)
Send your weekly report on autopilot. Two approaches:
- Gmail or any SMTP server via
smtplib
- Outlook on Windows via
win32com.client
(great inside corporate environments)
SMTP example:
import smtplib, ssl
from email.message import EmailMessage
sender = "you@example.com"
recipient = "team@example.com"
subject = "Weekly Report"
body = "Attached is this week's report. Highlights inside."
msg = EmailMessage()
msg["From"] = sender
msg["To"] = recipient
msg["Subject"] = subject
msg.set_content(body)
with open("weekly_report.xlsx", "rb") as f:
msg.add_attachment(
f.read(),
maintype="application",
subtype="vnd.openxmlformats-officedocument.spreadsheetml.sheet",
filename="weekly_report.xlsx"
)
smtp_server = "smtp.gmail.com"
port = 465
password = "YOUR_APP_PASSWORD" # Use app passwords, not your main login
context = ssl.create_default_context()
with smtplib.SMTP_SSL(smtp_server, port, context=context) as server:
server.login(sender, password)
server.send_message(msg)
print("Email sent!")
Outlook example (Windows only):
import win32com.client as win32
outlook = win32.Dispatch("Outlook.Application")
mail = outlook.CreateItem(0)
mail.To = "team@example.com"
mail.Subject = "Weekly Report"
mail.Body = "Attached is this week's report."
mail.Attachments.Add(Source=r"C:\path\to\weekly_report.xlsx")
mail.Send()
print("Outlook email sent!")
Prefer a ready-made playbook that includes email templates and attachments? See price on Amazon.
Security tip: Always use app passwords or OAuth for SMTP. Never hardcode real credentials in shared scripts. For reference, see Python’s official docs for smtplib.
Script 5: Create beautiful charts for presentations (matplotlib + XlsxWriter)
Build charts once and export them as images or embedded Excel charts.
Code (save PNG charts + Excel summary):
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_excel("weekly_report.xlsx")
pivot = df.pivot_table(index="week", columns="category", values=df.columns[-1], aggfunc="sum").fillna(0)
ax = pivot.plot(kind="line", figsize=(8,4), title="Weekly Trend by Category")
plt.tight_layout()
plt.savefig("weekly_trend.png", dpi=150)
plt.close()
with pd.ExcelWriter("weekly_with_chart.xlsx", engine="xlsxwriter") as writer:
pivot.to_excel(writer, sheet_name="Data")
wb = writer.book
ws = wb.add_worksheet("Chart")
ws.insert_image("B2", "weekly_trend.png", {"x_scale": 0.8, "y_scale": 0.8})
Alternate approach: build native Excel charts with XlsxWriter’s chart API for consistent styling within Excel.
Script 6: Monitor a folder and process files instantly (watchdog)
When new files arrive in a shared folder, process them automatically.
Code:
import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import pandas as pd
from pathlib import Path
watch_dir = Path("incoming")
class NewFileHandler(FileSystemEventHandler):
def on_created(self, event):
if not event.is_directory and event.src_path.endswith(".xlsx"):
path = Path(event.src_path)
print(f"New file: {path.name}")
df = pd.read_excel(path)
# Example processing: append to master
try:
master = pd.read_excel("master.xlsx")
combined = pd.concat([master, df], ignore_index=True)
except FileNotFoundError:
combined = df
combined.to_excel("master.xlsx", index=False)
if __name__ == "__main__":
event_handler = NewFileHandler()
observer = Observer()
observer.schedule(event_handler, str(watch_dir), recursive=False)
observer.start()
print(f"Watching {watch_dir.resolve()} ...")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
Ready to upgrade your workflow with a tested project you can run today? Buy on Amazon.
Note: watchdog must keep running. Schedule it as a background service, or trigger a batch script when files appear.
Script 7: Combine data from multiple sheets
Sometimes the data is split across many tabs inside the same workbook. This script pulls them all together.
Code:
import pandas as pd
wb = "multi_tab.xlsx"
sheets = pd.read_excel(wb, sheet_name=None) # dict of {sheet_name: DataFrame}
frames = []
for name, df in sheets.items():
df["sheet"] = name
frames.append(df)
combined = pd.concat(frames, ignore_index=True)
combined.to_excel("combined_sheets.xlsx", index=False)
print("Combined all sheets into combined_sheets.xlsx")
Tip: Filter out lookup or hidden tabs by name if needed.
Script 8: Add professional formatting to reports (styles, zebra stripes, totals)
Presentation matters. Use XlsxWriter formats to make reports “board-ready.”
Code:
import pandas as pd
df = pd.read_excel("master_clean.xlsx").head(500) # demo subset
with pd.ExcelWriter("styled_report.xlsx", engine="xlsxwriter") as writer:
df.to_excel(writer, sheet_name="Report", index=False)
wb = writer.book
ws = writer.sheets["Report"]
# Formats
header = wb.add_format({"bold": True, "bg_color": "#1F4E78", "font_color": "white", "align": "center"})
money = wb.add_format({"num_format": "$#,##0"})
zebra1 = wb.add_format({"bg_color": "#F9F9F9"})
zebra2 = wb.add_format({"bg_color": "#FFFFFF"})
# Apply header
ws.set_row(0, 18, header)
# Set column widths and currency format if amount exists
for idx, col in enumerate(df.columns):
ws.set_column(idx, idx, max(10, min(25, len(str(col)) + 2)))
if "amount" in col or "revenue" in col or "cost" in col:
ws.set_column(idx, idx, 14, money)
# Zebra striping
for row in range(1, len(df) + 1):
fmt = zebra1 if row % 2 == 0 else zebra2
ws.set_row(row, None, fmt)
# Add total row at bottom if there's a numeric column
numeric_cols = [i for i, c in enumerate(df.columns) if df[c].dtype.kind in "fi"]
if numeric_cols:
total_row = len(df) + 1
ws.write(total_row, 0, "TOTAL")
for col_idx in numeric_cols:
col_letter = chr(ord('A') + col_idx)
ws.write_formula(total_row, col_idx, f"=SUM({col_letter}2:{col_letter}{len(df)+1})", money)
This style pass takes seconds and elevates the entire deliverable.
Script 9: Build a fully automated daily dashboard
This approach rebuilds an Excel dashboard each morning, saves charts, and emails it—hands off.
What it does: – Reads the latest data – Updates pivot tables and charts – Saves a timestamped workbook – (Optional) emails it to stakeholders
Skeleton code:
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
from pathlib import Path
today = datetime.now().strftime("%Y-%m-%d")
out = Path(f"dashboard_{today}.xlsx")
# 1) Load and prep
df = pd.read_excel("master_clean.xlsx")
# 2) KPI calc
kpis = {
"Revenue": df["amount"].sum() if "amount" in df.columns else df.select_dtypes("number").sum().sum(),
"Orders": len(df)
}
kpi_df = pd.DataFrame(list(kpis.items()), columns=["Metric", "Value"])
# 3) Pivot(s)
pvt = df.pivot_table(index="category", values="amount", aggfunc="sum").sort_values("amount", ascending=False)
# 4) Chart image
ax = pvt.plot(kind="bar", figsize=(6,3), title="Revenue by Category")
plt.tight_layout()
plt.savefig("kpi_chart.png", dpi=150)
plt.close()
# 5) Save dashboard
with pd.ExcelWriter(out, engine="xlsxwriter") as writer:
kpi_df.to_excel(writer, sheet_name="KPIs", index=False)
pvt.to_excel(writer, sheet_name="Breakdown")
wb = writer.book
ws = wb.add_worksheet("Charts")
ws.insert_image("B2", "kpi_chart.png")
print(f"Dashboard created: {out}")
Schedule it: – Windows: Task Scheduler (guide) – macOS/Linux: cron (man page)
Want to try it yourself with a downloadable project and sample files? View on Amazon.
Script 10: Quick pivot table to Excel with formatting
When leadership asks, “Can we see totals by region and product?” you’ll have it in seconds.
Code:
import pandas as pd
df = pd.read_excel("master_clean.xlsx")
index_cols = [c for c in ["region", "product"] if c in df.columns]
value_col = "amount" if "amount" in df.columns else df.select_dtypes("number").columns[0]
pvt = pd.pivot_table(df, index=index_cols, values=value_col, aggfunc=["sum", "mean", "count"])
pvt.columns = [f"{stat}_{col}" for stat, col in pvt.columns]
pvt = pvt.reset_index().sort_values(f"sum_{value_col}", ascending=False)
with pd.ExcelWriter("pivot_report.xlsx", engine="xlsxwriter") as writer:
pvt.to_excel(writer, sheet_name="Pivot", index=False)
wb = writer.book
ws = writer.sheets["Pivot"]
money = wb.add_format({"num_format": "$#,##0"})
header = wb.add_format({"bold": True, "bg_color": "#EEE"})
ws.set_row(0, None, header)
for i, col in enumerate(pvt.columns):
if "sum_" in col or "mean_" in col:
ws.set_column(i, i, 14, money)
else:
ws.set_column(i, i, 18)
Result: an instantly readable summary that looks like manual pivot work—but updates itself.
Troubleshooting and performance tips
- Keep column names consistent across files. Decide on “Amount” vs “amount” once and stick to it.
- Use dtype hints when reading Excel for speed:
pd.read_excel("file.xlsx", dtype={"sku": str, "region": "category"})
- If you hit memory limits with very large files, read in chunks:
for chunk in pd.read_excel("big.xlsx", sheet_name=0, chunksize=100_000): # process
- For reliable scheduling, log outcomes to a text file and include try/except blocks with email alerts on failure.
- Store configuration (paths, email recipients) in a .env or YAML file so you don’t edit code every week.
- Protect sensitive data. Use least-privilege service accounts and never commit credentials to version control.
Common pitfalls—and how to avoid them
- Mixed date formats: Coerce with
pd.to_datetime(..., errors="coerce")
and dropNaT
or fix rows. - Numbers saved as text: Use
pd.to_numeric(..., errors="coerce")
and then fill or filter missing. - Hidden Excel headers or blank top rows: Use
skiprows
inread_excel
. - Formula recalculation: If consumers rely on formulas, save with openpyxl and ensure
calcPr
is set or ask recipients to refresh. - Locale issues (comma decimals): Replace commas before converting numeric fields.
Security, governance, and compliance basics
Automation moves fast—make sure it’s safe:
- Use app passwords or OAuth for email; never plain passwords in scripts.
- Log every run with timestamp, input files, output files, and any errors.
- Version your scripts with Git and review changes.
- If you automate regulated data, check company policy for encryption and retention.
- Consider a data catalog or at least a README documenting inputs/outputs and owners.
FAQs: Automate Excel with Python
Q: Do I need to learn a lot of Python to use these scripts?
A: No. You can copy, paste, and change a few file paths and column names. Over time you’ll pick up the basics naturally.
Q: Will this work on a Mac?
A: Yes. All pandas/openpyxl/XlsxWriter scripts work on macOS and Linux. Outlook automation via win32com is Windows-only; use smtplib
on Mac.
Q: Can I schedule scripts to run at 7 a.m. daily?
A: Yes. Use Windows Task Scheduler or cron on macOS/Linux. Point the task at python path/to/script.py
.
Q: How do I handle Excel files with multiple header rows?
A: Use pd.read_excel(..., header=[0,1])
for multi-index headers or skiprows
to land on the correct header.
Q: What if column names change between months?
A: Normalize by lowercasing and stripping spaces, then map aliases:
df.rename(columns={"amt": "amount", "revenue($)": "amount"}, inplace=True)
Q: Can Python create native Excel charts?
A: Yes. Use XlsxWriter’s add_chart
for native Excel charts, or export matplotlib images and insert them.
Q: Is pandas faster than Excel?
A: For large merges, group-bys, and joins—yes, often dramatically faster. It also reduces manual mistakes.
Q: How do I share the results with non‑technical teammates?
A: Export to .xlsx with clear sheet names and formatting, then email or place in a shared drive. You can also save PDFs of charts.
Final takeaway
If you’re doing the same Excel steps more than once, automate them. Start with one script—merging files or cleaning data—and you’ll quickly build a toolkit that saves 5–10 hours each week. The real win isn’t just speed; it’s accuracy and peace of mind. Keep going, keep it simple, and iterate. If you want more guides like this, subscribe or bookmark this page—we’re publishing fresh, practical automation recipes every month.
External resources worth exploring: – pandas documentation: https://pandas.pydata.org/docs/ – openpyxl documentation: https://openpyxl.readthedocs.io/en/stable/ – XlsxWriter documentation: https://xlsxwriter.readthedocs.io/ – matplotlib documentation: https://matplotlib.org/stable/index.html
Discover more at InnoVirtuoso.com
I would love some feedback on my writing so if you have any, please don’t hesitate to leave a comment around here or in any platforms that is convenient for you.
For more on tech and other topics, explore InnoVirtuoso.com anytime. Subscribe to my newsletter and join our growing community—we’ll create something magical together. I promise, it’ll never be boring!
Stay updated with the latest news—subscribe to our newsletter today!
Thank you all—wishing you an amazing day ahead!
Read more related Articles at InnoVirtuoso
- How to Completely Turn Off Google AI on Your Android Phone
- The Best AI Jokes of the Month: February Edition
- Introducing SpoofDPI: Bypassing Deep Packet Inspection
- Getting Started with shadps4: Your Guide to the PlayStation 4 Emulator
- Sophos Pricing in 2025: A Guide to Intercept X Endpoint Protection
- The Essential Requirements for Augmented Reality: A Comprehensive Guide
- Harvard: A Legacy of Achievements and a Path Towards the Future
- Unlocking the Secrets of Prompt Engineering: 5 Must-Read Books That Will Revolutionize You