|

Automate Excel with Python: 10 Copy‑Paste Scripts to Clean Data, Build Reports, and Chart Faster

If you spend hours wrangling spreadsheets, you’re not alone. Most professionals use Excel daily, yet much of that work is repetitive: merging files, cleaning columns, formatting reports, and rebuilding the same charts week after week. What if those tasks ran themselves while you focused on analysis and decisions?

This guide gives you 10 real‑world Python scripts you can copy, paste, and run today—even if you’ve never written code. You’ll learn how to merge folders of Excel files in seconds, clean messy data automatically, generate weekly reports, email the results, and even build a daily dashboard. I’ll explain each script in plain language, with tips so you can adapt them to your workflow.

What you need to get started (no experience required)

Before automating Excel, set up a simple Python environment. You only need a few tools:

  • Python 3.9+ (3.11+ recommended) – download from python.org.
  • A code editor like VS Code (free) or even Notepad will do.
  • Key libraries:
  • pandas for data wrangling (docs)
  • openpyxl for reading/writing .xlsx (docs)
  • XlsxWriter for formatting and Excel charts (docs)
  • matplotlib or seaborn for plots (matplotlib docs)
  • watchdog for folder monitoring (docs)
  • smtplib/win32com for email (smtplib docs)

Install everything with one command:

pip install pandas openpyxl XlsxWriter matplotlib seaborn watchdog

Windows Outlook users who want native Outlook sending can add:

pip install pywin32

If you’re nervous about the command line, don’t be. You’ll mostly copy and paste. Here’s why that matters: once you run these scripts once, you can reuse them forever.

Want the complete, copy‑paste bundle and step‑by‑step walkthroughs? Shop on Amazon.


Script 1: Merge dozens of Excel files in seconds (pandas + glob)

Use this when you receive a folder full of monthly or regional files and need a master sheet.

What it does: – Reads all .xlsx files in a folder – Appends them into one DataFrame – Adds a “source file” column (handy for traceability) – Saves to a clean master.xlsx

Code:

import pandas as pd
from pathlib import Path

folder = Path("data/monthly_reports")  # change this
files = list(folder.glob("*.xlsx"))

frames = []
for f in files:
    df = pd.read_excel(f)
    df["source_file"] = f.name
    frames.append(df)

master = pd.concat(frames, ignore_index=True)
master.to_excel("master.xlsx", index=False)
print(f"Merged {len(files)} files into master.xlsx")

Tip: If columns differ slightly across files, use pd.concat(..., join="outer") and standardize later.


Script 2: Clean messy data automatically (trim, case, dates, numbers)

This is your “make it neat” button. It fixes common quality issues:

  • Trim whitespace and standardize casing
  • Parse dates
  • Coerce numeric fields
  • Drop duplicates

Code:

import pandas as pd

df = pd.read_excel("master.xlsx")

# Standardize column names
df.columns = [c.strip().lower().replace(" ", "_") for c in df.columns]

# Trim whitespace in object columns
for c in df.select_dtypes(include="object").columns:
    df[c] = df[c].astype(str).str.strip()

# Example: standardize text case in a category column
if "category" in df.columns:
    df["category"] = df["category"].str.title()

# Parse dates safely
for c in df.columns:
    if "date" in c:
        df[c] = pd.to_datetime(df[c], errors="coerce")

# Convert currency/numeric-like columns
for c in ["amount", "revenue", "cost"]:
    if c in df.columns:
        df[c] = pd.to_numeric(df[c], errors="coerce")

# Remove rows with no critical fields and drop duplicates
critical = [c for c in ["date", "category", "amount"] if c in df.columns]
df = df.dropna(subset=critical).drop_duplicates()

df.to_excel("master_clean.xlsx", index=False)
print("Cleaned data saved to master_clean.xlsx")

Pro move: keep the cleaning script separate so you can reuse it with different input files.


Choosing the right Excel automation tools (pandas, openpyxl, XlsxWriter, xlwings)

You have options, and the best choice depends on what you’re doing:

  • pandas: fastest for reading/wrangling data at scale. Great for merges, group-bys, pivot tables, and exports.
  • openpyxl: reads/writes .xlsx with control over cells, formulas, and styles. Ideal for light formatting.
  • XlsxWriter: best-in-class Excel formatting and charts from Python. Fast and feature-rich.
  • xlwings: controls an actual Excel instance (useful if you need macros/VBA interop or interactive workbooks).

Rules of thumb: – Heavy data work? Use pandas. – Fancy formatting and native Excel charts? Save with XlsxWriter (to_excel(engine="xlsxwriter")). – Need to click buttons in Excel from Python or reuse existing macros? Try xlwings.

If you’d like a vetted, beginner‑friendly starter pack with scripts and setup tips, Check it on Amazon.

For deeper background, see Microsoft’s Excel automation overview (Microsoft Learn).


Script 3: Generate professional weekly reports

Goal: Produce a weekly summary by team, region, or product—automatically.

What it does: – Groups data by week and category – Calculates totals – Applies basic formatting in Excel

Code:

import pandas as pd

df = pd.read_excel("master_clean.xlsx")

# Ensure a date column
date_col = "date" if "date" in df.columns else df.filter(like="date").columns[0]
df["week"] = pd.to_datetime(df[date_col]).dt.to_period("W").apply(lambda r: r.start_time)

group_cols = [c for c in ["week", "category", "region"] if c in df.columns]
value_col = "amount" if "amount" in df.columns else df.select_dtypes("number").columns[0]

weekly = df.groupby(group_cols, dropna=False)[value_col].sum().reset_index()

with pd.ExcelWriter("weekly_report.xlsx", engine="xlsxwriter") as writer:
    weekly.to_excel(writer, sheet_name="Summary", index=False)

    wb  = writer.book
    ws  = writer.sheets["Summary"]
    currency_fmt = wb.add_format({"num_format": "$#,##0", "font_name": "Calibri"})
    date_fmt     = wb.add_format({"num_format": "yyyy-mm-dd"})
    header_fmt   = wb.add_format({"bold": True, "bg_color": "#F2F2F2"})

    ws.set_row(0, None, header_fmt)
    ws.set_column("A:A", 12, date_fmt)    # week
    ws.set_column("B:D", 18)
    ws.set_column("E:E", 14, currency_fmt)  # value

Why it works: weekly aggregation gives you a repeatable, executive-friendly snapshot.


Script 4: Email updates with attachments (Gmail/Outlook)

Send your weekly report on autopilot. Two approaches:

  • Gmail or any SMTP server via smtplib
  • Outlook on Windows via win32com.client (great inside corporate environments)

SMTP example:

import smtplib, ssl
from email.message import EmailMessage

sender = "you@example.com"
recipient = "team@example.com"
subject = "Weekly Report"
body = "Attached is this week's report. Highlights inside."

msg = EmailMessage()
msg["From"] = sender
msg["To"] = recipient
msg["Subject"] = subject
msg.set_content(body)

with open("weekly_report.xlsx", "rb") as f:
    msg.add_attachment(
        f.read(),
        maintype="application",
        subtype="vnd.openxmlformats-officedocument.spreadsheetml.sheet",
        filename="weekly_report.xlsx"
    )

smtp_server = "smtp.gmail.com"
port = 465
password = "YOUR_APP_PASSWORD"  # Use app passwords, not your main login

context = ssl.create_default_context()
with smtplib.SMTP_SSL(smtp_server, port, context=context) as server:
    server.login(sender, password)
    server.send_message(msg)

print("Email sent!")

Outlook example (Windows only):

import win32com.client as win32

outlook = win32.Dispatch("Outlook.Application")
mail = outlook.CreateItem(0)
mail.To = "team@example.com"
mail.Subject = "Weekly Report"
mail.Body = "Attached is this week's report."
mail.Attachments.Add(Source=r"C:\path\to\weekly_report.xlsx")
mail.Send()
print("Outlook email sent!")

Prefer a ready-made playbook that includes email templates and attachments? See price on Amazon.

Security tip: Always use app passwords or OAuth for SMTP. Never hardcode real credentials in shared scripts. For reference, see Python’s official docs for smtplib.


Script 5: Create beautiful charts for presentations (matplotlib + XlsxWriter)

Build charts once and export them as images or embedded Excel charts.

Code (save PNG charts + Excel summary):

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_excel("weekly_report.xlsx")
pivot = df.pivot_table(index="week", columns="category", values=df.columns[-1], aggfunc="sum").fillna(0)

ax = pivot.plot(kind="line", figsize=(8,4), title="Weekly Trend by Category")
plt.tight_layout()
plt.savefig("weekly_trend.png", dpi=150)
plt.close()

with pd.ExcelWriter("weekly_with_chart.xlsx", engine="xlsxwriter") as writer:
    pivot.to_excel(writer, sheet_name="Data")
    wb = writer.book
    ws = wb.add_worksheet("Chart")
    ws.insert_image("B2", "weekly_trend.png", {"x_scale": 0.8, "y_scale": 0.8})

Alternate approach: build native Excel charts with XlsxWriter’s chart API for consistent styling within Excel.


Script 6: Monitor a folder and process files instantly (watchdog)

When new files arrive in a shared folder, process them automatically.

Code:

import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import pandas as pd
from pathlib import Path

watch_dir = Path("incoming")

class NewFileHandler(FileSystemEventHandler):
    def on_created(self, event):
        if not event.is_directory and event.src_path.endswith(".xlsx"):
            path = Path(event.src_path)
            print(f"New file: {path.name}")
            df = pd.read_excel(path)
            # Example processing: append to master
            try:
                master = pd.read_excel("master.xlsx")
                combined = pd.concat([master, df], ignore_index=True)
            except FileNotFoundError:
                combined = df
            combined.to_excel("master.xlsx", index=False)

if __name__ == "__main__":
    event_handler = NewFileHandler()
    observer = Observer()
    observer.schedule(event_handler, str(watch_dir), recursive=False)
    observer.start()
    print(f"Watching {watch_dir.resolve()} ...")
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observer.stop()
    observer.join()

Ready to upgrade your workflow with a tested project you can run today? Buy on Amazon.

Note: watchdog must keep running. Schedule it as a background service, or trigger a batch script when files appear.


Script 7: Combine data from multiple sheets

Sometimes the data is split across many tabs inside the same workbook. This script pulls them all together.

Code:

import pandas as pd

wb = "multi_tab.xlsx"
sheets = pd.read_excel(wb, sheet_name=None)  # dict of {sheet_name: DataFrame}

frames = []
for name, df in sheets.items():
    df["sheet"] = name
    frames.append(df)

combined = pd.concat(frames, ignore_index=True)
combined.to_excel("combined_sheets.xlsx", index=False)
print("Combined all sheets into combined_sheets.xlsx")

Tip: Filter out lookup or hidden tabs by name if needed.


Script 8: Add professional formatting to reports (styles, zebra stripes, totals)

Presentation matters. Use XlsxWriter formats to make reports “board-ready.”

Code:

import pandas as pd

df = pd.read_excel("master_clean.xlsx").head(500)  # demo subset

with pd.ExcelWriter("styled_report.xlsx", engine="xlsxwriter") as writer:
    df.to_excel(writer, sheet_name="Report", index=False)

    wb = writer.book
    ws = writer.sheets["Report"]

    # Formats
    header = wb.add_format({"bold": True, "bg_color": "#1F4E78", "font_color": "white", "align": "center"})
    money  = wb.add_format({"num_format": "$#,##0"})
    zebra1 = wb.add_format({"bg_color": "#F9F9F9"})
    zebra2 = wb.add_format({"bg_color": "#FFFFFF"})

    # Apply header
    ws.set_row(0, 18, header)

    # Set column widths and currency format if amount exists
    for idx, col in enumerate(df.columns):
        ws.set_column(idx, idx, max(10, min(25, len(str(col)) + 2)))
        if "amount" in col or "revenue" in col or "cost" in col:
            ws.set_column(idx, idx, 14, money)

    # Zebra striping
    for row in range(1, len(df) + 1):
        fmt = zebra1 if row % 2 == 0 else zebra2
        ws.set_row(row, None, fmt)

    # Add total row at bottom if there's a numeric column
    numeric_cols = [i for i, c in enumerate(df.columns) if df[c].dtype.kind in "fi"]
    if numeric_cols:
        total_row = len(df) + 1
        ws.write(total_row, 0, "TOTAL")
        for col_idx in numeric_cols:
            col_letter = chr(ord('A') + col_idx)
            ws.write_formula(total_row, col_idx, f"=SUM({col_letter}2:{col_letter}{len(df)+1})", money)

This style pass takes seconds and elevates the entire deliverable.


Script 9: Build a fully automated daily dashboard

This approach rebuilds an Excel dashboard each morning, saves charts, and emails it—hands off.

What it does: – Reads the latest data – Updates pivot tables and charts – Saves a timestamped workbook – (Optional) emails it to stakeholders

Skeleton code:

import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
from pathlib import Path

today = datetime.now().strftime("%Y-%m-%d")
out = Path(f"dashboard_{today}.xlsx")

# 1) Load and prep
df = pd.read_excel("master_clean.xlsx")

# 2) KPI calc
kpis = {
    "Revenue": df["amount"].sum() if "amount" in df.columns else df.select_dtypes("number").sum().sum(),
    "Orders": len(df)
}
kpi_df = pd.DataFrame(list(kpis.items()), columns=["Metric", "Value"])

# 3) Pivot(s)
pvt = df.pivot_table(index="category", values="amount", aggfunc="sum").sort_values("amount", ascending=False)

# 4) Chart image
ax = pvt.plot(kind="bar", figsize=(6,3), title="Revenue by Category")
plt.tight_layout()
plt.savefig("kpi_chart.png", dpi=150)
plt.close()

# 5) Save dashboard
with pd.ExcelWriter(out, engine="xlsxwriter") as writer:
    kpi_df.to_excel(writer, sheet_name="KPIs", index=False)
    pvt.to_excel(writer, sheet_name="Breakdown")
    wb = writer.book
    ws = wb.add_worksheet("Charts")
    ws.insert_image("B2", "kpi_chart.png")

print(f"Dashboard created: {out}")

Schedule it: – Windows: Task Scheduler (guide) – macOS/Linux: cron (man page)

Want to try it yourself with a downloadable project and sample files? View on Amazon.


Script 10: Quick pivot table to Excel with formatting

When leadership asks, “Can we see totals by region and product?” you’ll have it in seconds.

Code:

import pandas as pd

df = pd.read_excel("master_clean.xlsx")
index_cols = [c for c in ["region", "product"] if c in df.columns]
value_col  = "amount" if "amount" in df.columns else df.select_dtypes("number").columns[0]

pvt = pd.pivot_table(df, index=index_cols, values=value_col, aggfunc=["sum", "mean", "count"])
pvt.columns = [f"{stat}_{col}" for stat, col in pvt.columns]
pvt = pvt.reset_index().sort_values(f"sum_{value_col}", ascending=False)

with pd.ExcelWriter("pivot_report.xlsx", engine="xlsxwriter") as writer:
    pvt.to_excel(writer, sheet_name="Pivot", index=False)
    wb = writer.book
    ws = writer.sheets["Pivot"]
    money = wb.add_format({"num_format": "$#,##0"})
    header = wb.add_format({"bold": True, "bg_color": "#EEE"})
    ws.set_row(0, None, header)
    for i, col in enumerate(pvt.columns):
        if "sum_" in col or "mean_" in col:
            ws.set_column(i, i, 14, money)
        else:
            ws.set_column(i, i, 18)

Result: an instantly readable summary that looks like manual pivot work—but updates itself.


Troubleshooting and performance tips

  • Keep column names consistent across files. Decide on “Amount” vs “amount” once and stick to it.
  • Use dtype hints when reading Excel for speed: pd.read_excel("file.xlsx", dtype={"sku": str, "region": "category"})
  • If you hit memory limits with very large files, read in chunks: for chunk in pd.read_excel("big.xlsx", sheet_name=0, chunksize=100_000): # process
  • For reliable scheduling, log outcomes to a text file and include try/except blocks with email alerts on failure.
  • Store configuration (paths, email recipients) in a .env or YAML file so you don’t edit code every week.
  • Protect sensitive data. Use least-privilege service accounts and never commit credentials to version control.

Common pitfalls—and how to avoid them

  • Mixed date formats: Coerce with pd.to_datetime(..., errors="coerce") and drop NaT or fix rows.
  • Numbers saved as text: Use pd.to_numeric(..., errors="coerce") and then fill or filter missing.
  • Hidden Excel headers or blank top rows: Use skiprows in read_excel.
  • Formula recalculation: If consumers rely on formulas, save with openpyxl and ensure calcPr is set or ask recipients to refresh.
  • Locale issues (comma decimals): Replace commas before converting numeric fields.

Security, governance, and compliance basics

Automation moves fast—make sure it’s safe:

  • Use app passwords or OAuth for email; never plain passwords in scripts.
  • Log every run with timestamp, input files, output files, and any errors.
  • Version your scripts with Git and review changes.
  • If you automate regulated data, check company policy for encryption and retention.
  • Consider a data catalog or at least a README documenting inputs/outputs and owners.

FAQs: Automate Excel with Python

Q: Do I need to learn a lot of Python to use these scripts?
A: No. You can copy, paste, and change a few file paths and column names. Over time you’ll pick up the basics naturally.

Q: Will this work on a Mac?
A: Yes. All pandas/openpyxl/XlsxWriter scripts work on macOS and Linux. Outlook automation via win32com is Windows-only; use smtplib on Mac.

Q: Can I schedule scripts to run at 7 a.m. daily?
A: Yes. Use Windows Task Scheduler or cron on macOS/Linux. Point the task at python path/to/script.py.

Q: How do I handle Excel files with multiple header rows?
A: Use pd.read_excel(..., header=[0,1]) for multi-index headers or skiprows to land on the correct header.

Q: What if column names change between months?
A: Normalize by lowercasing and stripping spaces, then map aliases:

df.rename(columns={"amt": "amount", "revenue($)": "amount"}, inplace=True)

Q: Can Python create native Excel charts?
A: Yes. Use XlsxWriter’s add_chart for native Excel charts, or export matplotlib images and insert them.

Q: Is pandas faster than Excel?
A: For large merges, group-bys, and joins—yes, often dramatically faster. It also reduces manual mistakes.

Q: How do I share the results with non‑technical teammates?
A: Export to .xlsx with clear sheet names and formatting, then email or place in a shared drive. You can also save PDFs of charts.


Final takeaway

If you’re doing the same Excel steps more than once, automate them. Start with one script—merging files or cleaning data—and you’ll quickly build a toolkit that saves 5–10 hours each week. The real win isn’t just speed; it’s accuracy and peace of mind. Keep going, keep it simple, and iterate. If you want more guides like this, subscribe or bookmark this page—we’re publishing fresh, practical automation recipes every month.

External resources worth exploring: – pandas documentation: https://pandas.pydata.org/docs/ – openpyxl documentation: https://openpyxl.readthedocs.io/en/stable/ – XlsxWriter documentation: https://xlsxwriter.readthedocs.io/ – matplotlib documentation: https://matplotlib.org/stable/index.html

Discover more at InnoVirtuoso.com

I would love some feedback on my writing so if you have any, please don’t hesitate to leave a comment around here or in any platforms that is convenient for you.

For more on tech and other topics, explore InnoVirtuoso.com anytime. Subscribe to my newsletter and join our growing community—we’ll create something magical together. I promise, it’ll never be boring! 

Stay updated with the latest news—subscribe to our newsletter today!

Thank you all—wishing you an amazing day ahead!

Read more related Articles at InnoVirtuoso

Browse InnoVirtuoso for more!