DocMods

Edit DOCX Files with Python: The Track Changes Problem (And How to Solve It)

How to edit Word documents programmatically with Python. The critical limitation of python-docx and how to add track changes support to your document workflows.

Edit DOCX Files with Python: The Track Changes Problem (And How to Solve It)

What You'll Learn

Edit DOCX files programmatically with Python
Add track changes support (python-docx can't)
Handle comments and revision history
Batch process multiple documents

The Python DOCX Landscape

When you need to edit Word documents programmatically in Python, you have several options:

LibraryFree?Track ChangesCommentsMaturity
python-docxHigh
Aspose.Words✗ ($)High
Spire.Doc✗ ($)Medium
DocMods SDKAPI-basedGrowing

The most common choice—python-docx—has a critical limitation that we'll address in this guide.

python-docx Basics

python-docx is the go-to library for basic DOCX manipulation:

Installation

pip install python-docx

Reading a Document

from docx import Document

doc = Document('contract.docx')

# Read paragraphs
for para in doc.paragraphs:
    print(para.text)

# Read tables
for table in doc.tables:
    for row in table.rows:
        for cell in row.cells:
            print(cell.text)

Modifying Content

from docx import Document

doc = Document('contract.docx')

# Replace text in paragraphs
for para in doc.paragraphs:
    if 'PLACEHOLDER' in para.text:
        para.text = para.text.replace('PLACEHOLDER', 'Actual Value')

# Add a new paragraph
doc.add_paragraph('This is new content.')

# Save
doc.save('contract_modified.docx')

Adding Tables

from docx import Document

doc = Document()

# Add a table with 3 rows and 3 columns
table = doc.add_table(rows=3, cols=3)

# Fill in cells
for i, row in enumerate(table.rows):
    for j, cell in enumerate(row.cells):
        cell.text = f'Row {i+1}, Col {j+1}'

doc.save('table_example.docx')

Formatting

from docx import Document
from docx.shared import Pt, Inches
from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = Document()

# Add formatted paragraph
para = doc.add_paragraph()
run = para.add_run('Bold and Large')
run.bold = True
run.font.size = Pt(14)

# Center alignment
para.alignment = WD_ALIGN_PARAGRAPH.CENTER

doc.save('formatted.docx')

The Track Changes Problem

Here's where python-docx falls short:

from docx import Document

doc = Document('contract.docx')

# This adds text, but NOT as a tracked change
doc.paragraphs[0].add_run('NEW TEXT')

# When you open in Word:
# - The text is there
# - But there's no track change record
# - No author attribution
# - No insertion markup
# - Can't be accepted/rejected

doc.save('modified.docx')

This is a fundamental limitation, not a bug. The python-docx maintainers have acknowledged that track changes support would require significant architectural changes.

GitHub Issues

The most requested feature on python-docx:

The response is consistently: "This is a large feature that would require significant refactoring."

Why Track Changes Matter

Track changes aren't just for lawyers. They're essential when:

Compliance and auditing:

  • Regulated industries require proof of who changed what
  • Financial documents need audit trails
  • Healthcare records require attribution

Collaborative workflows:

  • Multiple reviewers need to see each other's edits
  • Editors can accept/reject suggestions individually
  • Document owners maintain control

Quality control:

  • Changes can be reviewed before becoming permanent
  • Mistakes can be identified and reverted
  • Process documentation is automatic

Without track changes, you're making "silent" edits—the document changes but there's no record of what changed or who did it.

Solution 1: DocMods Python SDK

DocMods provides a Python SDK that supports track changes:

Installation

pip install docxagent

Basic Usage

from docxagent import DocxClient

client = DocxClient()

# Upload document
doc_id = client.upload('contract.docx')

# Read content
content = client.read_document(doc_id)
print(content['paragraphs'][0]['text'])

# Insert text WITH track changes
client.insert_text(
    doc_id,
    paragraph_index=0,
    text='[REVIEWED] ',
    author='Legal Bot'  # Attribution!
)

# Download
client.download(doc_id, 'contract_reviewed.docx')

When you open contract_reviewed.docx in Word:

  • The inserted text appears with track change markup
  • Author is "Legal Bot"
  • Timestamp is recorded
  • Can be accepted or rejected

Adding Comments

# Add a comment (appears in Word's review pane)
client.add_comment(
    doc_id,
    paragraph_index=5,
    comment_text='Please verify this amount with Finance.',
    author='Review Bot'
)

Proposing Deletions

# Mark text for deletion (strikethrough with attribution)
client.propose_deletion(
    doc_id,
    paragraph_index=3,
    start_char=0,
    end_char=50,  # First 50 characters
    author='Compliance Bot'
)

Full Example

from docxagent import DocxClient

def review_contract(input_path, output_path):
    """Add automated review comments to a contract."""
    client = DocxClient()
    doc_id = client.upload(input_path)

    content = client.read_document(doc_id)

    # Define flags
    flags = {
        'indemnify': 'Legal review required: indemnification clause',
        'unlimited': 'Legal review required: unlimited liability',
        'perpetual': 'Legal review required: perpetual term',
        'TBD': 'Placeholder needs completion',
    }

    # Flag each matching paragraph
    for i, para in enumerate(content['paragraphs']):
        text_lower = para['text'].lower()
        for trigger, message in flags.items():
            if trigger.lower() in text_lower:
                client.add_comment(
                    doc_id,
                    paragraph_index=i,
                    comment_text=message,
                    author='Contract Review Bot',
                    highlight=True
                )

    # Add processing marker
    client.insert_text(
        doc_id,
        paragraph_index=0,
        text='[AUTO-REVIEWED] ',
        author='Contract Review Bot'
    )

    client.download(doc_id, output_path)
    return output_path

# Usage
review_contract('contract.docx', 'contract_reviewed.docx')

Solution 2: Aspose.Words for Python

Commercial option with comprehensive track changes support:

Installation

pip install aspose-words

Basic Usage

import aspose.words as aw

doc = aw.Document('contract.docx')

# Start tracking changes
doc.start_track_revisions('My Name')

# Make changes (they're now tracked)
builder = aw.DocumentBuilder(doc)
builder.move_to_document_start()
builder.write('[REVIEWED] ')

# Stop tracking
doc.stop_track_revisions()

doc.save('contract_tracked.docx')

Accept/Reject Changes

import aspose.words as aw

doc = aw.Document('document_with_changes.docx')

# Accept all changes
doc.accept_all_revisions()

# Or iterate through revisions
for revision in doc.revisions:
    if revision.author == 'Trusted Reviewer':
        revision.accept()
    else:
        revision.reject()

doc.save('document_processed.docx')

Pros and Cons

Advantages:

  • Full track changes support
  • Comprehensive Word feature coverage
  • No external API dependency
  • Offline processing

Disadvantages:

  • Commercial license required ($999+)
  • Large library size
  • Learning curve for API

Solution 3: Hybrid Approach

Use python-docx for basic operations, DocMods for track changes:

from docx import Document
from docxagent import DocxClient

def process_document(input_path, output_path):
    """
    Use python-docx for reading/analysis,
    DocMods for changes that need tracking.
    """
    # Step 1: Analyze with python-docx (free)
    doc = Document(input_path)

    paragraphs_to_flag = []
    for i, para in enumerate(doc.paragraphs):
        if 'IMPORTANT' in para.text:
            paragraphs_to_flag.append(i)

    # Step 2: Add tracked changes with DocMods
    client = DocxClient()
    doc_id = client.upload(input_path)

    for para_idx in paragraphs_to_flag:
        client.add_comment(
            doc_id,
            paragraph_index=para_idx,
            comment_text='Flagged: Contains IMPORTANT marker',
            author='Analysis Bot'
        )

    client.download(doc_id, output_path)

Batch Processing

With python-docx (No Track Changes)

import os
from docx import Document

input_folder = 'documents/'
output_folder = 'processed/'

for filename in os.listdir(input_folder):
    if filename.endswith('.docx'):
        doc = Document(os.path.join(input_folder, filename))

        # Replace placeholder
        for para in doc.paragraphs:
            para.text = para.text.replace('{{DATE}}', '2026-01-29')

        doc.save(os.path.join(output_folder, filename))

With DocMods (With Track Changes)

import os
from docxagent import DocxClient

client = DocxClient()
input_folder = 'documents/'
output_folder = 'processed/'

for filename in os.listdir(input_folder):
    if filename.endswith('.docx'):
        doc_id = client.upload(os.path.join(input_folder, filename))

        # Add tracked change
        client.insert_text(
            doc_id,
            paragraph_index=0,
            text='[Batch Processed] ',
            author='Batch Bot'
        )

        client.download(doc_id, os.path.join(output_folder, filename))

Handling Common Scenarios

Find and Replace with Tracking

from docxagent import DocxClient

client = DocxClient()
doc_id = client.upload('contract.docx')

content = client.read_document(doc_id)

old_value = 'ACME Corp'
new_value = 'Initech LLC'

for i, para in enumerate(content['paragraphs']):
    if old_value in para['text']:
        # Find position
        start = para['text'].find(old_value)
        end = start + len(old_value)

        # Delete old (tracked)
        client.propose_deletion(
            doc_id,
            paragraph_index=i,
            start_char=start,
            end_char=end,
            author='Find/Replace Bot'
        )

        # Insert new (tracked)
        client.insert_text(
            doc_id,
            paragraph_index=i,
            text=new_value,
            position=start,
            author='Find/Replace Bot'
        )

client.download(doc_id, 'contract_updated.docx')

Extract Track Changes

from docxagent import DocxClient

client = DocxClient()
doc_id = client.upload('reviewed_document.docx')

content = client.read_document(doc_id, include_track_changes=True)

for change in content.get('track_changes', []):
    print(f"Type: {change['type']}")
    print(f"Author: {change['author']}")
    print(f"Date: {change['date']}")
    print(f"Text: {change['text']}")
    print("---")

Generate Report from Multiple Documents

import os
from docxagent import DocxClient
import json

client = DocxClient()

def analyze_documents(folder):
    """Extract all comments and changes from documents in a folder."""
    report = []

    for filename in os.listdir(folder):
        if filename.endswith('.docx'):
            doc_id = client.upload(os.path.join(folder, filename))
            content = client.read_document(doc_id, include_track_changes=True)

            doc_report = {
                'filename': filename,
                'paragraph_count': len(content['paragraphs']),
                'changes': content.get('track_changes', []),
                'comments': content.get('comments', [])
            }
            report.append(doc_report)

    return report

report = analyze_documents('legal_documents/')
print(json.dumps(report, indent=2))

Performance Considerations

python-docx

  • Fast, runs locally
  • Memory usage scales with document size
  • No network latency

DocMods API

  • Network round-trip for each operation
  • Better for complex operations
  • Batching reduces overhead

Optimization Tips

# BAD: Multiple API calls
for para_idx in range(10):
    client.add_comment(doc_id, para_idx, 'Comment', 'Bot')

# BETTER: Batch operations when possible
comments = [(i, 'Comment', 'Bot') for i in range(10)]
client.add_comments_batch(doc_id, comments)

When to Use What

ScenarioRecommended Tool
Simple text replacementpython-docx
Template fillingpython-docx
Adding track changesDocMods or Aspose
Adding commentsDocMods or Aspose
Bulk format changespython-docx
Legal/compliance docsDocMods or Aspose
One-time scriptspython-docx
Production workflowsDocMods (API reliability)

The Bottom Line

python-docx is excellent for basic DOCX manipulation but fundamentally cannot add track changes. This isn't changing—it's been the top feature request for 10+ years with no resolution.

For workflows requiring track changes:

  • DocMods SDK: API-based, straightforward, pay-per-use
  • Aspose.Words: Commercial library, comprehensive, offline
  • Spire.Doc: Similar to Aspose, different licensing

Choose based on your needs:

  • Occasional use → DocMods API
  • High volume, offline → Aspose.Words
  • Basic editing only → python-docx (free, no track changes)

The investment in proper track changes support pays off when you need audit trails, collaborative review, or document compliance.

Frequently Asked Questions

Ready to Transform Your Document Workflow?

Let AI help you review, edit, and transform Word documents in seconds.

No credit card required • Free trial available