Does python-docx support track changes?

No. python-docx does not read, write, or preserve track changes. When you read a document with tracked revisions, python-docx returns the final text state as if all changes were accepted. When you modify a document with existing track changes, those changes are often corrupted or lost. This is documented in GitHub issues going back years.

Why doesn't python-docx support track changes?

Track changes require parsing OOXML revision elements (w:ins, w:del, w:rPrChange, etc.) which are deeply interleaved with content. python-docx was designed for simple document generation, not revision-aware editing. Adding full track change support would require significant architectural changes. The maintainers have acknowledged this limitation but haven't prioritized it.

How can I read track changes in Python?

Options include: parsing OOXML directly with lxml (complex but complete), using the python-docx-ml library (partial support), calling Word via COM automation on Windows, or using APIs like DocMods that handle revision tracking natively. Direct OOXML parsing gives most control but requires understanding the w:ins/w:del schema.

How do I accept all track changes in Python?

You cannot do this reliably with python-docx. Options: Use Word COM automation (Windows only), use LibreOffice in headless mode via subprocess, use a dedicated API that supports revisions, or parse OOXML directly and manually merge revision elements. Each has trade-offs between complexity and reliability.

python-docx Doesn't Support Track Changes. Here...

The GitHub Issue That Never Gets Fixed

python-docx issue #455: "Track changes not supported" - opened 2016, still open.

Issue #1044: "Reading tracked changes" - 2021, no resolution.

Issue #753: "Preserve track changes when editing" - 2019, corrupts documents.

The pattern is clear: python-docx was never designed for revision-aware document processing, and bolting it on isn't happening.

What Actually Happens When You Read a Document

When python-docx reads a DOCX with track changes, here's what you're missing:

<!-- What's actually in the document -->
<w:p>
  <w:r>
    <w:t>The contract amount is </w:t>
  </w:r>
  <w:del w:author="Legal" w:date="2024-01-15T10:30:00Z">
    <w:r>
      <w:delText>$50,000</w:delText>
    </w:r>
  </w:del>
  <w:ins w:author="Legal" w:date="2024-01-15T10:30:00Z">
    <w:r>
      <w:t>$75,000</w:t>
    </w:r>
  </w:ins>
  <w:r>
    <w:t> per year.</w:t>
  </w:r>
</w:p>

# What python-docx gives you
from docx import Document
doc = Document("contract.docx")
print(doc.paragraphs[0].text)
# Output: "The contract amount is $75,000 per year."

# What you're missing:
# - Original value: $50,000
# - Who changed it: Legal
# - When: 2024-01-15T10:30:00Z
# - That this is a PENDING change, not yet accepted

python-docx returns the text as if all insertions were accepted and all deletions were applied. You lose:

The original text (what was deleted)
The change author
The change timestamp
The fact that it's a pending revision

This is catastrophic for legal document workflows, audit trails, and contract review automation.

The OOXML Revision Schema (What You Need to Parse)

OOXML tracks revisions through several element types:

Text Insertions (`w:ins`)

<w:ins w:id="1" w:author="John Smith" w:date="2024-01-15T10:30:00Z">
  <w:r>
    <w:t>inserted text</w:t>
  </w:r>
</w:ins>

Attributes:

w:id: Unique revision ID
w:author: Person who made the change
w:date: ISO 8601 timestamp

Text Deletions (`w:del`)

<w:del w:id="2" w:author="Jane Doe" w:date="2024-01-16T14:00:00Z">
  <w:r>
    <w:delText>deleted text</w:delText>
  </w:r>
</w:del>

Note: Deleted text uses w:delText, not w:t.

Formatting Changes (`w:rPrChange`)

<w:r>
  <w:rPr>
    <w:b/>
    <w:rPrChange w:author="Editor" w:date="2024-01-17T09:00:00Z">
      <w:rPr/>
    </w:rPrChange>
  </w:rPr>
  <w:t>now bold</w:t>
</w:r>

This tracks when text was made bold (new state: <w:b/>, old state: no bold).

Paragraph Property Changes (`w:pPrChange`)

<w:pPr>
  <w:jc w:val="center"/>
  <w:pPrChange w:author="Designer" w:date="2024-01-18T11:00:00Z">
    <w:pPr>
      <w:jc w:val="left"/>
    </w:pPr>
  </w:pPrChange>
</w:pPr>

Paragraph changed from left-aligned to centered.

Move Operations (`w:moveFrom`, `w:moveTo`)

<w:moveFrom w:id="3" w:author="Editor" w:date="...">
  <w:r><w:t>moved text</w:t></w:r>
</w:moveFrom>
<!-- ... elsewhere in document ... -->
<w:moveTo w:id="3" w:author="Editor" w:date="...">
  <w:r><w:t>moved text</w:t></w:r>
</w:moveTo>

Tracks text movement (not just delete + insert).

Parsing OOXML Revisions Directly

If you need full control, parse the XML directly:

from lxml import etree
from zipfile import ZipFile

NSMAP = {
    'w': 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
}

def extract_revisions(docx_path):
    """Extract all tracked revisions from a DOCX file."""
    revisions = []

    with ZipFile(docx_path, 'r') as zf:
        xml_content = zf.read('word/document.xml')

    root = etree.fromstring(xml_content)

    # Find all insertions
    for ins in root.xpath('//w:ins', namespaces=NSMAP):
        revision = {
            'type': 'insertion',
            'id': ins.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id'),
            'author': ins.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}author'),
            'date': ins.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}date'),
            'text': ''.join(ins.xpath('.//w:t/text()', namespaces=NSMAP))
        }
        revisions.append(revision)

    # Find all deletions
    for deletion in root.xpath('//w:del', namespaces=NSMAP):
        revision = {
            'type': 'deletion',
            'id': deletion.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id'),
            'author': deletion.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}author'),
            'date': deletion.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}date'),
            'text': ''.join(deletion.xpath('.//w:delText/text()', namespaces=NSMAP))
        }
        revisions.append(revision)

    return revisions

# Usage
revisions = extract_revisions('contract_with_changes.docx')
for rev in revisions:
    print(f"{rev['type']}: '{rev['text']}' by {rev['author']} at {rev['date']}")

This works for reading. Writing is harder—you need to:

Track revision IDs (must be unique per document)
Maintain rsid (revision session IDs)
Handle nested revisions correctly
Update settings.xml revision tracking settings
Preserve existing revisions while adding new ones

The Problem with Modifying Documents

When you modify a document with python-docx that has existing track changes:

from docx import Document

doc = Document("contract_with_changes.docx")
doc.paragraphs[0].add_run(" Additional text.")
doc.save("modified.docx")

What can happen:

Existing track changes may be corrupted
New changes are NOT tracked (no w:ins wrapper)
Revision IDs may conflict
The document may become unreadable in Word

python-docx doesn't know about revisions, so it can't preserve them correctly.

Alternatives That Actually Work

Option 1: Word COM Automation (Windows Only)

import win32com.client

def accept_all_changes(input_path, output_path):
    """Accept all track changes using Word COM."""
    word = win32com.client.Dispatch("Word.Application")
    word.Visible = False

    doc = word.Documents.Open(input_path)
    doc.AcceptAllRevisions()
    doc.Save()
    doc.Close()

    word.Quit()

def add_text_with_tracking(input_path, text, output_path):
    """Add text with revision tracking enabled."""
    word = win32com.client.Dispatch("Word.Application")
    word.Visible = False

    doc = word.Documents.Open(input_path)
    doc.TrackRevisions = True

    # Add text at end
    doc.Content.InsertAfter(text)

    doc.SaveAs(output_path)
    doc.Close()
    word.Quit()

Pros:

Full Word functionality
Reliable revision handling
Supports all Word features

Cons:

Windows only
Requires Word installation
Slow (COM overhead)
Licensing considerations

Option 2: LibreOffice Headless

import subprocess
import shutil

def accept_all_changes_libreoffice(input_path, output_path):
    """Accept all changes using LibreOffice in headless mode."""
    # LibreOffice macro to accept all changes
    macro = """
    Sub AcceptAll
        ThisComponent.AcceptAllChanges()
        ThisComponent.Store()
    End Sub
    """

    # This is simplified - actual implementation needs macro setup
    subprocess.run([
        'soffice',
        '--headless',
        '--accept="socket,host=localhost,port=2002;urp;"',
        input_path
    ])
    # Additional UNO API calls needed for full implementation

Pros:

Cross-platform (Linux, macOS, Windows)
Free/open source
No Word license needed

Cons:

Complex setup
UNO API is poorly documented
Some Word compatibility issues
Slow for batch processing

Option 3: DocMods API

from docxagent import DocxClient

client = DocxClient()

# Upload document
doc_id = client.upload("contract_with_changes.docx")

# Read with revision awareness
content = client.read(doc_id, include_revisions=True)
# Returns revisions with author, date, original and new text

# Make changes WITH track changes
client.edit(
    doc_id,
    "Change the payment terms from Net 30 to Net 45"
)
# Changes are tracked with proper w:ins/w:del elements

# Accept specific revisions
client.accept_revision(doc_id, revision_id="1")

# Reject revisions
client.reject_revision(doc_id, revision_id="2")

# Accept all
client.accept_all_revisions(doc_id)

# Download
client.download(doc_id, "final_contract.docx")

Pros:

Full revision tracking support
Cross-platform (API-based)
AI-powered editing with track changes
No local software dependencies

Cons:

Requires API calls (network dependency)
Usage-based pricing

Option 4: Direct OOXML Manipulation

For complete control, manipulate the OOXML directly:

from lxml import etree
from zipfile import ZipFile
import tempfile
import shutil
import os

NSMAP = {
    'w': 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
}

class DocxRevisionEditor:
    def __init__(self, path):
        self.path = path
        self.temp_dir = tempfile.mkdtemp()

        # Extract DOCX
        with ZipFile(path, 'r') as zf:
            zf.extractall(self.temp_dir)

        # Parse document.xml
        doc_path = os.path.join(self.temp_dir, 'word', 'document.xml')
        self.tree = etree.parse(doc_path)
        self.root = self.tree.getroot()

        # Track max revision ID
        self.max_id = self._get_max_revision_id()

    def _get_max_revision_id(self):
        """Find highest existing revision ID."""
        max_id = 0
        for elem in self.root.xpath('//*[@w:id]', namespaces=NSMAP):
            try:
                rev_id = int(elem.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id'))
                max_id = max(max_id, rev_id)
            except (ValueError, TypeError):
                pass
        return max_id

    def accept_revision(self, revision_id):
        """Accept a specific revision by ID."""
        # Find insertion
        ins = self.root.xpath(f'//w:ins[@w:id="{revision_id}"]', namespaces=NSMAP)
        if ins:
            # Move children out of w:ins, remove w:ins
            parent = ins[0].getparent()
            index = list(parent).index(ins[0])
            for child in list(ins[0]):
                parent.insert(index, child)
                index += 1
            parent.remove(ins[0])
            return True

        # Find deletion
        deletion = self.root.xpath(f'//w:del[@w:id="{revision_id}"]', namespaces=NSMAP)
        if deletion:
            # Remove the entire w:del element (text is gone)
            parent = deletion[0].getparent()
            parent.remove(deletion[0])
            return True

        return False

    def insert_text_tracked(self, paragraph_index, text, author="Python Script"):
        """Insert text with track changes."""
        from datetime import datetime

        self.max_id += 1

        paragraphs = self.root.xpath('//w:p', namespaces=NSMAP)
        if paragraph_index >= len(paragraphs):
            raise IndexError("Paragraph index out of range")

        para = paragraphs[paragraph_index]

        # Create w:ins element
        w = '{http://schemas.openxmlformats.org/wordprocessingml/2006/main}'
        ins = etree.Element(f'{w}ins')
        ins.set(f'{w}id', str(self.max_id))
        ins.set(f'{w}author', author)
        ins.set(f'{w}date', datetime.utcnow().isoformat() + 'Z')

        # Create run with text
        run = etree.SubElement(ins, f'{w}r')
        t = etree.SubElement(run, f'{w}t')
        t.text = text

        # Append to paragraph
        para.append(ins)

    def save(self, output_path):
        """Save the modified document."""
        # Write document.xml
        doc_path = os.path.join(self.temp_dir, 'word', 'document.xml')
        self.tree.write(doc_path, xml_declaration=True, encoding='UTF-8', standalone=True)

        # Repack DOCX
        with ZipFile(output_path, 'w') as zf:
            for root, dirs, files in os.walk(self.temp_dir):
                for file in files:
                    file_path = os.path.join(root, file)
                    arc_name = os.path.relpath(file_path, self.temp_dir)
                    zf.write(file_path, arc_name)

        # Cleanup
        shutil.rmtree(self.temp_dir)

# Usage
editor = DocxRevisionEditor("contract.docx")
editor.insert_text_tracked(0, " AMENDED: New terms apply.", author="Legal Bot")
editor.accept_revision("1")  # Accept revision with ID 1
editor.save("contract_modified.docx")

Pros:

Full control over revisions
No external dependencies beyond lxml
Cross-platform
Can handle complex revision scenarios

Cons:

Complex to implement correctly
Must handle all edge cases
Easy to corrupt documents
Need deep OOXML knowledge

Building a Revision-Aware Pipeline

For production document automation with track changes:

from docxagent import DocxClient
import json

def contract_review_pipeline(template_path, data, reviewer_name):
    """
    Automated contract review with tracked changes.

    1. Load template
    2. Fill in data
    3. AI reviews and suggests changes (tracked)
    4. Return document with all changes visible
    """
    client = DocxClient()

    # Upload and fill template
    doc_id = client.upload(template_path)

    # Fill template variables
    for key, value in data.items():
        client.edit(doc_id, f"Replace placeholder {{{{{key}}}}} with: {value}")

    # AI-powered contract review with track changes
    client.edit(
        doc_id,
        f"""Review this contract as {reviewer_name}. Make specific suggestions:
        1. Flag any unusual indemnification language
        2. Verify payment terms match industry standards
        3. Check for missing limitation of liability
        4. Suggest clearer language where ambiguous

        All changes should be tracked with your name as author."""
    )

    # Get revision summary
    revisions = client.get_revisions(doc_id)

    summary = {
        "total_revisions": len(revisions),
        "insertions": len([r for r in revisions if r['type'] == 'insertion']),
        "deletions": len([r for r in revisions if r['type'] == 'deletion']),
        "by_author": {}
    }

    for rev in revisions:
        author = rev.get('author', 'Unknown')
        if author not in summary['by_author']:
            summary['by_author'][author] = 0
        summary['by_author'][author] += 1

    # Download document with all tracked changes visible
    client.download(doc_id, "reviewed_contract.docx")

    return summary

# Usage
result = contract_review_pipeline(
    "msa_template.docx",
    {
        "CLIENT_NAME": "Acme Corp",
        "EFFECTIVE_DATE": "January 1, 2025",
        "PAYMENT_TERMS": "Net 30"
    },
    reviewer_name="Contract AI"
)

print(json.dumps(result, indent=2))

Why This Matters

If you're building document automation in Python without revision awareness:

Legal risk: You can't prove what changed or when
Audit failures: No trail of modifications
Collaboration breaks: Changes made by your system aren't visible to reviewers
Data loss: Original text is silently discarded

python-docx is fine for simple document generation. For anything involving revisions, tracked changes, or collaborative editing, you need tools that understand OOXML revisions.

The Bottom Line

python-docx doesn't support track changes. This isn't a bug—it's a fundamental design limitation that won't be fixed.

Your options:

Windows only: Word COM automation
Cross-platform free: LibreOffice headless (complex setup)
Cross-platform API: DocMods or similar services
Full control: Direct OOXML manipulation (steep learning curve)

Choose based on your platform requirements, complexity tolerance, and whether you need to read revisions, write revisions, or both.

For production document workflows where track changes matter, don't fight python-docx's limitations. Use the right tool for the job.

python-docx Doesn't Support Track Changes. Here's What Actually Works.

What You'll Learn

The GitHub Issue That Never Gets Fixed

What Actually Happens When You Read a Document

The OOXML Revision Schema (What You Need to Parse)

Text Insertions (`w:ins`)

Text Deletions (`w:del`)

Formatting Changes (`w:rPrChange`)

Paragraph Property Changes (`w:pPrChange`)

Move Operations (`w:moveFrom`, `w:moveTo`)

Parsing OOXML Revisions Directly

The Problem with Modifying Documents

Alternatives That Actually Work

Option 1: Word COM Automation (Windows Only)

Option 2: LibreOffice Headless

Option 3: DocMods API

Option 4: Direct OOXML Manipulation

Building a Revision-Aware Pipeline

Why This Matters

The Bottom Line

Frequently Asked Questions

Related Guides

Python DOCX Templates: Beyond Jinja2 Placeholders

DOCX APIs Compared: Microsoft Graph, Aspose, and What They're Missing

Merging Word Documents Without Losing Track Changes (The Hard Problem)

Ready to Transform Your Document Workflow?