DocMods

python-docx Doesn't Support Track Changes. Here's What Actually Works.

python-docx silently ignores tracked revisions. Your scripts are reading stale data without knowing it. Here's the OOXML reality and the alternatives that actually work.

python-docx Doesn't Support Track Changes. Here's What Actually Works.

What You'll Learn

Why python-docx ignores w:ins and w:del elements
OOXML revision tracking schema explained
Working alternatives for track changes in Python
Accepting/rejecting changes programmatically
Building document automation with revision awareness

The GitHub Issue That Never Gets Fixed

python-docx issue #455: "Track changes not supported" - opened 2016, still open.

Issue #1044: "Reading tracked changes" - 2021, no resolution.

Issue #753: "Preserve track changes when editing" - 2019, corrupts documents.

The pattern is clear: python-docx was never designed for revision-aware document processing, and bolting it on isn't happening.

What Actually Happens When You Read a Document

When python-docx reads a DOCX with track changes, here's what you're missing:

<!-- What's actually in the document -->
<w:p>
  <w:r>
    <w:t>The contract amount is </w:t>
  </w:r>
  <w:del w:author="Legal" w:date="2024-01-15T10:30:00Z">
    <w:r>
      <w:delText>$50,000</w:delText>
    </w:r>
  </w:del>
  <w:ins w:author="Legal" w:date="2024-01-15T10:30:00Z">
    <w:r>
      <w:t>$75,000</w:t>
    </w:r>
  </w:ins>
  <w:r>
    <w:t> per year.</w:t>
  </w:r>
</w:p>
# What python-docx gives you
from docx import Document
doc = Document("contract.docx")
print(doc.paragraphs[0].text)
# Output: "The contract amount is $75,000 per year."

# What you're missing:
# - Original value: $50,000
# - Who changed it: Legal
# - When: 2024-01-15T10:30:00Z
# - That this is a PENDING change, not yet accepted

python-docx returns the text as if all insertions were accepted and all deletions were applied. You lose:

  • The original text (what was deleted)
  • The change author
  • The change timestamp
  • The fact that it's a pending revision

This is catastrophic for legal document workflows, audit trails, and contract review automation.

The OOXML Revision Schema (What You Need to Parse)

OOXML tracks revisions through several element types:

Text Insertions (w:ins)

<w:ins w:id="1" w:author="John Smith" w:date="2024-01-15T10:30:00Z">
  <w:r>
    <w:t>inserted text</w:t>
  </w:r>
</w:ins>

Attributes:

  • w:id: Unique revision ID
  • w:author: Person who made the change
  • w:date: ISO 8601 timestamp

Text Deletions (w:del)

<w:del w:id="2" w:author="Jane Doe" w:date="2024-01-16T14:00:00Z">
  <w:r>
    <w:delText>deleted text</w:delText>
  </w:r>
</w:del>

Note: Deleted text uses w:delText, not w:t.

Formatting Changes (w:rPrChange)

<w:r>
  <w:rPr>
    <w:b/>
    <w:rPrChange w:author="Editor" w:date="2024-01-17T09:00:00Z">
      <w:rPr/>
    </w:rPrChange>
  </w:rPr>
  <w:t>now bold</w:t>
</w:r>

This tracks when text was made bold (new state: <w:b/>, old state: no bold).

Paragraph Property Changes (w:pPrChange)

<w:pPr>
  <w:jc w:val="center"/>
  <w:pPrChange w:author="Designer" w:date="2024-01-18T11:00:00Z">
    <w:pPr>
      <w:jc w:val="left"/>
    </w:pPr>
  </w:pPrChange>
</w:pPr>

Paragraph changed from left-aligned to centered.

Move Operations (w:moveFrom, w:moveTo)

<w:moveFrom w:id="3" w:author="Editor" w:date="...">
  <w:r><w:t>moved text</w:t></w:r>
</w:moveFrom>
<!-- ... elsewhere in document ... -->
<w:moveTo w:id="3" w:author="Editor" w:date="...">
  <w:r><w:t>moved text</w:t></w:r>
</w:moveTo>

Tracks text movement (not just delete + insert).

Parsing OOXML Revisions Directly

If you need full control, parse the XML directly:

from lxml import etree
from zipfile import ZipFile

NSMAP = {
    'w': 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
}

def extract_revisions(docx_path):
    """Extract all tracked revisions from a DOCX file."""
    revisions = []

    with ZipFile(docx_path, 'r') as zf:
        xml_content = zf.read('word/document.xml')

    root = etree.fromstring(xml_content)

    # Find all insertions
    for ins in root.xpath('//w:ins', namespaces=NSMAP):
        revision = {
            'type': 'insertion',
            'id': ins.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id'),
            'author': ins.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}author'),
            'date': ins.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}date'),
            'text': ''.join(ins.xpath('.//w:t/text()', namespaces=NSMAP))
        }
        revisions.append(revision)

    # Find all deletions
    for deletion in root.xpath('//w:del', namespaces=NSMAP):
        revision = {
            'type': 'deletion',
            'id': deletion.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id'),
            'author': deletion.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}author'),
            'date': deletion.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}date'),
            'text': ''.join(deletion.xpath('.//w:delText/text()', namespaces=NSMAP))
        }
        revisions.append(revision)

    return revisions

# Usage
revisions = extract_revisions('contract_with_changes.docx')
for rev in revisions:
    print(f"{rev['type']}: '{rev['text']}' by {rev['author']} at {rev['date']}")

This works for reading. Writing is harder—you need to:

  1. Track revision IDs (must be unique per document)
  2. Maintain rsid (revision session IDs)
  3. Handle nested revisions correctly
  4. Update settings.xml revision tracking settings
  5. Preserve existing revisions while adding new ones

The Problem with Modifying Documents

When you modify a document with python-docx that has existing track changes:

from docx import Document

doc = Document("contract_with_changes.docx")
doc.paragraphs[0].add_run(" Additional text.")
doc.save("modified.docx")

What can happen:

  • Existing track changes may be corrupted
  • New changes are NOT tracked (no w:ins wrapper)
  • Revision IDs may conflict
  • The document may become unreadable in Word

python-docx doesn't know about revisions, so it can't preserve them correctly.

Alternatives That Actually Work

Option 1: Word COM Automation (Windows Only)

import win32com.client

def accept_all_changes(input_path, output_path):
    """Accept all track changes using Word COM."""
    word = win32com.client.Dispatch("Word.Application")
    word.Visible = False

    doc = word.Documents.Open(input_path)
    doc.AcceptAllRevisions()
    doc.Save()
    doc.Close()

    word.Quit()

def add_text_with_tracking(input_path, text, output_path):
    """Add text with revision tracking enabled."""
    word = win32com.client.Dispatch("Word.Application")
    word.Visible = False

    doc = word.Documents.Open(input_path)
    doc.TrackRevisions = True

    # Add text at end
    doc.Content.InsertAfter(text)

    doc.SaveAs(output_path)
    doc.Close()
    word.Quit()

Pros:

  • Full Word functionality
  • Reliable revision handling
  • Supports all Word features

Cons:

  • Windows only
  • Requires Word installation
  • Slow (COM overhead)
  • Licensing considerations

Option 2: LibreOffice Headless

import subprocess
import shutil

def accept_all_changes_libreoffice(input_path, output_path):
    """Accept all changes using LibreOffice in headless mode."""
    # LibreOffice macro to accept all changes
    macro = """
    Sub AcceptAll
        ThisComponent.AcceptAllChanges()
        ThisComponent.Store()
    End Sub
    """

    # This is simplified - actual implementation needs macro setup
    subprocess.run([
        'soffice',
        '--headless',
        '--accept="socket,host=localhost,port=2002;urp;"',
        input_path
    ])
    # Additional UNO API calls needed for full implementation

Pros:

  • Cross-platform (Linux, macOS, Windows)
  • Free/open source
  • No Word license needed

Cons:

  • Complex setup
  • UNO API is poorly documented
  • Some Word compatibility issues
  • Slow for batch processing

Option 3: DocMods API

from docxagent import DocxClient

client = DocxClient()

# Upload document
doc_id = client.upload("contract_with_changes.docx")

# Read with revision awareness
content = client.read(doc_id, include_revisions=True)
# Returns revisions with author, date, original and new text

# Make changes WITH track changes
client.edit(
    doc_id,
    "Change the payment terms from Net 30 to Net 45"
)
# Changes are tracked with proper w:ins/w:del elements

# Accept specific revisions
client.accept_revision(doc_id, revision_id="1")

# Reject revisions
client.reject_revision(doc_id, revision_id="2")

# Accept all
client.accept_all_revisions(doc_id)

# Download
client.download(doc_id, "final_contract.docx")

Pros:

  • Full revision tracking support
  • Cross-platform (API-based)
  • AI-powered editing with track changes
  • No local software dependencies

Cons:

  • Requires API calls (network dependency)
  • Usage-based pricing

Option 4: Direct OOXML Manipulation

For complete control, manipulate the OOXML directly:

from lxml import etree
from zipfile import ZipFile
import tempfile
import shutil
import os

NSMAP = {
    'w': 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
}

class DocxRevisionEditor:
    def __init__(self, path):
        self.path = path
        self.temp_dir = tempfile.mkdtemp()

        # Extract DOCX
        with ZipFile(path, 'r') as zf:
            zf.extractall(self.temp_dir)

        # Parse document.xml
        doc_path = os.path.join(self.temp_dir, 'word', 'document.xml')
        self.tree = etree.parse(doc_path)
        self.root = self.tree.getroot()

        # Track max revision ID
        self.max_id = self._get_max_revision_id()

    def _get_max_revision_id(self):
        """Find highest existing revision ID."""
        max_id = 0
        for elem in self.root.xpath('//*[@w:id]', namespaces=NSMAP):
            try:
                rev_id = int(elem.get('{http://schemas.openxmlformats.org/wordprocessingml/2006/main}id'))
                max_id = max(max_id, rev_id)
            except (ValueError, TypeError):
                pass
        return max_id

    def accept_revision(self, revision_id):
        """Accept a specific revision by ID."""
        # Find insertion
        ins = self.root.xpath(f'//w:ins[@w:id="{revision_id}"]', namespaces=NSMAP)
        if ins:
            # Move children out of w:ins, remove w:ins
            parent = ins[0].getparent()
            index = list(parent).index(ins[0])
            for child in list(ins[0]):
                parent.insert(index, child)
                index += 1
            parent.remove(ins[0])
            return True

        # Find deletion
        deletion = self.root.xpath(f'//w:del[@w:id="{revision_id}"]', namespaces=NSMAP)
        if deletion:
            # Remove the entire w:del element (text is gone)
            parent = deletion[0].getparent()
            parent.remove(deletion[0])
            return True

        return False

    def insert_text_tracked(self, paragraph_index, text, author="Python Script"):
        """Insert text with track changes."""
        from datetime import datetime

        self.max_id += 1

        paragraphs = self.root.xpath('//w:p', namespaces=NSMAP)
        if paragraph_index >= len(paragraphs):
            raise IndexError("Paragraph index out of range")

        para = paragraphs[paragraph_index]

        # Create w:ins element
        w = '{http://schemas.openxmlformats.org/wordprocessingml/2006/main}'
        ins = etree.Element(f'{w}ins')
        ins.set(f'{w}id', str(self.max_id))
        ins.set(f'{w}author', author)
        ins.set(f'{w}date', datetime.utcnow().isoformat() + 'Z')

        # Create run with text
        run = etree.SubElement(ins, f'{w}r')
        t = etree.SubElement(run, f'{w}t')
        t.text = text

        # Append to paragraph
        para.append(ins)

    def save(self, output_path):
        """Save the modified document."""
        # Write document.xml
        doc_path = os.path.join(self.temp_dir, 'word', 'document.xml')
        self.tree.write(doc_path, xml_declaration=True, encoding='UTF-8', standalone=True)

        # Repack DOCX
        with ZipFile(output_path, 'w') as zf:
            for root, dirs, files in os.walk(self.temp_dir):
                for file in files:
                    file_path = os.path.join(root, file)
                    arc_name = os.path.relpath(file_path, self.temp_dir)
                    zf.write(file_path, arc_name)

        # Cleanup
        shutil.rmtree(self.temp_dir)

# Usage
editor = DocxRevisionEditor("contract.docx")
editor.insert_text_tracked(0, " AMENDED: New terms apply.", author="Legal Bot")
editor.accept_revision("1")  # Accept revision with ID 1
editor.save("contract_modified.docx")

Pros:

  • Full control over revisions
  • No external dependencies beyond lxml
  • Cross-platform
  • Can handle complex revision scenarios

Cons:

  • Complex to implement correctly
  • Must handle all edge cases
  • Easy to corrupt documents
  • Need deep OOXML knowledge

Building a Revision-Aware Pipeline

For production document automation with track changes:

from docxagent import DocxClient
import json

def contract_review_pipeline(template_path, data, reviewer_name):
    """
    Automated contract review with tracked changes.

    1. Load template
    2. Fill in data
    3. AI reviews and suggests changes (tracked)
    4. Return document with all changes visible
    """
    client = DocxClient()

    # Upload and fill template
    doc_id = client.upload(template_path)

    # Fill template variables
    for key, value in data.items():
        client.edit(doc_id, f"Replace placeholder {{{{{key}}}}} with: {value}")

    # AI-powered contract review with track changes
    client.edit(
        doc_id,
        f"""Review this contract as {reviewer_name}. Make specific suggestions:
        1. Flag any unusual indemnification language
        2. Verify payment terms match industry standards
        3. Check for missing limitation of liability
        4. Suggest clearer language where ambiguous

        All changes should be tracked with your name as author."""
    )

    # Get revision summary
    revisions = client.get_revisions(doc_id)

    summary = {
        "total_revisions": len(revisions),
        "insertions": len([r for r in revisions if r['type'] == 'insertion']),
        "deletions": len([r for r in revisions if r['type'] == 'deletion']),
        "by_author": {}
    }

    for rev in revisions:
        author = rev.get('author', 'Unknown')
        if author not in summary['by_author']:
            summary['by_author'][author] = 0
        summary['by_author'][author] += 1

    # Download document with all tracked changes visible
    client.download(doc_id, "reviewed_contract.docx")

    return summary

# Usage
result = contract_review_pipeline(
    "msa_template.docx",
    {
        "CLIENT_NAME": "Acme Corp",
        "EFFECTIVE_DATE": "January 1, 2025",
        "PAYMENT_TERMS": "Net 30"
    },
    reviewer_name="Contract AI"
)

print(json.dumps(result, indent=2))

Why This Matters

If you're building document automation in Python without revision awareness:

  1. Legal risk: You can't prove what changed or when
  2. Audit failures: No trail of modifications
  3. Collaboration breaks: Changes made by your system aren't visible to reviewers
  4. Data loss: Original text is silently discarded

python-docx is fine for simple document generation. For anything involving revisions, tracked changes, or collaborative editing, you need tools that understand OOXML revisions.

The Bottom Line

python-docx doesn't support track changes. This isn't a bug—it's a fundamental design limitation that won't be fixed.

Your options:

  • Windows only: Word COM automation
  • Cross-platform free: LibreOffice headless (complex setup)
  • Cross-platform API: DocMods or similar services
  • Full control: Direct OOXML manipulation (steep learning curve)

Choose based on your platform requirements, complexity tolerance, and whether you need to read revisions, write revisions, or both.

For production document workflows where track changes matter, don't fight python-docx's limitations. Use the right tool for the job.

Frequently Asked Questions

Ready to Transform Your Document Workflow?

Let AI help you review, edit, and transform Word documents in seconds.

No credit card required • Free trial available