Track Changes: The Feature That Powers Legal, Finance, and Compliance
Track changes isn't just a Word feature. It's the infrastructure for professional document collaboration:
- Contract negotiations: Both parties see exactly what changed
- Regulatory filings: Auditors trace every modification
- Legal review: Partners see associate edits before approval
- Board documents: Governance requires revision history
- Compliance: SOX, HIPAA, and others mandate change documentation
When track changes breaks, professional document workflows break.
How Word Actually Stores Track Changes
Open any DOCX file as a ZIP archive (because that's what it is). Inside word/document.xml, you'll find:
Insertions (w:ins)
<w:p>
<w:r>
<w:t>The contract value is </w:t>
</w:r>
<w:ins w:id="1" w:author="Jane Smith" w:date="2026-02-01T14:30:00Z">
<w:r>
<w:t>$150,000</w:t>
</w:r>
</w:ins>
<w:r>
<w:t> per year.</w:t>
</w:r>
</w:p>
The w:ins element wraps inserted text with:
w:id: Unique revision identifierw:author: Who made the changew:date: When (ISO 8601 timestamp)
Deletions (w:del)
<w:del w:id="2" w:author="John Doe" w:date="2026-02-01T15:00:00Z">
<w:r>
<w:delText>old text that was removed</w:delText>
</w:r>
</w:del>
Deleted text uses w:delText (not w:t) and remains in the document—it's just marked as deleted.
Formatting Changes (w:rPrChange)
<w:r>
<w:rPr>
<w:b/> <!-- text is now bold -->
<w:rPrChange w:id="3" w:author="Editor" w:date="2026-02-01T16:00:00Z">
<w:rPr/> <!-- was not bold before -->
</w:rPrChange>
</w:rPr>
<w:t>emphasized text</w:t>
</w:r>
The old formatting lives inside w:rPrChange, new formatting outside it.
Why This Matters for Automation
Any tool that claims to "edit Word documents" must either:
- Produce proper OOXML revisions (w:ins, w:del with metadata)
- Destroy existing track changes (most tools)
- Work around by not touching tracked areas (limited usefulness)
Most AI tools and document libraries choose option 2 without telling you.
The AI Track Changes Gap
What AI Tools Actually Do
ChatGPT, Claude, Gemini:
Input: "Review this contract clause"
Output: "Here's a suggested revision: [text]"
You get text in a chat window. Applying it to your document with track changes is your problem.
Microsoft Copilot:
Input: Document open in Word + prompt
Output: New content replaces old content
Copilot rewrites text, but the rewrite isn't tracked. Old content disappears without revision history.
Most "AI document" tools:
Input: Upload document
Output: Download new document (or PDF)
The output is a fresh document. Any existing track changes from prior reviewers? Gone.
The Exception: Document-Level AI Editing
Tools that operate directly on DOCX structure can produce real track changes:
from docxagent import DocxClient
client = DocxClient()
doc_id = client.upload("contract.docx")
# This edit produces actual w:ins elements
client.edit(
doc_id,
"Change 'Net 30' payment terms to 'Net 45'",
author="Contract AI"
)
# Downloaded document has:
# - Original "Net 30" wrapped in w:del
# - New "Net 45" wrapped in w:ins
# - Author: "Contract AI"
# - Timestamp: current time
client.download(doc_id, "contract_edited.docx")
Open in Word: you see "Net 30" crossed out, "Net 45" underlined, attributed to "Contract AI" with today's date.
Automation Approaches
Option 1: Word COM Automation (Windows)
import win32com.client
def edit_with_tracking(input_path, output_path):
word = win32com.client.Dispatch("Word.Application")
word.Visible = False
doc = word.Documents.Open(input_path)
doc.TrackRevisions = True
# Find and replace with tracking
find = doc.Content.Find
find.Execute(
FindText="Net 30",
ReplaceWith="Net 45",
Replace=2 # wdReplaceAll
)
# Changes are now tracked with your Windows username
doc.SaveAs(output_path)
doc.Close()
word.Quit()
Pros:
- Full Word functionality
- Reliable track changes
- Access to all Word features
Cons:
- Windows only
- Requires Word installation
- Slow (COM overhead)
- Licensing issues at scale
Option 2: Power Automate + Encodian
For enterprise workflows, Encodian's Power Automate actions handle track changes:
Trigger: New file in SharePoint
Action: Encodian - Extract Tracked Changes
Output: JSON with all revisions, authors, dates
Use cases:
- Extract changes for review dashboards
- Trigger workflows when specific changes are detected
- Archive revision history separately
Pros:
- Enterprise-ready
- No code required
- SharePoint integration
Cons:
- Encodian licensing
- Microsoft ecosystem dependency
- Extract-only (not AI editing)
Option 3: LibreOffice Headless
# Convert with track changes acceptance
soffice --headless --accept-all-changes --convert-to docx input.docx
For more control, use LibreOffice's UNO API:
import uno
from com.sun.star.beans import PropertyValue
def accept_all_changes(input_path, output_path):
# Complex setup required - see LibreOffice documentation
desktop = get_desktop() # UNO bootstrap
doc = desktop.loadComponentFromURL(
f"file://{input_path}",
"_blank", 0, ()
)
doc.acceptAllChanges()
doc.storeToURL(f"file://{output_path}", ())
doc.close(True)
Pros:
- Cross-platform
- Free / open source
- No Microsoft dependency
Cons:
- Complex UNO API
- Some Word compatibility issues
- Slow for batch processing
- No AI integration
Option 4: Direct OOXML Manipulation
For full control, manipulate the XML directly:
from lxml import etree
from zipfile import ZipFile
from datetime import datetime
import tempfile
import shutil
NSMAP = {
'w': 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
}
def insert_tracked_text(docx_path, find_text, insert_after, new_text, author="Script"):
"""Insert text with track changes after specified text."""
# Extract DOCX
temp_dir = tempfile.mkdtemp()
with ZipFile(docx_path, 'r') as zf:
zf.extractall(temp_dir)
# Parse document.xml
doc_path = f"{temp_dir}/word/document.xml"
tree = etree.parse(doc_path)
root = tree.getroot()
# Find the text run containing target
w = '{http://schemas.openxmlformats.org/wordprocessingml/2006/main}'
for t_elem in root.xpath('//w:t', namespaces=NSMAP):
if insert_after in (t_elem.text or ''):
# Create tracked insertion
parent = t_elem.getparent().getparent() # w:p
ins = etree.Element(f'{w}ins')
ins.set(f'{w}id', '999') # Should track max ID
ins.set(f'{w}author', author)
ins.set(f'{w}date', datetime.utcnow().isoformat() + 'Z')
run = etree.SubElement(ins, f'{w}r')
text = etree.SubElement(run, f'{w}t')
text.text = new_text
# Insert after the run containing our target
t_elem.getparent().addnext(ins)
break
# Save and repack
tree.write(doc_path, xml_declaration=True, encoding='UTF-8')
output_path = docx_path.replace('.docx', '_edited.docx')
with ZipFile(output_path, 'w') as zf:
for root_dir, dirs, files in os.walk(temp_dir):
for file in files:
file_path = os.path.join(root_dir, file)
arc_name = os.path.relpath(file_path, temp_dir)
zf.write(file_path, arc_name)
shutil.rmtree(temp_dir)
return output_path
Pros:
- Complete control
- Cross-platform
- No external dependencies (beyond lxml)
- Can be integrated with any AI
Cons:
- Complex implementation
- Easy to corrupt documents
- Must handle all edge cases
- Significant OOXML knowledge required
AI + Track Changes: The Integration Pattern
The ideal workflow:
1. Document arrives (email, upload, API)
2. Parse existing track changes (understand current state)
3. AI analyzes content (including pending revisions)
4. AI generates edits (as structured operations)
5. Apply edits with track changes (w:ins, w:del)
6. Output document with full revision history
Implementation with DocMods
from docxagent import DocxClient
client = DocxClient()
# Step 1: Upload document
doc_id = client.upload("vendor_contract.docx")
# Step 2: Read including existing track changes
content = client.read(doc_id, include_revisions=True)
# Returns: {
# "text": "current text content",
# "revisions": [
# {"type": "insertion", "text": "...", "author": "...", "date": "..."},
# {"type": "deletion", "text": "...", "author": "...", "date": "..."}
# ]
# }
# Step 3-5: AI edits with track changes
client.edit(
doc_id,
"""Review this vendor contract and:
1. Flag any unusual liability provisions
2. Ensure payment terms are Net 45 (change if not)
3. Check for missing force majeure clause
Apply changes directly with track changes.""",
author="Contract Review AI"
)
# Step 6: Download with full history
client.download(doc_id, "vendor_contract_reviewed.docx")
The output document contains:
- Original text with vendor's changes (preserved)
- AI's changes as new tracked revisions
- Clear author attribution at each revision
- Full timestamps for audit trail
Common Pitfalls
Pitfall 1: Assuming AI Tools Handle Track Changes
Most don't. Always test with a document containing existing revisions.
Test protocol:
- Create document with track changes
- Process through AI tool
- Check if original track changes survive
- Check if new edits have proper tracking
Pitfall 2: python-docx for Track Changes
from docx import Document
# This DESTROYS existing track changes
doc = Document("contract_with_revisions.docx")
doc.paragraphs[0].add_run(" new text")
doc.save("output.docx")
# "new text" is NOT tracked
# Existing track changes may be corrupted
python-docx doesn't understand revisions. Don't use it for documents with track changes.
Pitfall 3: PDF Conversion
DOCX → PDF → OCR → Edit → DOCX
Track changes don't survive PDF conversion. If your workflow involves PDF at any stage, revision history is lost.
Pitfall 4: Copy-Paste from AI Chat
1. Copy text from Word
2. Paste to ChatGPT
3. Get suggestion
4. Copy suggestion
5. Paste to Word
6. Manually add track changes formatting
This "works" but:
- Manual track changes aren't proper OOXML revisions
- No reliable author/timestamp metadata
- Doesn't scale beyond a few edits
- Error-prone
The Bottom Line
Track changes are OOXML elements (w:ins, w:del) with metadata (author, date, ID). Most AI tools ignore them completely.
If your workflow requires revision tracking:
- Don't use general AI chat (ChatGPT, Claude) for document editing—use them for understanding and brainstorming
- Don't use python-docx for documents with existing track changes
- Test any tool with a revision-heavy document before production use
- Consider document-level APIs that operate directly on DOCX structure
The technology exists to combine AI intelligence with proper track changes. Most tools just haven't implemented it because it's hard.
Choose tools that take track changes seriously—your audit trails depend on it.


