You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You are an AI assistant tasked with analyzing and modifying a Non-Disclosure Agreement (NDA) based on specific instructions and a playbook. Your goal is to use information, code and stepds in canvas, to provide a redlined version of the NDA and a summary of your analysis and changes.
Attached docs are:
The NDA to redline (NDA.docx)
A playbook based on which to redline(Playbook.pdf)
Instruction on how to apply redline provided in playbook in the NDA. (Instructions.pdf)
Output expected:
A redlined version of the given NDA. (NDA-Redlined.docx)
Summary of document content, key highlights including parties and key clauses.
Redlined Edits in Word (.docx) Using Low-Level OpenXML Manipulation
Objective
Create a tool that can programmatically insert suggested edits (redlined insertions, deletions, and replacements) into an existing .docx Word document. These edits should appear in Microsoft Word as tracked changes, identical to how human reviewers suggest edits.
Why Low-Level OpenXML?
Most established libraries for working with .docx files do not support tracked changes (like insertions and deletions that show up as suggestions in Word). Therefore, we manipulate WordprocessingML directly (OpenXML format) to produce tracked edits:
<w:ins> — Inserted text (appears as underlined suggestion)
<w:del> — Deleted text (appears as strikethrough suggestion)
Metadata (author, timestamp, revision ID)
This approach guarantees compatibility with Word and maximum flexibility.
Approach
We'll unzip the .docx (a ZIP archive), modify its internal XML files (word/document.xml, etc.), and repackage it.
Step-by-Step
1. Required Libraries (Python)
python-docx – manipulate .docx document structure
lxml – low-level XML manipulation for reading/writing tracked changes
pip install python-docx lxml
2. .docx Structure
A .docx file is a ZIP archive with the following relevant files:
word/document.xml – main document content
word/comments.xml (optional) – for comments (not used for redlines)
_rels/.rels and [Content_Types].xml – for relationships and part types
We'll inject <w:ins> and <w:del> into document.xml.
This creates a replacement: delete “old text” and insert “new text”.
4. Workflow
Load .docx using python-docx
Extract and parse underlying XML with lxml
Locate text nodes to modify
Replace with <w:ins> / <w:del> inline using exact run positions
Save document back to .docx
5. Considerations
Preserve namespace declarations (xmlns:w, etc.) when editing XML.
Assign consistent rsid, author, and timestamp to edits.
Always insert redlined edits in-place using parent.insert(index, element) rather than append, to avoid incorrect placement (e.g., at paragraph end).
Simulate replacements using a <w:del> followed immediately by a <w:ins>.
Track edits across paragraphs and runs carefully — Word may group them visually.
Example Functions in Python for Handling Redlined Edits
1. Extract Redline Changes (insertions/deletions)
defextract_redlines(doc):
""" Extracts all tracked insertions and deletions from the given Word document and returns them as two separate lists of dictionaries. """fromlxmlimportetreens= {'w': 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'}
insertions, deletions= [], []
forparaindoc.paragraphs:
tree=etree.fromstring(para._element.xml.encode('utf-8'))
forinsintree.xpath('.//w:ins', namespaces=ns):
ins_text=''.join(ins.xpath('.//w:t/text()', namespaces=ns))
insertions.append({
'text': ins_text,
'author': ins.get(f'{{{ns["w"]}}}author'),
'date': ins.get(f'{{{ns["w"]}}}date')
})
fordeleteintree.xpath('.//w:del', namespaces=ns):
del_text=''.join(delete.xpath('.//w:delText/text()', namespaces=ns))
deletions.append({
'text': del_text,
'author': delete.get(f'{{{ns["w"]}}}author'),
'date': delete.get(f'{{{ns["w"]}}}date')
})
returninsertions, deletions
2. Apply a Redlined Replacement (Generic)
defapply_redline_replacement(doc, replacements, author="Reviewer"):
""" Accepts a list of replacement instructions and applies tracked changes. replacements: List of dicts with keys: 'search_text', 'replacement_text' """fromdocx.oxmlimportOxmlElementfromdocx.oxml.nsimportqnfromdatetimeimportdatetimedefmake_ins(text):
ins=OxmlElement("w:ins")
ins.set(qn("w:author"), author)
ins.set(qn("w:date"), datetime.utcnow().isoformat() +"Z")
r=OxmlElement("w:r")
t=OxmlElement("w:t")
t.text=textr.append(t)
ins.append(r)
returninsdefmake_del(text):
delete=OxmlElement("w:del")
delete.set(qn("w:author"), author)
delete.set(qn("w:date"), datetime.utcnow().isoformat() +"Z")
r=OxmlElement("w:r")
del_text=OxmlElement("w:delText")
del_text.text=textr.append(del_text)
delete.append(r)
returndeleteforrepinreplacements:
forparaindoc.paragraphs:
ifrep['search_text'] inpara.text:
forruninpara.runs:
ifrep['search_text'] inrun.text:
run.text=run.text.replace(rep['search_text'], "")
parent=run._element.getparent()
idx=parent.index(run._element)
parent.insert(idx+1, make_del(rep['search_text']))
parent.insert(idx+2, make_ins(rep['replacement_text']))
break
3. Compare Document Text With and Without Changes
defget_doc_text(doc):
"""Returns plain text content of a Word doc (ignores tracked changes)."""return"\n".join([p.textforpindoc.paragraphsifp.text])
defget_doc_text_with_redlines(doc):
"""Returns text with redlined (inserted/deleted) content included explicitly."""fromlxmlimportetreens= {'w': 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'}
lines= []
forparaindoc.paragraphs:
xml=para._element.xml.encode('utf-8')
tree=etree.fromstring(xml)
text_parts= []
fornodeintree.iter():
ifnode.tag.endswith('}t'):
text_parts.append(node.textor'')
elifnode.tag.endswith('}delText'):
text_parts.append(f"[DEL:{node.text}]")
elifnode.tag.endswith('}ins'):
ins_text=''.join(node.xpath('.//w:t/text()', namespaces=ns))
text_parts.append(f"[INS:{ins_text}]")
lines.append("".join(text_parts))
return"\n".join(lines)
Example Usage: Apply Redlined Edits
fromdocximportDocument# Load the documentdoc=Document("sample.docx")
# Define your tracked replacementsreplacements= [
{'search_text': "old phrase", 'replacement_text': "new phrase"},
{'search_text': "Company Name", 'replacement_text': "Acme Inc."},
{'search_text': "_____________", 'replacement_text': "Safwan"},
]
# Apply the redlined replacementsapply_redline_replacement(doc, replacements, author="Editor")
# Save to new filedoc.save("sample_redlined.docx")
Outcome
You will generate a .docx file with visible tracked changes (suggestions) that Word treats as reviewer edits, and they will appear in the correct inline context.