Skip to content

Instantly share code, notes, and snippets.

@sjardim
Last active December 6, 2024 14:33
Show Gist options
  • Save sjardim/214bfab3701c20844c1ff879176309ae to your computer and use it in GitHub Desktop.
Save sjardim/214bfab3701c20844c1ff879176309ae to your computer and use it in GitHub Desktop.
A Python script to clean up the paragraph and characters styles names from the converted Markdown to ICML file (via Pandoc).
# clean_icml.py
import sys
import re
import os
from pathlib import Path
def load_style_mappings():
"""Load style mappings from the configuration file"""
try:
# Get the directory of the current script
script_dir = Path(__file__).parent
# Import the style mappings from the same directory
sys.path.append(str(script_dir))
from style_mappings import STYLE_MAP
return STYLE_MAP
except ImportError as e:
sys.stderr.write(f"Error: Could not load style_mappings.py: {str(e)}\n")
sys.stderr.write("Make sure style_mappings.py is in the same directory as this script.\n")
sys.exit(1)
def clean_style_references(content, style_map):
"""Clean up and normalize both paragraph and character style references throughout the document"""
def clean_xml_part(xml_part, is_definition=False):
# Clean style references in attributes
for old_style, new_style in style_map.items():
# Pattern for Self attribute
pattern_self = f'Self="ParagraphStyle/{old_style}"'
replacement_self = f'Self="ParagraphStyle/{new_style}"'
xml_part = xml_part.replace(pattern_self, replacement_self)
# Pattern for Name attribute
pattern_name = f'Name="{old_style}"'
replacement_name = f'Name="{new_style}"'
xml_part = xml_part.replace(pattern_name, replacement_name)
# Pattern for AppliedParagraphStyle attribute
pattern_applied = f'AppliedParagraphStyle="ParagraphStyle/{old_style}"'
replacement_applied = f'AppliedParagraphStyle="ParagraphStyle/{new_style}"'
xml_part = xml_part.replace(pattern_applied, replacement_applied)
# Pattern for AppliedCharacterStyle attribute
if 'CharacterStyle/' in old_style:
pattern_char = f'AppliedCharacterStyle="{old_style}"'
replacement_char = f'AppliedCharacterStyle="{new_style}"'
xml_part = xml_part.replace(pattern_char, replacement_char)
return xml_part
# Process the style definitions sections
# Paragraph styles
style_def_pattern = r'(<RootParagraphStyleGroup.*?</RootParagraphStyleGroup>)'
style_defs = re.search(style_def_pattern, content, re.DOTALL)
if style_defs:
style_section = style_defs.group(1)
cleaned_styles = clean_xml_part(style_section, is_definition=True)
content = content.replace(style_section, cleaned_styles)
# Character styles
char_style_pattern = r'(<RootCharacterStyleGroup.*?</RootCharacterStyleGroup>)'
char_style_defs = re.search(char_style_pattern, content, re.DOTALL)
if char_style_defs:
char_section = char_style_defs.group(1)
cleaned_char_styles = clean_xml_part(char_section, is_definition=True)
content = content.replace(char_section, cleaned_char_styles)
# Process the content section (Story)
story_pattern = r'(<Story.*?</Story>)'
story = re.search(story_pattern, content, re.DOTALL)
if story:
story_section = story.group(1)
cleaned_story = clean_xml_part(story_section)
content = content.replace(story_section, cleaned_story)
return content
def remove_duplicate_styles(content):
"""Remove duplicate style definitions"""
def unique_style(match):
style = match.group(1)
if "BulList" in style and style not in seen_styles:
seen_styles.add(style)
return match.group(0)
elif "BulList" not in style:
return match.group(0)
return ""
seen_styles = set()
# Handle paragraph styles
pattern = r'(<ParagraphStyle\s+[^>]*?>.*?</ParagraphStyle>)'
content = re.sub(pattern, unique_style, content, flags=re.DOTALL)
# Reset seen styles for character styles
seen_styles = set()
# Handle character styles
char_pattern = r'(<CharacterStyle\s+[^>]*?>.*?</CharacterStyle>)'
content = re.sub(char_pattern, unique_style, content, flags=re.DOTALL)
return content
def main():
try:
# Load style mappings
style_map = load_style_mappings()
# Read input content
input_content = sys.stdin.read()
if not input_content:
sys.stderr.write("Error: No input content received\n")
sys.exit(1)
# Clean up style references
cleaned_content = clean_style_references(input_content, style_map)
# Remove duplicate style definitions
final_content = remove_duplicate_styles(cleaned_content)
# Write output
sys.stdout.write(final_content)
except Exception as e:
sys.stderr.write(f"Error: {str(e)}\n")
sys.exit(1)
if __name__ == '__main__':
main()

Introduction

The world has never been more dependent on the success of projects to enable and drive change. Projects are needed to reap the benefits of Generative AI, address the climate crisis, provide essential infrastructure and mitigate inequality, and they are essential to deliver long term organizational value, successful transformations and business results. For instance, McKinsey’s research shows that we are in the midst of a 5-year wave of global capital spending on physical assets through 2027, with a surge of roughly US$130 trillion pouring into projects to decarbonize and renew critical infrastructure.

Yet, despite this urgent, existential need to produce successful outcomes, a holistic way to evaluate project success has remained elusive. To start, a literature review on research into project success, conducted by PMI, showed varying perspectives, but little agreement on the definition of project success or how it should be measured. The conclusion: the fundamental question about how project success is defined in our modern era is needed to reflect not only project success but expand accountability to encompass value delivery and project success.

Reports that suggest high rates of project failure don’t tell the full story, since defining project success is complicated and usually viewed on a continuum rather than as a simple binary – pass or fail. Perceptions may differ depending on how updates and outcomes are received by the various stakeholders and beneficiaries of the project and may shift more positively or negatively with time. Critically, this lack of shared understanding reduces the chances of achieving success. Going forward, it will be imperative for project professionals to continually reassess and adapt to changing circumstances, all while managing perceptions.

How then to really understand what project success means to organizations today, and find a way to measure success that is accepted and easily understood by all major stakeholder groups that are typically involved in a project? PMI initiated the largest research project of this type to date during the current period of transformation in the workplace and the profession, an apt time to reexamine the changing landscape in which projects are conducted, along with the value they create. With this increasing complexity, practitioners will need to elevate their role in producing successful outcomes.

Against this backdrop, our focus on reframing project success aims to provide insights to activate the critical conversations between executives and project professionals, as well as their customers in organizations, business and government that will empower them to be more strategic and embrace responsibility for getting things done. Becoming more customer-centric and delivering results that drive value will demonstrate the power projects have to make a positive impact on our world.

This report will:

  • Define project success in a way that encompasses a shared perspective among a broad range of project stakeholders.

  • Introduce a clear, universal method to measure project success.

  • Explain the factors that influence project success in a way that will help practitioners and organizations lead their projects toward delivering greater VALUE and higher rates of success over time.

  • Provide a baseline of the current rate of project success globally, as well as by industry and project type.

  • Discuss the benefits to project success of aiming toward a higher purpose.

  • Deliver insights to activate project success for practitioners, executives and the project management profession.

In formulating a new definition of project success, we aspired to collect and analyze as much data as possible across many different reference points (see Figure 1). For a full discussion of how we arrived at this definition, refer to the section on the research process in the appendix at the end of this report.

Research and Analytic Approach{.figure width=10cm}

A Definition That Considers Execution Metrics and Outcomes

Initial qualitative interviews with project professionals, sponsors, PMO leaders, executives and intended beneficiaries showed that how project success is measured can generally be divided into two broad categories: execution-focused metrics – met requirements, on time, within scope and budget, for example – and outcome-focused metrics – customer satisfaction, commercial success, impact on productivity, etc. In these conversations, participants organically raised the idea of value beyond just meeting requirements. This exploratory research confirmed the opportunity to introduce and validate a definition that takes into account both project management success – how a project was executed – and project success – what it ultimately delivered​.

::: {custom-style='Quote'} We talk a lot about execution, and we have to talk a little bit more about outcomes because that’s why we have a lot of different opinions, a lot of confusion. People think that execution is success and in fact both things are success – execution and outcomes.

[C. Rodrigues]{custom-style='Quote_author'}, [ PMO Leader, Brazil]{custom-style='Quote_author_bio'} :::

This Python script (alongside its style mappings file) attempts to clean up the names of the Pandoc-generated paragraph and character styles.

Based on this thread: jgm/pandoc#8333

After installing Pandoc and Python on your system, you should run the commands:

  1. pandoc example-markdown.md -s -o before.icml
  2. python clean_icml.py < before.icml > after.icml

Examples of changes (you can tweak the names you want and add more):

# style_mappings.py

STYLE_MAP = {
    # Paragraph styles
    'BulList &gt; first &gt; Paragraph': 'BulList',
    'BulList &gt; Paragraph': 'BulList',
    'Quote &gt; Paragraph': 'Quote',
    # Character styles
    'CharacterStyle/Quote_author &gt; Character': 'CharacterStyle/Quote_author'
}
# style_mappings.py
STYLE_MAP = {
# Paragraph styles
'BulList &gt; first &gt; Paragraph': 'BulList',
'BulList &gt; Paragraph': 'BulList',
'Quote &gt; Paragraph': 'Quote',
# Character styles
'CharacterStyle/Quote_author &gt; Character': 'CharacterStyle/Quote_author'
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment