Problem: To determine which Knowledge Articles in NA33 contain embedded images (<img
). Of these, some will reference an opaque Salesforce URL because those images were directly pasted into the Salesforce Knowledge Article Rich Text Editor; others will not. Those that contain no IMGs can be processed with an ETL tool to migrate them from NA33 to NA35. This with IMGs will need to further be divided into those that have Salesforce opaque image URLs and those that have other URLs (either publicly available ones or ones that were not adequatly captured when the original curation took place during TSA12.)
Solution: Extract the IDs and corresponding articles in a way that makes it easy to search the article for IMG tags and know the article's ID when a match is found. In the procedure listed here, the way to do this is to ensure that each article is entirely on a single line.
Prerequisites and Tools
This process will require the following:
- Bash shell (such as Git Bash or Cygwin)
- perl installed in the Bas