Stripping Inline PHP From Drupal Text Fields During Migration

A simple Regex snippet to target PHP code embedded into a long text field during migration of Drupal content.
Image
Paper figure in front of some PHP code - Image via Unsplash: Kobu agency

I'm currently migrating a Drupal 7 site, but the (somewhat naughty) PHP code embedded into some of the text fields isn't being removed by the Drupal 9 text format: it's being rendered as plain text, which is a potential massive security hole.

Had me stumped, as I was trying to avoid Regex, which isn't particularly reliable for dealing with HTML; XPath was also a dead end... essentially need to remove the <?php and ?> tags plus any code between them.

Anyway, despite the pedantic style response on stackoverflow, pointing out facts etc (and the bleedin obvious), here's the solution via the marvellous Migrate Plus module to stick in your YAML file:


plugin: str_replace regex: true search: '/\<\?php((.|\n)*)\?\>/mi' replace: ' '

You might consider stripping out multiple spaces too