A simple Regex snippet to target PHP code embedded into a long text field during migration of Drupal content.
Image
I'm currently migrating a Drupal 7 site, but the (somewhat naughty) PHP code embedded into some of the text fields isn't being removed by the Drupal 9 text format: it's being rendered as plain text, which is a potential massive security hole.
Had me stumped, as I was trying to avoid Regex, which isn't particularly reliable for dealing with HTML; XPath was also a dead end... essentially need to remove the <?php
and ?>
tags plus any code between them.
Anyway, despite the pedantic style response on stackoverflow, pointing out facts etc (and the bleedin obvious), here's the solution via the marvellous Migrate Plus module to stick in your YAML file:
plugin: str_replace
regex: true
search: '/\<\?php((.|\n)*)\?\>/mi'
replace: ' '
You might consider stripping out multiple spaces too