[redcap] strip html and non-breaking space from field label#10387
[redcap] strip html and non-breaking space from field label#10387adamdaudrich wants to merge 2 commits intoaces:mainfrom
Conversation
| // strip HTML-related content | ||
| $field_label = strip_tags($field_label); | ||
| // replace the non-breaking space \xao | ||
| // and \xc2, the first byte of its UTF-8 encoding |
There was a problem hiding this comment.
I don't have any problem with the code stripping this out if redcap is adding them for some reason but I don't understand the comment.
- This is 2 bytes, not 1
- UTF-8 doesn't have any header. It's a character encoding. (Files sometimes has a BOM for legacy reasons but that is a different byte sequence than this..)
There was a problem hiding this comment.
Maybe? Can you explain what it is you're trying to do and why?
There was a problem hiding this comment.
the line removes the non-breaking space
There was a problem hiding this comment.
When you import html from REDcap, there's nbsp stuff that gets past the strip_tags filter. This one solves that. It's a direct result of my work on a redcap import
There was a problem hiding this comment.
@driusan @adamdaudrich searching quickly the non-break space (NBSP) code that Adam is replacing:
- Unicode character:
U+00A0https://www.compart.com/en/unicode/U+00A0 - UTF-8 encoding:
\xc2\xa0which is written in the code L249.
So maybe the comment should be replace the non-breaking space \xc2\xa0 by a single space?
Brief summary of changes
Sometimes, a REDcap instance is set up with HTML elements to facilitate user experience. This html is transferred to LINST.
This PR Removes html from LINST during redcap2linst import and
also removes non-breaking space
See the html tags at the bottom of this image:

1.
Link(s) to related issue(s)