Skip to content

[redcap] strip html and non-breaking space from field label#10387

Open
adamdaudrich wants to merge 2 commits intoaces:mainfrom
adamdaudrich:stripLinst
Open

[redcap] strip html and non-breaking space from field label#10387
adamdaudrich wants to merge 2 commits intoaces:mainfrom
adamdaudrich:stripLinst

Conversation

@adamdaudrich
Copy link
Copy Markdown
Contributor

Brief summary of changes

Sometimes, a REDcap instance is set up with HTML elements to facilitate user experience. This html is transferred to LINST.

This PR Removes html from LINST during redcap2linst import and
also removes non-breaking space

See the html tags at the bottom of this image:
1.
Screenshot from 2025-11-10 15-38-11

Link(s) to related issue(s)

  • Resolves # (Reference the issue this fixes, if any.)

@github-actions github-actions bot added Language: PHP PR or issue that update PHP code Module: redcap PR or issue related to redcap module labels Mar 3, 2026
Copy link
Copy Markdown
Collaborator

@regisoc regisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// strip HTML-related content
$field_label = strip_tags($field_label);
// replace the non-breaking space \xao
// and \xc2, the first byte of its UTF-8 encoding
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any problem with the code stripping this out if redcap is adding them for some reason but I don't understand the comment.

  1. This is 2 bytes, not 1
  2. UTF-8 doesn't have any header. It's a character encoding. (Files sometimes has a BOM for legacy reasons but that is a different byte sequence than this..)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@driusan Should I change the comment ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Can you explain what it is you're trying to do and why?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the line removes the non-breaking space

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what non-breaking space?

Copy link
Copy Markdown
Contributor Author

@adamdaudrich adamdaudrich Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you import html from REDcap, there's nbsp stuff that gets past the strip_tags filter. This one solves that. It's a direct result of my work on a redcap import

Copy link
Copy Markdown
Collaborator

@regisoc regisoc Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@driusan @adamdaudrich searching quickly the non-break space (NBSP) code that Adam is replacing:

image

So maybe the comment should be replace the non-breaking space \xc2\xa0 by a single space?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Language: PHP PR or issue that update PHP code Module: redcap PR or issue related to redcap module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants