STRIP: This paragraph has junk class and style attributes. Should become plain p tag.
KEEP: This is a clean paragraph with no attributes.
STRIP DIV KEEP TEXT: This text was inside a div with a Word class.
STRIP SPAN KEEP TEXT: This was wrapped in a styled span.
KEEP: A paragraph with a clean link to CDC Gaming that should keep href, target, rel but lose style, class, and data attributes.
KEEP: A paragraph with a bold word and an italic word that should lose their style attributes.
STRIP ATTRIBUTES: This h3 should become a plain h3
- STRIP ATTRIBUTES: List item with junk attributes
- STRIP ATTRIBUTES: Another list item with junk
- KEEP: Clean list item
KEEP CLASS STRIP STYLE: This blockquote should keep class=”twitter-tweet” but lose the style.
STRIP: The three lines above were empty paragraphs with and whitespace. They should be gone.
STRIP TABLE KEEP TEXT: This was inside a junk table.
STRIP: The style tag above should be completely removed.
STRIP: The script tag above should be completely removed.
STRIP: The meta tag above should be removed.
STRIP SECTION AND HEADER KEEP TEXT: Wrapped in section and header tags.
STRIP FONT KEEP TEXT: This was in a font tag.
KEEP: Regular paragraph before iframe test.
KEEP: The iframe above should be completely untouched.
KEEP: Paragraph with superscript that should lose its style.
KEEP: Paragraph with clean subscript that should stay as-is.
STRIP: The nested closing p tags above should collapse.
KEEP: Final clean paragraph.
Trailing nbsp test:
