Skip to content

Commit

Permalink
[TASK] Streamline \TYPO3\HtmlSanitizer\Sanitizer (#97)
Browse files Browse the repository at this point in the history
* [TASK] Adjust PHPdoc comment

* [TASK] Declare all default options for Masterminds parser

* [TASK] Organize node handling in dedicated method

Currently `sanitize` combines all steps:

- `parse` to create DOM node structures
- `handle` to apply corresponding sanitizer behavior
- `serialize` to convert DOM nodes back to HTML as string

The goal is, that all these methods could be public in future versions.

* [TASK] Deprecate superfluous property Sanitizer::$root
  • Loading branch information
ohader authored Nov 24, 2022
1 parent 1c918d7 commit 1b5eed3
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 12 deletions.
1 change: 1 addition & 0 deletions UPGRADING.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
use `\TYPO3\HtmlSanitizer\Behavior\NodeException::withDomNode(?DOMNode $domNode)` instead
* deprecated `\TYPO3\HtmlSanitizer\Behavior\NodeException::getNode()`,
use `\TYPO3\HtmlSanitizer\Behavior\NodeException::getDomNode()` instead
* deprecated property `\TYPO3\HtmlSanitizer\Sanitizer::$root`, superfluous - don't use it anymore
2 changes: 1 addition & 1 deletion src/Context.php
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ class Context
public $parser;

/**
* @var InitiatorInterface
* @var ?InitiatorInterface
*/
public $initiator;

Expand Down
35 changes: 24 additions & 11 deletions src/Sanitizer.php
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,14 @@
*/
class Sanitizer
{
protected const mastermindsDefaultOptions = [
// Whether the serializer should aggressively encode all characters as entities.
'encode_entities' => false,
// Prevents the parser from automatically assigning the HTML5 namespace to the DOM document.
// (adjusted due to https://github.com/Masterminds/html5-php/issues/181#issuecomment-643767471)
'disable_html_ns' => true,
];

/**
* @var VisitorInterface[]
*/
Expand All @@ -48,6 +56,7 @@ class Sanitizer

/**
* @var DOMDocumentFragment
* @deprecated since v2.1.0, not required anymore
*/
protected $root;

Expand All @@ -64,19 +73,27 @@ public function __construct(VisitorInterface ...$visitors)

public function sanitize(string $html, InitiatorInterface $initiator = null): string
{
$this->root = $this->parse($html);
$this->context = new Context($this->parser, $initiator);
$this->beforeTraverse();
$this->traverseNodeList($this->root->childNodes);
$this->afterTraverse();
return $this->serialize($this->root);
$root = $this->parse($html);
// @todo drop deprecated property
$this->root = $root;
$this->handle($root, $initiator);
return $this->serialize($root);
}

protected function parse(string $html): DOMDocumentFragment
{
return $this->parser->parseFragment($html);
}

protected function handle(DOMNode $domNode, InitiatorInterface $initiator = null): DOMNode
{
$this->context = new Context($this->parser, $initiator);
$this->beforeTraverse();
$this->traverseNodeList($domNode->childNodes);
$this->afterTraverse();
return $domNode;
}

protected function serialize(DOMNode $document): string
{
return $this->parser->saveHTML($document);
Expand Down Expand Up @@ -149,10 +166,6 @@ protected function replaceNode(DOMNode $source, ?DOMNode $target): ?DOMNode

protected function createParser(): HTML5
{
// set parser & applies work-around
// https://github.com/Masterminds/html5-php/issues/181#issuecomment-643767471
return new HTML5([
'disable_html_ns' => true,
]);
return new HTML5(self::mastermindsDefaultOptions);
}
}

0 comments on commit 1b5eed3

Please sign in to comment.