Fix html tags, close tags, repair bad quotes and more

This class can solve many problems coming from user generated html content or to fix html content before making some…

Febbraio 3, 2010

This class can solve many problems coming from user generated html content or to fix html content before making some hard work with your bots! (It’s specially usefull for web sites without the Html Tidy module of PHP).

Hre is a quick list of the magic things it can do.

  1. delete closed tags without their opening tag
  2. fix open tag without close, closing them automatically
  3. check bad nesting and fix them (if you have a bold inside a bold… or a paragrah that contains a table…)
  4. fix bad quotes in attributes (open quotes where missing…)
  5. merge different styles attributes in the same tag
  6. remove html comments
  7. remove empty tags and more bad tags

It works ina complex way since it analyzes the html code char by char and search for tags. When a tag is found start the work of cleaning attributes, then store data found in a matrix and search for the closing tags.
The data saved in the matrix are later used to re-build the correct fixed html.

EXAMPLE:
It’s very simple to use, suppose you have a variable with the dirty html:

$a = new HtmlFixer();
$clean = $a->getFixedHtml($dirty_html);

You can download the class from the HTML FIXER page.

Author

PHP expert. Wordpress plugin and theme developer. Father, Maker, Arduino and ESP8266 enthusiast.

Comments on “Fix html tags, close tags, repair bad quotes and more”

There are 8 thoughts

  1. Savita ha detto:

    Hi,

    This class is really helpful, just need to point some small issues…
    1) php short code is user <?, in place need to use this debug is false but still it shows all the debugging on the screen.

  2. admin ha detto:

    Thank you Savita, probably I’ve uploaded the wrong file. I’ll fix it today!

  3. Reflexões ha detto:

    Wow!!!

    This works perfeclty!!! fix all HTML tags….
    I lost much time fixing manually wrong HTML codes posted by users at a custom blog…

    Thx

  4. web ha detto:

    I don’t have a php page, is there another way to fix his problem for .html pages?

  5. Lee Tung ha detto:

    It is powerful! This is my need to fix HTML tags in PHP! Thanks admin!

  6. qinkun ha detto:

    If there is a before ,it will delete the string before the end tag().
    e.g.
    something to dor u sure

    What is the result i wnat is :
    something to dor u sure

    What can i do in that case?

  7. Radu ha detto:

    Bug?

    Enter the code below, replacing [ and ] with less than / greater than tags:

    $a = new CTS_HTMLFixer();
    echo $a->getFixedHtml(‘[font style=”color: rgb(153, 255, 51);” size=”4″]test[/font]’);

    ——-

    Result: [font style=”color:;” rgb(153, 255, 51);” size=”4″]test[/font]

    So it replaces the valid attribute style=”color: rgb(153, 255, 51);” with invalid HTML style=”color:;” rgb(153, 255, 51);”

  8. robcaa ha detto:

    How can I disable comment delete?

Comments are closed