ocPortal Developer's Guide: XHTML
» Return to Contents
An important and intriguing new feature in ocPortal is the XHTML checker. This nifty system will optionally check all outgoing XHTML code for errors, and report them to you, which is an absolute godsend for template development (and makes you wonder why web browsers do not have an option for this).
It can also be used to display XHTML source whether there are errors or not, presenting a colour coded, indented, and numbered, representation of your XHTML output.
validation.php performs a validation of given XHTML, CSS, accessibility, as well as finding common bugs and browser incompatibilities. To see what it does look at the language file and see what errors are given out.
Blockers for XHTML-strict (which we don't actually use: we use transitional without deprecated tags/attributes)
- iframe
- a.target, form.target
Technically, innerHTML shouldn't be used for XHTML. However, Firefox 1.5 and Opera 8 support it, and IE doesn't support proper-XHTML. Regardless, ocPortal does not require parsing as proper-HTML (because if it did, IE wouldn't work with it). More technically, innerHTML isn't even a part of DOM, but it is a defacto standard: Javascript is such a mess, that de-facto standards need to be used and browser-specific workarounds deployed.
Blockers for WCAG2 (standard still a working draft).
- All radio button groups are marked using fieldset and legend elements. (don't agree with this one, as the name attribute groups them already and fieldset adds a frame around them, which is unwanted because it breaks up the UI too much)
- No layout tables [make summary attribute have mandatory length, yet layout tables must have no summary => no layout tables] (we'd like to do this, wherever possible; some times CSS is not powerful or stable enough, but this is only in a handful of instances)
The following need to be checked manually
- <noscript> is given whenever appropriate and possible
- When plugins are used, info about it must be displayed
- When an appropriate markup language exists, use markup rather than images to convey information.
- Mark up lists and list items properly.
- Ensure that all information conveyed with color is also available without color, for example from context or markup.
- <blockquote> not used for non-quoting
The following are up to websites
- Until user agents allow users to freeze moving content, avoid movement in pages AND Until user agents allow users to control flickering, avoid causing the screen to flicker. : by default, nothing flickers, but Comcode allows it. It's a question of whether a site is designed to be accessible for all, or 'fancy' for the majority
- Alternatives given to multimedia content
- Use the clearest and simplest language appropriate for a site's content.
- Divide large blocks of information into more manageable groups where natural and appropriate.
- Specify the expansion of each abbreviation or acronym in a document where it first occurs.
- Place distinguishing information at the beginning of headings, paragraphs, lists, etc.
Level 3 accessibility
- Create a logical tab order through links, form controls, and objects. : impossible to construct a site modularly to do this; easily arguably, this only needs doing in specific cases, which we do do
- Form control default text invalid (null) : I don't think this one exists in the latest standard ("until user agents"), and rightly so: it is often necessary to have blank inputs. Turn $strict_form_accessibility for this.
sources/validation.php
Global_functions_validation.php
Function summary
|
void
|
init__validation ()
|
|
string
|
html_entity_decode (string input, integer quote_style, ?string charset)
|
|
mixed
|
str_word_count (string input, integer format)
|
|
URLPATH
|
qualify_url (URLPATH url, URLPATH url_base)
|
|
?string
|
http_download_file (URLPATH url, ?integer byte_limit, boolean trigger_error, boolean no_redirect, string ua, ?array post_params, ?array cookies, ?string accept, ?string accept_charset, ?string accept_language, ?resource write_to_file, ?string referer, ?array auth, float timeout, boolean is_xml, ?array files)
|
|
?mixed
|
do_lang (ID_TEXT a, ?mixed param_a, ?mixed param_b, ?mixed param_c, ?LANGUAGE_NAME lang, boolean require_result)
|
|
string
|
get_forum_type ()
|
|
string
|
ocp_srv (string value)
|
|
string
|
mailto_obfuscated ()
|
|
?mixed
|
mixed ()
|
|
?map
|
check_xhtml (string out, boolean well_formed_only, boolean is_fragment, boolean validation_javascript, boolean validation_css, boolean validation_wcag, boolean validation_compat, boolean validation_ext_files, boolean validation_manual)
|
|
map
|
_xhtml_error (string error, string param_a, string param_b, string param_c, boolean raw, integer rel_pos)
|
|
boolean
|
is_hex (string string)
|
|
?mixed
|
test_entity (integer offset)
|
|
string
|
fix_entities (string in)
|
|
?mixed
|
_get_next_tag ()
|
|
mixed
|
_check_tag (string tag, map attributes, boolean self_close, boolean close, list errors)
|
|
string
|
_get_tag_basis (string full)
|
void init__validation()
Standard code module initialisation function.
Parameters…
(No return value)
function init__validation()
{
if (!function_exists('html_entity_decode'))
{
/**
* Decode the HTML entitity encoded input string. Can give warning if unrecognised character set.
*
* @param string The text to decode
* @param integer The quote style code
* @param ?string Character set to decode to (NULL: default)
* @return string The decoded text
*/
function html_entity_decode($input,$quote_style,$charset=NULL)
{
unset($quote_style);
unset($charset);
/* // NB:   does not go to <space>. It's not something you use with html escaping, it's for hard-space-formatting. URL's don't contain spaces, but that's due to URL escaping (%20)
$replace_array=array(
'&'=>'&',
'>'=>'>',
'<'=>'<',
'''=>'\'',
'"'=>'"',
);
foreach ($replace_array as $from=>$to)
{
$input=str_replace($from,$to,$input);
}
return $input;
*/
$trans_tbl=get_html_translation_table(HTML_ENTITIES);
$trans_tbl=array_flip($trans_tbl);
return strtr($input,$trans_tbl);
}
}
if (!function_exists('str_word_count'))
{
/**
* Isolate the words in the input string.
*
* @param string String to count words in
* @param integer The format
* @set 0 1
* @return mixed Typically a list - the words of the input string
*/
function str_word_count($input,$format=0)
{
//count words
$pattern="/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/";
$all_words=trim(preg_replace($pattern,' ',$input));
$a=explode(' ',$all_words);
return ($format==0)?count($a):$a;
}
}
if (!function_exists('qualify_url'))
{
/**
* Take a URL and base-URL, and fully qualify the URL according to it.
*
* @param URLPATH The URL to fully qualified
* @param URLPATH The base-URL
* @return URLPATH Fully qualified URL
*/
function qualify_url($url,$url_base)
{
if (($url!='') && ($url[0]!='#') && (substr($url,0,7)!='mailto:'))
{
if (strpos($url,'://')===false)
{
if ($url[0]=='/')
{
$parsed=parse_url($url_base);
if (!array_key_exists('scheme',$parsed)) $parsed['scheme']='http';
if (!array_key_exists('host',$parsed)) $parsed['host']='localhost';
if (substr($url,0,2)=='//')
{
$url=$parsed['scheme'].':'.$url;
} else
{
$url=$parsed['scheme'].'://'.$parsed['host'].(array_key_exists('port',$parsed)?(':'.$parsed['port']):'').$url;
}
} else $url=$url_base.'/'.$url;
}
} else return '';
return $url;
}
}
if (!function_exists('http_download_file'))
{
/**
* Return the file in the URL by downloading it over HTTP. If a byte limit is given, it will only download that many bytes. It outputs warnings, returning NULL, on error.
*
* @param URLPATH The URL to download
* @param ?integer The number of bytes to download. This is not a guarantee, it is a minimum (NULL: all bytes)
* @range 1 max
* @param boolean Whether to throw an ocPortal error, on error
* @param boolean Whether to block redirects (returns NULL when found)
* @param string The user-agent to identify as
* @param ?array An optional array of POST parameters to send; if this is NULL, a GET request is used (NULL: none)
* @param ?array An optional array of cookies to send (NULL: none)
* @param ?string 'accept' header value (NULL: don't pass one)
* @param ?string 'accept-charset' header value (NULL: don't pass one)
* @param ?string 'accept-language' header value (NULL: don't pass one)
* @param ?resource File handle to write to (NULL: do not do that)
* @param ?string The HTTP referer (NULL: none)
* @param ?array A pair: authentication username and password (NULL: none)
* @param float The timeout
* @param boolean Whether to treat the POST parameters as a raw POST (rather than using MIME)
* @param ?array Files to send. Map between field to file path (NULL: none)
* @return ?string The data downloaded (NULL: error)
*/
function http_download_file($url,$byte_limit=NULL,$trigger_error=true,$no_redirect=false,$ua='ocPortal',$post_params=NULL,$cookies=NULL,$accept=NULL,$accept_charset=NULL,$accept_language=NULL,$write_to_file=NULL,$referer=NULL,$auth=NULL,$timeout=6.0,$is_xml=false,$files=NULL)
{
@ini_set('allow_url_fopen','1');
return @file_get_contents($url); // Assumes URL-wrappers is on, whilst ocPortal's is much more sophisticated
}
}
if (!function_exists('do_lang'))
{
/**
* Get the human-readable form of a language id, or a language entry from a language INI file. (STUB)
*
* @param ID_TEXT The language id
* @param ?mixed The first token [string or tempcode] (replaces {1}) (NULL: none)
* @param ?mixed The second token [string or tempcode] (replaces {2}) (NULL: none)
* @param ?mixed The third token (replaces {3}). May be an array of [of string], to allow any number of additional args (NULL: none)
* @param ?LANGUAGE_NAME The language to use (NULL: users language)
* @param boolean Whether to cause ocPortal to exit if the lookup does not succeed
* @return ?mixed The human-readable content (NULL: not found). String normally. Tempcode if tempcode parameters.
*/
function do_lang($a,$param_a=NULL,$param_b=NULL,$param_c=NULL,$lang=NULL,$require_result=true)
{
if (function_exists('_do_lang')) return _do_lang($a,$param_a,$param_b,$param_c,$lang,$require_result);
unset($lang);
switch ($a)
{
case 'LINK_NEW_WINDOW':
return 'new window';
case 'SPREAD_TABLE':
return 'Spread table';
case 'MAP_TABLE':
return 'Item to value mapper table';
}
return array($a,$param_a,$param_b,$param_c);
}
}
if (!function_exists('get_forum_type'))
{
/**
* Get the type of forums installed.
*
* @return string The type of forum installed
*/
function get_forum_type()
{
return 'none';
}
}
if (!function_exists('ocp_srv'))
{
/**
* Get server environment variables. (STUB)
*
* @param string The variable name
* @return string The variable value ('' means unknown)
*/
function ocp_srv($value)
{
return '';
}
}
if (!function_exists('mailto_obfuscated'))
{
/**
* Get obfuscate version of 'mailto:' (which'll hopefully fool e-mail scavengers to not pick up these e-mail addresses).
*
* @return string The obfuscated 'mailto:' string
*/
function mailto_obfuscated()
{
return 'mailto:';
}
}
if (!function_exists('mixed'))
{
/**
* Assign this to explicitly declare that a variable may be of mixed type, and initialise to NULL.
*
* @return ?mixed Of mixed type (NULL: default)
*/
function mixed()
{
return NULL;
}
}
define('DOCTYPE_HTML','<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">');
define('DOCTYPE_HTML_STRICT','<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">');
define('DOCTYPE_XHTML','<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">');
define('DOCTYPE_XHTML_STRICT','<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">');
define('DOCTYPE_XHTML_NEW','<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">');
define('DOCTYPE_XHTML5','<!DOCTYPE html>');
global $XHTML_VALIDATOR_OFF,$WELL_FORMED_ONLY,$VALIDATION_JAVASCRIPT,$VALIDATION_CSS,$VALIDATION_WCAG,$VALIDATION_COMPAT,$VALIDATION_EXT_FILES,$VALIDATION_MANUAL;
$VALIDATION_JAVASCRIPT=true;
$VALIDATION_CSS=true;
$VALIDATION_WCAG=true;
$VALIDATION_COMPAT=true;
$VALIDATION_EXT_FILES=true;
$VALIDATION_MANUAL=false;
global $EXTRA_CHECK;
$EXTRA_CHECK=array();
global $VALIDATED_ALREADY;
$VALIDATED_ALREADY=array();
global $NO_XHTML_LINK_FOLLOW;
$NO_XHTML_LINK_FOLLOW=0;
global $CSS_TAG_RANGES,$CSS_VALUE_RANGES;
$CSS_TAG_RANGES=array();
$CSS_VALUE_RANGES=array();
global $ENTITIES;
$ENTITIES=array(
'quot'=>1, 'amp'=>1, 'lt'=>1, 'gt'=>1, 'nbsp'=>1, 'iexcl'=>1, 'cent'=>1,
'pound'=>1, 'curren'=>1, 'yen'=>1, 'brvbar'=>1, 'sect'=>1, 'uml'=>1,
'copy'=>1, 'ordf'=>1, 'laquo'=>1, 'not'=>1, 'shy'=>1, 'reg'=>1, 'macr'=>1,
'deg'=>1, 'plusmn'=>1, 'sup2'=>1, 'sup3'=>1, 'acute'=>1, 'micro'=>1,
'para'=>1, 'middot'=>1, 'cedil'=>1, 'sup1'=>1, 'ordm'=>1, 'raquo'=>1,
'frac14'=>1, 'frac12'=>1, 'frac34'=>1, 'iquest'=>1,
'Agrave'=>1, 'Aacute'=>1, 'Acirc'=>1, 'Atilde'=>1, 'Auml'=>1, 'Aring'=>1,
'AElig'=>1, 'Ccedil'=>1, 'Egrave'=>1, 'Eacute'=>1,
'Ecirc'=>1, 'Euml'=>1, 'Igrave'=>1, 'Iacute'=>1, 'Icirc'=>1, 'Iuml'=>1,
'ETH'=>1, 'Ntilde'=>1, 'Ograve'=>1, 'Oacute'=>1, 'Ocirc'=>1,
'Otilde'=>1, 'Ouml'=>1, 'times'=>1, 'Oslash'=>1, 'Ugrave'=>1, 'Uacute'=>1,
'Ucirc'=>1, 'Uuml'=>1, 'Yacute'=>1, 'THORN'=>1,
'szlig'=>1, 'agrave'=>1, 'aacute'=>1, 'acirc'=>1, 'atilde'=>1, 'auml'=>1,
'aring'=>1, 'aelig'=>1, 'ccedil'=>1, 'egrave'=>1,
'eacute'=>1, 'ecirc'=>1, 'euml'=>1, 'igrave'=>1, 'iacute'=>1, 'icirc'=>1,
'iuml'=>1, 'eth'=>1, 'ntilde'=>1, 'ograve'=>1, 'oacute'=>1,
'ocirc'=>1, 'otilde'=>1, 'ouml'=>1, 'divide'=>1, 'oslash'=>1, 'ugrave'=>1,
'uacute'=>1, 'ucirc'=>1, 'uuml'=>1, 'yacute'=>1,
'thorn'=>1, 'yuml'=>1, 'fnof'=>1, 'Alpha'=>1, 'Beta'=>1, 'Gamma'=>1,
'Delta'=>1, 'Epsilon'=>1, 'Zeta'=>1, 'Eta'=>1, 'Theta'=>1, 'Iota'=>1,
'Kappa'=>1, 'Lambda'=>1, 'Mu'=>1, 'Nu'=>1, 'Xi'=>1, 'Omicron'=>1, 'Pi'=>1,
'Rho'=>1, 'Sigma'=>1, 'Tau'=>1, 'Upsilon'=>1, 'Phi'=>1, 'Chi'=>1,
'Psi'=>1, 'Omega'=>1, 'alpha'=>1, 'beta'=>1, 'gamma'=>1, 'delta'=>1,
'epsilon'=>1, 'zeta'=>1, 'eta'=>1, 'theta'=>1, 'iota'=>1, 'kappa'=>1,
'lambda'=>1, 'mu'=>1, 'nu'=>1, 'xi'=>1, 'omicron'=>1, 'pi'=>1, 'rho'=>1,
'sigmaf'=>1, 'sigma'=>1, 'tau'=>1, 'upsilon'=>1, 'phi'=>1, 'chi'=>1,
'psi'=>1, 'omega'=>1, 'thetasym'=>1, 'upsih'=>1, 'piv'=>1, 'bull'=>1,
'hellip'=>1, 'prime'=>1, 'Prime'=>1, 'oline'=>1, 'frasl'=>1,
'weierp'=>1, 'image'=>1, 'real'=>1, 'trade'=>1, 'alefsym'=>1, 'larr'=>1,
'uarr'=>1, 'rarr'=>1, 'darr'=>1, 'harr'=>1, 'crarr'=>1,
'lArr'=>1, 'uArr'=>1, 'rArr'=>1, 'dArr'=>1, 'hArr'=>1, 'forall'=>1,
'part'=>1, 'exist'=>1, 'empty'=>1, 'nabla'=>1, 'isin'=>1, 'notin'=>1,
'ni'=>1, 'prod'=>1, 'sum'=>1, 'minus'=>1, 'lowast'=>1, 'radic'=>1, 'prop'=>1,
'infin'=>1, 'ang'=>1, 'and'=>1, 'or'=>1, 'cap'=>1, 'cup'=>1, 'int'=>1,
'there4'=>1, 'sim'=>1, 'cong'=>1, 'asymp'=>1, 'ne'=>1, 'equiv'=>1, 'le'=>1,
'ge'=>1, 'sub'=>1, 'sup'=>1, 'nsub'=>1, 'sube'=>1, 'supe'=>1,
'oplus'=>1, 'otimes'=>1, 'perp'=>1, 'sdot'=>1, 'lceil'=>1, 'rceil'=>1,
'lfloor'=>1, 'rfloor'=>1, 'lang'=>1, 'rang'=>1, 'loz'=>1,
'spades'=>1, 'clubs'=>1, 'hearts'=>1, 'diams'=>1, 'OElig'=>1, 'oelig'=>1,
'Scaron'=>1, 'scaron'=>1, 'Yuml'=>1, 'circ'=>1, 'tidle'=>1,
'ensp'=>1, 'emsp'=>1, 'thinsp'=>1, 'zwnj'=>1, 'zwj'=>1, 'lrm'=>1, 'rlm'=>1,
'ndash'=>1, 'mdash'=>1, 'lsquo'=>1, 'rsquo'=>1, 'sbquo'=>1,
'ldquo'=>1, 'rdquo'=>1, 'bdquo'=>1, 'dagger'=>1, 'Dagger'=>1, 'permil'=>1,
'lsaquo'=>1, 'rsaquo'=>1, 'euro'=>1);
$strict_form_accessibility=false; // Form fields may not be empty with this strict rule
global $POSSIBLY_EMPTY_TAGS;
$POSSIBLY_EMPTY_TAGS=array(
'a'=>1, // When it's an anchor only - we will detect this with custom code
'div'=>1,
'span'=>1,
// 'p'=>1, // Sometimes we need to do an empty-p to workaround browser bugs
'td'=>1,
'th'=>1, // Only use for 'corner' ones
'textarea'=>1,
'button'=>1,
'script'=>1, // If we have one of these as self-closing in IE... it kills it!
);
if ($strict_form_accessibility) unset($POSSIBLY_EMPTY_TAGS['textarea']);
global $MUST_SELFCLOSE_TAGS;
$MUST_SELFCLOSE_TAGS=array(
'img'=>1,
'hr'=>1,
'br'=>1,
'param'=>1,
'input'=>1,
'base'=>1,
'link'=>1,
'meta'=>1,
'area'=>1,
'col'=>1,
'source'=>1,
'nobr'=>1,
);
// B's may not appear under A
global $PROHIBITIONS;
$PROHIBITIONS=array(
'a'=>array('a'),
'button'=>array('input','select','textarea','label','button','form','fieldset','iframe'),
// 'label'=>array('label'), Not sure, but used this for a reason - when we had one label for two things
'p'=>array('p','table','div','form','h1','h2','h3','h4','h5','h6','blockquote','pre','hr'),
'form'=>array('form'),
'em'=>array('em'),
'abbr'=>array('abbr'),
'strong'=>array('strong'),
'label'=>array('label','div'));
// Only B's can be under A
global $ONLY_CHILDREN;
$ONLY_CHILDREN=array(
'ruby'=>array('rbc','rtc','rp'),
'tr'=>array('td','th'),
'thead'=>array('tr'),
'tbody'=>array('tr'),
'tfoot'=>array('tr'),
'table'=>array('tbody','thead','tfoot','colgroup','col','caption'),
'colgroup'=>array('col'),
'select'=>array('option','optgroup'),
'legend'=>array('ins','del'),
//'map'=>array('area'), Apparently no such rule (see w3.org)
'html'=>array('head','body'),
'embed'=>array('noembed'),
'applet'=>array('param'),
'head'=>array('meta','base','basefont','script','link','noscript','map','title','style'),
'ul'=>array('li'),
'ol'=>array('li'),
'menu'=>array('li'),
'dl'=>array('li','dt','dd'),
'dir'=>array('li'),
'hr'=>array(),
'img'=>array(),
'input'=>array(),
'br'=>array(),
'meta'=>array(),
'base'=>array(),
'title'=>array(),
'textarea'=>array(),
'style'=>array(),
'pre'=>array(),
'script'=>array(),
'param'=>array(),
/*'option'=>array(),*/
'area'=>array(),
'link'=>array('link'),
'basefont'=>array(),
'col'=>array()
);
if (get_value('html5')=='1')
{
$ONLY_CHILDREN+=array(
'details'=>array('summary'),
'datalist'=>array('option'),
);
}
// A can only occur underneath B's
global $ONLY_PARENT;
$ONLY_PARENT=array(
'rb'=>array('rbc'),
'rt'=>array('rtc'),
'rbc'=>array('ruby'),
'rtc'=>array('ruby'),
'rp'=>array('ruby'),
'area'=>array('map'),
'base'=>array('head'),
'body'=>array('html'),
'head'=>array('html'),
'param'=>array('script','object'),
'link'=>array('head','link'),
'li'=>array('ul','ol','dd','menu','dt','dl','dir'),
'style'=>array('head'),
'tbody'=>array('table'),
'tfoot'=>array('table'),
'thead'=>array('table'),
'th'=>array('tr'),
'td'=>array('tr'),
'tr'=>array('table','thead','tbody','tfoot'),
'title'=>array('head'),
'caption'=>array('table'),
'col'=>array('colgroup','table'),
'colgroup'=>array('table'),
'option'=>array('select','optgroup','datalist'),
'noembed'=>array('embed'),
);
if (get_value('html5')=='1')
{
$ONLY_PARENT+=array(
'figcaption'=>array('figure'),
'summary'=>array('details'),
);
} else
{
$ONLY_PARENT+=array(
'meta'=>array('head'),
);
}
global $REQUIRE_ANCESTER;
$REQUIRE_ANCESTER=array(
'textarea'=>'form',
'input'=>'form',
// 'button'=>'form',
'option'=>'form',
'optgroup'=>'form',
'select'=>'form',
);
global $TEXT_NO_BLOCK;
$TEXT_NO_BLOCK=array(
'table'=>1,
'tr'=>1,
'tfoot'=>1,
'thead'=>1,
'ul'=>1,
'ol'=>1,
'dl'=>1,
'optgroup'=>1,
'select'=>1,
'colgroup'=>1,
'map'=>1,
'body'=>1,
'form'=>1,
);
if (get_value('html5')=='1')
{
$TEXT_NO_BLOCK+=array(
'menu'=>1,
);
}
define('IN_XML_TAG',-3);
define('IN_DTD_TAG',-2);
define('NO_MANS_LAND',-1);
define('IN_COMMENT',0);
define('IN_TAG_NAME',1);
define('STARTING_TAG',2);
define('IN_TAG_BETWEEN_ATTRIBUTES',3);
define('IN_TAG_ATTRIBUTE_NAME',4);
define('IN_TAG_BETWEEN_ATTRIBUTE_NAME_VALUE_LEFT',5);
define('IN_TAG_BETWEEN_ATTRIBUTE_NAME_VALUE_RIGHT',7);
define('IN_TAG_ATTRIBUTE_VALUE_BIG_QUOTES',10);
define('IN_TAG_ATTRIBUTE_VALUE_NO_QUOTES',12);
define('IN_TAG_EMBEDDED_COMMENT',9);
define('IN_TAG_ATTRIBUTE_VALUE_LITTLE_QUOTES',8);
define('IN_CDATA',11);
}
string html_entity_decode(string input, integer quote_style, ?string charset)
Decode the HTML entitity encoded input string. Can give warning if unrecognised character set.
Parameters…
| Name |
input |
| Description |
The text to decode |
| Type |
string |
| Name |
quote_style |
| Description |
The quote style code |
| Type |
integer |
| Name |
charset |
| Description |
Character set to decode to (NULL: default) |
| Default value |
|
| Type |
?string |
Returns…
| Description |
The decoded text |
| Type |
string |
function html_entity_decode($input,$quote_style,$charset=NULL)
{
unset($quote_style);
unset($charset);
/* // NB:   does not go to <space>. It's not something you use with html escaping, it's for hard-space-formatting. URL's don't contain spaces, but that's due to URL escaping (%20)
$replace_array=array(
'&'=>'&',
'>'=>'>',
'<'=>'<',
'''=>'\'',
'"'=>'"',
);
foreach ($replace_array as $from=>$to)
{
$input=str_replace($from,$to,$input);
}
return $input;
*/
$trans_tbl=get_html_translation_table(HTML_ENTITIES);
$trans_tbl=array_flip($trans_tbl);
return strtr($input,$trans_tbl);
}
mixed str_word_count(string input, integer format)
Isolate the words in the input string.
Parameters…
| Name |
input |
| Description |
String to count words in |
| Type |
string |
| Name |
format |
| Description |
The format |
| Default value |
0 |
| Type |
integer |
| Values restricted to |
0 1 |
Returns…
| Description |
Typically a list - the words of the input string |
| Type |
mixed |
function str_word_count($input,$format=0)
{
//count words
$pattern="/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/";
$all_words=trim(preg_replace($pattern,' ',$input));
$a=explode(' ',$all_words);
return ($format==0)?count($a):$a;
}
URLPATH qualify_url(URLPATH url, URLPATH url_base)
Take a URL and base-URL, and fully qualify the URL according to it.
Parameters…
| Name |
url |
| Description |
The URL to fully qualified |
| Type |
URLPATH |
| Name |
url_base |
| Description |
The base-URL |
| Type |
URLPATH |
Returns…
| Description |
Fully qualified URL |
| Type |
URLPATH |
function qualify_url($url,$url_base)
{
if (($url!='') && ($url[0]!='#') && (substr($url,0,7)!='mailto:'))
{
if (strpos($url,'://')===false)
{
if ($url[0]=='/')
{
$parsed=parse_url($url_base);
if (!array_key_exists('scheme',$parsed)) $parsed['scheme']='http';
if (!array_key_exists('host',$parsed)) $parsed['host']='localhost';
if (substr($url,0,2)=='//')
{
$url=$parsed['scheme'].':'.$url;
} else
{
$url=$parsed['scheme'].'://'.$parsed['host'].(array_key_exists('port',$parsed)?(':'.$parsed['port']):'').$url;
}
} else $url=$url_base.'/'.$url;
}
} else return '';
return $url;
}
?string http_download_file(URLPATH url, ?integer byte_limit, boolean trigger_error, boolean no_redirect, string ua, ?array post_params, ?array cookies, ?string accept, ?string accept_charset, ?string accept_language, ?resource write_to_file, ?string referer, ?array auth, float timeout, boolean is_xml, ?array files)
Return the file in the URL by downloading it over HTTP. If a byte limit is given, it will only download that many bytes. It outputs warnings, returning NULL, on error.
Parameters…
| Name |
url |
| Description |
The URL to download |
| Type |
URLPATH |
| Name |
byte_limit |
| Description |
The number of bytes to download. This is not a guarantee, it is a minimum (NULL: all bytes) |
| Default value |
|
| Type |
?integer |
| Value range |
1 max |
| Name |
trigger_error |
| Description |
Whether to throw an ocPortal error, on error |
| Default value |
boolean-true |
| Type |
boolean |
| Name |
no_redirect |
| Description |
Whether to block redirects (returns NULL when found) |
| Default value |
boolean-false |
| Type |
boolean |
| Name |
ua |
| Description |
The user-agent to identify as |
| Default value |
ocPortal |
| Type |
string |
| Name |
post_params |
| Description |
An optional array of POST parameters to send; if this is NULL, a GET request is used (NULL: none) |
| Default value |
|
| Type |
?array |
| Name |
cookies |
| Description |
An optional array of cookies to send (NULL: none) |
| Default value |
|
| Type |
?array |
| Name |
accept |
| Description |
'accept' header value (NULL: don't pass one) |
| Default value |
|
| Type |
?string |
| Name |
accept_charset |
| Description |
'accept-charset' header value (NULL: don't pass one) |
| Default value |
|
| Type |
?string |
| Name |
accept_language |
| Description |
'accept-language' header value (NULL: don't pass one) |
| Default value |
|
| Type |
?string |
| Name |
write_to_file |
| Description |
File handle to write to (NULL: do not do that) |
| Default value |
|
| Type |
?resource |
| Name |
referer |
| Description |
The HTTP referer (NULL: none) |
| Default value |
|
| Type |
?string |
| Name |
auth |
| Description |
A pair: authentication username and password (NULL: none) |
| Default value |
|
| Type |
?array |
| Name |
timeout |
| Description |
The timeout |
| Default value |
6 |
| Type |
float |
| Name |
is_xml |
| Description |
Whether to treat the POST parameters as a raw POST (rather than using MIME) |
| Default value |
boolean-false |
| Type |
boolean |
| Name |
files |
| Description |
Files to send. Map between field to file path (NULL: none) |
| Default value |
|
| Type |
?array |
Returns…
| Description |
The data downloaded (NULL: error) |
| Type |
?string |
function http_download_file($url,$byte_limit=NULL,$trigger_error=true,$no_redirect=false,$ua='ocPortal',$post_params=NULL,$cookies=NULL,$accept=NULL,$accept_charset=NULL,$accept_language=NULL,$write_to_file=NULL,$referer=NULL,$auth=NULL,$timeout=6.0,$is_xml=false,$files=NULL)
{
@ini_set('allow_url_fopen','1');
return @file_get_contents($url); // Assumes URL-wrappers is on, whilst ocPortal's is much more sophisticated
}
?mixed do_lang(ID_TEXT a, ?mixed param_a, ?mixed param_b, ?mixed param_c, ?LANGUAGE_NAME lang, boolean require_result)
Get the human-readable form of a language id, or a language entry from a language INI file. (STUB)
Parameters…
| Name |
a |
| Description |
The language id |
| Type |
ID_TEXT |
| Name |
param_a |
| Description |
The first token [string or tempcode] (replaces {1}) (NULL: none) |
| Default value |
|
| Type |
?mixed |
| Name |
param_b |
| Description |
The second token [string or tempcode] (replaces {2}) (NULL: none) |
| Default value |
|
| Type |
?mixed |
| Name |
param_c |
| Description |
The third token (replaces {3}). May be an array of [of string], to allow any number of additional args (NULL: none) |
| Default value |
|
| Type |
?mixed |
| Name |
lang |
| Description |
The language to use (NULL: users language) |
| Default value |
|
| Type |
?LANGUAGE_NAME |
| Name |
require_result |
| Description |
Whether to cause ocPortal to exit if the lookup does not succeed |
| Default value |
boolean-true |
| Type |
boolean |
Returns…
| Description |
The human-readable content (NULL: not found). String normally. Tempcode if tempcode parameters. |
| Type |
?mixed |
function do_lang($a,$param_a=NULL,$param_b=NULL,$param_c=NULL,$lang=NULL,$require_result=true)
{
if (function_exists('_do_lang')) return _do_lang($a,$param_a,$param_b,$param_c,$lang,$require_result);
unset($lang);
switch ($a)
{
case 'LINK_NEW_WINDOW':
return 'new window';
case 'SPREAD_TABLE':
return 'Spread table';
case 'MAP_TABLE':
return 'Item to value mapper table';
}
return array($a,$param_a,$param_b,$param_c);
}
string get_forum_type()
Get the type of forums installed.
Parameters…
Returns…
| Description |
The type of forum installed |
| Type |
string |
function get_forum_type()
{
return 'none';
}
string ocp_srv(string value)
Get server environment variables. (STUB)
Parameters…
| Name |
value |
| Description |
The variable name |
| Type |
string |
Returns…
| Description |
The variable value ('' means unknown) |
| Type |
string |
function ocp_srv($value)
{
return '';
}
string mailto_obfuscated()
Get obfuscate version of 'mailto:' (which'll hopefully fool e-mail scavengers to not pick up these e-mail addresses).
Parameters…
Returns…
| Description |
The obfuscated 'mailto:' string |
| Type |
string |
function mailto_obfuscated()
{
return 'mailto:';
}
?mixed mixed()
Assign this to explicitly declare that a variable may be of mixed type, and initialise to NULL.
Parameters…
Returns…
| Description |
Of mixed type (NULL: default) |
| Type |
?mixed |
function mixed()
{
return NULL;
}
?map check_xhtml(string out, boolean well_formed_only, boolean is_fragment, boolean validation_javascript, boolean validation_css, boolean validation_wcag, boolean validation_compat, boolean validation_ext_files, boolean validation_manual)
Check the specified XHTML, and return the results.
Parameters…
| Name |
out |
| Description |
The XHTML to validate |
| Type |
string |
| Name |
well_formed_only |
| Description |
Whether to avoid checking for relational errors (false implies just a quick structural check, aka a 'well formed' check) |
| Default value |
boolean-false |
| Type |
boolean |
| Name |
is_fragment |
| Description |
Whether what is being validated is an HTML fragment, rather than a whole document |
| Default value |
boolean-false |
| Type |
boolean |
| Name |
validation_javascript |
| Description |
Validate javascript |
| Default value |
boolean-true |
| Type |
boolean |
| Name |
validation_css |
| Description |
Validate CSS |
| Default value |
boolean-true |
| Type |
boolean |
| Name |
validation_wcag |
| Description |
Validate WCAG |
| Default value |
boolean-true |
| Type |
boolean |
| Name |
validation_compat |
| Description |
Validate for compatibility |
| Default value |
boolean-true |
| Type |
boolean |
| Name |
validation_ext_files |
| Description |
Validate external files |
| Default value |
boolean-true |
| Type |
boolean |
| Name |
validation_manual |
| Description |
Bring up messages about manual checks |
| Default value |
boolean-false |
| Type |
boolean |
Returns…
| Description |
Error information (NULL: no error) |
| Type |
?map |
function check_xhtml($out,$well_formed_only=false,$is_fragment=false,$validation_javascript=true,$validation_css=true,$validation_wcag=true,$validation_compat=true,$validation_ext_files=true,$validation_manual=false)
{
global $XHTML_VALIDATOR_OFF,$WELL_FORMED_ONLY,$VALIDATION_JAVASCRIPT,$VALIDATION_CSS,$VALIDATION_WCAG,$VALIDATION_COMPAT,$VALIDATION_EXT_FILES,$VALIDATION_MANUAL,$UNDER_XMLNS;
$XHTML_VALIDATOR_OFF=mixed();
$WELL_FORMED_ONLY=$well_formed_only;
if (!$WELL_FORMED_ONLY)
{
require_code('validation2');
}
$VALIDATION_JAVASCRIPT=$validation_javascript;
$VALIDATION_CSS=$validation_css;
$VALIDATION_WCAG=$validation_wcag;
$VALIDATION_COMPAT=$validation_compat;
$VALIDATION_EXT_FILES=$validation_ext_files;
$VALIDATION_MANUAL=$validation_manual;
global $IDS_SO_FAR;
$IDS_SO_FAR=array();
$content_start_stack=array();
global $BLOCK_CONSTRAIN,$XML_CONSTRAIN,$LAST_TAG_ATTRIBUTES,$FOUND_DOCTYPE,$FOUND_DESCRIPTION,$FOUND_KEYWORDS,$FOUND_CONTENTTYPE,$THE_DOCTYPE,$TAGS_DEPRECATE_ALLOW,$URL_BASE,$PARENT_TAG,$TABS_SEEN,$KEYS_SEEN,$ANCHORS_SEEN,$ATT_STACK,$TAG_STACK,$POS,$LINENO,$LINESTART,$OUT,$T_POS,$PROHIBITIONS,$ONLY_PARENT,$ONLY_CHILDREN,$REQUIRE_ANCESTER,$LEN,$ANCESTER_BLOCK,$ANCESTER_INLINE,$POSSIBLY_EMPTY_TAGS,$MUST_SELFCLOSE_TAGS,$FOR_LABEL_IDS,$FOR_LABEL_IDS_2,$INPUT_TAG_IDS;
global $TAG_RANGES,$VALUE_RANGES,$LAST_A_TAG,$A_LINKS,$XHTML_FORM_ENCODING;
global $AREA_LINKS,$LAST_HEADING,$CRAWLED_URLS,$HYPERLINK_URLS,$EMBED_URLS,$THE_LANGUAGE,$PSPELL_LINK;
global $TAGS_BLOCK,$TAGS_INLINE,$TAGS_NORMAL,$TAGS_BLOCK_DEPRECATED,$TAGS_INLINE_DEPRECATED,$TAGS_NORMAL_DEPRECATED;
$PSPELL_LINK=NULL;
$THE_LANGUAGE='en';
$THE_DOCTYPE=$is_fragment?DOCTYPE_XHTML:DOCTYPE_HTML;
$TAGS_DEPRECATE_ALLOW=true;
$XML_CONSTRAIN=$is_fragment;
$BLOCK_CONSTRAIN=false;
$LINENO=0;
$LINESTART=0;
$HYPERLINK_URLS=array();
$EMBED_URLS=array();
$AREA_LINKS=array();
$LAST_HEADING=0;
$FOUND_DOCTYPE=false;
$FOUND_CONTENTTYPE=false;
$FOUND_KEYWORDS=false;
$FOUND_DESCRIPTION=false;
$CRAWLED_URLS=array();
$PARENT_TAG='';
$XHTML_FORM_ENCODING='';
$UNDER_XMLNS=false;
$KEYS_SEEN=array();
$TABS_SEEN=array();
$TAG_RANGES=array();
$VALUE_RANGES=array();
$LAST_A_TAG=NULL;
$ANCHORS_SEEN=array();
$FOR_LABEL_IDS=array();
$FOR_LABEL_IDS_2=array();
$INPUT_TAG_IDS=array();
$TAG_STACK=array();
$ATT_STACK=array();
$ANCESTER_BLOCK=0;
$ANCESTER_INLINE=0;
$POS=0;
$OUT=$out;
unset($out);
$LEN=strlen($OUT);
$level_ranges=array();
$stack_size=0;
$to_find=array('html'=>1,'head'=>1,'title'=>1/*,'meta'=>1*/);
$only_one_of_stack=array();
$only_one_of_template=array('title'=>1,'head'=>1,'body'=>1,'base'=>1,'thead'=>1,'tfoot'=>1);
$only_one_of=$only_one_of_template;
$A_LINKS=array();
$previous='';
if (!isset($GLOBALS['MAIL_MODE'])) $GLOBALS['MAIL_MODE']=false;
$errors=array();
$bad_root=false;
$token=_get_next_tag();
while (!is_null($token))
{
// echo $T_POS.'-'.$POS.' ('.$stack_size.')<br />';
if ((is_array($token)) && (count($token)!=0)) // Some kind of error in our token
{
if (is_null($XHTML_VALIDATOR_OFF))
{
foreach ($token[1] as $error)
{
$errors[]=_xhtml_error($error[0],array_key_exists(1,$error)?$error[1]:'',array_key_exists(2,$error)?$error[2]:'',array_key_exists(3,$error)?$error[3]:'',array_key_exists('raw',$error)?$error['raw']:false,array_key_exists('pos',$error)?$error['pos']:0);
}
if (is_null($token[0])) return array('level_ranges'=>$level_ranges,'tag_ranges'=>$TAG_RANGES,'value_ranges'=>$VALUE_RANGES,'errors'=>$errors);
}
$token=$token[0];
}
$basis_token=_get_tag_basis($token);
// Open, close, or monitonic?
$term=strpos($token,'/');
if (!is_null($XHTML_VALIDATOR_OFF))
{
if ($term===false) $XHTML_VALIDATOR_OFF++;
elseif ($term==1)
{
if ($XHTML_VALIDATOR_OFF==0)
{
$XHTML_VALIDATOR_OFF=NULL;
} else
{
$XHTML_VALIDATOR_OFF--;
}
}
}
if ($term!==1)
{
if (isset($only_one_of[$basis_token]))
{
if ($only_one_of[$basis_token]==0) $errors[]=_xhtml_error('XHTML_ONLY_ONE_ALLOWED',$basis_token);
$only_one_of[$basis_token]--;
}
// echo 'Push $basis_token<br />';
$level_ranges[]=array($stack_size,$T_POS,$POS);
if (isset($to_find[$basis_token])) unset($to_find[$basis_token]);
if ((!$WELL_FORMED_ONLY) && (is_null($XHTML_VALIDATOR_OFF)))
{
if (((!$is_fragment) && ($stack_size==0)) && ($basis_token!='html'))
{
$errors[]=_xhtml_error('XHTML_BAD_ROOT');
$bad_root=true;
}
if ($stack_size!=0)
{
if (isset($ONLY_CHILDREN[$PARENT_TAG]))
{
if (!in_array($basis_token,$ONLY_CHILDREN[$PARENT_TAG]))
$errors[]=_xhtml_error('XHTML_BAD_CHILD',$basis_token,$PARENT_TAG);
}
/*if (isset($PROHIBITIONS[$PARENT_TAG]))
{
$prohibitions=$PROHIBITIONS[$PARENT_TAG];
if (in_array($basis_token,$prohibitions)) $errors[]=_xhtml_error('XHTML_PROHIBITION',$basis_token,$PARENT_TAG);
}*/
foreach ($TAG_STACK as $parent_tag)
{
if (isset($PROHIBITIONS[$parent_tag]))
{
$prohibitions=$PROHIBITIONS[$parent_tag];
if (in_array($basis_token,$prohibitions)) $errors[]=_xhtml_error('XHTML_PROHIBITION',$basis_token,$parent_tag);
}
}
}
if ((isset($REQUIRE_ANCESTER[$basis_token])) && (!$is_fragment))
{
if (!in_array($REQUIRE_ANCESTER[$basis_token],$TAG_STACK)) $errors[]=_xhtml_error('XHTML_MISSING_ANCESTER',$basis_token,$REQUIRE_ANCESTER[$basis_token]);
}
if (isset($ONLY_PARENT[$basis_token]))
{
if ($stack_size==0)
{
if (!$is_fragment) $errors[]=_xhtml_error('XHTML_BAD_PARENT',$basis_token,'/');
} else
{
if (!in_array($PARENT_TAG,$ONLY_PARENT[$basis_token])) $errors[]=_xhtml_error('XHTML_BAD_PARENT',$basis_token,$PARENT_TAG);
}
}
}
// In order to ease validation, we tolerate these in the parser (but of course, mark as errors)
if ((is_null($XHTML_VALIDATOR_OFF)) && (!$WELL_FORMED_ONLY) && ($term===false) && (isset($MUST_SELFCLOSE_TAGS[$basis_token])))
{
if ($XML_CONSTRAIN) $errors[]=_xhtml_error('XHTML_NONEMPTY_TAG',$basis_token);
}
else
{
if ($term===false)
{
$PARENT_TAG=$basis_token;
array_push($TAG_STACK,$basis_token);
array_push($ATT_STACK,$LAST_TAG_ATTRIBUTES);
array_push($content_start_stack,$POS);
array_push($only_one_of_stack,$only_one_of);
$only_one_of=$only_one_of_template;
++$stack_size;
} else
{
if ((is_null($XHTML_VALIDATOR_OFF)) && (!$WELL_FORMED_ONLY) && ((!$XML_CONSTRAIN) || (!isset($MUST_SELFCLOSE_TAGS[$basis_token]))) && /*(!in_array($basis_token,array('a'))) && */(is_null($XHTML_VALIDATOR_OFF))) // A tags must not self close even when only an anchor. Makes a weird underlined line effect in firefox
{
if (!$bad_root)
$errors[]=_xhtml_error('XHTML_CEMPTY_TAG',$basis_token);
}
}
}
}
elseif ($term==1) // Check its the closing to the stacks highest
{
// HTML allows implicit closing. We will flag errors when we have to do it. See 1-2-3 note
do
{
// For case 3 (see note below)
if (!in_array($basis_token,$TAG_STACK))
{
if ((is_null($XHTML_VALIDATOR_OFF)) && ($XML_CONSTRAIN)) $errors[]=_xhtml_error('XML_NO_CLOSE_MATCH',$basis_token,$previous);
break;
}
$previous=array_pop($TAG_STACK);
$PARENT_TAG=($TAG_STACK==array())?'':$TAG_STACK[count($TAG_STACK)-1];
$start_pos=array_pop($content_start_stack);
array_pop($ATT_STACK);
$only_one_of=array_pop($only_one_of_stack);
if (is_null($previous))
{
if ((is_null($XHTML_VALIDATOR_OFF)) && ($XML_CONSTRAIN)) $errors[]=_xhtml_error('XML_MORE_CLOSE_THAN_OPEN',$basis_token);
break;
}
if ($basis_token!=$previous)
{
// This is really tricky, and totally XHTML-incompliant. There are three situations:
// 1) Overlapping tags. We really can't survive this, and it's very invalid. We could only detect it if we broke support for cases (1) and (2). e.g. <i><b></i></b>
// 2) Implicit closing. We close everything implicitly until we find the matching tag. E.g. <i><b></i>
// 3) Closing something that was never open. This is tricky - we can't survive it if it was opened somewhere as a parent, as we'd end up closing a whole load of tags by rule (2) - but if it's a lone closing, we can skip it. Good e.g. <b></i></b>. Bad e.g. <div><p></div></p></div>
if ((is_null($XHTML_VALIDATOR_OFF)) && ($XML_CONSTRAIN)) $errors[]=_xhtml_error('XML_NO_CLOSE_MATCH',$basis_token,$previous);
}
if ((!$WELL_FORMED_ONLY) && (is_null($XHTML_VALIDATOR_OFF)))
{
if ((isset($MUST_SELFCLOSE_TAGS[$previous])) && ($XML_CONSTRAIN))
{
$errors[]=_xhtml_error('XHTML_NONEMPTY_TAG',$previous);
}
if ((!isset($MUST_SELFCLOSE_TAGS[$previous])) && (!isset($POSSIBLY_EMPTY_TAGS[$previous])) && (trim(substr($OUT,$start_pos,$T_POS-$start_pos))==''))
{
if ((isset($TAGS_BLOCK[$previous])) || (isset($TAGS_INLINE[$previous])) || (isset($TAGS_NORMAL[$previous])) || (isset($TAGS_BLOCK_DEPRECATED[$previous])) || (isset($TAGS_INLINE_DEPRECATED[$previous])) || (isset($TAGS_NORMAL_DEPRECATED[$previous])))
$errors[]=_xhtml_error('XHTML_EMPTY_TAG',$previous);
}
}
$stack_size--;
$level_ranges[]=array($stack_size,$T_POS,$POS);
// echo 'Popped $previous<br />';
if ((is_null($XHTML_VALIDATOR_OFF)) && (!$WELL_FORMED_ONLY) && (is_null($XHTML_VALIDATOR_OFF)))
{
if ($previous=='script')
{
$tag_contents=substr($OUT,$start_pos,$T_POS-$start_pos);
$c_section=strpos($tag_contents,']]>');
if ((trim($tag_contents)!='') && (strpos($tag_contents,'//-->')===false) && (strpos($tag_contents,'// -->')===false) && ($c_section===false))
{
$errors[]=_xhtml_error('XHTML_SCRIPT_COMMENTING',$previous);
} elseif (($c_section===false) && ((strpos($tag_contents,'<!--')!==false)))
{
if ($XML_CONSTRAIN) $errors[]=_xhtml_error('XHTML_CDATA');
}
if (/*(!$c_section) && */(strpos($tag_contents,'</')!==false)) $errors[]=_xhtml_error('XML_JS_TAG_ESCAPE');
}
}
}
while ($basis_token!=$previous);
}
/*else
{
$level_ranges[]=array($stack_size,$T_POS,$POS);
// it's monitonic, so ignore
}*/
$token=_get_next_tag();
}
// Check we have everything closed
if ($stack_size!=0)
{
if ($XML_CONSTRAIN) $errors[]=_xhtml_error('XML_NO_CLOSE',array_pop($TAG_STACK));
return array('level_ranges'=>$level_ranges,'tag_ranges'=>$TAG_RANGES,'value_ranges'=>$VALUE_RANGES,'errors'=>$errors);
}
if (!$well_formed_only)
// if ((is_null($XHTML_VALIDATOR_OFF)) || (!$well_formed_only)) // validator-off check needed because it's possible a non-validateable portion foobars up possibility of interpreting the rest of the document such that checking ends early
{
if (!$is_fragment)
{
foreach (array_keys($to_find) as $tag)
$errors[]=_xhtml_error('XHTML_MISSING_TAG',$tag);
if ((!$FOUND_DOCTYPE) && (!$GLOBALS['MAIL_MODE'])) $errors[]=_xhtml_error('XHTML_DOCTYPE');
if (($FOUND_DOCTYPE) && ($GLOBALS['MAIL_MODE'])) $errors[]=_xhtml_error('MAIL_DOCTYPE');
if (!$FOUND_CONTENTTYPE) $errors[]=_xhtml_error('XHTML_CONTENTTYPE');
//if (!$FOUND_KEYWORDS) $errors[]=_xhtml_error('XHTML_KEYWORDS');
//if (!$FOUND_DESCRIPTION) $errors[]=_xhtml_error('XHTML_DESCRIPTION');
}
if (!$is_fragment)
{
// Check that all area-links have a corresponding hyperlink
foreach (array_keys($AREA_LINKS) as $id)
{
if (!in_array($id,$HYPERLINK_URLS)) $errors[]=_xhtml_error('WCAG_AREA_EQUIV',$id);
}
// Check that all labels apply to real input tags
foreach (array_keys($FOR_LABEL_IDS_2) as $id)
{
if (!isset($INPUT_TAG_IDS[$id])) $errors[]=_xhtml_error('XHTML_ID_UNBOUND',$id);
}
}
}
// Main spelling
if ((function_exists('pspell_new')) && (isset($GLOBALS['SPELLING'])))
{
$stripped=$OUT;
$matches=array();
$num_matches=preg_match_all('#\<style.*\</style\>#Umis',$stripped,$matches);
for ($i=0;$i<$num_matches;$i++)
{
$stripped=str_replace($matches[0][$i],str_repeat(' ',strlen($matches[0][$i])),$stripped);
}
$num_matches=preg_match_all('#\<script.*\</script\>#Umis',$stripped,$matches);
for ($i=0;$i<$num_matches;$i++)
{
$stripped=str_replace($matches[0][$i],str_repeat(' ',strlen($matches[0][$i])),$stripped);
}
$stripped=@html_entity_decode(strip_tags($stripped),ENT_QUOTES,get_charset());
require_code('validation2');
$new_errors=validate_spelling($stripped);
$misspellings=array();
global $POS,$LINENO,$LINESTART;
foreach ($new_errors as $error)
{
if (array_key_exists($error[1],$misspellings)) continue;
$misspellings[$error[1]]=1;
$POS=strpos($OUT,$error[1]);
$LINESTART=strrpos(substr($OUT,0,$POS),chr(10));
$LINENO=substr_count(substr($OUT,0,$LINESTART),chr(10))+1;
$errors[]=_xhtml_error($error[0],$error[1]);
}
}
unset($OUT);
return array('level_ranges'=>$level_ranges,'tag_ranges'=>$TAG_RANGES,'value_ranges'=>$VALUE_RANGES,'errors'=>$errors);
}
map _xhtml_error(string error, string param_a, string param_b, string param_c, boolean raw, integer rel_pos)
Get some general debugging information for an identified XHTML error.
Parameters…
| Name |
error |
| Description |
The error that occurred |
| Type |
string |
| Name |
param_a |
| Description |
The first parameter of the error |
| Default value |
|
| Type |
string |
| Name |
param_b |
| Description |
The second parameter of the error |
| Default value |
|
| Type |
string |
| Name |
param_c |
| Description |
The third parameter of the error |
| Default value |
|
| Type |
string |
| Name |
raw |
| Description |
Whether to not do a lang lookup |
| Default value |
boolean-false |
| Type |
boolean |
| Name |
rel_pos |
| Description |
Offset position |
| Default value |
0 |
| Type |
integer |
Returns…
| Description |
A map of the error information |
| Type |
map |
function _xhtml_error($error,$param_a='',$param_b='',$param_c='',$raw=false,$rel_pos=0)
{
global $POS,$OUT,$LINENO,$LINESTART;
$lineno=($rel_pos==0)?0:substr_count(substr($OUT,$POS,$rel_pos),chr(10));
$out=array();
$out['line']=$LINENO+1+$lineno;
if ($rel_pos==0)
{
$out['pos']=$POS-$LINESTART;
} else
{
$out['pos']=$POS+$rel_pos-strrpos(substr($OUT,0,$POS+$rel_pos),chr(10));
}
$out['global_pos']=$POS+$rel_pos;
$out['error']=$raw?$error:do_lang($error,htmlentities($param_a),htmlentities($param_b),htmlentities($param_c));
return $out;
}
boolean is_hex(string string)
Checks to see if a string holds a hexadecimal number.
Parameters…
| Name |
string |
| Description |
The string to check |
| Type |
string |
Returns…
| Description |
Whether the string holds a hexadecimal number |
| Type |
boolean |
function is_hex($string)
{
return preg_match('#^(\d*[abcdef]*)*$#',$string)!=0;
}
?mixed test_entity(integer offset)
Test the next entity in the output stream.
Parameters…
| Name |
offset |
| Description |
Checking offset |
| Default value |
0 |
| Type |
integer |
Returns…
| Description |
An array of error details (NULL: no errors) |
| Type |
?mixed |
function test_entity($offset=0)
{
global $OUT,$POS,$ENTITIES;
$lump=substr($OUT,$POS+$offset,8);
$errors=array();
$pos=strpos($lump,';');
//if ($pos!==0) // "&; sequence" is possible. It's in IPB's posts and to do with emoticon meta tagging
{
if ($pos===false)
{
$errors[]=array('XHTML_BAD_ENTITY');
} else
{
$lump=substr($lump,0,$pos);
if (!(($lump[0]=='#') && ((is_numeric(substr($lump,1))) || (($lump[1]=='x') && (is_hex(substr($lump,2))))))) // It's ok if this is a numeric code, so no need to check further
{
// Check against list
if (!isset($ENTITIES[$lump]))
{
$errors[]=array('XHTML_BAD_ENTITY');
}
}
}
}
if (count($errors)==0) return NULL;
return $errors;
}
string fix_entities(string in)
Fix any invalid entities in the text.
Parameters…
| Name |
in |
| Description |
Text to fix in |
| Type |
string |
Returns…
| Description |
Fixed result |
| Type |
string |
function fix_entities($in)
{
global $ENTITIES;
$out='';
$len=strlen($in);
$cdata=false;
for ($i=0;$i<$len;$i++)
{
$out.=$in[$i];
if (substr($in,$i,9)=='<![CDATA[')
{
$cdata=true;
}
if ($cdata)
{
if (substr($in,$i,5)=='//]]>') $cdata=false;
} else
{
if ($in[$i]=='&')
{
$lump=substr($in,$i+1,8);
$pos=strpos($lump,';');
if ($pos===false)
{
$out.='amp;';
} else
{
$lump=substr($lump,0,$pos);
if (!(($lump[0]=='#') && ((is_numeric(substr($lump,1))) || (($lump[1]=='x') && (is_hex(substr($lump,2)))))))
{
if (!isset($ENTITIES[$lump])) $out.='amp;';
}
}
}
}
}
return $out;
}
?mixed _get_next_tag()
Get the next tag in the current XHTML document.
Parameters…
Returns…
| Description |
Either an array of error details, a string of the tag, or NULL for finished (NULL: no next tag) |
| Type |
?mixed |
function _get_next_tag()
{
// echo '<p>!</p>';
global $PARENT_TAG,$POS,$LINENO,$LINESTART,$OUT,$T_POS,$ENTITIES,$LEN,$ANCESTER_BLOCK,$TAG_STACK,$XHTML_VALIDATOR_OFF,$TEXT_NO_BLOCK,$INBETWEEN_TEXT;
global $TAG_RANGES,$VALUE_RANGES;
$status=NO_MANS_LAND;
$current_tag='';
$current_attribute_name='';
$current_attribute_value='';
$close=false;
$doc_type='';
$INBETWEEN_TEXT='';
$attribute_map=array();
$errors=array();
$chr_10=chr(10);
$chr_13=chr(13);
$special_chars=array('='=>1,'"'=>1,'&'=>1,'/'=>1,'<'=>1,'>'=>1,' '=>1,$chr_10=>1,$chr_13=>1);
while ($POS<$LEN)
{
$next=$OUT[$POS];
$POS++;
if ($next==$chr_10)
{
$LINENO++;
$LINESTART=$POS;
}
// echo $status.' for '.$next.'<br />';
// Entity checking
if (($next=='&') && ($status!=IN_CDATA) && ($status!=IN_COMMENT) && (is_null($XHTML_VALIDATOR_OFF)))
{
$test=test_entity();
if (!is_null($test)) $errors=array_merge($errors,$test);
}
// State machine
switch ($status)
{
case NO_MANS_LAND:
$in_no_mans_land='';
$continue=($next!='<') && ($next!='&') && ($POS<$LEN-1);
if ($next!='<') $INBETWEEN_TEXT.=$next;
while ($continue)
{
$next=$OUT[$POS];
$POS++;
$continue=($next!='<') && ($next!='&') && ($POS<$LEN-1);
if ($continue) $in_no_mans_land.=$next;
if ($next!='<') $INBETWEEN_TEXT.=$next;
if ($next==$chr_10)
{
$LINENO++;
$LINESTART=$POS;
}
}
if (($next=='&') && (is_null($XHTML_VALIDATOR_OFF)))
{
$test=test_entity();
if (!is_null($test)) $errors=array_merge($errors,$test);
}
// Can't have loose text in form/body/etc
// 'x' is there for when called externally, checking on an x that has replaced, for example, a directive tag (which isn't actual text - so can't trip the error)
if (($in_no_mans_land!='x') && (trim($in_no_mans_land)!='') && (isset($TEXT_NO_BLOCK[$PARENT_TAG])) && ($GLOBALS['BLOCK_CONSTRAIN'])) $errors[]=array('XHTML_TEXT_NO_BLOCK',$PARENT_TAG);
if (($next=='<') && (isset($OUT[$POS+2])) && ($OUT[$POS]=='!'))
{
if (($OUT[$POS+1]=='-') && ($OUT[$POS+2]=='-'))
{
$status=IN_COMMENT;
$INBETWEEN_TEXT.='<!--';
$POS+=3;
}
elseif (substr($OUT,$POS-1,9)=='<![CDATA[')
{
$status=IN_CDATA;
$POS+=8;
$INBETWEEN_TEXT.='<![CDATA[';
}
else
{
$status=IN_DTD_TAG;
}
}
elseif (($next=='<') && (isset($OUT[$POS])) && ($OUT[$POS]=='?') && ($POS<10))
{
if (!isset($GLOBALS['MAIL_MODE'])) $GLOBALS['MAIL_MODE']=false;
if ($GLOBALS['MAIL_MODE']) $errors[]=array('MAIL_PROLOG');
$status=IN_XML_TAG;
}
elseif ($next=='<')
{
$T_POS=$POS-1;
$status=STARTING_TAG;
}
else
{
if ($next=='>')
{
$errors[]=array('XML_TAG_CLOSE_ANOMALY');
return array(NULL,$errors);
}
}
break;
case IN_TAG_NAME:
$more_to_come=(!isset($special_chars[$next])) && ($POS<$LEN);
while ($more_to_come)
{
$current_tag.=$next;
$next=$OUT[$POS];
$POS++;
if ($next==$chr_10)
{
$LINENO++;
$LINESTART=$POS;
}
$more_to_come=(!isset($special_chars[$next])) && ($POS<$LEN);
}
if (($next==' ') || ($next==$chr_10) || ($next==$chr_13))
{
$TAG_RANGES[]=array($T_POS+1,$POS-1,$current_tag);
$status=IN_TAG_BETWEEN_ATTRIBUTES;
}
elseif ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','1');
return array(NULL,$errors);
}
elseif ($next=='>')
{
if ($OUT[$POS-2]=='/')
{
$TAG_RANGES[]=array($T_POS+1,$POS-1,$current_tag);
return _check_tag($current_tag,array(),true,$close,$errors);
} else
{
$TAG_RANGES[]=array($T_POS+1,$POS-1,$current_tag);
return _check_tag($current_tag,array(),false,$close,$errors);
}
}
elseif ($next!='/') $current_tag.=$next;
break;
case STARTING_TAG:
if ($next=='/') $close=true;
elseif ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','2');
// return array(NULL,$errors);
// We have to assume the first < was not for a real opening tag
$POS--;
$status=NO_MANS_LAND;
}
elseif ($next=='>')
{
$errors[]=array('XML_TAG_CLOSE_ANOMALY','3');
// return array(NULL,$errors);
// We have to assume neither were for a real tag
$status=NO_MANS_LAND;
}
else
{
$current_tag.=$next;
$status=IN_TAG_NAME;
}
break;
case IN_TAG_BETWEEN_ATTRIBUTES:
if (($next=='/') && (isset($OUT[$POS])) && ($OUT[$POS]=='>'))
{
++$POS;
return _check_tag($current_tag,$attribute_map,true,$close,$errors);
}
elseif ($next=='>')
{
return _check_tag($current_tag,$attribute_map,false,$close,$errors);
}
elseif (($next=='<') && (isset($OUT[$POS+3])) && ($OUT[$POS]=='!') && ($OUT[$POS+1]=='-') && ($OUT[$POS+2]=='-'))
{
$status=IN_TAG_EMBEDDED_COMMENT;
if ($OUT[$POS+3]=='-') $errors[]=array('XHTML_WRONG_COMMENTING');
}
elseif ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','4');
return array(NULL,$errors);
}
elseif (($next!=' ') && ($next!="\t") && ($next!=$chr_10) && ($next!=$chr_13))
{
$status=IN_TAG_ATTRIBUTE_NAME;
$current_attribute_name.=$next;
}
break;
case IN_TAG_ATTRIBUTE_NAME:
$more_to_come=(!isset($special_chars[$next])) && ($POS<$LEN);
while ($more_to_come)
{
$current_attribute_name.=$next;
$next=$OUT[$POS];
$POS++;
if ($next==$chr_10)
{
$LINENO++;
$LINESTART=$POS;
}
$more_to_come=(!isset($special_chars[$next])) && ($POS<$LEN);
}
if ($next=='=') $status=IN_TAG_BETWEEN_ATTRIBUTE_NAME_VALUE_RIGHT;
elseif ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','5');
//return array(NULL,$errors);
// We have to assume we shouldn't REALLY have found a tag
$POS--;
$current_tag='';
$status=NO_MANS_LAND;
}
elseif ($next=='>')
{
if ($GLOBALS['XML_CONSTRAIN']) $errors[]=array('XML_TAG_CLOSE_ANOMALY');
// Things like nowrap, checked, etc
// return array(NULL,$errors);
if (isset($attribute_map[$current_attribute_name])) $errors[]=array('XML_TAG_DUPLICATED_ATTRIBUTES',$current_tag);
$attribute_map[$current_attribute_name]=$current_attribute_name;
$current_attribute_name='';
$VALUE_RANGES[]=array($POS-1,$POS-1);
return _check_tag($current_tag,$attribute_map,false,$close,$errors);
}
elseif (($next!=' ') && ($next!="\t") && ($next!=$chr_10) && ($next!=$chr_13)) $current_attribute_name.=$next;
else $status=IN_TAG_BETWEEN_ATTRIBUTE_NAME_VALUE_LEFT;
break;
case IN_TAG_BETWEEN_ATTRIBUTE_NAME_VALUE_LEFT:
if ($next=='=') $status=IN_TAG_BETWEEN_ATTRIBUTE_NAME_VALUE_RIGHT;
elseif (($next!=' ') && ($next!="\t") && ($next!=$chr_10) && ($next!=$chr_13))
{
if ($GLOBALS['XML_CONSTRAIN']) $errors[]=array('XML_ATTRIBUTE_ERROR');
//return array(NULL,$errors); Actually <blah nowrap ... /> could cause this
$status=IN_TAG_BETWEEN_ATTRIBUTES;
if (isset($attribute_map[$current_attribute_name])) $errors[]=array('XML_TAG_DUPLICATED_ATTRIBUTES',$current_tag);
$attribute_map[$current_attribute_name]=$current_attribute_name;
$current_attribute_name='';
$VALUE_RANGES[]=array($POS-1,$POS-1);
}
break;
case IN_TAG_BETWEEN_ATTRIBUTE_NAME_VALUE_RIGHT:
if ($next=='"')
{
$v_pos=$POS;
$status=IN_TAG_ATTRIBUTE_VALUE_BIG_QUOTES;
}
elseif (($next=='\'') && (true)) // Change to false if we want to turn off these quotes (preferred - but we can't control all input :( )
{
$v_pos=$POS;
$status=IN_TAG_ATTRIBUTE_VALUE_LITTLE_QUOTES;
}
elseif (($next!=' ') && ($next!="\t") && ($next!=$chr_10) && ($next!=$chr_13))
{
if ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','6');
// return array(NULL,$errors);
}
elseif ($next=='>')
{
$errors[]=array('XML_TAG_CLOSE_ANOMALY');
// return array(NULL,$errors);
}
if ($GLOBALS['XML_CONSTRAIN']) $errors[]=array('XML_ATTRIBUTE_ERROR');
$POS--;
$v_pos=$POS;
$status=IN_TAG_ATTRIBUTE_VALUE_NO_QUOTES;
}
break;
case IN_TAG_ATTRIBUTE_VALUE_NO_QUOTES:
if ($next=='>')
{
if (isset($attribute_map[$current_attribute_name])) $errors[]=array('XML_TAG_DUPLICATED_ATTRIBUTES',$current_tag);
$attribute_map[$current_attribute_name]=$current_attribute_value;
$current_attribute_value='';
$current_attribute_name='';
$VALUE_RANGES[]=array($v_pos,$POS-1);
return _check_tag($current_tag,$attribute_map,false,$close,$errors);
}
elseif (($next==' ') || ($next=="\t") || ($next==$chr_10) || ($next==$chr_13))
{
$status=IN_TAG_BETWEEN_ATTRIBUTES;
if (isset($attribute_map[$current_attribute_name])) $errors[]=array('XML_TAG_DUPLICATED_ATTRIBUTES',$current_tag);
$attribute_map[$current_attribute_name]=$current_attribute_value;
$current_attribute_value='';
$current_attribute_name='';
$VALUE_RANGES[]=array($v_pos,$POS-1);
}
else
{
if ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','7');
// return array(NULL,$errors);
}
$current_attribute_value.=$next;
}
break;
case IN_TAG_ATTRIBUTE_VALUE_BIG_QUOTES:
$more_to_come=(!isset($special_chars[$next])) && ($POS<$LEN);
while ($more_to_come)
{
$current_attribute_value.=$next;
$next=$OUT[$POS];
$POS++;
if ($next==$chr_10)
{
$LINENO++;
$LINESTART=$POS;
}
$more_to_come=(!isset($special_chars[$next])) && ($POS<$LEN);
}
if (($next=='&') && (is_null($XHTML_VALIDATOR_OFF)))
{
$test=test_entity();
if (!is_null($test)) $errors=array_merge($errors,$test);
}
if ($next=='"')
{
$status=IN_TAG_BETWEEN_ATTRIBUTES;
if (isset($attribute_map[$current_attribute_name])) $errors[]=array('XML_TAG_DUPLICATED_ATTRIBUTES',$current_tag);
$attribute_map[$current_attribute_name]=$current_attribute_value;
$current_attribute_value='';
$current_attribute_name='';
$VALUE_RANGES[]=array($v_pos,$POS-1);
}
else
{
if ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','7');
// return array(NULL,$errors);
}
elseif ($next=='>')
{
$errors[]=array('XML_TAG_CLOSE_ANOMALY');
// return array(NULL,$errors);
}
$current_attribute_value.=$next;
}
break;
case IN_TAG_ATTRIBUTE_VALUE_LITTLE_QUOTES:
if ($next=='\'')
{
$status=IN_TAG_BETWEEN_ATTRIBUTES;
$attribute_map[$current_attribute_name]=$current_attribute_value;
$current_attribute_value='';
$current_attribute_name='';
$VALUE_RANGES[]=array($v_pos,$POS-1);
}
else
{
if ($next=='<')
{
$errors[]=array('XML_TAG_OPEN_ANOMALY','7');
// return array(NULL,$errors);
}
elseif ($next=='>')
{
$errors[]=array('XML_TAG_CLOSE_ANOMALY');
// return array(NULL,$errors);
}
$current_attribute_value.=$next;
}
break;
case IN_XML_TAG:
if (($OUT[$POS-2]=='?') && ($next=='>')) $status=NO_MANS_LAND;
break;
case IN_DTD_TAG: // This is a parser-directive, but we only use them for doctypes
$doc_type.=$next;
if ($next=='>')
{
if (substr($doc_type,0,8)=='!DOCTYPE')
{
global $THE_DOCTYPE,$TAGS_DEPRECATE_ALLOW,$FOUND_DOCTYPE,$XML_CONSTRAIN,$BLOCK_CONSTRAIN;
$FOUND_DOCTYPE=true;
$valid_doctypes=array(DOCTYPE_HTML,DOCTYPE_HTML_STRICT,DOCTYPE_XHTML,DOCTYPE_XHTML_STRICT,DOCTYPE_XHTML_NEW);
/*if (get_value('html5')==='1') */$valid_doctypes[]=DOCTYPE_XHTML5;
$doc_type=preg_replace('#//EN"\s+"#','//EN" "',$doc_type);
if (!in_array('<'.$doc_type,$valid_doctypes))
{
$errors[]=array('XHTML_DOCTYPE');
} else
{
$THE_DOCTYPE='<'.$doc_type;
if (($THE_DOCTYPE==DOCTYPE_HTML_STRICT) || ($THE_DOCTYPE==DOCTYPE_XHTML_STRICT) || ($THE_DOCTYPE==DOCTYPE_XHTML_NEW) || ($THE_DOCTYPE==DOCTYPE_XHTML5))
$TAGS_DEPRECATE_ALLOW=false;
if (($THE_DOCTYPE==DOCTYPE_XHTML_STRICT) || ($THE_DOCTYPE==DOCTYPE_XHTML_NEW) || ($THE_DOCTYPE==DOCTYPE_XHTML5))
$BLOCK_CONSTRAIN=true;
if (($THE_DOCTYPE==DOCTYPE_XHTML) || ($THE_DOCTYPE==DOCTYPE_XHTML_STRICT) || ($THE_DOCTYPE==DOCTYPE_XHTML_NEW) || ($THE_DOCTYPE==DOCTYPE_XHTML5))
$XML_CONSTRAIN=true;
}
}
$status=NO_MANS_LAND;
}
break;
case IN_CDATA:
$INBETWEEN_TEXT.=$next;
if (($next=='>') && ($OUT[$POS-2]==']') && ($OUT[$POS-3]==']')) $status=NO_MANS_LAND;
break;
case IN_COMMENT:
$INBETWEEN_TEXT.=$next;
if (($next=='>') && ($OUT[$POS-2]=='-') && ($OUT[$POS-3]=='-'))
{
if ($OUT[$POS-4]=='-') $errors[]=array('XHTML_WRONG_COMMENTING');
$status=NO_MANS_LAND;
}
break;
case IN_TAG_EMBEDDED_COMMENT:
if (($next=='>') && ($OUT[$POS-2]=='-') && ($OUT[$POS-3]=='-')) $status=IN_TAG_BETWEEN_ATTRIBUTES;
break;
}
}
if ($status!=NO_MANS_LAND)
{
$errors[]=array('XML_BROKEN_END');
return array(NULL,$errors);
}
return NULL;
}
mixed _check_tag(string tag, map attributes, boolean self_close, boolean close, list errors)
Checks an XHTML tag for validity, including attributes. Return the results.
Parameters…
| Name |
tag |
| Description |
The name of the tag to check |
| Type |
string |
| Name |
attributes |
| Description |
A map of attributes (name=>value) the tag has |
| Type |
map |
| Name |
self_close |
| Description |
Whether this is a self-closing tag |
| Type |
boolean |
| Name |
close |
| Description |
Whether this is a closing tag |
| Type |
boolean |
| Name |
errors |
| Description |
Errors detected so far. We will add to these and return |
| Type |
list |
Returns…
| Description |
String for tag basis form, or array of error information |
| Type |
mixed |
function _check_tag($tag,$attributes,$self_close,$close,$errors)
{
global $XML_CONSTRAIN,$LAST_TAG_ATTRIBUTES,$WELL_FORMED_ONLY,$XHTML_VALIDATOR_OFF,$MUST_SELFCLOSE_TAGS;
$ltag=strtolower($tag);
if ($ltag!=$tag)
{
if ($XML_CONSTRAIN) $errors[]=array('XHTML_CASE_TAG',$tag);
$tag=$ltag;
}
$LAST_TAG_ATTRIBUTES=$attributes;
$actual_self_close=$self_close;
if ((!$WELL_FORMED_ONLY) && (!$self_close) && (isset($MUST_SELFCLOSE_TAGS[$tag])))
{
$self_close=true; // Will be flagged later
}
if (((isset($attributes['class'])) && (in_array($attributes['class'],array('comcode_code_content','xhtml_validator_off')))) || ((isset($attributes['xmlns'])) && (strpos($attributes['xmlns'],'xhtml')===false)))
{
$XHTML_VALIDATOR_OFF=0;
}
if ((!$WELL_FORMED_ONLY) /*&& (is_null($XHTML_VALIDATOR_OFF))*/)
{
$errors=__check_tag($tag,$attributes,$self_close,$close,$errors);
}
if ($XHTML_VALIDATOR_OFF>0) $errors=array();
return array('<'.($close?'/':'').$tag.($actual_self_close?'/':'').'>',$errors);
}
string _get_tag_basis(string full)
Get the tag basis for the specified tag. e.g. '<br />' would become 'br'. Note: tags with parameters given are not supported.
Parameters…
| Name |
full |
| Description |
The full tag |
| Type |
string |
Returns…
| Description |
The basis of the tag |
| Type |
string |
function _get_tag_basis($full)
{
$full=preg_replace('#[/ <>]#','',$full);
return $full;
}
0 reviews: Unrated (average)
There have been no comments yet