{"id":66050,"date":"2024-12-17T03:01:00","date_gmt":"2024-12-16T17:01:00","guid":{"rendered":"https:\/\/www.rjmprogramming.com.au\/ITblog\/?p=66050"},"modified":"2024-12-16T12:32:01","modified_gmt":"2024-12-16T02:32:01","slug":"php-tokeniser-primer-tutorial","status":"publish","type":"post","link":"https:\/\/www.rjmprogramming.com.au\/ITblog\/php-tokeniser-primer-tutorial\/","title":{"rendered":"PHP Tokeniser Primer Tutorial"},"content":{"rendered":"<div style=\"width: 230px\" class=\"wp-caption alignnone\"><a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/PHP\/using_tokeniser.php\" rel=\"noopener\"><img decoding=\"async\" style=\"border: 15px solid pink;\" alt=\"PHP Tokeniser Primer Tutorial\" src=\"http:\/\/www.rjmprogramming.com.au\/PHP\/using_tokeniser.gif\" title=\"PHP Tokeniser Primer Tutorial\"  style=\"float:left;\" \/><\/a><p class=\"wp-caption-text\">PHP Tokeniser Primer Tutorial<\/p><\/div>\n<p>Validation is a big subject area for programmers (we last discussed with  <a title='XML Lint Validation Tutorial' href='#xmllvt'>XML Lint Validation Tutorial<\/a>), and &#8220;when in a real pickle&#8221; I&#8217;ve turned to online HTML validators such as <a target=\"_blank\" title='Online HTML Validator ... thanks' href='https:\/\/www.freeformatter.com\/html-validator.html' rel=\"noopener\">this one<\/a>, but more often the web browser&#8217;s web inspector can help.<\/p>\n<p>Here, changing from CentOS to AlmaLinux as the hosting Apache\/PHP\/MySql environment, we&#8217;ve been surprised by, even with PHP issues, the web browser web inspectors are by and large doing the job helping us debug issues.  Nonetheless, we want to explore the <a target=\"_blank\" title='PHP Tokeniser extension information' href='https:\/\/www.php.net\/manual\/en\/book.tokenizer.php' rel=\"noopener\">Tokeniser<\/a> PHP extension as a &#8230;<\/p>\n<ul>\n<li>means to strip out PHP comments from some Tokeniser analyzed input PHP code the user can enter into today&#8217;s <a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/PHP\/using_tokeniser.php_GETME\" rel=\"noopener\">&#8220;proof of concept&#8221; using_tokeniser.php<\/a> <a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/PHP\/using_tokeniser.php\" rel=\"noopener\">web application<\/a> &#8230; and &#8230;<\/li>\n<li>parse line(s) of code<\/li>\n<\/ul>\n<p> &#8230; best understood, we reckon, by trying it yourself &#8230;<\/p>\n<p><iframe style=width:98%;height:900px; src=\"https:\/\/www.rjmprogramming.com.au\/PHP\/using_tokeniser.php\"><\/iframe><\/p>\n<p><b><i>Did you know?<\/i><\/b><\/p>\n<p>For the first time we can remember we got over the perennial problem of <i>form<\/i> programming language code content encoding whereby a + equates to a space (on decoding) where we&#8217;d rather have <i>%20<\/i> be used, as <a target=\"_bkank\" title='Javascript encodeURIComponent information from W3schools' href='https:\/\/www.w3schools.com\/jsref\/jsref_encodeuricomponent.asp' rel=\"noopener\">encodeURIComponent<\/a> uses to encode the space character, by &#8230;<\/p>\n<ol>\n<li>on the way out, use (a form onsubmit event function incorporating) <font color=blue>window.<a target=\"_blank\" title='window.btoa information from W3schools' href='https:\/\/www.w3schools.com\/jsref\/met_win_btoa.asp' rel=\"noopener\">btoa<\/a><\/font> &#8230;<br \/>\n&lt;?php echo &#8221;<br \/>\n<code><br \/>\nfunction btoait() {<br \/>\n  orig=document.getElementById('html').value;<br \/>\n  <font color=blue>document.getElementById('html').value=window.btoa(orig);<\/font><br \/>\n  return true;<br \/>\n}<br \/>\n<\/code><br \/>\n&#8220;; ?&gt; <\/p>\n<li>teamed with, on the way in, incorporating <font color=blue><a target=\"_blank\" title='PHP base64_decode information' href='https:\/\/www.php.net\/manual\/en\/function.base64-decode.php' rel=\"noopener\">base64_decode<\/a><\/font> &#8230;<br \/>\n&lt;?php<br \/>\n<code><br \/>\n  <font color=blue>$html_fragment=base64_decode($_POST['html']);<\/font><br \/>\n<\/code><br \/>\n?&gt;\n<\/li>\n<\/ol>\n<p> &#8230; to get around that + character being misinterpreted.  We find, though, that window.btoa can fail when there is a lot of data or complex data <font size=1>(or, it could have been a bad hair day?!)<\/font><\/p>\n<p><!--p>You can also see this play out at WordPress 4.1.1's <a target=\"_blank\" href='\/\/www.rjmprogramming.com.au\/ITblog\/php-tokeniser-primer-tutorial\/' rel=\"noopener\">PHP Tokeniser Primer Tutorial<\/a>.<\/p-->\n<hr>\n<p id='xmllvt'>Previous relevant <a target=\"_blank\" title='XML Lint Validation Tutorial' href='\/\/www.rjmprogramming.com.au\/ITblog\/xml-lint-validation-tutorial\/' rel=\"noopener\">XML Lint Validation Tutorial<\/a> is shown below.<\/p>\n<div style=\"width: 230px\" class=\"wp-caption alignnone\"><a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/PHP\/xmllint_validation.php\" rel=\"noopener\"><img decoding=\"async\" style=\"border: 15px solid pink;\" alt=\"XML Lint Validation Tutorial\" src=\"http:\/\/www.rjmprogramming.com.au\/PHP\/xmllint_validation.jpg\" title=\"XML Lint Validation Tutorial\"  style=\"float:left;\" \/><\/a><p class=\"wp-caption-text\">XML Lint Validation Tutorial<\/p><\/div>\n<p>Do you remember when we discussed the Sanitizer API, talked about at <a title='Sanitizer API Primer Tutorial' href='#sapipt'>Sanitizer API Primer Tutorial<\/a>, regarding it as a web application HTML (and more) validation tool?<\/p>\n<p>Well, we&#8217;ve based a new &#8220;validator&#8221; of HTML or XML using the <a target=\"_blank\" title='xmllint on Linux information' href='https:\/\/linux.die.net\/man\/1\/xmllint' rel=\"noopener\">XML Lint<\/a> web application on what we did then, but this code needing to be &#8230;<\/p>\n<ul>\n<li>under the auspices of a serverside scenario &#8230; ie. PHP &#8230; for us &#8230; calling on &#8230;<\/li>\n<li>underlying operating system call such as (for HTML qsall.htm incoming data file) &#8230;<br \/>\n<code><br \/>\nxmllint --html --valid --noout .\/qsall.htm<br \/>\n<\/code><br \/>\n &#8230; via &#8230;<\/li>\n<li><a target=\"_blank\" title='PHP shell_exec() method information' href='http:\/\/php.net\/manual\/en\/function.shell-exec.php' rel=\"noopener\">shell_exec<\/a><\/li>\n<\/ul>\n<p> &#8230; there&#8217;s not much left of the original HTML and Javascript!<\/p>\n<p>We had a fun time with HTML textarea elements and scrolling with the resultant <a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/PHP\/xmllint_validation.php_GETME\" rel=\"noopener\">&#8220;first draft&#8221;<\/a> <a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/PHP\/xmllint_validation.php_GETME\" rel=\"noopener\">xmllint_validation.php<\/a> <a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/PHP\/xmllint_validation.php\" rel=\"noopener\">you might want to try for yourself<\/a> supplying an HTML or XML URL of intetest.  Why, in particular?  Well, it was the first time that we remember trying to make practically useful &#8230;<\/p>\n<ul>\n<li>a table cell (ie. td element) (the left of two) hosted &#8230;<\/li>\n<li>two textarea element arrangement whereby, ideally. they view &#8230;\n<ol>\n<li>side by side<\/li>\n<li>if one is scrolled the two identically scroll the same amount &#8230; <font size=1>(document.body outerHTML)<\/font> HTML &#8230;<br \/>\n&lt;?php echo &#8221;<br \/>\n<code><br \/>\n&lt;body onload=\"s1 = document.getElementById('preincoming'); s2 = document.getElementById('incoming'); s1.addEventListener('scroll', select_scroll_1, false); s2.addEventListener('scroll', select_scroll_2, false);\" data-onload='onl();'&gt;<br \/>\n&lt;h1&gt;XML Lint Validation &lt;!--button onclick='trythis();' title='Try your own'&gt;Usage&lt;\/button--&gt;&lt;\/h1&gt;<br \/>\n&lt;h3&gt;RJM Programming - June, 2024&lt;\/h3&gt;<br \/>\n&lt;form action=.\/xmllint_validation.php method=POST target=_self&gt;<br \/>\n&lt;table style=width:95%; border=5&gt;<br \/>\n&lt;tr&gt;&lt;th colspan=2 style=text-align:center;&gt;XML Lint validation of &lt;input style=width:70%; onblur=\"if (this.value.length &gt; 0) { document.getElementById('mysub').click();  }\" name=htmlfile id=myhxfile placeholder='Please enter either HTML or XML file to validate ...' value=\"&lt;?php echo str_replace('&gt;','&gt;',str_replace('&lt;','&lt;',$prefn)); ?&gt;\"&gt;&lt;\/input&gt;&lt;\/th&gt;&lt;\/tr&gt;<br \/>\n&lt;tr&gt;&lt;th&gt;Data to validate&lt;\/th&gt;&lt;th&gt;XML Lint results&lt;\/th&gt;&lt;\/tr&gt;<br \/>\n&lt;tr&gt;&lt;td style=vertical-align:top;&gt;&lt;textarea style=font-size:8px;display:inline-block;overflow-x:clip;text-wrap:nowrap;text-align:right; id=preincoming&gt;&lt;?php echo str_replace('&gt;','&gt;',str_replace('&lt;','&lt;',$precontents)); ?&gt;&lt;\/textarea&gt;&lt;textarea onblur=\"if (this.value.length &gt; 0 && '&lt;?echo $fn; ?&gt;' == '') { document.getElementById('mysub').click();  }\" style=font-size:8px;display:inline-block;overflow-x:clip;text-wrap:nowrap; name=content id=incoming&gt;&lt;?php echo str_replace('&gt;','&gt;',str_replace('&lt;','&lt;',$contents)); ?&gt;&lt;\/textarea&gt;&lt;\/td&gt;&lt;td style=vertical-align:top;&gt;&lt;textarea id=outgoing&gt;&lt;?php echo str_replace('&gt;','&gt;',str_replace('&lt;','&lt;',$results)); ?&gt;&lt;\/textarea&gt;&lt;\/td&gt;&lt;\/tr&gt;<br \/>\n&lt;tr&gt;&lt;td&gt;&lt;\/td&gt;&lt;td&gt;&lt;input type=submit id=mysub style=display:&lt;?php echo $vsnone; ?&gt; value=Validate&gt;&lt;\/input&gt;&lt;\/td&gt;&lt;\/tr&gt;<br \/>\n&lt;\/table&gt;<br \/>\n&lt;\/form&gt;<br \/>\n&lt;\/body&gt;<br \/>\n<\/code><br \/>\n&#8220;; ?&gt;<br \/>\n &#8230; uses Javascript &#8230;<br \/>\n&lt;?php echo &#8221;<br \/>\n<code><br \/>\nvar s1=null, s2=null;<br \/>\n<br \/>\n\/\/ Thanks to <a target=\"_blank\" href='https:\/\/stackoverflow.com\/questions\/7108270\/scrolling-2-different-elements-in-same-time' title='https:\/\/stackoverflow.com\/questions\/7108270\/scrolling-2-different-elements-in-same-time' rel=\"noopener\">https:\/\/stackoverflow.com\/questions\/7108270\/scrolling-2-different-elements-in-same-time<\/a><br \/>\nfunction select_scroll_1(e) { s2.scrollTop = s1.scrollTop; }<br \/>\nfunction select_scroll_2(e) { s1.scrollTop = s2.scrollTop; }<br \/>\n<\/code><br \/>\n&#8220;; ?&gt;<br \/>\n&#8230; so that &#8230;<\/li>\n<li>the left hand textarea contains code line numbers right aligned &#8230; to sidle up next to &#8230;<\/li>\n<li>the right hand textarea contains the code (HTML or XML) being validated by xmllint<\/li>\n<\/ol>\n<p> &#8230; while &#8230;\n<\/li>\n<li>the right hand table cell contains the xmllint validation (of HTML or XML) results<\/li>\n<\/ul>\n<p> &#8230; had us, in practice, thanking our lucky stars that &#8230;<\/p>\n<ol>\n<li>textarea elements are resizeable<\/li>\n<li>you can simulate &#8220;some cockpit action&#8221; aligning them vertically &#8230; <a target=\"_blank\" title='?' href='https:\/\/www.youtube.com\/watch?v=0TiqXFssKMY' rel=\"noopener\">Jim<\/a>?!<\/li>\n<\/ol>\n<p><!--p>You can also see this play out at WordPress 4.1.1's <a target=\"_blank\" href='\/\/www.rjmprogramming.com.au\/ITblog\/xml-lint-validation-tutorial\/' rel=\"noopener\">XML Lint Validation Tutorial<\/a>.<\/p-->\n<hr>\n<p id='sapipt'>Previous relevant <a target=\"_blank\" title='Sanitizer API Primer Tutorial' href='\/\/www.rjmprogramming.com.au\/ITblog\/sanitizer-api-primer-tutorial\/' rel=\"noopener\">Sanitizer API Primer Tutorial<\/a> is shown below.<\/p>\n<div style=\"width: 230px\" class=\"wp-caption alignnone\"><a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/HTMLCSS\/sanitizer_api_test.html\" rel=\"noopener\"><img decoding=\"async\" style=\"border: 15px solid pink;\" alt=\"Sanitizer API Primer Tutorial\" src=\"http:\/\/www.rjmprogramming.com.au\/HTMLCSS\/sanitizer_api_test.jpg\" title=\"Sanitizer API Primer Tutorial\"  style=\"float:left;\" \/><\/a><p class=\"wp-caption-text\">Sanitizer API Primer Tutorial<\/p><\/div>\n<p>Today we&#8217;re roadtesting the <a target=\"_blank\" title='Sanitizer API' href='https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/API\/HTML_Sanitizer_API' rel=\"noopener\">Sanitizer API<\/a> &#8230;<\/p>\n<blockquote cite='https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/API\/HTML_Sanitizer_API'><p>\nThe HTML Sanitizer API allow developers to take untrusted strings of HTML and Document or DocumentFragment objects, and sanitize them for safe insertion into a document&#8217;s DOM.\n<\/p><\/blockquote>\n<p> &#8230; as another validation idea for HTML to add to our previous <a target=\"_blank\" title='HTML Online Validation Tidy Errors Tutorial' href='https:\/\/www.rjmprogramming.com.au\/ITblog\/html-online-validation-tidy-errors-tutorial' rel=\"noopener\">HTML Online Validation Tidy Errors Tutorial<\/a> efforts.<\/p>\n<p>Perhaps you&#8217;d like to try the &#8220;Usage&#8221; button of the <a target=\"_blank\" href=\"http:\/\/www.rjmprogramming.com.au\/HTMLCSS\/sanitizer_api_test.html_GETME\" title=\"sanitizer_api_test.html\" rel=\"noopener\">proof of concept <a target=\"_blank\" href=\"https:\/\/www.rjmprogramming.com.au\/HTMLCSS\/sanitizer_api_test.html\" title=\"Click picture\" rel=\"noopener\">web application<\/a> below &#8230;<\/p>\n<p><iframe style=\"width:100%;height:1000px;\" src=\"https:\/\/www.rjmprogramming.com.au\/HTMLCSS\/sanitizer_api_test.html\"><\/iframe><\/p>\n<p>If this was interesting you may be interested in <a title='Click here to see topics in which you might be interested' href='#d56492' onclick='var dv=document.getElementById(\"d56492\"); dv.innerHTML = \"&lt;iframe width=670 height=600 src=\" + \"https:\/\/www.rjmprogramming.com.au\/ITblog\/tag\/validation\" + \"&gt;&lt;\/iframe&gt;\"; dv.style.display = \"block\";'>this<\/a> too.<\/p>\n<div id='d56492' style='display: none; border-left: 2px solid green; border-top: 2px solid green;'><\/div>\n<hr>\n<p>If this was interesting you may be interested in <a title='Click here to see topics in which you might be interested' href='#d63915' onclick='var dv=document.getElementById(\"d63915\"); dv.innerHTML = \"&lt;iframe width=670 height=600 src=\" + \"https:\/\/www.rjmprogramming.com.au\/ITblog\/tag\/linux\" + \"&gt;&lt;\/iframe&gt;\"; dv.style.display = \"block\";'>this<\/a> too.<\/p>\n<div id='d63915' style='display: none; border-left: 2px solid green; border-top: 2px solid green;'><\/div>\n<hr>\n<p>If this was interesting you may be interested in <a title='Click here to see topics in which you might be interested' href='#d66050' onclick='var dv=document.getElementById(\"d66050\"); dv.innerHTML = \"&lt;iframe width=670 height=600 src=\" + \"https:\/\/www.rjmprogramming.com.au\/ITblog\/tag\/textarea\" + \"&gt;&lt;\/iframe&gt;\"; dv.style.display = \"block\";'>this<\/a> too.<\/p>\n<div id='d66050' style='display: none; border-left: 2px solid green; border-top: 2px solid green;'><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Validation is a big subject area for programmers (we last discussed with XML Lint Validation Tutorial), and &#8220;when in a real pickle&#8221; I&#8217;ve turned to online HTML validators such as this one, but more often the web browser&#8217;s web inspector &hellip; <a href=\"https:\/\/www.rjmprogramming.com.au\/ITblog\/php-tokeniser-primer-tutorial\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12,14,37],"tags":[3697,218,5021,1985,4173,305,306,307,327,1605,1797,452,576,652,1712,899,932,997,1262,3304,5020,1319,1357,1358,4483],"class_list":["post-66050","post","type-post","status-publish","format-standard","hentry","category-elearning","category-event-driven-programming","category-tutorials","tag-base64_decode","tag-code","tag-code-data","tag-comment","tag-comments","tag-debug","tag-debugger","tag-debugging","tag-did-you-know","tag-encodeuricomponent","tag-extension","tag-form","tag-html","tag-javascript","tag-onsubmit","tag-parse","tag-php","tag-programming","tag-textarea","tag-token","tag-tokenise","tag-tutorial","tag-validate","tag-validation","tag-window-btoa"],"_links":{"self":[{"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/posts\/66050"}],"collection":[{"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/comments?post=66050"}],"version-history":[{"count":11,"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/posts\/66050\/revisions"}],"predecessor-version":[{"id":66063,"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/posts\/66050\/revisions\/66063"}],"wp:attachment":[{"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/media?parent=66050"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/categories?post=66050"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.rjmprogramming.com.au\/ITblog\/wp-json\/wp\/v2\/tags?post=66050"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}