NOTE: The current preferred location for bug reports is the GitHub issue tracker.
Bug 1004 - (Orphaned_close-p) Parsing error: element within <p> causes parser to see </p> as unmatched
(Orphaned_close-p)
Parsing error: element within <p> causes parser to see </p> as unmatched
Status: RESOLVED INTENTIONAL
Product: Validator.nu
Classification: Unclassified
Component: HTML parser
HEAD
All All
: P2 normal
Assigned To: Nobody
http://www.softwaresam.us/contact/con...
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-18 14:09 CEST by sam_bugzilla
Modified: 2014-10-20 09:35 CEST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description sam_bugzilla 2014-10-18 14:09:48 CEST
When experimenting with placement of a <div><a> ... </a></div> sequence inside a set of <p> </p> tags, I encountered the following:

The parser forgot that it was inside the <p></p> sequence, and declared the </p> to be an orphan:
	No p element in scope but a p end tag seen.
	From line 53, column 1; to line 53, column 4
	v> ↩<br>↩</p>

Here is the affected sequence of HTML source:
<h3>Site Contents</h3>
<div class="indent250">
<p>
 . . .
</p><p>
All TexInfo (info) documentation posted on this site, including any corresponding 
documentation in other formats (PDF or HTML for example), are released under the 
GNU Free Documentation License, version 1.3 (or higher). <br>
For full licensing information, please see:
<div class="xreftext">
<a href="http://www.gnu.org/licenses/" target="_blank"> Free Software Foundation (licenses)</a></div> 
<br>
</p><!-- HTML5 validator sees previous p-end-tag as unmatched--><p>
 . . .
</p><br>
</div><br>

 - - - - - -
Of course, it is not necessary to have the link inside the paragraph, but the parser shouldn't complain about it. I searched on the whatwg.org site for the allowable context for <p>, <div> and <a>, but found nothing that would prohibit a <div> and/or <a> inside a <p></p>.

I will leave it online for several days, so you can see it live:
http://www.softwaresam.us/contact/contact.html

Let me know if I can do anything to help.
Comment 1 Michael[tm] Smith 2014-10-18 16:21:53 CEST
(In reply to sam_bugzilla from comment #0)
> When experimenting with placement of a <div><a> ... </a></div> sequence
> inside a set of <p> </p> tags,

<p><div></div></p> isn't valid. And per the HTML parsing algorithm, it causes a parse error and it doesn't result in the DOM that you'd probably think it should.

Take a look at http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0A%3Cp%3E%3Cdiv%3E%3C%2Fdiv%3E%3C%2Fp%3E

The end tag for the p element is optional. And in cases where a <p> start tag is followed by a start tag like <div> for some element that's not what the spec defines at "phrasing content" https://html.spec.whatwg.org/multipage/dom.html#phrasing-content-2 then the HTML parsing algorithm says a parser must close the p element https://html.spec.whatwg.org/multipage/syntax.html#close-a-p-element -- which includes generating an implied </p> end tag. So basically what you end up with in the DOM is: 

<p></p><div></div></p>


> I encountered the following:
> 
> The parser forgot that it was inside the <p></p> sequence,

It didn't forget. As outlined above, it generated an implied </p> end tag when it found the <div> start tag. 

> and declared the
> </p> to be an orphan:
> 	No p element in scope but a p end tag seen.
> 	From line 53, column 1; to line 53, column 4
> 	v> ↩<br>↩</p>

Yup, that's because the parser already generated the implied </p> end tag for that p element, so now when it hits the other </p> end tag after the div, there's no longer any open p element for that </p> end tag to close.

> <p>
> For full licensing information, please see:
> <div class="xreftext">
> <a href="http://www.gnu.org/licenses/" target="_blank"> Free Software
> Foundation (licenses)</a></div> 
> <br>
> </p><!-- HTML5 validator sees previous p-end-tag as unmatched--><p>
...
> Of course, it is not necessary to have the link inside the paragraph, but
> the parser shouldn't complain about it. I searched on the whatwg.org site
> for the allowable context for <p>, <div> and <a>, but found nothing that
> would prohibit a <div> and/or <a> inside a <p></p>.

An <a> element is allowed inside a p element. Why aren't you just putting the <a> element inside the p directly? What purpose does the div serve? Why not just use a span instead of a div?

Anyway, the spec says that the content model for the p element is "phrasing content":
https://html.spec.whatwg.org/multipage/semantics.html#the-p-element

And the div element is not included in the definition of "phrasing content":
https://html.spec.whatwg.org/multipage/dom.html#phrasing-content-2

So I think the spec is quite clear on this.
Comment 2 Henri Sivonen 2014-10-20 09:35:42 CEST
Indeed, this is not a bug for the reason Mike gave.