- quote
- this
- message
[RFC|Accepted] Updated BBcode engine
- callumacrae
- Former Team Member
- Posts: 1046
- Joined: Tue Apr 27, 2010 9:37 am
- Location: England
- Contact:
Re: [RFC|Accepted] Updated BBcode engine
Re: [RFC|Accepted] Updated BBcode engine
The problem is that it accepts both:callumacrae wrote:Making this change would require... zero changes.
- quote
- this
- message
Code: Select all
[list][*]quote[/*]
[*]this[/*]
[*]message[/*][/list]
Code: Select all
[list][*]quote
[*]this
[*]message[/list]
If phpBB uses the
[*]
for opening and [/*]
for closing, then all ok. The problem is that I don't think that that will happen that frequently, or will it?Well, one idea so far is to use a preparser. What it will do is to scan the string like this:
Code: Select all
Check and look for "[/*]" in the string
If no "[/*]" are found
Locate and replace "[*]" by "[/*][*]"
Locate and replace "[/list]" by "[/*][/list]"
- DavidIQ
- Customisations Team Leader
- Posts: 1904
- Joined: Thu Mar 02, 2006 4:29 pm
- Location: Earth
- Contact:
Re: [RFC|Accepted] Updated BBcode engine
I don't understand what the problem is. This currently works just fine with either
[*]
or [*][/*]
. What is wrong with how it works right now?- imkingdavid
- Registered User
- Posts: 1050
- Joined: Thu Jul 30, 2009 12:06 pm
Re: [RFC|Accepted] Updated BBcode engine
David is right,works just like
Code: Select all
[list][*]test[/*][/list]
Code: Select all
[list][*]test[/list]
Re: [RFC|Accepted] Updated BBcode engine
works with either
[*]
and [*][/*]
and not just with [*][/*]
- EXreaction
- Registered User
- Posts: 1555
- Joined: Sat Sep 10, 2005 2:15 am
Re: [RFC|Accepted] Updated BBcode engine
The problem is when designing a new parser to be more efficient and more valid. If bbcode parsing is treated more like a tree or valid XML, [*] will not work without hacks for this specific case, which is not particularly desired.DavidIQ wrote:I don't understand what the problem is. This currently works just fine with either[*]
or[*][/*]
. What is wrong with how it works right now?
Re: [RFC|Accepted] Updated BBcode engine
It doesn't have to be a hack at all. Optional end tags are part of HTML 5 and they're not limited to
If this parser represents the BBCode markup as a tree, just don't allow
<li>
so if you want to produce valid HTML you'd have to handle them appropriately. For instance, [p]a[p]b[/p]c[/p]
might seem legal for a BBCode parser, but <p>a<p>b</p>c</p>
is not valid HTML and it will be interpreted as <p>a</p><p>b</p>c<p></p>
. I'm not saying that nesting paragraphs is particularly important, but at any rate optional end tags are not an edge case or a hack. (Void elements such as <hr>
are comparable in the sense that they don't use end tags, and a few people are using custom [hr]
BBCodes in their forums)If this parser represents the BBCode markup as a tree, just don't allow
[*]
to be a child of [*]
. Instead, move the node (with its descendants and all of its siblings) as the next sibling(s) of its parents. [Edit: here's an illustration of the move, when parsing [lіst][*]foo[*]bar[/list]
]Code: Select all
[list] -> [list]
| / \
[*] [*] [*]
/ \ | |
foo [*] foo bar
|
bar
Last edited by JoshyPHP on Wed Dec 12, 2012 10:54 pm, edited 1 time in total.
Re: [RFC|Accepted] Updated BBcode engine
AFAIK, I need to produce valid xHTML5. The main reason is if phpBB changes to xHTML.JoshyPHP wrote:It doesn't have to be a hack at all. Optional end tags are part of HTML 5 and they're not limited to<li>
so if you want to produce valid HTML you'd have to handle them appropriately.
Yeah... closing a <p> tag is optional in HTML, but not in xHTML.JoshyPHP wrote:For instance,[p]a[p]b[/p]c[/p]
might seem legal for a BBCode parser, but<p>a<p>b</p>c</p>
is not valid HTML and it will be interpreted as<p>a</p><p>b</p>c<p></p>
.
<p> tags can only contain flow tags inside. This means, p tags cannot be nested.
the result of:
[p]a[p]b[/p]c[/p]
is:
<p>a[p]b[/p]c</p>
because p BBCode tags cannot be nested .
Have you heard about... say eh... self-closing tags?JoshyPHP wrote: I'm not saying that nesting paragraphs is particularly important, but at any rate optional end tags are not an edge case or a hack. (Void elements such as<hr>
are similar, and a few people are using custom[hr]
BBCodes in their forums)
If you parse the same way for all tags then differentiate only in the details, then you'll see that it's not that straightforward as it seems when you write that.JoshyPHP wrote: If this parser represents the BBCode markup as a tree, just don't allow[*]
to be a child of[*]
. Instead, move the node (with its descendants and all of its siblings) at the next sibling of its parents.
Re: [RFC|Accepted] Updated BBcode engine
Can't you render the parsed message as XHTML without parsing it as such? Frankly I'd rather not require the end users to balance their BBCode tags, especially when the current implementation doesn't. It seems like an unnecessary loss.
I don't understand what you meant in your last sentence. If you use a tree structure, I'd think that moving nodes around would be relatively straightforward but since I haven't really looked into your code I wouldn't know.
I don't understand what you meant in your last sentence. If you use a tree structure, I'd think that moving nodes around would be relatively straightforward but since I haven't really looked into your code I wouldn't know.
Re: [RFC|Accepted] Updated BBcode engine
Ofc I could try to work with it in a more loose way, but for that I need to know how to specifically work with it.JoshyPHP wrote:Can't you render the parsed message as XHTML without parsing it as such? Frankly I'd rather not require the end users to balance their BBCode tags, especially when the current implementation doesn't. It seems like an unnecessary loss.
Anyway, any change to the original string will generate more processing overhead to the parser.
Doing these kinds of fixes requires creating zero-lengh tags which is not a problem, per se. The problem is knowing where to place those zero-length tags (I remind that my parser only reads the input string once and that's an advantage!).
Also, for the main part of the system, there are only 2 "classes" of tags. Self-closing and non self-closing, which are parsed almost apart because the self-closing ones are easier to parse (no nesting issues and no malforming exists).
The concept of moving the nodes is straight forward from the algorithmic POV. But not that straight forward from the performance POV.JoshyPHP wrote: I don't understand what you meant in your last sentence. If you use a tree structure, I'd think that moving nodes around would be relatively straightforward but since I haven't really looked into your code I wouldn't know.
Anyway, as soon as I know exactly which rules it needs to solve, then I'll see how can I do it. And I'll always try to "prize" the "XML" confirming BBCode nesting as the fastest way to parse and the other as a "recovery from failure". Because I don't work in the string itself it should stay fast, anyway.