Processing post subjects for display

General discussion of development ideas and the approaches taken in the 3.x branch of phpBB. The current feature release of phpBB 3 is 3.3/Proteus.
Forum rules
Please do not post support questions regarding installing, updating, or upgrading phpBB 3.3.x. If you need support for phpBB 3.3.x please visit the 3.3.x Support Forum on phpbb.com.

If you have questions regarding writing extensions please post in Extension Writers Discussion to receive proper guidance from our staff and community.
Post Reply
User avatar
EA117
Registered User
Posts: 21
Joined: Tue Apr 30, 2019 12:54 pm

Processing post subjects for display

Post by EA117 »

In https://tracker.phpbb.com/browse/PHPBB3-16399 and https://tracker.phpbb.com/browse/PHPBB3-15712, phpBB was fixed to correctly store emoji characters when entered as part of a subject line.

What comes to mind for being either the logical or necessary design needed for "taking the next step"? For giving these emoji characters the same treatment for display as we would when the exact same characters are entered into a message body. Specifically, the Twemoji treatment that text-formatter currently applies to them. So that the emojis look the same, when used in the subject line and message body of the same message:

subject.png

This of course is just an example of a display on Chrome in Windows 10. Exactly how the subject line renders, and how different / how much the same / how noticeable the behavior is, will vary from platform to platform and browser to browser and character to character. The point of the screen shot is just to show "not the same, and platform dependent" versus "the same, and controlled by phpBB".
  • There is no intention to store anything different for post subjects in the database.

    The actual stored subject will remain the UTF-8 string of characters entered. i.e. We would not "parse the subject line, and now store the parsed subject line" as could be said of message bodies. Anything currently dealing with the subject lines stored in SQL would remain exactly as it is today.
    • What seems logical to me is that the emoji handling would be done in a way similar to how censor_text() is currently being employed today.

      Meaning right as we're defining the value for template variables like topicrow.LAST_POST_SUBJECT, topicrow.TOPIC_TITLE, etc., we would process the subject string that templates receive to include the HTML necessary to implement Twemoji replacement for any emoji characters that were present in the subject. Such that:

      Code: Select all

      TOPIC_TITLE => new_subject_emoji_processing( censor_text($row['topic_title']) )
      would cause the previous template variable contents:

      Code: Select all

      TOPIC_TITLE => 'Judge 👩‍⚖️ Heart ❤️ Hand ✋'
      to now become:

      Code: Select all

      TOPIC_TITLE => 'Judge <img alt="👩&zwj;⚖️" class="emoji smilies" draggable="false" src="//twemoji.maxcdn.com/2/svg/1f469-200d-2696-fe0f.svg"> Heart <img alt="❤️" class="emoji smilies" draggable="false" src="//twemoji.maxcdn.com/2/svg/2764.svg"> Hand <img alt="✋" class="emoji smilies" draggable="false" src="//twemoji.maxcdn.com/2/svg/270b.svg">'
      • Ideally we would want our new processing to simply invoke the existing text-formatter. In order to have text-formatter "do exactly whatever text-formatter does for messages". So that whatever changes get made in text-formatter during future releases are automatically reflected in how the subject lines get rendered, too.

        Perhaps we can't do "exactly that" though, because we support BBCodes and other aliases in messages that we're not intending to enable or process in subject lines at this time. Meaning an existing subject line like "Using [b]:smile:[/b] is broken" should continue to render without the BBCode or the smile alias now being replaced, as it would if this same string was part of a message body.

        So maybe we would need to create and configure an entirely separate and parallel instance of text-formatter, configured to do "only that which we intend to do with subject lines." JoshyPHP would be the one who could say if there are easy "just turn off these switches before invoking the existing text-formatter instance, then turn them back on when you're done" that could lighten the implementation load of actually creating and maintaining an entirely separate instance.

        The alternative would be to make our subject-specific processor simply perform Twemoji replacements directly, without involving text-formatter, since "that's all we intend to do" right now. Knowing that if text-formatter changes in the future, e.g. to use something other than Twemoji, we have to update our subject-specific processor to match.
        • It seems acceptable to me that "we might not hit exactly 100% of every place subject lines are displayed."

          It certainly shouldn't be difficult to cover as many of the template variables that contain the subject line as we can. But "nothing very bad happens" if an extension or one of the more buried or esoteric locations in ACP or MCP still displays "just the actual UTF-8 subject string" without benefit of the new processing. It's just "better" in terms of visual consistency when the new processing is used for the display.

          The main goal is to have the primary surfaces users interact with -- the index.php, viewforum.php and viewtopic.php displays -- all showing the emoji characters consistently during normal forum browsing, searching, and reading. Rather than "one way in the subject line", and "another way in the actual message."

        Any additional considerations you can foresee that would need to be taken into account, within the described scope of intent?

        Post Reply