Converting plain text URLs into something else (images, videos, etc...) is usually part of the "media" features of forum softwares. vBulletin, IPB and XenForo all have something like that. I think that Wordpress does something like that too, in addition to shortcodes. They're usually called "media sites", "media BBCodes" or "media embeds". IMO it's a different concept than simply replacing the text of a link.
In general, I'm wary of scraping content because it's open to abuse; It could be used to DoS the originating server (à la Slowloris
), serve as a reflector for other attacks, or even as a way to probe the server's internal network. If we're only talking about internal links, it would be interesting to change the link on the client side
via XHR, directly within the textarea. Run a regexp on the textarea's content to find URLs in plain text, use XHR to retrieve the page and extract the title, and replace
. That way, the connection originates from the client and the permissions are checked the normal way; No extra precautions to be taken. That would make it harder to abuse.