I quickly glanced over your initial changes and noticed this:
Code: Select all
define('POST_REVISIONS_TABLE', $table_previx . 'post_revisions');
Code: Select all
define('POST_REVISIONS_TABLE', $table_previx . 'post_revisions');
Good catch. Fixed.t_backoff wrote:David,
I quickly glanced over your initial changes and noticed this:
That should be $table_prefixCode: Select all
define('POST_REVISIONS_TABLE', $table_previx . 'post_revisions');
Code: Select all
<imkingdavid> hmm... the diff engine packaged with phpbb only shows changes per line, not per character. so if I were to write: "Post" and then change it to "Posting", it would show a line removed "Post" and a line added "Posting" instead of 3 characters added "ing"
<imkingdavid> is that preferrable or should I figure out how to do a per-character diff?
<bantu> imkingdavid: AFAIK diff is by definition line-based
<imkingdavid> bantu: perhaps, but the diff engine I referenced in my gsoc proposal, http://www.raymondhill.net/finediff/viewdiff-ex.php has options for per-character, per-word, per-sentence, and per paragraph/line
<bantu> ah
<imkingdavid> I think that a diff for code works best as per-line, but a diff for user-written content (i.e. prose) is better either per-word or per-character
<bantu> yes, that might be the case
<imkingdavid> I don't want to package two diff engines with phpBB, though, so if we were to use a per word or per character diff, I would have to write it myself
<imkingdavid> which is okay if we want to do that
<-- stodan (~daniel@unaffiliated/stodan) has quit (Quit: Over & Out)
<bantu> imkingdavid: Have you thought about conflict resolution yet?
<imkingdavid> bantu: not yet. i'd need an example conflict to test
<imkingdavid> although...
<imkingdavid> it only ever compare two post revisions, so I'm not sure how there could be a conflict
<bantu> exactly
<bantu> so we just won't support branching off
<bantu> say two people editing at the same time
<bantu> this is fine I guess
<imkingdavid> revisions are based on posts, not other revisions. so if two people edit at the same time, two revisions will be made on the post, and the most recently finished one will be used
<imkingdavid> if the other one is decided to be better, they can swap them
<bantu> imkingdavid: they will usually want to merge them
<bantu> and the last editor will already get a warning about it
<imkingdavid> right
<bantu> I'm not sure about the overhead of using a character or word based approach for storage.
<imkingdavid> bantu: all revisions are stored in full text, nto as diffs
<imkingdavid> not*
<bantu> I'm pretty sure git does line-based and e.g. GitHub just highlights words again
<imkingdavid> right
<bantu> imkingdavid: so this is just about display
<imkingdavid> basically
<imkingdavid> when I change the word "Post" to "Posting" I only want to see "ing" in green, not "Post" in red and "Posting" in green
<imkingdavid> although, if we go word-based, it would be the latter
<imkingdavid> but the whole line wouldn't show as a change, just that word
<imkingdavid> which would be okay too
<bantu> line based is certainly bad for display
<bantu> I agree with that
<imkingdavid> alright, I'll work on either word or character based diff. which would you prefer?
<bantu> Especially since a post can be a single line that is very long.
<imkingdavid> right
<bantu> I don't think there would be a problem with packaging another diff engine.
<bantu> The current diff engine is however also used for doing actual merges.
<bantu> So it is kind of important that things actually work there.
<imkingdavid> right
<imkingdavid> i can include a new diff engine separately, since that would be easier than adding to the current engine
<imkingdavid> here's the code for that diff engine I referenced: https://github.com/gorhill/PHP-FineDiff
<-- rxu (~Miranda@phpbb/developer/rxu) has quit (Read error: Connection reset by peer)
<imkingdavid> i'll just change the class name to fit with naming guidelines and then make sure the code conforms to coding guidlines
<imkingdavid> or is that not required for 3rd party libs?
<bantu> it really is the other way around
<bantu> third party libraries should be used as is if possible
<imkingdavid> okay
<bantu> since it does support word-based and character-based diffing, it should not matter right now
<bantu> feel free to pick one
<imkingdavid> alright
It turns out we cannot use FineDiff as it does not support UTF8 properly. Also, after closer review, it does not look very mature to me.imkingdavid wrote:TL;DR: Will be using FineDiff (MIT Licensed) for post revision diff, because the currently-packaged line-based diff is not visually appealing for prose diffs. Instead, will use either character- or word-based (defaults to character) diff.
[01:45:24] <naderman> imkingdavid: please entirely ignore the diff engine used for updates, and just use a different one
[01:45:28] <naderman> oh he's gone
[01:45:33] <naderman> bantu: well you tell him that
[01:46:01] <bantu> naderman: that is what I told him
[01:46:20] <naderman> well not quite that clearly, but ok
[01:46:42] <naderman> and try and make sure whichever one you pick is proper PHP5 code
[01:49:00] <naderman> * Does it work with UTF-8?
[01:49:00] <naderman> As of now, the code assume single-byte characters. To use UTF-8 text, you can
[01:49:00] <naderman> always convert the encoding using mb_convert_encoding():
[01:49:07] <naderman> so the library he was looking at is not an option
[01:52:49] <bantu> naderman: interesting
[01:53:41] <bantu> I would also prefer if the library would be "more mature".
[01:53:48] <naderman> yeah
[01:53:59] <naderman> is he planning on doing snapshot or delta storage?
[01:55:48] <bantu> naderman: I think he doesn't want to do delta storage
[01:55:54] <naderman> k
[01:56:29] <bantu> this will cost quite a bit of storage then
[01:56:41] <naderman> yeah
[01:56:52] <naderman> but implementing a delta store on top of SQL sucks too
[01:56:54] <bantu> ideally, the new bbcode engine would be finished first
[01:57:00] <naderman> I wonder what mediawiki does
[01:57:03] <naderman> true
[01:57:19] <naderman> well if it's just about displaying diffs, I guess it doesn't matter that much
[01:57:23] <naderman> what matters is all the other stuff
[01:57:38] <naderman> I would actually recommend to him that he entirely ignores displaying diffs until the very end
[01:57:43] <naderman> and just displays them side by side for now
[01:58:04] <naderman> all the functionality around revisions is way more important
[01:58:20] <bantu> yeah, I agree
I would as well, except that it often displays very oddly. So there would have to be some added logic to get it to display correctly. For instance, take the following two strings, which I fabricated entirely for this example, even though there are probably examples that better illustrate the issue:drathbun wrote:Random scattered thoughts...
I would prefer a character diff rather than word, for reasons already listed (post versus posting example given).
I am a posterity person
Here is the resulting character diff:I ask a pottery question
Instead of the actual changes I made (word diff):I amsk a postteritypquerstion
As I said, not the best example, but I think it illustrates how often times coincidental similarities between strings can make the diff appear more confusing than a word-based diff.Iamask aposteritypotterypersonquestion
I am a posterity person
I ask a pottery question
No unsolicited PMs please except for quotes.psoTFX wrote: I went with Olympus because as I said to the teams ... "It's been one hell of a hill to climb"