About a year ago now, I figured out and implemented an "alternate" method for tracking read/unread information for forum posts. It is a form of run-length encoding/compression (or so I'm told), that is able to efficiently store per-post seen/unseen information while storing no more seen/unseen records than half of the number of total posts in the database per user (and usually much less). It works by maintaining "seen boundaries," where a record lists a lower and upper boundary of posts that have been seen. Upon page-load, a subroutine maintains this list of boundaries for the user, either creating new boundaries, growing old ones, or merging two boundaries. It works for data sets that have monotonically increasing integers as their primary keys (like phpBB).
You can read about it here: http://stackoverflow.com/questions/2288 ... 27#5045827
I say that phpBB "cheats" a little in the above link because they only store timestamps in their *_track tables rather than per-post information. However, as I understand it, with phpBB being a decidedly flat forum, and people very rarely reading out of order, and are usually reading through pages of posts, not single posts. More than per-forum or per-topic timestamps isn't likely to be needed.
In the forum that I designed this for, posts were threaded, and given some of the topics, reading out of order was entirely possible, if not encouraged. Therefore, per-post historical accuracy was important. In the year since I came up with my initial design, I've improved upon it, adding hooks to provide nice display features outside of the seen/unseen maintenance routine ("Which posts in this thread had I not read before accessing this page?"), and also to efficiently support the maintenance of multiple messages viewed at once.
Based on what I've seen while briefly looking around here, I know the read/unread feature comes up a lot. I appreciate phpBB's current simplicity, but given some of the use cases ("Mark as unread", reading backwards, sub-forums, sub-sub-forums, etc.), it seems... unwieldy? This is a common problem, and though my solution may not be absolutely perfect, I would think this would have been a better solved problem by now.
Has anyone seen this type of implementation before? If so, were there problems? Or, do you have particular issues with, or questions about my current implementation? And are you tired of this topic coming up?
Thanks for reading.



