Event Sourcing and CQRS

Discuss general development subjects that are not specific to a particular version like the versioning control system we use or other infrastructure.
Post Reply
User avatar
martti
Registered User
Posts: 45
Joined: Wed Aug 20, 2014 4:50 pm
Location: Belgium

Event Sourcing and CQRS

Post by martti »

Hi, did anyone ever fantasize on using Event Sourcing and CQRS for phpBB in the unknown future? It seems a good idea to me.

Instead of directly storing current state in the relational database it's storing all events or state changes sequantially in an append-only database. This can be done in almost any type of database; it can be a table in SQL-database. This sequence of events is then "The Source of Truth" that all parts in your system can listen to and produce useful view models from. Practically, you separate database concerns, a side optimised for writing (the Event Store, append-only, persistent to disk) and a side optimised for reading, querying (can be any sort of database, SQL, noSQL, text-search-engines, files, memory, etc. doesn't need to be persistent)

This gives following advantages:
  • Flexibility: you can come up with new view models build from the data in the Event Store at any time. And you can have as many view models as you like.
  • You keep the full history of your system. You can go back at any time in history by replaying the events to see how your board was (it's like Git does for code). You'll have a complete audit log. It provides traceability. i.e when somebody edits a post 25 times, you can go back how it was in edit 5. You don't need confirmation screens, as everything can be undone.
  • Security: You cannot accidentally throw away or overwrite data, as the Event Store is append-only. (only "insert", no "update" nor "delete")
  • Storing all events that lead to the current state rather than just having the current state is much more like the real world is. Imagine someone of the middle ages being dropped in our time: the person would not understand much of it, because he/she has no knowledge about all the events that led up to our current time.
  • Because you are not bound to the relational database model, you can use natural language in your events, describing the intent of the user. This makes it easier to understand and to reason about what happened.
  • Testability: replay all your events through a test system. (and commands, if you store these too). Replay a certain moment in time what happened in a certain board.
  • Database migrations are of the past. The Event Store data model itself is simple and unlikely to change. And the read models you would build from the beginning of history (not migrate).
Disadvantages might be:
  • You need to think well in advance setting up a system like this.
  • It might be too much work to change.
  • It might not be easy.
  • In general there's not so much experience with this kind of model, although it's getting more and more attention. Most developers have the relational database mindset. Some developers might not get into it.
  • You have to deal with asynchronicity.
  • ... (things I haven't thought of)
http://www.cqrs.nu/Faq/event-sourcing
https://stackoverflow.com/questions/332 ... g-and-cqrs
https://lostechies.com/jimmybogard/2012 ... qrs-myths/
https://martinfowler.com/eaaDev/EventSourcing.html
Greg Young did a lot on advocating this model. Video

User avatar
david63
Registered User
Posts: 355
Joined: Mon Feb 07, 2005 7:23 am

Re: Event Sourcing and CQRS

Post by david63 »

I think I follow in principle what this is about but the first thing that comes into my mind is that if you are storing all events that occur on a phpBB board in the database will this not result in a massive data storage problem, possibly beyond the resources for many boards? Or maybe I do not fully appreciate the concept.
David
Remember: You only know what you know -
and you do not know what you do not know!

User avatar
martti
Registered User
Posts: 45
Joined: Wed Aug 20, 2014 4:50 pm
Location: Belgium

Re: Event Sourcing and CQRS

Post by martti »

david63 wrote: Thu Apr 26, 2018 8:36 pm I think I follow in principle what this is about but the first thing that comes into my mind is that if you are storing all events that occur on a phpBB board in the database will this not result in a massive data storage problem, possibly beyond the resources for many boards? Or maybe I do not fully appreciate the concept.
That's often a first concern that comes to mind when people hear about this. Yes, you have extra storage requirements. But, storage is very cheap and becomes more cheap every year. You can store already a lot of events in just 1 GB. (which is not much these days.) And we are not dealing with a massive stream of events like in financial systems. If it would become ever a concern there are strategies of snapshotting events and archiving older events.

Image
My 2 cents = 1 GB

See https://youtu.be/JHGkaShoyNs?t=714

Some more links with general info on ES & CQRS:
https://community.risingstack.com/event ... g-vs-crud/
https://msdn.microsoft.com/en-us/library/jj591559.aspx
http://www.ben-morris.com/designing-an- ... -sourcing/
https://abdullin.com/post/event-sourcing-why/

User avatar
david63
Registered User
Posts: 355
Joined: Mon Feb 07, 2005 7:23 am

Re: Event Sourcing and CQRS

Post by david63 »

I appreciate that the cost of storage is cheap these days but when considering phpBB many (the majority?) of users are on [cheap] shared hosting where they are limited to the amount of storage that they have and some hosts are notorious in trying to force you to upgrade your hosting package when you run out of space.

The point I am trying to make is that for some this approach could have a negative impact.

Anyway those statistics do not represent the true picture - where can you go and buy a 1GB drive for 2c? Yes you can buy a 2tb drive for $20 (possibly) - but that is not quite the same.
David
Remember: You only know what you know -
and you do not know what you do not know!

User avatar
martti
Registered User
Posts: 45
Joined: Wed Aug 20, 2014 4:50 pm
Location: Belgium

Re: Event Sourcing and CQRS

Post by martti »

Ok that can be a point of discussion. It is then to find out/to have an idea how much more data is to be expected. And how much disk space will have impact on the price.
i.e My current shared hosting package is in the lowest category, I pay 0,5 euro/month and it gives 1.5GB. Currently I have a 55 MB database of which I suspect the logs take an important part. (Something that overlaps with ES) If my database would be 10 times the size I would still be fine. Attachments take currently much more space. My board is of course a small board.
What is the size of the database of the phpBB community board for example?

User avatar
martti
Registered User
Posts: 45
Joined: Wed Aug 20, 2014 4:50 pm
Location: Belgium

Re: Event Sourcing and CQRS

Post by martti »

Actually I don't think so much more data will be generated. Except maybe for the "views" (view counter of the topics): if it would be a problem to store these as events, you could give the option to keep them out of the ES and still store these as "state". Although, storing the "views" in ES could give a lot of interesting information like: when did people read this topic (you could build even a graph to see when a topic was popular), from what part of the world etc.

Paul
Infrastructure Team Leader
Infrastructure Team Leader
Posts: 373
Joined: Thu Sep 16, 2004 9:02 am
Contact:

Re: Event Sourcing and CQRS

Post by Paul »

phpBB.com's database is around 8GiB (That includes the CDB and other pages ofcourse, but we are not using the database for search but are using sphinx instead).
You should not forget that phpBB would require a huge amount of events. It is not just posting/editing/registring, but also all actions in the MCP & UCP (And, depending how far you want to go, also the ACP). And, if you want some performance with larger sites, you will also need a read model, making the amount of data even bigger.
You should also consider how the upgrade path can be. I think it is nearly impossible to get a larger site from the current database model to a event model without losing info, or without defining seperate events, making it even more complex.

User avatar
martti
Registered User
Posts: 45
Joined: Wed Aug 20, 2014 4:50 pm
Location: Belgium

Re: Event Sourcing and CQRS

Post by martti »

paulus wrote: Fri Apr 27, 2018 11:28 am phpBB.com's database is around 8GiB (That includes the CDB and other pages ofcourse, but we are not using the database for search but are using sphinx instead).
You should not forget that phpBB would require a huge amount of events. It is not just posting/editing/registring, but also all actions in the MCP & UCP (And, depending how far you want to go, also the ACP). And, if you want some performance with larger sites, you will also need a read model, making the amount of data even bigger.
You should also consider how the upgrade path can be. I think it is nearly impossible to get a larger site from the current database model to a event model without losing info, or without defining seperate events, making it even more complex.
That's still anything far from big data. I considered all those events and of course the upgrade path is something that would be thought about very well in advance. At least one read model (a view model, a projection) is a must, not an option, also for small sites I think.

That being said, I just mention ES and CQRS here as food for thought. It's a technique that brings challenges but also great benefits; it allows to let disappear difficulties that were introduced by the current state model (coupling the the domain model to the database model). A few years ago you couldn't find much information on it, but that has changed. It's getting more and more attention. And the price of storage and hosting keeps dropping. So maybe by the time people start working on phpBB4 it might be not an odd choice.

Post Reply