Rewriting from ground up

sooskriszta · Post by **sooskriszta** » Fri Aug 26, 2011 12:35 pm

Old article, but certainly relevant.

Joel Spolsky, founder of Fog Creek Software and Stack Overflow, and creator of Kiln wrote: Netscape 6.0 is finally going into its first public beta. There never was a version 5.0. The last major release, version 4.0, was released almost three years ago. Three years is an awfully long time in the Internet world. During this time, Netscape sat by, helplessly, as their market share plummeted.

It's a bit smarmy of me to criticize them for waiting so long between releases. They didn't do it on purpose, now, did they?

Well, yes. They did. They did it by making the single worst strategic mistake that any software company can make:

They decided to rewrite the code from scratch.

Netscape wasn't the first company to make this mistake. Borland made the same mistake when they bought Arago and tried to make it into dBase for Windows, a doomed project that took so long that Microsoft Access ate their lunch, then they made it again in rewriting Quattro Pro from scratch and astonishing people with how few features it had. Microsoft almost made the same mistake, trying to rewrite Word for Windows from scratch in a doomed project called Pyramid which was shut down, thrown away, and swept under the rug. Lucky for Microsoft, they had never stopped working on the old code base, so they had something to ship, making it merely a financial disaster, not a strategic one.

We're programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We're not excited by incremental renovation: tinkering, improving, planting flower beds.

There's a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:

It’s harder to read code than to write it.

This is why code reuse is so hard. This is why everybody on your team has a different function they like to use for splitting strings into arrays of strings. They write their own function because it's easier and more fun than figuring out how the old function works.

As a corollary of this axiom, you can ask almost any programmer today about the code they are working on. "It's a big hairy mess," they will tell you. "I'd like nothing better than to throw it out and start over."

Why is it a mess?

"Well," they say, "look at this function. It is two pages long! None of this stuff belongs in there! I don't know what half of these API calls are for."

Before Borland's new spreadsheet for Windows shipped, Philippe Kahn, the colorful founder of Borland, was quoted a lot in the press bragging about how Quattro Pro would be much better than Microsoft Excel, because it was written from scratch. All new source code! As if source code rusted.

The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they've been fixed. There's nothing wrong with it. It doesn't acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that's kind of gross if it's not made out of all new material?

Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.

Each of these bugs took weeks of real-world usage before they were found. The programmer might have spent a couple of days reproducing the bug in the lab and fixing it. If it's like a lot of bugs, the fix might be one line of code, or it might even be a couple of characters, but a lot of work and time went into those two characters.

When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work.

You are throwing away your market leadership. You are giving a gift of two or three years to your competitors, and believe me, that is a long time in software years.

You are putting yourself in an extremely dangerous position where you will be shipping an old version of the code for several years, completely unable to make any strategic changes or react to new features that the market demands, because you don't have shippable code. You might as well just close for business for the duration.

You are wasting an outlandish amount of money writing code that already exists.

Is there an alternative? The consensus seems to be that the old Netscape code base was really bad. Well, it might have been bad, but, you know what? It worked pretty darn well on an awful lot of real world computer systems.

When programmers say that their code is a holy mess (as they always do), there are three kinds of things that are wrong with it.

First, there are architectural problems. The code is not factored correctly. The networking code is popping up its own dialog boxes from the middle of nowhere; this should have been handled in the UI code. These problems can be solved, one at a time, by carefully moving code, refactoring, changing interfaces. They can be done by one programmer working carefully and checking in his changes all at once, so that nobody else is disrupted. Even fairly major architectural changes can be done without throwing away the code. On the Juno project we spent several months rearchitecting at one point: just moving things around, cleaning them up, creating base classes that made sense, and creating sharp interfaces between the modules. But we did it carefully, with our existing code base, and we didn't introduce new bugs or throw away working code.

A second reason programmers think that their code is a mess is that it is inefficient. The rendering code in Netscape was rumored to be slow. But this only affects a small part of the project, which you can optimize or even rewrite. You don't have to rewrite the whole thing. When optimizing for speed, 1% of the work gets you 99% of the bang.

Third, the code may be doggone ugly. One project I worked on actually had a data type called a *beep*. Another project had started out using the convention of starting member variables with an underscore, but later switched to the more standard "m_". So half the functions started with "_" and half with "m_", which looked ugly. Frankly, this is the kind of thing you solve in five minutes with a macro in Emacs, not by starting from scratch.

It's important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time. First of all, you probably don't even have the same programming team that worked on version one, so you don't actually have "more experience". You're just going to make most of the old mistakes again, and introduce some new problems that weren't in the original version.

The old mantra build one to throw away is dangerous when applied to large scale commercial applications. If you are writing code experimentally, you may want to rip up the function you wrote last week when you think of a better algorithm. That's fine. You may want to refactor a class to make it easier to use. That's fine, too. But throwing away the whole program is a dangerous folly, and if Netscape actually had some adult supervision with software industry experience, they might not have shot themselves in the foot so badly.

sooskriszta · Post by **sooskriszta** » Sat Aug 27, 2011 12:37 am

Chad Flower, author of The Passionate Programmer wrote: You’ve got an existing, successful software product. You’ve hit the ceiling on extensibility and maintainability. Your project platform is inflexible, and your application is a software house of cards that can’t support another new feature.

You’ve seen the videos, the weblog posts and the hype, and you’ve decided you’re going to re-implement your product in Rails (or Java, or .NET, or Erlang, etc.).

Beware. This is a longer, harder, more failure-prone path than you expect.

Throughout my career in software development, I’ve been involved in Big Rewrite after Big Rewrite. I suspect it’s because I have an interest in learning eclectic computer languages, operating systems, and development environments. Not being just-a-Java-guy or just-a-Windows-guy has led to me becoming a serial rewriter. I’ve been on projects to replace C, COBOL, PHP, Visual Basic, Perl, PLSQL, VBX (don’t ask!) and all manner of architectural atrocities with the latest and greatest technology of the day.

In many cases, these Big Rewrite projects have resulted in unhappy customers, political battles, missed deadlines, and sometimes complete failure to deliver. In all cases, the projects were considerably harder than the projects’ initiators ever thought they would be.

This is not a technology problem. It’s not at all Rails-specific, but being in the limelight these days, Rails implementations are both very likely to happen and very risky right now.

Why So Hard?
So, why, in software rewrites, when you’re traversing exclusively familiar territory are the results so often unpredictable and negative?

Software as spec

"Make it do what it already does." That’s a tempting and simple way to view software requirements on a rewrite project. After all, the system already exists. The question of "what should it do when…" can presumably always be answered with: "what it already does".

There are two major problems with this assumption. The first, and most disruptive, is that the programmers don’t know what questions to ask. This is especially true if the programmers weren’t the original developers of the system (most often the case on a major technology shift), but even a programmer who did the original implementation of a product won’t remember every nook, cranny, and edge case. What’s worse, with the fragile safety net of an existing implementation, programmers can easily oversimplify the interface, and assume they know the capabilities of the system. If a combination of drop-down selections results in a whole new corner of the system, how are they to know without stumbling onto it (or performing an exhaustive and expensive test cycle)?

If the software you’ve built is complex enough that it needs to be rewritten, it’s probably also so complex that it’s not discoverable in this way. This means that domain experts are going to have to be heavily involved. It means that requirements are going to need to be communicated in much the same way they are on a green-field project. And it means that, unless it’s only used as a supplement, the existing system is more a liability to the rewrite than an asset.

Optimistic programmers might think I’ve missed something important here. If you’re rewriting a system, you’ve already got the code. The code can serve as the spec, right? Probably not.

Based on my own experiences and conversations with thousands of software developers around the planet, I unscientifically conclude that almost all production software is in such bad shape that it would be nearly useless as a guide to re-implementing itself. Now take this already bad picture, and extract only those products that are big, complex, and fragile enough to need a major rewrite, and the odds of success with this approach are significantly worse.

Existing code is good for discovering algorithms—not complex, multistep processes.

Invention or Implementation?

In his article, The C2I2 Hypothesis, programmer Zed Shaw criticizes the famous C3 project at Chrysler, which is known for being the birthplace of eXtreme Programming. He says that the project was an implementation—not an invention. An invention, according to Zed, is something new which needs creativity and high customer involvement, whereas an implementation is a project which participants (including programmers) have done before.

According to Zed:
If that’s the case, why was the customer involved all the time? They had a completely working specification in an already working system. Replacing it is more a matter of reverse engineering than gathering vision, customer feedback, use cases, stories, or any of the other crap the XP team used.

Here’s the problem: when does the label "Payroll System" become so broad that you don’t know if it’s an invention or an implementation? Could it be possible that at a huge company like Chrysler the payroll system was unlike any other payroll system that had ever existed? And, within the realm of this possibility, might it also be possible that Chrysler were redesigning the system, because through such changes as globalization and evolving international tax and labor laws, the system which was being replaced was no longer valid?

My point isn’t to say Zed is wrong. He makes some excellent points and may very well be right (though, knowing some of the guys on the C3 project, I’d guess they knew the difference between invention and implementation).

My point is that it’s not always clear cut which things are implementation and which are invention. Worse than being ambiguous, it’s often not clear that it is ambiguous. My experience says that most of the time, people doing the Big Rewrite will assume that they’re doing an implementation and will staff and estimate accordingly.

Most of the time, they’re wrong.

The wish list

Imagine going to the hospital for a kidney transplant, and before and during the surgery saying to the surgeon: “Oh, and while you’re already in there digging around, I’ve had some problems with my lungs that could use a little attention. And, yes, I’ve been overeating terribly—-could you do one of those stomach reduction things I hear about? And on that note, how about a little plastic surgery since we’ve got the knives out?”

This is effectively what happens on The Big Rewrite. An existing product, no matter how successful, always has a few warts. The rewrite is seen by many people as the perfect opportunity to shave off the warts. If we’re going to do it over again, we might as well do it right this time.

Under the veil of a rewrite, the assumption is that the personality and capabilities of the software aren’t changing. So, what might start as just a few little tweaks will usually turn into an unbridled reinvention, with none of the usual checks and balances that go into new product development. With potentially many stake-holders involved and an uncontrolled process, I’ve seen little tweaks end up increasing the total effort and feature-set of a Big Rewrite by as much as 100%.

The Big Bang

When you do a technology rewrite, you want things to be clean. That’s usually a major goal in a project like this. And at the beginning of a Big Rewrite, while you’re still wide-eyed and hopeful for your application’s ultra-elegant, scalable, maintainable future, you’re faced with a question: Should we deliver the Rewrite incrementally or all in one big release? Now, imagine your existing infrastructure is a home-grown Oracle Pro*C-based CGI framework with its own cookie-based authentication mechanism which relies on carnal knowledge of an aging mainframe ERP system. Incremental deliveries means making the new technology work within the dirty framework of the old system. One big release would mean we could just turn off the old system, turn on the new one, and keep our new efforts isolated and pristine.

In most cases, it’s the Big Bang approach that wins the argument.

Now picture yourself as a developer or project leader nine months into this project. The old system, still in production, has been patched and enhanced along side the new one as you’ve been developing it. You haven’t had time to keep up with each and every change that took place in the old system. As a result, on top of behavioral changes, you’ve got an ever-evolving database schema to port to the new platform. And the new system’s wish list has gotten so out of control, that there are major differences between the old and the new. To top it all off, working from the old system as a specification didn’t work, and you’re way behind schedule due to misunderstood requirements and rework.

The table has been set. The guests are on their third course. And now you have to come along and replace the table on which they’re eating without disturbing their meal.

On a big system with a lot of customers, data migration can be a huge problem. Not only do we have to keep track of what gets migrated when, but we have to actually perform the migration at some point. The Big Bang sounds like a lovely idea until you get to the actual event, and you realize it’s kind of like preparing for a world title boxing match when you know it’s the first and last time you’ll ever compete. The processes and software you have to create, the attention you have to pay before you can create an event like this is often as consuming, complex, and potentially disastrous as the system development effort itself.

But by making it a Big Bang release, you’ve maximized the chances that you’ll be behind schedule when you get to the end, and you’ve therefore maximized the chances that you won’t spend enough time preparing. This results in a bad time for both you and your customers.

Unfortunately, perhaps due to something intrinsic in human nature, this scenario is a cliche for Big Rewrite projects.

Justification and Lies

To add to the stress of the Big Bang comes another, mostly people-related issue. Almost all technology rewrites are driven by some technologist. Behind almost every technologist pushing for a Big Rewrite is a business person saying “But, why?” The question is valid. The product already works. It’s successful enough to even consider re-plumbing it, so we must have already gotten something right, no?

So, then come the justifications. They start with the real reasons the software is being rewritten (but usually censored to avoid the technologist looking like he or she screwed up big time on the initial development of the product). The system will be more maintainable. It will be easier to add features. “Oh yea? So we can do more features faster?” “Uh, yea.” “How much faster?” And so on.

As those discussions get heated and prove unsatisfactory, the list of promises gets longer. The system will be more scalable. System response time will improve for our customers. We will have greater uptime. And so on.

It’s rare, in fact, that a technology rewrite can deliver on all these fronts. A J2EE Web application may not prove in practice to provide higher availability than a mainframe application. Rails might be a more flexible and productive environment for a developer, but Rails apps slightly underperform equivalent PHP apps. So, you don’t sell Rails as something that will be faster than PHP. You sell it as something that is more flexible and maintainable, and will perform reasonably compared to a PHP application.

The piles of justification lead to piles of additional work and/or piles of mismatched expectations and disappointment after release.

Who's Tending the Store?

While we’re all in the back creating the next revision of a product, who’s tending to the day to day issues of the existing product? Typically, it’s the domain experts and the original implementers of the product.

Regardless of our intentions, day to day life and in-your-face time-sensitive issues can very easily steal all of the attention from a Big Rewrite. Screaming customers need their problems solved. Outages and serious bugs need to be fixed. Enhancements have to keep rolling in if your project takes as long as projects tend to take. Somebody has to do these things. Training new people is hard and doesn’t seem to make sense. If we’re getting rid of a system, why would we train someone how to maintain it?

So, the experts keep the old system running while the new system is being built. So, who builds the new system? Not the experts, that’s who. Usually, it’s people like me: technology experts. And while we’re banging away at the existing system’s UI, trying to figure out what needs to be coded, the domain experts are doing their jobs. Unfortunately, this means the domain experts aren’t watching the Big Rewrite very closely. Regardless of how good the team, if communication is impaired between the domain experts and the technology experts, things are going to move slowly, and wrong software is going to be created.

sooskriszta · Post by **sooskriszta** » Sat Aug 27, 2011 1:00 am

Steve Blank, founder of E.piphany, Zilog, MIPS Computers, Convergent Technologies, author of Four Steps to the Epiphany and one of top 10 Influencers in Silicon Valley wrote: The benefits of customer and agile development and minimum features set are continuous customer feedback, rapid iteration and little wasted code. But over time if developers aren’t careful, code written to find early customers can become unwieldy, difficult to maintain and incapable of scaling. Ironically it becomes the antithesis of agile. And the magnitude of the problem increases exponentially with the success of the company. The logical solution? “Re-architect and re-write” the product.

For a company in a rapidly changing market, that’s usually the beginning of the end.

It Seems Logical

I just had lunch (at my favorite Greek restaurant in Palo Alto forgetting it looked like a VC meetup) with a friend who was technical founder of his company and is now its chairman. He hired an operating exec as the CEO a few years ago. We caught up on how the company was doing (“very well, thank you, after five years, the company is now at a $50M run rate,”) but he wanted to talk about a problem that was on his mind. “As we’ve grown we’ve become less and less responsive to changing market and customer needs. While our revenue is looking good, we can be out of business in two years if we can’t keep up with our customer’s rapid shifts in platforms. Our CEO doesn’t have a technology background, but he’s frustrated he can’t get the new features and platforms he wants (Facebook, iPhone and Android, etc.) At the last board meeting our VP of engineering explained that the root of our problems was ‘our code has accumulated a ton of “technical debt,’ it’s really ugly code, and it’s not the way we would have done it today. He told the board that the only way to to deliver these changes is to re-write our product.” My friend added, “It sounds logical to the CEO so he’s about to approve the project.”

Shooting Yourself in the Head

“Well didn’t the board read him the riot act when they heard this?” I asked. “No,” my friend replied, sadly shaking his head, “the rest of the board said it sounded like a good idea.”

With a few more questions I learned that the code base, which had now grown large, still had vestiges of the original exploratory code written back in the early days when the company was in the discovery phase of Customer Development. Engineering designs made back then with the aim of figuring out the product were not the right designs for the company’s current task of expanding to new platforms.
I reminded my friend that I’ve never been an engineering manager so any advice I could give him was just from someone who had seen the movie before.

The Siren Song to CEO’s Who Aren’t Technical

CEO’s face the “rewrite” problem at least once in their tenure. If they’re an operating exec brought in to replace a founding technical CEO, then it looks like an easy decision – just listen to your engineering VP compare the schedule for a rewrite (short) against the schedule of adapting the old code to the new purpose (long.) In reality this is a fools choice. The engineering team may know the difficulty and problems adapting the old code, but has no idea what difficulties and problems it will face writing a new code base.

A CEO who had lived through a debacle of a rewrite or understood the complexity of the code would know that with the original engineering team no longer there, the odds of making the old mistakes over again are high. Add to that introducing new mistakes that weren’t there the first time, Murphy’s law says that unbridled optimism will likely turn the 1-year rewrite into a multi-year project.

My observation was that the CEO and VP of Engineering were confusing cause and effect. The customers aren’t asking for new code. They are asking for new features and platforms –now. Customers couldn’t care less whether it was delivered via spaghetti code, alien spacecraft or a completely new product. While the code rewrite is going on, competitors who aren’t enamored with architectural purity will be adding features, platforms, customers and market share. The difference between being able to add them now versus a year or more in the future might be the difference between growing revenue and going out of business.

Who Wants to Work on The Old Product

Perhaps the most dangerous side-effect of embarking on a code rewrite is that the decision condemns the old code before a viable alternative exists. Who is going to want to work on the old code with all its problems when the VP Engineering and CEO have declared the new code to be the future of the company? The old code is as good as dead the moment management introduces the word “rewrite.” As a consequence, the CEO has no fallback. If the VP Engineering’s schedule ends up taking four years instead of one year, there is no way to make incremental progress on the new features during that time.

What we have is a failure of imagination

I suggested that this looked like a failure of imagination in the VP of Engineering - made worse by a CEO who’s never lived through a code rewrite – and compounded by a board that also doesn’t get it and hasn’t challenged either of them for a creative solution.

My suggestion to my friend? Given how dynamic and competitive the market is, this move is a company-killer. The heuristic should be don’t rewrite the code base in businesses where time to market is critical and customer needs shift rapidly.” Rewrites may make sense in markets where the competitive cycle time is long.

I suggest that he lay down on the tracks in front of this train at the board meeting. Force the CEO to articulate what features and platforms he needs by when, and what measures he has in place to manage schedule risk. Figure out whether a completely different engineering approach was possible. (Refactor only the modules for the features that were needed now? Rewrite the new platforms on a different code-base? Start a separate skunk works team for the new platforms? etc.)

Lessons Learned wrote:
Not all code rewrites are the same. When the market is stable and changes are infrequent, you may have time to rewrite.

When markets/customers/competitors are shifting rapidly, you don’t get to declare a “time-out” because your code is ugly.

This is when you need to understand 1) what problem are you solving (hint it’s not the code) and 2) how to creatively fix what’s needed.

Making the wrong choice can crater your company.

This is worth a brawl at the board meeting.

Oleg · Post by **Oleg** » Sat Aug 27, 2011 2:47 am

Some of my considerations for and against rewriting:

1. As a project matures, the amount of work needed to accomplish changes goes up. This is because software gets options which have to be tested with, runs on multiple environments which have to be supported and tested on, etc. Mature projects have established development practices designed to ensure stability and functionality (code reviews, regression tests, etc.). Contributors generally want to see their changes accepted, and this process takes longer on bigger and more mature projects. The barrier of entry to contributing is higher.

2. All projects have inertia. If something is already done, there is usually resistance to doing the same thing in a different way. Some of it is driven by users who don't want change for the sake of change; some of it is driven by developers who like their way of doing things. This is one reason for forks - people want to implement ideas that are too far detached from established approaches and they are not welcomed in existing projects.

3. The reason why mature projects are larger is because they have more functionality, ether due to sheer feature count or because of greater depth of each feature. A rewrite cannot match the functionality or stability of a mature, debugged product for quite some time.

4. Software developed by volunteers (as is the case with phpbb) must be sufficiently attractive to volunteers, or it will slowly lose its developers and stagnate. All projects have turnover. As technology changes, fewer people are familiar with or want to work on old technologies. Projects need to keep up with changing technology or they risk losing their developer and eventually user base.

5. Some contributors want to contribute in order to have an impact. If they contribute to a new project they would have a lot more impact compared to contributing to an established project. Put differently, new projects offer more opportunities to write "cool" things because they don't do much, whereas established projects tend to have bugs that need to be fixed - a much less glorious undertaking - because they actually have code that was used by users and had bugs filed against it.

As was pointed out in another topic, here we are talking about rewrites from scratch of entire applications at once. The so-called "incremental rewrites" are entirely different, and actually are what allows projects to evolve with technology changes.

sooskriszta · Post by **sooskriszta** » Sat Aug 27, 2011 5:58 pm

Some excellent points, Oleg!

I think I would list my pros and cons as following:

Why rewrites seem a good idea

As the project matures and starts becoming more and more complex, writing things from scratch becomes more and more enticing compared to finding and fixing bugs, writing a convoluted function, etc
The existing code is crap (very rarely the case).
The existing code seems to be crap (very often the case). To invoke Joel,That simple function is 2 pages? None of this stuff belongs in there? It has grown little hairs and stuff on it and nobody knows why? I'll tell you why: those are bug fixes.
Ruby (or Python or whatever the flavor of the day is) seems to be generating a lot of excitement, and developers wouldn't mind getting it on their CV.

Why rewrites are *almost* never a good idea

The fundamental problem with rewrites, in my humble opinion, is that the problem they try to address is a very real, serious problem, but they aren't really good, viable solution to that problem...somewhat like diet fad pills...they draw people in by detailed and sometimes quite scientific analysis of the problem but when it comes to the solution they don't actually offer much.
Irrespective of whether the original's programmers are on the team or not, the rewritten software *almost* never has all the functionality of the existing software.
A rewrite will take longer than you had hoped or planned.
Invariably, several bugs that have been quashed in existing system would creep into the rewritten code, and some new ones.
If you are a market leader and decide to rewrite, you are throwing away the massive advantages that make you the market leader and are no better than a brand new startup...you are hanging by the thin thread of brand recognition and testing its limits.
Because of the above 4 reasons, by the time you launch the rewritten code, the competitor would have evolved to a whole new level....unless you are very lucky, this would be the beginning of the end. e.g. If phpbb 4 were to be a complete rewrite, and would take (very very aggresively) 18 months from today...then by the time it is released, the infant bbpress would have matured into a powerful adult and almost certainly would be more feature rich than the brand spanking new phpbb, which ironically would be the infant.
Watching the store - while someone rewrites, 2 things could happen to existing software: 1) Developers may stop working on it, while competitors keep improving theirs and weaning away your customers, or 2) A key set of people may be involved in keeping the existing software alive, in which case the new software would be nowhere close to where it needs to be in terms of quality.
When you start from scratch there is absolutely no reason to believe that you are going to do a better job than you (or someone else) did the first time, although you almost certainly think you will. Bible has a word for this. It's called hubris.

I'm with Joel on this one:

Code doesn't rust.
As the team learns more, as technologies evolve, almost certainly there would be parts of the software that need to be changed.
The path to take would, in *almost* all cases would be to change things piece by piece in the existing software...by refactoring parts...changing templating system in next 6 months, modifying plugin system in 6 months after that, etc. rather than a big bang rewrite of the whole thing in one go from ground up.
If you want to shift paradigm, you don't burn your existing code or get existing team to do it for you...you hire a team of 4 who have never seen existing product, lock them up in a skunkworks studio, and see what concept they come up with....it may be worth developing, it may not be...but it would most certainly be a completely different product rather than a rewrite of the existing one.

imkingdavid · Post by **imkingdavid** » Wed Nov 30, 2011 1:46 am

I think that there are some great points in the above articles (although I did only read the first of the three articles before skimming the others and skipping to the follow up responses).

I think one of the reasons a complete rewrite is often the preferred option when progressing with development is, as the first article pointed out, writing code is a LOT easier than reading code. People simply do not want to have to sift through pages and pages of code to figure out how it works; they'd rather start over and write it themselves. That way they know exactly what it does, how it does it, etc. This is one reason commenting/documenting code is very important.

However, a complete rewrite creates issues because, for one, it's a ton of work. Having to rewrite each and every feature from scratch takes lots of time that could instead have been used optimizing the current code if it were documented/readable. Secondly, you lose most or all backwards compatibility. Even if bits and pieces of the new code is similar to the old code, chances are, it can't simply be integrated with existing modifications/plugins. Sometimes backwards compatible functions, etc. are added to prevent this, but that's even more work in the end.

So with that in mind, an incremental or feature rewrite might be a better option in some cases because you can optimize a certain part of the code while keeping backwards compatibility. You can take a section of the code at a time and optimize it and then move onto another part while the rest remains usable and operational.

However, the idea for phpBB 4 is to be built upon a framework (i.e. Symfony 2). It is impractical to try to implement a framework without rebuilding the application from the ground up.

So ultimately while I agree that the code that has been tested and used and proven to work is probably not as bad as people think, for the phpBB 4 update it is a better idea to do a full rewrite.

One thing that I would like to stress, in any case, to the development team for 4.0 is commenting the code. 3.0 has some comments here and there, some of which actually serve a purpose (some seem to be light-hearted humor or inside jokes or whatever but do not actually explain what is happening). I think that for maintainability for future developers and contributors, commenting and documentation are crucial. As has been repeated, reading code is much more difficult that writing it, so you can make it a LOT easier in the future for others and even yourself by taking the extra couple of minutes to go back over your code and document what is happening, why, etc.

Oleg · Post by **Oleg** » Wed Nov 30, 2011 1:57 am

imkingdavid wrote: One thing that I would like to stress, in any case, to the development team for 4.0 is commenting the code.

It is our policy already to require docblocks for all new classes and functions.

imkingdavid · Post by **imkingdavid** » Wed Nov 30, 2011 2:06 am

Oleg wrote:
imkingdavid wrote: One thing that I would like to stress, in any case, to the development team for 4.0 is commenting the code.
It is our policy already to require docblocks for all new classes and functions.

Well yes, but within the function or class, it is good practice to tell what the code does. A docblock only contains so much information (overall function description, parameters, return, class properties, etc.), but doesn't explain what is happening. I don't need a step by step hand-holding explanation, but at least something to make it easier to follow what is happening would be nice.

Oleg · Post by **Oleg** » Wed Nov 30, 2011 2:36 am

I agree with your reasoning, but we cannot legislate something like "add comments whenever you write unreadable code". Hopefully with the review that we have we don't add unreadable code, and hopefully functions are small enough that their descriptions in docblocks (which are also required) adequately explain them.

If you have issues with existing code being underdocumented you are welcome to create tickets for additional comments or documentation, and if you have already figured these issues out you are very welcome to submit documentation patches.

imkingdavid · Post by **imkingdavid** » Wed Nov 30, 2011 2:26 pm

Oleg wrote:I agree with your reasoning, but we cannot legislate something like "add comments whenever you write unreadable code". Hopefully with the review that we have we don't add unreadable code, and hopefully functions are small enough that their descriptions in docblocks (which are also required) adequately explain them.

If you have issues with existing code being underdocumented you are welcome to create tickets for additional comments or documentation, and if you have already figured these issues out you are very welcome to submit documentation patches.

Well I don't have any specific examples, and I don't necessarily think there has to be a rule that you "have to document all code", but I think that in general the more explanation, the easier it is to read later on. Anyway, this is going slightly off topic.

Development Discussion Board

Rewriting from ground up

Rewriting from ground up

The big rewrite

Startup Suicide – Rewriting the Code

Re: Rewriting from ground up

Re: Rewriting from ground up

Re: Rewriting from ground up

Re: Rewriting from ground up

Re: Rewriting from ground up

Re: Rewriting from ground up

Re: Rewriting from ground up