What is it exactly?

Discussion of general topics related to the new version and its place in the world. Don't discuss new features, report bugs, ask for support, et cetera. Don't use this to spam for other boards or attack those boards!
Forum rules
Discussion of general topics related to the new release and its place in the world. Don't discuss new features, report bugs, ask for support, et cetera. Don't use this to spam for other boards or attack those boards!
User avatar
DarsVaeda
Registered User
Posts: 87
Joined: Thu Feb 03, 2005 11:15 pm
Location: Germany
Contact:

Re: What is it exactly?

Post by DarsVaeda »

But you definitly go where the future goes.
It's worth the effort.
Great job!
"They say time is the fire in which we burn."

TerraPedia.org

KFCSpike
Registered User
Posts: 17
Joined: Mon Oct 02, 2006 5:44 pm
Location: Scotland
Contact:

Re: What is it exactly?

Post by KFCSpike »

Eelke wrote:
UTF-8 is basically ....[CUT]


Nice explanation Eelke - It has helped me understand what this UTF-8 is all about.

Thanks :D

Tymko
Registered User
Posts: 1
Joined: Thu Oct 26, 2006 9:41 pm

Re: What is it exactly?

Post by Tymko »

Eelke wrote: . . .UTF-8 is basically. . .

I thought my first post here would be put to good use to congratulate you Eelke on a very concise explanation of what UTF-8 is. I am sure that it will now help many members understand why UTF-8 is so important, and why it took the developers this long to implement and release phpBB Beta 3 (as we can see that there was much to be done). And, for anyone who would like to learn more about UTF-8, then I, too, would recommend you read the Wikipedia entry --Here--.


Tymko

Spectral Dragon
Registered User
Posts: 208
Joined: Mon Feb 16, 2004 1:45 pm
Location: Milan, MI
Contact:

Re: What is it exactly?

Post by Spectral Dragon »

Eelke wrote: Basically, all characters that are used to make up the HTML you, as a designer, create, has to be represented in zeroes and ones (computers are digital, remember? ;)). Of course, for everyone to understand what you mean when you send them your ones and zeroes (you may not realise it, but that's what you have been doing whenever you send someone a computer file), there needs to be an agreement of what sequence of ones and zeroes represents which character. This agreement is called an encoding.

Probably the best known encoding, and one of the oldest - I think people didn't even call this an encoding back then - , is ASCII (American Standard Code for Information Interchange), and you don't get much more basic then that. ASCII just describes character-to-binary mappings for upper and lower case a-z, numbers 0-9, some control characters (newline, carriage return, etc.) and a handful special characters (dots, commas, exclamation and question mark, tilde, etc.). You may have seen an ASCII table, where each character is mapped to a number. These numbers represent the ones and zeroes. They are usually decimal (e.g., A is represented by value 65), but that's just because decimal numbers are shorter to write; converting them to binary numbers - ones and zeroes - is pretty trivial if you know how (and often, that's handled at a lower level than a programmer will have to bother with, so it will be the computer taking care of the conversion).

In the years following ASCII, people found the need to use more characters than ASCII described (even many western languages that use ASCII characters, put accents on these characters - ë for example - which ASCII didn't provide), and various extensions were devised. Nowadays, a very common encoding on the web for western language sites is ISO-8859-1. However, we haven't even mentioned languages that use completely different characters, such as Japanese and Chinese; they had (and still have) their own encodings to represent their characters in sequences of ones and zeroes. phpBB used to use these older encodings, because especially when 2.0 was created it was still pretty much standard. Translators and operators of sites targetting audiences that used languages that don't use the ISO-8859-1 encoding, that phpBB 2.0 uses by default, had to fool around with changing the encoding.

UTF-8 is basically also "just" an encoding, but it is very flexible, as that wikipedia article will tell you; you won't need any other encoding to put characters of any language on your site. This also means that no one will every have to fool around with the encoding their site uses, because UTF-8 can handle any language you can throw at it. The web has been shifting more and more towards UTF-8 and for internationally oriented sites it is quickly becoming a necessity.

Wow, detailed AND easy to understand 8O

User avatar
Eelke
Registered User
Posts: 606
Joined: Thu Dec 20, 2001 8:00 am
Location: Bussum, NL
Contact:

Re: What is it exactly?

Post by Eelke »

:oops: Guys, too much credit :) Is it too late to claim copyright before someone goes off and make money with this? :)

Pace
Registered User
Posts: 16
Joined: Fri Oct 06, 2006 10:00 am
Contact:

Re: What is it exactly?

Post by Pace »

:twisted: Too late.

(that said, it was indeed a very fine explanation :))

Post Reply