Mardi 27 octobre 2009
2
27
10
2009
22:50
The American Standard Code for Information Interchange (ASCII) was developed in the
early '60s as a standard encoding of computer characters, encompassing the 26 letters of the English alphabet, both lowercase and uppercase, the numbers, common punctuation symbols, and a number of
control characters.
ASCII uses a 7 bit encoding system to represent 128 different characters.
Only characters between #32 (Space) and #126 (Tilde) have a visual representation, as show in the following table :
While ASCII was certainly a foundation (with its basic set of 128 characters that are still part of the core of Unicode), it was soon superseded by extended versions that used the 8th bit to add another 128 characters to the set.
Now the problem is that with so many languages around the world, there was no simple way to figure out which other characters to include in the set (at times indicated as ASCII-8). To make the story short, Windows adopts a different set of characters, called a code page, with a set of characters depending on your locale configuration and version of Windows. Beside Windows code pages there are many other standards based on a similar paging approach.
How did I get this and the previous image ? Using a simple Delphi 2009 program (called FromAsciiToUnicode) that displays characters on a StringGrid component, initially with the number of the corresponding columns and rows painted on the borders. The program forces some type casts to the AnsiChar type to be able to manage traditional 8-bit characters (more on this in the next chapter) :
procedure TForm30.btnAscii8Click(Sender: TObject);
var
I: Integer;
begin
ClearGrid;
for I := 32 to 255 do
begin
StringGrid1.Cells [I mod 16 + 1,
I div 16 + 1] := AnsiChar (I);
end;
end;
In previous versions of Delphi you could obtain the same output by writing
the following simpler version (that uses Char rather than AnsiChar for the
conversion):
for I := 32 to 255 do
begin
StringGrid1.Cells [I mod 16 + 1,
I div 16 + 1] := Char (I);
end;
I don't think I really need to tell you how messy the situation is with the various ISO 8859 encodings (there are 16 of them, still unable to cover the more complex alphabets), Windows page codes, multi byte representations to cover Chinese and other languages. With Unicode, this is all behind us, even though the new standard has its own complexity and potential problems.
ASCII uses a 7 bit encoding system to represent 128 different characters.
Only characters between #32 (Space) and #126 (Tilde) have a visual representation, as show in the following table :
While ASCII was certainly a foundation (with its basic set of 128 characters that are still part of the core of Unicode), it was soon superseded by extended versions that used the 8th bit to add another 128 characters to the set.
Now the problem is that with so many languages around the world, there was no simple way to figure out which other characters to include in the set (at times indicated as ASCII-8). To make the story short, Windows adopts a different set of characters, called a code page, with a set of characters depending on your locale configuration and version of Windows. Beside Windows code pages there are many other standards based on a similar paging approach.
How did I get this and the previous image ? Using a simple Delphi 2009 program (called FromAsciiToUnicode) that displays characters on a StringGrid component, initially with the number of the corresponding columns and rows painted on the borders. The program forces some type casts to the AnsiChar type to be able to manage traditional 8-bit characters (more on this in the next chapter) :
procedure TForm30.btnAscii8Click(Sender: TObject);
var
I: Integer;
begin
ClearGrid;
for I := 32 to 255 do
begin
StringGrid1.Cells [I mod 16 + 1,
I div 16 + 1] := AnsiChar (I);
end;
end;
In previous versions of Delphi you could obtain the same output by writing
the following simpler version (that uses Char rather than AnsiChar for the
conversion):
for I := 32 to 255 do
begin
StringGrid1.Cells [I mod 16 + 1,
I div 16 + 1] := Char (I);
end;
I don't think I really need to tell you how messy the situation is with the various ISO 8859 encodings (there are 16 of them, still unable to cover the more complex alphabets), Windows page codes, multi byte representations to cover Chinese and other languages. With Unicode, this is all behind us, even though the new standard has its own complexity and potential problems.
Par Sais Abdelkrim
-
Publié dans : Tutorials
Ecrire un commentaire - Voir les commentaires - Recommander
Ecrire un commentaire - Voir les commentaires - Recommander



