CardBoardFish
You are here: CardBoardFish > GeneralSupport > GSMEncoding

GSM Encoding and Special Characters

According to GSM specification, a standard SMS message can contain up to 140 bytes of data (payload). Standard latin (ISO-8859-1) character encoding represents a single character using 1 byte, which is 8 bits. Therefore, the maximum number of latin 1 characters that could be included in an sms is 140.

GSM encoding represents characters using 7 bits instead of 8. This therefore provides a maximum of 160 characters per SMS. (140 * 8 bits) / 7 bits = 160

This effectively halves the number of characters that the GSM character set can support, compared to ISO-8859-1. In order to include common characters that are usually represented using the 8th bit, these characters as well as other symbol characters must be re-mapped to a combination of lower bits. These re-mapped characters are often referred to as special characters. This re-mapping, in combination with packing 7-bit characters into 8-bit bytes is called GSM Encoding.

The below table lists the 7-bit default alphabet as specified by GSM 03.38.

Hex Character name Character ISO-8859-1 Hex
0x00 COMMERCIAL AT @ 40
0x01 POUND SIGN £ A3
0x02 DOLLAR SIGN $ 24
0x03 YEN SIGN ¥ A5
0x04 LATIN SMALL LETTER E WITH GRAVE è E8
0x05 LATIN SMALL LETTER E WITH ACUTE é E9
0x06 LATIN SMALL LETTER U WITH GRAVE ù F9
0x07 LATIN SMALL LETTER I WITH GRAVE ì EC
0x08 LATIN SMALL LETTER O WITH GRAVE ò F2
0x09 LATIN SMALL LETTER C WITH CEDILLA (case changed) Ç C7
0x0A Line Feed A
0x0B LATIN CAPITAL LETTER O WITH STROKE Ø D8
0x0C LATIN SMALL LETTER O WITH STROKE ø F8
0x0D Carriage Return D
0x0E LATIN CAPITAL LETTER A WITH RING ABOVE Å C5
0x0F LATIN SMALL LETTER A WITH RING ABOVE å E5
0x10 GREEK CAPITAL LETTER DELTA Δ
0x11 LOW LINE _ 5F
0x12 GREEK CAPITAL LETTER PHI Φ
0x13 GREEK CAPITAL LETTER GAMMA Γ
0x14 GREEK CAPITAL LETTER LAMBDA Λ
0x15 GREEK CAPITAL LETTER OMEGA Ω
0x16 GREEK CAPITAL LETTER PI Π
0x17 GREEK CAPITAL LETTER PSI Ψ
0x18 GREEK CAPITAL LETTER SIGMA Σ
0x19 GREEK CAPITAL LETTER THETA Θ
0x1A GREEK CAPITAL LETTER XI Ξ
0x1B ESCAPE TO EXTENSION TABLE
0x1B0A FORM FEED 12
0x1B14 CIRCUMFLEX ACCENT ^ 5E
0x1B28 LEFT CURLY BRACKET { 7B
0x1B29 RIGHT CURLY BRACKET } 7D
0x1B2F REVERSE SOLIDUS (BACKSLASH) \ 5C
0x1B3C LEFT SQUARE BRACKET [ 5B
0x1B3D TILDE ~ 7E
0x1B3E RIGHT SQUARE BRACKET ] 5D
0x1B40 VERTICAL BAR | 7C
0x1B65 EURO SIGN A4 (ISO-8859-15)
0x1C LATIN CAPITAL LETTER AE Æ C6
0x1D LATIN SMALL LETTER AE æ E6
0x1E LATIN SMALL LETTER SHARP S (German) ß DF
0x1F LATIN CAPITAL LETTER E WITH ACUTE É C9
0x20 SPACE   20
0x21 EXCLAMATION MARK ! 21
0x22 QUOTATION MARK " 22
0x23 NUMBER SIGN # 23
0x25 PERCENT SIGN % 25
0x26 AMPERSAND & 26
0x27 APOSTROPHE ' 27
0x28 LEFT PARENTHESIS ( 28
0x29 RIGHT PARENTHESIS ) 29
0x2A ASTERISK * 2A
0x2B PLUS SIGN + 2B
0x2C COMMA , 2C
0x2D HYPHEN-MINUS - 2D
0x2E FULL STOP . 2E
0x2F SOLIDUS (SLASH) / 2F
0x30 DIGIT ZERO 0 30
0x31 DIGIT ONE 1 31
0x32 DIGIT TWO 2 32
0x33 DIGIT THREE 3 33
0x34 DIGIT FOUR 4 34
0x35 DIGIT FIVE 5 35
0x36 DIGIT SIX 6 36
0x37 DIGIT SEVEN 7 37
0x38 DIGIT EIGHT 8 38
0x39 DIGIT NINE 9 39
0x3A COLON : 3A
0x3B SEMICOLON ; 3B
0x3C LESS-THAN SIGN < 3C
0x3D EQUALS SIGN = 3D
0x3E GREATER-THAN SIGN > 3E
0x3F QUESTION MARK ? 3F
0x40 INVERTED EXCLAMATION MARK ¡ A1
0x41 LATIN CAPITAL LETTER A A 41
0x42 LATIN CAPITAL LETTER B B 42
0x43 LATIN CAPITAL LETTER C C 43
0x44 LATIN CAPITAL LETTER D D 44
0x45 LATIN CAPITAL LETTER E E 45
0x46 LATIN CAPITAL LETTER F F 46
0x47 LATIN CAPITAL LETTER G G 47
0x48 LATIN CAPITAL LETTER H H 48
0x49 LATIN CAPITAL LETTER I I 49
0x4A LATIN CAPITAL LETTER J J 4A
0x4B LATIN CAPITAL LETTER K K 4B
0x4C LATIN CAPITAL LETTER L L 4C
0x4D LATIN CAPITAL LETTER M M 4D
0x4E LATIN CAPITAL LETTER N N 4E
0x4F LATIN CAPITAL LETTER O O 4F
0x50 LATIN CAPITAL LETTER P P 50
0x51 LATIN CAPITAL LETTER Q Q 51
0x52 LATIN CAPITAL LETTER R R 52
0x53 LATIN CAPITAL LETTER S S 53
0x54 LATIN CAPITAL LETTER T T 54
0x55 LATIN CAPITAL LETTER U U 55
0x56 LATIN CAPITAL LETTER V V 56
0x57 LATIN CAPITAL LETTER W W 57
0x58 LATIN CAPITAL LETTER X X 58
0x59 LATIN CAPITAL LETTER Y Y 59
0x5A LATIN CAPITAL LETTER Z Z 5A
0x5B LATIN CAPITAL LETTER A WITH DIAERESIS Ä C4
0x5C LATIN CAPITAL LETTER O WITH DIAERESIS Ö D6
0x5D LATIN CAPITAL LETTER N WITH TILDE Ñ D1
0x5E LATIN CAPITAL LETTER U WITH DIAERESIS Ü DC
0x5F SECTION SIGN § A7
0x60 INVERTED QUESTION MARK ¿ BF
0x61 LATIN SMALL LETTER A a 61
0x62 LATIN SMALL LETTER B b 62
0x63 LATIN SMALL LETTER C c 63
0x64 LATIN SMALL LETTER D d 64
0x65 LATIN SMALL LETTER E e 65
0x66 LATIN SMALL LETTER F f 66
0x67 LATIN SMALL LETTER G g 67
0x68 LATIN SMALL LETTER H h 68
0x69 LATIN SMALL LETTER I i 69
0x6A LATIN SMALL LETTER J j 6A
0x6B LATIN SMALL LETTER K k 6B
0x6C LATIN SMALL LETTER L l 6C
0x6D LATIN SMALL LETTER M m 6D
0x6E LATIN SMALL LETTER N n 6E
0x6F LATIN SMALL LETTER O o 6F
0x70 LATIN SMALL LETTER P p 70
0x71 LATIN SMALL LETTER Q q 71
0x72 LATIN SMALL LETTER R r 72
0x73 LATIN SMALL LETTER S s 73
0x74 LATIN SMALL LETTER T t 74
0x75 LATIN SMALL LETTER U u 75
0x76 LATIN SMALL LETTER V v 76
0x77 LATIN SMALL LETTER W w 77
0x78 LATIN SMALL LETTER X x 78
0x79 LATIN SMALL LETTER Y y 79
0x7A LATIN SMALL LETTER Z z 7A
0x7B LATIN SMALL LETTER A WITH DIAERESIS ä E4
0x7C LATIN SMALL LETTER O WITH DIAERESIS ö F6
0x7D LATIN SMALL LETTER N WITH TILDE ñ F1
0x7E LATIN SMALL LETTER U WITH DIAERESIS ü FC
0x7F LATIN SMALL LETTER A WITH GRAVE à E0