Bits of Characters
If you are curious what the binary of an ascii/utf-8 char is you can use this string of commands at the command line:
echo 'A' | xxd -b
The A
character is 65 in ascii which is 64 + 1. 01000000
is 64. 00000001
is of course 1, 01000001
is 65.
echo 'a' | xxd -b
The a
character is 97 in ascii which is 64 + 32 + 1. 00100000
is 32 in binary, given this and the above, 01100001
is 97.
echo '🤓' | xxd -b
This ridiculous emoji is a utf-8 char. When you look at the binary for it:
11110000 10011111 10100100 10010011
You can see that every byte begins with a 1 which means that it will combine with any subsequent byte beginning with 1 to form a unique character.
Tweet