Unicode, Charsets, Strings, and Binaries
TALK LEVEL: BEGINNER / INTERMEDIATE / ADVANCED
Writing global software means our programs need to speak global human languages, but writing programs that work correctly with non-Western European languages is at best a confusing affair. UTF8, latin1, Unicode?
What do these terms mean and how are they related to one another?
And what does Erlang do?
This talk demystifies the terminology around character encoding, explains how to retrofit your Erlang program for Unicode using Datometry HyperQ as a case study, and gives some best practices to help you break the one-byte/one-character assumption.
THIS TALK IN THREE WORDS
Character sets
Character encoding
Clarity
OBJECTIVES
- Demystify terminology around character sets and character set encoding.
- Provide best practices to avoid common pitfalls.
TARGET AUDIENCE
Erlang developers working on internationalised software.