String Encoding and CollationsΒΆ
MemSQL’s default character set and collations deviate from MySQL. While MySQL has a default character set of latin1 and collation of latin1_swedish_ci, MemSQL defaults to a character set of utf8 and collation of utf8_general_ci.
We picked utf8 to support a wider set of alphabets out of the box. The trade off is that in some cases, the strings require more space to store. Utf8 includes ASCII/Latin-1 as a subset consisting of only the single-byte characters, as well as multi-byte Unicode characters. If you want fine-tuned control over the byte-length of your binary columns, use the BLOB and VARBINARY family of types.
MemSQL currently only supports the utf8 character set. Using a character set other then utf8 for a string column will results in an unsupported feature error (see Unsupported Features). All utf8 collations supported by MySQL are also supported by MemSQL. Future releases of MemSQL will have full character set and collation support.
Please contact MemSQL support at support@memsql.com if you have any questions, or if you need a feature that is not currently supported.
