Understanding SMS lengths and 'units'

Modified on Mon, 15 May 2023 at 01:34 PM

For the most in depth information, please look here https://en.wikipedia.org/wiki/GSM_03.38 

It is important to keep in mind that the 'basic' set of characters is limited. There are some special characters in the 'Basic Extended Set' that count as two characters in a non-unicode message: ^\[~]|€{}

Anything outside of the basic and extended set forces the message to be sent as unicode.

Using the 7bit encoding character set:

- single message unit can be a maximum of 160 characters.

- If you exceed the 160 characters, all 'message units' are calculated in multiples of 153 characters, with the other 7 characters used for concatenation headers. (e.g. 2 sms units is a message of 306 max length)

If there are any characters outside of the 7bit encoding set used, the message is classed as 'unicode' 

- A single unicode message is 70 characters. 

- If you exceed 70 characters, then multiples of 67 characters to calculate the units due to the headers.

- Most emoji's are 2 unicode characters (there's a small set that are a single character) 

- Some emoji's are actually combinations of symbols that devices then display as a single symbol. 

    - e.g. A 'coloured thumbs up' emoji is both a colour symbol followed by the thumbs up symbol. This results in the use of 4 unicode character being required for what appears to be a single emoji.

    - e.g. The 'family' emojis are actually a combination of 4 different symbols (man, woman, child, child) and takes 8 characters.


  • If you are checking a message length using our message builder UI we have an 'on paste' filter that will change a couple of known problem characters. (Apostrophes, single and double quotes and ellipses when pasted from Microsoft Word use unicode characters)

  • To help users of our API's, we have a set of character replacements which attempt to keep a message as non-unicode. These are similar to our 'on paste' filtering but also includes some things like: 0-width space, unicode space, tab character, hyphens and bullet point characters.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article