You’ve crafted the perfect text message. Your campaign goes off without a hitch. Then, when you look at your costs you see they’re five times what you expected. Leading you to think: What is a segment and why am I being charged for so many of them?
We’ll pull back the covers on SMS standards to give you an answer.
Here’s what we’ll cover:
Understanding what a segment is and how it affects your bill
Encoding standards and headers you use to send messages
Crafting the perfect message
Subtle gotchas & pro-tips
Looking Back on Older Phones to Understand a Segment
Think back to when you first started texting. While hammering out messages on an older keyboard, you may have noticed a counter ticking down from 160 next to a 1. When that counter hit 0, you’d see that 1 that was sitting next to the 160 jump up to a 2. This means you’d end up with two messages on your bill. This first number was counting how many characters you had left per segment and the second one was counting how many segments you had used.
What’s Changed About Segments Since Back in The Day
SMS standards have barely changed since the days of flip phones. Messages are still sent in 140-byte parts, known as message segments.
When CGT communicates with carriers to send out SMS messages, we send them one segment at a time. To figure out how many characters this affords you, we’re going to have to do a little math.
A Little Math, Much Clearer Insight Into Segments
Standard SMS encoding uses the GSM 03.38 character set, which takes 7 bits to encode a character. 140 bytes x 8 bits in a byte divided by 7 bits leaves us with the 160-character message segment.
Message segments are how CGT (and the SMS industry as a whole) counts messages.
This means that in addition to your costs, you should also think in terms of segments when you’re analyzing SMS throughput. Throughput varies by the sending number you’re using, but in all cases, it’s counted in terms of Message Segments per Second rather than total messages.
If getting your message out in a certain window is important to you, make sure you know how many segments you’re sending.
How Does the Perfect Message Behave?
Going back to your perfect text message, you count the characters, and something still seems off. You’ve only used 210 characters, but it looks like each of these messages has more than two segments.
Part of the answer lies in the encoding. Notice that this message has UCS2 listed as the encoding instead of GSM. CGT has to use a different character set to accommodate a message as lit as this one. You may have noticed if you clicked on the GSM link above that it didn’t contain any ?’s. When you send messages with non-GSM characters, such as Emojis we have to use a different type of encoding known as UCS-2. UCS2 takes 16 bits to encode each character, so going back to the math we did above, we now have a limit of 70 characters (140 bytes * 8 bits in a byte / 16 bits). Besides emojis, you should also be careful with accented characters. GSM 03.38 includes some accented characters such as ñ, à, and ö, but does not include others such as á, í, or ú.
What Exactly Does a Data Header Do?
Still, it looks like with this 70-character limit, this message should still only be three segments, not four. The last piece of the puzzle lies in concatenation. When you send multi-segment, messages CGT uses User Data Headers to tell the destination how to reassemble it. This takes up 6 bytes per message leaving only 67 characters for UCS2 encoded messages or 153 for GSM encoded messages.
Maybe it turns out the fire emojis aren’t worth it after all. However, when you trim the same message down and resend it, it still doesn’t seem to work out quite right:
This message contains two of the “gotchas” that commonly cause encoding issues: smart quotes and non-GSM spaces. Look at this message that appears almost identical:
There are only three characters that have been switched: the spaces between sentences were changed from ‘ ’ to ‘ ’ (U+2002 to U+0020) and the “smart quote” after Shakespeare was replaced with a standard apostrophe ‘ instead ‘ (U+2019 to U+0027). Smart quotes are usually a result of text editors being too darn helpful. Non-GSM spaces are usually a result of copying and pasting. Be extra careful with those as they’re often converted to conventional spaces for display. Message bodies that contain non-GSM spaces in the API will be formatted as regular U+0020 spaces for display.
To help you calculate your message segment count, you can use tools like the Message Segment Calculator. This can be especially useful in ensuring that your messages are appropriately segmented and that you are aware of the potential costs and spam risks associated with longer messages.
Understanding message segment count is essential for effective communication in the digital age. To avoid having your messages marked as spam, it's crucial to craft messages that are relevant, contain genuine content, and adhere to established messaging practices. By doing so, you can ensure that your messages reach their intended recipients without being flagged as spam.