Skip to main content
  1. Posts/

Parsing phone numbers is a nightmare

Table of Contents

My expectations for parsing phone numbers:

  1. Turn +[COUNTRY] [AREA] [PREFIX][SUFFIX] into a regex.
  2. Done.

Reality:

  • The “country code” is not a country code.
  • A country can have more than one calling codes. In fact, it can have even four.
    • In China, Macau (+853) and Hong Kong (+852) have their own calling codes, distinct from the mainland one (+86)
    • Kosovan phone numbers may start with +381 (Serbia), +386 (Slovenia), +377 (Monaco) or +383 (Kosovo), depending on where/when the number was registered
  • Not exactly parsing, but formatting a number for domestic dial is very country-dependent
  • They can contain more than just numbers and the + sign
    • In Israel, some advertising numbers start with an *
    • In New Zealand, some emergency numbers also starts with an *
    • If you’re parsing numbers from plates or ads, you may also have to deal with:

And those are just some examples of annoyances you may encounter when dealing with phone numbers. So what this leaves us with?

libphonenumber to the rescue! #

In fact, international phone numbers are so chaotic that Google maintains an opensource library, libphonenumber, dedicated to this problem. To illustrate how chaotic it is, I counted more than 200 releases of the library since version 3.0 (2011) – currently at major is 8.x – in the Maven repository. The latest release was published just about 3 weeks ago, as the time of writing. So yeah, I think it’s not just a one-line Regex.

The source library is implemented in C++, Java and Javascript, but there are community ports for other languages too (ex: Python, Ruby, Go, C#).

But if after this reading you are still willing to parse it yourself (or just curious), you will want to read the Falsehoods Programmers Believe About Phone Numbers – that I used as base for this article –, from the same authors of libphonenumber . It’s a comprehensive document on phone numbers (lack of) standards, for your delight and despair.