Monday, March 28, 2011

How to be culturally sensitive with numbers!


Earnest Rutherford from Wikipedia. Image credits.

As a developer, it is important to be culturally conscientious, specifically when it comes to the formatting and parsing of dates and numbers. A fact which I obviously overlooked recently when a user noticed some map coordinates that were displayed many orders of magnitude larger than they should have been. In this post I would like to present the issue encountered and how to resolve it.

Here is the scenario. A third party service returns a map coordinate in decimal degrees as a string. For clarity, this is represented by the following constant.

const string COORDINATE = "-117.182";

And the goal is to convert this string representation to number and assigned it to the following variable.

double longitude;

This number’s format and my computer’s locale are both set to English (United States) or “en-US”. This means that “-117.182” is easily converted to a double as shown below.

double.TryParse(
    COORDINATE,
    out longitude);
MessageBox.Show(longitude.ToString());

image

Under-the-hood, the .NET framework is using the computer’s current locale to guide the number conversion and subsequent display. So, the TryParse above is actually identical to the following:

double.TryParse(
    COORDINATE,
    NumberStyles.Float | NumberStyles.AllowThousands,
    new CultureInfo("en-US"),
    out longitude);
MessageBox.Show(longitude.ToString(new CultureInfo("en-US")));

image

Now, problems occur when a number formatted using “en-US” is parsed on a computer set to a different locale such as Spanish (or “es-ES”).

double.TryParse(
    COORDINATE,
    NumberStyles.Float | NumberStyles.AllowThousands,
    new CultureInfo("es-ES"),
    out longitude);
MessageBox.Show(longitude.ToString(new CultureInfo("es-ES")));

image

In Spain, the convention is to use a “.” as a thousand delimiter rather than a decimal marker as in the US. This results in the number being one thousand times larger than it should be (or originally intended to be).

We cannot change the third party service but the solution is to explicitly convert the string to a number using the “en-US” or the more generic invariant culture formatter.

double.TryParse(
    COORDINATE,
    NumberStyles.Float | NumberStyles.AllowThousands,
    CultureInfo.InvariantCulture,
    out longitude);
MessageBox.Show(longitude.ToString(new CultureInfo("es-ES")));

image

Now that the string is correctly parsed as a double it can be re-displayed using the current (or any other) locale.

In summary:

  1. Whenever possible serialize dates and numbers to strings using the invariant culture, and
  2. Never assume that dates and numbers are serialized with the current locale.

1 comment:

  1. Hello Richie!

    Great post as usual. Most people forget how date and number formatting is variable. I came across the same problem and solved the same way as you.

    It's better to ALWAYS enforce a culture-info, whenever possible of course.

    ReplyDelete