Skip to main content
 
Go Search
Home
Categories
Bloggers
More on OCS Phone Number Normalization
By: Jeff Schertz | Posted: April 25, 2008 at 3:42 PM

(This post is a more in-depth follow-up to my original blog entry on the subject: Enabling Custom Phone Number Normalization with the Address Book Service.)

I recently took a short hiatus from the OCS TechNet discussion forums while concentrating on an Exchange project for the past few months.  While catching up on threads, I've noticed that there is still quite a lot of confusion regarding both how OCS handles phone numbers in the directories and how to understand the normalization rules.  This is not really surprising as not only is the product documentation a little light in these categories, but the sample Address Book Normalization file probably couldn't be any less confusing.  Reverse-engineering some of those sample rules is difficult, and the patterns used don't even match how PSS recommended they be written. (In my personal experience, at least.)

So I'll start off will a quick outline of how Office Communications Server 2007 and the Office Communicator 2007 (2.0) client handle phone numbers and then dive into some basic and complex translation rules.

Behavior

The OC 2007 Enhanced Presence Model White Paper has a very important paragraph on page 7, which basically states that many of the Active Directory attributes (Title, Work Phone, Mobile Phone, Home Phone, Other Phone, Company, Office, Work Address, and SharePoint Site) are visible to all contacts in the company regardless of what Access Level is granted to an individual.  For Federated contacts Access Levels still control the sharing of this AD-populated data, but PIC contacts will still never see any of these fields.  I believe the idea is that if all that information is already populated in AD, thus viewable by all users in the Global Address List via Outlook, then there is no reason to hide the information.  But I think this should still be customizable behavior, as there is a big difference between looking at someone's phone number in the the Address Book and accidentally one-click dialing your VP's mobile phone when using the Office Communicator client.  This default behavior has caused a scramble to remove non-work phone numbers from AD during some client deployments.

That said, lets look at how OCS handles and processes a specific telephone number, as in a user's Home Telephone number:

The AD attribute homePhone is already populated:

  • The Address Book Service processes the value and attempts to normalize it.
    • If correctly normalized then the number is inserted into the OCS address book.
      • Communicator will only display this normalized number if it is properly formatted in E.164 (+13125551234).
        • All OC clients will see this number regardless of their access level.
        • Users can not disable the publishing of these numbers with their client Phones options.
      • If not in E.164 format then the number will not appear in OC.
    • If the number fails to normalize then it will not be written into the OCS address book.

The AD attribute homePhone is not populated:

  • The Address Book Service ignores this attribute
  • A user can enter their Home Phone number and choose to publish it..
    • Note: this does not publish the number into Active Directory, only the OCS address book.
    • Only contacts added to the user's Personal Access Level will see this number in Communicator.

I've created this flowchart to better illustrate the observed behavior:

image

Understanding Normalization Rules

It is important to note that OCS has two places where phone number normalization rules can be utilized, and it depends on the type of deployment: Enterprise Voice (EV) or Remote Call Control (RCC).

  • When utilizing RCC, any rules added to an EV dial plan are completely ignored by OCS; the Address Book Service must be used for normalization. The opposite holds true for EV deployments, to a certain extent as dialing behavior isn't affected the same way, appearance of numbers in internal directories still needs to be considered.
  • Rules created in the dial plan for EV need to be encapsulated with ^ and $ characters, but these are not required in the Address Book Service's configuration file (Company_Phone_Number_Normalization_Rules.txt) as the ABS automatically inserts them when it processes the file.
  • Rules created for the Address Book Service are applied to all numbers in Active Directory by the ABS, while numbers entered into the OC client and pulled from a user's Outlook Contacts are processed by the these same rules, but by the OC client itself. (The rules are downloaded to the client and stored in the registry during sign-in.)

The single best benefit to utilizing the Address Book Service for normalizing numbers is if for some reason the data cannot be corrected in the source: Active Directory.  For the best OCS experience it is recommended to reformat AD phone number attributes to the standard E.164 format, but if this is not an option then the ABS can be used to 'fix' the numbers so that OCS can at least display and dial them as needed in a specific implementation.  (Keep in mind that reverse-number lookup may be adversely affected.)  Due to the E.164 requirement in OCS you may need to create additional rules on the connected PBX system to drop leading characters that OCS send in case the PBX is expecting numbers in a different format.  Typically the PBX is expecting number in either a 10 digit + prefix format for external calls (918005551212) or a short format denoting an internal extension (2454).  If OCS is normalizing and sending numbers as +18005551212 and +2454 then the PBX would need rules to strip the +1 and replace it with 91 for external calls (assuming 9 is needed to dial-out in this scenario) and just strip the + from the internal call.

The foundation for understanding and creating custom normalization patterns are Regular Expressions, which are special text patterns describing search patterns.  Each normalization rule is comprised of two strings, the Phone Pattern and the Translation Pattern.  The Phone Pattern is written to match the incoming phone number, depending on it's source (AD, Outlook Contact, manually entered into the OC Find bar, etc), while the Translation Pattern is how we want the outgoing number to formatted.

If this idea is completely foreign then I suggest following the link above to read-up on Regular Expressions, as well as review the OCS Deployment Documentation and other online resources.  Assuming that you understand how expressions like [0-9] and \d{4} work, then lets move on.

Creating Address Book Rules

Most of the OCS documentation covers simple expression patterns that handle incoming number strings as dialed numbers, so something like ^(312)(\d{7})$ will correctly handle the string 3125551234.  But it will not handle a number pulled from Active Directory in the format (312) 555-1212.  Since the Address Book Service needs to read in potentially thousands and thousands of phone numbers from Active Directory, the format of those attributes yet again becomes important.  So again, if you can get E.164 format forced throughout AD then you are already ahead of the game, but if that is no possible then things could get a bit messy.  Some companies allow non-Administrators to maintain phone numbers by creating a custom web portal that allows the control of certain attributes.  These solutions typically force a standard format throughout all values of a specific attribute.  This is beneficial on that a simple phone pattern can be used since the format is known.

Let's say your particular AD infrastructure is not so standardized and there are any number of privileged individuals entering phone numbers attributes in whatever format their hearts desire: some with parenthesis, some without, some with dashes, some with excessive spaces, etc.  This could create a mountain of carefully-ordered normalization rules for all the possible formats that the ABS would be required to deal with.  Luckily a single expression string can be written to handle almost any possible combination of characters, assuming at least the correct number of digits and order is used.

##
## Normalize all AD phone numbers to E.164
##

\+?[\s()\-\./]*1?[\s()\-\./]
*\(?\s*([2-9]\d\d)\s*\)?[\s()\-\./]*(\d\d\d)[\s()\-\./]*(\d\d\d\d)[\s]*
+1$1$2$3

Now if we dissect the entire rule, it's much easier to understand exactly what it is doing each step of the way:

EXPRESSION ACTION
\+? Ignore the first character if it is a +
[\s()\-\./]* Match any immediately following characters if they are a space ( ) dash or period
1? Ignore the next character if it is a 1
[\s()\-\./]* Match any immediately following characters if they are a space ( ) dash or period
\(? Ignore the next character if it is an open parenthesis
\s* Ignore any number of repeated spaces
([2-9]\d\d) Capture the first 3 digits and store as the first variable. (Matches only valid US area codes)
s* Ignore any number of repeated spaces
\)? Ignore the next character if it is a closed parenthesis
[\s()\-\./]* Match any immediately following characters if they are a space ( ) dash or period
(\d\d\d) Capture the next 3 digits and store as the second variable.
[\s()\-\./]* Match any number of immediately following characters if they are a space ( ) dash or period
(\d\d\d\d) Capture the last 4 digits and store as the third variable.
[\s]* Ignore any number of repeated spaces
   
+1 Insert +1 into the translation pattern
$1 Insert the value of the first captured variable
$2 Insert the value of the second captured variable
$3 Insert the value of the third captured variable

Using this very flexible rule, strings like (312) 555-1234 or +1 (312)555 - 1234 or even something wacky like +---1( ( ))312 . 555)--1234-)(.- . .) would all be normalized into +13125551234.  Now that we have a very general rule designed to normalize AD phone numbers into a format that will both correctly populate the OCS address book, and display correctly in the OC client, let's look at creating some more specific rules to handle proper routing of internal numbers and maybe some local exchange or local area code numbers.

Assume your company has a disconnected number space, which is quite typical given future expansion or changes in local or incumbent exchange carriers. Here are some examples for different contiguous number blocks which are translated into 4-digit extensions for internal PBX routing.

#
# 312-555-9500...9599
#
\(?\s*(312)\s*\)?[\s()\-\./]*(555)[\s()\-\./]*(95\d\d)[\s]*
$3

#
# 312-555-3540...3569
#
\(?\s*(312)\s*\)?[\s()\-\./]*(555)[\s()\-\./]*(35[4-6]\d)[\s]*
$3

#
# 312-555-8120...8127
#
\(?\s*(312)\s*\)?[\s()\-\./]*(555)[\s()\-\./]*(812[0-7])[\s]*
$3

There rules will allow for OCS to send only the 4-digit extensions to the PBX when dialing numbers within those ranges, keeping the call routing internal.

Testing the Rules

The configuration file allows for simple testing of the rules, as can be seen at the end of the ABS Sample configuration file installed by default in OCS.  In order for the test rules to function they must be commented out .  Simple enter the TestInput value to match exactly how the a number would stored in AD, and then enter what the expected result should be for the TestResult value.

#
# Test strings used with the "abserver.exe -testPhoneNorm" command to verify each rule
#
# (All Test strings below should be commented-out for proper operation, do not remove the initial '#')
#

#TestInput: (312) 555-9500 TestResult: 9500
#TestInput: (312) 555-3551 TestResult: 3551
#TestInput: (312) 555-8126 TestResult: 8126

By executing the abserver.exe -testPhoneNorm command, each rule included in the configuration file will be processed, top-down, to look for the best matching normalization rule and then return the results:

image

If the returned results match the expected TestResult parameter, then Test PASSED would be the result.  If the test fails, then look at the actual result to see if either there is a problem with the normalization rule or the order of the rules in the configuration file.  The first rule from the top that fits the pattern will be used, so make sure and put the most specific rules toward the top and most generic toward the bottom.

Each time the Address Book Service regenerates (1:30AM by default) it may create a new Invalid_AD_Phone_Numbers.txt file in the same \Files subdirectory where the the configuration and client address book files are stored.  Each attribute which the ABS was unable to find a sufficient normalization rule for will be written to this file.

Unmatched number: User: 'jeff'  AD Attribute: 'homePhone'  Number: '555-2299'
Unmatched number: User: 'jeff'  AD Attribute: 'telephoneNumber'  Number: '4774'

These numbers are not in a 10-digit format; they either need to be fixed in AD or additional normalization rules added to handle the 7 and 4 digit formats.


  Comments   Add Comment   Share It  
  Your Name:
  Your Email: **will not be displayed
  Comment Title:
* Comments:
  If you cannot read the code, please
click here to get a new one. You won't
lose your comments by doing so.
* Security Code:
   
  
  
* Your Name:
* Your Email: **will not be displayed
* Recipient's Email:
* Subject:
  If you cannot read the code, please
click here to get a new one. You won't
lose your comments by doing so.
* Security Code:
  
  
  
Re: question
By: Jeff Schertz | Posted: July 28, 2010 at 10:59 AM
Jeremy, it is a best practice to configure OCS to format all outgoing numbers as +E.164 (RFC3966) thus you really shouldn't be using OCS to strip numbers down to only 4 digits for outbound calls. If you are using a media gateway then this device is typically configured to perform the digit manipulation required to send only 4 digits to the PBX. This way both OCS and the PBX are sending/receiving number patterns in their native, format formats. But if this is a Direct SIP implementation than your best approach is to figure out a way to have the PBX strip the OCS outbound call strings down from +13125551234 to just 1234, for example.
question
By: Jeremy | Posted: July 28, 2010 at 10:45 AM
This is an awesome post, thank you very much. I'm running into the issue where with the GAL normalization, it's adding ";phone-context=enterprise" to the normalized number in the GALcontacts.db. Based on how we're configured with multiple locations, legacy PBXs and a staged deployment, this is causing a lot of pain. Because that value is added, the FE server normalization translation service is skipping it, so I can't strip off the npa-nxx and send to the PBX as the 4 digit extension. Any suggestions on this one trying to strip off the that part, but still have the GALcontacts.db populate the right value for a contact's work number?
Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic.
By: puma speed cat shoes | Posted: June 29, 2010 at 9:40 PM
Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic.
Sometimes, 4 digit extensions don't match 4 digit DID
By: Ron | Posted: April 20, 2010 at 6:55 PM
In a multi-office environment, sometimes the 4 digit extension doesn't match the last 4 digits of the DID. For example, the DID is (312) 555-3551 but the 4 digit extension is 1500. How do I map 3551 to 1500?
OCS Normalization Rules
By: Roshan | Posted: January 8, 2010 at 10:54 AM
Hi Jeff, Thank you for your effert here. If our Orgn is having geographicaly seperated and using only one OCS Server. if users are spread among them. Can we still use the ABS rule ? I would like to know how I can normalize phone numbers using normalize rule in Location Profile so i can assing right profile to users, not just basic senarios but i need some complex examples like lets say user insert phone number like (888,886,888 or (416)373-7474, 416-3235432, 416.786.5643) thanks in advance roshan
Confused
By: Brent | Posted: October 6, 2009 at 12:48 AM
Thanks, Jeff, for all the great info! I read the post and inserted the rule, and it tested fine. However, the numbers pulling in to MOC do not match the phone number in AD (not just a format issue - different numbers). So, I gave MS tech support a call and they said that phone numbers in ADS must be in E.164 format. Our ADS numbers were formatted (555) 666-1234. OK, so I changed my mobile record in ADS to +15556661234. Within a few minutes, MOC R2, in the forwarding feature, showed my mobile as +15556661234. So, I thought it was all good. After exiting MOC and logging back in the forward number choice for my mobile was (555) 666-1234. Now I am just as confused as ever. Do you have any insite on this?
Confused
By: Brent | Posted: October 6, 2009 at 12:47 AM
Thank, Jeff, for all the great info! I read the post and inserted the rule, and it tested fine. However, the numbers pulling in to MOC do not match the phone number in AD (not just a format issue - different numbers). So, I gave MS tech support a call and they said that phone numbers in ADS must be in E.164 format. Our ADS numbers were formatted (555) 666-1234. OK, so I changed my mobile record in ADS to +15556661234. Within a few minutes, MOC R2, in the forwarding feature, showed my mobile as +15556661234. So, I thought it was all good. After exiting MOC and logging back in the forward number choice for my mobile was (555) 666-1234. Now I am just as confused as ever. Do you have any insite on this?
Re: <no title>
By: Jeff Schertz | Posted: July 23, 2009 at 8:26 AM
Peter, the rules specific to the Address Book Service are only set in the Company_... configuration file. The rules configured in OCS under the Voice Properties are specific to Enterprise Voice only. Typically you need to configure identical rules in both locations.
By: Peter | Posted: July 23, 2009 at 2:57 AM
hi, thank you so much for this great post. but this leaves me with one question, sorry if you already answered this. can i add phone number normalization rules in office communications server voice properties to normalize ABS phone numbers, or do i still have to create the "Company_Phone_Number_Normalization_Rules.txt" file?
Great Job
By: Stephen Basinger | Posted: February 25, 2009 at 4:22 PM
Thank you so much for writing this article. It was needed badly.
Reverse Lookup
By: AK Agarwal | Posted: December 5, 2008 at 10:28 AM
Great article Jeff. You mentioned "The single best benefit to utilizing the Address Book Service for normalizing numbers is if for some reason the data cannot be corrected in the source: Active Directory. For the best OCS experience it is recommended to reformat AD phone number attributes to the standard E.164 format, but if this is not an option then the ABS can be used to 'fix' the numbers so that OCS can at least display and dial them as needed in a specific implementation. (Keep in mind that reverse-number lookup may be adversely affected.)". Unfortunately, I am in this exact situation and my reverse look ups are not working. Can you provide a more detailed explanation of what the issue with reverse lookup is and how the process works.
Reverse Lookup
By: AK Agarwal | Posted: December 5, 2008 at 9:13 AM
Great article Jeff. You mentioned "The single best benefit to utilizing the Address Book Service for normalizing numbers is if for some reason the data cannot be corrected in the source: Active Directory. For the best OCS experience it is recommended to reformat AD phone number attributes to the standard E.164 format, but if this is not an option then the ABS can be used to 'fix' the numbers so that OCS can at least display and dial them as needed in a specific implementation. (Keep in mind that reverse-number lookup may be adversely affected.)". Unfortunately, I am in this exact situation and my reverse look ups are not working. Can you provide a more detailed explanation of what the issue with reverse lookup is and how the process works.
Brilliant Post
By: Gaz Jones | Posted: November 24, 2008 at 4:25 AM
Thank god someone actually explained this - good work.
By: Ben | Posted: November 10, 2008 at 8:04 PM
Great post - very helpful - but the handling spaces has me baffled - and i just cant get it to work. I have some very basic rules such as ^(\d{4})$ $1 ^(\d{8})$ $1 which i basically want to say "if there's a space in any of this - ignore it (or even remove it - either way, i dont care) can anyone help
Nevermind
By: Paul Brown | Posted: October 1, 2008 at 12:10 AM
I should read ALL of the fine print next time. I had not moved the file into the address book output location to activate the customized list. Sorry
Nevermind
By: Paul Brown | Posted: October 1, 2008 at 12:10 AM
I should read ALL of the fine print next time. I had not moved the file into the address book output location to activate the customized list. Sorry
"Running 0 normalization rules tests"
By: Paul Brown | Posted: October 1, 2008 at 10:00 AM
I am having a slight problem with testing these rules. When I run abserver -testPhoneNorm, it returns "Running 0 normalization rules tests". I haven't really made any changes to my rules file, except for adding your test cases and your rule (it did the same before I made those changes, too). Any ideas on what's going on, and what I could perhaps check to get this working?
Re: How does this effect 2 Toasts?
By: Jeff Schertz | Posted: September 18, 2008 at 8:18 AM
I have not seen dual-toasts before, an dI'm not sure WHY it operates the way it does; that a question for MS themselves. I'm just trying to decode the behaivor.
How does this effect 2 Toasts?
By: Robert Burnett | Posted: September 17, 2008 at 3:04 PM
I have numerous customer deployments complaining about 2 toasts. I have them add the Company.txt file and the toast merge. Do you have any info on how this is used in the incoming call scenario? Why is the AD telephone entry looked at and not the teluri? Our extensions are 101-1101 in AD. (\d{7}) +1425$1;ext=$1 Thanks, Robert
Re: great article, but what about CCM?
By: Jeff Schertz | Posted: May 16, 2008 at 9:29 AM
Actually, there are a few more caveats I noticed that just didn't seem to fit any particular pattern, OC would exhibit 'strange' behavior in a number of scenarios that seemed to contradict other rules. This forum discussion (http://forums.microsoft.com/unifiedcommunications/showpost.aspx?postid=3354211&siteid=57) covers a bit more of these other example, for instance if a 10-digit number is used without the leading +, then OC would not display it, but 4 digit numbers would display, regardless of whether the + was present. You really just have to fool around with the rules and set them to what is most flexible in a specific deployment.
Re: great article, but what about CCM?
By: Cameron | Posted: May 14, 2008 at 7:50 PM
Great article! The one piece that doesn't seem accurate is the implication that E.164 is necessary for proper display in the MOC AB. That hasn't been my experience, and your examples for 4-digit dialing seem to also contradict it. Maybe I misinterpreted your E.164 comments. My experiences with RCC normalization for OCS 2007 and UCM 6.x/CUPS 6.x has been that a normalization rule like the following allows the number to be displayed and properly RCC dialed in MOC: # Convert 999-555-XXXX in the range of # 555: 7000-7399 to 4-digit dialing (999)[\s()\-\./]*(555)[\s()\-\./]*(\d*7[0-3]\d{2}) $3;phone-context=dialstring This normalization mitigates the need to do a translation pattern on the UCM side. We have similar patterns to address external numbers (local, long distance, international) and common Outlook contact numbering. None of them use E.164. All display in MOC as expected. Our toast also seems to do reverse number lookup properly with the absence of the plus.
What about RNL ?
By: Eric | Posted: May 7, 2008 at 10:24 AM
Thank you for your article, explains a lot of black holes. But I am stuck with RNL (Reverse Number Lookup) with RCC mode, CUPS/CUCM 6: AD phone numbers are 4 digits, TEL URI are tel:+4digits;phone-context=dialstring and I included in phone normalization file one rule (\d\d\d\d) -> +$1 Numbers presented in Com toaster stay 4 digits (no +) with no name presentation (just other:4 digits) Thank you.
Re: great article, but what about CCM?
By: Jeff Schertz | Posted: May 7, 2008 at 8:52 AM
I can't recall the exact version of CUCM I last integrated with, but we created two Application Dial Rules to strip the leading + that OCS was sending with 10-digit and 4-digit strings.
great article, but what about CCM?
By: Markus Weidner | Posted: May 2, 2008 at 8:14 PM
As far as I can tell, CCM 6.1 does not provide a way to strip the + sign from E.164 formatted numbers from MOC 2007. So, a bit of a catch 22. I've seen some articles about using IP-IP gateways, but we already have CUPS so I would have thought there would be a way to achieve this... Better yet, I'd like to talk directly between MOC and CUCM *and* strip the + sign.
 

 About Jeff Schertz

Senior ConsultantJeff Schertz is a senior consultant for PointBridge, focused on unified communications. He has over 10 years of experience in the IT industry ranging from family-owned businesses to global product dev... [more]

View Jeff Schertz's profile on LinkedIn
Microsoft Certified IT Professional

 Tag Cloud

 External Links

 ‭(Hidden)‬ Admin Links