Troubleshooting “DOC_TOO_HUGE” Errors in Exchange Content Indexing

Yesterday afternoon, while troubleshooting an unrelated issue, I noticed that the Application Event Log on one of my Exchange 2010 Mailbox Servers was filling up with errors like this:

Event ID 9875

Unexpected error “DOC_TOO_HUGE: There are not enough resources to process the document or row” occurred while indexing document.

What? This was a new one on me. And, in a supreme act of lameness, the error does not give the name of the mailbox with the problematic message. Or the folder within that mailbox. Just hex values for various MAPI properties. Poo.

Naturally, I turned to the Font of All Knowledge and Wisdom: Google! Google wasn’t very helpful at first. There was nothing really helpful in Microsoft’s official documentation or in the blogosphere. But I did manage to stumble across a handful of forum posts (such as this one, which was the most helpful), but none really explained what was involved very clearly. Here, I am attempting to rectify that.

It seems that in SP1 Rollup 5, Microsoft added a limit to the number of attachments in a message that the content indexer would attempt to tackle. If the indexer encounters more than 32 attachments on a message, the 9875 event shown above gets thrown. The trick, of course, is to translate those hex MAPI properties to something readable to track down the problematic message. That is where ExFolders comes in.

If you are an Exchange 2010 admin and don’t have ExFolders, what the heck are you waiting for? Go get it now! It is basically pfDAVadmin, rewritten to get around the fact that Ex2010 does not support WebDAV. It is very handy, as you will now see.

Once you have fired up ExFolders, go to the File menu and select Connect. (Much of the interface is the same as pfDAVadmin.) Make sure “Database” is checked, use the second “Select…” button to select the database mentioned in the event log entry, and click “OK.”

Leaving the root of the mailbox database selected (“Mailboxes”), click on the “Tools” menu and select “Export Folder Properties”. In the resulting window, choose “Selected folder and subfolders” and input a path and filename in the “Output file:” field, such as “C:\FolderID.txt”, then scroll down to the “ptagPID: 0x67480014” field and check it. Click “OK.”

What you have just done is instructed ExFolders to generate a text file listing EVERY FOLDER IN EVERY MAILBOX of the database you are looking at. The beauty here, is that the list includes each folder’s folderID (or “FID”) property tag. If you look back at the original Event Log entry, note the item that says “Folder ID”. That is the string you’ll want to search in the output file to tell you which folder in which mailbox is having the problem, including the path to the folder.

Okay, you are halfway there. Now to figure out which MESSAGE in that folder is causing the indexer to choke! Back to ExFolders.

Expand the list of mailboxes under (of all places) “Mailboxes” in the left-hand navigation panel, and find the listing of the problematic mailbox and navigate down to the relevant folder. (It will be somewhere under “Top of Information Store”.)  With that folder selected, open the Tools menu and select “Export Item Properties” and specify an output file (such as “C:\MessageID.txt”).

Now, before we go on, we are going to have to do something a little different than last time. The property we are needing isn’t on the list under “Properties to Export,” so we will need to add it. In the text box at the bottom of this window, type in (without the quotes) “ptagMID : 0x674A0014” and click “Add property to list” and click “OK”.

What we have just instructed ExFolders to do is generate a text file containing a list of all items in the folder in question. The list will contain, along with the path and name of each item and a bunch of other metadata, the MessageID (or MID) of each item. Look for the “Message ID” listed in the event log message, and that will tell you which message the indexer is choking on.

In our case, the message was entitled “failure notice” and was from the mailer daemon external to our Exchange system. In fact, there were a lot of messages like that in the Inbox in question. A LOT. With more constantly coming in.  As it turns out, the problematic mailbox has a mailbox rule sending a copy of each incoming message to an address external to the Exchange system. And that address went away. So the mailbox was filling with bounces, each containing as an attachment the previous bounce, which had its own attachment, etc. No wonder the poor indexer was confused…..

 

 

 

 

Weird Calendaring Voodoo

As an Exchange Admin, I frequently am called upon to troubleshoot rather odd calendaring behavior. Usually, the culprit is mixing of client types (something Microsoft recommends avoiding) or synchronization issues involving cached-mode Outlook or Exchange Web Services clients such as Apple Mail or Outlook:mac misbehaving. But, through it all, I frequently find myself wondering “Why do all of these folks run into these problems? I never do.”

I can’t say that any more.

Last week, I was out one day, and, per the procedure of our group, I placed a note about it on our group’s shared out-of-office calendar, using Outlook Web App to place the item directly on that shared mailbox’s calendar. Today, I received a message from my manager asking me to put my absence on the OOO calendar. Which I had already done. But he sent me a screen grab from his Outlook:mac client showing that it did not appear for him.

“Well,” I thought. “Maybe his Outlook:mac client isn’t properly syncing with the calendar.”  I opened up the calendar in Outlook (which I always run in online mode, never cached mode.) Sure enough, there it was. But then I noticed something else. My absence was the only item I could see on the calendar for that day, but the screen capture that my manager had sent included an item for HIS absence on that day. Which wasn’t showing on my view via Outlook. Furthermore, when I opened my item in Outlook, I was listed as the only attendee. The OOO mailbox was not. Yet this was an item on the OOO calendar and not on mine. That really shouldn’t happen.

Then the plot thickened. Since I have Full Access permission on the OOO mailbox, I fired up OWA and used the  “Open Other Mailbox” feature to open the OOO mailbox, and moseyed over to the calendar on the date in question.  There was my manager’s item. Mine wasn’t there.

I had my co-worker Doug take a look, just to make sure I wasn’t crazy. After all, I’m running on about 4 hours of sleep this morning, and I could have easily been misinterpreting what I was seeing. But I wasn’t.

Okay. The next logical course of action was to delete the item and re-create it. This time, I created it on my OWN calendar, adding the OOO mailbox as an invitee. All hunky-dory, except for one thing. When I looked at the OOO calendar via Outlook, the new item wasn’t showing up (even though I could see it in OWA). And my manager’s item still wasn’t there. I restarted Outlook. No joy.

Then, at Doug’s suggestion, I removed the OOO calendar from my Outlook view and re-added it. Now both items were showing up, and everything was as it should be. And I still have no clue why things were screwed up to begin with.

Moral of the story: If you get confused by weird behavior with an Exchange calendar, don’t feel bad about it. Even an Exchange admin with over a decade of Exchange experience can get befuddled by such things.

Issues with iCal Saving Calendar Items to Exchange in Snow Leopard

Recently, once of our Exchange users encountered an odd error while trying to use iCal to save a calendar event to our Exchange 2010 server. When trying to save a new event, the following error would crop up:

iCal can’t save the event “whatever the event name is” to the Exchange server.

The account *****@*****.**** currently can’t be modified. To discard your changes and continue using the version of your calendars that’s on the server, click Rever to Server. To save your changes on your computer until the problem is resolved, click Go Offline.

(Go Offline)     (Revert to Server) (Try Again)

Weird, eh?

After doing a bit of digging, we quickly found the issue documented here and there on the web. It seems that the problem stems from the fact that iCal does not by default use the system settings for Time Zones, and the above error pops up if there is a mismatch.

To fix the issue, it is necessary to go to the Advanced tab of the iCal preferences (iCal -> Prefererences -> Advanced) and enable time zone support. Also, check the system’s time zone settings (System Preferences -> Date & Time -> Time Zone) and make sure that the correct time zone is set there.

Well, that could have gone better….

So here is the skinny on what went awry with the transition of AEMS to Exchange 2010 Client Access Servers.

Although most users did not encounter any difficulties, a sizable subset encountered difficulties which can be broken down into the following categories, difficulties which were serious enough that it was felt that a backdown of our change was warrented:

IMAP4 issues: we’ve already identified and corrected one source of issues for IMAP4 users, but are still trying to characterize a remaining issue (a process made more difficult by the fact that we have reverted our change).

Droid and iPhones running pre-IOS4 software: In a 2010/2007 coexistence scenario, when an ActiveSync devices connects to the 2010 CAS, Exchange looks up the account, notices that the user still has their mailbox on Exchange 2007, and sends out an HTTP 451 error, saying “Hey, you can’t talk to that mailbox here, but if you go over this legacy address, you should be able to connect.” This does not seem to work with older iPhones or with any Android phones. We are currently trying to figure out a solution that doesn’t involve having each user of these devices manually modify their device settings to point to legacy.austin.utexas.edu, then reconfiguring to point the device back to wmail.austin.utexas.edu once their mailbox is moved, a dicey prospect at best. Unfortunately, the Microsoft engineer with which I was speaking this afternoon was not optimistic about us finding a good alternative solution for that.

Users with custom-hosted e-mail domains using non-Outlook clients: We have noted widespread problems for customers who are not using Outlook and have their client software (mobile or desktop) configured to use an email address other than their @austin.utexas.edu addy. This includes custom hosting customers, and folks who have their primary SMTP address set to use their @mail.utexas.edu address. The upshot for this is that autodiscover is broken for those address spaces, resulting in their connection not getting properly redirected to legacy.austin.utexas.edu. We MAY have come up with an interim solution which would involve DNS changes in those hosted domains. Alternatively, impacted users can configure their clients with their @austin.utexas.edu. (Outgoing mail would still APPEAR to come from their other address.)

I should point out that any user that had reconfigured their client to point to legacy.austin.utexas.edu in order to make it work need not make any changes at this point, but they will have to revert that change once we finally migrate their mailbox to 2010.

I had said in my presentation last week that problems inevitably crop up with a transition as big as this. I REALLY would not have minded being proven wrong….

Update (June 14):

For the three main issues, here is where we currently stand on trying to come up with a fix:

1) IMAP4 with NTLM – support for this seems to have been dropped from Exchange 2010 RTM, but quietly restored in SP1. Our MS support engineer is double-checking this as I type, but I suspect that, at worst, we will be able to do on this will be to tell IMAP4 users to make sure their clients are set to use Password auth. At best, we’ll figure out server settings to get this working.

2) Autodiscover failing with custom hosted domain email addresses – this will be relatively simple to fix. It requires a DNS change for every domain for which AEMS does custom address hosting.

3) Android failing to redirect – this is due to a bug in the Android code, which Google will likely not be fixing until the end of the year.  Having every Android user reconfigure their client – twice, once for the wmail switchover and once for their mailbox move – is obviously not an acceptable solution. We may be able to address this issue the same way in which we are handling Snow Leopard’s inability to handle the same redirect – setting up an iRule to on the F5 load balancers to route all traffic from these devices to legacy.austin.utexas.edu to handle the transition of wmail to Exchange 2010, identify those users by parsing our IIS logs, then move those users to Ex2010 mailbox servers in a single batch while removing the iRule. Both setting up the iRule and scouring the IIS logs are somewhat complicated by the fact that Android devices are bizarrely not consistent in how they populate the user-agent string in the HTTP headers, frequently filling it with manufacturer and device model (which is rather useless data for protocol logging), rather than something consistent, useful and simple to key on like “Android v#.# EAS”.

Accessing Exchange from Linux

Once upon a time, there was an Exchange Connector for the Linux email client Evolution, and it was fairly good. Basically an OWA screen-scraper, it enabled Evolution to function as a Linux equivalent to Outlook, and (based upon my admittedly limited tinkering with it) it seemed to get the job done.

Then along came Exchange 2007, which changed how OWA worked, thus breaking the Exchange Connector. Linux users needing to connect to Exchange were faced with being restricted to POP3 or IMAP4 (assuming their Exchange admins had those legacy protocols turned on), using OWA (which is pretty limited in a non-IE environment), or running Outlook in a Windows VM.

But there are other options these days. In addition to the dramatically improved cross-browser OWA found in Exchange 2010, there is a MAPI plugin for Revolution! I’ve not yet tinkered with it, but this opens up quite a few options.

The Perils of Running Antiquated Operating Systems (I’m looking at you, XP users!)

It is easy to forget that Windows XP will turn a decade old this fall. That is long run for an OS, and technology has continued to march on, yet many people still cling to XP. It is easy to see why. It was one of the more nimble and stable desktop OS releases that Microsoft has ever had. Being based upon the NT kernal, rather than DOS, it stood heads and shoulders above the old Windows 95, 98, and ME releases. Basically Windows 2000 Workstation with added MaxOS X inspired eye-candy, XP was a solid OS. When its successor, Windows Vista, was released in the fall of 2006, it was less than a resounding success. Sure, it was shiny and modern, but it was a resource hog with steep hardware requirements, and less stable than its predecessors. (To be fair, the latter issue was resolved with subsequent Service Pack releases.) Microsoft’s current desktop OS is Windows 7, which is essentially a souped-up Vista Service Pack, tuned to address many of the performance issues associated with Vista. It is a very nice OS, and Windows folk really should be using it, but a lot still aren’t.

I mention all of this because I ran into an issue last week caused by people clinging to XP. A user of my Exchange system (a Mac user running Outlook:mac 2011 – and yes, I have plenty of criticisms for that product as well) had found that some of the recipients of his digitally-signed messages could not read those messages. It turned out that the common factor amongst those recipients was that they were XP users. By default, when someone digitally signs a message using a personal cert on Outlook 2011, it uses SHA512 (a subset of SHA2) for its signing algorithm. But, as it turns out, the signing and encryption libraries in XP SP2 or earlier can only deal with messages signed using the SHA1 signing algorithm. XP SP3 added SOME SHA2 support, but it is quite limited.

http://blogs.technet.com/b/pki/archive/2010/09/30/sha2-and-windows.aspx

So, if you are an XP user and can’t read a signed message, you’ll have to ask the sender resend the message either unsigned or signed with SHA1. And consider upgrading. Please.

Microsoft Unveils RDCMan 2.2

Previously an internal-only tool, Microsoft’s Remote Desktop Connection Manager is now available for general use. And considering that I tend to have about a dozen RDC sessions open on my desktop at any given time, I have now doubt that I will find this utility useful.

All the gory details are available here.

Tips for Getting Better Support

Getting a technical problem resolved can be a frustrating and time-consuming process, for both the end user who is needing help, and for the support personnel trying to provide assistance. But there are a few basic steps that the end user can take to help streamline the process, resulting in a better experience for all concerned.

Find out if what you are experiencing is a known problem. If you are having trouble accessing a service, check the relevant status page to find out if the administrators are already aware of the issue and working on it.  Some service providers will have a status website to provide this information, others will provide notifications via mailing lists. (Here at UT, status updates for ITS services are provided at http://www.utexas.edu/its/alerts/.) If the administrators are already aware of the problem and are working on it, further notifications that “Server X” is down don’t really help. But if there is no indication that the administrators are aware of the issue, then certainly let them know of the problem through the relevant channels. If you are encountering a problem with software, take the time to check the relevant documentation and/or consult a search engine to find out if you are dealing with a known bug. In the process of doing so, you might encounter information about workarounds or bug fixes. Google really is your friend here. If you are running into a problem, odds are that someone else has as well.

At the risk of sounding repetitious, but it is an important point, read the relevant documentation. Frequently, there will be troubleshooting guides available that will help you solve many simple, commonly-encountered issues. This frees up support resources to work on issues that aren’t documented.

When reporting an issue, be specific. Saying “I can’t get to Server X” or “E-mail isn’t working” isn’t enough information for troubleshooting.  What is the actual text of the error messages that you getting? What exactly were you trying to do when the problem occurred? What client software and version are you using? What OS version are you using? When did you first notice the problem? When did it last work correctly? Have you made any changes on your end just before the problem cropped up, such as installing a piece of software or changing your configuration? If you are requesting a restore from backup, when was the missing data last known to be present? Being vague in your request for assistance only results in lengthening the time that will have to be spent subsequently gathering the needed information.

Communicate. If the support staff request further information, please respond.  If they ask you to try something, follow-up with information about whether it worked or not.  (If the suggested fix DID work, the support staff need to know so that they can close out your support ticket.) Otherwise, your support ticket will languish in the queue while support staff work on other issues.

Recognize that there are limits to what your support staff can do. For example, if you are encountering a bug in your software for which the developer has not issued a patch, the best that can generally be hoped for is a workaround. As another example, I am frequently asked to do mailbox restores so that a user can recover a missing crucial message, but what users tend to fail to grasp is that I cannot actually see the contents of the backups.  You need to tell me when the missing data was last present so that I can select the appropriate backup. (This is not the case for all backup solutions, but, alas, it is the case for the one we are using.) We are not deities. We do not have access to the source code of Windows. We do not have a magical “fix all my problems” button. But we will do what we can.