How to Install, Configure, and Manage the Mailbox Role in Microsoft Exchange Server 2013

  • 5/29/2015

Objective 1.5: Develop backup and recovery solutions for the mailbox role and public folders

Objectives that define what’s expected are as important as capabilities of a backup and recovery solution. Time required to recover data, also known as Recovery Time Objective (RTO), and the point to which the data must be restored, also known as Recovery Point Objective (RPO), are two of the most important design objectives for any backup and recovery solution. Without defined RPO and RTO objectives, a backup and recovery solution can only be as good as the guesswork of IT departments of what users expect of the system. Even though designing a backup and recovery solution is beyond the scope for this book, understanding what defined RPO and RTO mean to Exchange 2013 is important. You learn about the features and functionality of Exchange 2013 that can help with the restoration of data and meet RPO/RTO objectives when data loss has occurred.

Managing lagged copies

Lagged copies are database copies configured to lag the log replay into the passive copy of the database. Exchange 2013 allows for a maximum of 14 days of lag. Unlike regular database copies that are designed to provide high availability, lagged copies are designed to provide protection against logical corruption. Logical corruption can occur when the data is expected to be written to disk, but despite an acknowledgement, the disk subsystem fails to write data to the disk. This is also known as a lost flush. Another possibility is that an application can add or update mailbox data in a way that isn’t expected by the user. Unexpected malformed data is a valid MAPI operation for Exchange server, known as a store logical corruption.

While Exchange server has a built-in detection mechanism that detects and tries to correct lost flush occurrences, operations that cause store logical corruption are valid MAPI operations. Such corruptions require external backup mechanisms, such as a lagged database copy to prevent data loss.

The time it takes to recover data using a lagged copy depends on the configured lagged time for the lagged copy, the amount of logs that needs to be replayed to get to the point before corruption, and the speed at which the underlying hardware can replay the logs into the copy used for recovery.

Creating a lagged database copy is as simple an operation as creating a regular database copy. When creating a lagged copy, you need to run the Add-MailboxDatabaseCopy cmdlet with the ReplayLagTime parameter, configured with lag time span in dd:hh:mm:ss format. The TruncationLagTime parameter provides the ability to delay the truncation of logs that have been replayed into the database. You can set the truncation lag to the maximum of 14 days, which is similar to the replay lag time, but it shouldn’t be used on its own to provide protection against corruption. The important difference is the status of the database. Replay lag time prevents logs from updating the database copy by stopping the log replay up to the configured lag time. This provides you with the ability to replay only the logs required before the time of corruption. Truncation lag preserves the logs, but only after it has been replayed in the database.

If a lagged copy is an important aspect of your data recovery strategy, you also need to make sure a single-lagged copy isn’t susceptible to corruption itself. Storing lagged copy data on a RAID array, or having multiple lagged database copies is ideal, so a disk failure or corruption doesn’t invalidate your lagged copy.

When configuring lagged copies, you also need to account for the additional disk space required to store additional logs that would otherwise be truncated. The importance of required disk space is worth stressing because for a very active user profile, the amount of daily mail storage can be high. Now add many active users to make up a database and you might account for more than a few gigabytes of space daily. If the database copy is lagged to a maximum of 14 days, it can easily balloon to a considerable log storage.

When you configure a lagged copy, it’s also important to configure SafetyNet hold time to be equal to or more than lag time. This allows a lagged copy to request missing emails from SafetyNet successfully when activated. Increasing SafetyNet hold time has a direct impact on the disk space required to store emails protected by SafetyNet.

If SafetyNet is configured to exceed the lag time of a lagged database copy, the lagged copy can be activated without replaying the pending log files. This is because the activated copy requests the missing emails from SafetyNet, and SafetyNet can provide emails from the configured lagged time window.

When a database copy is created in DAG, the BCS process can select lagged copy for activation if it’s the only copy available for activation when the active copy of the database fails. You can exclude a lagged database copy from the BCS process by suspending a database copy with the ActivationOnly parameter. This only excludes the database copy from activation, while allowing the logs to be replayed up to the configured lag time.

When activating a lagged copy, the best practice is to make a copy of the lagged database and log files first. This provides you with an additional copy in cases where an activated lagged copy may be determined not to provide all of the data expected, and you might need relay additional logs or fewer logs than originally determined.

Objective 1.3, previously discussed, covers the process of activating a lagged mailbox database copy.

Determining the most appropriate backup solution/strategy

When considering appropriate backup solutions, understanding the impact of defined SLAs, such as RPO and RTO, is critical. For example, if the requirement dictates that the backup solution must be able to protect the environment from database corruption for up to 30 days, lagged copies can’t help protect the data due to its maximum configurable lag of 14 days.

If a requirement dictates that the time to restore data after data loss is reported must be less than 24 hours, you must account for the time it takes for offsite tapes to arrive, the time it takes to restore data from the tapes to the disk, the time it takes to replay restored logs and bring the database to its consistent state, and the time it takes to extract data from the recovery database into a PST file or a target mailbox.

DAGs provide the ability to recover not only from the disk, server, and other local failures, but also from disaster scenarios when DAG is spanned across sites, providing site resiliency and disaster recovery capabilities.

Exchange 2013 also provides the ability to recover accidentally deleted items using its single-item recovery features. When combined with an appropriate retention policy, this provides a vast improvement over using tape-based backup and restore strategy, which takes considerable time to restore a few items from the backup. Compliance and data loss prevention features of Exchange 2013 reduce the time to recover deleted items, while reducing administrative overhead associated with the restore process.

Even when using lagged database copies for recovery, the size of individual mailboxes and the size of the database on disk are the factors that greatly impact your ability to restore data while meeting RPO and RTO requirements. Exchange 2013 supports large mailboxes and databases that can be larger than 2 terabytes (TB). When recovering such mailboxes, it takes time to extract data from a lagged copy to the recovery database, and then onto the target mailbox or PST). As the mailbox size grows, so does its time to recover. While restoring data from tape eliminates time, lagged copies still need a replay of logs required to reach the determined point in time for successful recovery. The larger the database, the greater the amount of stored logs that need to be replayed into the lagged copy, directly impacting the time it can take to recover the data.

When selecting backup technologies for Exchange, you also need to ensure the selected backup technology is supported by Exchange 2013. Exchange server currently supports only Volume Shadow Copy Service (VSS)-based applications that support VSS writer for Exchange 2013. This requirement ensures that Exchange is made aware of the backup process start and completion times, as well as other important information that helps Exchange determine the state of backup and, upon successful completion, Exchange can truncate log files appropriately. The VSS writer functionality that was part of the Microsoft Exchange Information Store service was moved to the Microsoft Exchange Replication service. This new writer, named Microsoft Exchange Writer, is now used by Exchange aware VSS based applications, allowing them to backup from active or passive database copies.

Providing protection against the deletion of an entire mailbox is also built-in to Exchange 2013. When configuring a mailbox database, you have the ability to specify two retention- related parameters. While the DeletedItemRetention parameter doesn’t provide the retention of an entire deleted mailbox, it enables you to configure the amount of time individual items deleted from a mailbox are retained and can be recovered. By default, this retention period is 14 days. This retention attribute applies to all mailboxes that don’t have their unique deleted item retention value defined. MailboxRetention is the attribute that provides the ability to configure the retention period for a mailbox that was deleted. The database cleanup process won’t delete the deleted mailbox permanently until after the configured mailbox retention time requirements are met. If the deleted mailbox needs to be restored within the mailbox retention period, the administrator doesn’t need to rely on any lagged copies or other backup/restore applications. You can configure deleted item retention and mailbox retention parameters using the Set-MailboxDatabase cmdlet.

With the ability to integrate a Lync 2013 server with an Exchange 2013 server, administrators can choose to store user’s Lync contact information in the Unified Contact Store, which is located in the user’s mailbox. If a backup being restored exists before the time the Unified Contact Store integration was enabled, the restored data won’t contain the user’s Lync contacts. This results in the loss of Lync-related data while restoring the requested mailbox data. If the backup doesn’t contain the user’s Lync contacts, then it’s important to determine the status of the Unified Contact Store and to move the user’s contacts to the Lync server before performing the restore. The other important consideration is the time passed between the backup and the restore request. The user might have added more contacts since the last backup that contains their Unified Contact Store data. Restoring such backup results in losing the contacts added after the backup was performed. This can be prevented by moving a users’ contacts back to the Lync server.

Performing a dial tone restore

Dial tone recovery allows the restoration of service to be separated from the restoration of data. In case of a data loss due to server or site failure, where restoring data from backup is the only option, you might want to provide users with the ability to continue sending and receiving emails while the lost data is being recovered. Using dial ti recovery, you can create an empty database on the same server in case of the loss of database or, on an alternate mailbox server, in case of a server failure. Users can continue using their mailbox to send and receive emails while the data is being restored. Once data is restored successfully, the administrator can merge the data, completely restoring the user’s mailbox.

If the database has failed, but the server-hosting original database is still functional, you can choose to create a dial tone database on the same server. This eliminates the need to reconfigure client profiles that were configured manually.

If the server hosting the original mailbox database suffers hardware failures, you can create a dial tone database on a different server. Clients using Autodiscover are automatically updated to a new server. Clients configured manually might need to be updated to connect to a new server before they can connect to the dial tone database.

The process of performing dial tone recovery is mostly similar in both cases, with minor differences. Additional steps, which are listed here, are required when using a different server for dial tone recovery.

  1. Create an empty dial tone database to replace the failed database.

    Creating the empty dial tone database is no different than creating a new mailbox database using the New-MailboxDatabase cmdlet. But you might want to make sure that any existing files of the database being recovered are preserved. This can be helpful if the files are needed for recovery operations.

    Create the dial tone database using the New-MailboxDatabase cmdlet.

    After creating the new database, all of the users from the failed database need to be homed to the newly created dial tone database. Use the Set-Mailbox cmdlet to rehome all of the affected mailboxes to the new dial tone database.

    Mount the dial tone database to allow client computers to connect and start using the new empty mailbox. For computers using Autodiscover, the configuration should be automatic. For clients with manually configured profiles, manual configuration needs to be updated before clients can connect to the dial tone database.

  2. Restore the old database.

    Restoring the old database depends on your backup method. If you’re relying on a lagged copy, determine the point in time to which you need to restore. Replay the required logs into a copy of the lagged database to bring the database to a consistent state.

    If you’re using VSS-based backups using Windows Backup or third-party backup software, restore the database using its respective restore mechanism.

    If the failure doesn’t require you to revert to a specific point in time, you can copy the logs from the point of backup to the current time if they’re available from the failed copy. This lets you roll the database forward to the point of failure. This preserves all possible data up to the point of failure.

    Use eseutil to replay the log files and bring the database to a consistent Clean Shutdown state. While this isn’t required, it provides better control over a recovery process and enables you to address any failures more interactively than mounting the database and allowing it to replay the logs.

    Create a recovery database. If you used eseutil, the recovery database won’t be used for the log replay process. If you didn’t use eseutil to bring the database to a consistent state, copy the recovered database and all of the required log files to the recovery database location. Mount the recovery database to force log replay and bring the recovered database to a consistent state. Dismount the recovery database after it mounts successfully, and then copy the recovered database files to a safe location.

  3. Swap the dial tone database with the restored database.

    At this point, your users are using a dial tone database and you have recovered the failed database. Now you need to swap the database files, so the dial tone database files are replaced with the recovered database. The dial tone database is smaller compared to the recovered database, so it’s easier to take dial-tone database files, mount them in a recovery database, and merge the dial tone data with the recovered data, while users connect to their recovered mailbox and continue to use the service. This process involves downtime and, until the dial tone data is merged with the recovered database to which users are connected after the swap, users won’t be able to access their newly created data.

    To swap the database, dismount the dial tone database and copy the dial tone database files containing newly generated user data to the recovery database file location. Ensure you have preserved and moved the recovered database to a safe location to avoid the risk of overwriting recovered data in the recovery database location.

    Now, copy the recovered database from the safe location to the dial-tone database file location, and mount the dial tone database. As discussed earlier, users can connect to their mailboxes containing recovered data, but they won’t have access to their newly created data.

    Mount the recovery database, which now contains the new dial tone data generated by users after the creation of the dial tone database and before the swap with the recovered data.

  4. Merge the data from recovery database to the dial tone database.

    At this point, you can issue New-MailboxRestoreRequest against each mailbox from the dial tone database. Use the recovery database as the source, and the mailbox on the dial tone database as a target, merge the dial tone data from the recovery database to the dial tone database. Once complete, users have access to both recovered and dial tone data. The recovery process is complete and the recovery database can now be removed.

Performing item-level recovery

When a user deletes items from their mailbox and the restoration of the items is required at a later date, either for legal discovery or because the user needs access to the accidentally deleted data, compliance and retention features of Exchange 2013 provide administrators with the flexibility to perform such recovery without requiring a lengthy recovery process of restoring from backups. This certainly impacts the online storage capacity and database size, and it must be carefully balanced not to impact one aspect of the system while addressing another.

The recoverable items folder provides the ability to retain the deleted items when the user accidentally deletes the mailbox items, or the items are deleted on purpose including purging, where the user is intent on permanently removing items from their mailbox. When the user empties the deleted items folder or uses hard delete, folders within the recoverable items folder, which are only accessible by the administrator, allow for the recovery of such items to meet recovery and compliance needs of the organization.

When a litigation hold or a single item recovery is enabled for a mailbox, the items that are hard deleted or removed from the deleted items folder are stored in the Purges folder in the user’s mailbox. This folder isn’t accessible to the user. Enabling a single item recovery is a simple operation of setting the SingleItemRecoveryEnabled parameter to $true using the Set-Mailbox cmdlet.

Recovering messages using single item recovery is a two-step process. The search performed to find deleted messages recovers found items from the user’s mailbox to a defined mailbox, which can be any other mailbox except the source from which the messages are being recovered. While this isn’t a requirement, the discovery mailbox is typically an ideal target for such operations.

After the data is recovered into the target mailbox, the next step is to restore the recovered items to the source mailbox or to a PST file if needed.

To perform the search, issue the Search-Mailbox cmdlet with the SearchQuery parameter. Search query uses Keyword Query Language (KQL) syntax. KQL includes search elements such as subject, sender, and other email properties or free-form test search looking for specific content within a message. You also need to specify the mailbox where items found by search query are stored.

Once the items are found and recovered to a specified mailbox, the next step is to run the same Search-Mailbox query on the mailbox where the items are recovered, and use the user mailbox as a target. This copies the recovered items from the mailbox used as a target in the first step to the user’s mailbox. You can also use the DeleteContent parameter in this step to delete the recovered items from the source mailbox after the content is restored to the user’s mailbox.

If these steps are used for legal discovery process, the final target mailbox might not be the user’s mailbox and you could need to adjust the second step of the process accordingly.

If you need to export recovered data from the first step of the single item recovery process to a PST file, you can use the New-MailboxExportRequest cmdlet to extract data from the mailbox where it’s stored after running the Search-Mailbox cmdlet in the first step. You also need to specify a file share location that Exchange server has permission to, ideally an Exchange server, in order to avoid permission issues. This location is then used to store the exported PST file.

While not directly a backup or restore requirement, you might have instances where a user reports corruption on their mailbox items, such as folders reporting on an incorrect item count or search folders not functioning as expected. This isn’t a data loss, but a corruption of items that exist in the mailbox database.

Exchange 2013 provides the ability to address such corruption using the New-MailboxRepairRequest cmdlet. When you issue the cmdlet, you can specify a mailbox to run the repair request against a mailbox database if you believe corruption is affecting more than one mailbox in a given database.

The operation of running a repair request is disruptive and the mailbox being repaired is unavailable for the duration of the repair operation. Because of the disruption potential and performance impact on the server, only one repair request can be active for a given database and only 100 repair requests can be active for a mailbox-level repair per server.

Recovering the public folder hierarchy

Recovering public folder data historically has been a difficult request. Because public folders are now located on a mailbox database and use similar mailbox architecture, the recovery of data follows a similar logic as discussed in previous topics. Depending on whether or not the deleted items are within the retention window, you need to restore data either by using Outlook, by using the Recover Deleted Items option, or by using the recovery database if data needs to be restored from earlier backups.

However, the mailbox containing the public folder hierarchy plays a vital role for the public folder infrastructure. The loss of a primary or secondary hierarchy mailbox requires a restoration process that’s different for a primary and a secondary hierarchy mailbox.

The impact of losing a secondary hierarchy mailbox means user mailboxes configured to use that mailbox for a hierarchy might connect to other hierarchy mailboxes in the environment, which might not be optimal, depending on the user location. Most commonly, the public folder account hosting the secondary public folder hierarchy is also used to store the public folder content. When such a public folder mailbox is accidentally deleted, users are unable to access data contained in the deleted public folder mailbox.

When a secondary hierarchy mailbox is accidentally deleted, if it’s within the database retention period, it can simply be restored using the same steps as a user mailbox. You can simply connect the public folder mailbox back to the related Active Directory user account, which is created and disabled automatically when a public folder mailbox is created. If the Active Directory account deletion is the cause of the public folder mailbox being deleted, you can simply create a new user, disable it, and connect the recovered public folder mailbox to it. Use the Connect-Mailbox cmdlet to connect the disabled public folder mailbox to the related Active Directory user.

If the deleted mailbox is beyond the retention period, you need to recover the mailbox using the backup and recovery database. The process is similar to recovering a user mailbox.

If the deleted public folder mailbox contains public folder data as well, you must also point the public folders hosted on the deleted mailbox to an existing public folder mailbox or a newly created public folder mailbox. Use the Set-PublicFolder cmdlet with the OverrideContentmailbox parameter to point the public folder to an existing public folder mailbox. If you need to also restore the data from the deleted public folder, include the IncludeFolders switch.

When a primary public folder hierarchy mailbox is deleted, the impact on the public folder environment is bigger. This is because the primary hierarchy mailbox is the only mailbox in the environment that hosts writable copy of the hierarchy. When the only writable copy of the hierarchy is missing, you can’t create new public folders. When using EAC, administrators are able to see the list of public folders in the environment.

While the restore process of the public folder mailbox with the primary copy of hierarchy is similar to other mailboxes, the restoration of the mailbox immediately initiates the full hierarchy sync with all the secondary hierarchy mailboxes. All of the changes made to the hierarchy between the time the primary hierarchy mailbox was last backed up and when it was deleted are lost. This includes newly created public folders and any updates to permissions on public folders.

This is also why it’s critical to protect public folder mailboxes hosting primary hierarchy using multiple database copies on DAG, as well as to review backup procedures to ensure appropriate coverage exists to reduce exposure to data loss affecting the hierarchy and public folder permissions and content. The public folder account in Active Directory can be protected by enabling the feature Protect object from accidental deletion on the Active Directory container where the account is located.

Recovering a mailbox server role

When you lose a mailbox server due to a hardware issue or another event affecting the mailbox server, your mailboxes might survive the event if the server was a member of a DAG and if databases were configured with additional copies. If the server wasn’t a member of a DAG or the affected databases weren’t replicated, you can use concepts discussed earlier in this chapter to restore databases on different hardware, if available.

Recovering a mailbox server from failure requires the replacement hardware to have similar performance characteristics, have the same operating system version, and have the same drive letters and/or mount point configuration. You also need to determine the installation path if Exchange 2013 was installed in a nonstandard location. Because every Exchange server object is stored in the Active Directory, you can retrieve the install path from the Active Directory object using ADSIEdit or LDP.exe, if necessary. You can do so by inspecting the msExchInstallPath attribute on the Exchange server object located in the Configuration partition of Active Directory.

Once required information is available, reset the Active Directory account of the failed mailbox server. For recovery to succeed, you need to install the same operating system on the replacement server and name the new server with the same name of the failed server. The recovery will fail if the same name isn’t used on the replacement server. Join the server to the Active Directory domain. This step will fail if you didn’t reset the Active Directory account of the failed server because you’re trying to join the new server to the domain using the same name. After successfully joining the domain and installing the required prerequisites for Exchange 2013, you can start the Exchange 2013 setup using a command-line setup with switch /m:RecoverServer. You must also use the /TargetDir switch if Exchange was installed in a nonstandard location on the server. After the setup is complete, you might need to restore any custom settings applied to the failed server.

If the failed server was a member of a DAG and contained replicated database copies, the process looks slightly different. Before you start with any of the previous recovery steps, you need to remove any existing mailbox database copies from the failed server. This is a configuration change only because the server doesn’t exist anymore. Use the Remove-MailboxDatabaseCopy cmdlet to do so. Similarly, you also need to remove the failed server’s configuration from the DAG by using the Remove-DatabaseAvailabilityGroupServer cmdlet. You might even need to use the ConfigurationOnly switch if the failed server isn’t reachable on the network. You also need to evict the failed server node from the cluster using the Remove-ClusterNode cmdlet with the Force switch.

After performing these steps, perform the server recovery process mentioned earlier. Because the server was part of a DAG, after recovery you need to add the server back to the DAG using the Add-DatabaseAvailabilityGroupServer cmdlet. Add the mailbox database copies that existed on the server before the failure using the Add-MailboxDatabaseCopy cmdlet. Ensure that lag configuration is accurate for the lagged copies that might have existed on the server before the failure.

Objective summary

  • When configuring lagged copies, the impact on storage is higher because you are required to store larger amount of logs that can’t be truncated like normal database copies. You should also account for additional storage space when a lagged copy needs to be used for the recovery process, because it’s ideal to preserve an extra copy before replaying logs into lagged copy during the recovery process.
  • If relying on SafetyNet when activating a lagged copy for recovery, the SafetyNet configuration must match or exceed the lag time configured on lagged copy to be effective during the recovery operation.
  • Dial tone recovery is a fine balance between service availability and data availability. If data must be available when the user is accessing their mailbox, using dial tone might not be an effective strategy. In such cases, users could be without email access until the required data is restored.
  • When using dial tone recovery, downtime can’t be completely eliminated. When data is restored from backup, swapping the recovery database with the dial tone database involves downtime. To eliminate downtime, you can merge data from the recovered database to the dial tone database. But the time it might take to completely restore all of the data depends on the amount of data that needs to be merged from the restored database. This is always higher than the data contained in the dial tone database for each user.
  • Single item recovery reduces administrative overhead and provides protection, both accidental and intentional deletions by the users, but the feature isn’t enabled by default. Single item recovery and litigation hold can provide the ultimate protection against data loss, but at an additional cost for storage and other resources.
  • While recovery of public folder hierarchy is significantly simplified compared to previous versions of Exchange server, careful planning is still required to prevent the loss of the primary hierarchy mailbox. Because public folder mailboxes can be hosted on regular mailbox databases and can be protected by a DAG, it’s highly recommended to configure multiple database copies and include site resilience in the architecture where feasible.

Objective review

Answer the following questions to test your knowledge of the information in this objective. You can find the answers to these questions and explanations of why each answer choice is correct or incorrect in the “Answers” section at the end of this chapter.

  1. An Exchange administrator reports that a lagged copy was activated during an outage at a primary datacenter. The administrator has since reconfigured the lagged copy, but wants to prevent it from being activated in the future without manual intervention. What must you do to configure the lagged copy to meet the stated requirements?

    1. Suspend lagged copy.
    2. Suspend lagged copy for activation only.
    3. Remove permissions assigned to the Exchange Trusted Subsystem on the lagged copy folder.
  2. You have received reports of corrupt search folders from 50 users. You notice all of the users are on the same mailbox database. You want to fix the corruption in the shortest amount of time. What must you do?

    1. Issue New-MailboxRepairRequest against the mailbox database.
    2. Issue New-MailboxRepairRequest against the individual mailboxes.
    3. Distribute users to multiple databases and run New-MailboxRepairRequest on their mailboxes.
    4. Perform offline repair of database.
  3. When applying new Exchange cumulative update on one of the Mailbox servers, the update failed. You need to fix the issue. What must you do? Choose all that apply. Restart the server and apply the update again.

    1. Restart the server and uninstall the failed update.
    2. Run setup.exe from commandline with /recoverserver switch
    3. Restart the server using last known good configuration option. Reinstall the update.