Having been on a few projects now where the data migration was treated as a technical exercise and felt the impact of this – now it’s time to describe the data migration from a functional perspective.
Data migration strategy also called data conversion strategy is the basis for this.
Focusing on the functional perspective helps the business taking the right decisions early on in the implementation with focus on business driven decisions.
When the data migration is technical focused, major issues surface later on in the project, sometimes even as late as during user acceptance test where the impact is felt the most.
Another danger of treating the data migration as a technical exercise is that important management or legal aspects are missed and these issues can be expensive to miss.
When the data migration has been agreed upon this may need some technical work depending on what is migrated. However this is business as usual and is handled in normal manner with functional and technical design documents based on the requirements set out in the data migration strategy.
Data migration varies immensely depending on project size and complexity so the examples below are just guidelines and should be adapted to the actual project. So this article alone will not sort out all of your data migration considerations but hopefully it will send you in the right direction.
The data migration strategy is part of the overall project key deliverables setting out scope and project outcome expectations. The documents below are directly related but should be kept separate:
- Interface Strategy: Define system scope and integration need
- Data Migration Strategy: Define data retention for decommissioned systems or systems not being interfaced
- Cut-over Strategy: Define how to orchestrate close down of decommissioned systems and their data migration, switch-over of retained systems and related interfaces, and commence use of new systems
The "strategy" nomenclature indicates these documents are likely to have a significant impact on the overall project plan. Each of the above documents may also spawn a number of functional and technical specifications based on decisions made.
The primary purpose of the data migration strategy is to facilitate management decisions caused by system change – decisions to ensure:
- Continuation of daily business activities
- Correct management reporting
- Correct legal reporting and compliance
- Correct data retention
The responsibility of ensuring the above, is not a technical decision but a high level management decision. If the decision is taken at the wrong level or in the wrong perspective the result can be serious for the business which again falls back on management.
So therefore the target audience and signee of the data migration strategy is upper management.
A secondary purpose of the data migration strategy is to give project management an idea of the effort ahead – both in terms of functional and technical work.
Therefore the data migration strategy must be created early in the project either during a pre-analysis phase or as first phase in an already started project.
Technical aspects are only considered where circumstances require technical assistance to gauge feasibility to accomplish the task due to complexity or extremely high volumes.
The scope of the data migration strategy is highly dependent on the overall project scope and timeline. Often projects are phased and may have intermediate system scenarios during which the business need to continue to operate.
On a large scale project you may want to split the data migration strategy according to the implementation phases or sub-project schedules.
The above example is a project looking to replace an old finance; payroll and treasury system with a new integrated finance and payroll system. The business seek to prioritise the business critical retail system and in a later phase to replace payroll and treasury systems.
If the phases are treated as two separate projects it may be convenient to have two data migration strategies – one for phase one and one for phase 2. In case of parallel run; parallel implementations or downstream dependency of a decommissioned system, the data migrations may impact each other and should therefore be in one document.
I recommend that the interface strategy is created before the data migration strategy as the interface strategy sets out the overall system scope which has a downstream impact on the data migration strategy. The high level logical system diagrams from the interface strategy can be re-used in the data migration strategy to describe both scope and system changes.
The chapters below are just suggestions and will vary by scope and size of project.
In all sections it is important to ensure that content provides management with information to make the right decisions.
I normally split the document into three parts to address different levels of audiences:
- General Overview: Upper management and general interest
- System Analysis: Middle management and functional interest
- Entity Analysis: Functional interest mainly
In some rare cases a technical considerations part maybe required to describe impacts of very high volume or space requirements.
First part of the document should describe general scope and limitations enabling a high level executive to create an overview of the decisions ahead.
For instance use these chapters:
Purpose: Explains about the overall project and how the data migration strategy fits into this.
Scope: Lists key objectives; target systems and high level project plan with timeline; phases and milestones.
Limitations: This describes both high level business and legal limitations. Business limitations can the areas like seasonal impacts; other projects; reorganisations and mergers and acquisitions. Legal limitations are mainly based on statutory reporting like year-end; quarterly VAT reporting. In addition, Sarbanes-Oxley; data protection and sector specific rules may impact the data migration strategy.
Second Part should contain short description of systems impacted by the above changes and an analysis of these changes and an overview of logical entities involved in the change. This will enable middle management to understand the overall impact on their functional area.
For instance use these chapters:
- System Descriptions
- Change Analysis
- Logical Entities
System Descriptions: Describe systems as-is; to-be and any intermediate scenarios. All systems creating or modifying transactions or data should be described. All related systems should be considered whether they are decommissioned or retained as this may impact the data migrations.
The description should focus on logical systems like finance; retail or banking systems. Avoid assimilation between logical systems and physical hardware like "customer server" or "payment desktop".
I normally describe systems using data flow diagrams showing logical systems with logical data flows – so we do not want a technical diagram with components that just facilitate data transfer or basic transformations like network components; servers and hubs.
Data flows should describe flows of logical entities between logical systems like invoices; customers and other complete logical data. Transfer method; handshake and control flows are not relevant so do not have a flow called "ftp" or "ready".
Change Analysis: Describe system and data flow changes at a high level with focus on changes in functional areas, as focus on too much detail may impede on the understanding. The detail will be in next section.
- New systems
- Decommissioned systems
- Retained systems with changed or new interfaces
Logical Entities: Identify data as logical components to be migrated. Logical entities are always hard to define as it is subject to many misunderstandings. A resource with skills in logical data modelling may you managing this area.
A good acid test for, if an entity is really just one logical entity:
- The entity and its contents is created; used and deleted in the same business process
- The entity is owned by one functional area
- The entity is independently transferred to other systems
- Supplier and invoice: not a valid entity as the supplier is reused for multiple invoices. If you delete an invoice the supplier is not deleted. Split into two entities
- Invoice Line: not a valid entity as it is created/deleted in the same business process as the invoice. Just use invoice as the entity
These are simple examples and in many cases it can be hard to agree to a proper definition.
Third part should describe actions needed for each logical data entity identified above. This part is very detailed but helps to determine the expected impact and effort needed for each logical entity.
Line managers and subject matter experts should verify these entities and their impacts.
For this chapter I would normally have a table per entity.
For instance these table sections:
- Source/Target Systems
- Structure Analysis
- Pre-migration steps
- Migration steps
- Post-migration steps
- Security/Data Protection
Entity/Sub-Entities/Attributes: Name of logical entity from system analysis. Subdivision of the logical entity into various type or states of the entity. Typically an entity is a sub-entity when it serves a similar purpose as the main entity but where some information is different.
Attributes are the information on the logical entity. Only significant attributes identifying sub-entities or having an impact on the data migration should be described.
For example a supplier invoice may be split into a standard and prepayment invoice which essentially serves the same purpose but contains slightly different data.
Another example is a customer entity which may have sub-entities in form of business-to-business (B2B) and business-to-consumer (B2C). Example of attributes for B2B is "company name" and for B2C it is "first name" and "last name".
Scope: Description of what should be migrated. This often causes a lot of discussion as many compromises has to be made. Management often want a complete copy of the old system but in reality it would be too expensive and maybe even impossible to do.
So the decision here is a compromise between what is needed to run the business and what is cost/effective.
For example with suppliers you may only want to migrate suppliers that have invoiced you in the last year.
Similar with customers you may want to keep customers that have ordered within the last year. However marketing may have other ideas, in case they want to target inactive customers in the new CRM system?
So the decision can very fast become very complex as various departments may have different requirements.
Scope can also include data retention in case where only current data is migrated, then all other non-current data may have to be retained in a long-term readable form for legal purposes.
Source/Target System: Normally easy to determine based on the difference between as-is and to-be system diagrams, however in some cases there are multiple options from where to obtain data needed for the migration. A downstream system may be able to provide summarised data to an upstream system.
Dependencies: Dependencies identifies relationships to other entities. For instance before you can migrate invoices you may need suppliers converted first. When all the dependencies are identified this will give you an idea of how the actual data migration will be orchestrated. Data migration orchestration is a separate topic all together but the dependencies are crucial to its success.
Structure Analysis: Analysis of differences in entity structures, between source and target system. In some cases multiple entities has to be combined into one or vice versa. Complex structures like customer and contact hierarchies often have to be reorganised.
Volumetrics/Statistics: Is essential for the approach and complexity. For entities with multiple parts be sure to obtain volumetrics for all parts. For instance you may have very few invoices but if each invoice has thousands of lines it is still a complex task to convert these.
Statistical information can be helpful to determine when or how the conversion should be done. Often the term "dynamic" and "static" data is used but this is essentially a definition of data that statistically either is "changing" or "stable".
What is "dynamic" on one project may be "static" on another project.
For example if all payments must be made by the 5th then convert invoices after that date as the number of open invoices will be at a minimum then.
If you have a surge of transactions on Weekends then statistics may help you to consider to migrate data early in the week.
Transformations: Data transformation is the process of changing the data between the source and target. This activity can take place either at source; at target or during the transfer between source and target.
Examples of transformations:
- Enrichment – adding data to source data
- Cleansing – removing unnecessary data
- Mapping – matching related data to each other
- Summarisation – accumulating detail data
- Validation – indirect a cleansing to ensure valid data
Examples of transformations are:
- Capitalising customer names
- Adding missing addresses
- Removing salacious names
- Merging of identical suppliers
- Mapping of general ledger accounts from old to new accounts
- Postcode Address File (PAF) validation
- Truncation of long item descriptions
In addition to the transformations needed on logical entities there may be technical transformations needed but they should be described in a technical design document, like mapping of customer id’s between two systems.
Pre-migration Steps: Activities to be performed before the logical entity is migrated. This activity will not move any data but may reduce volumes; clear complex scenarios or ensure source data is in a controlled state. This may include some transformations.
For example paying unpaid invoices and closing the period in the source system before migrating invoices may help the reconciliation between source and target systems.
Migration Steps: Step to move data from source system to target system. This may include transformations during the transfer.
For example run a program and any needed parameters. How the program is made is a technical design document.
Post-migration Steps: Activities performed after the logical entity is migrated. This activity will not move any data but will complete the data in the target system.
For example this may be to reverse journals generated as a bi-product by the migrated data. The post-migrations may be connected to the controls below. For instance you may want to reverse the journals after you have reconciled the data migration.
Controls: Describe how to verify if the migration was successful. What to check depends on the logical entity and care should be taken to ensure a successful data migration really has been achieved.
For example opening balance journals can be checked with a trial balance in source and target systems. If complex mappings are taking place the target trial balance should contain a reverse mapping so equal data can be compared. This may impact the design of the migration so the target data is enriched with source data to facilitate the reverse mapping.
If data structures has been change the control should ensure this has been done correctly for instance with count of master/detail parts.
Approach: This is an initial assessment of how to migrate data in terms of manual; automated or assisted migration.
Manual conversion may cause a higher level of errors but may be the only option if the migration is complex. If the volume is high you may want to get outside resources to assist in the process.
Automatic is always desirable but the costs may be too high depending on volume. Also if automated conversion require intrusive data manipulation in the target system this may not be a viable option.
Assisted is to simulate manual entry. There are dedicated tools that simulate data entry based on spreadsheets or simple data files. Often tools like these may need to be monitored during the migration in case of unexpected errors.
Security/Data Protection: Logical entities and its attributes needs to be considered for security risks and may need to be masked or encoded. Another security issue is to reuse personal information in multiple systems without consent. Worst case the business needs to obtain consent from customers before the migration can commence or make a transformation based on whether consent has been obtained or not.
Examples of data that may need security or data protection considerations:
· Personal information
· Bank accounts
· Credit card details
· Salary information
· Mailshot addresses
· Personal phone numbers