First of all let me say that generally, I wouldn’t recommend linking workbook files together, but the question of how to link files comes up frequently and there are some circumstances when it is sensible to have separate workbooks with related data. So this blog post starts off with how to minimise risk when linking files. Following that, I identify some of the risks and pitfalls – I recommend reading to the end before you decide to go ahead and link your workbooks.

Linking Excel files

There are situations, discussed below, when linking workbooks together can be useful. If this is something you need to do, be sure to follow these points to avoid corrupting your data:

  1. Limit external links to situations where they are essential – use single workbooks where it is practical to do so.
  2. Use a deliberately designed interface (an ‘airlock’) between models to avoid corruption of links – see below for how to do this.
  3. Ensure that ‘master files’ are on a network drive accessible by everyone who needs access, not on individual hard discs.
  4. Stick to a clearly defined directory (folder) structure that will not need to change.
  5. Ensure that everyone needing access to the master file(s) has the same drive mapping for the network location.
  6. Ensure all workbooks that are referenced are opened and recalculated before any output is used.
  7. Consider protecting the files to prevent unauthorised change.
  8. Design and maintain a diagram of the ‘big picture’ showing all the files that are linked and how the data should flow between them.
  9. Document your approach and;
  10. Train all the users that need to access the linked files to ensure compliance.

Linked Spreadsheets – the case for the defence:

Why might we want to link workbooks together? We’re used to compartmentalising information, so it might seem intuitive to want to keep different data in different workbooks. A common reason is that central information is required by many people for different types of analysis; there is often a ‘master file’ which many people need to link to, to keep their data up to date. Another good reason is that different people may be responsible for different sections of the data, or that some data may be confidential and therefore must be kept separate from the rest. These reasons need to be weighed up against the risks, which are discussed below.

Risks – the case for the prosecution

The main risk in linking workbooks is that the links can become corrupt. This can happen if the referenced workbook is altered when the referring workbook is closed, or if changes are made and both workbooks are not saved.

External links are vulnerable to changes in workbook structure; for example if a referenced file has a row added above one of the cells referenced by another workbook, it is essential that both workbooks are open and both are saved after the change. Failure to do this will mean that the reference will point to the original location, but the data has moved, so the link will reference different data.

Range names are not a solution: you might be tempted to suggest named ranges as a way to manage links; I had the same thought. I did some testing to find out whether references to named ranges in another workbook can be corrupted by inserting rows in the source workbook when the referencing workbook is closed. The short answer is that they can, although this doesn’t happen in all cases (ie the behaviour is inconsistent). I would therefore strongly recommend that you don’t use named ranges for linking between workbooks.

Using an ‘airlock’ to reduce risk with linked workbooks

When it is necessary to have external links in workbooks, there are some techniques that can be used to reduce the corruption risk.

Firstly we need what I call an ‘airlock’ sheet in both models. These should be identical in structure.

  • In the source (referenced) workbook, the airlock sheet contains cells that link to all the data required by the dependent (referencing) workbook.
  • In the dependent (referencing) workbook airlock sheet, cells are a copy (see below for the options here) of their equivalent cell in the source (referenced) workbook’s airlock sheet. Because the two sheets are identical the data will be correctly transferred from one workbook to the other. Everywhere that the data is needed in the dependent workbook, cells should reference the airlock sheet.

The benefit of this approach is that, provided you keep the airlock sheets the same, you can change the rest of either workbook without risk or corruption.

You have three choices about how the ‘links’ are managed:

  1. Create formulae to link the cells in one workbook to the cells in the other – these are ‘external links’ and will cause the ‘Edit Link’ button to show up in the Data tab of Excel.
  2. Manually copy and paste the data between the files.
  3. If your Excel skills run to VBA, write some code to manage the transfer of data between the airlock sheets.

In any case, the structure of the airlock sheets should only be changed when there is a change in the definitions of the data to be transferred between the two workbooks (ie not just when the data itself is updated, but when there are more or fewer categories for example). When this kind of change is required, both workbooks must be opened, the changes made in the referenced workbook and the referencing workbook then both must be saved.