Most users interacting with Teamstudio Export tend to remain at the surface, where they appreciate its functionality without exploring further. However, for the curious, those eager to uncover the deeper layers, Teamstudio Export offers an intriguing ecosystem to explore. If you're interested in understanding the hidden structures that allow Export to transform and archive your historical Notes/Domino application data, this deeper dive into its underlying architecture will offer valuable insights beyond its everyday operational use.
The Core Purpose of Teamstudio Export Archives
When you use Teamstudio Export to create read-only HTML and/or PDF versions of your Notes databases, the crucial first step involves generating an XML archive of the data. These XML archives are incredibly comprehensive, containing everything that was in the original Notes database, including its design elements and attributes. This completeness is vital because it ensures that your Notes data remains accessible in perpetuity, without any dependency on Notes clients or Domino servers.
An important aspect of these archives is their stability and backward compatibility. For example, an archive created with Export v1.0 back in 2018 could still be used to create a fully functional HTML archive with the latest Export v5.0, incorporating all the latest features and bells and whistles, despite very few changes to the XML archive format itself over that period. This commitment to a stable archive format ensures your long-term investment in data preservation. This format understanding remains relevant with all current and future versions.
A Look Inside the ".tse" File
So, what exactly is hidden within those ".tse" files that Export produces? The answer is surprisingly straightforward: a single .tse file that holds the entire contents of a Notes database is nothing more than a standard ZIP file archive. Inside, it contains numerous XML files. Because these XML files are entirely text-based and don't include the view indexes that can inflate the size of an NSF database, they can compress down remarkably well.
The contents of the archive file may be accessed by appending the “.zip” file extension.
If you're curious to explore an archive yourself, simply add a ".zip" suffix to the .tse filename, and you can decompress it like any other ZIP file.
Understanding DXL: The Language of the Archive
As you dig into the contents, you'll notice many files are encoded in DXL. DXL, or Domino XML, is the Domino version of Extensible Markup Language (XML). The structure of DXL is defined by the Domino Document Type Definition (DTD). This DTD provides the definitions of XML tags, allowing you to validate XML documents or understand those produced when exporting Domino databases into XML. The Domino DTD includes core entities, common entities, and various Domino elements, such as the form element.
Addressing DXL Inconsistencies: The Role of Your Notes Client
While DXL aims for a consistent and structured representation of your Notes data, it's important to be aware of potential inconsistencies that can arise, particularly when dealing with older HCL Notes client versions. DXL inconsistencies generated by old Notes clients can cause issues during archive generation.
Teamstudio strongly recommends using the most recent version of the Notes client possible on the workstation where Teamstudio Export is being used. This is important for two key reasons:
It helps avoid bugs present in older Notes versions.
It allows you to take advantage of the improvements IBM and HCL have made in the DXLExporter.
Several improvements were made to the DXLExporter between Notes 8.5 and 9, meaning that even a seemingly small version jump can help mitigate DXL-related export issues. In fact, customers who used Notes 8.5 to create their export archives generally encountered more problems than those using later versions.
These inconsistencies from older Notes clients can show up in various oddities in the DXL output, potentially leading to issues such as:
Random XML declarations placed in incorrect locations: For instance, an unexpected XML declaration (<?xml version='1.0' encoding='utf-8'?>) might appear within an existing tag, disrupting the expected XML structure.
Data being excluded or corrupted within the DXL. An example of this might be when an image from a Notes document is entirely omitted from its DXL representation, despite being perfectly visible within the Notes client. The unfortunate consequence of this is that the image will be missing and result in an empty frame once the document is exported with Teamstudio Export.
Common DXL based Export errors can be avoided utilizing more modern versions of Notes clients on the Export workstation.
By ensuring your Notes client is as up-to-date as possible, you can significantly reduce the risk of these DXL-related issues and ensure the highest fidelity in your Teamstudio Export archives.
Navigating the Archive's Structure: Folders and Files
Upon unzipping a .tse archive, you'll find four primary folders at the top level, along with several key files:
1. The Core Folders:
Data Folder: This folder contains an individual XML file for every document within the database. Each file is encoded using the DXL format and named based on the corresponding document's Notes ID.
Design Folder: As its name suggests, this folder houses an XML file for each design element, also encoded in DXL and named by its Notes ID.
An Important Evolution: Domino supports two styles of DXL: binary and default. While both are text-based XML, binary mode exports complex data values as base-64 encoded raw binary data, offering maximum fidelity but making it challenging to interpret without deep knowledge of Domino's internal structures. Prior to Teamstudio Export version 3.0, only the default mode was used, converting complex data to human-readable XML. However, from Export 3.0 onwards, Teamstudio Export also includes binary mode DXL for forms and views in a separate 'design2' folder to ensure the archive captures complete design information, while the human-readable DXL remains in the 'design' folder.
Profile Folder: You'll find one DXL-encoded file for each profile document from the database within this folder.
Views Folder: This folder contains one XML file for each view and each folder present in the database. It's worth noting that there isn't a standard DXL format for view data, so Teamstudio has defined its own format for these files, which is documented in the Export online documentation.
2. Top-Level Configuration and Log Files:
In addition to these folders, six crucial files reside at the top level of the directory tree:
acl.dxl: This DXL file contains all the Access Control Lists (ACL) information for the database.
audit.txt: This is a UTF-8 encoded plain-text log file largely generated during the DXL export. It contains high-level summary information about the archiving process, such as the user who performed the archive, the start and finish times, and counts of the number of documents that archived successfully or encountered errors. This file complements the older log.txt.
db.dxl: This file captures critical database-level information, such as the replica ID.
log.txt: A plain text file that records any errors or warnings encountered during the archive creation process.
meta.xml: An XML format file containing metadata that Teamstudio Export primarily uses to maintain its user interface.
unidindex.txt: This is a plain text CSV (Comma Separated Values) file that maps Notes IDs to Universal Notes IDs (UNIDs). This mapping is essential as it allows Teamstudio Export to convert between these IDs during the HTML export process.
Understanding the internal structure of Teamstudio Export's .tse archives provides valuable insight into how your legacy Notes data is preserved and prepared for future access. While you may not always need to delve into these files, knowing their composition—that they are effectively zip files containing meticulously structured XML, often using DXL—can be incredibly useful for anyone concerned about the long-term Notes data fidelity and accessibility.