Category Archives: Schema Normalization

Summer update on schema conversion progress

Fellow TARO participants,

Here’s an update for you on our schema conversion progress. Generally speaking, the work is going well.

A big thank you to Minnie Rangel at UT Libraries for her work on this! And many thanks to the repositories going through this process with such good cheer. This is an important step forward for TARO.

We had hoped to finish Groups A, B and C before the end of the calendar year. Groups A and B will meet that timeline.
It is looking like Group C will need to be converted in very early 2017.
Which repositories are in which groups and how does this work?

All of our “Group A” repositories (those using software that exports XML such as Archon, ArchivistsToolkit, or ArchivesSpace) have had their existing files converted to schema format. Almost all of them have corrected the very minor errors which popped up.
These repositories are refining their workflows for submitting schema compliant & TARO friendly files now.
ArchivesSpace users are up and running, using the ArchivesSpace guidelines on the TARO Today blog.
We are working on similar how-to info for Archon, ArchivistsToolkit, CuadraStar users, which will then also be published and announced.
(Note: It was discovered that CuadraStar exports dtd-XML, not schema, so they will have a slightly different process.)
All will be keeping in mind the new TARO Standards / Best Practices Guidelines.

Our “Group B” repositories of hand-encoders are starting to be converted now.
These folks using XML editors such as Oxygen and XMetal, or other tools such as Notepad ++, will be making use of the new TARO Standards / Best Practices Guidelines (which also include XML templates, very handy for hand-encoders).

The first to be converted in this group will be:

  • San Jacinto Museum of History – Oxygen users –  July 12-14
  • Texas State Library and Archives Commission – Oxygen users – July 26-27
  • Texas Tech University Southwest Collection/Special Collections Library – Oxygen users – August 2-4
  • The University of Texas at Austin. Benson Latin American Collection – Oxygen users – August 16-18

The remaining Group B repositories are still being scheduled and will be contacted soon individually regarding their proposed dates.
Group C folks will likely be in early 2017.

Stay tuned for updates on this conversion work as the summer goes along, as well as our NEH planning grant final reports coming out later this summer.

Upgrading to schema compliance in 2016!

As promised in August 2015, we at TARO have been working diligently on preparing our system to move to the more modern format of schema-compliant EAD.
We have conducted our pilot project for moving to schema-compliance.

We will start with volunteers for early conversion with the rest following as training and support allows. No one will be rushed into conversion.
We will contact you in January 2016 to discuss this process, answer your questions,  and hear when your repository would consider participating.
A specific TARO contact person will be available to you for questions and assistance throughout this process.

We will be ready starting in January 2016 to begin working with each repository one at a time to:
1.) Convert the repository’s existing files which are on TARO over to schema compliance. TARO’s Minnie Rangel will use an automated process and then work with repositories on manually following up on any errors (at the repository’s convenience, or at the time when the repository wishes to reload a given file for content changes). The time needed for this will vary from repository to repository, but shouldn’t be significant, and is not on a particular deadline.
2.) Give you the information you need in order to start submitting schema-compliant files to TARO from then on.
(You may still submit dtd-compliant files all the way up until the time your repository converts to schema compliant submission.)

We look forward to working with you on this and appreciate your participation, as this step is the basis for any additional TARO improvements.

Sincerely,
Amanda Focke, on behalf of the TARO Steering Committee

Overview of Encoding Survey

Last month, I solicited EAD templates and documentation from partner institutions to get a clearer picture of TARO’s EAD landscape. Thank you to the 24 institutions that answered the questionnaire and provided documentation. The responses and accompanying documentation illuminate some of the shared (or similar) encoding practices across the TARO partners, as well as areas of encoding diversity. This knowledge will help me and the Steering Committee make useful recommendations for incorporating a schema-compliant workflow into existing practices. The goal is to find that sweet point between breadth and specificity, so that participation in TARO is both convenient and beneficial.     

Overall, there is plenty of common ground amongst the respondents in regards to encoding workflows and processes. The following is a very general overview of the survey responses:   

24 total responses

17 of the 24 of the institutions that responded to the survey described a process of encoding by hand using previous finding aids and/or templates as guides. MS Word and Excel are common tools used for creating collection inventories that are then copied and pasted into an XML editor.   

13 use Oxygen XML editor  

Finding aid creation is a multi-step, multi-tool process for everyone, and common ground bodes well as TARO moves toward greater standardization. Common tools, such as MS Excel and Oxygen XML editor can be incorporated and leveraged in best practices guidelines.  

As of right now, fewer organizations use archival management systems, while a handful of respondents expressed plans to adopt an AMS in the near future.

7 use AMS

3 ArchivesSpace
2 Archivists’ Toolkit
1 Archon
1 CuadraStar

As you may be aware, ArchivesSpace generates schema-compliant EAD. In fact, the AS output is sometimes stricter than the EAD 2002 schema . Currently, the institutions that use these archival management systems must reverse edit their EAD back to DTD to make it TARO compliant. With more organizations adopting (or at least considering) management systems, TARO must plan to accommodate current and future developments in technology. Updating the XML in TARO will not only improve the front-end user experience, but will also broaden potential participation.

The greatest variation across the respondents appears (quite obviously) in the documentation, instructions, and templates of each contributing institution. A large consideration going forward is finding the optimal level of standardization that benefits all contributing institutions. Participation in TARO should be easy, perhaps effortless. With this goal in mind, the question we need to ask is:

How can we reduce redundancies between unique institutional workflows and contributing to TARO?

Feel free to continue this conversation, especially if you feel that the overview above does not represent how your institution creates EAD.