Category Archives: Schema Normalization

Update on schema conversion Summer 2016

Fellow TARO folks,

We are wrapping up schema conversion for repositories who already create schema-compliant EAD using ArchivesSpace and Archon (known as “Group A”).
Directions for those repositories are online at TARO Today blog (using Archon, and using ArchivesSpace).

We have begun working individually with the hand encoders known as “Group B,” as listed below.
San Jacinto Museum of History was converted last week and the process went smoothly.
Updated instructions for working with TARO-friendly EAD are available here.

Below is the schedule for Group B, with Group C following in early 2017.
Basic info on how the conversion process works and which group each repository is in

Thanks, everyone!
Amanda Focke, TARO Steering Committee co-chair

account repository Conversion Date
sjmh San Jacinto Museum of History (Oxygen) July 12-14
tslac Texas State Library and Archives Commission (Oxygen) July 26-27
swcpc Texas Tech University, (Oxygen) August 2-4
tturb Texas Tech University, Rare Books (Oxygen) August 2-4
ttusw Texas Tech University, Southwest Collection (Oxygen) August 2-4
ttuua Texas Tech University, University Archives (Oxygen) August 2-4
ttuav Texas Tech University, Audio Visual (Oxygen) August 2-4
utlac The University of Texas at Austin. Benson Latin American Collection (Oxygen) August 16-18
utlsc H.J. Lutcher Stark Center, University of Texas at Austin (Notepad++) August 30-Sept 1
uttyler University Archives and Special Collections The University of Texas at Tyler (limbo between Archon/AS) August 30-Sept 1
utsa University of Texas San Antonio (Oxygen) August 30-Sept 1
tsusm Texas State University (Oxygen) Sept 13-Sept 15
dalpub Texas/Dallas History and Archives Division, Dallas Public Library (NoteTab) Sept 13-Sept 15
utlaw Tarlton Law Library, University of Texas at Austin (Oxygen) Sept 13-Sept 15
utmb Truman G. Blocker, Jr. History of Medicine Collections, Moody Medical Library, University of Texas Medical Branch (Oxygen) Sept 27-Sept 29
smu Southern Methodist University (Oxygen) Sept 27-Sept 29
hamtmc Houston Academy of Medicine-Texas Medical Center Library, John P. McGovern Historical Collections and Research Center (Oxygen) Sept 27-Sept 29
apts Austin Presbyterian Theological Seminary (Notepad++) Oct 11-Oct 13
aushc Austin History Center, Austin Public Library (NoteTab) Oct 11-Oct 13
utarl University of Texas Arlington Library, Special Collections (XMetal) Oct 25 – Oct 27
uthrc Harry Ransom Humanities Research Center, University of Texas at Austin(Oxygen) Nov 8 – Nov 10
drtsa Daughters of the Republic of Texas Library at the Alamo (Oxygen) Nov 29 – Dec 1
houpub Houston Public Library, Houston Metropolitan Research Center (limbo between AT/AS) Nov 29 – Dec 1
utcah The University of Texas at Austin. Dolph Briscoe Center for American History (Oxygen) Dec 13 – Dec 15

Summer update on schema conversion progress

Fellow TARO participants,

Here’s an update for you on our schema conversion progress. Generally speaking, the work is going well.

A big thank you to Minnie Rangel at UT Libraries for her work on this! And many thanks to the repositories going through this process with such good cheer. This is an important step forward for TARO.

We had hoped to finish Groups A, B and C before the end of the calendar year. Groups A and B will meet that timeline.
It is looking like Group C will need to be converted in very early 2017.
Which repositories are in which groups and how does this work?

All of our “Group A” repositories (those using software that exports XML such as Archon, ArchivistsToolkit, or ArchivesSpace) have had their existing files converted to schema format. Almost all of them have corrected the very minor errors which popped up.
These repositories are refining their workflows for submitting schema compliant & TARO friendly files now.
ArchivesSpace users are up and running, using the ArchivesSpace guidelines on the TARO Today blog.
We are working on similar how-to info for Archon, ArchivistsToolkit, CuadraStar users, which will then also be published and announced.
(Note: It was discovered that CuadraStar exports dtd-XML, not schema, so they will have a slightly different process.)
All will be keeping in mind the new TARO Standards / Best Practices Guidelines.

Our “Group B” repositories of hand-encoders are starting to be converted now.
These folks using XML editors such as Oxygen and XMetal, or other tools such as Notepad ++, will be making use of the new TARO Standards / Best Practices Guidelines (which also include XML templates, very handy for hand-encoders).

The first to be converted in this group will be:

  • San Jacinto Museum of History – Oxygen users –  July 12-14
  • Texas State Library and Archives Commission – Oxygen users – July 26-27
  • Texas Tech University Southwest Collection/Special Collections Library – Oxygen users – August 2-4
  • The University of Texas at Austin. Benson Latin American Collection – Oxygen users – August 16-18

The remaining Group B repositories are still being scheduled and will be contacted soon individually regarding their proposed dates.
Group C folks will likely be in early 2017.

Stay tuned for updates on this conversion work as the summer goes along, as well as our NEH planning grant final reports coming out later this summer.

schema conversion – ready for Group B

Fellow TARO participants,

It is now time for the “Group B” TARO repositories to be scheduled for conversion to schema compliance.

If any repositories in that group are interested in being scheduled for this work sooner rather than later, please reply to Amanda Focke (afocke@rice.edu) by the end of this week, July 1.

After hearing from repositories, we will post a specific schedule for conversion, and begin working with the first repositories.

Here is the blog post with the year’s schedule and basic info on how this will work.
**Please remember Minnie Rangel at TARO will do the conversion work and each repository will have help and personal attention along the way, ending with the repository having what they need to start submitting schema compliant finding aids.**

Here is the list (from that blog post of the Group B repositories):

Group B: Roughly scheduled for Summer / early Fall

Austin History Center, Austin Public Library (NoteTab)
Austin Presbyterian Theological Seminary (Notepad++)
Daughters of the Republic of Texas Library at the Alamo (Oxygen)
Harry Ransom Humanities Research Center, University of Texas at Austin(Oxygen)
Houston Academy of Medicine-Texas Medical Center Library, John P. McGovern Historical Collections and Research Center (Oxygen)
Houston Public Library, Houston Metropolitan Research Center (limbo between AT/AS)
San Jacinto Museum of History (Oxygen)
Southern Methodist University (Oxygen)
Stark Center, University of Texas at Austin (Notepad++)
Stephen F. Austin University (limbo between Archon/AS)
Tarlton Law Library, University of Texas at Austin (Oxygen)
Texas State Library and Archives Commission (Oxygen)
Texas Tech University Southwest Collection/Special Collections Library (Oxygen)
Texas/Dallas History and Archives Division, Dallas Public Library (NoteTab)
The University of Texas at Austin. Alexander Architectural Archive (Oxygen) –CONVERTED FEB 2016 IN TARO PILOT WORK
The University of Texas at Austin. Benson Latin American Collection (Oxygen)
The University of Texas at Austin. Dolph Briscoe Center for American History (Oxygen)
Truman G. Blocker, Jr. History of Medicine Collections,
Moody Medical Library, University of Texas Medical Branch (Oxygen)
Tyrrell Historical Library (Oxygen) University Archives and Special Collections The University of Texas at Tyler (limbo between Archon/AS)
University of Texas Arlington Library, Special Collections (XMetal)
University of Texas San Antonio (Oxygen)

 

1st draft available for review: TARO schema-compliant encoding guidelines

On behalf of Rebecca Romanchuk and Carla Alvarez, TARO Standards Committee co-chairs, please read the following asking for your feedback on the new schema-compliant encoding guidelines, which will be used by all TARO repositories after each repository is converted to schema compliance later this year.
Please know that doing your conversion, you will have oneonone contact with a TARO volunteer to help you get started submitting finding aids in schema format using these guidelines, but we welcome your feedback on the guidelines now. ___________________________________________________________________________The TARO Standards subcommittee is pleased to announce that we have completed our first draft of the
EAD 2002 Schema Best Practice Guidelines for TARO!

Texas Archival Resources Online (TARO), Texas’ EAD finding aid consortial site – https://www.lib.utexas.edu/taro/, is in the midst of an NEH planning grant to develop improved systems and updated standards for TARO as it achieves sustainability to serve the archival research community into the future. Part of this work is to create new encoding guidelines for TARO repositories that c onform to the EAD 2002 Schema encoding standard, which TARO will complete conversion to in 2016. These best practice guidelines (BPG) are available as a PDF at http://bit.ly/1Wk6p6W. The BPG appendices are a TARO-friendly sample Schema-compliant template for EAD encoding for your use, and an EAD finding aid ex ample. These appendices are also available at the same link as XML files.

We welcome feedback addressing every aspect of our BPG.

Go to http://goo.gl/forms/gaJXiCVtp4 to complete a brief survey to give us your ideas for how the BPG can better address your needs for EAD encoding. The survey is configured to adapt its questions depending on whether your repository is a TARO member, or if you are in Texas and have not yet joined TARO, or if you are outside of Texas and want to give us your general feedback.

Please complete the survey by Friday, June 3, 2016.

If you encode for TARO, we need to hear from you. The BPG, which will be a key tool for TARO participants, offers detailed guidance on creating EAD XML files. Even participants who export XML from software such as ArchivesSpace (and don’t see the raw XML) will need to follow TARO protocols as described in the BPG, such as formatting the <eadid>. You will need to follow the BPG in order to submit your Schema-compliant files to TARO, which each repository will be required to do by the end of 2016.

The co-chairs of the TARO Standards subcommittee extend sincere thanks to its members for their superb contributions to the BPG. Invaluable support has been provided during our drafting process by TARO Steering Committee co-chairs Amanda Focke and Amy Bowman, UT Libraries TARO technical support staff Minnie Rangel, and our NEH planning grant project manager Leigh Grinstead and grant consultant Jodi Allison-Bunnell. We are also grateful to the EAD consortial community at large for the encoding documentation they make available online, in particular Online Archive of California and Archives West, which are models that have guided us.

Cordially,

Carla Alvarez, MA, CA (co-chair – TARO Standards subcommittee)
Rare Books and Manuscripts
Nettie Lee Benson Latin American Collection
University of Texas at Austin

Rebecca Romanchuk, MLIS, CA (co-chair – TARO Standards subcommittee)
Team Lead, Archives / Archivist II
Archives and Information Services
Texas State Library and Archives Commission

TARO Standards subcommittee members:  
Maristella Feustle (UNT-Music Library),
Cynthia Franco (SMU-DeGolyer Library),
Molly Hults (Austin Public Library-Austin History Center),
Benna Vaughan (Baylor University-Texas Collection),
Jeffrey Warner (Rice University-Woodson Research Center).

Schema transition underway – see the year’s rough schedule

TARO repositories have been sorted into three groups for the purposes of working through schema conversion process.

Each repository will be worked with individually to ensure their documentation and training needs are met.

Scroll down to see what to expect and where your repository is grouped, and please know we will be in touch with your repository to discuss this process and the timing.
Questions right now? Contact Amanda Focke

  • Group A – Spring / early Summer 2016: repositories already creating schema compliant XML with software such as ArchivesSpace, ArchivistsToolkit, CuadraStar, Archon.
  • Group B –   Summer – Fall 2016: repositories encoding by hand in XML editor of some sort, significant current staff experience and documentation
  • Group C –  Winter – early 2017: encoding by hand in XML or text editor, little or no current staff experience and documentation

TARO workflow steps for repositories moving to schema compliant XML submissions, 2016

What to expect: Overall, each repository should expect the conversion process to take about a week, with the work happening via a script run by Minnie Rangel, and the repository not having account access during that time. After that, the repository can submit edited or new finding aids as long as they are schema compliant, and guidance will be provided on how to do that.

  1. Scheduling the conversion, repository by repository

Minnie Rangel and Amanda Focke  to schedule conversion with repository at a convenient time.

  1. Blocking repository account access during conversion

Tuesday of the scheduled conversion week, the repository’s account access is blocked by Minnie to prevent any submissions during the conversions.

  1. Schema conversion of existing files at TARO
    Wednesday of the scheduled conversion week, Minnie runs the dtd-to-schema conversion script on the repository’s existing files in TARO. This may take 2-3 days depending on the number and size of the files. (For example, 800 files might take 2-3 days.) At the end of this process, all the XML files on TARO’s server for this repository will be schema compliant and valid, with no need for the repository to take further steps on them, unless there was an error (see below for further info on errors). The HTML webpage for the finding aids that researchers see online will not have changed at all.
  2. Repository to download their dtd files and new schema files for local backup.

Repository will log in to their TARO account in the usual manner as access will have been restored, and use the secure-shell client’s tools to download all their files. The old DTD XML will be in one folder and should no longer be used for current finding aid editing, only as an archived copy. The newly created schema compliant files can be used for editing if needed.

  1. Error correction on schema compliant XML

In the event of any errors, Minnie will supply a list of such errors which will be helpful in correcting them.

  • Please note that the correction of these errors is required but can be done at the convenience of the repository, since the finding aid seen by users is still the HTML as generated by the old DTD file.
  • Advice and troubleshooting will be available from the TARO Outreach and Committee, and possibly other TARO committee members as needed.
  1. Any new or edited XML submitted will need to be schema compliant and valid

Going forward from your conversion, any edits to files, such as for updates to a finding aid, will need to be submitted as a valid schema compliant XML file in order for it to process correctly and show online as HTML. You will be given the documentation and other info needed in order to do this using essentially the same workflow you already have, it is not a huge change.


 

A note about groups — if you think your repository is in the wrong group, or you don’t see your repository at all, please contact Amanda Focke. The groups were made based on survey responses in Fall 2015 or by email / phone in early Spring 2016.

Group A: roughly scheduled for Spring / early Summer

African American Library at the Gregory School (AS)
Baylor University (CuadraStar)
Rice University, Fondren Library, Woodson Research Center (AS)
Texas General Land Office Archives and Records (AT)
Texas A&M Corpus Christi (AS)
Texas A&M University Cushing Memorial Library (Archon)
University of Houston Libraries, Special Collections (Archon) University of North Texas Archives (Archon)
Vietnam Center and Archive, Texas Tech University (AS)

Group B: Roughly scheduled for Summer / early Fall

Austin History Center, Austin Public Library (NoteTab)
Austin Presbyterian Theological Seminary (Notepad++)
Daughters of the Republic of Texas Library at the Alamo (Oxygen) Harry Ransom Humanities Research Center, University of Texas at Austin(Oxygen)
Houston Academy of Medicine-Texas Medical Center Library, John P. McGovern Historical Collections and Research Center (Oxygen) Houston Public Library, Houston Metropolitan Research Center (limbo between AT/AS)
San Jacinto Museum of History (Oxygen)
Southern Methodist University (Oxygen)
Stark Center, University of Texas at Austin (Notepad++)
Stephen F. Austin University (limbo between Archon/AS)
Tarlton Law Library, University of Texas at Austin (Oxygen)
Texas State Library and Archives Commission (Oxygen)
Texas Tech University Southwest Collection/Special Collections Library (Oxygen)
Texas/Dallas History and Archives Division, Dallas Public Library (NoteTab)
Texas State University (Oxygen)
The University of Texas at Austin. Alexander Architectural Archive (Oxygen) –CONVERTED FEB 2016 IN TARO PILOT WORK
The University of Texas at Austin. Benson Latin American Collection (Oxygen)
The University of Texas at Austin. Dolph Briscoe Center for American History (Oxygen)
Truman G. Blocker, Jr. History of Medicine Collections,
Moody Medical Library, University of Texas Medical Branch (Oxygen)
University Archives and Special Collections The University of Texas at Tyler (limbo between Archon/AS)
University of Texas Arlington Library, Special Collections (XMetal)
University of Texas San Antonio (Oxygen)

Group C:  early 2017

Tyrrell Historical Library (Oxygen)
Concordia University Texas Historical Online Collection (Oxygen)
Lamar University’s Archives and Special Collections (NoteTab)
Robert E. Nail, Jr. Historical Archives at Old Jail Art Center (NoteTab)
San Antonio Municipal Archives
South Texas Archives at Texas A&M University-Kingsville (Oxygen)
Texas Woman’s University, the Woman’s Collection (Oxygen)
University of St. Thomas Archives
University of Texas El Paso (Oxygen)
University of Texas M.D. Anderson Cancer Center (Oxygen)
UT Health Science Center San Antonio
UT Human Rights Documentation Initiative


 

Scheduling the transition to schema compliance in 2016

Dear fellow TARO members,

Our progress towards updating TARO files to schema compliance continues.

Thanks to your participation in our Fall 2015 survey regarding your repositories’ methods for creating EAD finding aids, we have been able to group our TARO repositories into 3 groups for the purpose of scheduling each repository’s schema updates in 2016:

  • Group A includes repositories already creating schema compliant finding aids (for example those using collection management software which exports schema compliant EAD). This group would go through the schema transition first (Spring 2016).
  • Group B includes repositories creating dtd compliant finding aids with significant staff experience and workflow documentation. This group would go through the schema transition second, with assistance from members of the TARO team (timeframe to be determined).
  • Group C includes repositories creating dtd compliant finding aids with less staff experience and workflow documentation, as well as those who are close to creating dtd compliant finding aids and who need training or other support to get started. This group would go through the schema transition third, with assistance from members of the TARO team (timeframe to be determined).

About half of TARO repositories responded to the survey which allows us to sort repositories into these groups for planning purposes.
Our next step will be to follow up in the next two weeks with the repositories who did not respond so we can plan the year’s schema compliance work accordingly. We do realize that in some cases where repositories did not respond, the contact email we have could have been out of date, and we will do our best correct that situation.

Questions about this schema compliance planning process? Contact Amanda Focke at afocke@rice.edu.

Otherwise, stay tuned!
Thanks,
Amanda
TARO Steering Committee Co-chair
TARO blog for public news Wiki as working committee records

Upgrading to schema compliance in 2016!

As promised in August 2015, we at TARO have been working diligently on preparing our system to move to the more modern format of schema-compliant EAD.
We have conducted our pilot project for moving to schema-compliance.

We will start with volunteers for early conversion with the rest following as training and support allows. No one will be rushed into conversion.
We will contact you in January 2016 to discuss this process, answer your questions,  and hear when your repository would consider participating.
A specific TARO contact person will be available to you for questions and assistance throughout this process.

We will be ready starting in January 2016 to begin working with each repository one at a time to:
1.) Convert the repository’s existing files which are on TARO over to schema compliance. TARO’s Minnie Rangel will use an automated process and then work with repositories on manually following up on any errors (at the repository’s convenience, or at the time when the repository wishes to reload a given file for content changes). The time needed for this will vary from repository to repository, but shouldn’t be significant, and is not on a particular deadline.
2.) Give you the information you need in order to start submitting schema-compliant files to TARO from then on.
(You may still submit dtd-compliant files all the way up until the time your repository converts to schema compliant submission.)

We look forward to working with you on this and appreciate your participation, as this step is the basis for any additional TARO improvements.

Sincerely,
Amanda Focke, on behalf of the TARO Steering Committee

Overview of Encoding Survey

Last month, I solicited EAD templates and documentation from partner institutions to get a clearer picture of TARO’s EAD landscape. Thank you to the 24 institutions that answered the questionnaire and provided documentation. The responses and accompanying documentation illuminate some of the shared (or similar) encoding practices across the TARO partners, as well as areas of encoding diversity. This knowledge will help me and the Steering Committee make useful recommendations for incorporating a schema-compliant workflow into existing practices. The goal is to find that sweet point between breadth and specificity, so that participation in TARO is both convenient and beneficial.     

Overall, there is plenty of common ground amongst the respondents in regards to encoding workflows and processes. The following is a very general overview of the survey responses:   

24 total responses

17 of the 24 of the institutions that responded to the survey described a process of encoding by hand using previous finding aids and/or templates as guides. MS Word and Excel are common tools used for creating collection inventories that are then copied and pasted into an XML editor.   

13 use Oxygen XML editor  

Finding aid creation is a multi-step, multi-tool process for everyone, and common ground bodes well as TARO moves toward greater standardization. Common tools, such as MS Excel and Oxygen XML editor can be incorporated and leveraged in best practices guidelines.  

As of right now, fewer organizations use archival management systems, while a handful of respondents expressed plans to adopt an AMS in the near future.

7 use AMS

3 ArchivesSpace
2 Archivists’ Toolkit
1 Archon
1 CuadraStar

As you may be aware, ArchivesSpace generates schema-compliant EAD. In fact, the AS output is sometimes stricter than the EAD 2002 schema . Currently, the institutions that use these archival management systems must reverse edit their EAD back to DTD to make it TARO compliant. With more organizations adopting (or at least considering) management systems, TARO must plan to accommodate current and future developments in technology. Updating the XML in TARO will not only improve the front-end user experience, but will also broaden potential participation.

The greatest variation across the respondents appears (quite obviously) in the documentation, instructions, and templates of each contributing institution. A large consideration going forward is finding the optimal level of standardization that benefits all contributing institutions. Participation in TARO should be easy, perhaps effortless. With this goal in mind, the question we need to ask is:

How can we reduce redundancies between unique institutional workflows and contributing to TARO?

Feel free to continue this conversation, especially if you feel that the overview above does not represent how your institution creates EAD.

 

Schema Compliance Intern

Picture of Hannah
Hannah helping remote researchers at the Harry Ransom Center

Hello TARO! My name is Hannah Rainey and I am the schema compliance intern for the 21st Century Collaborative Planning Project. I am very honored and excited to join the effort to update TARO. I am passionate about improving access, both in the reading room and behind the scenes.

Before I describe my role in the project, let me tell you a little bit about myself. I grew up in lovely Boise, Idaho where I developed a love for the outdoors. I attended Wellesley College in Massachusetts where I developed a hatred of winter. In 2010, I completed a BA in Cinema and Media Studies. I began working at a music library as an undergrad and have since worked in a variety of libraries and archives, including a short and very fun stint at the Library of Congress Packard Campus for Audio Visual Conservation. Currently, I am a Graduate Intern in Reference and Public Services at the Harry Ransom Center. If all goes as planned, I will graduate with a master’s degree from the UT School of Information this December.

From now until January, I will work directly with archivists at the Briscoe Center and librarians at UT Libraries to develop workflows for testing EAD finding aids. My goal is to identify common pain points in the transformation from DTD to Schema, and get a general sense of the time it will take to correct common errors, both manually and programmatically. My work will comprise a small portion of the overall effort to update and adopt shared encoding standards across the TARO consortium.

If you have any questions or comments please email me: rainey.hannahleah@gmail.com