Introduction: Migration is More Than Just "Copy-Paste"
As an archivist, you know that your data is your most valuable asset. Moving that data from a legacy system (like Archivists' Toolkit, older DB textworks, or even Excel spreadsheets) to a modern platform like AtoM (Access to Memory) or ArchivesSpace is rarely a simple "export/import" job.
It is an archival project, not just a technical one.
At USA Archives Hosting, we have overseen migrations for complex, multi-repository institutions. We wrote this guide to help you understand the path ahead—and how to avoid the common pitfalls that lead to data loss or "dirty" descriptions.
Phase 1: The Audit (Know Your Data)
Before you install any software, you must audit your current metadata.
- Inconsistent Dates: Do you have dates entered as "circa 1900," "c.1900," and "1900?" Modern systems rely on ISO 8601 standards (YYYY-MM-DD). These need to be normalized.
- Orphaned Records: Identify records that have lost their link to a parent collection or series.
- Controlled Vocabularies: Are your subject headings free-text or linked to a standard like LCSH? Migration is the perfect time to reconcile these.
Phase 2: The Mapping Strategy
This is where the "Archivist Advantage" comes in. You need to map your old fields to the new standards (ISAD(G) or DACS).
If moving to AtoM:
AtoM relies heavily on the slug (URL) and reference code.
- Tip: Ensure your
reference_codecolumn in your CSV import is unique across the entire database, not just the collection. - Hierarchy: AtoM builds hierarchy (Fonds -> Series -> File) via the
parentIdcolumn. If this link is broken in your spreadsheet, your tree structure will collapse.
If moving to ArchivesSpace:
ArchivesSpace is stricter about agent links (People/Corporate Bodies).
- Tip: You cannot just type a name into a "Creator" field. You often need to create the Agent record first, obtain its URI, and link it.
- Dates: ArchivesSpace splits dates into "Expression" (what the public sees, e.g., "Spring 1940") and "Begin/End" (machine-readable, e.g., "1940-03-01").
Phase 3: The "Cleanup" (Sanitization)
We use custom Python scripts to clean data before it ever touches the new server. Common tasks include:
- Character Encoding: Fixing "Mojibake" (broken characters like é instead of é).
- Whitespace Removal: Stripping trailing spaces that cause search errors.
- HTML Stripping: Removing old formatting tags
(
<b>, <font>) that break modern themes.
Phase 4: The Test Import (Sandbox)
Never import directly into production. We set up a "Sandbox" environment for our clients. We run the import, and then you (the archivist) browse the collection.
- Does the hierarchy look right?
- Are the digital objects attached to the correct descriptions?
- Did the notes fields migrate?
How We Can Help
Migration is technically demanding. If you are comfortable with Python, SQL, and CSV manipulation, this guide gives you the roadmap.
If you would rather focus on describing archives than debugging scripts, we can handle the migration for you.
We offer:
- Full Legacy Assessment: We analyze your old database.
- Custom Mapping: We build the crosswalks between your old fields and AtoM/ArchivesSpace.
- Data Cleaning: We scrub your data of errors.
- Turnkey Import: You get a fresh, populated system ready for use.
Ready to move? Contact us at info@usa-archives-hosting.com to discuss your migration project.