7.1 Digital preservation techniques and best practices
4 min read•july 29, 2024
Digital preservation is crucial for safeguarding rescued stories. It involves comprehensive strategies, from organizational policies to technological infrastructure. Key aspects include selecting appropriate file formats, implementing robust backup systems, and designing efficient digitization workflows.
Preserving digital content requires careful planning and execution. This includes choosing non-proprietary file formats, following the 3-2-1 backup rule, and creating detailed digitization workflows. Regular and quality control are essential for long-term accessibility and integrity of digital .
Digital Preservation Strategies
Comprehensive Approach
Top images from around the web for Comprehensive Approach
Factors that influence digital preservation sustainability in academic libraries in South Africa View original
Is this image relevant?
UNT Libraries’ Digital Preservation Policy Framework - University Libraries - UNT View original
Is this image relevant?
Episode 123 - Infrastructure and Data Lifecycle (part 2) - Roaring Elephant View original
Is this image relevant?
Factors that influence digital preservation sustainability in academic libraries in South Africa View original
Is this image relevant?
UNT Libraries’ Digital Preservation Policy Framework - University Libraries - UNT View original
Is this image relevant?
1 of 3
Top images from around the web for Comprehensive Approach
Factors that influence digital preservation sustainability in academic libraries in South Africa View original
Is this image relevant?
UNT Libraries’ Digital Preservation Policy Framework - University Libraries - UNT View original
Is this image relevant?
Episode 123 - Infrastructure and Data Lifecycle (part 2) - Roaring Elephant View original
Is this image relevant?
Factors that influence digital preservation sustainability in academic libraries in South Africa View original
Is this image relevant?
UNT Libraries’ Digital Preservation Policy Framework - University Libraries - UNT View original
Is this image relevant?
1 of 3
A comprehensive digital preservation strategy addresses organizational infrastructure, technological infrastructure, and resources framework
Key components include policies, staffing, selection, metadata, access, storage, security, disaster preparedness, and sustainability
Policies should cover the scope of digital content preserved, retention schedules, access and use conditions, and preservation standards
Documented policies are essential for consistent implementation (retention schedules, use conditions)
Staffing and Resources
Staffing and resourcing digital preservation requires defining roles and responsibilities, assessing current and future staffing needs, and budgeting for short and long-term requirements
Selection criteria for digital preservation consider the information content, uniqueness, and value of the materials, as well as technical feasibility and sustainability
Not all content can or should be preserved (low-value, redundant content)
Descriptive, administrative, structural, and preservation metadata are all critical to manage and preserve digital objects over time and maintain meaning, authenticity, and access
Storage and Security
Storage and security include evaluating and implementing storage management systems and media that align with preservation policies and practices
Multiple copies, geographical distribution, access controls, and security monitoring are best practices (offsite backup, encryption)
Planning for sustainability and disaster preparedness requires , succession planning, and established responses for a range of failure scenarios to ensure continuity of preservation
File Formats for Preservation
Selecting Preservation Formats
Careful selection of file formats is critical for long-term preservation
Formats should be non-proprietary, ubiquitous, and have freely available specifications (PDF/A, TIFF)
Text-based formats like XML, PDF/A, and plain text are preferable to complex proprietary formats for documents
Markdown is suitable for simple documents
For images, uncompressed TIFF is widely used for preservation
PNG, JPEG 2000, and PDF/A are other viable options
Evaluating Formats
Recommended audio formats are uncompressed WAV, AIFF, or lossless compressed FLAC
For video, uncompressed Quicktime or AVI and lossless JPEG 2000 are suitable
Evaluating formats should consider the significant properties of the original that must be maintained, support for embedded metadata, error detection/correction, and compatibility with current and future systems
Format registries like PRONOM provide detailed information about file formats to aid in evaluation and risk assessment over time
A comprehensive preservation strategy may involve preserving the original bitstream alongside a normalized version in a preservation format
Multiple versions support different future use cases (access copy, preservation master)
Data Backup and Migration
Backup Strategies
A 3-2-1 backup strategy is recommended - 3 total copies of the data, 2 on different media, and 1 copy kept offsite
Cloud storage can be leveraged for offsite copies (Amazon S3, Microsoft Azure)
Backups should be automated where possible to ensure consistency and logged for verification
Backup frequency depends on data volatility and criticality (daily, weekly)
Integrity of stored data should be ensured through the use of checksums and periodic validation
Bit-level preservation is crucial
Storage and Migration
Storage media must be evaluated for underlying stability, vulnerability to degradation, and obsolescence
Migration to new storage media on a 3-5 year cycle is often necessary (magnetic tape to disk)
Preservation storage systems must account for scalability as data volumes grow exponentially
Consider both maximum capacity and I/O performance
Preservation repositories should conform to the ISO 14721 Open Archival Information System (OAIS) Reference Model for functionality and metadata requirements
Format migration and emulation are two key strategies for ensuring long-term access
Migration converts data to new formats, while emulation recreates the original environment
Digitization Workflows
Workflow Design
Digitization workflows should be designed to optimize efficiency, maintain quality control, and ensure consistency of practice
Workflows are highly dependent on material types and institutional requirements (maps, photographs, manuscripts)
Key stages include selection, assessment, preparation, metadata collection, digitization, quality control, data storage, and access
Documenting each stage is essential
Careful material handling and use of specialized equipment is necessary to avoid damaging fragile analog materials during digitization
Establish processes for stabilization and conservation (humidification, flattening, repair)
Parameters and Quality Control
Digitization parameters and settings should be established based on material types and intended uses
Guidelines and recommendations are available from organizations like FADGI
Quality control processes are critical to ensure completeness, accuracy, and consistency
Both manual and automated QC should be performed, assessing image quality, metadata, naming, formats, and fixity
Workflows should generate and capture essential metadata, including descriptive, administrative, structural and preservation metadata
Metadata should be stored independently of the objects (sidecar files, databases)
Fixity information like checksums should be generated and captured at the point of digitization and verified throughout the digitization chain of custody
Scalability and Sustainability
Digitization workflows must be designed to scale and adapt to evolving technologies and standards over time
Utilize open source tools where possible to avoid vendor lock-in (ImageMagick, Tesseract OCR)