Blogs: Representation Information

Blog posts filtered by the Representation Information subject tag.

Browse blogs by subject

All subjects Access Analysis Android apache tika ApacheTika AQuA ARC ARC to WARC archives archiving audiovisual Benchmark benchmarking best practice best practices Bit rot bitcurator board game British Library Characterisation Community compression Corpora CSV-Validator curation Database Database Archiving Database Preservation Delivery Digital Forensics digital preservation digitisation Disk Images DROID E-ARK E-ARK Project EaaS eArchiving Education Emulation epub Experimentation extensible Fido File Formats FLAC Flashback floppy disk floppy disks floppy drive Format Identification Format Registry GitHub Hackathon Hardware obsolescence help httpreserve Identification IDPD17 IMPACT Internet Standards iPRES. community survey isolyzer jhove job JP2 JPEG2000 jpylyzer LZW magnetic media Matchbox MediaConch Members Metadata metadate Migration Monitoring Normalisation OCR open Open Planets Foundation Open Preservation Foundation Open source OPF diary Optimization Packaging PDF PDF/A Planets policy PREFORMA PREMIS preservation Preservation Actions preservation planning Preservation Risks Preservation Strategies Preservia Process Projects PRONOM Provenance pywb recordkeeping records Representation Information Research data research infrastructure Resources RFC Rogues Gallery Rosetta Roy SCAPE Server Siegfried Signature Development significant properties Software Software benchmarking SPARQL specification specifications spreadsheets SPRUCE standards technical technical registry testing TIFF Tika Tools training validation veraPDF Virtual Machines w3c WARC Watch WAV WAVE Web Archiving Web Publications wget Wikidata Workflow Workflows Zip

The development work on an imaging/ripping workflow for optical media is shaping up steadily, and you can expect a write-up with more information about our software and hardware setup here in the near future (you can get a sneak peek here). However, this blog is about a very specific problem that we ran into while […]

By johan, posted in johan's Blog

25th Apr 2017  2:07 PM  4658 Reads  3 Comments

While browsing ArchiveTeam's File Formats Wiki earlier this week, I came across some entries I created there on Quattro Pro spreadsheets two years ago. At the time I had also contributed some old Quattro Pro for DOS spreadsheets (here and here) from my personal archives to the OPF format corpus. Seeing those files again, I […]

By johan, posted in johan's Blog

29th Oct 2014  2:59 PM  26894 Reads  2 Comments

Anyone willing to preserve digital content must be aware of events that might constitute a relevant risk. In SCAPE we are developing tools that will allow you to detect risks before they cause any irreversible damage. Help us understand the preservation events, threats and opportunities, you find more relevant and the ways you would like […]

By lfaria, posted in lfaria's Blog

30th Jan 2014  10:05 AM  14221 Reads  2 Comments

This blog follows up on three earlier posts about detecting preservation risks in PDF files. In part 1 I explored to what extent the Preflight component of the Apache PDFBox library can be used to detect specific preservation risks in PDF documents. This was followed up by some work during the SPRUCE Hackathon in Leeds, […]

By johan, posted in johan's Blog

27th Jan 2014  3:08 PM  24080 Reads  7 Comments

My previous blog Assessing file format risks: searching for Bigfoot? resulted in some interesting feedback from a number of people. There was a particularly elaborate response from Ross Spencer, and I originally wanted to reply to that directly using the comment fields. However, my reply turned out to be a bit more lengthy than I […]

By johan, posted in johan's Blog

8th Oct 2013  4:24 PM  16474 Reads  4 Comments

Like many other organisations that are using JPEG 2000, the KB produces two representations of most of its digitised content (newspapers, books, periodicals): a high-quality, losslessly compressed JP2 that is the archival master; a lesser-quality, lossily compressed JP2 that is used as an access image (this is used for e.g. our newspapers website). The majority […]

By johan, posted in johan's Blog

19th Aug 2013  11:22 AM  15418 Reads  No comments

Last winter I started a first attempt at identifying preservation risks in PDF files using the Apache Preflight PDF/A validator. This work was later followed up by others in two SPRUCE hackathons in Leeds (see this blog post by Peter Cliff) and London (described here). Much of this later work tacitly assumes that Apache Preflight […]

By johan, posted in johan's Blog

25th Jul 2013  12:57 PM  25035 Reads  12 Comments

Now that the subproject lead in PW is being transferred from me to Kresimir, it seems a good time to reflect a little on what we have achieved in PW since February 2011 and what is left to do! What did we set out to do? To accomplish effective digital preservation, environments with a preservation […]

By cbecker, posted in cbecker's Blog

23rd Jul 2013  9:20 AM  13585 Reads  No comments

It’s been more than two years now since I wrote my D-Lib paper JPEG 2000 for Long-term Preservation: JP2 as a Preservation Format. From time to time people ask me about the status of the issues that are mentioned in that paper, so here’s a long overdue update. Issues addressed in the 2011 paper The […]

By johan, posted in johan's Blog

1st Jul 2013  4:44 PM  20593 Reads  2 Comments

Following the community response to our workshop last year, we want to invite you again to contribute your future preservation challenge! Digital Preservation has emerged as a key challenge for information systems in almost any domain from eCommerce and eGovernment to finance, health, and personal life. The field is increasingly recognized and has taken major […]

By cbecker, posted in cbecker's Blog

17th Jun 2013  5:24 PM  15128 Reads  2 Comments