File Use Vocabulary

The File Use Vocabulary is a list of subclasses of pcdm:File that describe the role a pcdm:File plays within a pcdm:Object.  These classes, combined with technical metadata (mime type, image resolution, etc.), should be used to determine which file to use in a given context.

File Use Values

  • Original File: the original file uploaded by the user
  • Thumbnail Image: low resolution placeholder image
  • Extracted Text: text extracted from documents/OCR
  • Preservation Master File: best quality file in a format appropriate for long-term preservation
  • Intermediate File: high quality representation of the Object, appropriate for generating derivatives or other additional processing
  • Service File: a format generated for serving to users, such as the PDF generated from a Word/LaTeX source file, MP3 generated from a WAV file, JPEG generated from a TIFF, etc.
  • Transcript: text representation that can be a substitute or complement for accessibility purposes, such as a transcript, subtitles, or closed captions

Examples

Object TypeOriginal FilePreservation Master FileThumbnail ImageExtracted TextTranscriptService File
AudioLogic source fileWAVJPEG of album cover, promotional poster, etc. transcriptMP3
DocumentWord DocPDFJPEG of first pagetext dump PDF
ImagePhotoshop source file, uncropped/uncorrected TIFFTIFF imagelow-res JPEGOCR text med-res JPEG
VideoPremiere source filefull-quality MOVJPEG of title frametext-to-speech outputsubtitles720p MPEG4

Using Multiple Types

Multiple types may be appropriate for a single file, such as an image originally created as a medium-resolution JPEG.  In that case, you can assign both Original File (because it's the original creation format) and Service File (because it's appropriate for serving to end users).