Skip to content
This repository was archived by the owner on Mar 13, 2022. It is now read-only.

Audio Ingestion

ajbrown edited this page Sep 13, 2010 · 4 revisions

Overview

Audio Ingestion is the process of receiving an audio file and all of it’s metadata, including cover art, and processing that content to make it available and purchasable on the site. The process involves encoding files, tagging ID3 frames, and pushing binaries to Amazon S3. It also includes various self-checks along the way to ensure the quality and accuracy of content.

BitNotion will minimize it’s local storage and bandwidth by offloading to Amazon S3. Only original files needed to produce all of the resulting files will be kept locally. This will reduce the need to invest in hundreds of thousands of dollars of hardware in order to build an infrastructure that would otherwise be needed to store, process, and deliver terrabytes of music files. Additionally, we will use CloudFront as our cachable content delivery system. This will allow us to scale quickly and horizontally as the site becomes more popular.

BitNotion will not modify the original files themselves.

We will strive to automate as much of the process as possible without getting in the way of content providers. This includes building as many early detection systems as possible.

Glossary of Terms

  • WAV = Original Audio File, Uploaded by user (note: any format can be possible, but WAV will be used to represent all original formats)
  • ART = Original Cover Art
  • PRV = Preview File(s)
  • MP3 = Encoded Audio File, purchasable
  • STOR = Local storage, maintained by BitNotion
  • AS3 = Amazon S3 cloud storage. A “bucket” is essentially a “drive” which has global permssions. Each item within the bucket can have it’s own permissions, though. In our case, we have 2 buckets: “content” which is private, and “public” which can be accessed without authentication. Temporary public URIs can be generated for private content, which is how we’ll downloads for purchased content.
  • SQS = Amazon Simple Queue Service. Part of Amazon Web Services, SQS is a job queuing service, allowing remote machines to communicate with eachother through messages.
  • MetaData = data about track (artist, release, genre, etc) and it’s files in the database (S3 URI’s)
  • UI = User Interface

Content Ingestion WorkFlow

Here’s a procedural overview of the content ingestion process for new incoming content. The process will be slightly different for content that’s being update (I.E. a track title or cover art is being changed.).

Client Side: Create Encode Request

  1. Track data is entered by user, including Release, Genre, Title, Cover Art, and Master Audio File.
  2. User clicks “Submit”
  3. Track MetaData is used to generate an encode request
    1. UI POSTS MetaData to encode request API
    2. API Returns a status message and a reference to the encode request (requestId)
  4. Track is Uploaded to AS3
    1. UI requests AS3 policy for encode request to upload track
    2. API responds with status message, signuatre, and encrypted policy
    3. UI uses signature and policy to upload track to AS3
    4. AS3 responds with status message. If succesful, responds with ETag and URI for uploaded track
    5. UI updates encode request with ETag and S3URI for track
  5. CoverArt is Uploaded to AS3
    1. UI requests AS3 policy for encode request to upload cover art
    2. API responds with status message, signuatre, and encrypted policy
    3. UI uses signature and policy to upload track to AS3
    4. AS3 responds with status message. If succesful, responds with ETag and URI for uploaded track
    5. UI updates encode request with ETag and S3URI for track
  6. Encode Job is Queued using encode request data
    1. UI updates encode request, setting status to ready
    2. API creates SQS job with encode request data, responds with status message.

Server Side: Encode Track

  1. Receive WAV, ART, and Track Metadata
    1. Use S3 ETag/Uri to retrieve master audio file from S3
    2. Use S3 ETag/Uri to retrieve cover art file from S3
  2. Process Original Files
    1. Tag WAV with all possible Metadata values
    2. Move WAV, ART to permanent location in STOR
    3. Copy WAV to AS3 private bucket
    4. Copy ART to AS3 public bucket
  3. Create Preview Files
    1. Encode WAV to single PRV
    2. Split PRV to multiple PRV, 2 minute increments each
    3. Move PRVs to AS3 public bucket, Update Track MetaData with preview uris
  4. Create Purchasable File(s)
    1. Encode WAV to MP3
    2. Tag MP3 with MetaData
    3. Move MP3 to AS3 private bucket
  5. update MetaData, mark track as published.
  6. Cleanup
    1. request deletion of incoming master audio file and cover art

At this point, the track is ready to be made available on the site. All preview files and cover art are publicly accessible by the world (even for direct linking). All purchasable and original files (except cover art) are in a private bucket @ S3. All originals are stored locally, so that we can retag at any time without the user uploading again.

  1. Check MusicBrainz database
  2. Fingerprint WAV file
    1. check for existing data in musicbrainz database.
      1. if data exists and track is not tagged as previously auto-detected, make sure artist sounds similar. IF not, flag track for review and remove from site
      2. if data does not exist, populate it and flag track as “auto-detect passed” so that music brainz check is never performed again.

Software

Clone this wiki locally