Products

Demos

Use Cases

Pricing

Resources

Blog

Go to Portal

Products Demos Use Cases Pricing Resources Blog Go to Portal

Ingestion

The following documentation describes our ingestion which includes several automated steps based on queues. While the data are instantly stored in our databases, the audio analysis and the computation of the products could take up to several hours to be available.

Once configured, the ingestion could easily take place daily to ensure a permanent update of your catalogues. In consideration of fair usage of our services, please notify us in prevision of large ingestions.

The ingestion is the process allowing you to add content to one of your catalogues, including all the related steps to make your track available for your licenced product.

Depending on your assets and/or your usage, you may use one of another way to ingest your content. Note that even though we only require an audio-file and its reference in your system (called outer_id), it may be interesting for you to ingest more metadata linked to your audio files. To gain a better ergonomy on our Portal, we usually recommend ingesting your outer_id, the title, the related artist and the ISRC amongst all your audio files. These metadata will need a specific mapping for us to parse all your content during the ingestion; this will be described here below.

Note that you can create several catalogs inside your organization, and use a different process of ingestion for each of them at the same time.

Summary

AWS S3 Bucket

Included to your account, you have access to one single dedicated Amazon S3 Bucket in which you will have to deliver your assets to be analyzed:

{
	"bucket_name": "msm-s3-{YOUR_ORGANIZATIO_ID}-{AWS_REGION}-prd",
	"bucket_region": "{AWS_REGION}",
	"aws_access_key_id": "{YOUR_PRIVATE_ACCESS_KEY_ID}",
	"aws_secret_access_key": "{YOUR_PRIVATE_ACCESS_KEY_SECRET}"
}

This S3 Bucket is dedicated and secured for your organization only. All the assets that you are delivering plus all the assets we are producing are stored in this unique location.

Several paths already are configured:

Path	Description	Access
csv_outputs	Contains all the export made by Musimotion	ro
delivery	Root delivery area	ro
delivery/audio	Root delivery for all audio files	rw
delivery/ddex	Root delivery for all DDEX files	rw
delivery/mapping	Root repository for all mapping files	rw
delivery/metadata	Root delivery for all CSV files	rw
INFO.json	Technical configuration (JSON format)	ro
live_analysis	Temporary storage for the Live analysis	-
logs	Storage for all the logs produced by our services	-
processed	Root storage area for processed files	-
processed/audio	Main storage for processed audio files	-
processed/features	Main storage for extracted features	-

Access through CLI

Once you have configured your environment to access the bucket, the following commands should work:

For you to get the region in which your bucket is located:

aws s3api get-bucket-location --bucket {YOUR_BUCKET_NAME}
{
    "LocationConstraint": "eu-west-1"
}

For you to list the content of your bucket:

aws s3 ls s3://{YOUR_BUCKET_NAME}/                                                                                                                                               
                           PRE csv_outputs/
                           PRE delivery/
                           PRE live_analysis/
                           PRE logs/
                           PRE processed/
                           831 INFO.json
                            36 {YOUR_ORGANIZATION_NAME}

For you to recursively list all the files that are in the delivery area:

aws s3 ls s3://{YOUR_BUCKET_NAME}/delivery --recursively

Access through Interface

If you prefer to avoid the usage of the command line, you may connect to your dedicated bucket by using any software supporting the S3-protocol.

We recommend you to use Cyberduck for which you will have to configure the key and the path. Once connected, you will be able to navigate in the directories in order to upload your audio files in the right folder:

Preview

Audio files only, "free naming"

This is the simplest way to ingest content in one of your catalogs. Note that the outer_id will correspond to the filename of the related audio file.

Use-cases

You have a bunch of audio files that are not formatted to follow a proper format.
You will not match our results with another database or match the data based on the filenames.
You want to have a quick look at our results through our portal.

Preparation

Simply upload all your audio files into the right path delivery/audio/:

# one file at once
aws s3 cp LOCAL_AUDIO_FILE.mp3 s3://{YOUR_BUCKET_NAME}/delivery/audio/REMOTE_AUDIO_FILE.mp3
# all the content of a specific folder
aws s3 cp LOCAL_FOLDER/ s3://{YOUR_BUCKET_NAME}/delivery/audio/

Note that you can create as many subfolders you want, as long as their main path is delivery/audio/.

Read more about the S3 cp commands

Request

curl --location -g --request POST 'https://api-v2.musimap.io/ingestion/audio_only?delivery_dir_path={THE_PARENT_DELIVERY_PATH}&catalog_id={CATALOG_ID}&delete_delivered_file={TRUE|FALSE}' \
--header 'Authorization: Bearer {VALIDE_ACCESS_TOKEN}'

Parameter	Value Type	Description
delivery_dir_path	string	Path on S3, relative to `./delivery/audio/`
catalog_id	string (UUID)	Musimap Internal Unique Identifier for the catalog
delete_delivered_file	boolean (default = "True")	Whether the file is deleted from the delivery once processed

Process

Once triggered, the ingestion will start scanning the delivery_dir_path and create one single entry for each found files. Please note that the filename will become the outer_id which is supposed to be unique. This means that a file could overwrite any previously uploaded file having the same name.

Note that the audio files are moved during the process. Depending on the parameter delete_delivered_file, the delivery_dir_path will be empty at the end of this first step.

Once all the tracks saved in our database, the audio-analysis will start sending every audio file to our audio analyzers. This system is based on messaging queues; the audio analysis speed depends on the number of tracks being processed.

Several times a day, another process will compute the similarities for Musimatch or the tags for Musimotion and Musime. If a track hasn't been analysed yet, it will be computed in another thread later on. This third step is defining the availability of this entry for your licenced product.

Audio files only, "Structured naming"

This is the simplest way to ingest formatted content in one of your catalogues.

Use-cases

All your audio files have the same naming convention.
You want to match our results with your infrastructure, based on a specific reference.
You want to get benefits of our service by quickly adding simple metadata to your audio files.

Preparation

Simply upload all your audio files into the right path delivery/audio/:

# one file at once
aws s3 cp LOCAL_AUDIO_FILE.mp3 s3://{YOUR_BUCKET_NAME}/delivery/audio/REMOTE_AUDIO_FILE.mp3
# all the content of a specific folder
aws s3 cp LOCAL_FOLDER/ s3://{YOUR_BUCKET_NAME}/delivery/audio/

Note that you can create as many subfolders you want, as long as their main path is delivery/audio/.

Read more about the S3 cp commands

Mapping

This file should tell us about your naming convention and will allow us to parse each of your filenames to extract the right information.

As an example, if your filename is composed as "outer_id"-"isrc".ext, your mapping should look like:

mapping:
  - separator: "-"
  - column_0: "outer_id"
  - column_1: "isrc"

Available values:

outer_id
title
isrc
release_date
artist_name

Once this file is written (and validated), you may want to store it inside your bucket:

aws s3 cp local_mapping.yaml s3://{YOUR_BUCKET_NAME}/delivery/mapping/default_mapping.yaml

Note that our Support Team will usually configure your very first ingestion, and a working example is provided as a validation.

Request

curl --location -g --request POST 'https://api-v2.musimap.io/ingestion/audio_mapping?mapping_filename=default_mapping.yaml&delivery_dir_path={THE_PARENT_DELIVERY_PATH}&catalog_id={CATALOG_ID}&delete_delivered_file={TRUE|FALSE}' \
--header 'Authorization: Bearer {VALIDE_ACCESS_TOKEN}'

Parameter	Value Type	Description
delivery_dir_path	string	Path on S3, relative to `./delivery/audio/`
catalog_id	string (UUID)	Musimap Internal Unique Identifier for the catalog.
delete_delivered_file	boolean (default = "True")	Whether the file is deleted from the delivery once processed
mapping_filename	string	The filename of your mapping stored into `./delivery/mapping`
mapping_content	string	The YAML file containing your mapping, if not stored in S3

Process

Once triggered, the ingestion will start scanning the delivery_dir_path and create one entry for each found files. The extracted data will be stored amongst the audio file in our databases. Please note that the outer_id is supposed to be unique and will overwrite any previous entry having the same reference.

Note that the audio files are moved during the process. Depending on the parameter delete_delivered_file, the delivery_dir_path will be empty at the end of this first step.

Audio Files & Metadata (JSON)

This is the most complete way to ingest audio files with metadata.

Use-cases

Your audio files don't have any structured filenames.
You want us to store all the metadata related to an audio file in order to retrieve all the information immediatly.
You want to get benefits of our service through our Portal.

Preparation

S3 Storage

In order to fetch the media on S3, simply upload all your audio files into the right path delivery/audio/:

# one file at once
aws s3 cp LOCAL_AUDIO_FILE.mp3 s3://{YOUR_BUCKET_NAME}/delivery/audio/REMOTE_AUDIO_FILE.mp3
# all the content of a specific folder
aws s3 cp LOCAL_FOLDER/ s3://{YOUR_BUCKET_NAME}/delivery/audio/

Note that you can create as many subfolders you want, as long as their main path is delivery/audio/.

Read more about the S3 cp commands

Remote Storage

If your audio files are publicly available, you can setup the ingestion to download them. In such a case, you will need to fill the primary_media of each track with the complete URL wherefrom the file could be downloaded.

JSON Body

Once your audio files have been uploaded, you will need to query our Web-API to ingest them with the right information. You may use one single query for up to 25 tracks. The query will then contain all the information for us to retrieve the right audio file and to save it with all the information you want us to store. For each of those tracks, the following structure needs to be respected:

{
    "outer_id": "string",
    "references": [
      {
        "id": "string",
        "source": "string"
      }
    ],
    "title": "string",
    "lyrics": "string",
    "isrc": "string",
    "release_date": 0,
    "albums": [
      {
        "upc": "string",
        "title": "string",
        "release_date": "string",
        "type": "Compilation",
        "references": [
          {
            "id": "string",
            "source": "string"
          }
        ],
        "track_position": "string",
        "disk_number": "string"
      }
    ],
    "artists": [
      {
        "id": "string",
        "name": "string",
        "role": "string"
      }
    ],
    "primary_media": "string",
    "customer_tags": [
      {
        "tag": "type of Rock",
        "category": "Rock"
      }
    ]
  }

Note that only the fields outer_id and primary_media are required, all the others are optional.

Parameter	Value Type	Description
outer_id	string	Your unique identifier for this track
references	Nested object	A list of official references for this track, sorted by source
title	string	The official title for this track
lyrics	string	The complete lyrics for this track
isrc	string	The unique ISRC reference for this track
release_date	integer	The date of release (YYYYMMDD)
albums.upc	string	The unique UPC reference for this album
albums.title	string	The title for the album
albums.release_date	string	The date of release (YYYYMMDD)
albums.type	string	The type of album (Official, Compilation, Single,...)
albums.references	Nested object	A list of official references for this album, sorted by source
albums.track_position	string	The position of the track on the related disk_number
albums.disk_number	string	The disk on which the track could be listened
artists.id	string	A specific identifier for the related artist
artists.name	string	The name of a specific artist related to this track
artists.role	string	A specific role you would like to attach to the artist
primary_media	string	Remote URL or path on S3, relative to `./delivery/audio/`
customer_tags	Nested object	A list of tags, sorted by categories

Note that you will be able to retrieve all those information by using the enriched response for several of our endpoints.

Request

In addition to this BODY, several QUERY parameters allow you to configure the request:

Parameter	Value Type	Description
media_fetch_type	`s3` or `download` (default: s3)	Whether the file is stored on S3 or remotely
catalog_id	any string	Musimap Internal Unique Identifier for the catalog.
ingestion_id	any string	A unique reference for this ingestion
overwrite_audio_file	BOOLEAN (default: TRUE)	In case of an already existing `outer_id`, whether the audio file needs to be overwritten
delete_delivered_file	BOOLEAN (default: TRUE)	Whether the file is deleted from the delivery once processed

Note that the ingestion_id is any string that will create a sub-collection of your entries. We advise you to generate one unique ingestion_id per day or week. If omitted, a timestamp will be used.

curl --location -g --request POST 'https://api-v2.musimap.io/ingestion/json?media_fetch_type={s3|download}&catalog_id={CATALOG_ID}&ingestion_id={INGESTION_ID}&overwrite_audio_file={TRUE|FALSE}&delete_delivered_file={TRUE|FALSE}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {VALIDE_ACCESS_TOKEN}' \
--data-raw '[
  {
    "outer_id": "string",
    "references": [
      {
        "id": "string",
        "source": "string"
      }
    ],
    "title": "string",
    "lyrics": "string",
    "isrc": "string",
    "release_date": 0,
    "albums": [
      {
        "upc": "string",
        "title": "string",
        "release_date": "string",
        "type": "Compilation",
        "references": [
          {
            "id": "string",
            "source": "string"
          }
        ],
        "track_position": "string",
        "disk_number": "string"
      }
    ],
    "artists": [
      {
        "id": "string",
        "name": "string",
        "role": "string"
      }
    ],
    "primary_media": "string",
    "customer_tags": [
      {
        "tag": "type of Rock",
        "category": "Rock"
      }
    ]
  }
]'

Process

Once triggered, the ingestion will start parsing the BODY of your request and create one entry for each found track. The extracted data will be stored amongst the audio file in our databases. Please note that the outer_id is supposed to be unique and will overwrite any previous entry having the same reference.

Note that the audio files are moved during the process. Depending on the parameter delete_delivered_file, the delivery_dir_path will be empty at the end of this first step.

Ingestion

Summary

AWS S3 Bucket

Access through CLI

Access through Interface

Audio files only, "free naming"

Use-cases

Preparation

Request

Process

Audio files only, "Structured naming"

Use-cases

Preparation

Mapping

Request

Process

Audio Files & Metadata (JSON)

Use-cases

Preparation

S3 Storage

Remote Storage

JSON Body

Request

Process

Navigation

Musimap

Terms