Essentia による音響解析 Gaiaライブラリ編 - 「ひろの」の徒然日記帳 [IT tools, programming, software and more...]

ひろのです。

最近パンにはまってます。同じクルミ入りのパンでも店によってかなり味が違う。。
新居の周りにはパン屋がなぜか多いので色んな所をめぐってるところです。
食べ過ぎないように気をつけよう。
www.super-kinokuniya.jp

音響解析について音源からLowLevelの解析以外に、ジャンルや男性女性ボーカル、カラオケ版か
どうか等、より深い解析ができるのでその方法を記載しました。

High-Level解析の為にGaiaインストール＆ビルド＆サンプル実行です。

Music extractor — Essentia 2.1-dev documentation

High-Levelにて取得できる項目は以下となります。
http://essentia.upf.edu/documentation/svm_models/accuracies_v2.1_beta1.html

導入

Essentia基本環境

前回記事を参考に構築
ino-hiro1012.hatenablog.com

Gaia2をインストール

gaiaをDL＆解凍 https://github.com/MTG/gaia/tree/master
brew install gaia --HEAD
./waf configure --download
./waf
./waf install
(gaia src)/src/point.hファイルだけ/usr/local/include/gaia2/parser にコピー必要しておく。
cp /(gaia_src)/src/point.h /usr/local/include/gaia2/parser/

essentiaの設定変更

以下はGaiaではなく、Essentiaのソースは以下のwafに対してです。
./waf configure --mode=release --build-static --with-python --with-cpptests --with-examples --with-vamp --with-gaia
「--with-gaia」を追加。
./waf
./waf install

サンプル実行

YAMLファイルを作成。(例：sample.yaml) 取得したい値により変更します。

highlevel:
compute: 1
svm_models: ['svm_models/genre_tzanetakis.history', 'svm_models/mood_sad.history']

すべて取得したい場合は以下のような指定になります。

highlevel:
compute: 1
svm_models: ['svm_models/danceability.history', 'svm_models/gender.history', 'svm_models/genre_dortmund.history', 'svm_models/genre_electronic.history', 'svm_models/genre_rosamerica.history', 'svm_models/genre_tzanetakis.history', 'svm_models/ismir04_rhythm.history', 'svm_models/mood_acoustic.history', 'svm_models/mood_aggressive.history', 'svm_models/mood_electronic.history', 'svm_models/mood_happy.history', 'svm_models/mood_party.history', 'svm_models/mood_relaxed.history', 'svm_models/mood_sad.history', 'svm_models/moods_mirex.history', 'svm_models/timbre.history', 'svm_models/tonal_atonal.history', 'svm_models/voice_instrumental.history']

http://essentia.upf.edu/documentation/svm_models/ からhistoryファイル等を取得し、
build/src/examplesに配置
(1) essentia_streaming_extractor_music sample.mp3 result.txt
※Essentiaにて解析。
(2) essentia_streaming_extractor_music_svm result.txt result2.txt sample.yaml で実行
_svmを付ける
※Essentiaいて解析した結果ファイルを引数にGaiaにて解析。

http://essentia.upf.edu/documentation/svm_models/accuracies_v2.1_beta1.html
取得出来る値は上記の通りとなります。

Genreについての補足は以下が参考になります。
https://acousticbrainz.org/datasets/accuracy

moods_mirexについては以下が参考になります。

http://www.music-ir.org/mirex/wiki/2015:Audio_Classification_(Train/Test)_Tasks
Cluster_1: passionate, rousing, confident,boisterous, rowdy
Cluster_2: rollicking, cheerful, fun, sweet, amiable/good natured
Cluster_3: literate, poignant, wistful, bittersweet, autumnal, brooding
Cluster_4: humorous, silly, campy, quirky, whimsical, witty, wry
Cluster_5: aggressive, fiery,tense/anxious, intense, volatile,visceral
[Excite翻訳]
Cluster_1：情熱的であること、起きること、自信があること、騒々しいこと、乱暴者
Cluster_2：大いに楽しむこと、快活であること、および楽しみ、甘味性質なので、好意的／よい
Cluster_3：教育のある人、痛烈で、物欲しそうで、ビタースイート、秋であり、卵を抱く
Cluster_4：ユーモラスで、馬鹿、ホモっぽく、奇抜で、気まぐれで、機知に富み、皮肉である
Cluster_5：攻撃的で、火であり、緊張／心配し、強烈で、揮発性で、本能的である
[Google翻訳]
Cluster_1: 情熱的な活発な自信を持って、騒々しい、騒々しい
Cluster_2: 陽気、陽気な楽しい、甘い、愛想のいい/良い温厚な
Cluster_3: 読み書き、痛烈な哀愁を帯びた、ほろ苦い、紅葉、陰気な
Cluster_4: ユーモラスな愚かな、古くさい、風変わりな、気まぐれな、機知に富んだ、皮肉
Cluster_5: 積極的な激しい、緊張・不安、強烈な揮発性、内臓

手元の楽曲数が多かったので、リストを事前に作り、
リストを個別に読み込ませながら複数のコア（CPU)で実行させました。
数十万曲とかあると結構日数かかりそうです。。