ラズパイ４B + ODAS_ROS でReSpeaker 4-Mic Array for Raspberry Piを使ってみる

ラズパイ４B + Raspberry Pi OS でReSpeaker 4-Mic Array for Raspberry Piを使ってみる　でちょっと毛色の変わったマイクアレイデバイスを使ってみました。

音声がどの方向から来ているのか検知し、移動すればそれを追跡してみます。

前回はデータ送信は同一コンピュータ上でサーバー・クライアントを使いましたが、ここでは異なるコンピュータでROSのtopic通信をやってみます。

使うのはODASのROSパッケージです。

odas_ros

ROSはUbuntuで動くミドルウエアですが’、ReSpeaker 4-Mic Array for Raspberry PiがUbuntuにはセットアップできませんでした。

なので、ラズパイで使う場合は

１：Raspberry Pi OS (Debian）にReSpeaker 4-Mic Array for Raspberry Piをセットアップ

２：Docker でUbuntu のコンテナを作り、この中にROS1（Melodic）とodas_rosをセットアップし、コンテナからRaspberry 4-Mic Array を使う

で、PublishしたデータをJetson Nano から見てみます。

通信については以下をご参照

シングルボードコンピュータ + ROS1 (Melodic) でノード間通信をやってみる

ここでは、こんな感じ

ラズパイ４BにReSpeaker 4-Mic Array for Raspberry Piをセットアップ

使うOSイメージは32-bit Bullseye、MicroSDに焼いてセットアップしておきます。

seeed-voicecardをクローンしておきます。

cd ~

git clone https://github.com/respeaker/seeed-voicecard.git

cd ~/seeed-voicecard

cd ~

git clone https://github.com/respeaker/seeed-voicecard.git

cd ~/seeed-voicecard

インストール

sudo ./install.sh 4mic

1	sudo ./install.sh 4mic

再起動

sudo reboot

1	sudo reboot

以下のコマンドでac108やCARD=seeed4micvoicecが見えているのを確認

arecord -L

1	arecord -L

Docker Engine のインストール、コンテナ作成

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

1 2	curl -fsSL https://get.docker.com -o get-docker.sh sudo sh get-docker.sh

コンテナを作成してROSをインストール

ROSはMelodic を使うので、Docker Hub からUbuntu 18.04 LTS イメージをダウンロード

sudo docker pull ubuntu:18.04

1	sudo docker pull ubuntu:18.04

my_ros1 という名前でコンテナ作成

サウンドデバイスが見えるようにしておきます。Xを使うかどうかまだ分かりませんが、とりあえずセットしておきます。

sudo docker create -it --name my_ros1 --network host --device=/dev/snd:/dev/snd -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix ubuntu:18.04

hostとファイル交換する場合はdocker cp コマンドを使うかworkディレクトリを作ってボリュームマウントするようにしておきましょう。

コンテナ起動

sudo docker start -i my_ros1

1	sudo docker start -i my_ros1

アップデート＆アップグレード、必要なパッケージやライブラリなどをインストール

apt update
apt upgrade -y
apt install git -y
apt install alsa-utils
apt install python3-pip -y
apt install python-pip -y
apt install nano

apt update

apt upgrade -y

apt install git -y

apt install alsa-utils

apt install python3-pip -y

apt install python-pip -y

apt install nano

コンテナにROS1（Melodic）をインストール

リポジトリ追加

apt install software-properties-common -y

apt-add-repository universe
apt-add-repository multiverse
apt-add-repository restricted

apt install software-properties-common -y

apt-add-repository universe

apt-add-repository multiverse

apt-add-repository restricted

aptキー追加

sh -c 'echo "deb http://packages.ros.org/ros/ubuntu $(lsb_release -sc) main" > /etc/apt/sources.list.d/ros-latest.list'

apt-key adv --keyserver 'hkp://keyserver.ubuntu.com:80' --recv-key C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654

Melodicをインストール

apt update

apt install ros-melodic-desktop-full -y

echo "source /opt/ros/melodic/setup.bash" >> ~/.bashrc

source ~/.bashrc

apt update

apt install ros-melodic-desktop-full -y

echo "source /opt/ros/melodic/setup.bash" >> ~/.bashrc

source ~/.bashrc

OSのソフトウェアパッケージのインストールに必要なrosinstallをインストール

apt install python-rosdep python-rosinstall python-rosinstall-generator python-wstool build-essential -y

ソースコンパイル時の依存パッケージインストールに使われるrosdepをインストール

apt install python-rosdep

rosdep init

rosdep update

apt install python-rosdep

rosdep init

rosdep update

ワークスペース作成

mkdir -p ~/catkin_ws/src

cd ~/catkin_ws

catkin_make


sh -c 'echo "source ~/catkin_ws/devel/setup.bash" >> ~/.bashrc'

mkdir -p ~/catkin_ws/src

cd ~/catkin_ws

catkin_make

sh -c 'echo "source ~/catkin_ws/devel/setup.bash" >> ~/.bashrc'

odas_ros をインストール

パッケージや依存ライブラリをインストール

apt-get install cmake gcc build-essential libfftw3-dev libconfig-dev libasound2-dev

1	apt-get install cmake gcc build-essential libfftw3-dev libconfig-dev libasound2-dev

ODAS ROSはAudioUtilsのオーディオユーティリティを使用するため、catkinワークスペースにインストールする必要があります。

cd ~/catkin_ws/src

git clone https://github.com/introlab/audio_utils.git

cd ~/catkin_ws/src

git clone https://github.com/introlab/audio_utils.git

依存ライブラリを追加インストール

apt-get install gfortran texinfo -y
pip install libconf

1 2	apt-get install gfortran texinfo -y pip install libconf

submoduleをインストール

cd audio_utils

git submodule update --init --recursive

cd ../..

catkin_make

cd audio_utils

git submodule update --init --recursive

cd ../..

catkin_make

odas_rosをクローンしてインストール

cd ~/catkin_ws/src

git clone https://github.com/introlab/odas_ros.git

cd odas_ros

git submodule update --init --recursive

cd ../..

catkin_make

cd ~/catkin_ws/src

git clone https://github.com/introlab/odas_ros.git

cd odas_ros

git submodule update --init --recursive

cd ../..

catkin_make

ここで一旦コンテナを終了して再起動

ハードウェア設定ファイル（ configuration.cfg）を編集

ファイルは以下の場所にあります。

/root/catkin_ws/src/odas_ros/config/configuration.cfg

odas_rosに書かれているのを参考にしますが、これはReSpeaker 4-Mic Array for Raspberry Pi用ではないので、ラズパイ４B + Raspberry Pi OS でReSpeaker 4-Mic Array for Raspberry Piを使ってみる　で作られたrespeaker_4_mic_array.cfgも参考にして修正します。

修正ファイルはここからダウンロードできます。

以下のコマンドでカード番号、デバイス番号を確認

arecord -l

1	arecord -l

主な修正先

ReSpeakerのマイクは4つなので、nChannels は４．devicenameに先に確認したカード番号を変更

# Raw

raw: 
{

    fS = 16000;
    hopSize = 512;
    nBits = 32;
    nChannels = 4; 

    # Input with raw signal from microphones
    interface: {    #"arecord -l" OR "aplay --list-devices" to see the devices
        type = "soundcard_name";
        devicename = "hw:CARD=3,DEV=0";
    }

}

# Raw

raw:

{

fS = 16000;

hopSize = 512;

nBits = 32;

nChannels = 4;

# Input with raw signal from microphones

interface: { #"arecord -l" OR "aplay --list-devices" to see the devices

type = "soundcard_name";

devicename = "hw:CARD=3,DEV=0";

}

マイクは４個なので配列こんな感じ

mapping:
{
    map: (1, 2, 3, 4);
}

mapping:

{

map: (1, 2, 3, 4);

}

~~micsには１６個マイクが定義されていますが５以降は削除（{}閉じの最後のカンマも削除するようにしてください）。~~

各micsの設定をrespeaker_4_mic_array.cfgのものに置き換え。

ssl,sst,sssの各セクションのinterfaceもIPアドレスなどを変更。

詳細はこのファイルをご参照ください。

ちなみに

ssl　－＞　Sound Source Localization（音源定位）

sst　－＞　Sound Source Tracking（音源追跡）

sss　－＞　Sound Source Separation（音源分離）

データをPublish

Jetson Nano でSubscribeしてみます。

Jetson Nanoのセットアップは以下をご参照ください。

Jetson Nano にROS1 (Melodic) をインストール（メモ）

以下のようなIPアドレスだとして、やってみます。

Jetson Nano 側でmasterを設定してROSを起動しておきます。

export ROS_MASTER_URI=http://192.168.0.40:11311
export ROS_IP=192.168.0.40

roscore

export ROS_MASTER_URI=http://192.168.0.40:11311

export ROS_IP=192.168.0.40

roscore

ラズパイ側でコンテナからPublishします。

sudo docker start -i my_ros1


export ROS_MASTER_URI=http://192.168.0.40:11311
export ROS_IP=192.168.0.34

roslaunch odas_ros odas.launch

sudo docker start -i my_ros1

export ROS_MASTER_URI=http://192.168.0.40:11311

export ROS_IP=192.168.0.34

roslaunch odas_ros odas.launch

Jetson Nano 側で別のターミナルを開いてtopicを確認します。

rostopic list

1	rostopic list

以下のようなtopicが見えればOKだと思います。

/odas/ssl
/odas/ssl_pcl2
/odas/sss
/odas/sst
/odas/sst_poses

Jetson 側で以下のエラー

rostopic echo <topic名>をやってみましたが、/odas/ssl、/odas/sss、/odas/sstでERRORになりました。

/odas/ssl

ERROR: Cannot load message class for [odas_ros/OdasSslArrayStamped]. Are your messages built?

/odas/sss

ERROR: Cannot load message class for [audio_utils/AudioFrame]. Are your messages built?

/odas/sst

ERROR: Cannot load message class for [odas_ros/OdasSstArrayStamped]. Are your messages built?

~~sstのERRORは　odas_rosにあるようにSound Source Tracking Threshold adjustmentの問題かもしれません。~~

messageは以下の２つでこうなっていました。

rostopic echo /odas/ssl_pcl2

type：sensor_msgs/PointCloud2

--- ^Cheader: seq: 3713 stamp: secs: 1646554913 nsecs: 599056005 frame_id: "odas" height: 1 width: 2 fields: - name: "x" offset: 0 datatype: 7 count: 1 - name: "y" offset: 4 datatype: 7 count: 1 - name: "z" offset: 8 datatype: 7 count: 1 - name: "intensity" offset: 12 datatype: 7 count: 1 is_bigendian: False point_step: 16 row_step: 32 data: [74, 12, 66, 63, 10, 215, 35, 190, 193, 202, 33, 63, 53, 94, 58, 62, 221, 36, 198, 62, 231, 251, 169, 189, 12, 2, 107, 63, 18, 131, 64, 62] is_dense: False

rostopic echo /odas/sst_poses

type：geometry_msgs/PoseArray

---
header: 
  seq: 1119
  stamp: 
    secs: 1646555033
    nsecs: 919815063
  frame_id: "odas"
poses: 
  - 
    position: 
      x: 0.0
      y: 0.0
      z: 0.0
    orientation: 
      x: -0.0372528547815
      y: -0.425908955232
      z: -0.0787690638174
      w: 0.900560503936

---

header:

seq: 1119

stamp:

secs: 1646555033

nsecs: 919815063

frame_id: "odas"

poses:

position:

x: 0.0

y: 0.0

z: 0.0

orientation:

x: -0.0372528547815

y: -0.425908955232

z: -0.0787690638174

w: 0.900560503936

Jetson で見れなかったtopicをラズパイ側のtopic echo で見るとこんな感じ

rostopic echo /odas/sst

type：odas_ros/OdasSstArrayStamped

---
header: 
  seq: 38244
  stamp: 
    secs: 1646634269
    nsecs:   1620054
  frame_id: "odas"
sources: 
  - 
    id: 1
    x: 0.002
    y: 0.035
    z: 0.999
    activity: 1.0

---

header:

seq: 38244

stamp:

secs: 1646634269

nsecs: 1620054

frame_id: "odas"

sources:

id: 1

x: 0.002

y: 0.035

z: 0.999

activity: 1.0

rostopic echo /odas/ssl

type：odas_ros/OdasSslArrayStamped

---
header: 
  seq: 86020
  stamp: 
    secs: 1646634651
    nsecs: 206928968
  frame_id: "odas"
sources: 
  - 
    x: 0.066
    y: 0.091
    z: 0.994
    E: 0.27
  - 
    x: 0.131
    y: 0.043
    z: 0.99
    E: 0.22

---

header:

seq: 86020

stamp:

secs: 1646634651

nsecs: 206928968

frame_id: "odas"

sources:

x: 0.066

y: 0.091

z: 0.994

E: 0.27

x: 0.131

y: 0.043

z: 0.99

E: 0.22

rostopic echo /odas/sss

type：audio_utils/AudioFrame

format: "signed_16" channel_count: 1 sampling_frequency: 44100 frame_sample_count: 256 data: [53, 0, 22, 255, 155, 254, 198, 254, 64, 255, 169, 255, 224, 255, 253, 255, 29, 0, 52, 0, 23, 0, 172, 255, 9, 255, 98, 254, 219, 253, 112, 253, 8, 253, 161, 252, 86, 252, 70, 252, 112, 252, 173, 252, 213, 252, 228, 252, 246, 252, 42, 253, 124, 253, 207, 253, 18, 254, 85, 254, 186, 254, 75, 255, 223, 255, 57, 0, 59, 0, 3, 0, 215, 255, 230, 255, 43, 0, 115, 0, 144, 0, 128, 0, 100, 0, 96, 0, 119, 0, 134, 0, 102, 0, 9, 0, 129, 255, 250, 254, 154, 254, 115, 254, 122, 254, 144, 254, 145, 254, 105, 254, 30, 254, 210, 253, 162, 253, 154, 253, 177, 253, 212, 253, 255, 253, 58, 254, 137, 254, 218, 254, 12, 255, 12, 255, 231, 254, 197, 254, 198, 254, 227, 254, 244, 254, 210, 254, 125, 254, 29, 254, 224, 253, 213, 253, 227, 253, 228, 253, 205, 253, 181, 253, 192, 253, 247, 253, 61, 254, 100, 254, 82, 254, 22, 254, 221, 253, 212, 253, 12, 254, 114, 254, 221, 254, 42, 255, 70, 255, 51, 255, 5, 255, 216, 254, 198, 254, 225, 254, 45, 255, 157, 255, 19, 0, 115, 0, 173, 0, 203, 0, 232, 0, 28, 1, 102, 1, 172, 1, 204, 1, 185, 1, 139, 1, 110, 1, 129, 1, 188, 1, 246, 1, 8, 2, 240, 1, 204, 1, 194, 1, 216, 1, 241, 1, 226, 1, 160, 1, 70, 1, 1, 1, 234, 0, 247, 0, 7, 1, 7, 1, 4, 1, 24, 1, 81, 1, 154, 1, 205, 1, 209, 1, 177, 1, 141, 1, 132, 1, 151, 1, 170, 1, 157, 1, 97, 1, 2, 1, 152, 0, 60, 0, 1, 0, 247, 255, 40, 0, 138, 0, 250, 0, 66, 1, 54, 1, 205, 0, 46, 0, 152, 255, 52, 255, 2, 255, 223, 254, 166, 254, 81, 254, 252, 253, 196, 253, 173, 253, 154, 253, 104, 253, 17, 253, 181, 252, 131, 252, 148, 252, 212, 252, 24, 253, 63, 253, 78, 253, 104, 253, 172, 253, 24, 254, 137, 254, 214, 254, 243, 254, 245, 254, 4, 255, 61, 255, 159, 255, 23, 0, 139, 0, 241, 0, 78, 1, 171, 1, 14, 2, 112, 2, 190, 2, 226, 2, 203, 2, 128, 2, 24, 2, 189, 1, 140, 1, 142, 1, 174, 1, 204, 1, 211, 1, 199, 1, 184, 1, 176, 1, 168, 1, 145, 1, 98, 1, 36, 1, 238, 0, 211, 0, 211, 0, 222, 0, 232, 0, 239, 0, 252, 0, 23, 1, 67, 1, 121, 1, 181, 1, 240, 1, 29, 2, 37, 2, 245, 1, 145, 1, 23, 1, 180, 0, 126, 0, 105, 0, 75, 0, 8, 0, 167, 255, 82, 255, 49, 255, 70, 255, 113, 255, 141, 255, 148, 255, 162, 255, 220, 255, 83, 0, 250, 0, 181, 1, 109, 2, 17, 3, 139, 3, 193, 3, 163, 3, 71, 3, 230, 2, 186, 2, 211, 2, 5, 3, 18, 3, 215, 2]

可視化

Jetson 側で取得できる/odas/ssl_pcl2と /odas/sst_posesをRvizで見てみます。

Rvizを起動して、

Fixed Frameを odas にして、Addボタンー＞By topicタブで両方を指定

こんな感じ

赤い矢印がsst_poses、ちょっと見ずらいですが点群がチラチラしてますのがssl_pcl2です。

左側半分下に見えてるwindowはrostopic echo /odas/sst_poses を表示しています。

移動する音源の方向を矢印が追跡しているようすです。スマホからyoutubeなどでニュースなどを流してReSpeakerの周りをグルグル移動させてみると、矢印が追随する様子が分かります。

課題

〇Jetson 側でmessageをloadできるようにする（configuration.cfgの問題？）

〇Publishされたデータがどんな意味を持っているのか調査

〇すべてのデータを可視化してみる

〇動作中、ラズパイからのPublishが停止します（イレギュラーな現象に見えますが必ず起こるようです）。エラーというよりtopicにデータが送られなくなっている感じ。例えて言うなら、「何かが詰まっている」。rostopic bw で見ると10Kくらいのサイズが途中で数十Bくらいに低下しています。再ローンチすれば復活します。解決策はあるのかな？

似たような現象は前回のODAS_web版でも起きます。途中でSegmentation faultで落ちることがあります。

configuration.cfgで、sstのmodeをカルマンフィルタ（kalman）ではなくパーティクルフィルタ（particle）に変更すると少しましになるように思えます。なんで？という感じですが….よく分かりません。使っている環境がkalmanよりparticleの方が適していると考えたほうがいいのかな？

パラメータの設定は難しいです。configuration.cfgに合わせて、You should leave this parameterでやると、簡単に停止します。ちょっとづつ調整しましょう。

サイトをあれこれさがしてもこれといった情報に行き当たりません。

より詳しいことはこの論文を読んでね…..ってことのようです。

ODASについて

Appendix

【configuration.cfg】

version = "2.1";

# Raw

raw: 
{

    fS = 16000;
    hopSize = 512;
    nBits = 32;
    nChannels = 4; 

    # Input with raw signal from microphones
    interface: {    #"arecord -l" OR "aplay --list-devices" to see the devices
        type = "soundcard_name";
        devicename = "hw:CARD=3,DEV=0";
    }

}

# Mapping

mapping:
{
    map: (1, 2, 3, 4);
}

# General

general:
{

    epsilon = 1E-20;

    size:   #for fft calculation
    {
        hopSize = 128;      #shift size of the cross fft
        frameSize = 256;    #size of each fft of the cross fft
    };

    samplerate:
    {
        mu = 16000;
        sigma2 = 0.01;
    };

    speedofsound:
    {
        mu = 343.0;
        sigma2 = 25.0;
    };

    mics = (
        
        # Microphone 1
        { 
            mu = ( -0.0405, +0.0000, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        },

        # Microphone 2
        { 
            mu = ( +0.0000, +0.0405, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        },

        # Microphone 3
        { 
            mu = ( +0.0405, +0.0000, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        },

        # Microphone 4
        { 
            mu = ( +0.0000, -0.0405, +0.0000 ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 80.0, 90.0 );
        }

    );

    # Spatial filters to include only a range of direction if required
    # (may be useful to remove false detections from the floor, or
    # limit the space search to a restricted region)
    spatialfilters = (
        {
            direction = ( +0.000, +0.000, +1.000 );
            angle = (80.0, 90.0);
        }
    );


    nThetas = 181;
    gainMin = 0.25;

};

# Stationnary noise estimation

sne:
{
    
    b = 3;
    alphaS = 0.1;
    L = 150;
    delta = 3.0;
    alphaD = 0.1;

}

# Sound Source Localization

ssl:
{

    nPots = 2;
    nMatches = 10;
    probMin = 0.5;
    nRefinedLevels = 1;
    interpRate = 4;

    # Number of scans: level is the resolution of the sphere
    # and delta is the size of the maximum sliding window
    # (delta = -1 means the size is automatically computed)
    scans = (
        { level = 2; delta = -1; },
        { level = 4; delta = -1; }
    );

    # Output to export potential sources
    potential: {

        format = "json";

        interface: {
            type = "socket";
            ip = "192.168.0.34";
            port = 9002;
        };

        #format = "undefined";

        #interface: {
        #   type = "blackhole";
        #};

    };

};

# Sound Source Tracking

sst:
{

    # Mode is either "kalman" or "particle"

    mode = "particle";

    # Add is either "static" or "dynamic"

    add = "dynamic";

    # Parameters used by both the Kalman and particle filter

    active = (
        { weight = 1.0; mu = 0.3; sigma2 = 0.0025 }
    );

    inactive = (
        { weight = 1.0; mu = 0.15; sigma2 = 0.0025 }
    );

    sigmaR2_prob = 0.0025;
    sigmaR2_active = 0.0225;
    sigmaR2_target = 0.0025;
    Pfalse = 0.1;
    Pnew = 0.1;
    Ptrack = 0.8;

    theta_new = 0.9;
    N_prob = 5;
    theta_prob = 0.8;
    N_inactive = ( 150 );
    theta_inactive = 0.9;

    # Parameters used by the Kalman filter only

    kalman: {

        sigmaQ = 0.001; #default 0.001 #bigger=more reactive / smaller=more robust the noise
        
    };
   
    # Parameters used by the particle filter only

    particle: {

        nParticles = 1000;
        st_alpha = 2.0;
        st_beta = 0.04;
        st_ratio = 0.5;
        ve_alpha = 0.05;
        ve_beta = 0.2;
        ve_ratio = 0.3;
        ac_alpha = 0.5;
        ac_beta = 0.2;
        ac_ratio = 0.2;
        Nmin = 0.7;

    };

    target: ();

    # Output to export tracked sources
    tracked: {

        format = "json";

        interface: {
            type = "socket";
            ip = "192.168.0.34";
            port = 9000;
        };

    };

}

# Sound Source Separation

sss:
{
    
    # Mode is either "dds", "dgss" or "dmvdr"

    mode_sep = "dds"; #delay and sum
    mode_pf = "ms";

    gain_sep = 2.0;
    gain_pf = 10.0;

    dds: {

    };

    dgss: {

        mu = 0.01;
        lambda = 0.5;

    };

    dmvdr: {

    };

    ms: {

        alphaPmin = 0.07;
        eta = 0.5;
        alphaZ = 0.8;
        thetaWin = 0.3;
        alphaWin = 0.3;
        maxAbsenceProb = 0.9;
        Gmin = 0.01;
        winSizeLocal = 3;
        winSizeGlobal = 23;
        winSizeFrame = 256;

    };

    ss: {

        Gmin = 0.01;
        Gmid = 0.5;
        Gslope = 10.0;

    };

    separated: { #packaging and destination of the separated files

        fS = 44100;
        hopSize = 256;
        nBits = 16;
        
        #interface: {
        #    type = "blackhole";
        #};
        interface: {
            type = "socket";
            ip = "192.168.0.34";
            port = 9001;
        }        
        

    };

    postfiltered: { #packaging and destination of the post filtered files

        fS = 44100;
        hopSize = 256;
        nBits = 16;

        interface: {
            type = "blackhole";
            ip = "127.0.0.1";
            port = 9002;
        }
                

    };

}

classify:
{

    frameSize = 1024;
    winSize = 3;
    tauMin = 32;
    tauMax = 200;
    deltaTauMax = 7;
    alpha = 0.3;
    gamma = 0.01;
    phiMin = 0.5;
    r0 = 0.2;

    category: {
        format = "undefined";

        interface: {
            type = "blackhole";
        }
    }
}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

version = "2.1";

# Raw

raw:

{

fS = 16000;

hopSize = 512;

nBits = 32;

nChannels = 4;

# Input with raw signal from microphones

interface: { #"arecord -l" OR "aplay --list-devices" to see the devices

type = "soundcard_name";

devicename = "hw:CARD=3,DEV=0";

}

# Mapping

mapping:

{

map: (1, 2, 3, 4);

}

# General

general:

{

epsilon = 1E-20;

size: #for fft calculation

{

hopSize = 128; #shift size of the cross fft

frameSize = 256; #size of each fft of the cross fft

};

samplerate:

{

mu = 16000;

sigma2 = 0.01;

};

speedofsound:

{

mu = 343.0;

sigma2 = 25.0;

};

mics = (

# Microphone 1

{

mu = ( -0.0405, +0.0000, +0.0000 );

sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );

direction = ( +0.000, +0.000, +1.000 );

angle = ( 80.0, 90.0 );

# Microphone 2

{

mu = ( +0.0000, +0.0405, +0.0000 );

sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );

direction = ( +0.000, +0.000, +1.000 );

angle = ( 80.0, 90.0 );

# Microphone 3

{

mu = ( +0.0405, +0.0000, +0.0000 );

sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );

direction = ( +0.000, +0.000, +1.000 );

angle = ( 80.0, 90.0 );

# Microphone 4

{

mu = ( +0.0000, -0.0405, +0.0000 );

sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );

direction = ( +0.000, +0.000, +1.000 );

angle = ( 80.0, 90.0 );

}

);

# Spatial filters to include only a range of direction if required

# (may be useful to remove false detections from the floor, or

# limit the space search to a restricted region)

spatialfilters = (

{

direction = ( +0.000, +0.000, +1.000 );

angle = (80.0, 90.0);

}

);

nThetas = 181;

gainMin = 0.25;

};

# Stationnary noise estimation

sne:

{

b = 3;

alphaS = 0.1;

L = 150;

delta = 3.0;

alphaD = 0.1;

}

# Sound Source Localization

ssl:

{

nPots = 2;

nMatches = 10;

probMin = 0.5;

nRefinedLevels = 1;

interpRate = 4;

# Number of scans: level is the resolution of the sphere

# and delta is the size of the maximum sliding window

# (delta = -1 means the size is automatically computed)

scans = (

{ level = 2; delta = -1; },

{ level = 4; delta = -1; }

);

# Output to export potential sources

potential: {

format = "json";

interface: {

type = "socket";

ip = "192.168.0.34";

port = 9002;

};

#format = "undefined";

#interface: {

# type = "blackhole";

#};

};

# Sound Source Tracking

sst:

{

# Mode is either "kalman" or "particle"

mode = "particle";

# Add is either "static" or "dynamic"

add = "dynamic";

# Parameters used by both the Kalman and particle filter

active = (

{ weight = 1.0; mu = 0.3; sigma2 = 0.0025 }

);

inactive = (

{ weight = 1.0; mu = 0.15; sigma2 = 0.0025 }

);

sigmaR2_prob = 0.0025;

sigmaR2_active = 0.0225;

sigmaR2_target = 0.0025;

Pfalse = 0.1;

Pnew = 0.1;

Ptrack = 0.8;

theta_new = 0.9;

N_prob = 5;

theta_prob = 0.8;

N_inactive = ( 150 );

theta_inactive = 0.9;

# Parameters used by the Kalman filter only

kalman: {

sigmaQ = 0.001; #default 0.001 #bigger=more reactive / smaller=more robust the noise

};

# Parameters used by the particle filter only

particle: {

nParticles = 1000;

st_alpha = 2.0;

st_beta = 0.04;

st_ratio = 0.5;

ve_alpha = 0.05;

ve_beta = 0.2;

ve_ratio = 0.3;

ac_alpha = 0.5;

ac_beta = 0.2;

ac_ratio = 0.2;

Nmin = 0.7;

};

target: ();

# Output to export tracked sources

tracked: {

format = "json";

interface: {

type = "socket";

ip = "192.168.0.34";

port = 9000;

};

}

# Sound Source Separation

sss:

{

# Mode is either "dds", "dgss" or "dmvdr"

mode_sep = "dds"; #delay and sum

mode_pf = "ms";

gain_sep = 2.0;

gain_pf = 10.0;

dds: {

};

dgss: {

mu = 0.01;

lambda = 0.5;

};

dmvdr: {

};

ms: {

alphaPmin = 0.07;

eta = 0.5;

alphaZ = 0.8;

thetaWin = 0.3;

alphaWin = 0.3;

maxAbsenceProb = 0.9;

Gmin = 0.01;

winSizeLocal = 3;

winSizeGlobal = 23;

winSizeFrame = 256;

};

ss: {

Gmin = 0.01;

Gmid = 0.5;

Gslope = 10.0;

};

separated: { #packaging and destination of the separated files

fS = 44100;

hopSize = 256;

nBits = 16;

#interface: {

# type = "blackhole";

#};

interface: {

type = "socket";

ip = "192.168.0.34";

port = 9001;

}

};

postfiltered: { #packaging and destination of the post filtered files

fS = 44100;

hopSize = 256;

nBits = 16;

interface: {

type = "blackhole";

ip = "127.0.0.1";

port = 9002;

}

};

}

classify:

{

frameSize = 1024;

winSize = 3;

tauMin = 32;

tauMax = 200;

deltaTauMax = 7;

alpha = 0.3;

gamma = 0.01;

phiMin = 0.5;

r0 = 0.2;

category: {

format = "undefined";

interface: {

type = "blackhole";

}

FRONT

地図と画像のサイト

ラズパイ４B + ODAS_ROS でReSpeaker 4-Mic Array for Raspberry Piを使ってみる

Be the first to comment

Leave a Reply コメントをキャンセル