Detect who is talking in the room

mkh · January 4, 2018, 9:45am

Consider there are multiple speakers in the room. Is there a way to detect the active speaker(s) so we can for example show who is talking and bring his/her video up?

Francesco_Durighetto · January 4, 2018, 4:20pm

Technically this could be possible since in the rtp packets header there’s an extension indicating the audio level of that specific audio packet. But the implementation server side is completely missing. It will be nice to include this feature in the roadmap.

Cracker_F · January 5, 2018, 2:05pm

Hey,
I solved this by using the uncompressed erizo.js then added analyser code in the init function. I use the javascript audio api to get the current audio level. Also added a gain node their to mute clients very easy and smooth.

The code:

that.init = (gainNodeCallback) => { //Added callback to mute yourself

[...]

that.Connection.GetUserMedia(opt, (stream) => {
        // navigator.webkitGetUserMedia("audio, video", (stream) => {

        if(spec.audio) {  

              //INJECTED CODE <  Start

              var audioAontext = window.AudioContext || window.webkitAudioContext;  
              var context = new audioAontext();  
              var microphone = context.createMediaStreamSource(stream);  
              var dest = context.createMediaStreamDestination();  

              var gainNode = context.createGain();  
                
              var analyser = context.createAnalyser();  
              analyser.fftSize = 2048;  
              var bufferLength = analyser.frequencyBinCount;  
              var dataArray = new Uint8Array(bufferLength);  
              analyser.getByteTimeDomainData(dataArray);  

              var audioVolume = 0;  
              var oldAudioVolume = 0;  
              function calcVolume() {  
                requestAnimationFrame(calcVolume);  
                analyser.getByteTimeDomainData(dataArray);  
                  var mean = 0;  
                  for(var i=0;i<dataArray.length;i++) {  
                      mean += Math.abs(dataArray[i]-127);  
                  }  
                  mean /= dataArray.length;  
                  mean = Math.round(mean);  
                  if(mean < 2)   
                    audioVolume = 0;  
                  else if(mean < 5)  
                    audioVolume = 1;  
                  else  
                    audioVolume = 2;  

                  if(audioVolume != oldAudioVolume) {  
                    sendAudioVolume(audioVolume);  //Call the function with current audio level
                    oldAudioVolume = audioVolume;  
                  }  
              }  
              calcVolume();  
              microphone.connect(gainNode);  
              gainNode.connect(analyser); //get sound  
              analyser.connect(dest);  
              that.stream = dest.stream;
              if(gainNodeCallback) {
                gainNodeCallback(gainNode);
              }
          } else {  
            that.stream = stream;  
          }

          //INJECTED CODE < END

      __WEBPACK_IMPORTED_MODULE_4__utils_Logger__["a" /* default */].info('User has granted access to local media.');
      // that.stream = stream;
[...]
}

this will call the global function “sendAudioVolume” with 0 for silence, 1 for medium audio level, and 2 for high audio volume.
Not an easy solution but it works for me

mkh · January 5, 2018, 3:21pm

Interesting! I’ll give it a try. Thanks for sharing.

Francesco_Durighetto · January 5, 2018, 3:52pm

we also implemented the “same” approach client side to detect speaking and sending to server events of speaking activity.
This is not optimal solution since is a client side based calculation, but this is enough until we finish implementing the server side one I described in the post before

mkh · January 9, 2018, 3:56pm

Of course a server side implementation is preferred. Looking forward for such a great feature

Valeriy · August 9, 2019, 9:21am

Any updates on this ?

Topic		Replies	Views
Active Speaker Indication?	2	739	April 18, 2017
How to get video stream at server?	2	976	November 6, 2017
Room.getStreamStats() - A useful hidden API	5	1292	January 25, 2018
Username in Erizo controller front-end	1	570	November 20, 2014
lastN ? Audio level extension (RFC6464)	0	744	April 17, 2017

Detect who is talking in the room

Related topics