Efficient buffer allocation for current and predicted active...

Telephonic communications – Special services – Conferencing

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C370S260000, C370S267000, C709S204000

Reexamination Certificate

active

06728358

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to computer-based telephony networks and more particularly to servers that manage telephony conferencing.
2. Related Art
In today's technological environment, there exists many ways for several people who are in multiple geographic locations to communicate with one another simultaneously. One such way is audio conferencing. Audio conferencing applications serve both the needs of business users (e.g., national sales force meeting) and leisure users (e.g., audio chat room participants) who are geographically distributed.
Traditional audio conferencing involved a central conferencing server which hosted an audio conference. Participants would use their telephones and dial in to the conferencing server over the Public Service Telephone Network (PSTN) (also called the Plain Old Telephone System (POTS)).
In recent years, the possibility of transmitting voice (i.e., audio) over the worldwide public Internet has been recognized. As will be appreciated by those skilled in the relevant art(s), the connectivity achieved by the Internet is based upon a common protocol suite utilized by those computers connecting to it. Part of the common protocol suite is the Internet Protocol (IP), defined in Internet Standard (STD) 5, Request for Comments (RFC) 791 (Internet Architecture Board). IP is a network-level, packet (i.e., a unit of transmitted data) switching protocol.
Transmitting voice over IP (VoIP) began with computer scientists experimenting with exchanging voice using personal computers (PCs) equipped with microphones, speakers, and sound cards. VoIP has further developed with the adoption of the H.323 Internet Telephony Standard, developed by the International Telecommunications Union-Telecommunications sector (ITU-T), and the Session Initiation Protocol (SIP), developed within the Internet Engineering Task Force (IETF) Multiparty Multimedia Session Control (MMUSIC) Working Group.
Conferencing servers (also called multipoint control units (MCUs)) were developed to host audio conferences where participants are connected to a central MCU using PC-based equipment and the Internet, or using a telephone through a gateway, rather than traditional telephone equipment over the PSTN.
One common problem, however, exists in both MCUs that support Internet-based telephony and conferencing servers that support traditional PSTN-based telephony. This problem is now described (with conferencing servers and MCUs being referred to generally herein as MCUs).
MCUs, in general, enable multipoint communications between two or more participants in a voice conference. An MCU may support many conferences at one time, each of which have many participants. Each participant in a given conference will hear a mix of up to n active speakers, except for the active speakers themselves, who hear the mix minus themselves (this is, in essence, an “echo suppression” function so that a party will not “hear themselves speak” during the audio conference). For ease of explanation herein, and as will be appreciated by those skilled in the relevant art(s), the module in an MCU that does the active speaker detection, mixing or multiplexing, switching and streaming of the audio is referred to herein as the “Mixer.”
In the case where the Mixer needs to do mixing of multiple audio streams or accept different packet sizes from different participants, the Mixer needs a buffer (i.e., a memory storage area) in which to receive audio data. This buffer may be large if it also needs to accommodate jitter (the random variation in the delivery time) in packet arrival times. From a memory standpoint, it would be most efficient to assign buffers only to the active speakers rather than to all participants in a conference, and to reassign the buffers as the active speakers change. However, there is a drawback to only collecting data for the active speakers. Often times, the active speaker update event within a Mixer does not detect a new active speaker until enough “loud” packets have gone by to trigger the selection of the speaker as a new active speaker. This can cause the first word to be partially lost in the new active speaker's audio stream.
Therefore, given the above, what is needed is a method and computer program product for the efficient allocation of buffers for current and predicted active speakers in voice conferencing systems.
SUMMARY OF THE INVENTION
The present invention is directed to a method and computer program product for the efficient first-in first-out FIFO (i.e., queue) allocation for current and predicted active speakers in voice conferencing systems, that meets the above-identified needs.
The method and computer program product of the present invention receive a packet from a speaker participating in a conference, wherein the speaker is not currently designated as an “active” speaker nor as a “predicted active” speaker. Then, a first test is applied to determine whether the speaker should now be designated as a “predicted active” speaker. The test is a comparison between the energy measurement of the packet (or the speaker's energy averaged over some pre-determined time period and including such packet) and any one of numerous possible functions of the energies of the current “active” or “predicted active” speakers. The method and computer program product of the present invention discard the packet when the packet fails the first test. If the packet passes the first test, the steps described below are performed.
First, a determination is made as to whether there is an unallocated buffer from among a set of p “predicted active” speaker buffers. If so, the packet is stored in the unallocated buffer. If not, a determination is made, by using a second test on the packet, whether the speaker should now be designated as a “predicted active” speaker, thereby replacing a current predicted active speaker using one of the set of p “predicted active” speaker buffers. The second test, like the first, is a comparison between the energy measurement of the packet (or the speaker's energy averaged over some pre-determined time period including such packet) and any one of numerous possible functions of the energies of the current “active” or “predicted active” speakers, although with a higher threshold than the first test.
Next, the packet is discarded if it fails the second test. If it passes the second test, a buffer from the set of p “predicted active” speaker buffers that can be reassigned is identified and the packet is then stored in the identified buffer. At this point the speaker is considered a “predicted active speaker” and data received from that speaker will be received into their predicted active speaker buffer.
Once that speaker becomes an “active speaker,” some of the data from their predicted active speaker buffer will be used as their active speaker data. (One way of doing this is to make that speaker's predicted active speaker buffer an active speaker buffer.) In an embodiment, the portion of the data used is equal to M-J packets, where M is a pre-determined desired jitter buffer depth and J is the current jitter buffer depth, assuming M>J. If M≦J none (i.e., zero packets) of the data from that speaker's predicted active speaker buffer is used. This minimizes the loss of audio data for speakers as they switch from “non-active” to “active” status and ensures that the delay introduced by first using the speaker's data that has been saved into their predicted active speaker buffer is never more than the desired jitter buffer depth M.
An advantage of the present invention is that it minimizes the loss of audio data for speakers as they switch from “non-active” to “active” status by collecting audio data from those speakers before they are actually active. This is done in a memory efficient manner and without introducing additional delay.
Another advantage of the present invention is that it provides a method of predicting future active speakers to limit the amount of non-active speaker dat

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Efficient buffer allocation for current and predicted active... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Efficient buffer allocation for current and predicted active..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Efficient buffer allocation for current and predicted active... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3250156

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.