Use eXosip+ffmpeg, ffplay command line to implement sip client


Preface

When using sip for video calls, you may encounter situations where you need to use an ip camera as the video source. After checking the information, using pjsip usually requires changing the source code. The functions included in pjsip are complete, but a bit too large, and many functions are not needed. And the author has an idea. As long as there is a library that can handle SIP interaction, such as eXosip, the audio and video aspects can be implemented separately. For example, first use the ffmpeg and ffplay command lines as audio and video tests, and then write code to implement it after success. This article is a successful test solution. The truly flexible way is to write code to adjust ffmpeg. This article is more about providing an implementation idea.


1. Key realization

The main implementation steps are to use eXosip to process sip, parse sdp yourself, and use ffmpeg and ffplay command lines to stream media.

1. Main process

Insert image description here

2. Resolve port conflicts

(1) Reasons for occurrence

Following the above process, you will encounter port conflicts . Push and pull streams need to use the same local udp port. Since ffmpeg and ffplay are two processes that use the same port, they will conflict. The specific details are as follows:
Insert image description here

(2), solution

The generally thought of solution is to use jrtplib to establish only one rtp session to take into account both sending and receiving, and the streaming media is implemented through ffmpeg code. This article does not use this method. In order to insist on using the ffmpeg and ffplay command lines, the best way is to use the udp proxy listening port to forward the data, which can effectively solve the port conflict problem.

Insert image description here

3. Parse sdp

Although eXosip provides a method to obtain sdp, you still need to parse the specific information yourself, which is actually relatively simple.

(1) Define entities

//流类型
enum StreamType {
    
    
	STREAMTYPE_VIDEO,
	STREAMTYPE_AUDIO
};
/// <summary>
/// 流信息
/// </summary>
class StreamInfo {
    
    
public:
	//流类型
	StreamType type;
	//rtp推流地址,可以用此地址ffmpeg直接推流,也可以用下面参数自定义推流
	char rtpAdress[128] = {
    
     0 };
	//流的远端地址
	char remoteIp[32] = {
    
     0 };
	//流的远端端口
	int remotePort = 0;
	//本地接收/发送端口
	int localPort = 0;
	//编码格式
	char codec[16];
	//负载类型
	int payload = 0;
	union
	{
    
    
		//采样率,音频
		int sampleRate = 0;
		//时间基、视频
		int timebase;
	};
	//声道数
	int channels = 0;
};

(2), analyze video

std::vector<StreamInfo> SipUA::_getVideoStreams(sdp_message_t* sdp_msg)
{
    
    
	std::vector<StreamInfo> streams;
	if (!sdp_msg)
		return streams;
	sdp_connection_t* connection = eXosip_get_video_connection(sdp_msg);
	if (!connection)
		return streams;
	std::string ip = connection->c_addr; 
	sdp_media_t* sdp = eXosip_get_video_media(sdp_msg);
	if (!sdp)
		return streams;
	int	port = atoi(sdp->m_port); 
	for (int i = 0; i < sdp->a_attributes.nb_elt; i++)
	{
    
    
		sdp_attribute_t* attr = (sdp_attribute_t*)osip_list_get(&sdp->a_attributes, i);
		if (attr)
		{
    
    
			std::string audio_filed = attr->a_att_field;
			if (audio_filed == "rtpmap")
			{
    
    
				StreamInfo stream;
				stream.type = StreamType::STREAMTYPE_VIDEO;
				snprintf(stream.remoteIp, 32, ip.c_str());
				stream.remotePort = port;

				std::string value = attr->a_att_value;

				std::string::size_type pt_idx = value.find_first_of(0x20);
				if (pt_idx == std::string::npos)
					continue;
				stream.payload = atoi(value.substr(0, pt_idx).c_str());
				std::string::size_type bitrate_idx = value.find_first_of('/');
				if (bitrate_idx == std::string::npos)
					continue;
				stream.timebase = atoi(value.substr(bitrate_idx + 1).c_str());
				snprintf(stream.codec, 32, value.substr(pt_idx + 1, bitrate_idx - pt_idx - 1).c_str());
				streams.push_back(stream);
			}
		}
	}
	return streams;
}

(3), analyze audio

std::vector<StreamInfo> SipUA::_getAudioStreams(sdp_message_t* sdp_msg)
{
    
    
	std::vector<StreamInfo> streams;
	if (!sdp_msg)
		return streams;
	sdp_connection_t* connection = eXosip_get_audio_connection(sdp_msg);
	if (!connection)
		return streams;
	std::string audio_ip = connection->c_addr; //audio_ip
	sdp_media_t* audio_sdp = eXosip_get_audio_media(sdp_msg);
	if (!audio_sdp)
		return streams;
	int	audio_port = atoi(audio_sdp->m_port); //audio_port
	for (int i = 0; i < audio_sdp->a_attributes.nb_elt; i++)
	{
    
    
		sdp_attribute_t* attr = (sdp_attribute_t*)osip_list_get(&audio_sdp->a_attributes, i);
		if (attr)
		{
    
    
			std::string audio_filed = attr->a_att_field;
			if (audio_filed == "rtpmap")
			{
    
    
				StreamInfo stream;
				stream.type = StreamType::STREAMTYPE_AUDIO;
				snprintf(stream.remoteIp, 32, audio_ip.c_str());
				stream.remotePort = audio_port;
				std::string value = attr->a_att_value;
				auto strs = StringHelper::split(value, " ");
				if (strs.size() > 1)
				{
    
    
					stream.payload = atoi(strs[0].c_str());
					auto format = StringHelper::split(strs[1], "/");
					if (format.size() > 1)
					{
    
    
						snprintf(stream.codec, 16, format[0].c_str());
						stream.sampleRate = atoi(format[1].c_str());
						if (format.size() > 2)
							stream.channels = atoi(format[2].c_str());
					}
				}
				streams.push_back(stream);
			}
		}
	}
	return streams;
}

4. Command line push and pull flow

(1), video streaming

For example, forward the h264 stream of rtsp. The preview box will be displayed at the same time when the rtp stream is pushed.

ffmpeg -i rtmp://127.0.0.1/live/a123 -an -vcodec copy -payload_type 96 -f rtp rtp://127.0.0.1:25026?localrtpport=15514 -window_size 192x108 -f sdl 

(2) Audio streaming

Taking local files transcoded to g.711u as an example, each packet size is 160 bytes.

ffmpeg -re -stream_loop -1 -i D:\test_music.wav -vn -acodec pcm_mulaw -ar 8000 -ac 1 -af "aresample=8000[0];[0]asetnsamples=n=160:p=0" -payload_type 0 -f rtp rtp://127.0.0.1:15026?localrtpport=25514

The audio device collection encoding is g.711u as an example, and each packet size is 160 bytes.

ffmpeg -f dshow -i audio="音频设备名称" -vn -acodec pcm_mulaw -ar 8000 -ac 1 -af "aresample=8000[0];[0]asetnsamples=n=160:p=0" -payload_type 0 -f rtp rtp://127.0.0.1:15026?localrtpport=25514

Note: If audio and video are from the same input source, they can also be combined into the same command.

(3), audio and video playback

Save sdp string to local file
and play sdp locally

v=0
o=1002 158 1 IN IP4 127.0.0.1
s=Talk
c=IN IP4 127.0.0.1
t=0 0
m=video 25008 RTP/AVP 96
a=rtpmap:96 H264/90000
a=rtcp:25008
m=audio 25310 RTP/AVP 0
a=rtpmap:0 PCMU/8000
a=rtcp:25310

Save to test.sdp

FILE* f=NULL;
fopen_s(&f, "test.sdp", "wb");
if (f)
{
    
    
	fwrite(call->sdp, 1, strlen(call->sdp), f);
	fclose(f);
}

Command line play

ffplay.exe -x 640 -y 360 -protocol_whitelist \"file,udp,rtp\" -i test.sdp

2. Sipua interface design

#pragma once
#include<functional>
#include <string>
#include <vector>
#include "UdpProxy.h"
#include <eXosip2\eXosip.h>
#include"MessageQueue.h"

/// 这是一个sipua,内部实现是eXosip2,只提供sip交互,sdp解析、udp代理功能。
/// udp代理分离端口功能:
/// sdp的每个m媒体的推拉流需要使用一个端口,sip服务器要检查来源。
/// 如果此时采样ffmpeg.exe推流、ffplay.exe拉流,两个进程都需要绑定本地同一个端口,就会产生端口冲突。
/// 那就只能个使用jrtplib之类的库,打开一个连接同时发送和接收数据。
/// 但是有一个巧妙的解决办法那就是使用udp代理转发数据,就可以将端口拓展为多个了。

/// <summary>
/// sip状态
/// </summary>
enum SipUAState {
    
    
	//收到对方invite
	SIPUAEVENT_INVITE,
	//收到对方回复
	SIPUAEVENT_ANSWER,
	//处理流媒体,推流拉流端口有做分离,便于推拉流分开实现。
	SIPUAEVENT_STREAM,
	//结束通话,对方挂断
	SIPUAEVENT_ENDED,
};


/// <summary>
/// 流类型
/// </summary>
enum StreamType {
    
    
	STREAMTYPE_VIDEO,
	STREAMTYPE_AUDIO
};

/// <summary>
/// 流信息
/// </summary>
class StreamInfo {
    
    
public:
	//流类型
	StreamType type;
	//rtp推流地址,可以用此地址ffmpeg直接推流,也可以用下面参数自定义推流
	char rtpAdress[128] = {
    
     0 };
	//流的远端地址
	char remoteIp[32] = {
    
     0 };
	//流的远端端口
	int remotePort = 0;
	//本地接收/发送端口
	int localPort = 0;
	//编码格式
	char codec[16];
	//负载类型
	int payload = 0;
	union
	{
    
    
		//采样率,音频
		int sampleRate = 0;
		//时间基、视频
		int timebase;
	};
	//声道数
	int channels = 0;
};

/// <summary>
/// 通话对象
/// </summary>
class SipCall {
    
    
public:
	int callId = 0;
	//对方id
	const char* userId = nullptr;
	//播发的sdp
	const char* sdp = nullptr;
	//需要推流的视频信息
	StreamInfo* video = nullptr;
	//需要推流的音频信息
	StreamInfo* audio = nullptr;
};
class SipUA
{
    
    
public:
	/// <summary>
	/// 状态改变回调,目前版本除媒体流外只有对方的消息会触发状态改变
	/// </summary>
	std::function<void(SipUAState state, SipCall* call)> onState = [](auto, auto) {
    
    };
	SipUA(const std::string& serverIp, int serverPort, const std::string& username, const std::string& password);
	~SipUA();
	/// <summary>
	/// 开启客户端,此方法是阻塞的,可以在线程中开启。
	/// </summary>
	/// <param name="exitFlag">退出标记,值为true则退出</param>
	void exec(int* exitFlag);
	/// <summary>
	/// 呼叫
	/// </summary>
	/// <param name="remoteUserID">对方id</param>
	/// <param name="hasVideo">有视频否</param>
	/// <param name="hasAudio">有音频否</param>
	/// <returns>是否呼叫成功</returns>
	bool call(const std::string& remoteUserID, bool hasVideo = true, bool hasAudio = true);
	/// <summary>
	/// 应答
	/// </summary>
	/// <param name="hasVideo">有视频否</param>
	/// <param name="hasAudio">有音频否</param>
	void answer(bool hasVideo, bool hasAudio);
	/// <summary>
	/// 挂断
	/// </summary>
	void hangup();
};


3. Usage examples

/// <summary>
/// 本示例启动后会自动拨号,
/// 接收到通话请求会自动接听
/// </summary>
void main() {
    
    
	SipUA ua("192.168.1.10", 5060, "1002", "1234");
	int exitFlag = false;
	ua.onState = [&](SipUAState state, SipCall* call) {
    
    
		switch (state)
		{
    
    
		case SIPUAEVENT_INVITE:
			ua.answer(true, true);
			break;
		case SIPUAEVENT_ANSWER:
		
			break;
		case SIPUAEVENT_STREAM:

			//视频推流
			if (call->video)
			{
    
    
				std::string srcUrl = "test.mp4";
				std::string format = "-re -stream_loop -1";
				auto codec = StringHelper::toLower(call->video->codec);
				std::string params = "";
				char cmd[512];	
				if (codec == "h264")
				{
    
    
					params = "-preset ultrafast -tune zerolatency -level 4.2";
				}
				//发送桌面流,同时使用sdl本地预览
				sprintf_s(cmd, "ffmpeg %s  -i %s  -an -vcodec %s -pix_fmt yuv420p %s  -s 640x360   -b:v 500k  -r 30   -g 10   -payload_type %d   -f rtp %s -window_size 192x108 -f sdl \"%s\"  ",
					format.c_str(), srcUrl.c_str(), codec.c_str(), params.c_str(), call->video->payload, call->video->rtpAdress, srcUrl.c_str());
				//运行命令行
				runCmd(cmd);
			}
			//音频推流,如何是同一个输入流也可以和视频合并为一条命令
			if (call->audio)
			{
    
    	
				std::string srcUrl = "test_music.wav";
				std::string format = "-re -stream_loop -1";	
				auto codec = StringHelper::toLower(call->audio->codec);
				std::string params = "";
				char cmd[512];
				if (codec == "opus")
				{
    
    
					codec = "libopus";
				}
				if (codec == "pcmu")
				{
    
    
					codec = "pcm_mulaw";
					params = "-ac 1 -af \"aresample=8000[0];[0]asetnsamples=n=160:p=0\"";//af滤镜确保每个包160bytes
				}
				//转发本地文件
				sprintf_s(cmd, "ffmpeg  %s -i %s -vn -acodec %s  -ar %d  %s -payload_type %d -f rtp %s",
					format.c_str(), srcUrl.c_str(), codec.c_str(), call->audio->sampleRate, params.c_str() , call->audio->payload, call->audio->rtpAdress
				);
				printf(cmd);
				//运行命令行
				runCmd(cmd);
			}
			//播放对方音视频
			if (call->sdp)
			{
    
    
				FILE* f=NULL;
				fopen_s(&f, "test.sdp", "wb");
				if (f)
				{
    
    
					fwrite(call->sdp, 1, strlen(call->sdp), f);
					fclose(f);
					std::string cmd = "ffplay.exe -x 640 -y 360 -protocol_whitelist \"file,udp,rtp\" -i test.sdp";
					//运行命令行
					runCmd(cmd);
				}
				else
				{
    
    
					printf("fopen_s test.sdp error\n");
				}
			}
			break;
		case SIPUAEVENT_ENDED:
		    //关闭所有子进程
			closeJobObject();
			break;
		default:
			break;
		}

	};

	//开启测试拨号
	new std::thread([&]() {
    
    
		Sleep(2000);
		ua.call("1004", true);
		});
	ua.exec(&exitFlag);
}

4. Complete code

The eXosip version is 5.1, ffmpeg.exe is 4.3, vs2022 project.

https://download.csdn.net/download/u013113678/88180712


5. Effect preview

Use freeswitch as the sip server.
The running effect of this program:
push local mp4 to sip
Insert image description here
and use linphone as the peer. The running effect:
Insert image description here


Summarize

The above is what I will talk about today. The technology used in this article is very simple, but the implementation process is a bit tortuous. Especially for the port conflict problem, it took a lot of time to determine the cause, and the solution came to mind accidentally. Otherwise, the entire SIP client might have been implemented in code very early. The implementation method of this article decouples SIP, streaming media and RTP very well. SIP can be implemented independently, streaming media can be freely selected, and there is no need to share an RTP session. Sometimes it becomes easy to quickly build a test project too much.

Guess you like

Origin blog.csdn.net/u013113678/article/details/132126069