Cross-platform low-latency RTMP/RTSP live broadcast player design and implementation

Cross-platform low-latency RTMP/RTSP live broadcast player design and implementation

Development background

In 2015, when we tried to find a low-latency player specifically for live broadcast on the market to test our RTMP push module, we actually found that there was no useful one on the market, such as VLC or Vitamio, to put it bluntly, is based on FFMPEG. There are many formats and excellent support for on-demand. However, the live broadcast, especially RTMP, has a delay of a few seconds. For example, pure audio and pure video playback, fast start broadcast, network Abnormal state handling, integration complexity, and other aspects are very poorly supported, and because of the powerful functions, there are many bugs. In addition to the experienced developers in the industry, many developers need to consume a lot of energy even to compile the overall environment.

Our live player starts on the Windows platform, and is developed simultaneously on Android and iOS. Based on the various shortcomings of the above open source players, we consider a fully self-developed framework to ensure that the overall design is cross-platform and to ensure the degree of playback It is possible to achieve millisecond delays, and the interface design is unified on the three platforms to ensure that the complexity of multi-platform integration is minimized.

Overall scheme structure

RTMP or RTSP live player, the goal is very clear, from the RTMP server (self-built server or CDN) or RTSP server (or NVR/IPC/encoder, etc.) pull streaming data, complete data analysis, decoding, audio and video data synchronization, draw.

Specifically corresponding to the "receiving end" part of the following figure :

Initial module design goals

  • Own framework, easy to expand, adaptive algorithm makes the delay lower, decoding and drawing efficiency is higher;
  • Support various abnormal network status processing, such as network reconnection, network jitter, etc.;
  • There are Event state callbacks to ensure that developers can understand the overall state of the playback terminal, from pure black box uncontrollable to a more intelligent understanding of the overall playback state;
  • Support multi-instance playback;
  • Video supports H.264, audio supports AAC/PCMA/PCMU;
  • Support buffer time setting (buffer time);
  • Real-time mute.

Function after iteration

  • [ Support playback protocol ] RTSP, RTMP, millisecond delay;
  •  [ Multi-instance playback ] Supports multi-instance playback;
  •  [ Event callback ] Support network status, buffer status and other callbacks;
  •  [ Video format ] Support RTMP to extend H.265, H.264;
  •  [ Audio Format ] Support AAC/PCMA/PCMU/Speex;
  •  [ H.264/H.265 soft decoding ] Support H.264/H.265 soft decoding;
  •  [ H.264 hard decoding ] Windows/Android/iOS supports H.264 hard decoding;
  •  [ H.265 hardware solution ] Windows/ Android/iOS supports H.265 hardware solution;
  •  [ H.264/H.265 Hard Decoding ] Android supports setting Hard Decoding in Surface Mode and Hard Decoding in Normal Mode;
  •  [ Buffer time setting ] Support buffer time setting;
  •  [ First screen second on ] Support the first screen second on mode;
  •  [ Low Latency Mode ] Supports ultra-low latency mode settings similar to live broadcast programs such as online crane machines (200~400ms on the public network);
  •  [ Complex network processing ] Support automatic adaptation of various network environments such as network disconnection and reconnection;
  •  [ Quick URL switching ] Supports fast switching to other URLs during playback, and content switching is faster;
  •  [ Multi-render mechanism for audio and video ] Android platform, video: surfaceview/OpenGL ES, audio: AudioTrack/OpenSL ES;
  •  [ Real-time mute ] Support real-time mute/unmute during playback;
  •  [ Real-time snapshot ] Support to intercept the current playback screen during playback;
  •  [ Play key frames only ] Windows platform supports real-time setting of whether to play key frames only;
  •  [ Rendering angle ] Supports four video screen rendering angle settings of 0 , 90 , 180 and 270 ;
  •  [ Rendering Mirror ] Support horizontal inversion and vertical inversion mode settings;
  •  [ Real-time download speed update ] Support real-time callback of current download speed (support setting callback time interval);
  •  [ ARGB Overlay ] Windows platform supports ARGB image overlay to display video (see DEMO of C++);
  •  [ Video data callback before decoding ] Support H.264/H.265 data callback;
  •  [ Video data callback after decoding] Support YUV/RGB data callback after decoding;
  •  [ Video data zoom callback after decoding ] The Windows platform supports an interface for specifying the size of the callback image (you can zoom back to the upper layer after the original image is zoomed);
  •  [ Audio data callback before decoding ] Support AAC/PCMA/PCMU/SPEEX data callback;
  •  [ Audio and video self-adaptation ] Supports self-adaptation after audio and video information changes during playback;
  •  [ Extended recording function ] Support RTSP/RTMP H.264, extended H.265 stream recording, support PCMA/PCMU/Speex to AAC after recording, support setting to record only audio or video, etc.;

Considerations in the development and design of RTMP and RTSP live broadcast

1. ** Low latency: **Most RTSP playbacks are geared toward live broadcast scenes. Therefore, if the delay is too large, it will seriously affect the experience. Therefore, low latency is a very important indicator to measure a good RTSP player. The RTSP live broadcast delay of the live SDK is better than that of the open source player, and it will not cause delay accumulation when running for a long time;

2. Audio and video synchronization processing ** : **Some players do not even do audio and video synchronization in pursuit of low latency. They play audio video directly, which causes a/v to be out of sync, and there are various other things such as random skipping of timestamps. Problem, the player provided by Daniel Live SDK has a good time stamp synchronization and abnormal time stamp correction mechanism;

** 3. Support for multiple instances: ** The player provided by Daniel Live SDK supports simultaneous playback of multiple channels of audio and video data, such as 4-8-9 windows. Most open source players are not friendly to multi-instance support;

** 4. Support buffer time setting: ** In some scenes with network jitter, the player needs to support buffer time setting. Generally speaking, in milliseconds, open source players are not friendly enough to support this;

5. TCP/UDP mode setting ** , ** automatic ** switching: **Considering that many servers only support TCP or UDP mode, a good RTSP player needs to support TCP/UDP mode setting, if the link does not support TCP Or UDP, Daniel Live SDK can automatically switch, and open source players do not have the ability to automatically switch TCP/UDP;

6. Real-time ** mute: **For example, if you play RTSP streams in multiple windows, if every audio is played, the experience is very bad, so the real-time mute function is very necessary. The open source player does not have the real-time mute function;

7 ** . Video view rotation: ** Many cameras cause the image to be inverted due to installation restrictions, so a good RTSP player should support real-time rotation of the video view (0 90 180 270 ), horizontal inversion, and vertical Reverse, open source players do not have this function;

8 ** . Supports audio/video data output after decoding: **Daniu Live SDK has contacted many developers, hoping to obtain YUV or RGB data while playing, and perform algorithm analysis such as face matching. Open source player Does not have this function;

9 ** . ** Real-time ** Snapshot: **Interesting or important pictures, it is necessary to take real-time screenshots, general players do not have the snapshot capability, open source players do not have this function;

1 0 ** . Network jitter processing (such as network disconnection and reconnection): ** Stable network processing mechanism, support such as network disconnection and reconnection, etc., open source players have poor support for network exception handling;

** 11. **** Long-term operation stability: **Different from the open source players on the market, the RTSP live broadcast SDK for Windows platform provided by Daniel Live SDK is suitable for long-term operation for several days, and open source players are suitable for long-term operation. Poor stability support;

12. ** Log information record: **The overall process mechanism is recorded to the LOG file to ensure that there is a basis for evidence when a problem occurs, and the open source player has almost no log record.

1 3 ** . ** Real-time download speed ** feedback: **Daniu Live SDK provides real-time download callbacks for audio and video streams, and can set the callback time interval to ensure real-time download speed feedback, so as to monitor network status and open source playback The device does not have this capability;

1 4. Handling of abnormal status ** , ** Event status callback ** : **For various scenarios such as network disconnection, network jitter, etc. during playback, the player provided by Daniel Live SDK can call back related status in real time. Ensure that the upper module perceives processing, and the open source player does not support this well;

15. Real-time switching of key frame/full frame playback: especially when playing multiple channels, if there are too many channels, all decoding and drawing will increase the system resource usage. If it can be handled flexibly, only key frames can be played at any time , Full-frame playback switching, greatly reducing system performance requirements.

Interface design

Many developers, in the initial design of the interface, if there is not enough audio and video background, it is easy to overturn the previous design repeatedly. We take the Windows platform as an example to share our design ideas. If you need to download the demo project source code, you can go to GitHub Download reference:

smart_player_sdk.h

# ifdef __cplusplus extern "C" { # endif typedef struct _ SmartPlayerSDKAPI { /* Flag is currently passed 0, used for expansion later, pReserve passes NULL, used for expansion, Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *Init)(NT_UINT32 flag, NT_PVOID pReserve); /* This is the last interface called Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *UnInit)(); /* Flag is currently passed 0, used for expansion later, pReserve passes NULL, used for expansion, NT_HWND hwnd, the window used to draw the screen, can be set to NULL Get Handle Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *Open)(NT_PHANDLE pHandle, NT_HWND hwnd, NT_UINT32 flag, NT_PVOID pReserve); /* After calling this interface, the handle becomes invalid, Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *Close)(NT_HANDLE handle); /* Set the event callback. If you want to listen to the event, it is recommended to call this interface after the call to Open is successful. */ NT_UINT32 (NT_API *SetEventCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, NT_SP_SDKEventCallBack call_back); /* Set the video size callback interface */ NT_UINT32 (NT_API *SetVideoSizeCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, SP_SDKVideoSizeCallBack call_back); /* Set video callback, spit out video data frame_format: can only be NT_SP_E_VIDEO_FRAME_FORMAT_RGB32, NT_SP_E_VIDEO_FRAME_FROMAT_I420 */ NT_UINT32 (NT_API *SetVideoFrameCallBack)(NT_HANDLE handle, NT_INT32 frame_format, NT_PVOID call_back_data, SP_SDKVideoFrameCallBack call_back); /* Set the video callback, spit out the video data, you can specify the width and height of the spit out video *handle: play handle *scale_width: zoom width (must be an even number, it is recommended to be a multiple of 16) *scale_height: zoom height (must be an even number *scale_filter_mode: Scale quality, if it is 0, SDK will use the default value, the current setting range is [1, 3], the larger the value, the better the scaling quality, but the more performance it consumes *frame_format: can only be NT_SP_E_VIDEO_FRAME_FORMAT_RGB32, NT_SP_E_VIDEO_FRAME_FROMAT_I420 Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetVideoFrameCallBackV2)(NT_HANDLE handle, NT_INT32 scale_width, NT_INT32 scale_height, NT_INT32 scale_filter_mode, NT_INT32 frame_format, NT_PVOID call_back_data, SP_SDKVideoFrameCallBack call_back); /* *Set the video frame timestamp callback when drawing the video frame *Note that if the current playing stream is pure audio, then there will be no callback, this is only valid when there is video */ NT_UINT32 (NT_API *SetRenderVideoFrameTimestampCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, SP_SDKRenderVideoFrameTimestampCallBack call_back); /* Set audio PCM frame callback, spit out PCM data, the current frame size is 10ms. */ NT_UINT32 (NT_API *SetAudioPCMFrameCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, NT_SP_SDKAudioPCMFrameCallBack call_back); /* Set user data callback */ NT_UINT32 (NT_API *SetUserDataCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, NT_SP_SDKUserDataCallBack call_back); /* Set video sei data callback */ NT_UINT32 (NT_API *SetSEIDataCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, NT_SP_SDKSEIDataCallBack call_back); /* Start playing, pass the URL in Note: This interface is no longer recommended at present, please use StartPlay. For the convenience of old customers to upgrade, keep it temporarily. */ NT_UINT32 (NT_API *Start)(NT_HANDLE handle, NT_PCSTR url, NT_PVOID call_back_data, SP_SDKStartPlayCallBack call_back); /* Stop play Note: This interface is currently no longer recommended, please use StopPlay. For the convenience of old customers to upgrade, keep it temporarily. */ NT_UINT32 (NT_API *Stop)(NT_HANDLE handle); /* *Provide a set of new interfaces++ */ /* *Set URL Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetURL)(NT_HANDLE handle, NT_PCSTR url); /* * * Set the decryption key, currently only used to decrypt the rtmp encrypted stream * key: Decryption key * size: key length * Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetKey)(NT_HANDLE handle, const NT_BYTE* key, NT_UINT32 size); /* * * Set the decryption vector, currently only used to decrypt the rtmp encrypted stream * iv: Decryption vector * size: vector length * Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetDecryptionIV)(NT_HANDLE handle, const NT_BYTE* iv, NT_UINT32 size); /* handle: play handle hwnd: This must be passed in the window handle that is actually used for drawing is_support: *is_support is 1 if it is supported, 0 if it is not supported The interface call successfully returns NT_ERC_OK */ NT_UINT32 (NT_API *IsSupportD3DRender)(NT_HANDLE handle, NT_HWND hwnd, NT_INT32* is_support); /* Set the handle of the drawing window. If it is set when calling Open, then this interface does not need to be called If it is set to NULL when calling Open, then a drawing window handle can be set to the player here Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetRenderWindow)(NT_HANDLE handle, NT_HWND hwnd); /* * Set whether to play a sound, this is different from the mute interface * The main purpose of this interface is to use when the user does not want the SDK to play a sound after setting the external PCM callback interface * is_output_auido_device: 1: indicates that the output to the audio device is allowed, the default is 1, 0: indicates that the output is not allowed. Other values interface return failure * Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetIsOutputAudioDevice)(NT_HANDLE handle, NT_INT32 is_output_auido_device); /* *Start playing, note that StartPlay and Start cannot be mixed, either use StartPlay or Start. * Start and Stop are old interfaces and are not recommended. Please use the new interface of StartPlay and StopPlay */ NT_UINT32 (NT_API *StartPlay)(NT_HANDLE handle); /* *Stop play */ NT_UINT32 (NT_API *StopPlay)(NT_HANDLE handle); /* * Set whether to record video. By default, if the video source has a video, it will be recorded. If there is no video, it will not be recorded. However, in some scenarios, you may not want to record video, but only want to record audio, so add a switch * is_record_video: 1 means recording video, 0 means not recording video, default is 1 */ NT_UINT32 (NT_API *SetRecorderVideo)(NT_HANDLE handle, NT_INT32 is_record_video); /* * Set whether to record audio or not. By default, if the video source has audio, it will be recorded. If there is no audio, it will not be recorded. However, in some scenarios, you may not want to record audio, but only want to record video, so add a switch * is_record_audio: 1 means recording audio, 0 means not recording audio, the default is 1 */ NT_UINT32 (NT_API *SetRecorderAudio)(NT_HANDLE handle, NT_INT32 is_record_audio); /* Set the local video directory, it must be an English directory, otherwise it will fail */ NT_UINT32 (NT_API *SetRecorderDirectory)(NT_HANDLE handle, NT_PCSTR dir); /* Set the maximum size of a single video file, when it exceeds this value, it will be cut into a second file size: The unit is KB (1024Byte), the current range is [5MB-800MB], if it exceeds it, it will be set to the range */ NT_UINT32 (NT_API *SetRecorderFileMaxSize)(NT_HANDLE handle, NT_UINT32 size); /* Set the rules for generating video file names */ NT_UINT32 (NT_API *SetRecorderFileNameRuler)(NT_HANDLE handle, NT_SP_RecorderFileNameRuler* ruler); /* Set the recording callback interface */ NT_UINT32 (NT_API *SetRecorderCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, SP_SDKRecorderCallBack call_back); /* Set the switch for converting audio to AAC encoding during recording. Aac is more common, and sdk adds the function of converting other audio encodings (such as speex, pcmu, pcma, etc.) to aac. is_transcode: If it is set to 1, if the audio code is not aac, it will be converted to aac, if it is aac, no conversion will be done. If it is set to 0, no conversion will be done. The default is 0. Note: Transcoding will increase performance consumption */ NT_UINT32 (NT_API *SetRecorderAudioTranscodeAAC)(NT_HANDLE handle, NT_INT32 is_transcode); /* Start recording */ NT_UINT32 (NT_API *StartRecorder)(NT_HANDLE handle); /* Stop recording */ NT_UINT32 (NT_API *StopRecorder)(NT_HANDLE handle); /* * Set the callback of the video data when pulling the stream */ NT_UINT32 (NT_API *SetPullStreamVideoDataCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, SP_SDKPullStreamVideoDataCallBack call_back); /* * Set the callback of the audio data when pulling the stream */ NT_UINT32 (NT_API *SetPullStreamAudioDataCallBack)(NT_HANDLE handle, NT_PVOID call_back_data, SP_SDKPullStreamAudioDataCallBack call_back); /* Set the switch to convert audio to AAC encoding when streaming, aac is more common, sdk adds the function of converting other audio encodings (such as speex, pcmu, pcma, etc.) is_transcode: If it is set to 1, if the audio code is not aac, it will be converted to aac, if it is aac, no conversion will be done. If it is set to 0, no conversion will be done. The default is 0. Note: Transcoding will increase performance consumption */ NT_UINT32 (NT_API *SetPullStreamAudioTranscodeAAC)(NT_HANDLE handle, NT_INT32 is_transcode); /* Start pull */ NT_UINT32 (NT_API *StartPullStream)(NT_HANDLE handle); /* Stop streaming */ NT_UINT32 (NT_API *StopPullStream)(NT_HANDLE handle); /* *Provide a set of new interfaces-- */ /* When the drawing window size changes, you must call */ NT_UINT32 (NT_API* OnWindowSize)(NT_HANDLE handle, NT_INT32 cx, NT_INT32 cy); /* Universal interface, setting parameters, most problems, these interfaces can solve */ NT_UINT32 (NT_API *SetParam)(NT_HANDLE handle, NT_UINT32 id, NT_PVOID pData); /* Universal interface, get parameters, most problems, these interfaces can solve */ NT_UINT32 (NT_API *GetParam)(NT_HANDLE handle, NT_UINT32 id, NT_PVOID pData); /* Set buffer, minimum 0ms */ NT_UINT32 (NT_API *SetBuffer)(NT_HANDLE handle, NT_INT32 buffer); /* Mute interface, 1 means mute, 0 means no mute */ NT_UINT32 (NT_API *SetMute)(NT_HANDLE handle, NT_INT32 is_mute); /* Set RTSP TCP mode, 1 is TCP, 0 is UDP, only RTSP is valid */ NT_UINT32 (NT_API* SetRTSPTcpMode)(NT_HANDLE handle, NT_INT32 isUsingTCP); /* Set RTSP timeout time, timeout unit is second, must be greater than 0 */ NT_UINT32 (NT_API* SetRtspTimeout)(NT_HANDLE handle, NT_INT32 timeout); /* For RTSP, some may support rtp over udp, and some may support rtp over tcp. For ease of use, you can turn on the automatic try switch in some scenarios. If udp cannot be played after it is turned on, sdk will automatically try tcp, if tcp cannot be played, sdk will automatically try udp. is_auto_switch_tcp_udp: If set to 1, sdk will try to switch between tcp and udp, if set to 0, it will not try to switch. */ NT_UINT32 (NT_API* SetRtspAutoSwitchTcpUdp)(NT_HANDLE handle, NT_INT32 is_auto_switch_tcp_udp); /* Set seconds to open, 1 means seconds to open, 0 means no seconds to open */ NT_UINT32 (NT_API* SetFastStartup)(NT_HANDLE handle, NT_INT32 isFastStartup); /* Set low-latency playback mode, the default is normal playback mode mode: 1 is low latency mode, 0 is normal mode, others are only invalid The interface call successfully returns NT_ERC_OK */ NT_UINT32 (NT_API* SetLowLatencyMode)(NT_HANDLE handle, NT_INT32 mode); /* Check whether H264 hard decoding is supported If supported, return NT_ERC_OK */ NT_UINT32 (NT_API *IsSupportH264HardwareDecoder)(); /* Check whether it supports H265 hard decoding If supported, return NT_ERC_OK */ NT_UINT32 (NT_API *IsSupportH265HardwareDecoder)(); /* *Set H264 hardware solution *is_hardware_decoder: 1: means hard decoding, 0: means no hard decoding *reserve: reserved parameters, just pass 0 at present *Successfully returns NT_ERC_OK */ NT_UINT32 (NT_API *SetH264HardwareDecoder)(NT_HANDLE handle, NT_INT32 is_hardware_decoder, NT_INT32 reserve); /* *Set H265 hardware solution *is_hardware_decoder: 1: means hard decoding, 0: means no hard decoding *reserve: reserved parameters, just pass 0 at present *Successfully returns NT_ERC_OK */ NT_UINT32 (NT_API *SetH265HardwareDecoder)(NT_HANDLE handle, NT_INT32 is_hardware_decoder, NT_INT32 reserve); /* *Set to decode only video key frames *is_only_dec_key_frame: 1: means to decode only key frames, 0: means to decode all, the default is 0 *Successfully returns NT_ERC_OK */ NT_UINT32 (NT_API *SetOnlyDecodeVideoKeyFrame)(NT_HANDLE handle, NT_INT32 is_only_dec_key_frame); /* *Up and down reverse (vertical reverse) *is_flip: 1: means flip, 0: means no flip */ NT_UINT32 (NT_API *SetFlipVertical)(NT_HANDLE handle, NT_INT32 is_flip); /* *Horizontal inversion *is_flip: 1: means flip, 0: means no flip */ NT_UINT32 (NT_API *SetFlipHorizontal)(NT_HANDLE handle, NT_INT32 is_flip); /* Set rotation, clockwise rotation degress: setting 0, 90, 180, 270 degrees is valid, other values are invalid Note: Except for 0 degrees, playback from other angles will consume more CPU The interface call successfully returns NT_ERC_OK */ NT_UINT32 (NT_API* SetRotation)(NT_HANDLE handle, NT_INT32 degress); /* * In the case of D3D drawing, draw a logo on the drawing window, the drawing of the logo is driven by the video frame, and the argb image must be passed in * argb_data: argb image data, if null is passed, the previously set logo will be cleared * argb_stride: the step size of each line of the argb image (usually image_width*4) * image_width: argb image width * image_height: argb image height * left: left x of the drawing position * top: the top y of the drawing position * render_width: the width of the drawing * render_height: the height of the drawing */ NT_UINT32 (NT_API* SetRenderARGBLogo)(NT_HANDLE handle, const NT_BYTE* argb_data, NT_INT32 argb_stride, NT_INT32 image_width, NT_INT32 image_height, NT_INT32 left, NT_INT32 top, NT_INT32 render_width, NT_INT32 render_height ); /* Set download speed report, download speed is not reported by default is_report: report switch, 1: table report. 0: means no report. Other values are invalid. report_interval: report interval (report frequency), the unit is second, the minimum is 1 time per second. If it is less than 1 and the report is set, the call will fail Note: If you set the report, please set SetEventCallBack, and then handle the event in the callback function. The reported event is: NT_SP_E_EVENT_ID_DOWNLOAD_SPEED This interface must be called before StartXXX Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetReportDownloadSpeed)(NT_HANDLE handle, NT_INT32 is_report, NT_INT32 report_interval); /* Actively obtain download speed speed: return download speed, the unit is Byte/s (Note: This interface must be called after startXXX, otherwise it will fail) Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *GetDownloadSpeed)(NT_HANDLE handle, NT_INT32* speed); /* Get video duration For live broadcast, there is no duration and the call result is undefined For on-demand, NT_ERC_OK will be returned if the acquisition is successful, and NT_ERC_SP_NEED_RETRY if the SDK is still being parsed */ NT_UINT32 (NT_API *GetDuration)(NT_HANDLE handle, NT_INT64* duration); /* Get the current playback timestamp, in milliseconds (ms) Note: This timestamp is the timestamp of the video source and only supports on-demand. Live broadcast is not supported. Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *GetPlaybackPos)(NT_HANDLE handle, NT_INT64* pos); /* Get the current streaming timestamp, in milliseconds (ms) Note: This timestamp is the timestamp of the video source and only supports on-demand. Live broadcast is not supported. Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *GetPullStreamPos)(NT_HANDLE handle, NT_INT64* pos); /* Set the playback position, the unit is milliseconds (ms) Note: Live broadcast is not supported, this interface is used for on-demand Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SetPos)(NT_HANDLE handle, NT_INT64 pos); /* Pause playback isPause: 1 means pause, 0 means resume playback, other errors Note: There is no concept of pause in live broadcast, so live broadcast is not supported. This interface is used for on-demand broadcast Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *Pause)(NT_HANDLE handle, NT_INT32 isPause); /* Switch URL url: the url to be switched switch_pos: After switching to the new url, set the playback position. By default, please fill in 0. This is only valid for the on-demand url that sets the playback position, and the live url is invalid reserve: reserved parameters Note: 1. If the switched url is the same as the url being played, the sdk will not do any processing Call preconditions: any one of the three interfaces of StartPlay, StartRecorder, and StartPullStream has been successfully called Successfully return NT_ERC_OK */ NT_UINT32 (NT_API *SwitchURL)(NT_HANDLE handle, NT_PCSTR url, NT_INT64 switch_pos, NT_INT32 reserve); /* Capture picture file_name_utf8: file name, utf8 encoding call_back_data: user-defined data when calling back call_back: Callback function, used to notify the user that the screenshot has been completed or failed Successfully return NT_ERC_OK The call may succeed only when it is playing, otherwise it will return an error. Because it takes a lot of time to generate PNG files, it usually takes several hundred milliseconds. To prevent the CPU from being too high, the SDK will limit the number of screenshot requests. Calling this interface will return NT_ERC_SP_TOO_MANY_CAPTURE_IMAGE_REQUESTS. In this case, please delay for a while and try again after the SDK has processed some requests. */ NT_UINT32 (NT_API* CaptureImage)(NT_HANDLE handle, NT_PCSTR file_name_utf8, NT_PVOID call_back_data, SP_SDKCaptureImageCallBack call_back); /* * Use GDI to draw RGB32 data * 32-bit rgb format, r, g, b each account for 8, and the other byte is reserved, the memory byte format is: bb gg rr xx, mainly matching the windows bitmap, in little endian mode, press DWORD Operation, the highest bit is xx, followed by rr, gg, bb * In order to maintain compatibility with windows bitmaps, the stride (image_stride) must be width_*4 * handle: player handle * hdc: draw dc * x_dst: the x coordinate of the upper left corner of the drawing surface * y_dst: the y coordinate of the upper left corner of the drawing surface * dst_width: the width to be drawn * dst_height: the height to be drawn * x_src: source image x position * y_src: y position of the original image * rgb32_data: rgb32 data, please refer to the previous note for the format * rgb32_data_size: data size * image_width: the actual width of the image * image_height: the actual height of the image * image_stride: image stride */ NT_UINT32 (NT_API *GDIDrawRGB32)(NT_HANDLE handle, NT_HDC hdc, NT_INT32 x_dst, NT_INT32 y_dst, NT_INT32 dst_width, NT_INT32 dst_height, NT_INT32 x_src, NT_INT32 y_src, NT_INT32 src_width, NT_INT32 src_height, const NT_BYTE* rgb32_data, NT_UINT32 rgb32_data_size, NT_INT32 image_width, NT_INT32 image_height, NT_INT32 image_stride); /* * Use GDI to draw ARGB data * The memory byte format is: bb gg rr alpha, which is mainly matched with the windows bitmap. In little-endian mode, operate according to the DWORD type, the highest bit is alpha, followed by rr, gg, bb * In order to maintain compatibility with windows bitmaps, the stride (image_stride) must be width_*4 * hdc: draw dc * x_dst: the x coordinate of the upper left corner of the drawing surface * y_dst: the y coordinate of the upper left corner of the drawing surface * dst_width: the width to be drawn * dst_height: the height to be drawn * x_src: source image x position * y_src: y position of the original image * argb_data: argb image data, see the previous note for the format * image_stride: the step size of each line of the image * image_width: the actual width of the image * image_height: the actual height of the image */ NT_UINT32 (NT_API *GDIDrawARGB)(NT_HDC hdc, NT_INT32 x_dst, NT_INT32 y_dst, NT_INT32 dst_width, NT_INT32 dst_height, NT_INT32 x_src, NT_INT32 y_src, NT_INT32 src_width, NT_INT32 src_height, const NT_BYTE* argb_data, NT_INT32 image_stride, NT_INT32 image_width, NT_INT32 image_height); } SmartPlayerSDKAPI; NT_UINT32 NT_API GetSmartPlayerSDKAPI (SmartPlayerSDKAPI* pAPI) ; /* reserve1: please pass 0 NT_PVOID: Please pass NULL Successful return: NT_ERC_OK */ NT_UINT32 NT_API NT_SP_SetSDKClientKey (NT_PCSTR cid, NT_PCSTR key, NT_INT32 reserve1, NT_PVOID reserve2) ; # ifdef __cplusplus } # endifCopy code

smart_player_define.h

# ifndef SMART_PLAYER_DEFINE_H_ # define SMART_PLAYER_DEFINE_H_ # ifdef WIN32 # include <windows.h> # endif # ifdef SMART_HAS_COMMON_DIC # include "../../topcommon/nt_type_define.h" # include "../../topcommon/nt_base_code_define.h" # else # include "nt_type_define.h" # include "nt_base_code_define.h" # endif # ifdef __cplusplus extern "C" { # endif # ifndef NT_HWND_ # define NT_HWND_ # ifdef WIN32 typedef HWND NT_HWND; # else typedef void * NT_HWND; # endif # endif # ifndef NT_HDC_ # define NT_HDC_ # ifdef _WIN32 typedef HDC NT_HDC; # else typedef void * NT_HDC; # endif # endif /* Error code */ typedef enum _ SP_E_ERROR_CODE { NT_ERC_SP_HWND_IS_NULL = (NT_ERC_SMART_PLAYER_SDK | 0x1 ), //window handle is empty NT_ERC_SP_HWND_INVALID = (NT_ERC_SMART_PLAYER_SDK | 0x2 ), //window handle is invalid NT_ERC_SP_TOO_MANY_CAPTURE_IMAGE_REQUESTS = (NT_ERC_SMART_PLAYER_SDK | 0x3 ), //too Screenshot request NT_ERC_SP_WINDOW_REGION_INVALID = (NT_ERC_SMART_PLAYER_SDK | 0x4 ), //invalid window area, the window width may be less than or high. 1 NT_ERC_SP_DIR_NOT_EXIST = (NT_ERC_SMART_PLAYER_SDK | 0x5 ), //directory does not exist NT_ERC_SP_NEED_RETRY = (NT_ERC_SMART_PLAYER_SDK | 0x6 ),//need to retry } SP_E_ERROR_CODE; /*Set the parameter ID, this is currently written like this, SmartPlayerSDK has divided the range*/ typedef enum _ SP_E_PARAM_ID { SP_PARAM_ID_BASE = NT_PARAM_ID_SMART_PLAYER_SDK, } SP_E_PARAM_ID; /*Event ID*/ typedef enum _ NT_SP_E_EVENT_ID { NT_SP_E_EVENT_ID_BASE = NT_EVENT_ID_SMART_PLAYER_SDK, = NT_SP_E_EVENT_ID_BASE NT_SP_E_EVENT_ID_CONNECTING | 0x2 , /* connection */ NT_SP_E_EVENT_ID_CONNECTION_FAILED = NT_SP_E_EVENT_ID_BASE | 0x3 , /* Connection Failed */ NT_SP_E_EVENT_ID_CONNECTED = NT_SP_E_EVENT_ID_BASE | 0x4 , /* connected */ NT_SP_E_EVENT_ID_DISCONNECTED = NT_SP_E_EVENT_ID_BASE | 0x5 , /* disconnect */ NT_SP_E_EVENT_ID_NO_MEDIADATA_RECEIVED = NT_SP_E_EVENT_ID_BASE | 0x8 , /* Can not receive RTMP data*/ NT_SP_E_EVENT_ID_RTSP_STATUS_CODE = NT_SP_E_EVENT_ID_BASE | 0xB , /*rtsp status code report, currently only 401 is reported, param1 means status code*/ NT_SP_E_EVENT_ID_NEED_KEY = NT_SP_E_EVENT_ID_BASE | 0xC , /*You need to enter the decryption key to play*/ NT_SP_E_EVENT_ID_KEY_ERROR = NT_SP_E_EVENT_ID_BASE | 0xD , /*The decryption key is incorrect*/ /* * Then from the Start 0x81/ NT_SP_E_EVENT_ID_START_BUFFERING = NT_SP_E_EVENT_ID_BASE | 0x81 , /* start buffer */ NT_SP_E_EVENT_ID_BUFFERING = NT_SP_E_EVENT_ID_BASE | 0x82 , /* buffer, param1 represents the percentage of progress */ NT_SP_E_EVENT_ID_STOP_BUFFERING = NT_SP_E_EVENT_ID_BASE | 0x83 , /* stop * buffer/ NT_SP_E_EVENT_ID_DOWNLOAD_SPEED = NT_SP_E_EVENT_ID_BASE | 0x91 , /*download speed, param1 means download speed, the unit is (Byte/s)*/ = NT_SP_E_EVENT_ID_BASE NT_SP_E_EVENT_ID_PLAYBACK_REACH_EOS | 0xA1 , /* end playback, live streaming without this event, on-demand streaming only */ NT_SP_E_EVENT_ID_RECORDER_REACH_EOS = NT_SP_E_EVENT_ID_BASE | 0xA2 , /* end of the video, there is no live streaming of this event, on-demand streaming only */ NT_SP_E_EVENT_ID_PULLSTREAM_REACH_EOS = NT_SP_E_EVENT_ID_BASE | 0xa3 , /*End of streaming, live streaming does not have this event, on-demand streaming only*/ NT_SP_E_EVENT_ID_DURATION = NT_SP_E_EVENT_ID_BASE | 0xa8 , /*Video duration, if it is a live broadcast, it will not be reported. If it is on-demand, if the video duration can be obtained from the video source, it will be reported. param1 represents the video duration in milliseconds (ms)*/ } NT_SP_E_EVENT_ID; //Define video frame image format typedef enum _ NT_SP_E_VIDEO_FRAME_FORMAT { NT_SP_E_VIDEO_FRAME_FORMAT_RGB32 = 1 , //32-bit rgb format, r, g, b each occupy 8, the other byte is reserved, the memory byte format is: bb gg rr xx, Mainly matches the windows bitmap. In little-endian mode, operate according to the DWORD type, the highest bit is xx, followed by rr, gg, bb NT_SP_E_VIDEO_FRAME_FORMAT_ARGB = 2 , //32-bit argb format, memory byte format is: bb gg rr aa This type, matches the windows bitmap NT_SP_E_VIDEO_FRAME_FROMAT_I420 = 3 , //YUV420 format, three components are stored on three sides } NT_SP_E_VIDEO_FRAME_FORMAT; //Define the video frame structure. typedef struct _ NT_SP_VideoFrame { NT_INT32 format_; //For image format, please refer to NT_SP_E_VIDEO_FRAME_FORMAT NT_INT32 width_; //Image width NT_INT32 height_; //Image height NT_UINT64 timestamp_; //Timestamp, usually 0, not used, in ms //Specific image data, argb and rgb32 only use the first one, I420 uses the first three NT_UINT8* plane0_; NT_UINT8* plane1_; NT_UINT8* plane2_; NT_UINT8* plane3_; //The number of bytes in each line of each plane, for argb and rgb32, in order to maintain compatibility with windows bitmaps, it must be width_*4 //for I420, stride0_ is the step size of y, stride1_ is the step size of u, stride2_ is the step size of v, NT_INT32 stride0_; NT_INT32 stride1_; NT_INT32 stride2_; NT_INT32 stride3_; } NT_SP_VideoFrame; //If all three items are 0, the recording cannot be started. typedef struct _ NT_SP_RecorderFileNameRuler { NT_UINT32 type_; //This value is currently 0 by default. In the future, NT_PCSTR file_name_prefix_; //Set a recording file name prefix, for example: daniulive NT_INT32 append_date_; //If it is 1, the date will be added to the file name, for example: daniulive-2017-01-17 NT_INT32 append_time_; //If it is 1, the time will be added, for example: daniulive-2017-01-17- 17-10-36 } NT_SP_RecorderFileNameRuler; /* *Some related data when streaming video data */ typedef struct _ NT_SP_PullStreamVideoDataInfo { NT_INT32 is_key_frame_; /* 1: indicates a key frame, 0: indicates a non-key frame*/ NT_UINT64 timestamp_; /* Decoding timestamp, in milliseconds*/ NT_INT32 width_; /* Generally 0 */ NT_INT32 height_; /* Normally also 0 */ NT_BYTE* parameter_info_; /* Normally NULL */ NT_UINT32 parameter_info_size_; /* Normally 0 */ NT_UINT64 presentation_timestamp_; /* Display timestamp, this value must be greater than or equal to timestamp_, the unit is millisecond*/ } NT_SP_PullStreamVideoDataInfo; /* *Some related data when streaming audio data */ typedef struct _ NT_SP_PullStreamAuidoDataInfo { NT_INT32 is_key_frame_; /* 1: indicates key frame, 0: indicates non-key frame*/ NT_UINT64 timestamp_; /* unit is millisecond*/ NT_INT32 sample_rate_; /* generally 0 */ NT_INT32 channel_; /* Generally 0 */ NT_BYTE* parameter_info_; /* If it is AAC, this is a value, other codes are generally ignored*/ NT_UINT32 parameter_info_size_; /* If it is AAC, this is a value, and other codes are generally ignored*/ NT_UINT64 reserve_; /* Reserve*/ } NT_SP_PullStreamAuidoDataInfo; /* When the player gets the video size, it will call back */ typedef NT_VOID (NT_CALLBACK *SP_SDKVideoSizeCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_INT32 width, NT_INT32 height) ; /* Passed in when calling Start, callback interface */ typedef NT_VOID (NT_CALLBACK *SP_SDKStartPlayCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 result) ; /* Video image callback status: currently not used, the default is 0, may be used in the future */ typedef NT_VOID (NT_CALLBACK* SP_SDKVideoFrameCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 status, const NT_SP_VideoFrame* frame) ; /* Audio PCM data callback, the current frame length is 10ms status: currently not used, the default is 0, may be used in the future data: PCM data size: data size sample_rate: sample rate channel: number of channels per_channel_sample_number: number of samples per channel */ typedef NT_VOID (NT_CALLBACK* NT_SP_SDKAudioPCMFrameCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 status, NT_BYTE* data, NT_UINT32 size, NT_INT32 sample_rate, NT_INT32 channel, NT_INT32 per_channel_sample_number) ; /* Screenshot callback result: If the screenshot is successful, the result is NT_ERC_OK, other errors */ typedef NT_VOID (NT_CALLBACK* SP_SDKCaptureImageCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 result, NT_PCSTR file_name) ; /* When drawing a video, the video frame timestamp is called back. This is used in some special scenarios, and users without special needs don t need to pay attention. timestamp: The unit is milliseconds reserve1: reserved parameters reserve2: reserved parameters */ typedef NT_VOID (NT_CALLBACK* SP_SDKRenderVideoFrameTimestampCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT64 timestamp, NT_UINT64 reserve1, NT_PVOID reserve2) ; /* Video callback status: 1: indicates that a new video file has been written. 2: indicates that a video file has been written file_name: actual recording file name */ typedef NT_VOID (NT_CALLBACK* SP_SDKRecorderCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 status, NT_PCSTR file_name) ; /* *When streaming, video data callback video_codec_id: please refer to NT_MEDIA_CODEC_ID data: video data size: video data size info: Video data related information reserve: reserved parameters */ typedef NT_VOID (NT_CALLBACK* SP_SDKPullStreamVideoDataCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 video_codec_id, NT_BYTE* data, NT_UINT32 size, NT_SP_PullStreamVideoDataInfo* info, NT_PVOID reserve) ; /* *When streaming, audio data callback auido_codec_id: please refer to NT_MEDIA_CODEC_ID data: audio data size: audio data size info: audio data related information reserve: reserved parameters */ typedef NT_VOID (NT_CALLBACK* SP_SDKPullStreamAudioDataCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 auido_codec_id, NT_BYTE* data, NT_UINT32 size, NT_SP_PullStreamAuidoDataInfo* info, NT_PVOID reserve) ; /* *Player event callback event_id: event ID, please refer to NT_SP_E_EVENT_ID param1 to param6, the meaning of the value is related to the specific event ID, note that if the specific event ID does not specify the meaning of param1-param6, it means that this event does not have parameters */ typedef NT_VOID (NT_CALLBACK* NT_SP_SDKEventCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_UINT32 event_id, NT_INT64 param1, NT_INT64 param2, NT_UINT64 param3, NT_PCSTR param4, NT_PCSTR param5, NT_PVOID param6 ) ; /* * * User data callback, currently sent by the push end * data_type: data type, 1: represents the binary byte type. 2: represents the utf8 string * data: actual data, if data_type is 1, data type is const NT_BYTE*, if data_type is 2, data type is const NT_CHAR* * size: data size * timestamp: video timestamp * reserve1: reserved * reserve2: reserve * reserve3: reserved */ typedef NT_VOID (NT_CALLBACK* NT_SP_SDKUserDataCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_INT32 data_type, NT_PVOID data, NT_UINT32 size, NT_UINT64 timestamp, NT_UINT64 reserve1, NT_INT64 reserve2, NT_PVOID reserve3 ) ; /* * * Video's sei data callback * data: sei data * size: sei data size * timestamp: video timestamp * reserve1: reserved * reserve2: reserve * reserve3: reserved * Note: The current test found that some videos have several sei nals. In order to facilitate the user's processing, we spit out all the sei nals that are parsed. The sei nals are still separated by 00 00 00 01, which is convenient for parsing. * The sei data spit out is currently prefixed with 00 00 00 01 */ typedef NT_VOID (NT_CALLBACK* NT_SP_SDKSEIDataCallBack) (NT_HANDLE handle, NT_PVOID user_data, NT_BYTE* data, NT_UINT32 size, NT_UINT64 timestamp, NT_UINT64 reserve1, NT_INT64 reserve2, NT_PVOID reserve3 ) ; # ifdef __cplusplus } # endif # endifCopy code

summary

In general, whether it is based on the secondary development of an open source player or fully self-developed, a good RTMP player or RTSP player, when designing a good RTMP player or RTSP player, more consideration should be given to how to make it more flexible, stable, and simple. With several interfaces, it is difficult to meet the requirements of generalized products.

The following encouragement: accumulate and reach the top of the mountain, not to enjoy the scenery, but to find a higher mountain!