音视频探索(7):FLV协议在RTMP中的使用

2023年 9月 23日 83.6k 0

FLV是FLASH VIDEO的简称,FLV流媒体格式是随着Flash MX的推出发展而来的视频格式。由于它形成的文件极小、加载速度极快,使得网络观看视频文件成为可能,它的出现有效地解决了视频文件导入Flash后,使导出的SWF文件体积庞大,不能在网络上很好的使用等问题。RTMPHTTP_FLV内部使用FLV协议封装H.264AAC音视频包,FLV属于大端字节序

 FLV格式由FLV Header+FLV Body构成:

 从上图可知,FLV封装主要由以下几个部分组成:

222.webp

FLV Body = PreviousTagSize0 + Tag1 + ..+ PreviousTagSize n + Tag n
PreviousTagSize + Tag = FLV Tag header + Data tag
Data tag = Video(video header + h264 data) / Audio (audio header + aac data)

 为了进一步熟悉FLV协议,我们探讨下在RTMP中对FLV的封装。

RTMP的Video FLV Data (封装H.264)

 通常,FLV的Video Tag应该由11字节FLV tag headerVideo tag header构成,即:

video tag.jpg

 FLV tag Header说明:

PreviousTagSize: 表面之前的 Tag 的长度(tag 头+tag 数据),第一个 tag值是 0
TagType:音频:8, 视频:9
DataSize:flv tag 的数据长度,其实如图里 audio tag 头及其数据长度
Timestamp:时间戳,貌似flv播放需要
TimestampExtended:当时间Timestamp不能表示的时候,启用本扩展字段
StreamID:都填 0.

 但是,在RTMP协议中,Video Tag不包含FLV tag header,即只有Video tag header,并且封装的H.264数据需要区分PPS/SPS NALU普通NALU,具体结构如下:

  • PPS/SPS NALU

1.jpg

 以MediaCodec采集的SPS NALUPPS NALU为例:

fun getConfigurationFlvVideoTag(ppsByteBuffer: ByteBuffer, spsByteBuffer: ByteBuffer): ByteArray {
    /**
     * 构造AVCDecoderConfigurationRecord
     */
    val configRecordData = generateAVCDecoderConfigurationRecord(ppsByteBuffer, spsByteBuffer)

    /**
     * ====================================================
     * FLV Video tag (5字节) + configRecord
     * ====================================================
     */
    val flvVideoTagLen = 5
    val videoTagLen = flvVideoTagLen + configRecordData.size
    val videoTag = ByteArray(videoTagLen)

    /**
     * FLV Video tag (5字节):
     * 
     * UB[4]FrameType = 指定帧类型,1表示IDR帧,2为非关键帧
     * UB[4]CodecID = 指定数据类型,7表示AVC数据
     * UB[8]AVCPacketType = 0x00表示AVC Sps/Pps;0x01表示普通AVC
     * UB[24]Composition Time = 0x00 0x00 0x00(忽略)
     */
    videoTag[0] = 0x27  // FrameType(2) + CodecId(7) = 0010 0111
    videoTag[1] = 0x00  // AVCPacketType = 0x00
    videoTag[2] = 0x00  // Composition Time = 0x00 0x00 0x00
    videoTag[3] = 0x00
    videoTag[4] = 0x00
    
    System.arraycopy(configRecordData, 0, videoTag, flvVideoTagLen, configRecordData.size)

    return videoTag
}


fun generateAVCDecoderConfigurationRecord(ppsByteBuffer: ByteBuffer, spsByteBuffer: ByteBuffer): ByteArray {
    /**
     * Rtmp flv不需要nalu起始地址
     * 去掉pps nalu和sps nalu的起始地址(4字节,0x00000001)
     */
    spsByteBuffer.position(4)
    ppsByteBuffer.position(4)

    /**
     * ====================================================
     * 拷贝sps、pps数据到缓存,预留11个字节配置
     * sps configuration(8字节) + sps data + pps configuration(3字节) + sps data
     * ====================================================
     */
    val spsLen = spsByteBuffer.remaining()
    val ppsLen = ppsByteBuffer.remaining()
    val totalLen = 11 + spsLen + ppsLen
    val recordData = ByteArray(totalLen)
    spsByteBuffer.get(recordData, 8, spsLen)
    ppsByteBuffer.get(recordData, 8 + spsLen + 3, ppsLen)

    /**
     * 配置sps
     * UB[8]configurationVersion = 0x01
     * UB[8]AVCProfileIndication = 0x4D
     * UB[8]profile_compatibility = 0x40
     * UB[8]AVCLevelIndication = 0x15
     * UB[8]lengthSizeMinusOne = FF
     * UB[8]numOfSequenceParameterSets = 0xE1
     * UB[16]sequenceParameterSetLength:sps长度,占2个字节
     */
    recordData[0] = 0x01.toByte()
    recordData[1] = 0x4D.toByte()
    recordData[2] = 0x40.toByte()
    recordData[3] = 0x15.toByte()
    recordData[4] = 0xFF.toByte()
    recordData[5] = 0xE1.toByte()
    recordData[6] = (spsLen shr 8 and 0xFF).toByte()
    recordData[7] = (spsLen and 0xFF).toByte()

    /**
     * 配置pps
     * UB[8]numOfPictureParameterSets = 0x01
     * UB[16]pictureParameterSetLength: pps长度,占2个字节
     */
    recordData[8 + spsLen] = 0x01.toByte()
    recordData[8 + spsLen + 1] = (spsLen shr 8 and 0xFF).toByte()
    recordData[8 + spsLen + 2] = (spsLen and 0xFF).toByte()
    return recordData
}
  • 普通NALU(IDR/I帧/non-I帧)

2.jpg

 以MediaCodec采集的NALU为例:

fun geFlvVideoTag(buffer: ByteBuffer, bufferInfo: MediaCodec.BufferInfo): ByteArray {

    /**
     * Rtmp flv不需要nalu起始地址
     * 去掉nalu的起始地址(4字节,0x00000001)
     */
    buffer.position(bufferInfo.offset + 4)
    buffer.limit(bufferInfo.offset + bufferInfo.size)

    /**=====================================================
     * FLV Video tag (5字节) + nalu长度值(4字节) + nalu data
     * =====================================================
     */
    val flvVideoTagLen = 5
    val naluLen = 4
    val naluDataSize = buffer.remaining()
    val videoFlvDataLen = flvVideoTagLen + naluLen + naluDataSize
    val videoFlvData = ByteArray(videoFlvDataLen)

    /**
     * FLV Video tag (5字节):
     *
     * UB[4]FrameType = 1表示IDR帧,2为非关键帧
     * UB[4]CodecID = 7表示AVC数据
     * UB[8]AVCPacketType = 0x00表示AVC Sps/Pps;0x01表示普通AVC
     * UB[24]Composition Time = 0x00 0x00 0x00(忽略)
     */
    videoFlvData[0] = 0x17  // FrameType(1) + CodecId(7) = 0001 0111
    videoFlvData[1] = 0x00  // AVCPacketType = 0x00
    videoFlvData[2] = 0x00  // Composition Time = 0x00 0x00 0x00
    videoFlvData[3] = 0x00
    videoFlvData[4] = 0x00

    /**
     * NALU数据长度值(4字节)
     *
     * UB[32]naluLen = nalu长度
     */
    videoFlvData[5] = (naluDataSize shr 24 and 0xFF).toByte()
    videoFlvData[6] = (naluDataSize shr 16 and 0xFF).toByte()
    videoFlvData[7] = (naluDataSize shr 8 and 0xFF).toByte()
    videoFlvData[8] = (naluDataSize and 0xFF).toByte()

    /**
     * NALU数据
     */
    buffer.get(videoFlvData, flvVideoTagLen + naluLen, naluDataSize)

    return videoFlvData
}

RTMP的Audio FLV Data

 通常,FLV的Audio Tag应该由11字节FLV tag headerAudio tag header构成,但是,在RTMP协议中,Audio Tag不包含FLV tag header,即只有Audio tag header,格式如下:

1.jpg

Flv Audio tag说明(占2字节):

SoundFormat[4]:表示音频格式,10 = AAC
SoundRate[2]:表示音频采样率,3 = 44000Hz
soundSize[1]:表示采用精度,0=8bit 1=16bit
SoundType[1]:表示音频类型,1 = aac
AACPacketType[8]:表示AAC packet类型,0 = aac header; 1 = aac raw(不含ADTS头部)

2.jpg

 以MediaCodec采集的AAC为例:

fun getFlvAudioTag(buffer: ByteBuffer, bufferInfo: MediaCodec.BufferInfo): ByteArray {
    buffer.position(bufferInfo.offset)
    buffer.limit(bufferInfo.offset + bufferInfo.size)

    /**
     * =====================================================
     * FLV Audio tag (2字节) + aac raw data(没有adts头)
     * =====================================================
     */
    val audioTagLen = 2
    val aacDataSize = buffer.remaining()
    val audioTagData = ByteArray(audioTagLen + aacDataSize)

    /**
     * FLV Audio tag (2字节):
     *
     * UB[4] 10=AAC
     * UB[2] 3=44kHz
     * UB[1] 1=16-bit
     * UB[1] 0=MonoSound
     * UB[8] 0=AAC header, 1=AAC raw 
     *
     * 1010 11 1 0 (0xAE)
     */
    audioTagData[0] = 0xAE.toByte()
    audioTagData[1] = 0x01.toByte() // 0x00=aac header

    /**
     * AAC raw data
     */
    buffer.get(audioTagData, audioTagLen, aacDataSize)
    
    return audioTagData
}

参考文献

【流媒体协议】图解 FLV 协议 快速入门
音视频基础:FLV封装格式介绍及解析
如何理解 rtmp 通过 flv 格式推送音视频流(h264/aac)

相关文章

JavaScript2024新功能:Object.groupBy、正则表达式v标志
PHP trim 函数对多字节字符的使用和限制
新函数 json_validate() 、randomizer 类扩展…20 个PHP 8.3 新特性全面解析
使用HTMX为WordPress增效:如何在不使用复杂框架的情况下增强平台功能
为React 19做准备:WordPress 6.6用户指南
如何删除WordPress中的所有评论

发布评论