content for all in one: February 2016

SPS in H264:

An h.264 bitstream contains a sequence of Network Abstraction Layer (NAL) units. The SPS and PPS are both types of NAL units. The SPS NAL unit contains parameters that apply to a series of consecutive coded video pictures, referred to as a “coded video sequence” in the h.264 standard. The PPS NAL unit contains parameters that apply to the decoding of one or more individual pictures inside a coded video sequence.

/* h.264 bitstreams */

const uint8_t sps[] =

{0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0x00, 0x0a, 0xf8, 0x41, 0xa2};

const uint8_t pps[] =

{0x00, 0x00, 0x00, 0x01, 0x68, 0xce, 0x38, 0x80};

Let’s decode this into something readable from the spec. The first thing I did was to look at section 7 of the h.264 specification. I saw that at a minimum I had to choose how to fill in the SPS parameters in the table below. In the table, as in the standard, the type u(n) indicates an unsigned integer of n bits, and ue(v) indicates an unsigned exponential-golomb coded value of a variable number of bits. The spec doesn’t seem to define the maximum number of bits anywhere, but the reference encoder software uses 32. (People wishing to explore the security of decoder software may find it interesting to violate this assumption!)

Parameter Name	Type	Value	Comments
forbidden_zero_bit	u(1)	0	Despite being forbidden, it must be set to 0!
nal_ref_idc	u(2)	3	3 means it is “important” (this is an SPS)
nal_unit_type	u(5)	7	Indicates this is a sequence parameter set
profile_idc	u(8)	66	Baseline profile
constraint_set0_flag	u(1)	0	We’re not going to honor constraints
constraint_set1_flag	u(1)	0	We’re not going to honor constraints
constraint_set2_flag	u(1)	0	We’re not going to honor constraints
constraint_set3_flag	u(1)	0	We’re not going to honor constraints
reserved_zero_4bits	u(4)	0	Better set them to zero
level_idc	u(8)	10	Level 1, sec A.3.1
seq_parameter_set_id	ue(v)	0	We’ll just use id 0.
log2_max_frame_num_minus4	ue(v)	0	Let’s have as few frame numbers as possible
pic_order_cnt_type	ue(v)	0	Keep things simple
log2_max_pic_order_cnt_lsb_minus4	ue(v)	0	Fewer is better.
num_ref_frames	ue(v)	0	We will only send I slices
gaps_in_frame_num_value_allowed_flag	u(1)	0	We will have no gaps
pic_width_in_mbs_minus_1	ue(v)	7	SQCIF is 8 macroblocks wide
pic_height_in_map_units_minus_1	ue(v)	5	SQCIF is 6 macroblocks high
frame_mbs_only_flag	u(1)	1	We will not to field/frame encoding
direct_8x8_inference_flag	u(1)	0	Used for B slices. We will not send B slices
frame_cropping_flag	u(1)	0	We will not do frame cropping
vui_prameters_present_flag	u(1)	0	We will not send VUI data
rbsp_stop_one_bit	u(1)	1	Stop bit. I missed this at first and it caused me much trouble.

A handy tool for decoding h.264 bitstreams, including the SPS, is the h264bitstream tool. It comes with a command line program that decodes a bitstream to the parameter names defined in the h.264 specification. Let’s look at its output for a sample mp4 file I downloaded from youtube. First, I extract the h.264 NAL units from the file using ffmpeg:

ffmpeg.exe -i video.mp4 -vcodec copy -vbsf h264_mp4toannexb -an out.h264

The NAL units now reside in the file of.h264. I then run the h264_analyze command from the h264bitstream package to produce the following output:

h264_analyze of.h264

!! Found NAL at offset 4 (0x0004), size 25 (0x0019)

==================== NAL ====================

forbidden_zero_bit : 0

nal_ref_idc : 3

nal_unit_type : 7 ( Sequence parameter set )

======= SPS =======

profile_idc : 100

constraint_set0_flag : 0

constraint_set1_flag : 0

constraint_set2_flag : 0

constraint_set3_flag : 0

reserved_zero_4bits : 0

level_idc : 31

seq_parameter_set_id : 0

chroma_format_idc : 1

residual_colour_transform_flag : 0

bit_depth_luma_minus8 : 0

bit_depth_chroma_minus8 : 0

qpprime_y_zero_transform_bypass_flag : 0

seq_scaling_matrix_present_flag : 0

log2_max_frame_num_minus4 : 3

pic_order_cnt_type : 0

log2_max_pic_order_cnt_lsb_minus4 : 3

delta_pic_order_always_zero_flag : 0

offset_for_non_ref_pic : 0

offset_for_top_to_bottom_field : 0

num_ref_frames_in_pic_order_cnt_cycle : 0

num_ref_frames : 1

gaps_in_frame_num_value_allowed_flag : 0

pic_width_in_mbs_minus1 : 79

pic_height_in_map_units_minus1 : 44

frame_mbs_only_flag : 1

mb_adaptive_frame_field_flag : 0

direct_8x8_inference_flag : 1

frame_cropping_flag : 0

frame_crop_left_offset : 0

frame_crop_right_offset : 0

frame_crop_top_offset : 0

frame_crop_bottom_offset : 0

vui_parameters_present_flag : 1

=== VUI ===

aspect_ratio_info_present_flag : 1

aspect_ratio_idc : 1

sar_width : 0

sar_height : 0

overscan_info_present_flag : 0

overscan_appropriate_flag : 0

video_signal_type_present_flag : 0

video_signal_type_present_flag : 0

video_format : 0

video_full_range_flag : 0

colour_description_present_flag : 0

colour_primaries : 0

transfer_characteristics : 0

matrix_coefficients : 0

chroma_loc_info_present_flag : 0

chroma_sample_loc_type_top_field : 0

chroma_sample_loc_type_bottom_field : 0

timing_info_present_flag : 1

num_units_in_tick : 100

time_scale : 5994

fixed_frame_rate_flag : 1

nal_hrd_parameters_present_flag : 0

vcl_hrd_parameters_present_flag : 0

low_delay_hrd_flag : 0

pic_struct_present_flag : 0

bitstream_restriction_flag : 1

motion_vectors_over_pic_boundaries_flag : 1

max_bytes_per_pic_denom : 0

max_bits_per_mb_denom : 0

log2_max_mv_length_horizontal : 11

log2_max_mv_length_vertical : 11

num_reorder_frames : 0

max_dec_frame_buffering : 1

=== HRD ===

cpb_cnt_minus1 : 0

bit_rate_scale : 0

cpb_size_scale : 0

initial_cpb_removal_delay_length_minus1 : 0

cpb_removal_delay_length_minus1 : 0

dpb_output_delay_length_minus1 : 0

time_offset_length : 0

The only additional thing I’d like to point out here is that this particular SPS also contains information about the frame rate of the video (see timing_info_present_flag). These parameters must be closely checked when you generate bitstreams to ensure they agree with the container format that the h.264 will eventually be muxed into. Even a small error, such as 29.97 fps in one place and 30 fps in another, can result in severe audio/video synchronization problems.

content for all in one

Labels

Thursday, 25 February 2016

Python scripting language for beginners

Tuesday, 23 February 2016

Sequence parameter set in H264

Search This Blog