home :: mpeg2_organization

Organization of an MPEG-2 Bitstream

Here we provide a brief overview of how an MPEG-2 video bitstream is organized. This should give you an idea of how different coding elements are related to each other and what to expect when the parser starts throwing events at you. In the text below, highlighted fixed-font text refers to specific classes in the coding element event hierarchy.

In general, an MPEG-2 bitstream is comprised of the following coding structures in this order:

  • Sequence header and sequence extension header
  • Possibly user data and/or other extension headers
  • Group of pictures header
  • Possibly user data
  • For each frame in the group of pictures...
    • Picture header and picture coding extension header
    • Possibly user data and/or other extension headers
    • Picture data

At the end of a group of pictures, this structure is repeated starting either at the sequence header level or the group of pictures header level. If the sequence header is repeated, then all values in the sequence header and sequence extension header must match the values previously encoded in any previous sequence header and/or sequence extension header. At the end of the entire sequence (i.e., at the end of the video file), should be a sequence end header. Once the sequence end header is encountered, the video stream is considered finished (i.e., any and all bytes after the sequence end header are meaningless relative to this video sequence).

Below are more detailed descriptions of these coding structures including the exact coding element events that are published when they are encountered.

Sequence Header and Sequence Extension

The bitstream should start with a sequence header. If you use the VideoParser object, everything before the sequence header will be parsed off and published as a sequence of OpaqueBits events. The sequence header will result in a StartCode event (the value of which should indicate that a sequence header follows) followed by the events published for each atomic coding element that is part of the sequence header. These will be, in order:

  • HorizontalSizeValue
  • VerticalSizeValue
  • AspectRatioInformation
  • FrameRateCode
  • BitRateValue
  • MarkerBit
  • VBVBufferSizeValue
  • ContrainedParametersFlag
  • LoadIntraQuantiserMatrix
  • IntraQuantiserMatrix
    Only if previous load intra quantiser matrix flag value is set to one, otherwise not present.
  • LoadNonIntraQuantiserMatrix
  • NonIntraQuantiserMatrix
    Only if previous load non-intra quantiser matrix flag value is set to one, otherwise not present.

All of this (not including the start code) is collected together and published as a SequenceHeader compound coding element event.

Following the sequence header will be an extension start code which indicates the presence of the sequence extension header. For MPEG-2 video, this is mandatory. The sequence header must be followed by the sequence extension header. The extension start code results in a StartCode event. This StartCode must have the appropriate value to indicate that an extension header follows. Following this will be an ExtensionStartCodeIdentifier event. This 4-bit atomic coding element determines the type of extension header that follows. In this case, its value must indicate that the extension header is a sequence extension header. The sequence extension header is comprised of the following atomic coding elements in this order:

  • ProfileAndLevelIndication
  • ProgressiveSequence
  • ChromaFormat
  • HorizontalSizeExtension
  • VerticalSizeExtension
  • BitRateExtension
  • MarkerBit
  • VBVBufferSizeExtension
  • LowDelay
  • FrameRateExtensionN
  • FrameRateExtensionD

All of this is part of the compound element SequenceExtension which is published after its component parts are parsed and published individually. Again, note that the start code and the extension start code identifier are not part of the SequenceExtension coding element.

Group of Pictures Header

The group of pictures header is preceded by a start code with the appropriate value indicating that what follows is a group of pictures header. The start code is published as a StartCode event and is not considered to be part of the group of pictures header itself. The group of pictures header is comprised of the following atomic coding elements in this order:

  • TimeCode
  • ClosedGOP
  • BrokenLink

These are collected together in the compound coding element GroupOfPicturesHeader which is published after all of its component parts are publised.

Picture Header and Picture Coding Extension Header

The picture header follows a StartCode event with the appropriate value. It is comprised of the following atomic coding elements:

  • TemporalReference
  • PictureCodingType
  • VBVDelay
  • FullPelForwardVector
    Present only if previously encountered PictureCodingType indicates that this picture is a P or B frame.
  • ForwardFCode
    Present only if FullPelForwardVector element is present.
  • FullPelBackwardVector
    Present only if previously encountered PictureCodingType indicates that this picture is a B frame.
  • BackwardFCode
    Present only if FullPelBackwardVector element is present.
  • ExtraBitPicture
  • ExtraInformationPicture
    Present only if previous ExtraBitPicture value is set to 1. This two element sequence is repeated until ExtraBitPicture element is set to 0.

The compound PictureHeader element event is published after all of its components are parsed and published individually.

The picture header must be followed by a picture coding extension header. This is introduced with a StartCode indicating the extension followed by a ExtensionStartCodeIdentifier give the extension type as a picture coding extension. The picture coding extension is comprised of the following atomic coding elements which follow in this order:

  • Four FCode elements
    These provide parameters required for decoding motion vectors in different types of frames.
  • IntraDCPrecision
  • PictureStructure
  • TopFieldFirst
  • FramePredFrameDCT
  • ConcealmentMotionVectors
  • QScaleType
  • IntraVLCFormat
  • AlternateScane
  • RepeatFirstField
  • Chroma420Type
  • ProgressiveFrame
  • CompositeDisplayFlag
    If CompositeDisplayFlag is set to 1 then the following coding elements will follow:
    • VAxis
    • FieldSequence
    • SubCarrier
    • BurstAmplitude
    • SubCarrierPhase

The picture coding extension is then published as PictureCodingExtension event.

Picture Data

Picture data is comprised of the coding elements that actually encode the frame's visual content. MPEG-2 frames may be organized either as a single progressive frame or two fields (odd and even). If the frame is organized as two fields, each field is encoded as a two consecutive pictures with the same temporal reference value. Various values in the picture header and picture coding extension header should provide all the information you need about the type (i.e., I, P, or B) and structure (i.e., field or frame, odd first or even first, etc.) of a particular picture.

Picture data is organized into slices. Each slice starts with a StartCode. The slice start codes range from 0x00000101 through 0x000001AF. The last byte of the slice start code provides the macroblock row number of the first macroblock encoded in this slice.

After the slice start code, if the vertical size of the frame encoded in the picture header and picture coding extension header exceeds 2800 pixels, a SliceVerticalPositionExtension element is parsed. This is rarely used, but if needed, provides the additional info needed to properly encode the macroblock row number of the first macroblock in the slice.

At this point, if certain scalability options are in use, there may be 7 bits encoding the priority breakpoint value. At this time, MPEG2Event doesn't support scalability options and their use would probably break the VideoParser object at just this point since these 7 bits would not be interpreted correctly.

Next a QuantiserScaleCode element is parsed. This resets the quantiser value for the slice. This value is allowed to change on a per macroblock value so this element may be encountered again if the value is changed at any given macroblock.

The next bit is parsed as the SliceExtensionFlag. If set to 1, then the following coding elements will be parsed:

  • IntraSlice
  • SlicePictureIdEnable
  • slicePictureId
  • ExtraBitSlice
  • ExtraInformationSlice
    Only if ExtraBitSlice is set to 1. If so set, then ExtraBitSlice and ExtranInformationSlice elements repeatedly parsed until an ExtraBitSlice with value 0 is encountered.

Everything after the slice start code until this point is then published as a SliceHeader. Following this is one or more macroblocks. Each macroblock is comprised of some subset of the following coding elements:

  • MacroblockEscape
    One or more of these elements are parsed if the difference between the current macroblock address and the previous macroblock address is greater than 33.
  • MacroblockAddressIncrement
    The combination of zero or more MacroblockEscape elements and this element are published as part of a compound element called MacroblockAddressDelta that provides the actual encoded macroblock address difference.
  • MacroblockMode
    This element signals the presence of an encoded quantiser scale code, forward motion vectors, backward motion vectors, and the coded block pattern. If not encoded, then these elements will not appear in the bitstream.
  • QuantiserScaleCode
  • MotionVector
    Multiple MotionVector elements may be present depending on the type and mode of motion compensation in use. This is a compound element that encompasses a number of atomic coding elements that actually encode the various parts of each motion vector. A more detailed discussion of motion vector decoding can be found here.
  • CodedBlockPattern
  • Block
    Zero or more blocks are encoded according to the type of frame and the value of the previous CodedBlockPattern element if present. A Block is a compound element that encapsulates the following atomic elements:
    • DiffEncodedCoeff
      Present if the DC coefficient is encoded differentially with respect to a previous DC coefficient (i.e., I-frames, and I-macroblocks in P and B frames).
    • HuffmanEncodedCoeff
      One for each DCT coefficient encoded as a (run-length, value) pair.

All of these elements are encapsulated in the compound Macroblock element.

Other Extension Headers and User Data

After the sequence extension header and picture extension header, zero or more other extension headers may be present or user data may be present. Also, after the group of pictures header, user data may be present. Currently MPEG2Event only recognizes, parses, and publishes the quantization matrix extension header which is published as a QuantMatrixExtension element. User data, if present is parsed and published as a UserData element. All other extension headers are parsed as OpaqueBits.

permalink 2004.11.23-09:24.00