IjkPlayer Series Data Reading Thread read_thread

PS: Control the impact of new technology on you rather than being controlled by it.

This article analyzes the data reading thread read_thread of IjkPlayer, aiming to clarify its basic flow and the calling of key functions. The main content is as follows:

Basic usage of IjkPlayer
Creation of read_thread
avformat_alloc_context
avformat_open_input
avformat_find_stream_info
avformat_seek_file
av_dump_format
av_find_best_stream
stream_component_open
Main loop of read_thread

Basic usage of IjkPlayer#

A brief review of the basic usage of IjkPlayer is as follows:

// Create IjkMediaPlayer
IjkMediaPlayer mMediaPlayer = new IjkMediaPlayer();
// Set log level
mMediaPlayer.native_setLogLevel(IjkMediaPlayer.IJK_LOG_DEBUG);
// Set options
mMediaPlayer.setOption(IjkMediaPlayer.OPT_CATEGORY_PLAYER, "mediacodec", 1);
// ...
// Set event listeners
mMediaPlayer.setOnPreparedListener(mPreparedListener);
mMediaPlayer.setOnVideoSizeChangedListener(mSizeChangedListener);
mMediaPlayer.setOnCompletionListener(mCompletionListener);
mMediaPlayer.setOnErrorListener(mErrorListener);
mMediaPlayer.setOnInfoListener(mInfoListener);
// Set surface
mMediaPlayer.setSurface(surface)
// ...
// Set URL
mMediaPlayer.setDataSource(dataSource);
// Prepare for playback
mMediaPlayer.prepareAsync();

After calling prepareAsync, receiving the onPrepared callback calls start to begin playback:

@Override
public void onPrepared(IMediaPlayer mp) {
    // Start playback
    mMediaPlayer.start();
}

At this point, under normal circumstances, the video should play properly, and we will focus only on the calling flow.

Creation of read_thread#

Starting from the prepareAsync method of IjkMediaPlayer, the calling flow is as follows:

Mermaid Loading...

It can be seen that prepareAsync ultimately calls the function stream_open, which is defined as follows:

static VideoState *stream_open(FFPlayer *ffp, const char *filename, AVInputFormat *iformat){
    av_log(NULL, AV_LOG_INFO, "stream_open\n");
    assert(!ffp->is);
    // Initialize VideoState and some parameters.
    VideoState *is;
    is = av_mallocz(sizeof(VideoState));
    if (!is)
        return NULL;
    is->filename = av_strdup(filename);
    if (!is->filename)
        goto fail;
    // Here, iformat has not been assigned yet, the best AVInputFormat will be found later through probing.
    is->iformat = iformat;
    is->ytop    = 0;
    is->xleft   = 0;
#if defined(__ANDROID__)
    if (ffp->soundtouch_enable) {
        is->handle = ijk_soundtouch_create();
    }
#endif

    /* start video display */
    // Initialize the decoded frame queue
    if (frame_queue_init(&is->pictq, &is->videoq, ffp->pictq_size, 1) < 0)
        goto fail;
    if (frame_queue_init(&is->subpq, &is->subtitleq, SUBPICTURE_QUEUE_SIZE, 0) < 0)
        goto fail;
    if (frame_queue_init(&is->sampq, &is->audioq, SAMPLE_QUEUE_SIZE, 1) < 0)
        goto fail;

    // Initialize the queue for undecoded data
    if (packet_queue_init(&is->videoq) < 0 ||
        packet_queue_init(&is->audioq) < 0 ||
        packet_queue_init(&is->subtitleq) < 0)
        goto fail;

    // Initialize condition variables (semaphores), including read thread, video seek, audio seek related semaphores
    if (!(is->continue_read_thread = SDL_CreateCond())) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateCond(): %s\n", SDL_GetError());
        goto fail;
    }

    if (!(is->video_accurate_seek_cond = SDL_CreateCond())) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateCond(): %s\n", SDL_GetError());
        ffp->enable_accurate_seek = 0;
    }

    if (!(is->audio_accurate_seek_cond = SDL_CreateCond())) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateCond(): %s\n", SDL_GetError());
        ffp->enable_accurate_seek = 0;
    }
    // Initialize clocks
    init_clock(&is->vidclk, &is->videoq.serial);
    init_clock(&is->audclk, &is->audioq.serial);
    init_clock(&is->extclk, &is->extclk.serial);
    is->audio_clock_serial = -1;
    // Initialize volume range
    if (ffp->startup_volume < 0)
        av_log(NULL, AV_LOG_WARNING, "-volume=%d < 0, setting to 0\n", ffp->startup_volume);
    if (ffp->startup_volume > 100)
        av_log(NULL, AV_LOG_WARNING, "-volume=%d > 100, setting to 100\n", ffp->startup_volume);
    ffp->startup_volume = av_clip(ffp->startup_volume, 0, 100);
    ffp->startup_volume = av_clip(SDL_MIX_MAXVOLUME * ffp->startup_volume / 100, 0, SDL_MIX_MAXVOLUME);
    is->audio_volume = ffp->startup_volume;
    is->muted = 0;

    // Set audio-video synchronization method, default is AV_SYNC_AUDIO_MASTER
    is->av_sync_type = ffp->av_sync_type;

    // Playback mutex
    is->play_mutex = SDL_CreateMutex();
    // Accurate seek mutex
    is->accurate_seek_mutex = SDL_CreateMutex();

    ffp->is = is;
    is->pause_req = !ffp->start_on_prepared;

    // Video rendering thread
    is->video_refresh_tid = SDL_CreateThreadEx(&is->_video_refresh_tid, video_refresh_thread, ffp, "ff_vout");
    if (!is->video_refresh_tid) {
        av_freep(&ffp->is);
        return NULL;
    }

    is->initialized_decoder = 0;
    // Read thread
    is->read_tid = SDL_CreateThreadEx(&is->_read_tid, read_thread, ffp, "ff_read");
    if (!is->read_tid) {
        av_log(NULL, AV_LOG_FATAL, "SDL_CreateThread(): %s\n", SDL_GetError());
        goto fail;
    }
    
    // Asynchronously initialize the decoder, related to hardware decoding, default is off
    if (ffp->async_init_decoder && !ffp->video_disable && ffp->video_mime_type && strlen(ffp->video_mime_type) > 0
                    && ffp->mediacodec_default_name && strlen(ffp->mediacodec_default_name) > 0) {
        // mediacodec
        if (ffp->mediacodec_all_videos || ffp->mediacodec_avc || ffp->mediacodec_hevc || ffp->mediacodec_mpeg2) {
            decoder_init(&is->viddec, NULL, &is->videoq, is->continue_read_thread);
            ffp->node_vdec = ffpipeline_init_video_decoder(ffp->pipeline, ffp);
        }
    }
    // Allow decoder initialization
    is->initialized_decoder = 1;

    return is;
fail:
    is->initialized_decoder = 1;
    is->abort_request = true;
    if (is->video_refresh_tid)
        SDL_WaitThread(is->video_refresh_tid, NULL);
    stream_close(ffp);
    return NULL;
}

It can be seen that the function stream_open mainly does the following:

Initializes VideoState and some parameters.
Initializes frame queues, including initializing the decoded video frame queue pictq, audio frame queue sampq, subtitle frame queue subpq, and the undecoded video data queue videoq, audio data queue audioq, and subtitle data queue subtitleq.
Initializes audio-video synchronization method and clocks, default is AV_SYNC_AUDIO_MASTER, meaning the audio clock is the master clock.
Initializes volume range.
Creates a video rendering thread named ff_vout for video_refresh_thread.
Creates a read thread named ff_read for read_thread.

Now we begin the analysis of the data reading thread read_thread, with the key parts of the read_thread function simplified as follows:

static int read_thread(void *arg){
    // ...
    
    // 1. Create AVFormatContext, specify default functions for opening and closing streams, etc.
    ic = avformat_alloc_context();
    if (!ic) {
        av_log(NULL, AV_LOG_FATAL, "Could not allocate context.\n");
        ret = AVERROR(ENOMEM);
        goto fail;
    }
    // ...
    
    // 2. Open the stream to get header information
    err = avformat_open_input(&ic, is->filename, is->iformat, &ffp->format_opts);
    if (err < 0) {
        print_error(is->filename, err);
        ret = -1;
        goto fail;
    }
    ffp_notify_msg1(ffp, FFP_MSG_OPEN_INPUT);
    // ...
    
    // 3. Get stream information
    if (ffp->find_stream_info) {
        err = avformat_find_stream_info(ic, opts);
    } 
    ffp_notify_msg1(ffp, FFP_MSG_FIND_STREAM_INFO);
    // ...
    
    // 4. If a start time is specified, seek to that position
    if (ffp->start_time != AV_NOPTS_VALUE) {
        int64_t timestamp;
        timestamp = ffp->start_time;
        if (ic->start_time != AV_NOPTS_VALUE)
            timestamp += ic->start_time;
        ret = avformat_seek_file(ic, -1, INT64_MIN, timestamp, INT64_MAX, 0);
    }
    // ...
    
    // 5. Print format information
    av_dump_format(ic, 0, is->filename, 0);
}

The following content is limited to the main flow of the read_thread function.

avformat_alloc_context#

The avformat_alloc_context function mainly allocates memory for AVFormatContext and initializes some parameters of ic->internal, as follows:

AVFormatContext *avformat_alloc_context(void){
    // Allocate memory for AVFormatContext
    AVFormatContext *ic;
    ic = av_malloc(sizeof(AVFormatContext));
    if (!ic) return ic;
    // Initialize default functions for opening and closing streams
    avformat_get_context_defaults(ic);
    // Allocate memory for internal
    ic->internal = av_mallocz(sizeof(*ic->internal));
    if (!ic->internal) {
        avformat_free_context(ic);
        return NULL;
    }
    ic->internal->offset = AV_NOPTS_VALUE;
    ic->internal->raw_packet_buffer_remaining_size = RAW_PACKET_BUFFER_SIZE;
    ic->internal->shortest_end = AV_NOPTS_VALUE;
    return ic;
}

Next, let's look at the avformat_get_context_defaults function as follows:

static void avformat_get_context_defaults(AVFormatContext *s){
    memset(s, 0, sizeof(AVFormatContext));
    s->av_class = &av_format_context_class;
    s->io_open  = io_open_default;
    s->io_close = io_close_default;
    av_opt_set_defaults(s);
}

Here, the default functions for opening and closing streams are specified as io_open_default and io_close_default, and we will not focus on the subsequent flow for now.

avformat_open_input#

The avformat_open_input function is mainly used to open a stream to get header information, its definition is simplified as follows:

int avformat_open_input(AVFormatContext **ps, const char *filename,
                        AVInputFormat *fmt, AVDictionary **options){
    // ...

    // Open the stream to probe the input format, returning the best demuxer score
    av_log(NULL, AV_LOG_FATAL, "avformat_open_input > init_input before > nb_streams:%d\n",s->nb_streams);
    if ((ret = init_input(s, filename, &tmp)) < 0)
        goto fail;
    s->probe_score = ret;

    // Protocol whitelist/blacklist checks, stream format whitelist checks, etc.
    // ...
    
    // Read media header
    // read_header mainly does some initialization work for certain formats, such as filling in its private structure
    // Allocates stream structures based on the number of streams and initializes them, pointing the file pointer to the start of the data area, etc.
    // Creates AVStream and waits to extract or write audio/video stream information in subsequent processes
    if (!(s->flags&AVFMT_FLAG_PRIV_OPT) && s->iformat->read_header)
        if ((ret = s->iformat->read_header(s)) < 0)
            goto fail;
    // ...
    
    // Handle additional images in audio/video, such as album art
    if ((ret = avformat_queue_attached_pictures(s)) < 0)
        goto fail;
    // ...
    
    // Update AVStream decoder-related information to AVCodecContext
    update_stream_avctx(s);
    // ...
}

It can be seen that the avformat_open_input function mainly opens the stream to probe the input format, performs protocol whitelist/blacklist checks, stream format whitelist checks, reads file header information, etc. Finally, it uses the update_stream_avctx function to update the AVStream decoder-related information to the corresponding AVCodecContext. This operation will be frequently seen in subsequent processes.

The most important tasks are to open the stream to probe the input format and read the file header information, which call the init_input and read_header functions, respectively. The read_header function will complete the initialization of AVStream during the process of reading header information.

The init_input function mainly probes the stream format and returns the score for that stream format, ultimately finding the best AVInputFormat corresponding to that stream format. This structure is the demuxer registered during initialization, and each demuxer corresponds to an AVInputFormat object. Similarly, the muxer corresponds to AVOutputFormat, which we will just understand for now.

If the init_input function executes successfully, the corresponding stream format has been determined, and at this point, the read_header function can be called. The corresponding demuxer for the current stream format AVInputFormat has its xxx_read_header function. If it is an HLS format stream, the corresponding function is hls_read_header, defined as follows:

AVInputFormat ff_hls_demuxer = {
    .name           = "hls,applehttp",
    .read_header   = hls_read_header,
    // ...
};

// hls_read_header
static int hls_read_header(AVFormatContext *s, AVDictionary **options){
    // ...
}

avformat_find_stream_info#

The avformat_find_stream_info function is mainly used to obtain stream information, which is useful for probing file formats without header information. This function can be used to obtain video width, total duration, bitrate, frame rate, pixel format, etc. Its definition is simplified as follows:

int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options){
    // ...
    
    // 1. Traverse streams
    for (i = 0; i < ic->nb_streams; i++) {
        // Initialize the parser in the stream, specifically AVCodecParserContext and AVCodecParser
        st->parser = av_parser_init(st->codecpar->codec_id);
        // ...
        
        // Probe the corresponding decoder based on the decoder parameters in AVStream and return it
        codec = find_probe_decoder(ic, st, st->codecpar->codec_id);
        // ...
        
        // If the decoder parameters are incomplete, initialize AVCodecContext based on the specified AVCodec and call the decoder's init function to initialize the decoder
        if (!has_codec_parameters(st, NULL) && st->request_probe <= 0) {
            if (codec && !avctx->codec)
                if (avcodec_open2(avctx, codec, options ? &options[i] :&thread_opt) < 0)
                    av_log(ic, AV_LOG_WARNING,
                           "Failed to open codec in %s\n",__FUNCTION__);
        }
    }
    
    // 2. Infinite loop to obtain stream information
    for (;;) {
        // ...
        // Check for interrupt requests, if any, call the interrupt function
        if (ff_check_interrupt(&ic->interrupt_callback)) {
            break;
        }
        
        // Traverse streams, check if there are still decoder-related parameters to process
        for (i = 0; i < ic->nb_streams; i++) {
            int fps_analyze_framecount = 20;
            st = ic->streams[i];
            // Check the decoder parameters in the stream, if complete, break; otherwise, continue to execute for further analysis
            if (!has_codec_parameters(st, NULL))
                break;
            // ...
        }
        
        if (i == ic->nb_streams) {
            // Mark that all streams have been analyzed
            analyzed_all_streams = 1;
            // If the current AVFormatContext sets ctx_flags to AVFMTCTX_NOHEADER, it indicates that the current stream has no header information
            // At this point, it is necessary to read some packets to obtain stream information; otherwise, break out directly, and the infinite loop ends normally
            if (!(ic->ctx_flags & AVFMTCTX_NOHEADER)) {
                /* If we found the info for all the codecs, we can stop. */
                ret = count;
                av_log(ic, AV_LOG_DEBUG, "All info found\n");
                flush_codecs = 0;
                break;
            }
        }
         
        // The read data has exceeded the allowed probing data size, but all codec information has not yet been obtained
        if (read_size >= probesize) {
            break;
        }
         
        // The following handles the case where the current stream has no header information
         
        // Read a frame of compressed encoded data
        ret = read_frame_internal(ic, &pkt1);
        if (ret == AVERROR(EAGAIN)) continue;
        if (ret < 0) {
            /* EOF or error*/
            eof_reached = 1;
            break;
        }
        
        // The read data is added to the buffer, which will be read from the buffer later
        ret = add_to_pktbuf(&ic->internal->packet_buffer, pkt,
                                &ic->internal->packet_buffer_end, 0);
        // Try to decode some compressed encoded data                       
        try_decode_frame(ic, st, pkt,(options && i < orig_nb_streams) ? &options[i] : NULL);
        
        // ...
    }
    
    // 3. Handle reading data to the end of the stream
    if (eof_reached) {
         for (stream_index = 0; stream_index < ic->nb_streams; stream_index++) {
             if (!has_codec_parameters(st, NULL)) {
                const AVCodec *codec = find_probe_decoder(ic, st, st->codecpar->codec_id);
                if (avcodec_open2(avctx, codec, (options && stream_index < orig_nb_streams) ? &options[stream_index] : &opts) < 0)
                        av_log(ic, AV_LOG_WARNING,
         }
    }
    // 4. The decoder performs flushing operations to avoid cached data not being retrieved
    if (flush_codecs) {
        AVPacket empty_pkt = { 0 };
        int err = 0;
        av_init_packet(&empty_pkt);
        for (i = 0; i < ic->nb_streams; i++) {
            st = ic->streams[i];
            /* flush the decoders */
            if (st->info->found_decoder == 1) {
                do {
                    // 
                    err = try_decode_frame(ic, st, &empty_pkt,
                                            (options && i < orig_nb_streams)
                                            ? &options[i] : NULL);
                } while (err > 0 && !has_codec_parameters(st, NULL));
        
                if (err < 0) {
                    av_log(ic, AV_LOG_INFO,
                        "decoding for stream %d failed\n", st->index);
                }
            }
        }
    }
    
    // 5. The following is some calculations of stream information, such as pix_fmt, aspect ratio SAR, actual frame rate, average frame rate, etc.
    // ...
    
    // 6. Update the decoder parameters from the internal AVCodecContext (avctx) to the stream's decoder parameters AVCodecParameters
    for (i = 0; i < ic->nb_streams; i++) {
        ret = avcodec_parameters_from_context(st->codecpar, st->internal->avctx);
        // ...
    }
    
    // ...
}

Since the avformat_find_stream_info function has a large amount of code, the above code omits most details and retains the more critical parts. Here, we only look at the main flow. From the source code, it can be seen that this function frequently uses the has_codec_parameters function to check whether the decoder context parameters in the stream are reasonable. If they are not reasonable, measures are taken to ensure that the decoder context parameters in the stream are reasonable. When ret = 0; indicates that avformat_find_stream_info executed successfully, its main flow is as follows:

Traverse streams, initialize AVCodecParser and AVCodecParserContext based on some parameters in the stream, probe the decoder using the find_probe_decoder function, and use the avcodec_open2 function to initialize AVCodecContext and call the decoder's init function to initialize the decoder's static data.
The infinite loop mainly uses the ff_check_interrupt function for interrupt detection, traverses streams, and uses the has_codec_parameters function to check whether the decoder context parameters in the stream are reasonable and whether data is being read. If reasonable and the current stream has header information, it marks analyzed_all_streams = 1; and flush_codecs = 0;, directly breaking out of the infinite loop. If the current stream has no header information, meaning ic->ctx_flags is set to AVFMTCTX_NOHEADER, it needs to call the read_frame_internal function to read a frame of encoded data and add it to the buffer, calling the try_decode_frame function to decode a frame of data to further fill the AVCodecContext in the stream.
eof_reached = 1 indicates that the previous infinite loop used the read_frame_internal function to read to the end of the stream. It traverses the stream again and uses the has_codec_parameters function to check whether the decoder context parameters in the stream are reasonable. If not, it repeats the steps in 2 to initialize the decoder context-related parameters.
The decoding process is a continuous process of putting data in and taking data out, corresponding to the avcodec_send_packet and avcodec_receive_frame functions. To avoid residual data in the decoder, it uses an empty AVPacket to flush the decoder, executed when flush_codecs = 1, meaning that the try_decode_frame function called in 2 executed the decoding operation.
The subsequent steps involve calculations of stream information, such as pix_fmt, aspect ratio SAR, actual frame rate, average frame rate, etc.
Traverse the stream and call the avcodec_parameters_from_context function to fill the previously filled decoder parameters from the internal AVCodecContext into the stream's decoder parameters st->codecpar, corresponding to the structure AVCodecParameters. Thus, the main flow analysis of the avformat_find_stream_info function is complete.

avformat_seek_file#

avformat_seek_file is mainly used to perform seek operations, its definition is simplified as follows:

int avformat_seek_file(AVFormatContext *s, int stream_index, int64_t min_ts,
                       int64_t ts, int64_t max_ts, int flags){
    // ...                   
    
    // Prefer to use read_seek2
    if (s->iformat->read_seek2) {
        int ret;
        ff_read_frame_flush(s);
        ret = s->iformat->read_seek2(s, stream_index, min_ts, ts, max_ts, flags);
        if (ret >= 0)
            ret = avformat_queue_attached_pictures(s);
        return ret;
    }
    // ...
    
    // If read_seek2 is not supported, try to use the old API seek
    if (s->iformat->read_seek || 1) {
        // ...
        int ret = av_seek_frame(s, stream_index, ts, flags | dir);
        return ret;
    }
    return -1; //unreachable                           
}

It can be seen that when the avformat_seek_file function executes, if the current demuxer (AVInputFormat) supports read_seek2, it uses the corresponding read_seek2 function; otherwise, it calls the old API's av_seek_frame function to perform seek. The av_seek_frame function is as follows:

int av_seek_frame(AVFormatContext *s, int stream_index,int64_t timestamp, int flags){
    int ret;
    if (s->iformat->read_seek2 && !s->iformat->read_seek) {
        // ...
        return avformat_seek_file(s, stream_index, min_ts, timestamp, max_ts,
                                  flags & ~AVSEEK_FLAG_BACKWARD);
    }

    ret = seek_frame_internal(s, stream_index, timestamp, flags);

    // ...
    return ret;
}

It can be seen that if the current AVInputFormat supports read_seek2 and does not support read_seek, it uses the avformat_seek_file function, which is the read_seek2 function to perform seek. If it supports read_seek, it prioritizes calling the internal seek function seek_frame_internal to perform seek. The seek_frame_internal function mainly provides several ways to seek frames:

seek_frame_byte: Seek by byte
read_seek: Seek according to the currently specified format, specifically supported by the corresponding demuxer.
ff_seek_frame_binary: Seek using binary search.
seek_frame_generic: Seek using a generic method.

This is also the logic for seek operations. For example, the HLS format demuxer does not support read_seek2, only supports read_seek, and the ff_hls_demuxer is defined as follows:

AVInputFormat ff_hls_demuxer = {
    // ...
    .read_seek      = hls_read_seek,
};

av_dump_format#

The av_dump_format function is used to print detailed information about the stream input format based on the current AVFormatContext. The information printed when IjkPlayer plays a video normally is as follows:

IJKMEDIA: Input #0, hls,applehttp, from 'http://devimages.apple.com.edgekey.net/streaming/examples/bipbop_4x3/gear1/prog_index.m3u8':
IJKMEDIA:   Duration:
IJKMEDIA: 00:30:00.00
IJKMEDIA: , start:
IJKMEDIA: 19.888800
IJKMEDIA: , bitrate:
IJKMEDIA: 0 kb/s
IJKMEDIA:
IJKMEDIA:   Program 0
IJKMEDIA:     Metadata:
IJKMEDIA:       variant_bitrate :
IJKMEDIA: 0
IJKMEDIA:
IJKMEDIA:     Stream #0:0
IJKMEDIA: , 23, 1/90000
IJKMEDIA: : Video: h264, 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, smpte170m/smpte170m/bt709, topleft), 400x300 (400x304), 0/1
IJKMEDIA: ,
IJKMEDIA: 29.92 tbr,
IJKMEDIA: 90k tbn,
IJKMEDIA: 180k tbc
IJKMEDIA:
IJKMEDIA:     Metadata:
IJKMEDIA:       variant_bitrate :
IJKMEDIA: FFP_MSG_FIND_STREAM_INFO:
IJKMEDIA: 0
IJKMEDIA:
IJKMEDIA:     Stream #0:1
IJKMEDIA: , 9, 1/90000
IJKMEDIA: : Audio: aac ([15][0][0][0] / 0x000F), 22050 Hz, stereo, fltp
IJKMEDIA:
IJKMEDIA:     Metadata:
IJKMEDIA:       variant_bitrate :
IJKMEDIA: 0

av_find_best_stream#

The av_find_best_stream function is mainly used to select the most suitable audio and video streams. Its definition is simplified as follows:

int av_find_best_stream(AVFormatContext *ic, enum AVMediaType type,
                        int wanted_stream_nb, int related_stream,
                        AVCodec **decoder_ret, int flags){
    // ...
    
    // Traverse to select the appropriate audio and video streams
    for (i = 0; i < nb_streams; i++) {
        int real_stream_index = program ? program[i] : i;
        AVStream *st          = ic->streams[real_stream_index];
        AVCodecParameters *par = st->codecpar;
        if (par->codec_type != type)
            continue;
        if (wanted_stream_nb >= 0 && real_stream_index != wanted_stream_nb)
            continue;
        if (type == AVMEDIA_TYPE_AUDIO && !(par->channels && par->sample_rate))
            continue;
        if (decoder_ret) {
            decoder = find_decoder(ic, st, par->codec_id);
            if (!decoder) {
                if (ret < 0)
                    ret = AVERROR_DECODER_NOT_FOUND;
                continue;
            }
        }
        disposition = !(st->disposition & (AV_DISPOSITION_HEARING_IMPAIRED | AV_DISPOSITION_VISUAL_IMPAIRED));
        count = st->codec_info_nb_frames;
        bitrate = par->bit_rate;
        multiframe = FFMIN(5, count);
        if ((best_disposition >  disposition) ||
            (best_disposition == disposition && best_multiframe >  multiframe) ||
            (best_disposition == disposition && best_multiframe == multiframe && best_bitrate >  bitrate) ||
            (best_disposition == disposition && best_multiframe == multiframe && best_bitrate == bitrate && best_count >= count))
            continue;
        best_disposition = disposition;
        best_count   = count;
        best_bitrate = bitrate;
        best_multiframe = multiframe;
        ret          = real_stream_index;
        best_decoder = decoder;
        // ...
    }
    // ...
    return ret;
}

It can be seen that the av_find_best_stream function mainly selects based on three dimensions, in the order of comparison: disposition, multiframe, and bitrate. When disposition is the same, it selects the one with more decoded frames, corresponding to multiframe, and finally selects the one with a higher bitrate, corresponding to bitrate.

disposition corresponds to the disposition member of AVStream, with specific values being AV_DISPOSITION_ identifiers. For example, the above AV_DISPOSITION_HEARING_IMPAIRED indicates that the stream is aimed at hearing-impaired individuals, which we will just understand for now.

In the read_thread function, av_find_best_stream finds the best audio, video, and subtitle streams, and the next step is decoding and playback.

stream_component_open#

The stream_component_open function mainly creates audio rendering threads, audio, video, subtitle decoding threads, and initializes VideoState. Its definition is simplified as follows:

static int stream_component_open(FFPlayer *ffp, int stream_index){
    // ...
    // 1. Initialize AVCodecContext
    avctx = avcodec_alloc_context3(NULL);
    if (!avctx)
        return AVERROR(ENOMEM);

    // 2. Update the current AVCodecContext parameters using the stream's decoder parameters
    ret = avcodec_parameters_to_context(avctx, ic->streams[stream_index]->codecpar);
    if (ret < 0)
        goto fail;
    av_codec_set_pkt_timebase(avctx, ic->streams[stream_index]->time_base);

    // 3. Find the decoder based on the decoder ID
    codec = avcodec_find_decoder(avctx->codec_id);

    // ...

    // 4. If a decoder name has been specified, find the decoder again using the decoder's name
    if (forced_codec_name)
        codec = avcodec_find_decoder_by_name(forced_codec_name);
    if (!codec) {
        if (forced_codec_name) av_log(NULL, AV_LOG_WARNING,
                                      "No codec could be found with name '%s'\n", forced_codec_name);
        else                   av_log(NULL, AV_LOG_WARNING,
                                      "No codec could be found with id %d\n", avctx->codec_id);
        ret = AVERROR(EINVAL);
        goto fail;
    }

    // ...
    
    // 5. Create audio rendering thread, initialize audio, video, subtitle decoders, and start decoding
    switch (avctx->codec_type) {
    case AVMEDIA_TYPE_AUDIO:
        // ...
        // Open audio output, create audio output thread ff_aout_android, corresponding audio thread function aout_thread,
        // ultimately calling the AudioTrack's write method to write audio data
        // ...
        // Initialize audio decoder
        decoder_init(&is->auddec, avctx, &is->audioq, is->continue_read_thread);
        if ((is->ic->iformat->flags & (AVFMT_NOBINSEARCH | AVFMT_NOGENSEARCH | AVFMT_NO_BYTE_SEEK)) && !is->ic->iformat->read_seek) {
            is->auddec.start_pts = is->audio_st->start_time;
            is->auddec.start_pts_tb = is->audio_st->time_base;
        }
        // Start audio decoding, here creates audio decoding thread ff_audio_dec, corresponding audio decoding thread function is audio_thread
        if ((ret = decoder_start(&is->auddec, audio_thread, ffp, "ff_audio_dec")) < 0)
            goto out;
        SDL_AoutPauseAudio(ffp->aout, 0);
        break;
    case AVMEDIA_TYPE_VIDEO:
        is->video_stream = stream_index;
        is->video_st = ic->streams[stream_index];
        // Asynchronously initialize the decoder, related to MediaCodec
        if (ffp->async_init_decoder) {
            // ...
        } else {
            // Initialize video decoder
            decoder_init(&is->viddec, avctx, &is->videoq, is->continue_read_thread);
            ffp->node_vdec = ffpipeline_open_video_decoder(ffp->pipeline, ffp);
            if (!ffp->node_vdec)
                goto fail;
        }
            // Start video decoding, here creates audio decoding thread ff_video_dec, corresponding audio decoding thread function is video_thread
        if ((ret = decoder_start(&is->viddec, video_thread, ffp, "ff_video_dec")) < 0)
            goto out;

        // ...

        break;
    case AVMEDIA_TYPE_SUBTITLE:
        // ...
        // Initialize subtitle decoder
        decoder_init(&is->subdec, avctx, &is->subtitleq, is->continue_read_thread);
        // Start video decoding, here creates audio decoding thread ff_subtitle_dec, corresponding audio decoding thread function is subtitle_thread
        if ((ret = decoder_start(&is->subdec, subtitle_thread, ffp, "ff_subtitle_dec")) < 0)
            goto out;
        break;
    default:
        break;
    }
    goto out;

fail:
    avcodec_free_context(&avctx);
out:
    av_dict_free(&opts);

    return ret;
}

It can be seen that the stream_component_open function has created the corresponding decoding threads. The comments in the above code are quite detailed, so I will not elaborate further. In the corresponding read_thread, this function fills some data of IjkMediaMeta, at this point ffp->prepared = true; and sends a playback preparation completion event message FFP_MSG_PREPARED to the application layer, ultimately callback to OnPreparedListener.

Main loop of read_thread#

The main loop here refers to the main loop of data reading in read_thread, with the key flow as follows:

for (;;) {
    // 1. If the stream is closed or the application layer releases, is->abort_request is 1
    if (is->abort_request)
        break;
    // ...
    // 2. Handle seek operations
    if (is->seek_req) {
        // ...
        is->seek_req = 0;
        ffp_notify_msg3(ffp, FFP_MSG_SEEK_COMPLETE, (int)fftime_to_milliseconds(seek_target), ret);
        ffp_toggle_buffering(ffp, 1);
    }
    // 3. Handle attached_pic
    // If a stream contains AV_DISPOSITION_ATTACHED_PIC, it indicates that this stream is a Video Stream in files like *.mp3
    // This stream only has one AVPacket which is attached_pic
    if (is->queue_attachments_req) {
        if (is->video_st && (is->video_st->disposition & AV_DISPOSITION_ATTACHED_PIC)) {
            AVPacket copy = { 0 };
            if ((ret = av_packet_ref(&copy, &is->video_st->attached_pic)) < 0)
                goto fail;
            packet_queue_put(&is->videoq, &copy);
            packet_queue_put_nullpacket(&is->videoq, is->video_stream);
        }
        is->queue_attachments_req = 0;
    }
    // 4. If the queue is full, temporarily cannot read more data
    // If it is a network stream, ffp->infinite_buffer is 1
    /* if the queue are full, no need to read more */
    if (ffp->infinite_buffer<1 && !is->seek_req &&
        // ...
        SDL_LockMutex(wait_mutex);
        // Wait 10ms to give the decoding thread time to consume
        SDL_CondWaitTimeout(is->continue_read_thread, wait_mutex, 10);
        SDL_UnlockMutex(wait_mutex);
        continue;
    }
    // 5. Check if the stream has finished playing
    if ((!is->paused || completed) &&
        (!is->audio_st || (is->auddec.finished == is->audioq.serial && frame_queue_nb_remaining(&is->sampq) == 0)) &&
        (!is->video_st || (is->viddec.finished == is->videoq.serial && frame_queue_nb_remaining(&is->pictq) == 0))) {
        // Check if loop playback is set
        if (ffp->loop != 1 && (!ffp->loop || --ffp->loop)) {
            stream_seek(is, ffp->start_time != AV_NOPTS_VALUE ? ffp->start_time : 0, 0, 0);
        } else if (ffp->autoexit) {// Check if auto exit is set
            ret = AVERROR_EOF;
            goto fail;
        } else {
            // ...
            
            // Playback error...
            ffp_notify_msg1(ffp, FFP_MSG_ERROR);
            
            // Playback completed...
            ffp_notify_msg1(ffp, FFP_MSG_COMPLETED);
        }
    }
    pkt->flags = 0;
    // 6. Read data packets
    ret = av_read_frame(ic, pkt);
    // 7. Check data reading status
    if (ret < 0) {
        // ...
        
        // Handle reading to the end...
        if (pb_eof) {
            if (is->video_stream >= 0)
                packet_queue_put_nullpacket(&is->videoq, is->video_stream);
            if (is->audio_stream >= 0)
                packet_queue_put_nullpacket(&is->audioq, is->audio_stream);
            if (is->subtitle_stream >= 0)
                packet_queue_put_nullpacket(&is->subtitleq, is->subtitle_stream);
            is->eof = 1;
        }
        
        // Data reading handling...
        if (pb_error) {
            if (is->video_stream >= 0)
                packet_queue_put_nullpacket(&is->videoq, is->video_stream);
            if (is->audio_stream >= 0)
                packet_queue_put_nullpacket(&is->audioq, is->audio_stream);
            if (is->subtitle_stream >= 0)
                packet_queue_put_nullpacket(&is->subtitleq, is->subtitle_stream);
            is->eof = 1;
            ffp->error = pb_error;
            av_log(ffp, AV_LOG_ERROR, "av_read_frame error: %s\n", ffp_get_error_string(ffp->error));
            // break;
        } else {
            ffp->error = 0;
        }
        if (is->eof) {
            ffp_toggle_buffering(ffp, 0);
            SDL_Delay(100);
        }
        SDL_LockMutex(wait_mutex);
        SDL_CondWaitTimeout(is->continue_read_thread, wait_mutex, 10);
        SDL_UnlockMutex(wait_mutex);
        ffp_statistic_l(ffp);
        continue;
    } else {
        is->eof = 0;
    }
    // ...
    // 8. Fill the undecoded frame queue
    if (pkt->stream_index == is->audio_stream && pkt_in_play_range) {
        packet_queue_put(&is->audioq, pkt);
    } else if (pkt->stream_index == is->video_stream && pkt_in_play_range
               && !(is->video_st && (is->video_st->disposition & AV_DISPOSITION_ATTACHED_PIC))) {
        packet_queue_put(&is->videoq, pkt);
    } else if (pkt->stream_index == is->subtitle_stream && pkt_in_play_range) {
        packet_queue_put(&is->subtitleq, pkt);
    } else {
        av_packet_unref(pkt);
    }
    // ...
}

It can be seen that the infinite loop in read_thread is mainly used for data reading. Each time a frame of compressed encoded data is read, it is added to the undecoded frame queue, specifically as follows:

If the stream fails to open, the stream_close function will be called, or if the application layer calls the release function to release the player, it directly breaks.
Handle seek operations during playback.
Handle attached_pic in the stream. If a stream contains AV_DISPOSITION_ATTACHED_PIC, it indicates that this stream is a Video Stream in files like *.mp3, and this stream only has one AVPacket which is attached_pic.
Handle the case where the queue is full. This occurs when the total size of the undecoded queues (audio, video, subtitle) exceeds 15M, or when stream_has_enough_packets determines that the audio, video, and subtitle streams have enough packets to decode. If so, it delays for 10ms to allow the decoder to consume data.
Check if the stream has finished playing, whether loop playback is set, whether to auto exit, and handle playback errors, etc.
av_read_frame can be said to be the key function of the read_thread thread, as it is responsible for demuxing. Each time it reads a frame of compressed encoded data, it adds it to the undecoded frame queue for the corresponding decoding thread to use.
Check the data reading status, mainly handling the situation where data reading reaches the end of the stream and handling data reading errors.
If av_read_frame reads data successfully, it adds it to the corresponding undecoded frame queue.

Thus, the data reading thread read_thread of IjkPlayer has been basically sorted out. Most of it still relates to FFmpeg.