6月 29 2020

背景

在spdk 中， bdev 承上启下，隔离了具体的设备和IO模式，对vdev 等上层抽象出了统一的框架。了解它对外的接口、功能、框架，才可能分析它的优缺点。

spdk bdev对外的接口

实现spdk_bdev_write/spdk_bdev_read/spdk_bdev_open/spdk_bdev_close 等操作。

参考：https://spdk.io/doc/bdev.html
接口代码在： bdev.h 中

spdk bdev的重要数据结构

参考：https://spdk.io/doc/bdev_module.html
和注册驱动类似，首先需要实现系列操作接口。以bdev libaio接口为例，相关的重要接口包括：

bdev 公共接口: spdk_bdev_module

tatic struct spdk_bdev_module aio_if = {
        .name           = "aio",
        .module_init    = bdev_aio_initialize,
        .module_fini    = bdev_aio_fini,
        .config_text    = bdev_aio_get_spdk_running_config,
        .get_ctx_size   = bdev_aio_get_ctx_size,
};

这里类似注册Linux 驱动接口。

bdev 操作接口： spdk_bdev_fn_table

static const struct spdk_bdev_fn_table aio_fn_table = {
        .destruct               = bdev_aio_destruct,
        .submit_request         = bdev_aio_submit_request,
        .io_type_supported      = bdev_aio_io_type_supported,
        .get_io_channel         = bdev_aio_get_io_channel,
        .dump_info_json         = bdev_aio_dump_info_json,
        .write_config_json      = bdev_aio_write_json_config,
};

同样，这里类似注册了驱动设备的一组操作。具体针对linux 块设备的libaio读写操作，被放到了上面 bdev_aio_submit_request 的操作中。比如bdev_aio_submit_request最终会调用libaio的io_submit()操作。那么收割操作放到哪呢？

spdk bdev 的框架流程

上面问题实际就涉及到了spdk bdev 的IO 提交、事件完的框架了。

框架中IO如何提交

主要的调用关系如下：(上面的被下面调用)
* bdev->fn_table->submit_request(ch->channel, bdev_io);
* _spdk_bdev_qos_io_submit(struct spdk_bdev_channel *ch, struct spdk_bdev_qos *qos)
* _spdk_bdev_io_submit
* spdk_bdev_io_submit
* spdk_bdev_read_blocks/spdk_bdev_readv_blocks/spdk_bdev_write_blocks/spdk_bdev_writev_blocks/spdk_bdev_write_zeroes_blocks/spdk_bdev_unmap_blocks

框架中IO如何收割

io_getevents （bdev_aio.c）
bdev_aio_group_poll
bdev_aio_group_create_cb:ch->poller = spdk_poller_register(bdev_aio_group_poll, ch, 0);

这里特别需要注意：上面基于收割函数注册了一个spdk poller。spdk 的poller 可以理解成一个定时执行的Timer线程。这通过它数据结构可以看到：

struct spdk_poller {
        TAILQ_ENTRY(spdk_poller)        tailq;

        /* Current state of the poller; should only be accessed from the poller's thread. */
        enum spdk_poller_state          state;

        uint64_t                        period_ticks;
        uint64_t                        next_run_tick;
        spdk_poller_fn                  fn;
        void                            *arg;
};

上面spdk_poller_fn 就是前面注册的bdev_aio_group_poll，而period_ticks、next_run_tick决定了下次该函数执行的时刻。

收割完的IO的回调如何执行

bdev_io->internal.cb(bdev_io, bdev_io->internal.status == SPDK_BDEV_IO_STATUS_SUCCESS，..);
_spdk_bdev_io_complete(void *ctx)
spdk_bdev_io_complete（）
spdk_bdev_io_complete_scsi_status（） or spdk_bdev_io_complete_nvme_status（）

以spdk_bdev_io_complete_nvme_status为例，它就是用来执行完一个bdev io，并且带回一个返回值和一个完成队列里的一个entry。

spdk bdev框架的主要原理

SPDK的一个重要理念就是run-to-completion的，保证IO尽量在一个线程上处理。通过分析IO提交和收割（poller）操作如果在一个线程上执行，能够帮助我们理解这一点。

spdk run-to-completion的基础：spdk_thread

理解spdk thread 是理解bdev 机制的关键。而bdev里最需要理解的数据结构是spdk_thread:

struct spdk_thread {
        TAILQ_HEAD(, spdk_io_channel)   io_channels;//用来表示特定于线程的队列
        TAILQ_ENTRY(spdk_thread)        tailq;
        char                            *name;

        uint64_t                        tsc_last;
        struct spdk_thread_stats        stats;

        /*
         * Contains pollers actively running on this thread.  Pollers
         *  are run round-robin. The thread takes one poller from the head
         *  of the ring, executes it, then puts it back at the tail of
         *  the ring.
         */
        TAILQ_HEAD(, spdk_poller)       active_pollers;//执行各种定期的完成IO收割、统计操作，这个和收割相关

        /**
         * Contains pollers running on this thread with a periodic timer.
         */
        TAILQ_HEAD(timer_pollers_head, spdk_poller) timer_pollers;

        struct spdk_ring                *messages; // 用来接受其他线程传递过来的IO请求，加入待处理队列, 这个和IO提交相关

        SLIST_HEAD(, spdk_msg)          msg_cache;
        size_t                          msg_cache_count;
};

bdev IO提交的时候最后通常会走到
spdk_thread_send_msg，它把传过来的 (const struct spdk_thread *thread, spdk_msg_fn fn, void *ctx)三元组，组装成下面的数据结构，最后插入到上面的messages 队列。 (最关键的点就在这里：
1. 通过把操作放入队列实现了请求的异步执行；2. 由于队列是无锁队列，实现了无锁并发)

struct spdk_msg {
        spdk_msg_fn             fn;
        void                    *arg;

        SLIST_ENTRY(spdk_msg)   link;
};

而在bdev初始化的时候，会把实际设备或者IO模式的收割函数(比如libaio的get_event()）注册到下面的poller 数据结构：

struct spdk_poller {
        TAILQ_ENTRY(spdk_poller)        tailq;

        /* Current state of the poller; should only be accessed from the poller's thread. */
        enum spdk_poller_state          state;

        uint64_t                        period_ticks;
        uint64_t                        next_run_tick;
        spdk_poller_fn                  fn;
        void                            *arg;
};

基于spdk_thread实现run-to-completion

bdev 提供了异步提交和收割的线程模型。基于类似定时器的spdk poller机制，可以实现周期地收割完成到IO事件；基于SPDK 消息和无锁队列机制，提交线程通过spdk_thread_send_msg（）把IO请求加入处理队列，然后在统一的线程（spdk_thread_poll(struct spdk_thread *thread, uint32_t max_msgs, uint64_t now）中批量提交（msg_count = _spdk_msg_queue_run_batch(thread, max_msgs);）。提交一批即开始收割一批完成的IO事件（timer_rc = poller->fn(poller->arg);）。

由于提交和收割都在同一个函数里面，因此这也就实现了run-to-completin。

Post Views: 1,261

发表于存储相关技术

一	二	三	四	五	六	日
« 3月
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

初识spdk bdev

背景

spdk bdev对外的接口

spdk bdev的重要数据结构

bdev 公共接口: spdk_bdev_module

bdev 操作接口： spdk_bdev_fn_table

spdk bdev 的框架流程

框架中IO如何提交

框架中IO如何收割

收割完的IO的回调如何执行

spdk bdev框架的主要原理

spdk run-to-completion的基础：spdk_thread

基于spdk_thread实现run-to-completion

About The Author

发表评论取消回复

初识spdk bdev

背景

spdk bdev对外的接口

spdk bdev的重要数据结构

bdev 公共接口: spdk_bdev_module

bdev 操作接口： spdk_bdev_fn_table

spdk bdev 的框架流程

框架中IO如何提交

框架中IO如何收割

收割完的IO的回调如何执行

spdk bdev框架的主要原理

spdk run-to-completion的基础：spdk_thread

基于spdk_thread实现run-to-completion

About The Author

发表评论 取消回复

发表评论取消回复