So you observed the bandwidth from dd?
That's not wright. dd returns the page cache write bandwidth, while iocost
controls block layer. You have to observe block layer bandwidth through
cgroup stats.
Thanks,
Joseph
On 9/29/22 8:14 PM, 王传国 wrote:
Hi hongyun,
Still no effect. My operation steps are as follows:
1. Add a storage with SCSI - 60G at my qemu-vm, format as ext4;
2.python3 iocost_coef_gen.py --testdev /dev/sda got:
8:0 rbps=14335615141 rseqiops=93650 rrandiops=90693 wbps=1178201578 wseqiops=82077
wrandiops=77142
3.re-format sda as ext4
4.init as:
mount /dev/sda1 /wcg/data2/
echo "8:0 rbps=14335615141 rseqiops=93650 rrandiops=90693 wbps=1178201578
wseqiops=82077 wrandiops=77142" > /sys/fs/cgroup/blkio/blkio.cost.model
echo "8:0 enable=1 ctrl=user rpct=95.00 rlat=5000 wpct=95.00 wlat=5000 min=50.00
max=150.00" > /sys/fs/cgroup/blkio/blkio.cost.qos
cd /sys/fs/cgroup/blkio
mkdir blkcg_be blkcg_lc
echo "8:0 50" > /sys/fs/cgroup/blkio/blkcg_be/blkio.cost.weight
echo "8:0 1000" > /sys/fs/cgroup/blkio/blkcg_lc/blkio.cost.weight
echo 0 > /sys/block/sda/queue/rotational
5. Executing 2 commands in two terminals at the same time:
echo $$ > /sys/fs/cgroup/blkio/blkcg_be/cgroup.procs
dd if=/dev/zero of=/wcg/data2/ddfile1 bs=1M count=20480
-------------------
echo $$ > /sys/fs/cgroup/blkio/blkcg_lc/cgroup.procs
dd if=/dev/zero of=/wcg/data2/ddfile2 bs=1M count=20480
6. 5 times,2 dd all got about 550 MB/s which I think was wrong.
(blkcg_be should do almost when blkcg_lc was done, so the speed should be nearly
1:2)
What did I do wrong???!!!
在 2022-09-29 16:38:41,"钱君(弘云)" <hongyun.qj(a)alibaba-inc.com> 写道:
使用说明
第一步:为对应的磁盘生成相应的cost model数据
在进行IO评测的时候需要获取iocost的model模型数据,iocost_coed_gen.py用来获取model数据,这个脚本可以使用内核源码中的…
[root@iZbp14ah12fefuzd6rh5rkZ ~]# python3 iocost_coed_gen.py --testdev /dev/vdc
Test target: vdc(253:32)
Temporarily disabling elevator and merges
Determining rbps...
Jobs: 1 (f=1): [R(1)][100.0%][r=128MiB/s,w=0KiB/s][r=1,w=0 IOPS][eta 00m:00s]
rbps=179879083, determining rseqiops...
Jobs: 1 (f=1): [R(1)][100.0%][r=26.5MiB/s,w=0KiB/s][r=6791,w=0 IOPS][eta 00m:00s]
rseqiops=6862, determining rrandiops...
Jobs: 1 (f=1): [r(1)][100.0%][r=26.6MiB/s,w=0KiB/s][r=6800,w=0 IOPS][eta 00m:00s]
rrandiops=6830, determining wbps...
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=128MiB/s][r=0,w=1 IOPS][eta 00m:00s]
wbps=179882078, determining wseqiops...
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=26.6MiB/s][r=0,w=6798 IOPS][eta 00m:00s]
wseqiops=6862, determining wrandiops...
Jobs: 1 (f=1): [w(1)][100.0%][r=0KiB/s,w=26.6MiB/s][r=0,w=6806 IOPS][eta 00m:00s]
wrandiops=6830
Restoring elevator to none and nomerges to 0
253:32 rbps=179879083 rseqiops=6862 rrandiops=6830 wbps=179882078 wseqiops=6862
wrandiops=6830
然后将最后一行的数据写入对应磁盘的cost model文件中,如下所示:
echo "253:32 rbps=179879083 rseqiops=6862 rrandiops=6830 wbps=179882078
wseqiops=6862 wrandiops=6830" > /sys/fs/cgroup/blkio/blkio.cost.model
注意:不需要在所有的机器上都执行这个操作,同样的磁盘的model数据是一样的,我们只需要获取一次即可,然后将对应的数据写入blkio
root目录的blkio.cost.model接口文件即可。
第二步:配置磁盘的QOS,开启blk-iocost
这里假设使用cost.qos接口为设备253:32开启blk-iocost功能,并且当读写延迟rlat|wlat的请求有95%超过5
ms时,认为磁盘饱和。内核将进行磁盘发送请求速率的调整,调整区间为最低降至原速率的50%,最高升至原速率的150%:
echo "253:32 enable=1 ctrl=user rpct=95.00 rlat=5000 wpct=95.00 wlat=5000 min=50.00
max=150.00" > /sys/fs/cgroup/blkio/blkio.cost.qos
第三步:为容器分配io权重
这里可以根据业务容器不同的io的时延等级,给与设置不同的io权重,假设这里设置be blkcg的io权重为50,lc blkcg的io权重为1000:
echo "253:32 50" > /sys/fs/cgroup/blkio/blkcg_be/blkio.cost.weight
echo "253:32 50" > /sys/fs/cgroup/blkio/blkcg_lc/blkio.cost.weight
这样在IO资源使用饱和时,会根据blkcg的io权重进行io资源的分配。
注意事项
在ECS实例中使用blk-iocost功能启动ctrl=auto配置项时,如果对应的云盘为高效云盘、SSD云盘、ESSD云盘或NVMe
SSD本地盘类型时,需要手动将对应磁盘的rotational属性设置为0:
#[$DISK_NAME]为磁盘名称
echo 0 > /sys/block/[$DISK_NAME]/queue/rotational
另外切记不要使用分区的maj和min的值作为参数进行配置,需要使用硬盘的maj和min值进行配置。
------------------------------------------------------------------
发件人:Joseph Qi<joseph.qi(a)linux.alibaba.com>
日 期:2022年09月29日 16:26:55
收件人:王传国<wangchuanguo(a)163.com>om>; 钱君(弘云)<hongyun.qj(a)alibaba-inc.com>
抄 送:<cloud-kernel(a)lists.openanolis.cn>cn>; <storage(a)lists.openanolis.cn>
主 题:Re: [ck]Re: cgroup2 io weight 没有效果
Sure.
Hi, Jun Qian, could you please share an iocost sample steps for buffer
io weight control?
Thanks,
Joseph
On 9/29/22 4:09 PM, 王传国 wrote:
> Hi,
> I'm using v1.
> I do not need bfq, but mq-deadline also has no effect.
> buffer IO is my target(I've added the cgwb_v1 at grub). If no throttle setting,
it would be too fast for observation.
> can you give out a demo shell with 2 different blkio.cost.weight at
/sys/fs/cgroup/blkio??
> Thanks very much!
>
>
>
>
> 在 2022-09-29 15:52:16,"Joseph Qi" <joseph.qi(a)linux.alibaba.com> 写道:
>> Hi,
>>
>> Which cgroup version do you use? cgroup v1 or v2?
>>
>> For bfq, I don't have any experience on weight control.
>> For iocost, better to specify qos and model, according to the documentation
>> suggested before.
>>
>> Seems you've mixed bfq, iocost, block throttle together. I'd suggest
you
>> evaluate them individually and use direct io first.
>>
>> Thanks,
>> Joseph
>>
>> On 9/29/22 3:26 PM, 王传国 wrote:
>>> Hi Joseph,
>>> Thanks for your reply! But I have 2 questions, :
>>> 1. why the blkio.bfq.weight has no effect after "echo bfq >
/sys/block/vdb/queue/scheduler"
>>> 2.iocost didn't work either, two fio both got 5M,but 3M and 6M is what I
want.Point out my mistakes please!
>>> Thanks very much!
>>> And my shell like below:
>>>
>>> mount /dev/vdb1 /wcg/data2/
>>>
>>> cd /sys/fs/cgroup/blkio
>>>
>>> echo bfq > /sys/block/vdb/queue/scheduler
>>>
>>> echo 0 > /sys/block/vdb/queue/iosched/low_latency
>>>
>>> echo "253:16 10485760" > blkio.throttle.write_bps_device
>>>
>>> echo "253:16 enable=1" > blkio.cost.qos
>>>
>>> echo "253:16 ctrl=auto" > blkio.cost.model
>>>
>>> echo 0 > /sys/block/vdb/queue/rotational
>>>
>>> mkdir fio1 fio2
>>>
>>> echo "253:16 100" > fio1/blkio.cost.weight
>>>
>>> echo "253:16 200" > fio2/blkio.cost.weight
>>>
>>>
>>>
>>>
>>> echo $$ > /sys/fs/cgroup/blkio/fio1/cgroup.procs
>>>
>>> fio -rw=write -ioengine=libaio -bs=4k -size=1G -numjobs=1
-name=/wcg/data2/fio_test1.log
>>>
>>>
>>>
>>>
>>> #do follows at another console
>>>
>>> echo $$ > /sys/fs/cgroup/blkio/fio2/cgroup.procs
>>>
>>> fio -rw=write -ioengine=libaio -bs=4k -size=1G -numjobs=1
-name=/wcg/data2/fio_test2.log
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 在 2022-09-28 16:41:40,"Joseph Qi"
<joseph.qi(a)linux.alibaba.com> 写道:
>>>> 'io.weight' is for cfq io scheduler, while
'io.bfq.weight' is for bfq io
>>>> scheduler, as its name indicates.
>>>> So you may need configure corresponding io scheduler as well.
>>>>
>>>> BTW, if you want io weight control, I recommend another approach named
io
>>>> cost. The following documentation may help to understand the details:
>>>>
https://help.aliyun.com/document_detail/155863.html
>>>>
>>>> Thanks,
>>>> Joseph
>>>>
>>>> On 9/28/22 1:50 PM, 王传国 wrote:
>>>>> 各位同仁,
>>>>>
>>>>> 我看到cgroup2中有io.weight 和 io.bfq.weight,区别是什么?
>>>>>
>>>>> 我的理解是为了控制兄弟group在父group下的IO权重,我在如下版本测试了下,好像结果不太对,谁能指点一下,拜谢!
>>>>>
>>>>> # uname -a
>>>>>
>>>>> Linux localhost.localdomain 4.19.91-26.an8.x86_64 #1 SMP Tue May 24
13:10:09 CST 2022 x86_64 x86_64 x86_64 GNU/Linux
>>>>>
>>>>>
>>>>>
>>>>> 我的测试脚本:
>>>>>
>>>>> #change to cgroup2 by adding cgroup_no_v1=all into grub param
>>>>>
>>>>> mkdir -p /aaa/cg2
>>>>>
>>>>> mkdir -p /aaa/data2
>>>>>
>>>>> mount -t cgroup2 nodev /aaa/cg2
>>>>>
>>>>> mount /dev/sdb1 /aaag/data2/
>>>>>
>>>>> echo bfq > /sys/block/vdb/queue/scheduler #做或不做
>>>>>
>>>>>
>>>>>
>>>>> mkdir /aaa/cg2/test
>>>>>
>>>>> echo "+io +memory" > /aaa/cg2/cgroup.subtree_control
>>>>>
>>>>> echo "+io +memory" >
/aaa/cg2/test/cgroup.subtree_control
>>>>>
>>>>> cat /aaa/cg2/test/cgroup.controllers
>>>>>
>>>>> echo "8:16 wbps=10485760" > /aaa/cg2/test/io.max
>>>>>
>>>>> echo $$ > /aaa/cg2/test/cgroup.procs
>>>>>
>>>>>
>>>>>
>>>>> mkdir -p /aaa/cg2/test/dd1
>>>>>
>>>>> mkdir -p /aaa/cg2/test/dd2
>>>>>
>>>>> echo 200 > /aaa/cg2/test/dd1/io.weight
>>>>>
>>>>> #echo 200 > /aaa/cg2/test/dd1/io.bfq.weight #两个选项都试了
>>>>>
>>>>>
>>>>>
>>>>> #在另外2个终端执行如下的2个测试:
>>>>>
>>>>> echo $$ > /aaa/cg2/test/dd1/cgroup.procs
>>>>>
>>>>> dd if=/dev/zero of=/aaa/data2/ddfile1 bs=128M count=1
>>>>>
>>>>>
>>>>>
>>>>> echo $$ > /aaa/cg2/test/dd2/cgroup.procs
>>>>>
>>>>> dd if=/dev/zero of=/aaa/data2/ddfile2 bs=128M count=1
>>>>>
>>>>>
>>>>>
>>>>> 我得到了两个 500K+, 而不是期望的300K+ and 600K!
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Cloud Kernel mailing list -- cloud-kernel(a)lists.openanolis.cn
>>>>> To unsubscribe send an email to
cloud-kernel-leave(a)lists.openanolis.cn
>>>> _______________________________________________
>>>> Cloud Kernel mailing list -- cloud-kernel(a)lists.openanolis.cn
>>>> To unsubscribe send an email to cloud-kernel-leave(a)lists.openanolis.cn
>> _______________________________________________
>> Cloud Kernel mailing list -- cloud-kernel(a)lists.openanolis.cn
>> To unsubscribe send an email to cloud-kernel-leave(a)lists.openanolis.cn
>>
>> _______________________________________________
>> Cloud Kernel mailing list -- cloud-kernel(a)lists.openanolis.cn
>> To unsubscribe send an email to cloud-kernel-leave(a)lists.openanolis.cn