导读
昨天有个群友问: select x from table where varchar = 0;
(未加引号)能把所有数据查询出来, 问是否是BUG.
对于数据类型不一致的做比较, 就是mysql帮我们做了隐式转换. 隐式转换规则如下:
If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe equality comparison operator. For NULL NULL, the result is true. No conversion is needed.
If both arguments in a comparison operation are strings, they are compared as strings.
If both arguments are integers, they are compared as integers.
Hexadecimal values are treated as binary strings if not compared to a number.
If one of the arguments is a TIMESTAMP or DATETIME column and the other argument is a constant, the constant is converted to a timestamp before the comparison is performed. This is done to be more ODBC-friendly. This is not done for the arguments to IN(). To be safe, always use complete datetime, date, or time strings when doing comparisons. For example, to achieve best results when using BETWEEN with date or time values, use CAST() to explicitly convert the values to the desired data type.
A single-row subquery from a table or tables is not considered a constant. For example, if a subquery returns an integer to be compared to a DATETIME value, the comparison is done as two integers. The integer is not converted to a temporal value. To compare the operands as DATETIME values, use CAST() to explicitly convert the subquery value to DATETIME.
If one of the arguments is a decimal value, comparison depends on the other argument. The arguments are compared as decimal values if the other argument is a decimal or integer value, or as floating-point values if the other argument is a floating-point value.
In all other cases, the arguments are compared as floating-point (double-precision) numbers. For example, a comparison of string and numeric operands takes place as a comparison of floating-point numbers.
我们这里属于最后一种, 即转为double类型来比较.
那为啥被转为0了呢? 难道预示着宇宙的终极奥秘?
深入分析
猜测1: 默认值
我们知道double的默认值是0, 那么是转换失败使用默认值吗?
我们验证发现 double 的默认值确实是0, 符合我们的猜测, 但mysql会使用这种未初始化的值吗? 感觉不太可能. 所以进一步分析
GDB分析
我们第一次分析这种问题, 还不知道在哪打断点, 咋办呢? 没关系. 我们打一个通用的断点, 即dispatch_command 从解析包开始
但手动做next/step 还是太麻烦了, 所以我们使用脚本来分析.
(echo -e "break dispatch_command\ncontinue"; while true;do echo 'step';done) | gdb -p `pidof mysqld` > /tmp/t20240702_dispatch_command.gdb.txt 2>&1
我们直接全部step, 然后搜索下关键字 double 就能看到这么一个函数double_from_string_with_check , 然后看下函数调用, 发现最终调用的是my_strtod_int (名字也能看出来是str转为double).
返回值是dval($rv) , 而该对象初始化为0. 符合我们的要求
堆栈信息就只能看到这里了. 我们修改下初始化值, 然后重新编译验证下.
为了方便调试, 就写成脚本了. 编译启动快一点.(老年人不习惯vs之类的)
我们发现 where name = 0
的时候就查询不出来数据了, 而使用我们修改的默认值1
就能查询出来数据. 说明我们找正确了.
那这是属于BUG吗? 毕竟结果不符合我们的预期. 个人认为不是BUG,而是特性. 默认值在很多场景还是很好用的, 比如你连接数据库的时候,可以不用指定默认的3306端口.
题外话
说到这里, 我想到了之前的double存在-0的时候的问题了.
感兴趣的自己去看吧: https://bugs.mysql.com/bug.php?id=114962
参考:
https://dev.mysql.com/doc/refman/8.0/en/type-conversion.html