ES中常用查询解释
项目中间经常使用es进行一些查询,这里通过一些例子简单说明一下es中常见的几种查询,以下通过products索引来对match,term,bool查询进行讲解说明
PUT /products
{
"mappings": {
"properties": {
"product_id": { "type": "integer" },
"name": { "type": "text" },
"category": { "type": "keyword" },
"price": { "type": "float" }
}
}
}
示例数据
[
{
"product_id": 1,
"name": "iPhone 13 Pro",
"category": "Electronics",
"price": 1099
},
{
"product_id": 2,
"name": "Samsung Galaxy",
"category": "Electronics",
"price": 899
},
{
"product_id": 3,
"name": "Nike Running Shoes",
"category": "Sportswear",
"price": 99
},
{
"product_id": 4,
"name": "Sony Headphones",
"category": "Electronics",
"price": 199
},
{
"product_id": 5,
"name": "Canon EOS R5",
"category": "Photography",
"price": 3499
},
{
"product_id": 6,
"name": "Adidas Soccer Ball",
"category": "Sports",
"price": 20
},
{
"product_id": 7,
"name": "Logitech Keyboard",
"category": "Electronics",
"price": 79
},
{
"product_id": 8,
"name": "Dell Laptop",
"category": "Electronics",
"price": 1299
}
]
1. Match 查询:
查询条件:
{
"query": {
"match": {
"name": "Electronics"
}
}
}
查询结果:
[
{
"product_id": 1,
"name": "iPhone 13 Pro",
"category": "Electronics",
"price": 1099
},
{
"product_id": 2,
"name": "Samsung Galaxy",
"category": "Electronics",
"price": 899
},
{
"product_id": 7,
"name": "Logitech Keyboard",
"category": "Electronics",
"price": 79
},
{
"product_id": 8,
"name": "Dell Laptop",
"category": "Electronics",
"price": 1299
}
]
2. Term 查询:
查询条件:
{
"query": {
"term": {
"category.keyword": "Electronics"
}
}
}
查询结果:
[
{
"product_id": 1,
"name": "iPhone 13 Pro",
"category": "Electronics",
"price": 1099
},
{
"product_id": 2,
"name": "Samsung Galaxy",
"category": "Electronics",
"price": 899
},
{
"product_id": 4,
"name": "Sony Headphones",
"category": "Electronics",
"price": 199
},
{
"product_id": 7,
"name": "Logitech Keyboard",
"category": "Electronics",
"price": 79
},
{
"product_id": 8,
"name": "Dell Laptop",
"category": "Electronics",
"price": 1299
}
]
3. Bool 查询:
查询条件:
{
"query": {
"bool": {
"must": [
{ "match": { "category": "Electronics" }},
{ "range": { "price": { "gte": 200 }}}
],
"must_not": [
{ "term": { "name.keyword": "Samsung Galaxy" }}
]
}
}
}
查询结果:
[
{
"product_id": 1,
"name": "iPhone 13 Pro",
"category": "Electronics",
"price": 1099
},
{
"product_id": 4,
"name": "Sony Headphones",
"category": "Electronics",
"price": 199
},
{
"product_id": 8,
"name": "Dell Laptop",
"category": "Electronics",
"price": 1299
}
]
Filter 查询:
查询条件:
{
"query": {
"bool": {
"filter": [
{ "range": { "price": { "lte": 100 }}}
]
}
}
}
查询结果:
[
{
"product_id": 3,
"name": "Nike Running Shoes",
"category": "Sportswear",
"price": 99
},
{
"product_id": 6,
"name": "Adidas Soccer Ball",
"category": "Sports",
"price": 20
},
{
"product_id": 7,
"name": "Logitech Keyboard",
"category": "Electronics",
"price": 79
}
]
其实filter查询是bool查询的一种,bool查询通过结合filter,must,should,must_not关键字可以完成很多灵活的查询。
Bool查询filter,must,should,must_not关键字分析
通过以下示例来了解bool查询中filter,must,should,must_not关键字。
假设我们的索引products
包含以下数据:
product_id | name | category | price |
---|---|---|---|
1 | iPhone 13 Pro | Electronics | 1099 |
2 | Samsung Galaxy | Electronics | 899 |
3 | Nike Running Shoes | Sportswear | 99 |
4 | Sony Headphones | Electronics | 199 |
5 | Canon EOS R5 | Photography | 3499 |
6 | Adidas Soccer Ball | Sports | 20 |
7 | Logitech Keyboard | Electronics | 79 |
8 | Dell Laptop | Electronics | 1299 |
Filter 查询示例:
1. 价格低于等于100,并且属于电子产品的:
{
"query": {
"bool": {
"filter": [
{ "range": { "price": { "lte": 100 }}},
{ "term": { "category.keyword": "Electronics" }}
]
}
}
}
2. 不是电子产品的,并且价格在200到1000之间的:
{
"query": {
"bool": {
"filter": [
{ "term": { "category.keyword": "Electronics" }},
{ "range": { "price": { "gte": 200, "lte": 1000 }}}
],
"must_not": [
{ "term": { "category.keyword": "Electronics" }}
]
}
}
}
3. 价格在100到500之间,或者是运动类别的:
{
"query": {
"bool": {
"filter": [
{ "range": { "price": { "gte": 100, "lte": 500 }}},
{ "term": { "category.keyword": "Sportswear" }}
],
"should": [
{ "range": { "price": { "gte": 100, "lte": 500 }}},
{ "term": { "category.keyword": "Sportswear" }}
],
"minimum_should_match": 1
}
}
}
4. 不是电子产品,并且价格不高于1000,或者是相机类别的:
{
"query": {
"bool": {
"filter": [
{ "term": { "category.keyword": "Electronics" }},
{ "range": { "price": { "lte": 1000 }}}
],
"must_not": [
{ "term": { "category.keyword": "Electronics" }}
],
"should": [
{ "term": { "category.keyword": "Photography" }}
],
"minimum_should_match": 1
}
}
}
在通过以上示例了解之后,有一个疑问就是如果四个关键字同时使用,那么他们之间都会对结果产生什么影响呢?以及他们之间的执行顺序呢?
在查找了一些资料之后,发现没有明确的资料阐述四个关键字的具体执行顺序,但是可以通过分析几个关键字分别在查询阶段和评分阶段的作用来发现以下结果。
- filter和must_not只作用于查询阶段,不影响评分阶段
- must和should会影响评分
- 查询文档如果符合filter条件会被返回,查询文档结果如果不符合must条件不会被返回,查询文档如果匹配must_not,那么不会被返回
总结
因此总结来看filter,must,must_not作用于查询阶段,must,should作用于评分阶段。简单来说就是,通过filter,must,must_not关键字用来缩小匹配到的文档集合,再根据must,should对符合查询条件的结果进行评分最终返回结果集合。所以我们可以简单理解filter,must,must_not会先执行,并且执行顺序一样,should只在评分阶段使用(这里说的顺序只作为理解来看)。