开发 CLIP AI 基于Redis实现以图搜图 莫已诺 2024-05-11 2024-05-12 构建自己的搜图引擎 RAG搜索引擎公式: 语料+EmbeddingModel+向量库=搜索引擎
图片文件+图片EmbeddingModel+向量库=图片搜索引擎
通过不同的Embedding 的模型还能实现 音频、人脸等垂直领域的向量搜索
这里基于clip-ViT-B-32 、Redis(向量库), 实现图片、语言搜索图。
clip-ViT-B-32 是图像和文本模型 CLIP 它将文本和图像映射到共享矢量空间。
运行环境 Docker 、Rrdis、Python3.10.9
安装相关依赖 1 2 3 4 pip install sentence-transformers pip install redis==5.0.1 pip install nacos-sdk-python pip install flask -i
基于docker-compose构建redis 1 2 3 4 5 6 7 8 9 10 version: "2.4" services: redis-server: image: redis/redis-stack-server:7.2.0-v6 container_name: redis-server ports: - 16379 :6379 volumes: - /E/Docker/APP/Redis/data/redis-data:/data
图像向量化 从 HuggingFace 加载 OpenAI 的 Clip 模型,对图片进行向量化处理,生成可以被存储在 Redis 中的数据。
1 2 3 4 5 6 7 8 from sentence_transformers import SentenceTransformer, utilfrom PIL import Imagemodel = SentenceTransformer('clip-ViT-B-32' ) img_emb = model.encode(Image.open ('two_dogs_in_snow.jpg' ))
什么是 Embedding? Embedding 度量了文本字符串之间的相关性。Embedding 通常用于:
搜索 (根据与查询字符串的相关性对结果进行排序)
聚类 (将文本字符串按相似性分组)
推荐 (推荐具有相关文本字符串的项目)
异常检测 (识别与相关性较低的异常值)
多样性测量 (分析相似性分布)
分类 (根据最相似的标签对文本字符串进行分类)
Embedding 是一个浮点数向量(列表)。两个向量之间的距离测量它们的相关性。较小的距离表示高相关性,较大的距离表示低相关性。
使用Redis 存储图片向量数据 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 import torchimport numpy as npfrom sentence_transformers import SentenceTransformerfrom PIL import Imageimport timeimport osimport redisclient = redis.Redis(host="localhost" , port=16379 , decode_responses=True ) res = client.ping() print ("redis connected:" , res)img_model_path = 'E:\WorkSpace\CV\models\clip-ViT-B-32' img_model = SentenceTransformer(model_name_or_path = img_model_path) image_directory = 'E:/WorkSpace/CV/images/up1' png_files = [filename for filename in os.listdir(image_directory)] sorted_png_files = sorted (png_files, key=lambda x: x.split('.' )[0 ]) def slip_replace_once (s, target, replacement ): index = s.find(target) if index != -1 : return replacement +target + s[index+len (target):] else : return replacement +target + s pipeline = client.pipeline() for i, png_file in enumerate (sorted_png_files, start=1 ): pipeline.json().set ("zerocc-" +png_file, "$" , png_file) batch_size = 1 with torch.no_grad(): for idx, png_file in enumerate (sorted_png_files, start=1 ): start = time.time() image = Image.open (f"{image_directory} /{png_file} " ) embeddings = img_model.encode(image).tolist() vector_dimension = len (embeddings) print ('vector_dimension:' , vector_dimension) end = time.time() print ('%s Seconds' %(end-start)) pipeline.json().set ("zerocc-" +png_file, "$" , embeddings) res = pipeline.execute() print ('redis set:' , res)
这里redis作为KEY 用自定义‘zerocc’ 作为前缀,方便后面索引使用。
创建Redis向量索引 这里使用最简单的平面索引,这种索引方式的内存使用量最低,因为会采取遍历式搜索,所以别名被称为“暴力搜索”。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 import redisfrom redis.commands.search.field import VectorFieldfrom redis.commands.search.indexDefinition import IndexDefinition, IndexTypeclient = redis.Redis(host="localhost" , port=16379 , decode_responses=True ) res = client.ping() print ("redis connected:" , res)vector_dimension = 512 vector_indexes_name = "idx:zerocc_indexes" schema = ( VectorField( "$" , "FLAT" , { "TYPE" : "FLOAT32" , "DIM" : vector_dimension, "DISTANCE_METRIC" : "COSINE" , }, as_name="vector" , ), ) definition = IndexDefinition(prefix=["zerocc-" ], index_type=IndexType.JSON) res = client.ft(vector_indexes_name).create_index( fields=schema, definition=definition ) print ("create_index:" , res)
图片搜索 实现图片搜索。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 import torchimport numpy as npfrom sentence_transformers import SentenceTransformerfrom PIL import Imageimport timeimport redisfrom redis.commands.search.query import Queryimg_model_path = 'E:\Docker\APP\Python\data\work\models\clip-ViT-B-32' img_model = SentenceTransformer(model_name_or_path = img_model_path) vector_indexes_name = "idx:zerocc_index" client = redis.Redis(host="localhost" , port=16379 , decode_responses=True ) res = client.ping() print ("redis connected:" , res)def dump_query (query, query_vector, extra_params={} ): result_docs = ( client.ft(vector_indexes_name) .search( query, { "query_vector" : query_vector } | extra_params, ) .docs ) print (result_docs) return result_docs import osimport shutildef empty_folder (folder_path ): for file in os.listdir(folder_path): file_path = os.path.join(folder_path, file) if os.path.isfile(file_path): os.remove(file_path) def move_img (img_name ): old_floder = 'E:\\WorkSpace\\CV\\images\\up2' new_folder = 'E:\\WorkSpace\\CV\\images\\out' old_img_path = os.path.join(old_floder, img_name.replace('zerocc-' , '' )) new_img_path = os.path.join(new_folder, img_name) if not os.path.exists(new_folder): os.makedirs(new_folder) shutil.copy(old_img_path, new_img_path) def main (): img_path = 'we_20240506103121.jpg' start = time.time() image = Image.open ("E:/WorkSpace/CV/images/up2/" +img_path) with torch.no_grad(): embeddings = img_model.encode(image).tobytes() query_vector = embeddings query = ( Query("(*)=>[KNN 3 @vector $query_vector AS vector_score]" ) .sort_by("vector_score" ) .return_fields("$" ) .dialect(2 ) ) result = dump_query(query, query_vector, {}) if len (result) > 0 : empty_folder('E:\\WorkSpace\\CV\\images\\out' ) for doc in result: move_img(doc['id' ]) end = time.time() print ('%s Seconds' %(end-start)) if __name__ == '__main__' : main()
基于指定的图片找到相近的三张后输出至out目录
效果
参考:使用 Redis 构建轻量的向量数据库应用:图片搜索引擎(一) - 苏洋博客 (soulteary.com)