KeyError when load_faiss_index from a dumpped datastore
Created by: Maxwell-Lyu
How to reproduce
- [OK] Build a datastore and its faiss_index using scripts under knnbox-scripts/vanilla-knn-mt
- [OK] Load this datastore in my code, dump it
- [Error] Load the dumpped datastore, and load its faiss_index (called by any retriever)
Cause
The "dump" and "load" in knn-box is not symmetric when it comes to faiss_index
- The
build_faiss_index
method saves faiss_index shape to config.json - The
dump
method does not - The
load
method tries to load faiss_index shape from config.json
Fix
faiss's Index class saves vector dimention and vector counts in faiss_index file, knn-box need not to save them.
Error Trace
Traceback (most recent call last):
File "/data0/lvyz/knn-box/knnbox-scripts/plac-knn-mt/../../knnbox-scripts/plac-knn-mt/save_drop_index.py", line 39, in <module>
mt_known = retriever.retrieve(query=query, return_list=["mt_known"])["mt_known"]
File "/data0/lvyz/knn-box/knnbox/retriever/retriever.py", line 21, in retrieve
self.datastore.load_faiss_index("keys", move_to_gpu=True)
File "/data0/lvyz/knn-box/knnbox/datastore/datastore.py", line 155, in load_faiss_index
shape = config["data_infos"][filename]["faiss_index_shape"]
KeyError: 'faiss_index_shape'