KeyError when load_faiss_index from a dumpped datastore

Created by: Maxwell-Lyu

How to reproduce

  1. [OK] Build a datastore and its faiss_index using scripts under knnbox-scripts/vanilla-knn-mt
  2. [OK] Load this datastore in my code, dump it
  3. [Error] Load the dumpped datastore, and load its faiss_index (called by any retriever)

Cause

The "dump" and "load" in knn-box is not symmetric when it comes to faiss_index

  • The build_faiss_index method saves faiss_index shape to config.json
  • The dump method does not
  • The load method tries to load faiss_index shape from config.json

Fix

faiss's Index class saves vector dimention and vector counts in faiss_index file, knn-box need not to save them.

Error Trace

Traceback (most recent call last):
  File "/data0/lvyz/knn-box/knnbox-scripts/plac-knn-mt/../../knnbox-scripts/plac-knn-mt/save_drop_index.py", line 39, in <module>
    mt_known    = retriever.retrieve(query=query, return_list=["mt_known"])["mt_known"]
  File "/data0/lvyz/knn-box/knnbox/retriever/retriever.py", line 21, in retrieve
    self.datastore.load_faiss_index("keys", move_to_gpu=True)
  File "/data0/lvyz/knn-box/knnbox/datastore/datastore.py", line 155, in load_faiss_index
    shape = config["data_infos"][filename]["faiss_index_shape"]
KeyError: 'faiss_index_shape'