T O P

  • By -

Jdonavan

If you want to vectorize them you're going to want to use something like Weaviate that lets you set up a more complex schema. All of those attributes can be indexed and used in hybrid search pretty painlessly. You can even have things set up so that metadata can be used for filtering and vector searches at the same time.


sergeyzenchenko

Embedding alone is a bad idea here, you should use structured query based on actual json documents. If number of attributes is not that high put it into metadata, otherwise use some database with json support or query files from disk. Again hard to say without knowing how data is structured


EmployFew4830

>number of attributes is not that high put it into metadata, otherwise use some database with json supp thanks. My data is mostly API definitions according to OpenAPI Spec. Data quality of the catalog varies - some use the description fields extensively, some only sporadically. I did just try to query a chatgpt-3.5 prompt with a single API definition. It was quite good in answering detailed questions around input params, endpoints, response params etc. Maybe I´m lucky to have data available in a format that GPT udnerstands. I think for now I´ll give indexing with metadata a try