Semantic Search Using CockroachDB
With our 24.2 release, CockroachDB adds support for the VECTOR data type, along with a set of pgvector compatible functions for doing interesting things like computing similarity between vectors. This article provides a brief overview of some related concepts and introduces a semantic search application to illustrate this new capability in action. These new functionalities demonstrate CockroachDB’s expanding support for AI-driven applications, such as Large Language Models (LLMs).
Michael Goddard
October 23, 2024
applications
Use cases for trigram indexes (when not to use Full Text Search)
We’ve been planning a visit to Orange County, California, where I grew up, over the upcoming holidays. My favorite Mexican restaurant is there, in Placentia: Q-Tortas! I have frequent cravings for their carnitas burrito, so a visit there is obligatory (I promise this blog is about search).
Michael Goddard
December 12, 2022
Product
Full text search with CockroachDB and Elasticsearch
Full text indexing and search is such a common feature of applications these days. Users expect to be able to find a restaurant, a product, a movie review, or any number of other things quickly by entering a few search terms into a user interface. Many of these apps are built using a relational database as the data store, so the apps aren’t generally dedicated to search, but incorporate that as an added feature.
Michael Goddard
July 27, 2022
Product
CockroachDB Admission Control? Yes, please!
Last week, while running a workload consisting of 200 different queries, we noticed right away that a CPU imbalance was causing a performance issue. Looking at the first graph, below, you can see right away that one of the three CockroachDB nodes was operating at near 100% CPU. Not ideal.
Michael Goddard
May 23, 2022
Engineering
An experiment in fuzzy matching, using SQL, with CockroachDB
A recent tweet inspired me to address the need for fuzzy matching by combining some existing capabilities of CockroachDB. Note the key features mentioned in the tweet: - similar but not equal sporting events names: a common pattern. Users tend to mis-type data in input fields, and data isn’t always correct. Nevertheless, we’d like to return the closest match. - I’d rather use this in-built feature than pay for a whole ES cluster with added maintenance overhead to boot: This is the second time I’ve heard this sentiment in the past couple of months. ES is a full-featured search engine and delivers a great experience but, for this purpose, would be overkill and would require additional time and expense to deploy and operate.
Michael Goddard
April 18, 2022
Product
Highly available spatial data: Finding pubs in London
Imagine you’re driving a rental car in Rome and the satnav (or GPS) on your phone stops working. This happened to me two years ago when I was commuting by car each day from an Airbnb in Trastevere to an office on Via Amsterdam.
Michael Goddard
March 9, 2022
applications
Full text indexing and search in CockroachDB
In this post, I’ll skim the surface of a very common pattern in application development: full text indexing and search. I’ll start with a bit of motivation, what prompted me to explore this using CockroachDB. Next, I’ll introduce the initial pass at a solution, followed by a deeper explanation of how that was done, and I’ll then improve on that result by adding a “score”. Finally, I’ll discuss the limitations of this simplistic approach, within the context of information retrieval, ending with my answer to “So, why’d you do it?” Let’s get started.
Michael Goddard
August 27, 2020