Full-text search with Rails and SQLite

Linked in logoX logoFacebook logo
Sergio Alvarez
October 14, 2024

In this two-part blog post series, we will use Rails 8 and SQLite to search over a collection of movies using full-text and vector search. You can find the code in this repository. In this first part, we'll be focusing on finding movies via a full-text search.

Why SQLite?

Much work has been done to make the SQLite adapter and Ruby driver suitable for real production use in Rails 8. Rails 8 transforms SQLite from a lightweight development tool to a reliable choice for production use. Moreover, SQLite now has the capability to effectively power Action Cable, Rails.cache, and Active Job.

If you are interested in learning more about running SQLite in production. I suggest taking a look at Stephen Margheim's Rails World talk SQLite on Rails.

The Application

The app is going to show a catalog of movies I got from the Movie Database (TMDB). Here's the Movie model that we will use.

# == Schema Information
#
# Table name: movies
#
#  id         :integer          not null, primary key
#  overview   :text
#  poster_url :string
#  title      :string           not null
#  created_at :datetime         not null
#  updated_at :datetime         not null
#  tmdb_id    :bigint           not null
#
# Indexes
#
#  index_movies_on_tmdb_id  (tmdb_id) UNIQUE
#
class Movie < ApplicationRecord
  include FullTextSearch
  include VectorSearch

  def full_poster_path
    "https://media.themoviedb.org/t/p/w220_and_h330_face/#{poster_url}"
  end
end

Full-text search

We'll be using FTS5 to power the full-text search.

FTS5 is an SQLite extension that provides full-text search capabilities. It uses virtual tables optimized for text searching, as well as specialized indexing for quick retrieval of text data.

Overall, the steps to follow are:
1. Create the fts5 virtual table
2. Populate the virtual table
3. Write the search query
4. Search for records

1. Create the fts5 virtual table

We'll start by generating a migration and using the recently added create_virtual_table method to create the virtual table. We'll use "title" and "overview" as the search columns.

class CreateVirtualMovieFts < ActiveRecord::Migration[8.0]  
  def change    
    create_virtual_table :movie_fts, :fts5, ["title", "overview"]
  end
end
# app/models/movie_fts.rb
class MovieFts < ApplicationRecord
  self.primary_key = "rowid"
end

2. Populate the virtual table

Next, we want to create a new MovieFts every time a new Movie is added to the database. It's important to note that we can't add a primary key declaration to fts5 virtual tables, so we'll use the implicit rowid key to match records to the "movies" table.

module Movie::FullTextSearch
  extend ActiveSupport::Concern

  included do
    has_one :movie_fts, foreign_key: "rowid"
  end

  def find_or_create_movie_fts
    return if movie_fts

    sql = ActiveRecord::Base.sanitize_sql_array(
      [
        "INSERT INTO movie_fts (rowid, title, overview) VALUES (?, ?, ?)",
        id, title, overview
      ]
    )
    ActiveRecord::Base.connection.execute(sql)
  end
end

3. Write the search query

Let's create a query that uses FTS5 to search for movies. We'll use the bm25 function to get the most relevant results first.

BM25 is a ranking function measuring relevance based on term frequency and document length.
module Movie::FullTextSearch
  extend ActiveSupport::Concern

  included do
    has_one :movie_fts, foreign_key: "rowid"
  end

  class_methods do
    def full_text_search(input:, limit:)
      where("movie_fts MATCH ?", input)
        .joins(:movie_fts)
        .limit(limit)
        .order("bm25(movie_fts)")
        .distinct
    end
  end
end

4. Search for records

We can use the full_text_search method when users want to search for a movie.

class MoviesController < ApplicationController
  def index
    @movies = if params[:search].present?
      Movie.full_text_search(input: params[:search], limit: 40)
    else
      Movie.all.limit(40)
    end
  end
end

That's it, if you have any questions you can find me at sergio@teloslabs.co. On the next blog post, we’ll go beyond keywords and explore vector search, making it possible to find similar movies based on meaning rather than just words.

READY FOR
YOUR UPCOMING VENTURE?

We are.
Let's start a conversation.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Our latest
news & insights