Hello there,
This week, we're tackling a big topic. This article follows on from " Why isn't MYSQL a search engine?
In the latter, I talked about Elasticsearch, a popular indexed database engine that gets the job done.

This week, we're going to talk about Typesense. If you visit this blog from time to time, you'll know that I like to tell you about technologies I've spent time working on, and Typesense is one of them, having been a big part of the project I'm currently working on.

Are you ready? Then let's get started!

What is Typesense?

Typesense is an Open Source indexed database engine (and therefore search engine) written in C++.
On its Github page, it describes itself as an Open Source alternative to Algolia (a proprietary SaaS offering a search API).

To interact with typesense, it exposes a fairly well-documented REST API . The API allows you to do everything, from creating access keys with specific rights, to creating collections, indexing data... and so on.

What is a collection?

Unlike a relational database engine, there are no relationships in an indexed database engine.For example, if you have a "Circuit" collection and a "bike type" field in it, then "Road" or "Mountain bike" will be written as many times as necessary.
Think of it as a very large Excel table.
Each row of the Excel table is called a "document", and each column is called a "field".
It's really text-based, there's no other logic behind it, which is why it's fast.

Some Typesense features

Although not specific to Typesense, here are a few features you'll find in most search engines:

Facets
You're probably familiar with this, having seen it on e-commerce sites. It's the ability to filter results according to certain criteria, for example, on a bicycle sales site, you can filter by "type of bike" or "wheel size". You can look to the right of this page (below on mobile) to see what we're talking about.

Search results by "weight
It's very simple: some fields are more important than others. For example, if you're looking for the name of a city like Paris, and in your collection there's a document with the "title" field reading "Une balade en Île-de-France" and the "city" field reading: "Paris", then if your "title" field has a greater weight than the "city" field, the search engine will look first in "title" then in "city", and therefore sort the search results according to that. In this case, documents with "Paris" in the "title" field will appear before those with "Paris" in the "city" field.

What does Typesense bring to the table compared to Elasticsearch?

First of all, they have a number of things in common: for example, they can both run in clusters (on several servers, which means high availability and better performance when there's a lot of traffic).

As for the rest:

Typesense is much lighter than Elasticsearch, which can be explained by the fact that it is written in C++, whereas Elasticsearch is java.
It's tolerant of typos, so you can type "turtue nonja" to find "Ninja Turtle" (don't ask me why Ninja Turtle, I just wanted to...).
Setting the "weight" of search fields is very simple, and was one of Algolia's great strengths before Typesense came along. In fact, all you have to do is define the order of the fields Typesense will search when sending the request to the API, the first field in the list will have the most weight, and of course the last will have less, which is perfect when you want results to come out if you can't find "anything else".
It seems easier to get to grips with than Elasticsearch, its API is simple and logical to get to grips with, so you won't need to spend years on it to get something that works and, above all, returns relevant answers. There's also an official Docker image available.

Still a young project

At the time of writing, the "stable" version is 0.24.1.
This can be felt when working with...

Beware of frequent data updates

Here's a strange "behavior" that occurred to me when I was working with Typesense. I would sometimes make a lot of changes to the index data and leave Typesense running on my dev machine. After a while, the API would become much slower, and sometimes data would even disappear altogether. For example, when I wanted to update a document, the API would tell me that the document didn't exist... ?
The problem is simply that when documents are updated frequently, Typesense tends to bug.
I imagine - and this is just my interpretation - that the team is working more on the relevance of search results and speed than on this part of the process at the moment.

Integrating Typesense into a Symfony project

As with the ElasticSearch bundle, a bundle is currently being developed for Typesense.

Like Typesense itself, the bundle is quite young, and in the course of my work I came across an OS - in my case, I had a huge number of entries to index.I had a lot of entries to index, and at that time, the bundle sent everything to Typesense at once, without pagination, rather than sending it in several requests...

So I was ready to code the import with pagination and make a pull request on the project, but they beat me to it.

How does the Bundle work?

If you've already used the ElasticSearch bundle, you'll be familiar with it...

Set the URL, port and key of your Typesense server as an environment variable.
Configure your "acseo_typesense" file. To find out how, take a look at the Github page.
Use the "typesense:create" and "typesense:import" commands to populate your indexed database (depending on what you've configured in your configuration file).
Use the bundle's dynamically generated services to access and query your collections.

Synchronization with Doctrine

To work, the bundle listens to doctrine events, to detect when your database is due for an update, and thus update the indexed database.
As I said earlier, you need to be careful with this behavior, as Typesense doesn't like it when there are a lot of modifications to the documents in a collection.

Watch out for hydration

It may also be interesting to note that, by default, just like the bundle for ElasticSearch, when you make a query, the bundle "Hydrates" the results with "Doctrin" objects, and to do this, it makes many requests to your database (MYSQL, Postgres .. etc.), so be sure to check whether you really need the full set of objects.

The front-end

Like Algolia, Typesense enables the front-end (the JavaScript) to connect directly to it, so you don't have to go through your application's back-end every time. It's lighter and more responsive.

Typesense-js

In addition to providing the search engine, the Typesense team also provides a JavaScript client, which makes it easy to query Typesense servers (yes, the client supports multiple nodes).

It's very simple to use, so to instantiate your client, take a look at this test file which shows how it's done.
Once you've got your "client".

Once you've instantiated your client, you can simply do a search as follows:

The result will be in JSON form, and if you're using Symfony, remember that here you're connecting directly to Typesense, so there's no hydration.

InstantSearch

InstantSearch is a JavaScript library proposed by Algolia to create advanced search interfaces. It's interesting because it's not badly designed and, above all, it's extensible to make it behave as you like.

I mentioned earlier that Typesense is an alternative to Algolia, so there's also an adapter for InstantSearch.

It's very easy to use too, just instantiate it in the same way as Typesense-js and pass it to InstantSearch. Take a look at the project's Github page to see how.
Once you've instantiated your adapter, take a look at the InstantSearch documentation.

What does it look like in production?

A picture is worth 1,000 words, so I suggest you try it out for yourself:

Here you have an advanced search page, with facets on the right and complex queries.

Here, just an "input", making simple queries

The servers that process your queries are located in Paris, France, giving you a good example of what's possible with Typesense.

Conclusion

Although it's a young project, Typesense has, I think, a future in the search engine market.
I've really enjoyed working with it and would use it again when I get the chance. It allows me to create the user experience I want to create.
Until next time.