Summary
Hello there,
This week, we're tackling a big topic. This article follows on from " Why isn't MYSQL a search engine?
In the latter, I talked about Elasticsearch, a popular indexed database engine that gets the job done.
This week, we're going to talk about Typesense. If you visit this blog from time to time, you'll know that I like to tell you about technologies I've spent time working on, and Typesense is one of them, having been a big part of the project I'm currently working on.
Are you ready? Then let's get started!
Typesense is an Open Source indexed database engine (and therefore search engine) written in C++.
On its Github page, it describes itself as an Open Source alternative to Algolia (a proprietary SaaS offering a search API).
To interact with typesense, it exposes a fairly well-documented REST API . The API allows you to do everything, from creating access keys with specific rights, to creating collections, indexing data... and so on.
Unlike a relational database engine, there are no relationships in an indexed database engine.For example, if you have a "Circuit" collection and a "bike type" field in it, then "Road" or "Mountain bike" will be written as many times as necessary.
Think of it as a very large Excel table.
Each row of the Excel table is called a "document", and each column is called a "field".
It's really text-based, there's no other logic behind it, which is why it's fast.
Although not specific to Typesense, here are a few features you'll find in most search engines:
Facets
You're probably familiar with this, having seen it on e-commerce sites. It's the ability to filter results according to certain criteria, for example, on a bicycle sales site, you can filter by "type of bike" or "wheel size". You can look to the right of this page (below on mobile) to see what we're talking about.
Search results by "weight
It's very simple: some fields are more important than others. For example, if you're looking for the name of a city like Paris, and in your collection there's a document with the "title" field reading "Une balade en Île-de-France" and the "city" field reading: "Paris", then if your "title" field has a greater weight than the "city" field, the search engine will look first in "title" then in "city", and therefore sort the search results according to that. In this case, documents with "Paris" in the "title" field will appear before those with "Paris" in the "city" field.
First of all, they have a number of things in common: for example, they can both run in clusters (on several servers, which means high availability and better performance when there's a lot of traffic).
As for the rest:
At the time of writing, the "stable" version is 0.24.1.
This can be felt when working with...
Here's a strange "behavior" that occurred to me when I was working with Typesense. I would sometimes make a lot of changes to the index data and leave Typesense running on my dev machine. After a while, the API would become much slower, and sometimes data would even disappear altogether. For example, when I wanted to update a document, the API would tell me that the document didn't exist... ?
The problem is simply that when documents are updated frequently, Typesense tends to bug.
I imagine - and this is just my interpretation - that the team is working more on the relevance of search results and speed than on this part of the process at the moment.
As with the ElasticSearch bundle, a bundle is currently being developed for Typesense.
Like Typesense itself, the bundle is quite young, and in the course of my work I came across an OS - in my case, I had a huge number of entries to index.I had a lot of entries to index, and at that time, the bundle sent everything to Typesense at once, without pagination, rather than sending it in several requests...
So I was ready to code the import with pagination and make a pull request on the project, but they beat me to it.
If you've already used the ElasticSearch bundle, you'll be familiar with it...
To work, the bundle listens to doctrine events, to detect when your database is due for an update, and thus update the indexed database.
As I said earlier, you need to be careful with this behavior, as Typesense doesn't like it when there are a lot of modifications to the documents in a collection.
It may also be interesting to note that, by default, just like the bundle for ElasticSearch, when you make a query, the bundle "Hydrates" the results with "Doctrin" objects, and to do this, it makes many requests to your database (MYSQL, Postgres .. etc.), so be sure to check whether you really need the full set of objects.
Like Algolia, Typesense enables the front-end (the JavaScript) to connect directly to it, so you don't have to go through your application's back-end every time. It's lighter and more responsive.
In addition to providing the search engine, the Typesense team also provides a JavaScript client, which makes it easy to query Typesense servers (yes, the client supports multiple nodes).
It's very simple to use, so to instantiate your client, take a look at this test file which shows how it's done.
Once you've got your "client".
Once you've instantiated your client, you can simply do a search as follows:
The result will be in JSON form, and if you're using Symfony, remember that here you're connecting directly to Typesense, so there's no hydration.
InstantSearch is a JavaScript library proposed by Algolia to create advanced search interfaces. It's interesting because it's not badly designed and, above all, it's extensible to make it behave as you like.
I mentioned earlier that Typesense is an alternative to Algolia, so there's also an adapter for InstantSearch.
It's very easy to use too, just instantiate it in the same way as Typesense-js and pass it to InstantSearch. Take a look at the project's Github page to see how.
Once you've instantiated your adapter, take a look at the InstantSearch documentation.
A picture is worth 1,000 words, so I suggest you try it out for yourself:
Here you have an advanced search page, with facets on the right and complex queries.
Here, just an "input", making simple queries
The servers that process your queries are located in Paris, France, giving you a good example of what's possible with Typesense.
Although it's a young project, Typesense has, I think, a future in the search engine market.
I've really enjoyed working with it and would use it again when I get the chance. It allows me to create the user experience I want to create.
Until next time.