The remainder of this blog post takes a deep dive into the nitty-gritty details of how Postgres calculates the similarity score. Just make sure to EXPLAIN ANALYZE your queries and check that they use the index before going to production. If that is all you want to know, feel free to stop reading at this point. Now you know how to use the ILIKE and similarity operators efficiently to search for names and other short texts in your database. If you use the ILIKE method, you should seriously consider this combination for production use. It is independent of the global similarity threshold since it only sorts the results of the ILIKE query. Keep in mind that it has the same limitation as ILIKE which is that it doesn’t return non-matching, but very similar search results like Ana for the search term Anna. It uses the GIN index for all operations and returns the most relevant results on top. This query uses ILIKE to filter out any non-matching names and then orders the remaining names by their similarity to the search term. Run your your migration and then head over to your context to add the following query:ĭef similarity ( name ) do query = from ( p in Person, where: fragment ( "? % ?", ^ name, p. You can re-use the migration for ILIKE also for similiarity. Let’s first have a look at how to use this operator. We will get into the details of what can go wrong later on. So, before you deploy your query to production, please analyze the query using EXPLAIN ANALYZE first. Similarity is one of these functions that appear to be very easy to use, but it’s very easy to get it wrong. ![]() We can use Postgres’ SIMILARITY operator for that. So, names that don’t match exactly, but are very similar. For example, if you search for Ana you might also want to see Anna and Hannah in your results. Now, the ILIKE query returns all names that match a certain search term, but it won’t return names that don’t match the search term but are very similar to it. If it returns an analysis beginning with Seq Scan, it doesn’t use your index and scans every row 1-by-1, which you don’t want. It should return an analysis that starts with Bitmap Heap Scan. This is how it looks:ĮXPLAIN ANALYZE SELECT name FROM persons WHERE name ILIKE '%bert%' We also create an index for that field, which we’ll discuss later. □ The Migrationįirst, we build a migration that creates a simple persons table with only one field: a name. Now, let’s have a look at how to implement an efficient ILIKE search in Ecto. So, whereas Ber% LIKE 'bert' doesn’t match, Ber% ILIKE 'bert' does. In our queries, we usually don’t use LIKE, but its sibling ILIKE, which ignores cases in our strings and search terms. This would match Bert, but also DagoBert. If your search term may exist anywhere in the string, you can use two % signs like this: %Ber%. For example, a search term of Ber% means that a string must begin with Ber whereas %Ber means the string must end with Ber. You can specify where in the string the search term may exist using the wildcard % sign. It checks whether a string contains a search term. Probably the most common search method is the LIKE operator. Let’s have a look at how to implement each of them. The non-trivial size of the set allows us to test the speedup of using indexes. We will test our name search against an example set of 18.000 first names. ![]() similarity is a more fuzzy approach and would also return Bart, since it’s not the same, but very similar to Ber. For example, the search term Ber would return Bert or Berry, but not Bart. (i)like is better if you want to have exact matches with your query. In short, there are two good candidates of query methods you can use: (i)like and similarity. You need to get this out the door quickly, so instead of building a cool, custom search engine, you want to use what’s already at hand. You need to allow admins to search through the users by name. ![]() Imagine that you have a database with a million users. I will cover FTS in another article in the future. It focuses only on searching for short strings like names, movie titles, article names, etc. Quick note: This article does not cover full-text search (FTS) with Postgres. Let’s dive into today’s topic: How can you search through millions of usernames using Ecto and Postgres? Let’s find out!
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |