Reading Time: 3 minutes
Full-text search in Node.js is a powerful tool for searching for information stored in a database. By leveraging the capabilities of the Node.js platform, developers can easily incorporate full-text search into their applications, allowing users to quickly find relevant data based on keywords or phrases in their queries. This can help reduce the time and effort needed to search through large datasets. Moreover, it can be used to provide search-related data such as analytics, visualization, and more.
If you are building a website, e-commerce, blog, etc., you will need a full-text search to find related content like Google does for every web page. This is an already known problem so probably you don’t want to implement your own solution.
One option is to use the FlexSearch module for Node js.
So let’s create a small Proof of Concept (POC) from scratch.
The full source code is here
Have in mind that it’s an in-memory implementation so won’t be possible to index a huge amount of data. You can make your own benchmarks based on your requirements.
Setting up
Install Express generator if you haven’t done
Also, I strongly recommend you install a plugin in your browser to see JSON in a pretty-print format. I use JSONView. Another option is to use Postman to make your HTTP requests.
mkdir myflexsearch
cd myflexsearch
express --no-view --git
You can delete boilerplate code such as /public folder and routes/routes/users.js. After that, you will have to modify app.js because they are used there. Anyway, that code doesn’t affect our Proof of Concept.
Let’s install flexsearch module
npm install flexsearch --save
Optionally you can install nodemon module to automatically reload your app after every change. You can install it globally but I will locally
npm install nodemon --save
After that, open package.json and modify the start
"scripts": {
"start": "nodemon ./bin/www"
}
Let’s code
Our main code will be at routes/index.js. This will be our endpoint to expose a service to search like this:
/search?phrase=Cloud
Import the module
const FlexSearch = require("flexsearch");
const preset = "score";
const searchIndex = new FlexSearch(preset);
With preset = “score” we are defining behavior for our search. You can see more presets here. I recommend you play with different presets and see results.
We’ll need some dummy data to test.
What I’ve done is to create a file /daos/my_data.js with some content from here: https://api.publicapis.org/entries
Summary steps
- Build our index
- Define a key. Typically and ID field of our elements to index (user.id, book.id, etc)
- Define a content where we want to search. Example: the body of our blog post plus some description and its category.
- Expose a service to search through a URL parameter
- Build our index if it is empty
- Get the phrase to search from and url parameter
- Search in our index and get a list of IDs with results
- With the above results get elements from our indexed collection.
- Make requests to test our data
- Building the index
function buildIndex() {
console.time('buildIndexTook');
console.info('building index...');
const { data } = wsData; // we could get our data from DB, remote web service, etc.
for (let i = 0; i < data.length; i++) {
// we might concatenate the fields we want for our content
const content = `${data[i].API} ${data[i].Description} ${data[i].Category}`;
const key = parseInt(data[i].id);
searchIndex.add(key, content);
}
console.info(`index built, length: ${searchIndex.length}`);
console.info(' Open a browser at http://localhost:3000/');
console.timelineEnd('buildIndexTook');
}
Have in mind we are working with an in-memory search so be careful with the amount of data you load to the index.
This method shouldn’t take more than a couple of seconds running.
Basically in buildIndex() method we get our data from a static file but we could get it from a remote web service or a database.
Then we indicate a key for our index and then the content.
After that our index is ready to receive queries.
Exposing the service to search
router.get('/search', async (req, res, next) => {
try {
if (searchIndex.length === 0) {
await buildIndex();
}
const { phrase } = req.query;
if (!phrase) {
throw Error('phrase query parameter empty');
}
console.info(`Searching by: ${phrase}`);
// search using flexsearch. It will return a list of IDs we used as keys during indexing
const resultIds = await searchIndex.search({
query: phrase,
suggest: true, // When suggestion is enabled all results will be filled up (until limit, default 1000) with similar matches ordered by relevance.
});
console.info(`results: ${resultIds.length}`);
const results = getDataByIds(resultIds);
res.json(results);
} catch (e) {
next(e);
}
});
Here we expose a typical Express endpoint that receives the phrase to search through a query string parameter called phrase.
The result of our index will be the keys that match with our phrase, after that we will have to search our elements in our dataset to be displayed.
function getDataByIds(idsList) {
const result = [];
const { data } = wsData;
for (let i = 0; i < data.length; i++) {
if (idsList.includes(data[i].id)) {
result.push(data[i]);
}
}
return result;
}
We are just iterating our collection but typically we will query a database.
Making requests
Our last step is just to make some test requests with our browser, Postman, curl, or any other tool.
Some examples:
That’s it. See the full source code
Tip: if you are working with MySQL, you can try its own full-text implementation