Mynij Milestones 1 and 2: what is the limit of flexsearch?

We already know that we can have in Mynij an index of all bee species, of which there are 20 thousand. But what about the 150 million different references in Amazon. Can they fit inside Mynij, ie. inside the web browser's RAM and storage on a typical smartphone?
  • Last Update:2021-03-01
  • Version:001
  • Language:en

Mynij is an experimental offline Web search engine based on JIO and Flexsearch. The long term goal of Mynij is to provide relevant and accurate search results faster than traditional online Web search engines by relying for search on indices that are personalised for each user.

The goal of this milestone was to write a test and build a test environment which can automatically compute disk and RAM occupation, as well as time of execution, depending on the number of entries added to Mynij search engine. With these results, we can determine the kind of data we can expect Mynij to index. For example, we already know that we can have in Mynij an index of all bee species, of which there are 20 thousand. But what about the 150 million different references in Amazon. Can they fit inside Mynij, ie. inside the web browser's RAM and storage on a typical smartphone?

All results can be found online here: https://alpha.iodide.io/notebooks/3633/?viewMode=report

The source code to produce results can be found here: https://lab.nexedi.com/ARogova/Mynij-unit-tests

We also did some tests of the RSS and Sitemap parsers which Mynij relies on to index web sites: https://alpha.iodide.io/notebooks/3900/?viewMode=report

The simplified conclusion is that: with current implementation of flexsearch, it is possible with Mynij to store a 100K to 300K entries per index inside a smartphone. It is also possible to import / export about 100K entries in a matter of seconds to minutes. Within 10 years, we can expect these figures grow up to a million entries, and even maybe 10 million.

This is enough for all bee speicies but not enough for all references of Amazon.

Contact

  • Logo Nexedi
  • Alain Takoudjou
  • alain (dot) takoudjou (at) nexedi (dot) com
  • email was talino@tiolive.com
  • Photo Jean-Paul Smets
  • Logo Nexedi
  • Jean-Paul Smets
  • jp (at) nexedi (dot) com
  • Jean-Paul Smets is the founder and CEO of Nexedi. After graduating in mathematics and computer science at ENS (Paris), he started his career as a civil servant at the French Ministry of Economy. He then left government to start a small company called “Nexedi” where he developed his first Free Software, an Enterprise Resource Planning (ERP) designed to manage the production of swimsuits in the not-so-warm but friendly north of France. ERP5 was born. In parallel, he led with Hartmut Pilch (FFII) the successful campaign to protect software innovation against the dangers of software patents. The campaign eventually succeeeded by rallying more than 100.000 supporters and thousands of CEOs of European software companies (both open source and proprietary). The Proposed directive on the patentability of computer-implemented inventions was rejected on 6 July 2005 by the European Parliament by an overwhelming majority of 648 to 14 votes, showing how small companies can together in Europe defeat the powerful lobbying of large corporations. Since then, he has helped Nexedi to grow either organically or by investing in new ventures led by bright entrepreneurs.