Sophia NLU Engine - Implementation

Sophia is designed for seamless integration into any existing back-end application, regardless of tech stack, programming language, or network infrastructure. Two main implementation methods are available:

Shared Library via FFI

Sophia includes a shared library compiled for Linux, Mac, and Windows, with a small footprint of 3.7MB (Linux). This library can be easily loaded into your preferred programming language via FFI (Foreign Function Interface). Integration scripts are provided for Python, Rust, C++, JavaScript, and PHP, but any language supporting FFI can be used.

The library provides two primary functions:

  • tokenize(input): Tokenizes the input, returning two vectors of tokens—individual words and MWEs (multi-word entities).
  • interpret(input): Tokenizes and interprets the input, intuitively chunking it into phrases structured for easy processing by software.
  • Both functions take a string input and return an object with the results. Detailed Rust structs and their C-compatible counterparts are available in the cicero-interfaces crates.io Rust crate.

    RPC Daemon

    Alternatively, you can use the included binary, which serves as an RPC daemon and keeps the vocabulary database loaded in memory. The "sophia" binary, compiled for Linux, Mac, and Windows, loads the vocabulary into memory and listens on localhost as an RPC server.

    • You can call it via the command line or send an RPC request over HTTP to receive a JSON object in return.
    • The same two methods, tokenize() and interpret(), are available.
    • View the exact responses on the JSON Objects page.

    Vocabulary Database

    The vocabulary database is a simple 44.6MB binary file with no dependencies that requires nothing except pointing to its location upon loading the shared library or within the RPC configuration.

    In addition to the vocabulary data store, there are two additional files for the cache and any imported named entities, both starting empty.

    Additional Details

    Sophia also allows the import of your own named entities into the vocabulary and the use of selectors to understand user input against predefined phrases. For more information, see:

    For inquiries or expert consultation, please fill out the contact form.