Analytics, Data, Strategy

Conversational Interfaces: Use AI to Chat With Your Data or App

In today’s fast-evolving digital landscape, AI is reshaping how we interact with technology. Among its transformative applications, Conversational Interfaces (CIs) stand out, fundamentally changing our approach to data interaction and application usability. Unlike traditional graphical user interfaces (GUIs), CIs leverage natural language processing, enabling users to converse with data in a more intuitive, human-like manner.

The Challenge with Traditional GUIs

For years, GUIs have been the standard for interacting with data and applications. However, they come with limitations, especially in realms like self-service BI. The steep learning curve associated with manipulating GUIs or mastering query languages like SQL can be daunting. Efforts to simplify these interfaces, such as pivot-table-like designs and drag-and-drop icons, have made strides but still fall short in adoption rates, particularly among infrequent and non-technical users. The complexity remains a barrier, stifling the potential for data-driven decision-making across all levels of an organization.

The Rise of Conversational Interfaces

Enter Conversational Interfaces. By allowing users to interact through natural language queries or commands, CIs make it possible to chat with your data or interact with your app as if you were conversing with a human. This approach can be implemented directly in the app or across various platforms, including messaging apps like Slack, social media, and even voice gateways. At the heart of this system is building an “Augmented GPT” – a model that, beyond its base capabilities, is customized with specific data and functionalities enabled by developers.

How It Works

New functionality introduced recently by vendors such as OpenAI paves the way for such conversational capabilities. Developers can now equip GPT with setup prompts, documents, and, crucially, tool specifications. This customization allows GPT to adopt a specific persona and possess additional capabilities, such as querying private datasets or executing application-specific commands.

The process is straightforward yet powerful:

Developers customize GPT with the necessary configuration, information, and tools.
Users input their queries or commands into the application.
The application forwards these to the now-augmented GPT, which then processes the request.
GPT may make one or several tool call requests to gather information or perform actions.
Importantly, these tool calls are then actually executed by the developer’s application, not GPT.
The results are fed back to GPT, which crafts and returns a coherent response.
Finally, the application presents this response to the user.

This model not only enhances user experience by enabling sophisticated interactions with custom functionality but also ensures developers maintain control at all times, setting appropriate contextual guardrails and implementing safety measures.

Addressing Concerns

In working to implement CIs, we have had to deal with various concerns, particularly around data security, access control, cost, performance, and scalability. Here’s how we have tackled these issues:

Data Security: We only use OpenAI’s zero-retention endpoints to ensure that no data is stored (much less trained on), making them suitable for even restrictive HIPAA-compliant environments.
Access Control: We have the application act as a gatekeeper, managing user access and maintaining security.
Cost-Effectiveness: Interactions are token-based, with costs depending on the model used. By using a mix of models and optimizing queries, we normally find that the efficiency and expense reduction compared to traditional BI tools can be very significant.
Performance: Model selection impacts heavily response time. For example, GPT 3.5 is much faster than GPT 4. We normally use the most advanced model, but it takes a bit getting used to especially because, due to the series of interactions described above, we typically wait until the response is completed by GPT before presenting results to the user. This means response time can take 10-20 seconds. We expect this to drop significantly as new versions are introduced.
Scalability: Most transactional systems are designed for fast transactions. These GPT sessions take time as users interact, so adequate horizontal scaling must be implemented to deal with large numbers of users accessing the system simultaneously – for example, with automatic scaling of cloud containers.
Thematic focus: One of the key concerns companies have is to make sure their GPT appropriately presents itself, stays on task, and does not say something that is incongruent with the company’s brand image, values, and tone. We have found that the risk of GPT going astray can be minimized with careful prompt design and adequate testing.
Complexity: Many interactions are more complex than a simple “question/answer” or a “command/act” model. A technique we use to tackle this is to break down complex interactions into steps, which can then be addressed by individual GPTs or a GPT with a state machine to guide it.

Building a Conversational Interface

Developing a CI is akin to any software + data engineering project. To achieve commercial-grade strength, it is important to incorporate best practices such as:

Separate development, QA, and production environments.
CI/CD tools for software and data lifecycle automation.
Rigorous testing for accuracy and effectiveness.
A semantic layer with clear definitions and understanding.
Iterative development for rapid deployment and refinement.

This last point is critical as the entire AI industry, including the Conversational Interface subsegment, is moving extremely fast. We continuously refactor and adapt our CIs as new developments pop up, often within a few weeks of one another.

The Skillsets Needed

Building a Conversational Interface requires a variety of expertise, including:

Infrastructure engineering/cloud architecture: to design, create, and manage cloud resources.
Data engineering: to create the data and models that will power the CI.
Software engineering: to create the end-user interface, orchestration functions, and all tools needed for the GPT.
Prompt engineering: to create custom prompts, GPT personalization, and fine-tune models.
DevOps engineering: to add testing, CI/CD, pipeline orchestration, environment management, etc.
Project and product management: to manage expectations, set feature roadmaps, and coordinate execution and rollout

One thing you will not need, at least for the CI itself, are people with deep AI or traditional Data Science skills. This is because CIs leverage pre-trained models like GPT, making these projects accessible, both in terms of cost and time requirements. However, you will need all the disciplines above if you are to have a solid CI implementation.

Getting Started

At Fractal River, we have a specialized practice focused on developing Conversational Interfaces that has experience building CIs and helping growth companies transform their data and application interactions.

We also understand that developing these competencies in-house at the right time is strategic, so we are prepared to not only help you through the process, from concept to deployment and ongoing management, but to grow your internal team as well so you can be self-sufficient in the future.

What are your thoughts on Conversational Interfaces? Do you have a use case in mind? Are you interested in seeing a demonstration? If so, we’d love to talk more with you about it!

Blog Categories

Pages

info@fractalriver.com

+1 832 3771028