Moving Large Language Models (LLM) into Real-World Business Applications

Moving Large Language Models (LLM) into Real-World Business Applications

Large language models are everywhere. Every customer conversation or VC pitch involves questions about how ready LLM tech is and how it will drive future applications. I covered some patterns on this in my previous post. Here I will talk about some real-world patterns for an application in the pharma industry that Persistent Systems worked on.

Large Language Models and Core Strengths

LLMs are good at understanding language, that’s their forte. Most common pattern we are seeing with applications is retrieval augmented generation (RAG), where knowledge is externally compiled from data sources and provided in context as a prompt for the LLM to paraphrase a response. In this case, super-fast search mechanisms like vector databases and Elasticsearch-based engines serve as a first line of search. Then the search results are compiled into a prompt and sent to the LLM mostly as an API call.

Another pattern is generating a query on structured data by feeding the LLM a data model as the prompt and a specific user query. This pattern could be used to develop an advanced “talk to your data” interface for SQL databases like Snowflake, as well as graph databases like Neo4j.

Leveraging LLM Patterns for Real-World Insights

Persistent Systems recently looked at a pattern for Blast Motion, a sports telemetry company (swing analysis for baseball, golf, etc.), where we analysed time-series data of player summaries to get recommendations.

For more complex applications, we often need to chain the LLM requests with processing in between calls. For a pharma company, we developed a smart trails app that filters patients for clinical trials based on criteria extracted from clinical trial document. Here we used a LLM chain approach. First we developed a LLM to read trial pdf document and use RAG pattern to extract inclusion and exclusion criteria.

For this, a relatively simpler LLM like GPT-3.5-Turbo (ChatGPT) was used. Then we combined these extracted entities with data model of patients SQL database in Snowflake, to create a prompt. This prompt fed to a more powerful LLM like GPT4 gives us a SQL query to filter patients, that is ready to run on Snowflake. Since we use LLM chaining, we could use multiple LLMs for each step of the chain, thus enabling us to manage cost.

Currently, we decided to keep this chain deterministic for better control. That is, we decided to have more intelligence in the chains and keep the orchestration very simple and predictable. Each element of the chain is a complex application by itself that would take few months to develop in the pre-LLM days.

Powering More Advanced Use Cases

For a more advanced case, we could use Agents like ReAct to prompt the LLM to create step by step instructions to follow for a particular user query. This would of course need a high end LLM like GPT4 or Cohere or Claude 2. However, then there is a risk of the model taking an incorrect step that will need to be verified using guardrails. This is a trade-off between moving intelligence in controllable links of the chain or making the whole chain autonomous.

Today, as we get used to the age of Generative AI for language, the industry is starting to adopt LLM applications with predictable Chains. As this adoption grows, we will soon start experimenting with more autonomy for these chains via agents. That is what the debate on AGI is all about and we are interested to see how all of this evolves over time.

The post Moving Large Language Models (LLM) into Real-World Business Applications appeared first on Unite.AI.

文 » A