A Complete Guide to LangChain in JavaScript — SitePoint

By admin
In this comprehensive guide, we’ll dive deep into the essential components of

LangChain
PRODUCT

and demonstrate how to harness its power in

JavaScript
LOC

.

LangChainJS is a versatile JavaScript framework that empowers developers and researchers to create, experiment with, and analyze language models and agents. It offers a rich set of features for natural language processing (

NLP
ORG

) enthusiasts, from building custom models to manipulating text data efficiently. As a JavaScript framework, it also allows developers to easily integrate their

AI
ORG

applications into web apps.

Prerequisites

To follow along with this article, create a new folder and install the

LangChain
PRODUCT

npm package:

npm install -S langchain

After creating a new folder, create a new JS module file by using the .mjs suffix (such as test1.mjs ).

Agents

In

LangChain
PRODUCT

, an agent is an entity that can understand and generate text. These agents can be configured with specific behaviors and data sources and trained to perform various language-related tasks, making them versatile tools for a wide range of applications.

Creating a

LangChain
PRODUCT

agent

Agents can be configured to use “tools” to gather the data they need and formulate a good response. Take a look at the example below. It uses

Serp API
PRODUCT

(an internet search

API
ORG

) to search the Internet for information relevant to the question or input, and use that to make a response. It also uses the llm-math tool to perform mathematical operations — for example, to convert units or find percentage change

between two
CARDINAL

values:

import { initializeAgentExecutorWithOptions } from "

langchain
PERSON

/agents" ; import { ChatOpenAI } from "

langchain
PERSON

/chat_models/openai" ; import { SerpAPI } from "

langchain
PERSON

/tools" ; import { Calculator } from "

langchain
PERSON

/tools/calculator" ; process . env [ "OPENAI_API_KEY" ] = "YOUR_OPENAI_KEY" process . env [ "

SERPAPI_API_KEY
PRODUCT

" ] = "YOUR_SERPAPI_KEY" const tools = [ new Calculator ( ) , new

SerpAPI
GPE

( ) ] ; const model = new ChatOpenAI ( { modelName : "gpt-3.5-turbo" , temperature :

0
CARDINAL

} ) ; const executor = await initializeAgentExecutorWithOptions ( tools , model , { agentType : "openai-functions" , verbose : false , } ) ; const result = await executor . run ( "By searching the Internet, find how many albums has

Boldy James
PERSON

dropped since

2010
DATE

and how many albums has

Nas
PERSON

dropped since

2010
DATE

? Find who dropped more albums and show the difference in percent." ) ; console . log ( result ) ;

After creating the model variable using modelName: "gpt-3.5-turbo" and temperature:

0
CARDINAL

, we create the executor that combines the created model with the specified tools (

SerpAPI
LOC

and Calculator). In the input, I’ve asked the

LLM
WORK_OF_ART

to search the Internet (using

SerpAPI
GPE

) and find which artist dropped more albums since

2010
DATE

— Nas or

Boldy James
PERSON

— and show the percentage difference (using Calculator).

In this example, I had to explicitly tell the LLM “By searching the Internet…” to have it get data up until

present day
DATE

using the Internet instead of using

OpenAI
ORG

’s default data limited to

2021
DATE

.

Here’s what the output looks like:

> node test1.mjs

Boldy James
PERSON

has released

4
CARDINAL

albums since

2010
DATE

. Nas has released

17
CARDINAL

studio albums since

2010
DATE

. Therefore,

Nas
PERSON

has released more albums than

Boldy James
PERSON

. The difference in the number of albums is

13
CARDINAL

. To calculate the difference in percent, we can use the formula: ( Difference / Total ) *

100
CARDINAL

. In this case, the difference is

13
CARDINAL

and the total is

17
CARDINAL

. The difference in percent is: (

13 / 17
DATE

) *

100
CARDINAL

=

76.47 %
PERCENT

. So,

Nas
PERSON

has released

76.47 %
PERCENT

more albums than

Boldy James
PERSON

since

2010
DATE

.

Models

There are

three
CARDINAL

types of models in

LangChain
PRODUCT

: LLMs, chat models, and text embedding models. Let’s explore every type of model with some examples.

Language model


LangChain
PRODUCT

provides a way to use language models in

JavaScript
PRODUCT

to produce a text output based on a text input. It’s not as complex as a chat model, and it’s used best with simple input–output language tasks. Here’s an example using OpenAI:

import { OpenAI } from "

langchain/llms
WORK_OF_ART

/openai" ; const llm = new OpenAI ( { openAIApiKey : "YOUR_OPENAI_KEY" , model : "gpt-3.5-turbo" , temperature :

0
CARDINAL

} ) ; const res = await llm . call ( "List all red berries" ) ; console .

log ( res )
ORG

;

As you can see, it uses the gpt-3.5-turbo model to list all red berries. In this example, I set the temperature to

0
CARDINAL

to make the

LLM
WORK_OF_ART

factually accurate. Output:


1
CARDINAL

. Strawberries

2
CARDINAL

. Cranberries

3
CARDINAL

. Raspberries 4 . Redcurrants

5
CARDINAL

. Red Gooseberries

6
CARDINAL

.

Red Elderberries 7
PRODUCT

.

Red Huckleberries 8
LOC

. Red Mulberries

Chat model

If you want more sophisticated answers and conversations, you need to use chat models. How are chat models technically different from language models? Well, in the words of the

LangChain
PRODUCT

documentation:

Chat models are a variation on language models. While chat models use language models under the hood, the interface they use is a bit different. Rather than using a “text in, text out” API, they use an interface where “chat messages” are the inputs and outputs.

Here’s a simple (pretty useless but fun) JavaScript chat model script:

import { ChatOpenAI } from "

langchain
PERSON

/chat_models/openai" ; import { PromptTemplate } from "

langchain
PERSON

/prompts" ; const chat = new ChatOpenAI ( { openAIApiKey : "YOUR_OPENAI_KEY" , model : "gpt-3.5-turbo" , temperature :

0
CARDINAL

} ) ; const prompt = PromptTemplate . fromTemplate ( ` You are a poetic assistant that always answers in rhymes: {question} ` ) ; const runnable = prompt . pipe ( chat ) ; const response = await runnable . invoke ( { question : "Who is better,

Djokovic
GPE

,

Federer
ORG

or

Nadal
PERSON

?" } ) ; console . log ( response ) ;

As you can see, the code

first
ORDINAL

sends a system message and tells the chatbot to be a poetic assistant that always answers in rhymes, and afterwards it sends a human message telling the chatbot to tell me who’s the better tennis player: Djokovic,

Federer
ORG

or

Nadal
PERSON

. If you run this chatbot model, you’ll see something like this:

AIMessage.content: ‘In the realm of tennis, they all shine bright,

‘ + ‘Djokovic,

Federer
ORG

, and

Nadal
PERSON

, a glorious sight.

‘ + ‘Each with their unique style and skill,

‘ + ‘Choosing the best is a difficult thrill.

‘ + ‘

‘ + ‘Djokovic, the

Serb
NORP

, a master of precision,

‘ + ‘With agility and focus, he plays with decision.

‘ + ‘His powerful strokes and relentless drive,

‘ + "Make him a force that’s hard to survive.

" + ‘

‘ + ‘Federer, the

Swiss
NORP

maestro, a true artist,

‘ + ‘Graceful and elegant, his game is the smartest.

‘ + ‘His smooth technique and magical touch,

‘ + ‘Leave spectators in awe, oh so much.

‘ + ‘

‘ + ‘

Nadal
PERSON

, the

Spaniard
NORP

, a warrior on clay,

‘ + ‘His fierce determination keeps opponents at bay.

‘ + ‘With his relentless power and never-ending fight,

‘ + ‘He conquers the court, with all his might.

‘ + ‘

‘ + "So, who is better? It’s a question of taste,

" + "Each player’s greatness cannot be erased.

" + "In the end, it’s the love for the game we share,

" + ‘That makes them all champions, beyond compare.’

Pretty cool!


Embeddings
PERSON


Embeddings
PERSON

models provide a way to turn words and numbers in a text into vectors, that can then be associated with other words or numbers. This may sound abstract, so let’s look at an example:

import { OpenAIEmbeddings } from "

langchain
PERSON

/embeddings/openai" ; process . env [ "OPENAI_API_KEY" ] = "YOUR_OPENAI_KEY" const embeddings = new OpenAIEmbeddings ( ) ; const res = await embeddings .

embedQuery
ORG

( "Who created the world wide web?" ) ; console .

log ( res )
ORG

This will return a long list of floats:

[

0.02274114
CARDINAL

, –

0.012759142
CARDINAL

,

0.004794503
CARDINAL

, – 0.009431809 ,

0.01085313
GPE

,

0.0019698727
FAC

, –

0.013649924
DATE

,

0.014933698
CARDINAL

, –

0.0038185727
CARDINAL

, – 0.025400387 ,

0.010794181
DATE

,

0.018680222
DATE

,

0.020042595
DATE

,

0.004303263
FAC

,

0.019937797
DATE

,

0.011226473
CARDINAL

,

0.009268062
DATE

,

0.016125774
DATE

,

0.0116391145
CARDINAL

, – 0.0061765253 , – 0.0073358514 ,

0.00021696436
CARDINAL

,

0.004896026
DATE

,

0.0034026562
CARDINAL

, – 0.018365828 , …

1501
CARDINAL

more items ]

This is what an embedding looks like. All of those floats for just

six
CARDINAL

words!

This embedding can then be used to associate the input text with potential answers, related texts, names and more.

Now let’s look at a use case of embedding models…

Now here’s a script that will take the question “What is the heaviest animal?” and find the right answer in the provided list of possible answers by using embeddings:

import { OpenAIEmbeddings } from "

langchain
PERSON

/embeddings/openai" ; process . env [ "OPENAI_API_KEY" ] = "YOUR_OPENAI_KEY" const embeddings = new OpenAIEmbeddings ( ) ; function cosinesim ( A , B ) { var dotproduct =

0
CARDINAL

; var mA = 0 ;

var mB
ORG

= 0 ; for ( var i = 0 ; i < A . length ; i ++ ) { dotproduct += A [ i ] * B [ i ] ; mA += A [ i ] * A [ i ] ; mB += B [ i ] * B [ i ] ; } mA = Math . sqrt ( mA ) ; mB = Math . sqrt ( mB ) ; var similarity = dotproduct / ( mA * mB ) ; return similarity ; } const res1 = await embeddings .

embedQuery
ORG

( "

The Blue Whale
WORK_OF_ART

is the heaviest animal in the world" ) ; const res2 = await embeddings .

embedQuery
ORG

( "

George Orwell
PERSON

wrote

1984
DATE

" ) ; const res3 = await embeddings .

embedQuery
ORG

( "Random stuff" ) ; const text_arr = [ "

The Blue Whale
WORK_OF_ART

is the heaviest animal in the world" , "

George Orwell
PERSON

wrote

1984
DATE

" , "Random stuff" ] const res_arr = [ res1 ,

res2
PERSON

, res3 ] const question = await embeddings .

embedQuery
ORG

( "What is the heaviest animal?" ) ; const sims = [ ] for ( var i = 0 ; i < res_arr .

length
PERSON

; i ++ ) { sims . push ( cosinesim ( question ,

res_arr
PERSON

[ i ] ) ) } Array . prototype .

max
PERSON

= function ( ) { return Math .

max
PERSON

. apply ( null , this ) ; } ; console . log ( text_arr [ sims . indexOf ( sims .

max
PERSON

( ) ) ] )

This code uses the cosinesim(A, B) function to find the relatedness of each answer to the question. By finding the list of embeddings most related to the question using the

Array.prototype.max
PERSON

function by finding the maximum value in the array of relatedness indexes that were generated using cosinesim , the code is then able to find the right answer by finding which text from

text_arr
PERSON

belongs to the most related answer: text_arr[sims.indexOf(sims.max())] .

Output:


The Blue Whale
ORG

is the heaviest animal in the world

Chunks


LangChain
PRODUCT

models can’t handle large texts and use them to make responses. This is where chunks and text splitting come in. Let me show you

two
CARDINAL

simple methods to split your text data into chunks before feeding it into

LangChain
PRODUCT

.

Splitting chunks by character

To avoid abrupt breaks in chunks, you can split your texts by paragraph by splitting them at every occurrence of a newline:

import { Document } from "

langchain
PERSON

/document" ; import { CharacterTextSplitter } from "

langchain
PERSON

/text_splitter" ; const splitter = new

CharacterTextSplitter
ORG

( { separator : "

" , chunkSize :

7
CARDINAL

, chunkOverlap :

3
CARDINAL

, } ) ; const output = await splitter .

createDocuments
PERSON

( [ your_text ] ) ;

This is one useful way of splitting a text. However, you can use any character as a chunk separator, not just

.

Recursively splitting chunks

If you want to strictly split your text by a certain length of characters, you can do so using RecursiveCharacterTextSplitter :

import { RecursiveCharacterTextSplitter } from "

langchain
PERSON

/text_splitter" ; const splitter = new RecursiveCharacterTextSplitter ( { chunkSize :

100
CARDINAL

, chunkOverlap :

15
CARDINAL

, } ) ; const output = await splitter .

createDocuments
PERSON

( [ your_text ] ) ;

In this example, the text gets split every

100
CARDINAL

characters, with a chunk overlap of

15
CARDINAL

characters.

Chunk size and overlap

By looking at those examples, you’ve probably started wondering exactly what the chunk size and overlap parameters mean, and what implications they have on performance. Well, let me explain it simply in

two
CARDINAL

points.


Chunk size
PERSON

decides the amount of characters that will be in each chunk. The bigger the chunk size, the more data is in the chunk, the more time it will take

LangChain
PRODUCT

to process it and to produce an output, and vice versa.


Chunk
PERSON

overlap is what shares information between chunks so that they share some context. The higher the chunk overlap, the more redundant your chunks will be; the lower the chunk overlap, the less context will be shared between the chunks. Generally, a good chunk overlap is

between 10% and 20%
PERCENT

of the chunk size, although the ideal chunk overlap varies across different text types and use cases.

Chains

Chains are basically multiple

LLM
WORK_OF_ART

functionalities linked together to perform more complex tasks that couldn’t otherwise be done with simple LLM input–>output fashion. Let’s look at a cool example:

import { ChatPromptTemplate } from "

langchain
PERSON

/prompts" ; import {

LLMChain
PERSON

} from "

langchain
PERSON

/chains" ; import { ChatOpenAI } from "

langchain
PERSON

/chat_models/openai" ; process . env [ "OPENAI_API_KEY" ] = "YOUR_OPENAI_KEY" const wiki_text = `

Alexander Stanislavovich ‘
PERSON


Sasha
PERSON



Bublik
NORP

(

Александр Станиславович Бублик
PERSON

; born

17 June 1997
DATE

) is a

Kazakhstani
NORP

professional tennis player. He has been ranked as high as world No.

25
CARDINAL

in singles by

the Association of Tennis Professionals
ORG

(

ATP
ORG

), which he achieved in

July 2023
DATE

, and is the current

Kazakhstani
NORP

No.

1
CARDINAL

player…

Alexander Stanislavovich Bublik
PERSON

was born on

17 June 1997
DATE

in

Gatchina
GPE

,

Russia
GPE

and began playing tennis at

the age of four
DATE

. He was coached by his father,

Stanislav
PERSON

. On the junior tour,

Bublik
GPE

reached a career-high ranking of No.

19
CARDINAL

and won

eleven
CARDINAL

titles (

six
CARDINAL

singles and

five
CARDINAL

doubles) on

the International Tennis Federation
ORG

(

ITF
ORG

) junior

circuit.[4][5
PERSON

]… ` const chat = new ChatOpenAI ( { temperature :

0
CARDINAL

} ) ; const chatPrompt = ChatPromptTemplate . fromMessages ( [ [ "system" , "You are a helpful assistant that {action} the provided text" , ] , [ "human" , "{text}" ] , ] ) ; const chainB = new

LLMChain
PERSON

( { prompt : chatPrompt , llm : chat , } ) ; const resB = await chainB . call ( { action : "lists all important numbers from" , text : wiki_text , } ) ; console . log ( { resB } ) ;

This code takes a variable into its prompt, and formulates a factually correct answer (temperature:

0
CARDINAL

). In this example, I asked the

LLM
WORK_OF_ART

to list all important numbers from a short

Wiki
PRODUCT

bio of my favorite tennis player.

Here’s the output of this code:

{ resB: { text: ‘Important numbers from the provided text:

‘ + ‘

‘ + "-

Alexander Stanislavovich ‘
PERSON


Sasha
PERSON



Bublik
GPE

‘s date of birth:

17 June 1997
DATE

" + "-

Bublik
GPE

‘s highest singles ranking: world No.

25
CARDINAL

" + "-

Bublik
GPE

‘s highest doubles ranking: world No.

47
CARDINAL

" + "-

Bublik
GPE

‘s career

ATP Tour
ORG

singles titles:

3
CARDINAL

" + "-

Bublik
GPE

‘s career

ATP Tour
ORG

singles runner-up finishes:

6
CARDINAL

" + "-

Bublik
GPE

‘s height:

1.96 m
QUANTITY

(

6 ft 5
QUANTITY

in)

" + "-

Bublik
GPE

‘s number of aces served in the

2021
DATE

ATP Tour season: unknown

" + "-

Bublik
GPE

‘s junior tour ranking: No.

19
CARDINAL

" + "-

Bublik
GPE

‘s junior tour titles:

11
CARDINAL

(

6
CARDINAL

singles and

5
CARDINAL

doubles)

" + "-

Bublik
GPE

‘s previous citizenship:

Russia
GPE

" + "-

Bublik
GPE

‘s current citizenship:

Kazakhstan
GPE

" + "-

Bublik
GPE

‘s role in the Levitov Chess Wizards team: reserve member" } }

Pretty cool, but this doesn’t really show the full power of chains. Let’s take a look at a more practical example:

import { z } from "zod" ; import { zodToJsonSchema } from "zod-to-json-schema" ; import { ChatOpenAI } from "

langchain
PERSON

/chat_models/openai" ; import {

ChatPromptTemplate
PRODUCT

,

SystemMessagePromptTemplate
ORG

, HumanMessagePromptTemplate , } from "

langchain
PERSON

/prompts" ; import

{ JsonOutputFunctionsParser }
PERSON

from "

langchain
PERSON

/output_parsers" ; process . env [ "OPENAI_API_KEY" ] = "YOUR_OPENAI_KEY" const

zodSchema
ORG

= z . object ( { albums : z . array ( z . object ( { name : z .

string
PERSON

( ) . describe ( "The name of the album" ) , artist : z .

string
PERSON

( ) . describe ( "The artist(s) that made the album" ) , length : z . number ( ) . describe ( "The length of the album in

minutes
TIME

" ) , genre : z .

string
PERSON

( ) . optional ( ) . describe ( "The genre of the album" ) , } ) ) . describe ( "An array of music albums mentioned in the text" ) , } ) ; const prompt = new

ChatPromptTemplate
PRODUCT

( { promptMessages : [ SystemMessagePromptTemplate . fromTemplate ( "List all music albums mentioned in the following text." ) , HumanMessagePromptTemplate . fromTemplate ( "{inputText}" ) , ] , inputVariables : [ "inputText" ] , } ) ; const llm = new ChatOpenAI ( { modelName : "gpt-3.5-turbo" , temperature :

0
CARDINAL

} ) ; const functionCallingModel = llm . bind ( { functions : [ { name : "output_formatter" , description : "Should always be used to properly format output" , parameters : zodToJsonSchema (

zodSchema
ORG

) , } , ] , function_call : { name : "output_formatter" } , } ) ; const outputParser =

new JsonOutputFunctionsParser
PERSON

( ) ; const chain = prompt . pipe ( functionCallingModel ) .

pipe
PERSON

( outputParser ) ; const response = await chain . invoke ( { inputText : "My favorite albums are:

2001
DATE

, To Pimp a

Butterfly
PERSON

and

Led Zeppelin IV
PERSON

" , } ) ; console . log ( JSON .

stringify
PERSON

( response , null ,

2
CARDINAL

) ) ;

This code reads an input text, identifies all mentioned music albums, identifies each album’s name, artist, length and genre, and finally puts all the data into JSON format. Here’s the output given the input “My favorite albums are:

2001
DATE

, To Pimp a

Butterfly
PERSON

and

Led Zeppelin IV
PERSON

”:

{ "albums" : [ { "name" : "

2001
DATE

" , "artist" : "Dr.

Dre
PERSON

" , "length" :

68
CARDINAL

, "genre" : "

Hip Hop
WORK_OF_ART

" } , { "name" : "

To Pimp a Butterfly
WORK_OF_ART

" , "artist" : "

Kendrick Lamar
PERSON

" , "length" :

79
CARDINAL

, "genre" : "

Hip Hop
WORK_OF_ART

" } , { "name" : "

Led Zeppelin IV
WORK_OF_ART

" , "artist" : "

Led Zeppelin
WORK_OF_ART

" , "length" :

42
CARDINAL

, "genre" : "Rock" } ] }

This is just a fun example, but this technique can be used to structure unstructured text data for countless other applications.

Going Beyond OpenAI

Even though I keep using

OpenAI
ORG

models as examples of the different functionalities of

LangChain
PRODUCT

, it isn’t limited to OpenAI models. You can use

LangChain
PRODUCT

with a multitude of other LLMs and AI services. You can find the full list of

LangChain
PRODUCT

and

JavaScript
ORG

integratable LLMs in their documentation.

For example, you can use

Cohere
PERSON

with

LangChain
PRODUCT

. After installing

Cohere
PERSON

, using npm install cohere-ai , you can make a simple question–>answer code using

LangChain
PRODUCT

and

Cohere
PERSON

like this:

import { Cohere } from "

langchain/llms
WORK_OF_ART

/cohere" ; const model = new

Cohere
PERSON

( { maxTokens :

50
CARDINAL

,

apiKey
ORG

: "YOUR_COHERE_KEY" , } ) ; const res = await model . call ( "Come up with a name for a new Nas album" ) ; console . log ( { res } ) ;

Output:

{ res : ‘ Here are a few possible names for a new Nas album:

‘ + ‘

‘ + "- King’s Landing

" + "- God’s Son: The Sequel

" + "- Street’s Disciple

" + ‘- Izzy Free

‘ + ‘- Nas and the Illmatic Flow

‘ + ‘

‘ + ‘Do any’ }

Conclusion

In this guide, you’ve seen the different aspects and functionalities of

LangChain
PRODUCT

in

JavaScript
PRODUCT

. You can use

LangChain
PRODUCT

in

JavaScript
PRODUCT

to easily develop AI-powered web apps and experiment with LLMs. Be sure to refer to the LangChainJS documentation for more details on specific functionalities.

Happy coding and experimenting with

LangChain
PRODUCT

in

JavaScript
NORP

! If you enjoyed this article, you might also like to read about using

LangChain
PRODUCT

with

Python
ORG

.