Building a ShopifyQL Code Editor

By admin
In

October 2022
DATE

, Shopify released ShopifyQL

Notebooks
ORG

, a

first
ORDINAL

-party app that lets merchants analyze their shop data to make better decisions. It puts the power of ShopifyQL into merchants’ hands with a guided code editing experience. In order to provide a

first
ORDINAL

-class editing experience, we turned to

CodeMirror
ORG

, a code editor framework built for the web. Out of the box,

CodeMirror
ORG

didn’t have support for ShopifyQL–here’s how we built it.

ShopifyQL Everywhere

ShopifyQL is an accessible, commerce-focused querying language used on both the client and server. The language is defined by an

ANTLR
PRODUCT

grammar and is used to generate code for multiple targets (currently,

Go and Typescript
WORK_OF_ART

). This lets us share the same grammar definition between both the client and server despite differences in runtime language. As an added benefit, we have types written in

Protobuf
NORP

so that types can be shared between targets as well.

All the ShopifyQL language features on the front end are encapsulated into a typescript language server, which is built on top of the

ANTLR
LAW

typescript target. It conforms to

Microsoft
ORG

‘s language server protocol (LSP) in order to keep a clear separation of concerns between the language server and a code editor.

LSP
ORG

defines the shape of common language features like tokenization, parsing, completion, hover tooltips, and linting.

When code editors and language servers both conform to

LSP
ORG

, they become interoperable because they speak a common language. For more information about

LSP
ORG

, read the VSCode Language Server Extension Guide.


Connecting The ShopifyQL Language Server
WORK_OF_ART

To

CodeMirror
ORG


CodeMirror
ORG

has its own

grammar & parser
ORG

engine called

Lezer
GPE

.

Lezer
PERSON

is used within

CodeMirror
ORG

to generate parse trees, and those trees power many of the editor features.

Lezer
PERSON

has support for common languages, but no

Lezer
PERSON

grammar exists for ShopifyQL.

Lezer
PERSON

also doesn’t conform to LSP. Because ShopifyQL’s grammar and language server had already been written in

ANTLR
ORG

, it didn’t make sense to rewrite what we had as a

Lezer
GPE

grammar. Instead, we decided to create an adapter that would conform to

LSP
ORG

and integrate with

Lezer
GPE

. This allowed us to pass a ShopifyQL query to the language server, adapt the response, and return a

Lezer
GPE

parse tree.


Lezer
PERSON

supports creating a tree in

one
CARDINAL

of

two
CARDINAL

ways:

Manually creating a tree by creating nodes and attaching them in the correct tree shape Generating a tree from a buffer of tokens

The ShopifyQL language server can create a stream of tokens from a document, so it made sense to re-shape that stream into a buffer that

Lezer
PERSON

understands.

Converting A ShopifyQL

Query Into A Lezer Tree
PRODUCT

In order to transform a ShopifyQL query into a

Lezer
GPE

parse tree, the following steps occur:


Lezer
PERSON

initiates the creation of a parse tree. This happens when the document is

first
ORDINAL

loaded and any time the document changes. Our custom adapter takes the ShopifyQL query and passes it to the language server. The language server returns a stream of tokens that describe the ShopifyQL query. The adapter takes those tokens and transforms them into

Lezer
GPE

node types. The

Lezer
PERSON

node types are used to create a buffer that describes the document. The buffer is used to build a

Lezer
GPE

tree. Finally, it returns the tree back to

Lezer
GPE

and completes the parse cycle.

Understanding ShopifyQL’s Token Offset


One
CARDINAL

of the biggest obstacles to transforming the language server’s token stream into a

Lezer
GPE

buffer was the format of the tokens. Within the ShopifyQL

Language Server
PRODUCT

, the tokens come back as integers in chunks of

5
CARDINAL

, with the position of each integer having distinct meaning:

In this context, length, token type, and token modifier were fairly straightforward to use. However, the behavior of line and start character were more difficult to understand. Imagine a simple ShopifyQL query like this:

This query would be tokenized like this:

In the stream of tokens, even though product_title is on line

1
CARDINAL

(using

zero
CARDINAL

-based indexes), the value for its line integer is

zero
CARDINAL

! This is because the tokenization happens incrementally and each computed offset value is always relative to the previous token. This becomes more confusing when you factor in whitespace-let’s say that we add

five
CARDINAL

spaces before the word SHOW :

The tokens for this query are:

Notice that only the start character for SHOW changed! It changed from

0 to 5
CARDINAL

after adding

five
CARDINAL

spaces before the SHOW keyword. However,

product_title
PERSON

’s values remain unchanged. This is because the values are relative to the previous token, and the space between SHOW and product_title didn’t change.

This becomes especially confusing when you use certain language features that are parsed out of order. For example, in some

ANTLR
PRODUCT

grammars, comments are not parsed as part of the default channel–they are parsed after everything in the main channel is parsed. Let’s add a comment to the

first
ORDINAL

line:

The tokens for this query look like this (and are in this order):

Before the parser parses the comment, it points at product_title , which is

two
CARDINAL

lines after the comment. When the parser finishes with the main channel and begins parsing the channel that contains the comment, the pointer needs to move

two
CARDINAL

lines up to tokenize the comment–hence the value of -2 for the comment’s line integer.

Adapting ShopifyQL’s

Token Offset To Work With
ORG

CodeMirror


CodeMirror
ORG

treats offset values much simpler than ANTLR. In

CodeMirror
ORG

, everything is relative to the top of the document–the document is treated as

one
CARDINAL

long string of text. This means that newlines and whitespace are meaningful to

CodeMirror
ORG

and affect the start offset of a token.

So to adapt the values from

ANTLR
ORG

to work with

CodeMirror
ORG

, we need to take these values:

And convert them into this:

The solution? A custom

TokenIterator
WORK_OF_ART

that could follow the “directions” of the Language Server’s offsets and convert them along the way. The final implementation of this class was fairly simple, but arriving at this solution was the hard part.

At a high level, the

TokenIterator
PERSON

class:

Takes in the document and derives the length of each line. This means that trailing whitespace is properly represented. Internally tracks the current line and character that the iterator points to.

Ingests
PERSON

the

ANTLR
LAW

-style line, character, and token length descriptors and moves the current line and character to the appropriate place. Uses the current line, current character, and line lengths to compute the

CodeMirror
ORG

-style start offset. Uses the start offset combined with the token length to compute the end offset.

Here’s what the code looks like:

Building A Parse Tree

Now that we have a clear way to convert an

ANTLR
PRODUCT

token stream into a

Lezer
GPE

buffer, we’re ready to build our tree! To build it, we follow the steps mentioned previously–we take in a ShopifyQL query, use the language server to convert it to a token stream, transform that stream into a buffer of nodes, and then build a tree from that buffer.

Once the parse tree is generated,

CodeMirror
ORG

then “understands” ShopifyQL and provides useful language features such as syntax highlighting.

Providing Additional Language Features

By this point,

CodeMirror
ORG

can talk to the ShopifyQL

Language Server
PRODUCT

and build a parse tree that describes the ShopifyQL code. However, the language server offers other useful features like code completion, linting, and tooltips. As mentioned above,

Lezer/CodeMirror
ORG

doesn’t conform to LSP–but it does offer many plugins that let us provide a connector between our language server and

CodeMirror
ORG

. In order to provide these features, we adapted the language server’s doValidate with

CodeMirror
ORG

’s linting plugin, the language server’s doComplete with

CodeMirror
ORG

’s autocomplete plugin, and the language server’s doHover with

CodeMirror
ORG

’s requestHoverTooltips plugin.

Once we connect those features, our ShopifyQL code editor is fully powered up, and we get an assistive, delightful code editing experience.

Conclusion

This approach enabled us to provide ShopifyQL features to

CodeMirror
ORG

while continuing to maintain a grammar that serves both client and server. The custom adapter we created allows us to pass a ShopifyQL query to the language server, adapt the response, and return a

Lezer
GPE

parse tree to

CodeMirror
ORG

, making it possible to provide features like syntax highlighting, code completion, linting, and tooltips. Because our solution utilizes

CodeMirror
ORG

’s internal parse tree, we are able to make better decisions in the code and craft a stronger editing experience. The ShopifyQL code editor helps merchants write ShopifyQL and get access to their data in new and delightful ways.