Building a ShopifyQL Code Editor

Created on November 12, 2023 at 10:34 am

In October 2022 DATE , Shopify released ShopifyQL Notebooks ORG , a first ORDINAL -party app that lets merchants analyze their shop data to make better decisions. It puts the power of ShopifyQL into merchants’ hands with a guided code editing experience. In order to provide a first ORDINAL -class editing experience, we turned to CodeMirror ORG , a code editor framework built for the web. Out of the box, CodeMirror ORG didn’t have support for ShopifyQL–here’s how we built it.

ShopifyQL Everywhere

ShopifyQL is an accessible, commerce-focused querying language used on both the client and server. The language is defined by an ANTLR PRODUCT grammar and is used to generate code for multiple targets (currently, Go and Typescript WORK_OF_ART ). This lets us share the same grammar definition between both the client and server despite differences in runtime language. As an added benefit, we have types written in Protobuf NORP so that types can be shared between targets as well.

All the ShopifyQL language features on the front end are encapsulated into a typescript language server, which is built on top of the ANTLR LAW typescript target. It conforms to Microsoft ORG ‘s language server protocol (LSP) in order to keep a clear separation of concerns between the language server and a code editor. LSP ORG defines the shape of common language features like tokenization, parsing, completion, hover tooltips, and linting.

When code editors and language servers both conform to LSP ORG , they become interoperable because they speak a common language. For more information about LSP ORG , read the VSCode Language Server Extension Guide.

Connecting The ShopifyQL Language Server WORK_OF_ART To CodeMirror ORG

CodeMirror ORG has its own grammar & parser ORG engine called Lezer GPE . Lezer PERSON is used within CodeMirror ORG to generate parse trees, and those trees power many of the editor features. Lezer PERSON has support for common languages, but no Lezer PERSON grammar exists for ShopifyQL. Lezer PERSON also doesn’t conform to LSP. Because ShopifyQL’s grammar and language server had already been written in ANTLR ORG , it didn’t make sense to rewrite what we had as a Lezer GPE grammar. Instead, we decided to create an adapter that would conform to LSP ORG and integrate with Lezer GPE . This allowed us to pass a ShopifyQL query to the language server, adapt the response, and return a Lezer GPE parse tree.

Lezer PERSON supports creating a tree in one CARDINAL of two CARDINAL ways:

Manually creating a tree by creating nodes and attaching them in the correct tree shape Generating a tree from a buffer of tokens

The ShopifyQL language server can create a stream of tokens from a document, so it made sense to re-shape that stream into a buffer that Lezer PERSON understands.

Converting A ShopifyQL Query Into A Lezer Tree PRODUCT

In order to transform a ShopifyQL query into a Lezer GPE parse tree, the following steps occur:

Lezer PERSON initiates the creation of a parse tree. This happens when the document is first ORDINAL loaded and any time the document changes. Our custom adapter takes the ShopifyQL query and passes it to the language server. The language server returns a stream of tokens that describe the ShopifyQL query. The adapter takes those tokens and transforms them into Lezer GPE node types. The Lezer PERSON node types are used to create a buffer that describes the document. The buffer is used to build a Lezer GPE tree. Finally, it returns the tree back to Lezer GPE and completes the parse cycle.

Understanding ShopifyQL’s Token Offset

One CARDINAL of the biggest obstacles to transforming the language server’s token stream into a Lezer GPE buffer was the format of the tokens. Within the ShopifyQL Language Server PRODUCT , the tokens come back as integers in chunks of 5 CARDINAL , with the position of each integer having distinct meaning:

In this context, length, token type, and token modifier were fairly straightforward to use. However, the behavior of line and start character were more difficult to understand. Imagine a simple ShopifyQL query like this:

This query would be tokenized like this:

In the stream of tokens, even though product_title is on line 1 CARDINAL (using zero CARDINAL -based indexes), the value for its line integer is zero CARDINAL ! This is because the tokenization happens incrementally and each computed offset value is always relative to the previous token. This becomes more confusing when you factor in whitespace-let’s say that we add five CARDINAL spaces before the word SHOW :

The tokens for this query are:

Notice that only the start character for SHOW changed! It changed from 0 to 5 CARDINAL after adding five CARDINAL spaces before the SHOW keyword. However, product_title PERSON ’s values remain unchanged. This is because the values are relative to the previous token, and the space between SHOW and product_title didn’t change.

This becomes especially confusing when you use certain language features that are parsed out of order. For example, in some ANTLR PRODUCT grammars, comments are not parsed as part of the default channel–they are parsed after everything in the main channel is parsed. Let’s add a comment to the first ORDINAL line:

The tokens for this query look like this (and are in this order):

Before the parser parses the comment, it points at product_title , which is two CARDINAL lines after the comment. When the parser finishes with the main channel and begins parsing the channel that contains the comment, the pointer needs to move two CARDINAL lines up to tokenize the comment–hence the value of -2 for the comment’s line integer.

Adapting ShopifyQL’s Token Offset To Work With ORG CodeMirror

CodeMirror ORG treats offset values much simpler than ANTLR. In CodeMirror ORG , everything is relative to the top of the document–the document is treated as one CARDINAL long string of text. This means that newlines and whitespace are meaningful to CodeMirror ORG and affect the start offset of a token.

So to adapt the values from ANTLR ORG to work with CodeMirror ORG , we need to take these values:

And convert them into this:

The solution? A custom TokenIterator WORK_OF_ART that could follow the “directions” of the Language Server’s offsets and convert them along the way. The final implementation of this class was fairly simple, but arriving at this solution was the hard part.

At a high level, the TokenIterator PERSON class:

Takes in the document and derives the length of each line. This means that trailing whitespace is properly represented. Internally tracks the current line and character that the iterator points to. Ingests PERSON the ANTLR LAW -style line, character, and token length descriptors and moves the current line and character to the appropriate place. Uses the current line, current character, and line lengths to compute the CodeMirror ORG -style start offset. Uses the start offset combined with the token length to compute the end offset.

Here’s what the code looks like:

Building A Parse Tree

Now that we have a clear way to convert an ANTLR PRODUCT token stream into a Lezer GPE buffer, we’re ready to build our tree! To build it, we follow the steps mentioned previously–we take in a ShopifyQL query, use the language server to convert it to a token stream, transform that stream into a buffer of nodes, and then build a tree from that buffer.

Once the parse tree is generated, CodeMirror ORG then “understands” ShopifyQL and provides useful language features such as syntax highlighting.

Providing Additional Language Features

By this point, CodeMirror ORG can talk to the ShopifyQL Language Server PRODUCT and build a parse tree that describes the ShopifyQL code. However, the language server offers other useful features like code completion, linting, and tooltips. As mentioned above, Lezer/CodeMirror ORG doesn’t conform to LSP–but it does offer many plugins that let us provide a connector between our language server and CodeMirror ORG . In order to provide these features, we adapted the language server’s doValidate with CodeMirror ORG ’s linting plugin, the language server’s doComplete with CodeMirror ORG ’s autocomplete plugin, and the language server’s doHover with CodeMirror ORG ’s requestHoverTooltips plugin.

Once we connect those features, our ShopifyQL code editor is fully powered up, and we get an assistive, delightful code editing experience.

Conclusion

This approach enabled us to provide ShopifyQL features to CodeMirror ORG while continuing to maintain a grammar that serves both client and server. The custom adapter we created allows us to pass a ShopifyQL query to the language server, adapt the response, and return a Lezer GPE parse tree to CodeMirror ORG , making it possible to provide features like syntax highlighting, code completion, linting, and tooltips. Because our solution utilizes CodeMirror ORG ’s internal parse tree, we are able to make better decisions in the code and craft a stronger editing experience. The ShopifyQL code editor helps merchants write ShopifyQL and get access to their data in new and delightful ways.

Connecting to blog.lzomedia.com... Connected... Page load complete