Bring Your Own Markup to Edge Delivery Services with Content Overlays

Many developers new to Adobe's Edge Delivery Services wonder if and how it supports dynamic site behavior and back end integrations. Some misconceptions I've heard are "it's just a SPA," or "it’s just a static site generator," or that "you can't build complex sites on it like you can on traditional AEM."

In this post, I attempt to debunk some of those assumptions by exploring an enterprise-grade integration use case: dynamically rendering thousands of detail pages powered by data from an external system such as a database, PIM, search appliance, or other common systems you might integrate into a marketing site.

Data Integrations in Edge Delivery Services

There are several ways to approach these integrations which I'll touch on while also highlighting an approach that is based on newer and lesser known features of Edge Delivery Services' core architecture.

More specifically, we'll explore how you can publish your own markup by bringing your own content system (i.e. not SharePoint, Google, AEM, or Adobe's own Document Authoring) alongside any Adobe-supported sources you may already be using to author your Edge Delivery site.

Recap of Content Publishing Architecture

Let’s quickly recap how content is published and delivered in Edge Delivery Services.

Your content is authored in a document format (via document-based authoring) or in AEM’s JCR format using the Universal Editor (a..k.a. “Crosswalk”). When an author Previews or Publishes the page, the authoring interface sends a request to Edge Delivery's /preview or /live Admin API.

These APIs trigger the Helix Admin service, which fetches content from the configured source (as defined in fstab.yaml or the configuration service). That content is transformed into markdown by internal microservice adapters and stored in the Content Bus (backed by AWS S3).

When a user requests a page, the Helix pipeline service fetches the markdown content from the Content Bus, transforms it into standardized HTML, and serves it as a cached static file via the CDN.

Notably, there’s no custom backend rendering logic like in AEM’s traditional delivery model. This means no HTL, Sling Models, OSGi services, etc. The final HTML is generated based on the structure of your document.

At this point it might seem like any integration with external systems must happen client-side, but as we’ll explore next, that’s not the case.

An Integration Scenario

While client-side integrations are ideal for real-time data like product pricing or inventory, server-side rendering (or a hybrid of both) is often better when content changes infrequently or SEO and indexing are priorities, especially when no real-time API is available.

Take, for example, a commerce site with thousands of product detail pages, or a directory with profile pages for each person. If the data lives in an external PIM or database, and we need these pages to be server-rendered for SEO or performance reasons, how might we approach this within Edge Delivery’s architecture?

Common Approaches to Data-Driven Integrations

Let's cover some common integration patterns of rendering data-driven detail pages from external systems in Edge Delivery Services, ranging from client-side to pure back end.

Client-side API calls

While this approach doesn't satisfy any server-side rendering requirements, it's still worth a mention. In this setup, you often rely on the URL containing an expected pattern with an identifier that is used to query an API.

Some past Edge Delivery implementations of this approach would use a feature known as Folder Mapping which allowed your details URL to be dynamic. An HTTP 200 response is returned even if the requested resource does not physically exist in Edge Delivery's Content Bus. Then, JavaScript is used to parse the URL and invoke fetch calls to a remote API endpoint.

While this approach has worked for many, it has some drawbacks worth highlighting:

Page load performance is impacted if your third-party data API is slow.
Search engines favor server-side rendered HTML at crawl time. If the client-side fetch call results in a "not found" scenario, there isn't a good way to tell search engines about this since they still see the HTTP 200 server response, potentially causing undesirable indexing behavior.
No sitemap inclusion by default.
And most notably, the Folder Mapping feature is now deprecated!

Screenshot of Edge Delivery's Folder Mapping feature deprecation

Folder Mapping is Deprecated

At the time of this writing, the Folder Mapping documentation does not mention what exactly should replace the deprecated feature, likely because there are several other approaches to this solution which may involve edge workers in the CDN or some of the other approaches described below.

Data Sync into the CMS

Another common approach is to sync the data as a native content format into your authoring environment via scheduled job or webhook.

If you're using document-based authoring with SharePoint or Google Docs, you might use the APIs of those platforms to programmatically manage the documents for each detail page.

If you're using AEM with Universal Editor (Crosswalk), you're likely writing Java-based Sling Servlets and OSGi services to manage cq:Pages in the Author instance.

If you're using DA, you might use its Admin APIs to manage documents which are stored in DA's "Author Bus."

Once the data is synced into the authoring environment in the right format, the Helix (AEM) Admin API is invoked to preview and publish the page to Edge Delivery Services.

All of these options ensure that each programmatically managed page exists as a physical document in the CMS, giving authors flexibility to enhance it with additional content. You also gain the benefits of strong performance, high cacheability, and indexability via sitemap.xml and query-index.json

The last two mentioned options, AEM & DA, are actually forms of BYO Content which means they are systems that provide HTML in the expected Helix format. This differs from SharePoint and Google Docs which don't provide HTML outright, but are integrated in their own way to convert those documents into a native content format.

A snippet of the Author layer of Helix 5's architecture.

A snippet of the Author layer of Helix 5's architecture. See the full diagram here.

As you can see in the diagram, specific adapters for Google Docs and Microsoft Word are embedded in Edge Delivery's core architecture since those were, at the time of Edge Delivery's early releases (back when it was known as "Helix", then "Franklin", and a few other name iterations), the only supported content sources.

As shown in the diagram, Edge Delivery's core architecture includes built-in adapters for Google Docs and Microsoft Word, reflecting its early days (when it was known as “Helix,” then “Franklin”) when those were the only supported content sources at the time.

A Not So Common Approach to Data-Driven Integrations

The extensibility of Edge Delivery Services changed significantly when the concept of BYO Content was introduced. While AEM Author and DA were the first Adobe authoring systems to implement this architecture, Adobe has effectively created an extension point that allows you to bring your own system to the table via a feature called Bring Your Own Markup (BYOM), documented on aem.live here: https://www.aem.live/developer/byom

The concept is actually quite simple: no matter what system you configure as your site's content source, as long as it produces Helix-compliant HTML, it can be integrated into the publishing flow I summarized earlier. This has massive implications for how we can generate native Edge Delivery pages in a dynamic fashion. Adobe's documentation even hints at a powerful use case with this excerpt:

A typical use case for such a setup is, for example, the automatic publishing of product pages directly from the commerce backend.

The Setup

You can choose to use your custom content source as the primary source for your site, or you can configure it to be complimentary to your primary source.

The latter is probably more desired given that you may want to use a standard Adobe-supported CMS tool as your primary content source (i.e. SharePoint, Google, AEM, or DA) and then layer your custom content source over that. This setup is referred to as a Content Overlay, a complimentary feature of BYOM.

The overlay server need not be a custom server, it can also be AEM or DA as those already follow the BYOM model.

Adobe's documentation demonstrates what this setup looks like when using the API-based Configuration Service via the Helix Admin APIs, since this setup cannot be configured via the older fstab.yaml file approach:

curl -X PUT https://admin.hlx.page/config/acme/sites/website.json \ 
  -H 'content-type: application/json' \ 
  -H 'x-auth-token: ' \ 
  --data '{ 
    "version": 1, 
    "code": { 
      "owner": "acme", 
      "repo": "website" 
    }, 
    "content": {   
      "source": { 
        "url": "https://acme.sharepoint.com/sites/aem/Shared%20Documents/website" 
      }, 
      "overlay": { 
        "url": "https://content-service.acme.com/data", 
        "type": "markup" 
      } 
    } 
  }'

Note the overlay property and the fact that it can be any arbitrary URL endpoint. It's also worth noting that only a single overlay can be configured, however if needed, this overlay URL can be some middleware service of yours that delegates the content fetching to multiple downstream systems depending on the URL path being requested.

How exactly does this all work? Lets dive deeper into what happens with the publishing process of Edge Delivery pages when an overlay is configured for your site.

The Overlayed Publishing Flow

Earlier I described how the publishing flow works when a document is published. When an overlay is configured, there are some slight differences in this flow.

When a document is published, the Helix Admin service will:

Request that document's URL path from the overlay URL.

For example, if our configured overlay URL is https://content-service.acme.com/data and the Admin API was invoked to publish a page at the path /people/marty-mcfly, that results in Helix Admin sending this GET request to our overlay server: https://content-service.acme.com/data/people/marty-mcfly
If the overlay URL returns a document (HTTP 200), then that is used to create the page. The primary content source from your Edge Delivery site's config is not queried at all.
If the overlay URL returns a 404, Helix Admin then makes a fallback request to the site's primary content source.
Add the fetched document to the content bus, made available for delivery to your site users.

Now that we understand how to configure a content overlay for our site, lets render data from a third-party back end into our server-side Edge Delivery markup.

Bringing Your Own Markup with Integrations

We know that the overlay server must produce Helix-compliant HTML. One approach to this is to construct your own HTML as long as you follow the guidelines described in the BYOM documentation.

An easier approach is to start with an existing Edge Delivery Services page as your template, then replace specific elements with data from your third-party source. This avoids building HTML from scratch and gives authors the ability to edit the template if it’s a published page within your Edge Delivery site. It could also be a static file in your overlay server.

Using this template model, you can define placeholders that your overlay logic replaces with dynamic data at publishing time. For example, a profile detail page might display a person’s name and contact details by mapping JSON paths to values returned from an external API.

Lets assume we have a data object with the following JSON structure:

{
    "id": "4983",
    "firstName": "Marty",
    "lastName": "McFly",
    "slug": "marty-mcfly",
    "email": "[email protected]",
    "address": {
        "street": "9303 Lyon Drive",
        "city": "Hill Valley",
        "state": "CA",
        "zipcode": "95420",
        "geo": {
            "lat": "34.052235",
            "lng": "-118.243683"
        }
    },
    "phone": "(555) 123-4567",
    "website": "hoverboardhero.com"
}

The data placeholders that are authored in your content template might look like this:

Contact Info:

Name: {{person.firstName}} {{person.lastName}}

Email: {{person.email}}

Website: {{person.website}}

Phone: {{person.phone}}

Address:

Street: {{person.address.street}}

City: {{person.address.city}}

Zip code: {{person.address.zipcode}}

State: {{person.address.state}}

We could also add a Schema.org object with placeholders into the json-ld field in our document's metadata block, giving us a server-side rendered JSON+LD script tag.

Metadata block containing data placeholders

Your overlay server should be configured to handle requests sent to a given route such as /people/marty-mcfly. In this example, we'll use a Node.js Express server with a route for GET requests to /people/:name, invoked for any publish requests for paths under /people.

Integration Logic Sequence

The sequence might look something like this:

Edge Delivery Services Overlays - Dynamic Content Generation Sequence Diagram

A data object is updated in your back end data source.
Your back end data source calls the Helix Admin /preview and /live APIs, passing the URL path to the Edge Delivery page representing that data object.
Helix Admin invokes your overlay server with that page path.
Overlay server reads the ID from the page path (in this example, marty-mcfly) and uses that to fetch the JSON data for that profile from an API.
Overlay server fetches the template page from the Edge Delivery site.
Overlay server replaces the placeholders with the JSON data fetched from the API for that profile.
The final HTML is returned to Helix Admin which then creates that as a native Edge Delivery page within your site.

Here's what the code in our overlay server might look like for the above flow:

// People route with name parameter
app.get('/people/:name', async (req, res) => {
  try {
    const urlName = req.params.name.toLowerCase();

    const apiUrl = new URL("https://api.example.com/people");
    apiUrl.searchParams.append('nameSlug', urlName);

    // Get person data from API
    const response = await axios.get(apiUrl.toString());
    const personData = response.data;

    // Return 404 to Helix Admin if person data is not found
    if (!personData) {
      return res.status(404).send();
    }

    // Fetch and populate template
    const templateUrl = `https://main--people-directory--ericvangeem.aem.live/templates/person-detail`;
    const html = await fetchAndPopulateTemplate(templateUrl, personData);
    res.send(html);

  } catch (error) {
    res.status(500).send();
  }
});

The fetchAndPopulateTemplate function contains the logic to replace the placeholders with the fetched data. In this example, the placeholders are actually JSON paths to the data property name:

// Replace all {{path.to.property}} placeholders, including URL-encoded versions
html = html.replace(/(%7B%7B|{{)([^}%]+)(%7D%7D|}})/g, (match, openBrace, path, closeBrace) => {
  // Decode the path in case it contains URL-encoded characters
  const decodedPath = decodeURIComponent(path);
  const value = decodedPath.split('.').reduce((obj, key) => obj?.[key], personData);
  return value !== undefined ? value : match;
});

This approach can go beyond simple placeholder replacement. For example, you can inject images using public, fully qualified URLs from any source. During publishing, Helix Admin will download the image and upload it to the Media Bus, converting it into a native Edge Delivery media URL. You can also conditionally add or remove entire HTML sections based on data-driven logic.

An Enterprise-grade Example

Now that we understand how we can create native Edge Delivery pages using data from a third-party back end, lets wrap up this post with a scalable example of creating and managing thousands of these data detail pages using Microsoft Azure.

Lets build on the "profile directory" example from earlier. We'll assume these people profiles are all stored in an Azure Cosmos DB and we have an Azure Function registered as a Cosmos DB Trigger. Any time a profile is added, updated, or deleted from the Cosmos DB, the Function Trigger will add a message to a Service Bus queue containing the payload and operation of that data event. We'll also have a separate Function registered as Timer trigger to periodically poll the Service Bus queue for any messages. This Timer trigger will be responsible for invoking the Helix Admin /preview and /live APIs for any messages it pops from the queue, and will do so in a throttled manner to ensure we're not hitting the Admin API rate limits.

This results in an architecture that might look like this:

Example architecture of an Azure-based integration with a custom Overlay server

For the Overlay server, I've chosen to use a Node.js Express server deployed on Vercel. This detail doesn't really matter, I could have kept everything in Azure using any language or web framework of choice. When my overlay server receives a /people/:name request from Helix Admin, it calls our third-party profile API which is yet another Azure Function that queries the Cosmos DB for that person, then running through the sequence I described earlier to return the data-integrated Helix-compliant HTML back to Edge Delivery Services, thus creating the page in our site.

Our Edge Delivery site's content configuration ends up looking like this:

curl -X PUT https://admin.hlx.page/config/ericvangeem/sites/people-directory.json \ 
  -H 'content-type: application/json' \ 
  -H 'x-auth-token: ' \ 
  --data '{ 
    "version": 1, 
    "code": { 
       "owner": "ericvangeem",
       "repo": "people-directory",
       "source": {
         "type": "github",
         "url": "https://github.com/ericvangeem/people-directory"
       }
    }, 
    "content": {   
      "source": { 
        "url": "https://content.da.live/ericvangeem/people-directory/",
        "type": "markup" 
      }, 
      "overlay": { 
        "url": "https://www.vercel.app/", // note: intentionally fake Vercel app URL to prevent abuse.
        "type": "markup" 
      } 
    } 
  }'

Implementation

I've implemented this architecture into my example site using around 10,000 profile objects stored in my DB, though this can certainly scale to a much higher number. All data shown here is randomly generated. Here is the sitemap.xml and query-index.json demonstrating that these are now real server-side rendered (or "prerendered") pages in Edge Delivery based on an external data source:

Sitemap.xml: https://main--people-directory--ericvangeem.aem.live/sitemap.xml
Query-index.json: https://main--people-directory--ericvangeem.aem.live/profile-index.json?limit=100

Additionally, you can see the template page used for merging data along with an example detail page:

Template page: https://main--people-directory--ericvangeem.aem.live/templates/person-detail
Example detail page: https://main--people-directory--ericvangeem.aem.live/people/marty-mcfly

On the example detail page, you'll notice all placeholders have been replaced with the back end data for that profile which came from our Azure Cosmos DB. If you View Source to see the server-rendered HTML, you'll see the JSON+LD script tag as well.

Conclusion

The Bring Your Own Markup (BYOM) feature offers a powerful way to integrate your existing content into an Edge Delivery Services site without a content migration, while still benefiting from Edge Delivery's performance and server-side HTML rendering.

Adobe is already leveraging this architecture for Edge Delivery's Commerce Storefront to enable server-rendered product detail pages. It appears that Adobe I/O App Builder is used to support the integration, as you can see in their GitHub project aem-commerce-prerender.

My example focuses on a profile directory use case, but can be applicable to any site with dynamic detail page patterns. The supporting back end services is a matter of preference, I used Azure but you might use something else depending on your stack.

Happy integrating!

Additional Resources

My Overlay server deployed to Vercel: https://github.com/ericvangeem/eds-overlay-server
My Azure services: https://github.com/ericvangeem/azure-user-processor
Bring Your Own Markup documentation: https://www.aem.live/developer/byom

Note: these projects were created to support this blog post and are meant for conceptual reference only.